Things I have learned about XHTML

Published June 21, 2008. Filed under: Pedantics, Web standards.

The following are gleaned from the comments to my recent explanation of why I chose to use HTML 4.01 Strict for my redesign, rather than a flavor of XHTML, an explanation in which I mostly boiled the debate — for my needs, here on this site — down to “XHTML doesn’t offer me any compelling advantage, and it’s more complex to do right than most people know/admit”.

Advance warning: yes, this is snarky and is going to make fun of uninformed comments. Yes, I do think it’s necessary to call people out on this kind of thing. Yes, if you don’t like it you should go read something else. So let’s get started.

Craig says that

I don’t really agree that XHTML is any more complex than HTML. If anything, there are fewer tags/attributes.

Since XHTML 1.0 was a tag-for-tag and attribute-for-attribute identical reformulation of HTML 4.01 into XML, I have a tough time understanding this one.

T. Bille chimed in and contributed the fact that

Something left out of the picture imho is the compliance with disability standards which usualy imply stricter checking / xhtml compliance.

For reference, here are the relevant accessibility specs: WCAG 1.0 and WCAG 2.0. Try out your browser’s in-page search, and you’ll find that a certain sequence of characters — to wit, “XHTML” — is conspicuously absent from the contents of both documents. Irony of ironies: WCAG 1.0 is an HTML 4 document.

Next up is Tedel, who mentions

However, I have also found good advantages of using XHTML 1.0 strict over HTML 4.1 strict, especially in search engine optimization techniques.

Since XHTML 1.0 Strict and HTML 4.01 Strict are, again, identical in terms of tags and attributes, I’m somewhat amazed by this.

Then there’s Timbo with an urgent plea:

Please use XHTML. It’s so much easier to scrape your data with.

Sorry, Timbo, I already use the most advanced markup-scraping tool on the planet, and so should you.

Don Ulrich contributed several gems to the conversation, but this one was my favorite:

If ppl only read the w3c spec they could understand how robust (X)HTML is. Most only use a fraction of its resources. We have created many a false social meme about markup.

Indeed. XHTML is so robust that, for example, a document that’s invalid XHTML will be rendered correctly by your web browser even when served as application/xhtml+xml. However, it’s also so fragile that a document’s well-formedness status can change based only on the details of the transport protocol used to get it in front of your eyeballs.

Although I do have to give an honorable mention to his most recent contribution:

And lastly XHTML is an application. Where HTML is markup.

I could go on like this for a while, and I probably should have expected that my article would bring some uninformed kooks out of the woodwork, but seriously? People? It’s 2008 here and the necessary information to clear up all of the above confusions is publicly available and has been for years. If you’re a professional web designer or a professional web developer and you can’t spot the problems with these comments, then I weep for the future of our industry.