The Problem with HTML

by Charles Miller on October 18, 2002

I was reading this thread on www-html about the possible addition of contentEditable to HTML. (found via manero.org), and specifically this post by Christian Hujer:

Forms are not the task of HTML anymore. They are the task of XForms. Why should only HTML contain forms, but not XSL:FO, SVG, SMIL, MathML, DocBook? So XForms will be the pluggable language to add forms to content.

This really sums up the situation with HTML as far as the W3C is concerned. HTML4 is dead. XHTML1 was just a transition from a monolithic SGML format to a modular XML format. The as-yet-unfinished XHTML2 is the only way forward for the standardized web. Thus, any improvements we might want to make to the specification must end up in, and conform to the goals of, XHTML2.

For those not following the game, XHTML2 is most notable for the fact that it is incredibly backwards-incompatible with HTML. Where XHTML1 just required you to remember to close your tags in the right order, and maybe replace one or two attributes, XHTML2 does some pretty radical things like get rid of the <img/> tag in favour of <object/>, and completely replace HTML forms with XFORMS, which is about twice as complicated.

Add up the three years it's going to take for the standard to be acceptably implemented in browsers, and the three years after that until the majority of users have got around to upgrading. And even then, it's going to be even harder to convince web authors to use XHTML2 than it was to convince them to use XHTML1, because it's such a radical change to what they were used to. HTML4 is going to live a very long time.

(As an aside, if you haven't yet, go read Ian Hickson's description of why we shouldn't even be using XHTML now, because browsers don't properly accept text/xml documents)

So, as far as the W3C are concerned, if we want any new, standard features, we're going to have to wait six years for them, and we're going to have to dive into the XHTML 2 Brave New World in order to use them. This leaves the browsers in an interesting position. The browser vendors want to deliver nifty things that users like. With the standards as they are being in this time-warp, and the average web-hacker most likely to want to stick with the moribund HTML4, seeing no reason to switch to something more complex and less forgiving, it is inevitable that the browsers are going to extend HTML4 in weird directions out of the control of the W3C.

The W3C, on the other hand, can't really go back to adding things to the old SGML-based HTML, because that would undermine the efforts of the push towards XHTML, an important effort that is being done for all the right reasons.

It's a tough one, innit.

Previously: My life

Next: The Manual