Strained Tag Soup

by Charles Miller on October 13, 2002

Disclaimer: I'm slowly moving this weblog from its rather soupy default templates to correct markup, but I generally can't play with HTML for more than about fifteen minutes at a time before getting bored. Do what I say, not what I do. :)

Dave Winer, What Is Tag Soup?

The Web is tag soup. People use blockquotes to indent. Even though the REST folk argue that it's anti-Web to do RPC, people do RPC anyway. There's a never-ending list of complaints, but they can be resolved. That's why I'm writing this little essaylet. [...] You can't put the genie back in the bottle. Only by making your world very small can you fail to see the enormity of getting everyone to see it your way. Better to adapt your thinking to their way, and see how you can make your vision fit into what is.


The most significant cause of Tag Soup has been the fact that the tools we have to work with are always catching up with what we want to do with them. The reason people use <blockquote> for indenting (as they used <ul> before it) is because not long ago it was the only way to indent text without using really complicated tables or spacer images. Now that the technology to indent text using CSS is widespread, we developers can migrate to more correct, and more reliable ways to indent.

The tools aren't there yet, of course. The standards don't support everything we want to do, and the browsers don't support all the standards that do exist. We've got years to go yet, but that doesn't mean we shouldn't be moving in the direction of semantically correct web-pages.

The movement is already happening. Like all good Internet movements, it started in the hands of the technical practicioners, shown by the proliferation of weblogs that are moving away from table-based layouts and towards CSS styling. Any web designer who shows pride in their work will want to use the tools correctly. If there are two ways of doing something, the correct way and the hacky way, a professional will want to do things the correct way.

Wired has shown that a professional publication can follow the same route.

There are three things that, if we continue to do them as a community, may not eliminate tag soup, but will ensure everyone knows it's a bad thing, and at least tries to avoid it:

  1. Continue to improve both the standards and the browsers. Wherever web designers are doing something the hacky way because there is no way to do it correctly, provide a way to do it that doesn't break the semantic structure of HTML
  2. Continue to evangelize the correct way. Continue to let designers and page authors know that there is a correct way, and there are good reasons to adopt it.
  3. Write applications that take advantage of the benefits of correctly structured web pages. Nothing would speed up the adoption of standards more than a killer application, the function of which applies better to valid pages than it does to invalid pages.

Previously: The Price Curve

Next: UML Uses