Bruce Lawson's personal site

This millenium in HTML 5 (politics)

OK, so it’s not quite a millenium but the flame wars, bitch fests on the mailing list make it feel that way.

Microdata/ RDFa

According to the W3C‘s RDFa primer, it is “a few simple XHTML attributes … [to] mark up human-readable data with machine-readable indicators for browsers and other programs to interpret. A web page can include markup for items as simple as the title of an article, or as complex as a user’s complete social network.”

HTML 5 had the temerity not to care about the RDFa for three reasons (as far as I can tell):

  1. It’s too hard for most markup monkeys to code
  2. It requires XML namespaces and HTML 5 doesn’t do namespaces
  3. If you copy and paste RDFa annotations, it won’t work in the destination page unless you also copy the RDFa definitions, which can be separated in the code.

So instead, HTML 5 introduces microdata which does the same thing in a different way. To some, this has caused grave anguish. Shelley Powers is anguished because it competes with RDFa, Ian Hickson wrote it on his own, the HTML 5 spec is too big already and microdata isn’t as “good” as RDFa.

Jeni Tennison seems unanguished. While microdata isn’t as good as RDFa, she acknowledges, it’s easier to use.

Meanwhile, Philip Jägenstedt (who implements video for Opera but indulges his pash for the Semantic Web at weekends) goes crazy and, rather than philosophising, implements the microdata API in JavaScript to see how it compares. (His feedback to the Working Group.) He concludes:

  • Microformats, you’re a class attribute kludge
  • RDFa, HTML is not your triplestore
  • Microdata, I like you but you need more review

Anyway, this crazy mindset of Philip’s in which things are tested and the best one wins regardless of ideology seems to have infected others. Ian Hickson writes

I am going to try some tweaks to the Microdata syntax. Google has kindly offered to provide usability testing resources so that we can try a variety of different syntaxes and see which one is easiest for authors to understand.

As someone only tangentally concerned with machine-parsable data (I’ve used one microformat in my life, once) I’d love to see the three options compared by someone like Tantek, Jeremy Keith or Ben Ward. (Wish granted in Ben’s comment.)


Summary lovin’ had me a-blast

The decision to deprecate the summary attribute of table caused what we lucky denizens of W3C Working Groups officially term a right shitstorm.

From a barely-used accessibility add-on, it became a talisman whereby the mature accessibility experts with real experience beat up the juvenile upstarts who are all clever ideas and no practical knowledge, while the brave young pioneers with the courage to boldly forge new paths attacked the silly old farts who are afraid of change.

It’s settled down now, with summary remaining in the spec but advised against (a sensible compromise, in my opinion; I think summary is superfluous).

A highly condensed partisan history of Oh, those summary nights is available from Google’s Mark Pilgrim, who memorably did his bit for inter-generational bonding by calling some accessibility practioners “charlatans and fools”.

ARIA integration

This is vital work. ARIA is a bolt-on spec that extends HTML 4 to cover the accessibility of Web Applications, which stretch HTML 4 to breaking point because it was designed for static documents, not apps.

That stretch is one reason that HTML 5 was invented; its original name was “Web Applications 1.0”. Because HTML 5 is a new spec, it’s better to have accessibility built-in rather than bolted-on.

Also, the ARIA spec is complex and puts a lot of work on the developer, so it’s in everyone’s interests that ARIA be folded into HTML 5’s existing structures as far as possible.

Fortunately (and surprisingly, given the summary fracas) no-one seems to be disagreeing on the need for the HTML 5/ ARIA integration process which appears to be going smoothly and without horses’ heads being left in beds.

canvas accessibility

canvas is the immediate drawing mode available in HTML 5. It’s fast, and therefore sexy, because it doesn’t have a “DOM”, but which means there is nothing for assistive technologies to hook into.

Some have suggested that canvas be removed from the main spec, but that would make little difference except to confuse authors who already have too many specs to cross-reference.

Because the spec retrospectively codified what was already working in the browsers, we just have to accept that those implementations are inaccessible for now and not use canvas where a more accessible method exists.

However, the first part step towards solving a problem is acknowledging that there is a problem, and it’s acknowledged that the original use-cases anticipated for canvas don’t match what people are trying to do now (further discussion of this in my standards suck video interview).

I know that cabal member and fellow Operative Lachlan Hunt has been investigating how my old chum Bob Regan of Adobe went about retrofitting Flash for accessibility, way back in version 5. That’s a promising sign.

New chairs

Chris Wilson of Microsoft resigned as co-chair, and two people replace him: Maciej Stachowiak of Apple, a cabal member through and through, and Paul Cotton of Microsoft. I don’t know Paul, but it’s good to see Microsoft continuing its involvement, and certainly we’re seeing lots of specific feedback from Internet Explorer Program Manager Adrian Bateman.

Super Friends

Zeldman and a group of celebs including Eric Meyer, Tantek Çelik, Dan Cederholm adopted the moniker HTML5 Super Friends and endorsed the direction the spec’s going, thereby immediately making it primetime.

They have a useful list of concerns. Regular readers (hello Mum!) may recall that I’ve previously submitted concrete proposals to the Working Group about the content model for small, the problem of legend in new elements, and the unnecessary restrictions to the time element, so I’ll not reprise those, but here are my thoughts on the Super Friends’ hiccups.


I agree that hgroup is clumsy and likely to be misused. Rather than wrap an h1 and its h2 subtitle in hgroup to keep the subtitle out of the outlining algorithm, I would prefer to use

<h1>My blog</h1>
<subtitle>My wit and wisdom</subtitle>

as I think that’s easier to understand than a heading-that’s-not-a-heading, and it removes a wrapping element.

The footer content model

I agree that it’s daft that you can have nav, headings and sections in a header but not a footer, and this contradicts what many people already do—see Webcamp BKK for an example taken at random from

Update 4 September: and, lo!, it is done:

Contexts in which this element may be used: Flow content, but with no header or footer element descendants.

The dialog element

I’ve said before that

is an abomination. It perpetuates the HTML 4 use of dd and dt for dialogues, which are better marked up with blockquote and cite.

That better option is no longer available in HTML 5 which unnecessarily restricts the cite element to only marking up the title of a work (in HTML 4 you can use it to mark up a name).

In my opinion, the HTML 5 spec is overly restrictive, as I’d like to continue using the blockquote/cite pattern, as dialog has no mechanism for marking up stage directions such as “Exit, pursued by a bear“.

Frankly, I’m not even certain that dialog merits its own element. If no-one is convinced that marking up ancient or fuzzy dates has a use-case, what is the use case for dialog? Adrian Bateman of Microsoft agrees with me:

We also don’t think dialog adds sufficient value to justify the spec,implementation, and test cost.

Learn more

If you’re looking to read the HTML 5 spec from a markup author’s perspective (that is, ignoring all the stuff for implementors), you can find the huge single-page Author spec and multipage Author version.

Stack Overflow’s What improvements to accessibility are offered by HTML5? is an unbiased overview.

There’s a 10 min video interview on Standards Suck talking about some of the issues above, and an interview with Government Computer News in which I talk about HTML 5 development and its relationship to the US accessibility standards section 508.

And, by the way, if you’re interested in HTML 5, please vote for my South By Southwest HTML 5 panel.

Buy "Calling For The Moon", my debut album of songs I wrote while living in Thailand, India, Turkey. (Only £2, on Bandcamp.)

23 Responses to “ This millenium in HTML 5 (politics) ”

Comment by Nicolas Gallagher

How would something like the element work if I had multiple subtitles of differing rank? The element would let me have h1, h2, and h3 (and so on) inside it.

The other issue is that if we’re concerned about the way in which people already use elements, then it is worth noting that most people already markup their subtitles or taglines with headings. Most of them don’t even care about the document outline and don’t seem to be asking for a new element to markup sub-headings.

Getting into the realms of pedantry, the word “subtitle” also refers to textual versions of the dialog in films and television programs.

The “HTML5 Super Friends” suggestion of using a boolean attribute feels a bit ugly and may be open to some type of “abuse” if people can slap it on any heading they want to be removed from the outline (I don’t know why, but it could end up being an undesired consequence in the future).

It would also be interesting to see what was more intuitive for authors – a new wrapping element or adding boolean attributes to the relevant (sub)headings.

One additional factor that is worth thinking about is how some of these things are going to work (or not) in a CMS WYSIWYG environment.

Comment by Nicolas Gallagher

That first sentence above was meant to say:

How would something like the “subtitle” element work if I had multiple sub-headings of differing rank. The “hgroup” element would let me have h1, h2, and h3 (and so on) inside it.

Comment by Philip Jägenstedt

Actually, Hixie’s call for feedback predates my blog post and it would be fair to say that Hixie’s (crazy) mindset has influenced me rather than the other way around. I agree that feedback/comparisons from people who actually work with these things would be great. But as you’ll see on some of the names you mention already have a vested interest of sorts. That’s not to say I don’t of course…

Comment by Divya

(er.., please disregard my previous comment).

I, actually, like the hgroup, only because it lets me group two/or more headings together. But I only want hgroup to be applicable to the parent container. The use case I see is: I am marking up a music release, and I want to put the title of the release in h1, and the artist name in h2. Somehow, it seems orderly to put them in a hgroup.

Comment by Bruce

Divya, thanks – fixed the typo (Paul Cotton is from MS). Still thinking about your hgroup use-case.

Shelley and foolip, yeah I know that Tantek (and Jeremy and Ben) are microformat evangelicals – and I’d expect them to be partisan, but I’ve heard arguments supporting RDFa and microdata so would like to complete the set (as it were).

Comment by Kyle Weems

This is a great (and amusing) recap of HTML5 so far. Wait, is it HTML5 or HTML 5? I swear I’ve seen two simultaneous declarations to that effect.

Also, I agree that dialog just seems really, really unneeded. It feels like you’re inventing a solution to a problem nobody knew they had, or are already solving quite fine with existing tools.

Comment by Steve

What’s the use case for a “subtitle” element? I can only think of proper subheadings that should appear in an outline or tag-lines that can be marked up as paragraphs within the “header” element (they aren’t headings, after all).

Comment by Bruce

Frankly, Steve, I’m with you, but if people really feel the urge for subheadings, I’d like to use a subheading/ subtitle element rather than the hgroup element.

Comment by Ben Ward

Nice summary as ever, Bruce.

Since you asked, my views on Microdata are generally pretty good. Needs polish, and I share a concern with Tantek that having explicit vocabularies baked into the HTML spec as examples might cause confusion or conflict with existing users of the vocabularies, but it seems to address all the use cases we’ve come across over the years in Microformats.

Critically, there is one divergent design goal. HTML5 microdata has a design goal that user agents should be able to generically parse all microdata from a page, even if the UA itself does not understand what to do with it. It’s a pretty good Search Engine use case actually, from a Yahoo perspective, our SearchMonkey product parses and stores structed data from pages in RDFa and microformats using custom parsing rules, and then makes those data structures available to app developers to interpret. Parsing everything from microdata, even if SearchMonkey doesn’t understand it, would be pretty handy.

Microformats, by contrast, are designed in a space where parsing them generically is not of enough of a use-case in itself. Our limited number of specifications are built around solid use cases for consumption, and so parses needing to understand the vocabulary to parse microformats isn’t a problem to us.

The generic/specific breakdown aside, Microdata handles some of the edges of microformats parsing gracefully. At various points we had to develop patterns within HTML to handle more complex data relationships (the include-pattern, value-class-pattern, for example.) Microdata adds dedicated tools that could replace these (yay!).

Long term, it means that as a group will be able to switch from defining vocabularies and syntax, and just do vocabularies.

… Which is not to say that `itemprop` isn’t a really ugly attribute name. But, well, I think we’ve learned to deal with that sort of thing in HTML.

Concerning other parts of your post, one quibble with regard to `dialog`. Personally, I like it. The `dt`, `dd` part comes from HTML4 legitimizing that for `dl`, sure, but it’s pretty clean. The main issue I see with your response (`cite` and `blockquote`) is that `blockquote` should surely only be used for, well, quoting (ideally from another citeable resource). A dialog is not necessarily a quote. You may be paraphrasing, writing fiction, or simple documenting a conversation for the first time.

I regard the semantic strength of blockquote indicating that content comes from elsewhere on the web as a valuable reason not to dilute it, and as such prefer the dedicated dialog. That leaves it open for `blockquote > dialog` structures when you are explicitly quoting a dialog.

Comment by Steve

Thinking about hgroup…

The spec recommends that every header element is at the beginning of some piece of sectioning content, with every header being a . So, how about a simple element, with numbering automatically determined by the level of nesting.

The sectioning element can then act as a , with only the first appearing in the outline and any others in the section represent subheadings not to be in the outline.

Its backwards compatible as only will use this new logic, and h1,h2,… can be as they were. And you can have more than 6 levels of headings (not that you’d need to).

Comment by Steve

Didn’t realise that my elements would be taken literally:

The spec recommends that every header element is at the beginning of some piece of sectioning content, with every header being a ‘h1’. So, how about a simple element ‘h’, with numbering automatically determined by the level of nesting.

The sectioning element can then act as a ‘hgroup’, with only the first ‘h’ appearing in the outline and any others represent subheadings not to be in the outline.

It’s backwards compatible as only ‘h’ will use this new logic, and h1,h2,… can be as they were. And you can have more than 6 levels of headings (not that you’d need to).

Comment by Michael Kozakewich

This is probably the best recap so far.

I’ve noticed that there’ll be nothing special, but then someone will happen to mention a list of things they’re thinking about, someone will talk more in depth on one or two of those points, and then everyone will start disagreeing on one thing or another.
Suddenly, you’ve got this polarized debate over a single item on someone’s list. You can almost guess when it’s going to happen.

Leave a Reply

HTML: You can use these tags: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong> . To display code, manually escape it.