(Last Updated on )
One of the things that have long irritated me about HTML is the restriction on what elements are allowed inside lists.
The specs for both HTML 4 and 5 allow only
ol, and only
dd are allowed inside
dl definition lists. I’d like to expand that to allow
I was talking to two of Opera’s Standards reps, Anne van Kesteren and Lachlan Hunt, about this and they suggested that I make a proposal to the HTML 5 working group, with appropriate use cases.
So before I make a tit of myself by putting flawed proposal to that somewhat grumpy group, I thought I’d do what Eric Meyer did and ask developers at large what you think. Here’s my reasoning—and if you have any more use cases or objections, please let me know.
Allowing headings (
h6) in lists
Until recently, I worked for the Law Society and Solicitors Regulation Authority. In such a business, we spent a lot of time marking up rules, regulations and statutes.
In the UK, as most (all?) other jurisdictions, laws and rules are written with numbered paragraphs. Within those lists are headings that introduce sections. The headings are not part of a list item, but group list items. Check out any of the thousands of examples at Office of Public Sector Information or the UK Statute Law Database.
Here’s a small but nevertheless real-world example: take a quick look at the Solicitors’ Practising Certificate Regulations 1995 (PDF 34K), which I naturally want to mark up like this:
<li> These regulations replace the Practising Certificate Regulations 1976 in relation to all practising certificates, and applications for practising certificates, for any period commencing on or after 1st November 1995.</li>
<h2>Requests for information</h2>
<li>In addition to information supplied on any prescribed form under these regulations, solicitors must supply to the Law Society such information as to their practice as solicitors as the Society shall from time to time reasonably require for the purpose of processing applications.</li>
<h2>Replacement date and conditions</h2>
<li>The replacement date for every practising certificate shall be the 31st October following the issue of the applicant’s current practising certificate.</li>
<li>Every practising certificate shall specify its commencement date, its replacement date, and any conditions imposed by the Law Society</li>
You’ll notice that the heading "Replacement date and conditions" is not part of either of the following two items, so is not a child of either
li. Instead, it groups (or introduces) them, and therefore, its semantically most appropriate location is as a child of the surrounding
Another way to mark up this document is as a succession of headings and paragraphs, with each paragraph beginning with a hard-coded paragraph number, perhaps surrounded with a span that is styled with
dislay:block; in order to make the number look like a list marker. This spectacularly fails the Bruce Lawson Markup Duck Test which states that if it looks like a duck, walks like a duck and quacks like a duck then it is a duck: a list of paragraphs, each beginning with a number indicating the order of the paragraphs is an ordered list, and needs to be marked up as one.
Take a more complex example, Legal Services Act 2007, paragraphs 203-206. This legislation is a long list of numbered paragraphs, interspersed with headings to group the following paragraphs into sections. Being more complex, this legislation has nested (ordered) sublists, but the same logic and basic structure holds here too:
The giving of notices, directions and other documents in electronic form</h5>
<li>[subparagraph 2]</li> …
<h4>Orders, rules etc</h4>
<li><h5>Orders, regulations and rules</h5>
… lots of subparagraphs …
<li><h5>Consultation requirements for rules</h5></li>
<li><h5>Parliamentary control of orders and regulations</h5></li>
A counter argument is that that these whole piece of legislation is an ordered list of sections, each containing a sublist list of paragraphs within that section.. And that is an legitimate way to look at it, except that the actual numbered paragraphs would no longer have the correct paragraph numbers auto-generated, as they’d be split into sublists.
Playing with CSS counters wouldn’t help, as different lists are treated as separate entities, so numbering in one list can’t follow on from numbering in another list. To avoid the paragraph immediately below a section heading (the
h4 in my code example above) going back to 1, you would have to give the
start attribute and hard-code the paragraph number, making a mockery of the idea of automatically generating numbers in ordered lists. Even if it could be faked with CSS counters or hardcoding the
start attribute, it shouldn’t be because that fails the Duck Test, too.
For HTML 5, it would be ideal if the spec allowed the new
section element to be a child of a list. This means that content could be pulled from a CMS into different pages with different heading hierarchies, and the headings would automatically be the correct level within that context. This is an idea from the XHTML 2 spec, which has an unnumbered
Structured headings use the single h element, in combination with the section element to indicate the structure of the document, and the nesting of the sections indicates the importance of the heading. The heading for the section is the one that is a child of the section element.
In HTML 5 this is complicated by backwards compatiblity, so any heading element from
h6 can be chosen, and the headings and sections algorithm determines what “level” it actually is. (See A Preview of HTML 5 for a more readable discussion of
I’ve marked up the Practising Certificate example as HTML 5 and styled the various different levels of
h1s using CSS so you can see a practical example of the usefulness of allowing headings and
section to be children of a list.
Headings in definition lists
An example in a definition list would be similar. Here’s a real-world glossary marked up as a definition list (which is the best way to mark them up, in my opinion, although some favour
A really long alphabetical glossary would be enhanced by dividing it up with headers for each letter of the alphabet, for reasons of scannability, or so an on-the-fly table of contents generator could make a linked table of contents above the glossary.
That could be done by the following (illegal code):
<dd>Never hurt anybody</dd>
<dd>The lower limbs of people standing side-by-side</dd>
<dd>The finest car known to man</dd>
<dd>See Christian Heilmann, Tom Hughes-Croucher</dd>
You might say that each letter of the alphabet should have its own
dl. I contend that a glossary is a single entity, not twenty-six different lists and would reply "Tish and pish, sir. You are a nincompoop, and your words are balderdash, poppycock and gobbledegook."
And I’d be right, and you’d be sorry.
div as a child of a list
While we’re talking of rules and specifications, I’d like to know why I can’t use
div inside a list.
Mostly I’d like to do this so that I could properly style definition lists to look like tables.
You can’t reliably style definition lists at the moment, but you can if you can wrap a
dt and its associated
dds in a
div. This is illegal, but works cross-browser already.
I agree with the HTML 5 gang when they refuse a new grouping
di element (presumably "definition item"), saying "This is a styling problem and should be fixed in CSS. There’s no reason to add a grouping element to HTML, as the semantics are already unambiguous."
Yes, there is no reason for a new definition grouping element; we already have a generic grouping element called
div. And, yes, it’s true that it’s a problem for CSS, but with all the other stuff on the CSS Working Group’s agenda, they’re unlikely to get round to it soon.
It must be a common problem (the HTML 5 crew cite it as a "frequently asked question") and it can be easily solved using the interoperable, backwardly-compatible method I outlined above.
It also raises a philosophical question: I can understand why there are restrictions on where some elements can go (for example, it would make no semantic sense to allow a list inside an image), but why restrict where an author can put an element that has absolutely no meaning ("The div element represents nothing at all")?
I see the argument against over-complicating a specification, but I think that if a new spec can’t accommodate real-world examples of content then the specification is not in danger of over-complication—rather, it’s currently over-simplistic. HTML 5 has been bravely making itself backwards-compatible and thereby becoming more complicated in some areas (such as the algorithm for working out the importance of headings in sections), so slight extra complication to help developers can also help its adoption.
Thank you for reading this far, dear reader. Am I talking nonsense? Have I missed something obvious? Do tell, before the HTML 5 gang employ sarcasm or scorn at me.