Bruce Lawson’s personal site

Why don’t we add a <lovely> element to HTML?

(Last Updated on )

(Russian translation: Почему мы не добавим в HTML элемент <чудесный>?)

Yesterday, there was an interesting conversation, started by Sara Soueidan:

Now, before people start to think “but colour isn’t content, it’s presentation!”, Sara was talking of pages showing colour swatches. In this case, the colours are the content. It seems like a good candidate for a semantic element, because it has meaning.

In my capacity as Ancient Old Fogey Of The Web, I sat and thought about this.

Rembrandt painting of Philosopher in Meditation

HTML: The mis-spent youth

The first iteration of HTML was a small set of tags noted down by in an email from Sir Uncle Timbo in October 1991, and added to in November 1992. By the time HTML2 came around, some tags had changed names, and a few tags added that showed HTML’s primary use as a language for mathematics and computer geeks: <var>, <samp>, <code>, <pre>, <kbd> as well as the now-defunct <xmp>, <dir> and <listing>.

At this point, we only had three presentational elements: <tt>, <b> and <i> (and arguably, <i> isn’t presentational—the spec says “If a specific rendering is necessary — for example, when referring to a specific text attribute as in “The italic parts are mandatory” — a typographic element can be used to ensure that the intended typography is used where possible.”)

Further generations of HTML reflected the changing uses of the Web; it was no longer the read/write medium that Sir Uncle Timbo had envisaged, so we needed a way of sending information back to sites – thus, a whole form-related markup evolved, which subsequently has served eCommerce brilliantly. Tables were added to show data, extending the web’s original use-case of showing and sharing mathematical papers. This had the side-effect of allowing creative people to (mis)-use tables to make great-looking sites, which meant ever more consumer-friendly sites (and a menagerie of presentational markup which is deprecated now we have CSS).

By the time HTML5 came around, we added a whole slew of elements to demarcate landmarks in common web page designs – <nav>, <header>, <article>, <main> and the like, which has improved the experience for assistive technology users.

By my count, we now have 124 HTML elements, many of which are unknown to many web authors, or regularly confused with each other—for example, the difference between <article> and <section>. This suggests to me that the cognitive load of learning all these different elements is getting too much.

HTML: comfortable middle-age

There’s loads of stuff we don’t have elements for in HTML. For ages I wanted a <location> element for geo information and a <person> element (<person givenname="Bruce" familyname="Lawson" nickname="Awesome" honorific="Mr."> etc.)

But here are some of the main reasons why we probably won’t get these (or Sara’s <color> element):

The 80/20 rule

The Web exists to share all possible human knowledge. Thus, the list of possible things that we could have a semantic for is infinite. We’re already getting overload on learning or remembering our current list of elements, their semantics and their attributes. So we (hopefully) have a set of elements that express the most commonly-used semantics (ignoring historical artefacts which browsers must continue to support because we can’t break the web).

Fourteen years ago (!) Matthew Thomas wrote

The more complex a markup language, the fewer people understand it, the less conformant the average article will be, so the less useful the Web’s semantics will be.

Testing

Browsers are sophisticated beasts. I’d wager it’s the most complex software running on your device right now. As someone who used to work for a browser vendor, I know there’s a lot of resistance to adding new elements to the language – it adds even more testing to be done and boosts the chances of regressions. As Mat Marquis wrote in his recent history of Responsive Images,

Most important of all, though, it meant we didn’t have to recreate all of the features of img on a brand-new element: because picture didn’t render anything in and of itself

What’s the use-case?

The most important question: if there were a <person>, <location> or <color> element, what would the browser do with it?

Matthew Thomas suggested that new elements need to have some form of User Interface to make them easier for authors to choose the right one:

One way of improving this situation would be to reduce the number of new elements — forget about <article> and <footer>, for example.

Another way would be to recommend more distinct default presentation for each of the elements — for example, default <article> to having a drop cap, default <sidebar> to floating right, default <header>, <footer>, and <navigation> to having a slightly darker background than their parent element, and default <header>…<li> and <footer>…</li> to inline presentation. This would make authors more likely to choose the appropriate element.

As Robin Berjon wrote

Pretty much everyone in the Web community agrees that “semantics are yummy, and will get you cookies”, and that’s probably true. But once you start digging a little bit further, it becomes clear that very few people can actually articulate a reason why.

So before we all go another round on this, I have to ask: what’s it you wanna do with them darn semantics?

The general answer is “to repurpose content”. That’s fine on the surface, but you quickly reach a point where you have to ask “repurpose to what?”. For instance, if you want to render pages to a small screen (a form of repurposing) then <nav> or <footer> tell you that those bits aren’t content, and can be folded away; but if you’re looking into legal issues digging inside <footer> with some heuristics won’t help much.

I think HTML should add only elements that either expose functionality that would be pretty much meaningless otherwise (e.g. <canvas>) or that provide semantics that help repurpose *for Web browsing uses*.

So what can we do?

Luckily, HTML already has a little-known element you can use to wrap data to make it machine readable: the <data> element:

The element can be used for several purposes.

When combined with microformats or microdata, the element serves to provide both a machine-readable value for the purposes of data processors, and a human-readable value for the purposes of rendering in a Web browser. In this case, the format to be used in the value attribute is determined by the microformats or microdata vocabulary in use.

Manuel Strehl mocked up a quick example of Sara’s colour swatch using the <data> element. You could add more semantics to this using microdata and schema.org color property.

Some schema.org vocabularies do pass the Robin and Matthews’ “browser UI test” (kinda-sorta). We know that Google’s Rich Snippets search results make use of some microdata, as does Apple’s WatchOS, which is why I use it to mark up publication dates on this blog:


<article itemscope itemtype="http://schema.org/BlogPosting">
<header>
<h2 itemprop="title">
<a href="https://brucelawson.co.uk/2018/reading-list-201/">Reading List</a></h2>
<time itemprop="dateCreated pubdate datePublished"
datetime="2018-06-29">Friday 29 June 2018</time>
</header>
<p>Some marvellous, trustworthy content</p>
<p><strong>Update: <time itemprop="dateModified"
datetime="2018-06-30">Saturday 30 June 2018</time></strong>Updated content</p>

Google says

You can add additional schema.org structured data elements to your pages to help Google understand the purpose and content of the page. Structured data can help Google properly classify your page in search results, and also make your page eligible for future search result features.

This is pretty vague (Google secret algorithms, etc) but I don’t believe it can hurt. What’s that you say? It adds dozens of extra bytes of markup to your page? Go and check your kilobytes of jQuery and React, and your hero images before you start to worry about the download overhead of nourishing semantics.

What about Custom Elements?

Custom elephants are Coming Soon™ in Edge, behind a flag in Gecko and already in Blink. These allow you to make your own new tags, which must contain a hyphen – e.g., <lovely-bruce>. However, they’re primarily a way of composing and sharing discrete lumps of functionality (“Components”) and don’t add any semantics.

Conclusion

So that’s why we don’t include lots of new semantics into HTML (but feel free to propose some if there’s a real use case). However, you can do a lot using existing semantics, generic containers like <data> and extensibility hooks. Happy marking-up!

15 Responses to “ Why don’t we add a <lovely> element to HTML? ”

Comment by Stephen Band

What is wrong with using for making colour swatches? I suppose, semantically, you don’t want an input? But they are a great way of storing copyable values, particularly with readonly attribute…

Comment by Greg

Hi,
Great articles, thank you.
You have à typo here I think “Custom elephants are Coming Soon™ in Edge”
Don’t think it was elephants😏

Comment by JP

An interesting read, thank you!

It’s calling out a specific detail of you article but I think a [person] tag would be extremely difficult to execute given how non-standardisable names are. Even the ONIX format (the XML standard for defining book metadata) shrinks the world of names down to a mere 12 fields and still doesn’t cover everything you might want. In the past I’ve opted for allowing people to define “preferred name” (what should I call you) and “legal name” (what should documents I produce call you), but I’m certain that wouldn’t cover every HTML use case!

Comment by Bruce

@JP, Oh I completely agree. Likewise geo co-ordinates; which co-ordinate system would you use (especially for elevation)? But I didn’t get into that because it detracts from the main point: what would browsers do with such data, anyway?

Comment by Charlie

Definitely a case for custom elements (which you can without components, just by extending the DOM as we all did to make IE HTML5-savvy) because this is a case of domain-specific stuff that doesn’t fit the 80/20 rule. “Microdata” as in itemprop always made me ill because it was neither one thing no the other, difficult to enforce and slow to parse. Then JSON came along.

While I don’t agree that all elements should have a UI representation, I shudder when I think of the missed opportunity that the date-tag provided. Everything needs timestamps, why can’t we have them in HTML with formatting controlled by CSS? %Y-%M-%D %h:%m:%s

Comment by Lewis Cowles

We absolutely need new tags, but also new paradigms in web. I’ve watched from the mid 90s as an enthusiastic amateur into the late 2010s where I now consider myself somewhat of a seasoned expert.

I’ve watched the web become less declarative, various shades of interactive, in some cases unusable garbage obsessed with bombarding the senses.

Worse still it’s taken the OS market with it with the people who’s job it was to keep my pc running now deciding if my app launcher is aliased, if it will consult Google or Amazon or the file-system in some omni-crap see everything know nothing, which I’m perfectly capable of doing myself if I want it.

Colour palettes seem like the exact thing you could cook up in an afternoon to display reading from a file. Scss/sass & js already has your back with Naming, already has work on parsing so why bother with html?

You could pop them in existing markup, absorb or push accessibility by convention.

I’m greatly appreciative of people inputting to this, but where does it stop, and what value will be overlooked along the way?

Comment by Lazar Ljubenović

Love the thought. Brought me into existential crisis for a moment when I realized it takes me a while to come up with “an excuse” for semantics, since I’m always talking about how it’s important. But stating a use-case which is actually common indeed takes a while.

BTW, pretty sure that input[type=color][readonly] is what the original idea from the post title is after.

Comment by Neil Osman

Thanks Bruce!
1. custom components can carry semantics through the use of ARIA. So should we enrich the ARIA’s ontology? Well, currently, and this applies to HTML as well, mappings to OS sucks, actually all “OS semantics” is poor.
2. I wonder if we should consider the idea of merging elements, something like a link-button – instead of anchors and buttons – in which author properties should determine the computed semantics, style, and behavior.

Comment by Peter Rushforth

Hi Bruce, Regarding <location>, I think it’s great, but it’s not sufficient, and, what’s the payoff for the user? What I mean by that is that the semantic of what could go in location is not particularly visual, and the Web is a nice GUI for your API really so, why would we use <location> as a textual element when what we want to do is have a visual impact at the same time. That’s where <map> and the proposed <layer> elements come in. By extending image maps to be responsive *and* geo-referenced, we give the user the visual and we get location semantics in the bargain. Plus we get the standardized beauty of HTML. win-win-win I call that. Cheers!

Leave a Reply

HTML: You can use these tags: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong> . To display code, manually escape it.