The medium is the message.
If you’re here, you can probably write some HTML. The web is the #1 hypermedia system after all, the one this book will spend the most time with, and HTML is its format.
As with every aspect of the web, it has been exapted and reinterpreted by web developers in myriad ways. Is it a document format? Is it for applications? Is it a rendering system? Is it (gasp!) a programming language? These are the contenders in the Eternal Debate of the web development world, and none will ever win because none of them are right.
When I say hypertext, I mean the simultaneous presentation of information and controls such that the information becomes the affordance through which the user (or automaton) obtains choices and selects actions.
HTML, like all hypermedia, blurs the distinction between the information being accessed and the application used to access it. HTML is for documents, insofar as you’re willing to adopt a broad definition of “document” — and it is for applications, ones that are interwoven with the data they process.
HTML is a hypermedium.
An HTML file is not a program that produces a human-readable document. It is the document.
Unfortunately, HTML’s development in terms of hypermedia controls has stagnated and what little there is often not put to full use.
This chapter looks at HTML as something worth studying in its own right, even in this day and age. It covers our best practices for writing/generating HTML, and why HTML is something far cooler than a programming language. It won’t be an HTML tutorial, as that would take a whole other book, but it can accompany you in your HTML re-learning journey.
Why relearn HTML?
Have you noticed that a lot of websites are bad?
Pages are bloated with
<div>soup, and stylesheets are big as a result of trying to select elements in that mess. The result is slow loading times. Other than
<div>being the most common element, the HTTP Archive Web Almanac found that 0.06% of pages surveyed in 2020 contained the nonexistent
<h7>element. 0.0015% used
- So-called MVPs (minimum viable product) are released in open beta while being completely unusable by vast swathes of people — UX not just buggy, but nonexistent. Is an inaccessible product “viable”?
- Websites, including websites containing public data or results of publicly-funded research, are impossible to scrape programmatically.
- Search engines have a hard time extracting useful information from a page, and rank that page lower as a result.
In the rest of the chapter, we’ll look at these issues in more detail and see how effective HTML can help us develop better websites. However, we should first note that HTML is not a panacea. If you care about machine readability, or human readability, or page weight, the most important thing to do is testing. Test manually. Test automatically. Test with screenreaders, test with a keyboard, test on different browsers and hardware, run linters (while coding and/or in CI).
So where does HTML and the s-word come in?
Knowing HTML well might not absolve you from doing your job, but it makes it a lot easier.
“But I already know HTML well.” Maybe you do. But many people underestimate how sophisticated HTML is. Indeed, it’s very easy (and sometimes acceptable) to produce mediocre HTML that seems to work, and many websites settle with seeming to work. But better websites are possible, and anyone can learn HTML to the level of making websites that actually work.
While programming code is described as spaghetti when it’s not well organized, the food of choice for messy markup is soup.
HTML can turn into soup in a variety of ways, usually due to a disregard for or misunderstanding of best practice or due to an excess of layers between the developer and the HTML.
The best-known kind of messy HTML is
When developers fall back on the generic
<span> elements instead of more meaningful tags,
we either degrade the quality of our websites or create more work for ourselves — probably both.
For example, instead of adding a button using the dedicated
<div> element might have a
click event listener added to it.
Why might a developer do this, when the
<button> element is right there?
There could be a few reasons:
<button>element might have been harder to style.
- Confusion. The button might have some other interactive features, which could lead to the developer not realizing it was a button.
Apathy. The developer doesn’t care, and uses
<div>except when he has to.
It’s absolutely possible to implement this button, and indeed most kinds of UI, using nothing but
However, it makes the job harder.
There are two main issues with this button:
- It’s not focusable — the Tab key won’t get you to it.
- There’s no way for assistive tools to tell that it’s a button.
Let’s fix that:
We often don’t remember to look out for these types of UX and accessibility bugs. F5-Driven Development is the way most of us write HTML: write something, Alt-Tab to the browser to see if it works, and go back to edit. It’s a fast and enjoyable way to build things, but it means that during most of development, developers are biased towards their own UI needs, and users (who might use websites differently) become an afterthought. However, if we use HTML effectively, we can catch many of these issues before they ever occur, even before testing.
Given all this, why are so many developers writing div soup? There is a tendency to understate the sophistication of HTML.
Instead, learn the meaning of every tag and consider each another tool in your tool chest. (With the 113 elements currently defined in the spec, it’s more of a tool shed).
Markdown soup is the lesser known sibling of
This is the result of web developers limiting themselves to the set of elements that the Markdown language provides shorthand for,
even when these elements are incorrect.
Consider the following example of an IEEE-style citation:
Here, <em> is used because it’s the only Markdown element that is presented in italics by default.
This indicates that the book title is being stressed, but the purpose is to mark it as the title of a work.
HTML has the
<cite> element that’s intended for this exact purpose.
Furthermore, even though this is a numbered list perfect for the
<ol> element, which Markdown supports, plain text is used for the reference numbers instead.
Why could this be?
The IEEE citation style requires that these numbers are presented in square brackets.
This could be achieved on an
<ol> with CSS,
but Markdown doesn’t have a way to add a class to elements meaning the square brackets would apply to all ordered lists.
Don’t shy away from using embedded HTML in Markdown. For larger sites, also consider Markdown extensions.
You can also use custom processors to produce extra-detailed HTML instead of writing it by hand:
Remedy: Stay close to the output
In order to avoid
<div> soup (or Markdown soup, or similar), you need to constantly be aware what kind of markup you’re producing and be able to change it.
For example, a popular concept found in many frameworks is components. Components encapsulate a section of a page along with its dynamic behavior. While encapsulating behavior is a good way to organize code, they also separate elements from their surrounding context, which can lead to wrong or inadequate relationships between elements. The result is what one might call component soup, where information is hidden in component state, rather than being present in the HTML, which is now incomprehensible due to missing context. In our Client Side Scripting chapter, we’ll look at alternatives to component-based frameworks that can be used to avoid these shortcomings.
To be abundantly clear, components aren’t the cause of all div soup. Not even most of it. The root cause is the fact that HTML is falsely believed to be very simple, and as a result, developers and organizations don’t invest in learning and applying HTML skills. However, don’t reach for components for reuse without considering other options. Lower-level mechanisms usually (allow you to) produce better HTML.
Components, when used well, can actually improve the clarity of your HTML.
To decide if a component is appropriate for your use case, a good rule of thumb is to ask:
“Could this reasonably be a built-in HTML element?”
For example, a code editor is a good candidate,
since HTML already has
In addition, a fully-featured code editor will have many child elements that won’t provide much information anyway.
We can use features like
to encapsulate these elements.
We can create a
<code-area>, that we can drop into our page whenever we want.
See how we’re extending HTML, rather than abstracting it away.
“Yeah! Down with
<div>! It’s time to use the full power of HTML5!”
<figure> have become a sort of shorthand for HTML.
Developers may sprinkle them generously and haphazardly over
This is not an improvement, and can in fact make a website worse.
By using these elements, a page makes false promises, like
<article> elements being self-contained, reusable entities, to clients like browsers, search engines and scrapers that can’t know better.
Most HTML isn’t this much of a mess,
but it’s far too common for
<article> to be used as a drop-in replacement
<div> instead of adding useful information.
To avoid this:
- Check the HTML spec. Make sure that the element you’re using fits your use case.
Don’t try to be specific when you can’t or don’t need to.
Keep the spec on hand
The beginning of wisdom is to call things by their right names.
The most authoritative (though not necessarily best) resource for learning about HTML is the HTML specification. The current specification lives on https://html.spec.whatwg.org/multipage. There’s no need to rely on hearsay to keep up with developments in HTML.
Section 4 features a list of all available elements, including what they represent, where they can occur, and what they are allowed to contain. It even tells you when you’re allowed to leave out closing tags!
Section 4 in particular is a great piece of reference material and an useful read in general. Reading it through (skipping over the implementation details, like the several pages of algorithms) will give you a sense of how HTML is intended to be written.
Remedy: Know your budget
The close relationship between the content and the markup means that good HTML is actually quite labor-intensive, often across a whole organization. Most sites have a separation between the authors, who are rarely familiar with HTML and very rarely want to think about it, and the developers, who need to develop a generic system able to handle any content that’s thrown at it — this separation usually taking the form of a CMS. As a result, having markup tailored to content, which is often necessary for advanced HTML, is rarely feasible. Furthermore, for internationalized sites, content in different languages being injected into the same elements can degrade markup quality as stylistic conventions differ between languages. Dishearteningly, but understandably, it’s an expense few organizations can spare.
Thus, we don’t demand that every site contains the most conformant HTML it can. What’s most important is to avoid wrong HTML — it can be better to fall back on a more generic element than to be precisely incorrect. The kinds of defects caused by inadequate HTML can usually be caught through testing.
If you have the resources, however, putting more care in your HTML will produce a more polished site. Much like style guides, well-written HTML gives an air of quality and prestige to a document, even if few notice it. When it comes to HTML, you get what you pay for.
The S word
Gretchen, stop trying to make fetch happen! It’s not going to happen!
You might have noticed how we’ve avoided the use of the word “semantic” so far, partly because many people associate it with annoying pedantic colleagues (couldn’t be us!), and partly because it has multiple meanings, only one of which we care about.
We’re not really about the “Semantic Web”.
The "Semantic Web" was a vision of a system that could both express any kind of human knowledge, and be useful for computing. It planned to achieve this using ontologies, repositories of schemas like "person", "movie" and "species" and relations like "named", "part of" and "created by".
The problem with this vision is that information on the Web rarely fits into neat categories. Because no single ontology can be defined that encapsulates all kinds of information one might wish to publish on the Web, Semantic Web systems need to be pluggable with different schemas. In turn, a Semantic Web client, in order to do something useful with an arbitrary piece of HTML, needs to be able to parse these schemas, which means we need to define a standard machine-readable format for ontologies. But a single format couldn’t express every kind of object and relation… It’s turtles all the way down.
In practice, most implementations stop at the topmost turtle. Ontologies are defined in natural language, and clients are hardcoded to support a fixed set of schemas. The requirement for prior agreement between server and client means this technology does not have the generality of the Web, and for most use cases, you might as well define a JSON API.
Instead of extensibility through custom namespaces,
HTML is extensible through its flexibility — both its tolerance for errors and its well-defined extension points like classes and
These affordances let us embed metadata in it without native support.
They all have the possibility of name collisions,
but fragility and messiness is ultimately unavoidable for a generalized human information exchange language.
Tag and attribute names in such a language are not identifiers for behavior — like function names in a programing language — but words with well-understood meanings.
No amount of namespacing can make fetch happen,
and developers should be able to deal with that.
Embrace the mess and let go of your schemas.
A flexible format — not an infinity of namespaces with URLs pointing to nothing — is “software design on the scale of decades”.
This is a necessarily reductive explanation of the Semantic Web, a field that we’ve described in past tense even though it continues to have some practical use. The reason it doesn’t matter to us is because the Semantic Web has nothing to do with semantic HTML.
Semantic HTML has no ambitions of robotic agents navigating information and helping us make connections and discoveries. It’s actually quite mundane: don’t break the web.
I think being asked to write meaningful HTML better lights the path to realizing that it isn’t about what the text means to humans—it’s about using tags for the purpose outlined in the specs to meet the needs of software like browsers, assistive technologies, and search engines.
Telling people to "use semantic HTML" instead of "read the spec" has led to a lot of people guessing at the meaning of tags — "`looks pretty semantic to me!" — instead of engaging with the spec.
I think even “meaningful” is too lofty. Instead, I recommend talking about, and writing, conformant HTML. Use the elements to the full extent provided by the HTML specification, and let the software take from it whatever meaning they can.
Speaking of assistive technologies, by the way…
The A word
Throughout this chapter, we’ve gestured at potential accessibility benefits to be had from effective HTML.
(Re)learning HTML and using it consciously prevents and fixes many accessibility issues.
It’s true that all else being equal, an app that makes full use of HTML will be more accessible than one that is made of soup. However, HTML is not a panacea. Even the adage that HTML is “accessible by default” is a bit misleading.
The tabs can’t be focused with the Tab key. Because the radio buttons are hidden with
display: none, they are removed from the focus order, and label elements are not focusable.
- “[…] does not listen for Down Arrow or Up Arrow so those keys can provide their normal browser scrolling functions […]” Radio buttons listen to these events (since they’re usually presented vertically). Thankfully, right and left arrow keys also work.
- The tabs can’t be focused with the Tab key. Because the radio buttons are hidden with
ARIA roles, states, and properties
“[The element that contains the tabs] has role
tablist.” There is no such element in this implementation, as that would break the CSS.
“Each [tab] has role
tab[…]” The tab elements have role
label. Furthermore, the elements they are labeling are hidden.
“Each element that contains the content panel for a
tabpanel.” No, though that could be added.
“Each [tab] has the property
aria-controlsreferring to its associated tabpanel element.” Nope.
tabelement has the state
trueand all other
tabelements have it set to
“Each element with role
tabpanelhas the property
aria-labelledbyreferring to its associated
tabelement.” No. The element that is labelled by the tab element is a hidden radio button.
- “[The element that contains the tabs] has role
It turns out that fulfilling all of these requirements takes a lot of code.
Some of the ARIA attributes can be added directly in HTML,
but they are repetitive
and others (like
The keyboard interactions can be error-prone too.
It’s not impossible to make a good tab set implementation.
However, it’s difficult to trust that a new implementation will work in all environments, since most of us have limited access to testing devices.
This is why it’s often recommended to use established libraries for UI interactions instead of rolling your own.
Before adding a dependency, however, let’s reconsider our design. Does the information really need to be presented as tabs? Sometimes the answer is yes — we used dummy text in our code example, so we can’t tell — but if not, a sequence of details disclosures fulfills a very similar purpose.
Screen reader rage
The purpose of writing good HTML is not to please the specification deities. It’s to make good websites. The spec is a good starting point when deciding how to mark something up, but when browser implementations don’t conform, we shouldn’t throw up our hands because we did what was specified.
It is of course frustrating when browsers and other tools misbehave. Accessibility itself feels inaccessible sometimes. It helps with the frustration is to recognize that hypermedia exchanges are not machine-to-machine communication. An HTML file is not a program that produces a human-readable document. It is the document. So, instead of banging your head against a wall, focus on people, not the tools they use.
Don’t write HTML for browsers. or assistive tools, or validators. HTML is not for them. HTML is for humans.
The Scrapeable Web
Hypermedia systems perform best with human-operated clients. However, machine-readable information can be embedded into HTML pages through a variety of extension mechanisms:
Link relations (
These mechanisms are fairly unstructured (as per earlier discussion on Semantic Web schematamania), but structure can be imposed upon them if needed. One standard for including structured data in HTML is microformats. Microformats use classes to mark certain elements as containing information to be extracted. The microformats2 standard uses five kinds of classes:
h-classes denote that an element represents a machine-readable entity, e.g.,
The other prefixes denote that an element represents properties of an enclosing entity:
p-classes are plain text properties, from an element’s inner text or
u-classes are URL properties, from an element’s
dt-classes are date/time properties, from
e-classes are embedded markup properties, from an element’s inner HTML, e.g.,
There are also conventions for extracting common properties like name, URL and photo without needing classes for each property.
By adding these classes into the HTML representation of an object, we allow the properties of the object to be recovered from the HTML. For example, this simple HTML:
can be parsed into this JSON-like structure:
We can see microformats in action by looking back at how we can mark up the reference list we mentioned earlier in this example. Using a variety of properties and nested objects, we can mark up every bit of information about the work being cited in a machine-readable way:
This can be parsed into a JSON-like structure, as follows:
In this example, Microformats and the extensibility of HTML proved quite useful. However, embedding data in HTML is hardly appropriate for every use case. Your human-facing and machine-facing interfaces may end up being limited by each other. It’s often the best option to define a JSON data API separate from your HTML, which will be discussed later in this book.
Where to next
Unfortunately, a full HTML tutorial is way out of scope for one chapter of this book. Here are some resources you can check out if you’d like to invest in your HTML knowledge:
- HTML specification: https://html.spec.whatwg.org/multipage
- TODO link resources on alt text.
- https://htmhell.dev: Along with sinister abuses of HTML, this website shares development tips that will help ypu keep up-to-date with best practice.
- Manuel Matuzović, Lost in Translation, https://www.youtube.com/watch?v=Wno1IhEBTxc.
- Manuel Matuzović, Why I’m not the biggest fan of Single Page Applications, https://www.matuzo.at/blog/2023/single-page-applications-criticism/
- semantic: the 8 letter s-word, https://t-ravis.com/post/doc/semantic_the_8_letter_s-word/