Hypermedia is a universal technology today, almost as common as electricity.
Billions of people use hypermedia-based systems every day, mainly by interacting with the Hypertext Markup Language (HTML) being exchanged via the Hypertext Transfer Protocol (HTTP) by using a web browser connected to the World Wide Web.
People use these systems to get their news, check in on friends, buy things online, play games, send emails and so forth: the variety and sheer number of online services being delivered by hypermedia is truly astonishing.
And yet, despite this ubiquity, the topic of hypermedia itself is a strangely under-explored concept today, left mainly to specialists. Yes, you can find a lot of tutorials on how to author HTML, create links and forms, etc. But it is rare to see a discussion of HTML as a hypermedia and, more broadly, on how an entire hypermedia system fits together.
This is in contrast with the early web development era when concepts like Representational State Transfer (REST) and Hypermedia As The Engine of Application State (HATEOAS) were discussed frequently, refined and debated among web developers.
HTML happens to be there, in the browser, and so we have to use it.
This is a shame and we hope to convince you that hypermedia is not simply a piece of legacy technology that we have to accept and deal with. Instead, we aim to show you that hypermedia is a tremendously innovative, simple and flexible way to build robust applications: Hypermedia-Driven Applications.
We hope that by the end of this book you will feel, as we do, that the hypermedia approach deserves a seat at the table when you, a web developer, are considering the architecture of your next application. Creating a Hypermedia-Driven Application on top of a hypermedia system like the web is a viable and, indeed, often excellent choice for modern web applications.
(And, as the section on Hyperview will show, not just web applications.)
What Is Hypermedia?
Hypertexts: new forms of writing, appearing on computer screens, that will branch or perform at the reader’s command. A hypertext is a non-sequential piece of writing; only the computer display makes it practical.
Let us begin at the beginning: what is hypermedia?
Hypermedia is a media, for example a text, that includes non-linear branching from one location in the media to another, via, for example, hyperlinks embedded in the media. The prefix “hyper-” derives from the Greek prefix “ὑπερ-” which means “beyond” or “over”, indicating that hypermedia goes beyond normal, passively consumed media like magazines and newspapers.
Hyperlinks are a canonical example of what is called a hypermedia control:
- Hypermedia Control
A hypermedia control is an element in a hypermedia that describes (or controls) some sort of interaction, often with a remote server, by encoding information about that interaction directly and completely within itself.
Hypermedia controls are what differentiate hypermedia from other sorts of media.
You may be more familiar with the term hypertext, from whose Wikipedia page the above quote is taken. Hypertext is a sub-category of hypermedia and much of this book is going to discuss how to build modern applications using hypertexts such as HTML, the Hypertext Markup Language, or HXML, a hypertext used by the Hyperview mobile hypermedia system.
Hypertexts like HTML function alongside other technologies crucial for making an entire hypermedia system work: network protocols like HTTP, other media types such as images and videos, hypermedia servers (i.e., servers providing hypermedia APIs), sophisticated hypermedia clients (e.g., web browsers), and so on.
Because of this, we prefer the broader term hypermedia systems when describing the underlying architecture of applications built using hypertext, to emphasize the system architecture over the particular hypermedia being used.
It is the entire hypermedia system architecture that is underappreciated and ignored by many modern web developers.
A Brief History of Hypermedia
Where did the idea of hypermedia come from?
While there were many precursors to the modern idea of hypertext and the more general hypermedia, many people point to the 1945 article As We May Think written by Vannevar Bush in The Atlantic as a starting point for looking at what has become modern hypermedia.
In this article Bush described a device called a Memex, which, using a complex mechanical system of reels and microfilm, along with an encoding system, would allow users to jump between related frames of content. The Memex was never actually implemented, but it was an inspiration for later work on the idea of hypermedia.
The terms “hypertext” and “hypermedia” were coined in 1963 by Ted Nelson, who would go on to work on the Hypertext Editing System at Brown University and who later created the File Retrieval and Editing System (FRESS), a shockingly advanced hypermedia system for its time. (This was perhaps the first digital system to have a notion of “undo”.)
While Nelson was working on his ideas, Douglas Engelbart was busy at work at the Stanford Research Institute, explicitly attempting to make Vannevar Bush’s Memex a reality. In 1968, Englebart gave “The Mother of All Demos” in San Francisco, California.
Englebart demonstrated an unbelievable amount of technology:
- Remote, collaborative text editing with his peers in Menlo Park.
- Video and audio chat.
- An integrated windowing system, with window resizing, etc.
- A recognizable hypertext, whereby clicking on underlined text navigated to new content.
Despite receiving a standing ovation from a shocked audience after his talk, it was decades before the technologies Englebart demonstrated became mainstream.
In 1990, Tim Berners-Lee, working at CERN, published the first website. He had been working on the idea of hypertext for a decade and had finally, out of desperation at the fact it was so hard for researchers to share their research, found the right moment and institutional support to create the World Wide Web:
Creating the web was really an act of desperation, because the situation without it was very difficult when I was working at CERN later. Most of the technology involved in the web, like the hypertext, like the Internet, multifont text objects, had all been designed already. I just had to put them together. It was a step of generalising, going to a higher level of abstraction, thinking about all the documentation systems out there as being possibly part of a larger imaginary documentation system.
By 1994 his creation was taking off so quickly that Berners-Lee founded the W3C, a working group of companies and researchers tasked with improving the web. All standards created by the W3C were royalty-free and could be adopted and implemented by anyone, cementing the open, collaborative nature of the web.
In 2000, Roy Fielding, then at U.C. Irvine, published a seminal PhD dissertation on the web: “Architectural Styles and the Design of Network-based Software Architectures.” Fielding had been working on the open source Apache HTTP Server and his thesis was a description of what he felt was a new and distinct networking architecture that had emerged in the early web. Fielding had worked on the initial HTTP specifications and, in the paper, defined the web’s hypermedia network model using the term REpresentational State Transfer (REST).
Fielding’s work became a major touchstone for early web developers, giving them a language to discuss the new technical medium they were building applications in.
We will discuss Fielding’s key ideas in depth in Chapter 2, and try to correct the record with respect to REST, HATEOAS and hypermedia.
The World’s Most Successful Hypertext: HTML
In the beginning was the hyperlink, and the hyperlink was with the web, and the hyperlink was the web. And it was good.
The system that Berners-Lee, Fielding and many others had created revolved around a hypermedia: HTML. HTML started as a read-only hypermedia, used to publish (at first) academic documents. These documents were linked together via anchor tags which created hyperlinks between them, allowing users to quickly navigate between documents.
When HTML, 2.0 was released, it introduced the notion of the
form tag, joining the anchor tag (i.e., hyperlink) as a
second hypermedia control. The introduction of the form tag made building applications on the web viable by providing
a mechanism for updating resources, rather than just reading them.
It was at this point that the web transitioned from an interesting document-oriented system to a compelling application architecture.
Today HTML is the most widely used hypermedia in existence and this book naturally assumes that the reader has a reasonable familiarity with it. You do not need to be an HTML (or CSS) expert to understand the code in this book, but the better you understand the core tags and concepts of HTML, the more you will get out of it.
The Essence of HTML as a Hypermedia
Let us consider these two defining hypermedia elements (that is the two defining hypermedia controls) of HTML, the anchor tag and the form tag, in a bit of detail.
Anchor tags are so familiar as to be boring but, as the original hypermedia control, it is worth reviewing the mechanics of hyperlinks to get our minds in the right place for developing a deeper understanding of hypermedia.
Consider a simple anchor tag, embedded within a larger HTML document:
An anchor tag consists of the tag itself,
<a></a>, as well as the attributes and content within the tag. Of particular
interest is the
href attribute, which specifies a hypertext reference to another document or document fragment. It
is this attribute that makes the anchor tag a hypermedia control.
In a typical web browser, this anchor tag would be interpreted to mean:
- Show the text “Hypermedia Systems” in a manner indicating that it is clickable.
When the user clicks on that text, issue an HTTP
GETrequest to the URL
- Take the HTML content in the body of the HTTP response to this request and replace the entire screen in the browser as a new document, updating the navigation bar to this new URL.
Anchors provide the main mechanism we use to navigate around the web today, by selecting links to navigate from document to document, or from resource to resource.
Here is what a user interaction with an anchor tag/hyperlink looks like in visual form:
When the link is clicked the browser (or, as we sometimes refer to it, the hypermedia client) initiates an HTTP
GET request to the URL encoded in the link’s
Note that the HTTP request includes additional data (i.e., metadata) on what, exactly, the browser wants from the server, in the form of headers. We will discuss these headers, and HTTP in more depth in Chapter 2.
The hypermedia server then responds to this request with a hypermedia response — the HTML — for the new page. This may seem like a small and obvious point, but it is an absolutely crucial aspect of a truly RESTful hypermedia system: the client and server must communicate via hypermedia!
Anchor tags provide navigation between documents or resources, but don’t allow you to update those resources. That functionality falls to the form tag.
Here is a simple example of a form in HTML:
Like an anchor tag, a form tag consists of the tag itself,
<form></form>, combined with the attributes and
content within the tag. Note that the form tag does not have an
href attribute, but rather has an
that specifies where to issue an HTTP request.
Furthermore, it also has a
method attribute, which specifies exactly which HTTP “method” to use. In this example
the form is asking the browser to issue a
In contrast with anchor tags, the content and tags within a form can have an effect on the hypermedia interaction
that the form makes with a server. The values of
input tags and other tags such as
select tags will be included
with the HTTP request when the form is submitted, as URL parameters in the case of a
GET and as part of the request
body in the case of a
POST. This allows a form to include an arbitrary amount of information
collected from a user in a request, unlike the anchor tag.
In a typical browser this form tag and its contents would be interpreted by the browser roughly as follows:
- Show a text input and a “Sign Up” button to the user.
When the user submits the form by clicking the “Sign Up” button or by hitting the enter key while the input element is
focused, issue an HTTP
POSTrequest to the path
/signupon the “current” server.
- Take the HTML content in the body of the HTTP response body and replace the entire screen in the browser as a new document, updating the navigation bar to this new URL.
This mechanism allows the user to issue requests to update the state of resources on the server. Note that despite this new type of request the communication between client and server is still done entirely with hypermedia.
It is the form tag that makes Hypermedia-Driven Applications possible.
If you are an experienced web developer you probably recognize that we are omitting a few details and complications here. For example, the response to a form submission often redirects the client to a different URL.
This is true, and we will get down into the muck with forms in more detail in later chapters but, for now, this simple example suffices to demonstrate the core mechanism for updating system state purely within hypermedia.
Here is a diagram of the interaction:
Web 1.0 applications
As someone interested in web development, the above diagrams and discussion are probably very familiar to you. You may even find this content boring. But take a step back and consider the fact that these two hypermedia controls, anchors and forms, are the only native ways for a user to interact with a server in plain HTML.
Only two tags!
And yet, armed with only these two tags, the early web was able to grow exponentially and offer a staggeringly large amount of online, dynamic functionality to billions of people.
These two tags give a tremendous amount of expressive power to HTML.
So What Isn’t Hypermedia?
So links and forms are the two main hypermedia-based mechanisms for interacting with a server available in HTML.
do this, we will use the
fetch() API, a popular API for
This button has an
GET request to
fetch(). An AJAX request is like a
“normal” HTTP request, but it is issued “behind the scenes” by the browser. The user does not see a
request indicator from the browser as they would with normal links and forms. Additionally, unlike requests issued by those
An HTTP response to this request might look something like this:
The details of exactly what the
updateUI() function does aren’t important for our discussion.
What is important, what is the crucial aspect of this JSON-based server interaction is that it is not using hypermedia. The JSON API being used here does not return a hypermedia response. There are no hyperlinks or other hypermedia-style controls in it.
This JSON API is, rather, a Data API.
updateUI() method must understand how to turn
this contact data into HTML.
In particular, the code in
updateUI() needs to know about the internal structure and meaning of the data.
It needs to know:
- Exactly how the fields in the JSON data object are structured and named.
- How they relate to one another.
- How to update the local data this new data corresponds with.
- How to render this data to the browser.
- What additional actions/API end points can be called with this data.
In short, the logic in
updateUI() needs to have intimate knowledge of the API endpoint at
/api/v1/contact/1, knowledge provided
via some side-channel beyond the response itself. As a result, the
updateUI() code and the
API have a strong relationship, known as tight coupling: if the format of the JSON response changes, then the code for
updateUI() will almost certainly
also need to be changed as well.
Single Page Applications
Instead, the application is exchanging plain data with the server and then updating the content within a single page.
When this strategy or architecture is adopted for an entire application, everything happens on a “Single Page” and, thus the application becomes a “Single Page Application.”
The Single Page Application architecture is extremely popular today and has been the dominant approach to building web applications for the last decade. This can be observed by the high level of mind-share and discussion it has received in the industry.
Today the vast majority of Single Page Applications adopt far more sophisticated frameworks for managing their user interface than this simple example shows. Popular libraries such as React, Angular, Vue.js, etc. are now the common — indeed, the standard — way to build web applications.
When the user interface is updated by a user these changes also flow into the model objects, establishing a “two-way” binding mechanism: the model can update the UI, and the UI can update the model.
This is a much more sophisticated approach to a web client than hypermedia, and it typically does away almost entirely with the underlying hypermedia infrastructure available in the browser.
SPAs are more much like thick client applications, that is, like the client-server applications of the 1980s — an architecture popular before the web came along and that the web was, in many ways, a reaction to.
This approach isn’t necessarily wrong, of course: there are times when a thick client approach is the appropriate choice for an application. But it is worth thinking about why web developers so frequently make this choice without considering other alternatives, and if there are reasons not to go down this path.
Why Use Hypermedia?
The emerging norm for web development is to build a React single-page application, with server rendering. The two key elements of this architecture are something like:
- The backend is an API that that application makes requests against.
This idea has really swept the internet. It started with a few major popular websites and has crept into corners like marketing sites and blogs.
Given the popularity, power and success of this modern approach to building web applications, why on earth would you consider an older, clunkier and less popular approach like hypermedia?
We are glad you asked!
It turns out that the hypermedia architecture, even in its original Web 1.0 form, has a number of advantages when compared with the Single Page Application + JSON Data API approach. Three of the biggest are:
- It is an extremely simple approach to building web applications.
- It is extremely tolerant of content and API changes. In fact, it thrives on them!
- It leverages tried and true features of web browsers, such as caching.
The first two advantages, in particular, address major pain points in modern web development:
- Single Page Application infrastructure has become extremely complex, often requiring an entire team to manage.
- JSON API churn — constant changes made to JSON APIs to support application needs — has become a major pain point for many application teams.
But if hypermedia is so great, and if it addresses so many of the problems that beset the web development industry, why was it set aside in the first place? After all, hypermedia was there first. Why didn’t web developers just stick with it?
There are two major reasons hypermedia hasn’t made a comeback in web development.
The first is this: the expressiveness of HTML as a hypermedia hasn’t changed much, if at all, since HTML 2.0, which was released in the mid 1990s. Many new features have been added to HTML, of course, but there haven’t been any major new ways to interact with a server in HTML in almost three decades.
HTML developers still only have anchor tags and forms available as hypermedia controls, and those hypermedia controls
can still only issue
This baffling lack of progress by HTML leads immediately to the second, and perhaps more practical reason that HTML-as-hypermedia has fallen on hard times: as the interactivity and expressiveness of HTML has remained frozen, the demands of web users have continued to increase, calling for more and more interactive web applications.
It didn’t have to be this way. There is nothing intrinsic to the idea of hypermedia that prevents it from having a richer, more expressive interactivity model than vanilla HTML. Rather than moving away from a hypermedia-based approach, the industry could have demanded more interactivity from HTML.
Instead, building thick-client style applications within web browsers became the standard, in an understandable move to a more familiar model for building rich applications.
Not everyone set aside hypermedia, of course. There have been heroic efforts to continue to advance hypermedia outside of HTML, efforts like HyTime, VoiceXML, and HAL.
A Hypermedia Resurgence?
It is interesting to think about how HTML could have advanced. Instead of stalling as a hypermedia, how could HTML have continued to develop? Could it have kept adding new hypermedia controls and increasing the expressiveness of existing ones? Would it have been possible to build modern web applications within this original, hypermedia-oriented and RESTful model that made the early web so powerful, so flexible, so much fun?
These hypermedia-oriented libraries re-center hypermedia as the core technology in web applications.
In the web development world there is an ongoing debate between the Single Page Application (SPA) approach and what is now being called the “Multi-Page Application” (MPA) approach. MPA is a modern name for the old, Web 1.0 way of building web applications, using links and forms located on multiple web pages, submitting HTTP requests and getting HTML responses.
MPA applications, by their nature, are Hypermedia-Driven Applications: after all, they are exactly what Roy Fielding was describing in his dissertation.
These applications tend to be clunky, but they work reasonably well. Many web developers and teams choose to accept the limitations of plain HTML in the interest of simplicity and reliability.
Rich Harris, creator of Svelte.js, a popular SPA library, and a thought-leader on the SPA side of the debate, has proposed a mix of this older MPA style and the newer SPA style. Harris calls this approach to building web applications “transitional,” in that it attempts to blend the MPA approach and the newer SPA approach into a coherent whole. (This is somewhat similar to the “transitional” trend in architecture, which combines traditional and modern architectural styles.)
“Transitional” is a fitting term for mixed-style applications, and it offers a reasonable compromise between the two approaches, using either one as appropriate on a case-by-case basis.
But this compromise still feels unsatisfactory.
Must we default to having these two very different architectural models in our applications?
Recall that the crux of the trade-off between SPAs and MPAs is the user experience, or interactivity of the application. This typically drives the decision to choose one approach versus the other for an application or — in the case of a “transitional” application — for a particular feature.
It turns out that by adopting a hypermedia-oriented library, the interactivity gap between the MPA and the SPA approach closes dramatically. You can use the MPA approach, that is, the hypermedia approach, for much more of your application without compromising your user interface. You might even be able to use the hypermedia approach for all your application needs.
Rather than having an SPA with a bit of hypermedia around the edges, or some mix of the two approaches, you can often create a web application that is primarily or entirely hypermedia-driven, and that still satisfies the interactivity that your users require.
This can tremendously simplify your web application and produce a much more coherent and understandable piece of software. While there are still times and places for the more complex SPA approach, which we will discuss later in the book, by adopting a hypermedia-first approach and using a hypermedia-oriented library to push HTML as far as possible, your web application can be powerful, interactive and simple.
One such hypermedia oriented library is htmx. Htmx will be the focus of Part Two of this book. We show that you can, in fact, create many common “modern” UI features found in sophisticated Single Page Applications by instead using the hypermedia model.
And, it is refreshingly fun and simple to do so.
When building a web application with htmx the term Multi-Page Application applies roughly, but it doesn’t fully characterize the core of the application architecture. As you will see, htmx doesn’t need to replace entire pages, and, in fact, an htmx-based application can reside entirely within a single page. We don’t recommend this practice, but it is possible!
So it isn’t quite right to call web applications built with htmx “Multi-Page Applications.” What the older Web 1.0 MPA approach and the newer hypermedia-oriented library powered applications have in common is their use of hypermedia as their core technology and architecture.
Therefore, we use the term Hypermedia-Driven Applications (HDAs) to describe both.
This clarifies that the core distinction between these two approaches and the SPA approach isn’t the number of pages in the application, but rather the underlying system architecture.
- Hypermedia-Driven Application (HDA)
A web application that uses hypermedia and hypermedia exchanges as its primary mechanism for communicating with a server.
So, what does an HDA look like up close?
Instead, we have declarative attributes much like the
href attribute on anchor tags and the
action attribute on
form tags. The
hx-get attribute tells htmx: “When the user clicks this button, issue a
GET request to
hx-target attribute tells htmx: “When the response returns, take the resulting HTML and place it into the element
with the id
Here we get to the crux of htmx and how it allows you to build Hypermedia-Driven Applications:
The HTTP response from the server is expected to be in HTML format, not JSON.
An HTTP response to this htmx-driven request might look something like this:
This small bit of HTML would be placed into the element in the DOM with the id
As we walk through building a Hypermedia-Driven Application in this book, the differences between the two approaches will become more and more apparent.
When Should You Use Hypermedia?
Hypermedia is often, though not always, a great choice for a web application.
Perhaps you are building a website or application that simply doesn’t need a huge amount of user-interactivity. There are many useful web applications like this, and there is no shame in it! Applications like Amazon, eBay, any number of news sites, shopping sites, message boards and so on don’t need a massive amount of interactivity to be effective: they are mainly text and images, which is exactly what the web was designed for.
Perhaps your application adds most of its value on the server side, by coordinating users or by applying sophisticated data analysis and then presenting it to a user. Perhaps your application adds value by simply sitting in front of a well-designed database, with simple Create-Read-Update-Delete (CRUD) operations. Again, there is no shame in this!
In any of these cases, using a hypermedia approach would likely be a great choice: the interactivity needs of these applications are not dramatic, and much of the value of these applications lives on the server side, rather than on the client side.
All of these applications are amenable to what Roy Fielding called “large-grain hypermedia data transfers”: you can simply use anchor tags and forms, with responses that return entire HTML documents from requests, and things will work just fine. This is exactly what the web was designed to do!
And, by layering htmx or another hypermedia-oriented library on top of this approach, you can address many of the usability issues that come with vanilla HTML and take advantage of finer-grained hypermedia transfers. This opens up a whole slew of new user interface and experience possibilities, making the set of applications that can be built using hypermedia much larger.
But more on that later.
When Shouldn’t You Use Hypermedia?
So, what about that not always? When isn’t hypermedia going to work well for an application?
One example that springs immediately to mind is an online spreadsheet application. In the case of a spreadsheet, updating one cell could have a large number of cascading changes that need to be made across the entire sheet. Worse, this might need to happen on every keystroke.
In this case we have a highly dynamic user interface without clear boundaries as to what might need to be updated given a particular change. Introducing a hypermedia-style server round-trip on every cell change would hurt performance tremendously.
However even in the case of an online spreadsheet there are likely areas where the hypermedia approach might help.
The spreadsheet application likely also has a settings page. And perhaps that settings page is amenable to the hypermedia approach. If it is simply a set of relatively straight-forward forms that need to be persisted to the server, the chances are good that hypermedia would, in fact, work great for this part of the app.
And, by adopting hypermedia for that part of your application, you might be able to simplify that part of the application quite a bit. You could then save more of your application’s complexity budget for the core, complicated spreadsheet logic, keeping the simple stuff simple.
Hypermedia: A Sophisticated, Modern System Architecture
Hypermedia is often regarded as an old and antiquated technology in web development circles, useful perhaps for static websites but certainly not a realistic choice for modern, sophisticated web applications.
Seriously? Are we claiming that modern web applications can be built using it?
Contrary to current popular opinion, hypermedia is an innovative and modern system architecture for building applications, in some ways more modern than the prevailing Single Page Application approaches. In the remainder of this book we will reintroduce you to the core, practical concepts of hypermedia and then demonstrate exactly how you can take advantage of this system architecture in your own software.
In the coming chapters you will develop a firm understanding of all the benefits and techniques enabled by this approach. We hope that, in addition, you will also become as passionate about it as we are.