In the previous chapter we introduced a simple Web 1.0-style hypermedia application to manage contacts. Our application
supported the normal CRUD operations for contacts, as well as a simple mechanism for searching contacts. Our application
was built using nothing but forms and anchor tags, the traditional hypermedia controls used to interact with servers.
The application exchanges hypermedia (HTML) with the server over HTTP, issuing
POST HTTP requests and
receiving back full HTML documents in response.
It is a basic web application, but it is also definitely a Hypermedia-Driven Application. It is robust, it leverages the web’s native technologies, and it is simple to understand.
So what’s not to like about the application?
Unfortunately, our application has a few issues common to web 1.0 style applications:
- From a user experience perspective: there is a noticeable refresh when you move between pages of the application, or when you create, update or delete a contact. This is because every user interaction (link click or form submission) requires a full page refresh, with a whole new HTML document to process after each action.
From a technical perspective, all the updates are done with the
POSTHTTP method. This, despite the fact that more logical actions and HTTP request types like
DELETEexist and would make more sense for some of the operations we implemented. After all, if we wanted to delete a resource, wouldn’t it make more sense to use an HTTP
DELETErequest to do so? Somewhat ironically, since we have used pure HTML, we are unable to access the full expressive power of HTTP, which was designed specifically for HTML.
We could address this issue by adopting a Single Page Application framework, and updating our server-side to provide JSON-based responses. Single Page Applications eliminate the clunkiness of web 1.0 applications by updating a web page directly: they mutate the Document Object Model (DOM) directly, without doing a full page refresh.
In this style of application, communication with the server is typically done via a JSON Data API, with the application sacrificing the advantages of hypermedia in order to provide a better, smoother user experience.
Many web developers today would not even consider the hypermedia approach due to the perceived “legacy” feel of these web 1.0 style applications.
The second, technical point may strike you as a bit pedantic, and we are the first to admit that conversations around REST and which HTTP Action is right for a given operation can become very tedious. But still, it’s odd that, when using plain HTML, it is impossible to use HTTP fully.
A Close Look At A Hyperlink
To understand how htmx allows us to improve the UX of our Web 1.0 style application without abandoning hypermedia, let’s revisit the hyperlink/anchor tag from Chapter 1. Recall, a hyperlink is what is known as a hypermedia control, a mechanism that describes some sort of interaction by encoding information about that interaction directly and completely within itself.
Consider again this simple anchor tag which, when interpreted by a browser, creates a hyperlink to the website for this book:
Let’s break down exactly what happens with this link:
- The browser will render the text “Hypermedia Systems” to the screen, likely with a decoration indicating it is clickable.
- Then, when a user clicks on the text…
The browser will issue an HTTP
- The browser will load the HTML body of the HTTP response into the browser window, replacing the current document.
So we have four aspects of a simple hypermedia link like this, with the last three aspects supplying the mechanism that distinguishes a hyperlink from “normal” text and makes this a hypermedia control.
Now, let’s take a moment and think about how we can generalize these last three aspects of a hyperlink.
Why Only Anchors & Forms?
Consider: what makes anchor tags (and forms) so special?
Why can’t other elements issue HTTP requests as well?
For example, why shouldn’t
button elements be able to issue HTTP requests? It seems arbitrary to have to wrap a
form tag around a button just to make deleting contacts work in our application.
Maybe: other elements should be able to issue HTTP requests as well, and act as hypermedia controls on their own.
This is our first opportunity to generalize HTML as a hypermedia.
HTML could be extended to allow any element to issue a request to the server and act as a hypermedia control.
Why Only Clicks & Submits?
Next, let’s consider the event that triggers the request to the server on our link: a click event.
Well, what’s so special about clicking (in the case of anchors) or submitting (in the case of forms)? Those are just two of many, many events that are fired by the DOM, after all. Events like mouse down, or key up, or blur are all events you might want to use to issue an HTTP request.
Why shouldn’t these other events be able to trigger requests as well?
This gives us our second opportunity to expand the expressiveness of HTML:
HTML could be extended to allow any event — not just a click, as in the case of hyperlinks — to trigger HTTP requests.
Why Only GET & POST?
Getting a bit more technical in our thinking leads us to the problem we noted earlier: plain HTML only
give us access to the
POST actions of HTTP.
HTTP stands for Hypertext Transfer Protocol, and yet the format it was explicitly designed for, HTML, only supports
Let’s recall what these different HTTP request types are designed to represent:
GETcorresponds with “getting” a representation for a resource from a URL: it is a pure read, with no mutation of the resource.
POSTsubmits an entity (or data) to the given resource, often creating or mutating the resource and causing a state change.
PUTsubmits an entity (or data) to the given resource for update or replacement, again likely causing a state change.
PATCHis similar to
PUTbut implies a partial update and state change rather than a complete replacement of the entity.
DELETEdeletes the given resource.
These operations correspond closely to the CRUD operations we discussed in Chapter 2. By giving us access to only two of the five, HTML hamstrings our ability to take full advantage of HTTP.
This gives us our third opportunity to expand the expressiveness of HTML:
HTML could be extended so that it allows access to the missing three HTTP methods,
Why Only Replace The Entire Screen?
As a final observation, consider the last aspect of a hyperlink: it replaces the entire screen when a user clicks on it.
It turns out that this technical detail is the primary culprit for poor user experience in Web 1.0 Applications. A full page refresh can cause a flash of unstyled content, it destroys the scroll state of the user by scrolling to the top of the page, and so forth.
But there is no rule saying that hypermedia exchanges must replace the entire document.
This gives us our fourth, final and perhaps most important opportunity to generalize HTML:
HTML could be extended to allow the responses to requests to replace elements within the current document, rather than requiring that they replace the entire document.
This is actually a very old concept in hypermedia. Ted Nelson, in his 1980 book “Literary Machines” coined the term transclusion to capture this idea: the inclusion of content into an existing document via a hypermedia reference. If HTML supported this style of “dynamic transclusion,” then Hypermedia-Driven Applications could function much more like a Single Page Application, where only part of the DOM is updated by a given user interaction or network request.
Extending HTML as a Hypermedia with Htmx
These four opportunities present us a way to extend HTML well beyond its current abilities, but in a way that is entirely within the original hypermedia model of the web. The fundamentals of HTML, HTTP, the browser, and so on, won’t be changed dramatically. Rather, these generalizations of existing functionality already found within HTML would simply let us accomplish more using HTML.
Installing and Using Htmx
can be added to a web application by simply including it via a
script tag in your
Because of this simple installation model, you can take advantage of tools like public CDNs to install the library.
Below is an example using the popular unpkg Content Delivery Network (CDN) to install version
SHA can be found on the htmx website.
We also mark the script as
crossorigin="anonymous" so no credentials will be sent to the CDN.
This is in the spirit of the early web, when you could simply include a script tag and things would “just work.”
If you don’t want to use a CDN, you can download htmx to your local system and adjust the
script tag to point to wherever you keep your static assets. Or, you may have a build system
that automatically installs dependencies. In this case you can use the Node Package Manager (npm) name for the library:
htmx.org and install it in the usual manner that your build system supports.
Once htmx has been installed, you can begin using it immediately.
Instead, you will use attributes placed directly on elements in your HTML to drive more dynamic behavior. Htmx extends
HTML as a hypermedia, and it wants that extension to be as natural and consistent as possible with existing
HTML concepts. Just as an anchor tag uses an
href attribute to specify the URL to retrieve, and forms use an
attribute to specify the URL to submit the form to, htmx uses HTML attributes to specify the URL that an HTTP request
should be issued to.
Triggering HTTP Requests
Let’s look at the first feature of htmx: the ability for any element in a web page to issue HTTP requests. This is the core functionality provided by htmx, and it consists of five attributes that can be used to issue the five different developer-facing types of HTTP requests:
hx-get- issues an HTTP
hx-post- issues an HTTP
hx-put- issues an HTTP
hx-patch- issues an HTTP
hx-delete- issues an HTTP
Each of these attributes, when placed on an element, tells the htmx library: “When a user clicks (or whatever) this element, issue an HTTP request of the specified type.”
The values of these attributes are similar to the values of both
href on anchors and
action on forms: you specify the
URL you wish to issue the given HTTP request type to. Typically, this is done via a server-relative path.
For example, if we wanted a button to issue a
GET request to
/contacts then we would write the following
The htmx library will see the
GET AJAX request to the
/contacts path when the user clicks on it.
Very easy to understand and very consistent with the rest of HTML.
It’s All Just HTML
With the request issued by the button above, we get to perhaps the most important thing to understand about htmx: it expects the response to this AJAX request to be HTML. Htmx is an extension of HTML. A native hypermedia control like an anchor tag will typically get an HTML response to a request it creates. Similarly, htmx expects the server to respond to the requests that it makes with HTML.
Htmx simply goes another direction and expects HTML.
Htmx vs. “Plain” HTML Responses
There is an important difference between the HTTP responses to “normal” anchor or form driven HTTP requests and to htmx-powered requests: in the case of htmx triggered requests, responses can be partial bits of HTML.
In htmx-powered interactions, as you will see, we are often not replacing the entire document. Rather we are using “transclusion” to include content within an existing document. Because of this, it is often not necessary or desirable to transfer an entire HTML document from the server to the browser.
This fact can be used to save bandwidth as well as resource loading time. Less overall content is transferred from
the server to the client, and it isn’t necessary to reprocess a
head tag with style sheets, script tags, and so forth.
When the “Get Contacts” button is clicked, a partial HTML response might look something like this:
This is just an unordered list of contacts with some clickable elements in it. Note that there is no opening
html tag, no
head tag, and so forth: it is a raw HTML list, without any decoration around it. A response in a
real application might contain more sophisticated HTML than this simple list, but even if it were more complicated
it wouldn’t need to be an entire page of HTML: it could just be the “inner” content of the HTML representation for
This small HTML response shows how htmx stays within the hypermedia paradigm: just like a “normal” hypermedia control in a “normal” web application, we see hypermedia being transferred to the client in a stateless and uniform manner.
This button just gives us a slightly more sophisticated mechanism for building a web application using hypermedia.
Targeting Other Elements
Now, given that htmx has issued a request and gotten back some HTML as a response, and that we are going to swap this content into the existing page (rather than replacing the entire page), the question becomes: where should this new content be placed?
It turns out that the default htmx behavior is to simply put the returned content inside the element that triggered the request. That’s obviously not a good thing in the case of our button: we will end up with a list of contacts awkwardly embedded within the button element. That will look pretty silly and is obviously not what we want.
Fortunately htmx provides another attribute,
hx-target which can be used to specify exactly where in the DOM the
new content should be placed. The value of the
hx-target attribute is a Cascading Style Sheet (CSS) selector that
allows you to specify the element to put the new hypermedia content into.
Let’s add a
div tag that encloses the button with the id
main. We will then target this
div with the response:
We have added
hx-target="#main" to our button, where
#main is a CSS selector that says “The thing with the ID ‘main’.”
By using CSS selectors, htmx builds on top of familiar and standard HTML concepts. This keeps the additional conceptual load for working with htmx to a minimum.
Given this new configuration, what would the HTML on the client look like after a user clicks on this button and a response has been received and processed?
It would look something like this:
The response HTML has been swapped into the
div, replacing the button that triggered the request. Transclusion! And
this has happened “in the background” via AJAX, without a clunky page refresh.
Now, perhaps we don’t want to load the content from the server response into the div, as child elements. Perhaps,
for whatever reason, we wish to replace the entire div with the response. To handle this, htmx provides another
hx-swap, that allows you to specify exactly how the content should be swapped into the DOM.
hx-swap attribute supports the following values:
innerHTML- The default, replace the inner html of the target element.
outerHTML- Replace the entire target element with the response.
beforebegin- Insert the response before the target element.
afterbegin- Insert the response before the first child of the target element.
beforeend- Insert the response after the last child of the target element.
afterend- Insert the response after the target element.
delete- Deletes the target element regardless of the response.
none- No swap will be performed.
The first two values,
outerHTML, are taken from the standard DOM properties that allow you to replace content
within an element or in place of an entire element respectively.
The next four values are taken from the
Element.insertAdjacentHTML() DOM API, which allow you to place an element or
elements around a given element in various ways.
The last two values,
none are specific to htmx. The first option will remove the target element from the
DOM, while the second option will do nothing (you may want to only work with response headers, an advanced technique we
will look at later in the book.)
Again, you can see htmx stays as close as possible to existing web standards in order to minimize the conceptual load necessary for its use.
So let’s consider that case where, rather than replacing the
innerHTML content of the main div above, we want to
replace the entire div with the HTML response.
To do so would require only a small change to our button, adding a new
Now, when a response is received, the entire div will be replaced with the hypermedia content:
You can see that, with this change, the target div has been entirely removed from the DOM, and the list that was returned as the response has replaced it.
Later in the book we will see additional uses for
hx-swap, for example when we implement infinite scrolling in our
contact management application.
Note that with the
hx-delete attributes, we have addressed two of the
four opportunities for improvement that we enumerated regarding plain HTML:
- Opportunity 1: We can now issue an HTTP request with any element (in this case we are using a button).
Opportunity 3: We can issue any sort of HTTP request we want,
DELETE, in particular.
hx-swap we have addressed a third shortcoming:
the requirement that the entire page be replaced.
- Opportunity 4: We can now replace any element we want in our page via transclusion, and we can do so in any manner want.
So, with only seven relatively simple additional attributes, we have addressed most of the shortcomings of HTML as a hypermedia that we identified earlier.
What’s next? Recall the other shortcoming we noted: the fact that only a
click event (on an anchor) or a
(on a form) can trigger a HTTP request. Let’s look at how we can address that limitation.
Thus far we have been using a button to issue a request with htmx. You have probably intuitively understood that the button would issue its request when you clicked on the button since, well, that’s what you do with buttons: you click on them.
And, yes, by default when an
hx-get or another request-driving annotation from htmx is placed on a button, the request
will be issued when the button is clicked.
However, htmx generalizes this notion of an event triggering a request by using, you guessed it, another attribute:
hx-trigger attribute allows you to specify one or more events that will cause the element to
trigger an HTTP request.
Often you don’t need to use
hx-trigger because the default triggering event will be what you want.
The default triggering event depends on the element type, and should be fairly intuitive:
selectelements are triggered by the
formelements are triggered on the
Requests on all other elements are triggered by the
To demonstrate how
hx-trigger works, consider the following situation: we want to trigger the request
on our button when the mouse enters it. Now, this is certainly not a good UX pattern, but bear with us: we are just
using this an example.
To respond to a mouse entering the button, we would add the following attribute to our button:
Now, with this
hx-trigger attribute in place, whenever the mouse enters this button, a request will be triggered. Silly,
but it works.
Let’s try something a bit more realistic and potentially useful: let’s add support for a keyboard shortcut for
loading the contacts,
Ctrl-L (for “Load”). To do this we will need to take advantage of additional syntax that
hx-trigger attribute supports: event filters and additional arguments.
Event filters are a mechanism for determining if a given event should trigger a request or not. They are applied to an
event by adding square brackets after it:
request. If not, the request will not be triggered.
In the case of keyboard shortcuts, we want to catch the
keyup event in addition to the click event:
Note that we have a comma separated list of events that can trigger this element, allowing us to respond to more than
one potential triggering event. We still want to respond to the
click event and load the contacts, in addition
to handling the
Ctrl-L keyboard shortcut.
There are, unfortunately, two problems with our
keyup addition: As it stands, it will trigger requests on any keyup
event that occurs. And, worse, it will only trigger when a keyup occurs within this button. The
user would need to tab onto the button to make it active and then begin typing.
Let’s fix these two issues. To fix the first one, we will use a trigger filter to test that Control key and the “L” key are pressed together:
The trigger filter in this case is
ctrlKey && key == 'l'. This can be read as “A key up event, where the ctrlKey property
is true and the key property is equal to l.” Note that the properties
key are resolved against the event
rather than the global name space, so you can easily filter on the properties of a given event. You can use any expression
OK, so this filter limits the keyup events that will trigger the request to only
Ctrl-L presses. However, we still have
the problem that, as it stands, only
keyup events within the button will trigger the request.
keyup will be triggered first on the focused element, and then on its parent (enclosing) element, and so
on, until it reaches the top level
document object that is the root of all other elements.
To support a global keyboard shortcut that works regardless of what element has focus, we will take advantage of
event bubbling and a feature that the
hx-trigger attribute supports: the ability to listen to other elements for
events. The syntax for doing this is the
from: modifier, which is added after an event name and that allows you to
specify a specific element to listen for the given event on using a CSS selector.
In this case, we want to listen to the
body element, which is the parent element of all visible elements on the page.
Here is what our updated
hx-trigger attribute looks like:
Now, in addition to clicks, the button will listen for
keyup events on the body of the page. So it will issue a
request when it is clicked on and also whenever someone hits
Ctrl-L within the body of the page.
And now we have a nice keyboard shortcut for our Hypermedia-Driven Application.
hx-trigger attribute supports many more modifiers, and it is more elaborate than other htmx attributes. This is because
events, in general, are complicated and require a lot of details to get just right. The default trigger will often
suffice, however, and you typically don’t need to reach for complicated
hx-trigger features when using htmx.
Htmx: HTML eXtended
And hey, check it out! With
hx-trigger we have addressed the final opportunity for improvement of HTML that we
outlined at the start of this chapter:
- Opportunity 2: We can use any event to trigger an HTTP request.
That’s a grand total of eight, count 'em, eight attributes that all fall squarely within the same conceptual model as normal HTML and that, by extending HTML as a hypermedia, open up a whole new world of user interaction possibilities within HTML.
Here is a table summarizing those opportunities and which htmx attributes address them:
- Any element should be able to make HTTP requests
- Any event should be able to trigger an HTTP request
- Any HTTP Action should be available
- Any place on the page should be replaceable (transclusion)
Passing Request Parameters
So far we have just looked at a situation where a button makes a simple
GET request. This is conceptually very
close to what an anchor tag might do. But there is that other native hypermedia control in HTML-based applications:
forms. Forms are used to pass additional information beyond just a URL up to the server in a request.
This information is captured via input and input-like elements within the form via the various types of input tags available in HTML.
Htmx allows you include this additional information in a way that mirrors HTML itself.
The simplest way to pass input values with a request in htmx is to enclose the element making a request within a form tag.
Let’s take our original button for retrieving contacts and repurpose it for searching contacts:
Here we have added a form tag surrounding the button along with a search input that can be used to enter a term to search contacts.
Now, when a user clicks on the button, the value of the input with the id
search will be included in the request. This
is by virtue of the fact that there is a form tag enclosing both the button and the input: when an htmx-driven request
is triggered, htmx will look up the DOM hierarchy for an enclosing form, and, if one is found, it will include all
values from within that form. (This is sometimes referred to as “serializing” the form.)
You might have noticed that the button was switched from a
GET request to a
POST request. This is because, by default,
htmx does not include the closest enclosing form for
GET requests, but it does include the form for all other types
This may seem a little strange, but it avoids junking up URLs that are used within forms when dealing with history
entries, which we will discuss in a bit. And you can always include an enclosing form’s values with an element that
GET by using the
hx-include attribute, discussed next.
While enclosing all the inputs you want included in a request is the most common approach for inputs
in htmx requests, it isn’t always possible or desirable: form tags can have layout consequences and simply cannot be
placed in some spots in HTML documents. A good example of the latter situation is in table row (
tr) elements: the
form tag is not a valid child or parent of table rows, so you can’t place a form within or around a
row of data in a table.
To address this issue, htmx provides a mechanism for including input values in requests: the
hx-include attribute allows you to select input values that you wish to include in a request via CSS selectors.
Here is the above example reworked to include the input, dropping the form:
hx-include attribute takes a CSS selector value and allows you to specify exactly which values to send along
with the request. This can be useful if it is difficult to colocate an element issuing a request with all the desired inputs.
It is also useful when you do, in fact, want to submit values with a
GET request and overcome the default behavior of
Relative CSS selectors
hx-include attribute and, in fact, most attributes that take a CSS selector, also support relative CSS selectors.
These allow you to specify a CSS selector relative to the element it is declared on. Here are some examples:
Find the closest parent element matching the given selector, e.g.,
Find the next element (scanning forward) matching the given selector, e.g.,
Find the previous element (scanning backwards) matching the given selector, e.g.,
Find the next element within this element matching the given selector, e.g.,
The current element.
Using relative CSS selectors often allows you to avoid generating ids for elements, since you can take advantage of their local structural layout instead.
A final way to include values in htmx-driven requests is to use the
hx-vals attribute, which allows you to include
“static” values in the request. This can be useful if you have additional information that you want to include in
requests, but you don’t want to have this information embedded in, for example, hidden inputs (which would be the
standard mechanism for including additional, hidden information in HTML.)
Here is an example of
state with the value
MT will be included in the
GET request, resulting in a path and parameters that
looks like this:
/contacts?state=MT. Note that we switched the
hx-vals attribute to use single quotes
around its value. This is because JSON strictly requires double quotes and, therefore, to avoid escaping we needed to
use the single-quote form for the attribute value.
You can also prefix
hx-vals with a
js: and pass values evaluated at the time of the request, which can be useful for
For example, if the
getCurrentState(), that returned the currently selected state, it could be included dynamically in htmx
requests like so:
These three mechanisms, using
form tags, using the
hx-include attribute and using the
hx-vals attribute, allow you
to include values in your hypermedia requests with htmx in a manner that should feel very familiar and in keeping with
the spirit of HTML, while also giving you the flexibility to achieve what you want.
We have a final piece of functionality to close out our overview of htmx: browser history support. When you use normal HTML links and forms, your browser will keep track of all the pages that you have visited. You can then use the back button to navigate back to a previous page and, once you have done this, you can use a forward button to go forward to the original page you were on.
This notion of history was one of the killer features of the early web. Unfortunately it turns out that history becomes tricky when you move to the Single Page Application paradigm. An AJAX request does not, by itself, register a web page in your browser’s history, which is a good thing: an AJAX request may have nothing to do with the state of the web page (perhaps it is just recording some activity in the browser), so it wouldn’t be appropriate to create a new history entry for the interaction.
If you have ever used a Single Page Application and accidentally clicked the back button, only to lose your entire application state and have to start over, you have seen this problem in action.
In htmx, as with Single Page Application frameworks, you will often need to explicitly work with the history API. Fortunately, since htmx sticks so close to the original model of the web and since it is declarative, getting web history right is typically much easier to do in an htmx-based application.
Consider the button we have been looking at to load contacts:
As it stands, if you click this button it will retrieve the content from
/contacts and load it into the element with the
main, but it will not create a new history entry.
If we wanted it to create a history entry when this request happened, we would add a new attribute to the button, the
Now, when the button is clicked, the
/contacts path will be put into the browser’s navigation bar and a history entry
will be created for it. Furthermore, if the user clicks the back button, the original content for the page will be
restored, along with the original URL.
Now, the name
history.pushState(). This notion of “pushing” derives from the fact that history entries are modeled as a stack, and
so you are “pushing” new entries onto the top of the stack of history entries.
With this relatively simple, declarative mechanism, htmx allows you to integrate with the back button in a way that mimics the “normal” behavior of HTML.
Now, there is one additional thing we need to handle to get history “just right”: we have “pushed” the
into the browsers location bar successfully, and the back button works. But what if someone refreshes their browser while
In this case, you will need to handle the htmx-based “partial” response as well as the non-htmx “full page” response. You can do this using HTTP headers, a topic we will go into in detail later in the book.
Htmx aims to incrementally improve HTML as a hypermedia in a manner that is conceptually coherent with the underlying markup language. Like any technical choice, this is not without trade-offs: by staying so close to HTML, htmx does not give developers a lot of infrastructure that many might feel should be there “by default”.
A good example is the concept of modal dialogs. Many web applications today make heavy use of modal dialogs, effectively in-page pop-ups that sit “on top” of the existing page. (Of course, in reality, this is an optical illusion and it is all just a web page: the web has no notion of “modals” in this regard.)
A web developer might expect htmx, as a front end library, to provide some sort of modal dialog component out of the box.
Htmx, however, has no such notion of modals. That’s not to say you can’t use modals with htmx, and we will look at how you can do so later. But htmx, like HTML itself, won’t give you an API specifically for creating modals. You would need to use a 3rd party library or roll your own modal implementation and then integrate htmx into it if you want to use modals within an htmx-based application.
By staying closer to the original model of the web, htmx aims to strike a balance between simplicity and functionality, deferring to other libraries for more elaborate frontend extensions on top of the existing web platform. The good news is that htmx plays well with others, so when these needs arise it is often easy enough to bring in another library to handle them.