web development

April Features!

I’m testing a handful of highly-experimental new features on my personal website using multivariate (“A/B”) testing.

Screenshot of the recent "Quickly Solving JigsawExplorer Puzzles" blog post with a new "Dark mode" switch hovering over it. — “Dark Mode” is just one of the new features I’m testing out.

If you visit within the next day or so you’re likely to be randomly-selected to try out one of them. (If you’re not selected, you can manually enable one of the experiments.)

I’d love to hear your feedback on these Very Serious New Features! Let me know which one(s) you see and whether you think they should become permanent fixtures on my site.

In August, I celebrated my blog – with its homepage weighing-in at a total of just 481kb – being admitted to Kev Quirk‘s 512kb club. 512kb club celebrates websites (often personal sites) whose homepage are neither “ultra minimal” or “link pages” but have a total size, including all assets, of under half a megabyte. It’s about making a commitment to a leaner, more-efficient Web.

My relatively-heavyweight homepage only just slipped in under the line. But, feeling inspired perhaps by some performance enhancements I’ve been planning this week at work, I decided to try to shave a little more off:

Now, at ~234kb, danq.me just beats the excellent gomakethings.com (it’s all those heavyweight fonts, Chris!).

Here’s what I changed:

The “recent article” tiles are dynamically sized based on their number, type, and the visitor’s screen resolution. But apart from the top one they’re almost never very large. Using thumbnail images for the non-first tile shaved off almost 160kb.

Not space-saving, but while I was in there I ensured that the first tile’s image – which almost-certainly comprises part of the Largest Contentful Paint – is never delivered with loading="lazy".
I was providing a shortcut icon in .ico format (<link rel="shortcut icon" href="/_q23t/icons/favicon-16-32-48-64-128.ico" />), which is pretty redundant nowadays because all modern browsers (and even IE11) support .png icons. I was already providing .png and .svg versions, but it turns out that some browsers favour the one with the (harmful?) rel="shortcut icon" over rel="icon" if both are present, and .ico files are – being based on Windows Bitmaps – horrendously inefficient.

By getting under the 250kb threshold, I’ve jumped up a league from Blue Team to Orange Team, so that’s nice too. I can’t see a meaningful path from where I’m at to Green Team (under 100kb) though, so this level might have to suffice.

Last-minute additions:

Here’s the PR updating my site’s record in the club
If you like the animation, here’s the JavaScript code I used to generate it (plus a quick crossfade made in Kdenlive)

The Stupidest CSS

Yesterday, I wrote the stupidest bit of CSS of my entire career.

Owners of online shops powered by WooCommerce can optionally “connect” their stores back to Woo.com. This enables them to manage their subscriptions to any extensions they use to enhance their store¹. They can also browse a marketplace of additional extensions they might like to consider, which is somewhat-tailored to them based on e.g. their geographical location²

In the future, we’ll be adding sponsored products to the marketplace listing, but we want to be transparent about it so yesterday I was working on some code that would determine from the appropriate API whether an extension was sponsored and then style it differently to make this clear. I took a look at the proposal from the designer attached to the project, which called for

the word “Sponsored” to appear alongside the name of the extension’s developer,
a stripe at the top in the brand colour of the extension, and
a strange green blob alongside it

That third thing seemed like an odd choice, but I figured that probably I just didn’t have the design or marketing expertise to understand it, and I diligently wrote some appropriate code.³

I even attached to my PR a video demonstrating how my code reviewers could test it without spoofing actual sponsored extensions.

After some minor tweaks, my change was approved. The designer even swung by and gave it a thumbs-up. All I needed to do was wait for the automated end-to-end tests to complete, and I’d be able to add it to WooCommerce ready to be included in the next-but-one release. Nice.

In the meantime, I got started on my next bit of work. This one also included some design work by the same designer, and wouldn’t you know it… this one also had a little green blob on it?

Then it hit me. The blobs weren’t part of the design at all, but the designer’s way of saying “look at this bit, it’s important!”. Whoops!

So I got to rush over to my (already-approved, somehow!) changeset and rip out the offending CSS: the stupidest bit of CSS of my entire career. Not bad code per se, but reasonable code resulting from a damn-stupid misinterpretation of a designer’s wishes. Brilliant.

¹ WooCommerce extensions serve loads of different purposes, like handling bookings and reservations and integrating with parcel tracking services.

² There’s no point us suggesting an extension that helps you calculate Royal Mail shipping rates if you’re not shipping from the UK, for example!

³ A fun side-effect of working on open-source software is that my silly mistake gets immortalised somewhere where you can go and see it any time you like!

Shiftless Progressive Enhancement

Progressive enhancement is a great philosophy for Web application development. Deliver all the essential basic functionality using the simplest standards available; use advanced technologies to add bonus value and convenience features for users whose platform supports them. Win.

In Three Rings, for example, volunteers can see a “starchart” of the volunteering shifts they’ve done recently, at-a-glance, on their profile page¹. In the most basic case, this is usable in its HTML-only form: even with no JavaScript, no CSS, no images even, it still functions. But if JavaScript is enabled, the volunteer can dynamically “filter” the year(s) of volunteering they’re viewing. Basic progressive enhancement.

If a feature requires JavaScript, my usual approach is to use JavaScript to add the relevant user interface to the page in the first place. Those starchart filters in Three Rings don’t appear at all if JavaScript is disabled. A downside to this approach is that the JavaScript necessarily modifies the DOM on page load, which introduces a delay to the page being interactive as well as potentially resulting in layout shift.

That’s not always the best approach. I was reminded of this today by the website of 7-year-old Shiro (produced with, one assumes, at least a little help from Saneef H. Ansari). Take a look at this progressively-enhanced theme switcher:

No layout shift, no DOM manipulation. And yet it’s still pretty clear what features are available.

The HTML that’s delivered over-the-wire provides a disabled <select> element, which gains the CSS directive cursor: not-allowed;, to make it clear to the used that this dropdown doesn’t do anything. The whole thing’s wrapped in a custom element.

When that custom element is defined by the JavaScript, it enhances the dropdown with an event listener that implements the theme changes, then enables the disabled <select>.

<color-schemer>
  <form>
    <label>
      Theme
      <select disabled>
        <option value="">System</option>
        <option value="dark">Dark</option>
        <option value="light" selected>Light</option>
      </select>
    </label>
  </form>
</color-schemer>

I’m not convinced by the necessity of the <form> if there’s no HTML-only fallback… and the <label> probably should use a for="..." rather than wrapping the <select>, but otherwise this code is absolutely gorgeous.

It’s probably no inconvenience to the minority of JS-less users to see a theme switcher than, when they go to use it, turns out to be disabled. But it saves time for virtually everybody not to have to wait for JavaScript to manipulate the DOM, or else to risk shifting the layout by revealing a previously-hidden element.

Altogether, this is a really clever approach, and I was pleased today to be reminded – by a 7-year-old! – of the elegance of this approach. Nice one Shiro (and Saneef!).

¹ Assuming that administrators at the organisation where they volunteer enable this feature for them, of course: Three Rings‘ permission model is robust and highly-customisable. Okay, that’s enough sales pitch.

Netscape’s Untold Webstories

I mentioned yesterday that during Bloganuary I’d put non-Bloganuary-prompt post ideas onto the backburner, and considered extending my daily streak by posting them in February. Here’s part of my attempt to do that:

Let’s take a trip into the Web of yesteryear, with thanks to our friends at the Internet Archive’s WayBack Machine.

The page we’re interested in used to live at http://www.netscape.com/comprod/columns/webstories/index.html, and promised to be a showcase for best practice in Web development. Back in October 1996, it looked like this:

The page is a placeholder for Netscape Webstories (or Web Site Stories, in some places). It’s part of a digital magazine called Netscape Columns which published pieces written by Marc Andreeson, Jim Barksdale, and other bigwigs in the hugely-influential pre-AOL-acquisition Netscape Communications.

This new series would showcase best practice in designing and building Web sites¹, giving a voice to the technical folks best-placed to speak on that topic. That sounds cool!

Those white boxes above and below the paragraph of text aren’t missing images, by the way: they’re horizontal rules, using the little-known size attribute to specify a thickness of <hr size=4>!²

Certainly you’re excited by this new column and you’ll come back in November 1996, right?

Oh. The launch has been delayed, I guess. Now it’s coming in January.

The <hr>s look better now their size has been reduced, though, so clearly somebody’s paying attention to the page. But let’s take a moment and look at that page title. If you grew up writing web pages in the modern web, you might anticipate that it’s coded something like this:

<h2 style="font-variant: small-caps; text-align: center;">Coming Soon</h2>

There’s plenty of other ways to get that same effect. Perhaps you prefer font-feature-settings: 'smcp' in your chosen font; that’s perfectly valid. Maybe you’d use margin: 0 auto or something to centre it: I won’t judge.

But no, that’s not how this works. The actual code for that page title is:

<center>
  <h2>
    <font size="+3">C</font>OMING
    <font size="+3">S</font>OON
  </h2>
</center>

Back when this page was authored, we didn’t have CSS³. The only styling elements were woven right in amongst the semantic elements of a page⁴. It was simple to understand and easy to learn… but it was a total mess⁵.

Anyway, let’s come back in January 1997 and see what this feature looks like when it’s up-and-running.

Nope, now it’s pushed back to “the spring”.

Under Construction pages were all the rage back in the nineties. Everybody had one (or several), usually adorned with one or more of about a thousand different animated GIFs for that purpose.⁶

Building “in public” was an act of commitment, a statement of intent, and an act of acceptance of the incompleteness of a digital garden. They’re sort-of coming back into fashion in the interpersonal Web, with the “garden and stream” metaphor⁷ taking root. This isn’t anything new, of course – Mark Bernstein touched on the concepts in 1998 – but it’s not something that I can ever see returning to the “serious” modern corporate Web: but if you’ve seen a genuine, non-ironic “under construction” page published to a non-root page of a company’s website within the last decade, please let me know!

RSS doesn’t exist yet (although here’s a fun fact: the very first version of RSS came out of Netscape!). We’re just going to have to bookmark the page and check back later in the year, I guess…

Okay, so February clearly isn’t Spring, but they’ve updated the page… to add a search form.

It’s a genuine <form> tag, too, not one of those old-fashioned <isindex> tags you’d still sometimes find even as late as 1997. Interestingly, it specifies enctype="application/x-www-form-urlencoded". Today’s developers probably don’t think about the enctype attribute except when they’re doing a form that handles file uploads and they know they need to switch it to enctype="multipart/form-data", (or their framework does this automatically for them!).

But these aren’t the only options, and some older browsers at this time still defaulted to enctype="text/plain". So long as you’re using a POST and not GET method, the distinction is mostly academic, but if your backend CGI program anticipates that special characters will come-in encoded, back then you’d be wise to specify that you wanted URL-encoding or you might get a nasty surprise when somebody turns up using LMB or something equally-exotic.

Anyway, let’s come back in June. The content must surely be up by now:

Oh come on! Now we’re waiting until August?

At least the page isn’t abandoned. Somebody’s coming back and editing it from time to time to let us know about the still-ongoing series of delays. And that’s not a trivial task: this isn’t a CMS. They’re probably editing the .html file itself in their favourite text editor, then putting the appropriate file:// address into their copy of Netscape Navigator (and maybe other browsers) to test it, then uploading the file – probably using FTP – to the webserver… all the while thanking their lucky stars that they’ve only got the one page they need to change.

We didn’t have scripting languages like PHP yet, you see⁸. We didn’t really have static site generators. Most servers didn’t implement server-side includes. So if you had to make a change to every page on a site, for example editing the main navigation menu, you’d probably have to open and edit dozens or even hundreds of pages. Little wonder that framesets caught on, despite their (many) faults, with their ability to render your navigation separately from your page content.

Okay, let’s come back in August I guess:

Now we’re told that we’re to come back… in the Spring again? That could mean Spring 1998, I suppose… or it could just be that somebody accidentally re-uploaded an old copy of the page.

Hey: the footer’s gone too? This is clearly a partial re-upload: somebody realised they were accidentally overwriting the page with the previous-but-one version, hit “cancel” in their FTP client (or yanked the cable out of the wall), and assumed that they’d successfully stopped the upload before any damage was done.

They had not.

I didn’t mention that top menu, did I? It looks like it’s a series of links, styled to look like flat buttons, right? But you know that’s not possible because you can’t rely on having the right fonts available: plus you’d have to do some <table> trickery to lay it out, at which point you’d struggle to ensure that the menu was the same width as the banner above it. So how did they do it?

The menu is what’s known as a client-side imagemap. Here’s what the code looks like:

<a href="/comprod/columns/images/nav.map">
  <img src="/comprod/columns/images/websitestories_ban.gif" width=468 height=32 border=0 usemap="#maintopmap" ismap>
</a><map name="mainmap">
  <area coords="0,1,92,24" href="/comprod/columns/mainthing/index.html">
  <area coords="94,1,187,24" href="/comprod/columns/techvision/index.html">
  <area coords="189,1,278,24" href="/comprod/columns/webstories/index.html">
  <area coords="280,1,373,24" href="/comprod/columns/intranet/index.html">
  <area coords="375,1,467,24" href="/comprod/columns/newsgroup/index.html">
</map>

The image (which specifies border=0 because back then the default behaviour for graphical browser was to put a thick border around images within hyperlinks) says usemap="#maintopmap" to cross-reference the <map> below it, which defines rectangular areas on the image and where they link to, if you click them! This ingenious and popular approach meant that you could transmit a single image – saving on HTTP round-trips, which were relatively time-consuming before widespread adoption of HTTP/1.1‘s persistent connections – along with a little metadata to indicate which pixels linked to which pages.

The ismap attribute is provided as a fallback for browsers that didn’t yet support client-side image maps but did support server-side image maps: there were a few! When you put ismap on an image within a hyperlink, then when the image is clicked on the href has appended to it a query parameter of the form ?123,456, where those digits refer to the horizontal and vertical coordinates, from the top-left, of the pixel that was clicked on! These could then be decoded by the webserver via a .map file or handled by a CGI program. Server-side image maps were sometimes used where client-side maps were undesirable, e.g. when you want to record the actual coordinates picked in a spot-the-ball competition or where you don’t want to reveal in advance which hotspot leads to what destination, but mostly they were just used as a fallback.⁹

Both client-side and server-side image maps still function in every modern web browser, but I’ve not seen them used in the wild for a long time, not least because they’re hard (maybe impossible?) to make accessible and they can’t cope with images being resized, but also because nowadays if you really wanted to make an navigation “image” you’d probably cut it into a series of smaller images and make each its own link.

Anyway, let’s come back in October 1997 and see if they’ve fixed their now-incomplete page:

Oh, they have! From the look of things, they’ve re-written the page from scratch, replacing the version that got scrambled by that other employee. They’ve swapped out the banner and menu for a new design, replaced the footer, and now the content’s laid out in a pair of columns.

There’s still no reliable CSS, so you’re not looking at columns: (no implementations until 2014) nor at display: flex (2010) here. What you’re looking at is… a fixed-width <table> with a single row and three columns! Yes: three – the middle column is only 10 pixels wide and provides the “gap” between the two columns of text.¹⁰

This wasn’t Netscape’s only option, though. Did you ever hear of the <multicol> tag? It was the closest thing the early Web had to a semantically-sound, progressively-enhanced multi-column layout! The author of this page could have written this:

<multicol cols=2 gutter=10 width=301>
  <p>
    Want to create the best possible web site? Join us as we explore the newest
    technologies, discover the coolest tricks, and learn the best secrets for
    designing, building, and maintaining successful web sites.
  </p>
  <p>
    Members of the Netscape web site team, recognized designers, and technical
    experts will share their insights and experiences in Web Site Stories. 
  </p>
</multicol>

That would have given them the exact same effect, but with less code and it would have degraded gracefully. Browsers ignore tags they don’t understand, so a browser without support for <multicol> would have simply rendered the two paragraphs one after the other. Genius!

So why didn’t they? Probably because <multicol> only ever worked in Netscape Navigator.

Introduced in 1996 for version 3.0, this feature was absolutely characteristic of the First Browser War. The two “superpowers”, Netscape and Microsoft, both engaged in unilateral changes to the HTML specification, adding new features and launching them without announcement in order to try to get the upper hand over the other. Both sides would often refuse to implement one-another’s new tags unless they were forced to by widespread adoption by page authors, instead promoting their own competing mechanisms¹¹.

Between adding this new language feature to their browser and writing this page, Netscape’s market share had fallen from around 80% to around 55%, and most of their losses were picked up by IE. Using <multicol> would have made their page look worse in Microsoft’s hot up-and-coming browser, which wouldn’t have helped them persuade more people to download a copy of Navigator and certainly wouldn’t be a good image on a soon-to-launch (any day now!) page about best-practice on the Web! So Netscape’s authors opted for the dominant, cross-platform solution on this page¹².

Anyway, let’s fast-forward a bit and see this project finally leave its “under construction” phase and launch!

Oh. It’s gone.

Sometime between October 1997 and February 1998 the long promised “Web Site Stories” section of Netscape Columns quietly disappeared from the website. Presumably, it never published a single article, instead remaining a perpetual “Coming Soon” page right up until the day it was deleted.

I’m not sure if there’s a better metaphor for Netscape’s general demeanour in 1998 – the year in which they finally ceased to be the dominant market leader in web browsers – than the quiet deletion of a page about how Netscape customers are making the best of the Web. This page might not have been important, or significant, or even completed, but its disappearance may represent Netscape’s refocus on trying to stay relevant in the face of existential threat.

Of course, Microsoft won the First Browser War. They did so by pouring a fortune’s worth of developer effort into staying technologically one-step ahead, refusing to adopt standards proposed by their rival, and their unprecedented decision to give away their browser for free¹³.

¹ Yes, we used to write “Web sites” as two words. We also used to consistently capitalise the words Web and Internet. Some of us still do so.

² In case it’s not clear, this blog post is going to be as much about little-known and archaic Web design techniques as it is about Netscape’s website.

³ This is a white lie. CSS was first proposed almost at the same time as the Web! Microsoft Internet Explorer was first to deliver a partial implementation of the initial standard, late in 1996, but Netscape dragged their heels, perhaps in part because they’d originally backed a competing standard called JavaScript Style Sheets (JSSS). JSSS had a lot going for it: if it had enjoyed widespread adoption, for example, we’d have had the equivalent of CSS variables a full twenty years earlier! In any case, back in 1996 you definitely wouldn’t want to rely on CSS support.

⁴ Wondering where the text and link colours come from? <body bgcolor="#ffffff" text="#000000" link="#0000ff" vlink="#ff0000" alink="#ff0000">. Yes really, that’s where we used to put our colours.

⁵ Personally, I really loved the aesthetic Netscape touted when using Times New Roman (or whatever serif font was available on your computer: webfonts weren’t a thing yet) with temporary tweaks to font sizes, and I copied it in some of my own sites. If you look back at my 2018 blog post celebrating two decades of blogging, where I’ve got a screenshot of my blog as it looked circa 1999, you’ll see that I used exactly this technique for the ordinal suffixes on my post dates! On the same post, you’ll see that I somewhat replicated the “feel” of it again in my 2011 design, this time using a stylesheet.

⁶ There’s a whole section of Cameron’s World dedicated to “under construction” banners, and that’s a beautiful thing!

⁷ The idea of “garden and stream” is that you publish early and often, refining as you go, in your garden, which can act as an extension of whatever notetaking system you use already, but publish mostly “finished” content to your (chronological) stream. I see an increasing number of IndieWeb bloggers going down this route, but I’m not convinced that it’s for me.

⁸ Another white lie. PHP was released way back in 1995 and even the very first version supported something a lot like server-side includes, using the syntax . But it was a little computationally-intensive to run willy-nilly.

⁹ Server-side imagemaps are enjoying a bit of a renaissance on .onion services, whose visitors often keep JavaScript disabled, to make image-based CAPTCHAs. Simply show the visitor an image and describe the bit you want them to click on, e.g. “the blue pentagon with one side missing”, then compare the coordinates of the pixel they click on to the knowledge of the right answer. Highly-inaccessible, of course, but innovative from a purely-technical perspective.

¹⁰ Nowadays, use of tables for layout – or, indeed, for anything other than tabular data – is very-much frowned upon: it’s often bad for accessibility and responsive design. But back before we had the features granted to us by the modern Web, it was literally the only way to get content to appear side-by-side on a page, and designers got incredibly creative about how they ~~mis~~used tables to lay out content, especially as browsers became more-sophisticated and began to support cells that spanned multiple rows or columns, tables “nested” within one another, and background images.

¹¹ It was a horrible time to be a web developer: having to make hacky workarounds in order to make use of the latest features but still support the widest array of browsers. But I’d still take that over the horrors of rendering engine monoculture!

¹² Or maybe they didn’t even think about it and just copy-pasted from somewhere else on their site. I’m speculating.

¹³ This turned out to be the master-stroke: not only did it partially-extricate Microsoft from their agreement with Spyglass Inc., who licensed their browser engine to Microsoft in exchange for a percentage of sales value, but once Microsoft started bundling Internet Explorer with Windows it meant that virtually every computer came with their browser factory-installed! This strategy kept Microsoft on top until Firefox and Google Chrome kicked-off the Second Browser War in the early 2010s. But that’s another story.

Lightboxes Without JavaScript

Because I like my blog to be as fast, accessible, and resilient, I try not to use JavaScript for anything I don’t have to¹. One example would be my “lightbox”: the way in which images are blown-up if you click on them:

My solution ensures that:

You can click an image and see a full-window popup dialog box containing a larger version of the image.
The larger version of the image isn’t loaded until it’s needed.
You can close the larger version with a close button. You can also use your browser’s back button.
You can click again to download the larger version/use your browser to zoom in further.
You can share/bookmark etc. the URL of a zoomed-in image and the recipient will see the same image (and return to the image, in the right blog post, if they press the close button).
No HTTP round trip is required when opening/closing a lightbox: it’s functionally-instantaneous.²
No JavaScript is used at all.

Visitors can click on images to see a larger version, with a “close” button. No JavaScript needed.

Here’s how it works –

The Markup

<figure id="img3336" aria-describedby="caption-img3336">
  <a href="#lightbox-img3336" role="button">
    <img src="small-image.jpg" alt="Alt text is important." width="640" height="480">
  </a>
  <figcaption id="caption-img3336">
    Here's the caption.
  </figcaption>
</figure>

... (rest of blog post) ...

<dialog id="lightbox-img3336" class="lightbox">
  <a href="large-image.jpg">
    <img src="large-image.jpg" loading="lazy" alt="Alt text is important.">
  </a>
  <a class="close" href="#img3336" title="Close image" role="button">×</a>
</dialog>

The HTML is pretty simple (and I automatically generate it, of course).

For each lightboxed image in a post, a <dialog> for that image is appended to the post. That dialog contains a larger copy of the image (set to loading="lazy" so the browser have to download it until it’s needed), and a “close” button.

The image in the post contains an anchor link to the dialog; the close button in the dialog links back to the image in the post.³ I wrap the lightbox image itself in a link to the full version of the image, which makes it easier for users to zoom in further using their browser’s own tools, if they like.

Even without CSS, this works (albeit with “scrolling” up and down to the larger image). But the clever bit’s yet to come:

The Style

body:has(dialog:target) {
  /* Prevent page scrolling when lightbox open (for browsers that support :has()) */
  position: fixed;
}

a[href^='#lightbox-'] {
  /* Show 'zoom in' cursor over lightboxed images. */
  cursor: zoom-in;
}

.lightbox {
  /* Lightboxes are hidden by-default, but occupy the full screen and top z-index layer when shown. */
  all: unset;
  display: none;
  position: fixed;
  top: 0;
  left: 0;
  width: 100%;
  height: 100%;
  z-index: 2;
  background: #333;
}

.lightbox:target {
  /* If the target of the URL points to the lightbox, it becomes visible. */
  display: flex;
}

.lightbox img {
  /* Images fill the lightbox. */
  object-fit: contain;
  height: 100%;
  width: 100%;
}

/* ... extra CSS for styling the close button etc. ... */

Here’s where the magic happens.

Lightboxes are hidden by default (display: none), but configured to fill the window when shown.

They’re shown by the selector .lightbox:target, which is triggered by the id of the <dialog> being referenced by the anchor part of the URL in your address bar!

Summary

It’s neither the most-elegant nor cleanest solution to the problem, but for me it hits a sweet spot between developer experience and user experience. I’m always disappointed when somebody’s “lightbox” requires some heavyweight third-party JavaScript (often loaded from a CDN), because that seems to be the epitome of the “take what the Web gives you for free, throw it away, and reimplement it badly in JavaScript” antipattern.

There’s things I’ve considered adding to my lightbox. Progressively-enhanced JavaScript that adds extra value and/or uses the Popover API where available, perhaps? View Transitions to animate the image “blowing up” to the larger size, while the full-size image loads in the background? Optimistic preloading when hovering over the image⁴? “Previous/next” image links when lightboxing a gallery? There’s lots of potential to expand it without breaking the core concept here.

I’d also like to take a deeper dive into the accessibility implications of this approach: I think it’s pretty good, but accessibility is a big topic and there’s always more to learn.

Close-up of a champagne-coloured French Bulldog wearing a teal jumper, lying in a basket and looking towards the camera. — In the meantime, why not try out my lightbox by clicking on this picture of my dog (photographed here staring longingly at the bacon sandwich picture above, perhaps).

I hope the idea’s of use to somebody else looking to achieve this kind of thing, too.

¹ Where JavaScript is absolutely necessary, I (a) host it on the same domain, for performance and privacy-respecting reasons, and (b) try to provide a functional alternative that doesn’t require JavaScript, ideally seamlessly.

² In practice, the lightbox images get lazy-loaded, so there can be a short round trip to fetch the image the first time. But after that, it’s instantaneous.

³ The pair – post image and lightbox image – work basically the same way as footnotes, like this one.

⁴ I already do this with links in general using the excellent instant.page.

Making WordPress Fast (The Hard Way)

This isn’t the guide for you

The Internet is full of guides on easily making your WordPress installation run fast. If you’re looking to speed up your WordPress site, you should go read those, not this.

Those guides often boil down to the same old tips:

uninstall unnecessary plugins,
optimise caching (both on the server and, via your headers, on clients/proxies),
resize your images properly and/or ensure WordPress is doing this for you,
use a CDN (and use DNS prefetch hints)¹,
tune your PHP installation so it’s got enough memory, keeps a process alive, etc.,
ensure your server is minifying² and compressing files, and
run it on a faster server/behind a faster connection³

You’ve heard those tips before, right? Today, let’s try something different.

The hard way

This article is for people who aren’t afraid to go tinkering in their WordPress codebase to squeeze a little extra (real world!) performance.

It’s for people whose neverending quest for perfection is already well beyond the point of diminishing returns.

But mostly, it’s for people who want to gawp at me, the freak who actually did this stuff just to make his personal blog a tiny bit nippier without spending an extra penny on hosting.

You shouldn’t use Lighthouse as your only measure of your site’s performance. But it’s still reassuring when you get to see those fireworks!

Don’t start with the hard way. Exhaust all the easy solutions – or at least, make a conscious effort which easy solutions to enact or reject – first. Only if you really want to get into the weeds should you actually try doing the things I propose here. They’re not for most sites, and they’re not the for faint of heart.

Performance is a tradeoff. Every performance improvement costs you something else: time, money, DX, UX, etc. What you choose to trade for performance gains depends on your priority of constituencies, which may differ from mine.⁴

This is not a recipe book. This won’t tell you what code to change or what commands to run. The right answers for your content will be different than the right answers for mine. Also: you shouldn’t change what you don’t understand! But I hope these tips will help you think about what questions you need to ask to make your site blazing fast.

Okay, let’s get started…

1. Backstab the plugins you can’t live without

If there are plugins you can’t remove because you depend upon their functionality, and those plugins inject content (especially JavaScript) on the front-end… backstab them to undermine that functionality.

For example, if you want Jetpack‘s backup and downtime monitoring features, but you don’t want it injecting random <link rel='stylesheet' id='...-jetpack-css' href='...' media='all' />‘s (an extra stylesheet to download and parse) into your pages: find the add_filter hook it uses and remove_filter it in your theme⁵.

Better yet, remove wp_head() from your theme entirely⁶. Now, instead of blocking the hooks you don’t want polluting your <head>, you’re specifically allowing only those you want. You’ll want to take care to get some semi-essential ones like <link rel="canonical" href="...">⁷.

Now most of your plugins are broken, but in exchange, your theme has reclaimed complete control over what gets sent to the user. You can select what content you actually want delivered, and deliver no more than that. It’s harder work for you, but your site becomes so much lighter.

Animated GIF from The Simpsons. Leonard Nimoy says "Well, my work is done here." Barney says "What do you mean? You didn't do anything?" Nimoy laughs and replies "Didn't I?" before disappearing as if transported away by a Star Trek teleporter. — Your site is faster now. It doesn’t work, but it’s quick about it!

2. Throw away 100% of your render-blocking JavaScript (and as much as you can of the rest)

The single biggest bottleneck to the user viewing a modern WordPress website is the JavaScript that needs to be downloaded, compiled, and executed before the page can be rendered. Most of that’s plugins, but even on a nearly-vanilla installation you might find a copy of jQuery (eww!) and some other files.

In step 1 you threw it all away, which is great… but I’m betting you were depending on some of that to make your site work? Let’s put it back, carefully and selectively, while minimising the impact on load time.

That means scripts should be loaded (a) low-down, and/or (b) marked defer (or, better yet, async), so they don’t block page rendering.

If you haven’t already, you might like to View Source on this page. Count my <script> tags. You’ll probably find just two of them: one external file marked async, and a second block right at the bottom.

The inline <script> in my footer.php wraps a single line of PHP: which looks a little like this: <?php echo implode("\n\n", apply_filters( 'danq_footer_js', [] ) ); ?>. For each item in an initially-empty array, it appends to the script tag. When I render anything that requires JavaScript, e.g. for 360° photography, I can just add to that (keyed, to prevent duplicates when viewing an archive page) array. Thus, the relevant script gets added exclusively to the pages where it’s needed, not to the entire site.

The only inline script added to every page loads my service worker, which itself aims to optimise caching as well as providing limited “offline” functionality.

While you’re tweaking your JavaScript anyway, you might like to check that any suitable addEventListeners are set to passive mode. Especially if you’re doing anything with touch or mousewheel events, you can often increase the perceived performance of these interactions by not letting your custom code block the default browser behaviour.

I promise you; most of your blog’s front-end JavaScript is either (a) garbage nobody wants, (b) polyfills for platforms nobody uses, or (c) huge libraries you’ve imported so you can use just one or two functions form them. Trash them.

3. Don’t use a CDN

Wait, what? That’s the opposite of what everybody else recommends. To understand why, you have to think about why people recommend a CDN in the first place. Their reasons are usually threefold:

Proximity
Claim: A CDN delivers content geographically-closer to the user.
Retort: Often true. But in step 4 we’re going to make sure that everything critical comes within the first TCP sliding window anyway, so there’s little benefit, and there’s a cost to that extra DNS lookup and fresh handshake. Edge caching your own content may have value, but for most sites it’ll have a much smaller impact than almost everything else on this list.
Precaching
Claim: A CDN improves the chance resources are precached in the user’s browser.
Retort: Possibly true, especially with fonts (although see step 6) but less than you’d think with JS libraries because there are so many different versions/hosts of each. Yours may well be the only site in the user’s circuit that uses a particular one!
Power
Claim: A CDN has more resources than you and so can better-withstand spikes of traffic.
Retort: Maybe, but they also introduce an additional single-point-of-failure. CDNs aren’t magically immune to downtime nor content-blocking, and if you depend on one you’ve just doubled the number of potential failure points that can make your site instantly useless. Furthermore: in exchange for those resources you’re trading away your users’ privacy and security: if a CDN gets hacked, every site that uses it gets hacked too.

Consider edge-caching your own content only if you think you need it, but ditch jsDeliver, cdnjs, Google Hosted Libraries etc.

Hell: if you can, ditch all JavaScript served from third-parties and slap a Content-Security-Policy: script-src 'self' header on your domain to dramatically reduce the entire attack surface of your site!⁸

4. Reduce your HTML and CSS size to <12kb compressed

There’s a magic number you need to know: 12kb. Because of some complicated but fascinating maths (and depending on how your hosting is configured), it can be significantly faster to initially load a web resource of up to 12kb than it is to load one of, say, 15kb. Also, for the same reason, loading a web resource of much less than 12kb might not be significantly faster than loading one only a little less than 12kb.

Exploit this by:

Making your pages as light as possible⁹, then
Inlining as much essential content as possible (CSS, SVGs, JavaScript etc.) to bring you back up to close-to that magic number again!

$ curl --compressed -so /dev/null -w "%{size_download}\n" https://danq.me/
	

	10416

Note that this is the compressed, over-the-wire size. Last I checked, my homepage weighed-in at about 10.4kb compressed, which includes the entirety of its HTML and CSS, most of its JS, and a couple of its SVG images.

Again, this probably flies in the face of everything you were taught about performance. I’m sure you were told that you should <link> to your stylesheets so that they can be cached across page loads. But it turns out that if you can make your HTML and CSS small enough, the opposite is true and you should inline the stylesheet again: caching styles becomes almost irrelevant if you get all the content in a single round-trip anyway!

For extra credit, consider optimising your homepage’s CSS so it’s even smaller by excluding directives that only apply to non-homepage pages, and vice-versa. Assuming you’re using a preprocessor, this shouldn’t be too hard: at simplest, you can have a homepage.css and main.css, each derived from a set of source files some of which they share (reset/normalisation, typography, colours, whatever) and the rest which is specific only to that part of the site.

Most web pages should fit entirely onto a floppy disk. This one doesn’t, mostly because of all the Simpsons clips, but most should.

Can’t manage to get your HTML and CSS down below the magic number? Then at least ensure that your HTML alone weighs in at <12kb compressed and you’ll still get some of the benefits. If you’ve got the headroom, you can selectively include a <style> block containing only the most-crucial CSS, with a particular focus on any that results in layout shifts (e.g. anything that specifies the height: of otherwise dynamically-sized block elements, or that declares an element position: absolute or position: fixed). These kinds of changes are relatively computationally-expensive because they cause content to re-flow, so provide hints as soon as possible so that the browser can accommodate for them.

5. Make the first load awesome

We don’t really talk about content being “above the fold” like we used to, because the modern Web has such a diverse array of screen sizes and resolutions that doing so doesn’t make much sense.

But if loading your full page is still going to take multiple HTTP requests (scripts, images, fonts, whatever), you should still try to deliver the maximum possible value in the first round-trip. That means:

Making sure all your textual content loads immediately! Unless you’re delivering a huge amount of text, there’s absolutely no excuse for lazy-loading text: it’s usually tiny, compresses well, and it’s fast to parse. It’s also the most-important content of most pages. Get it delivered to the browser so it can be rendered rightaway.
Lazy-loading images that are “expected” to be below the fold, using the proper HTML mechanism for this (never a JavaScript approach).
Reserving space for blocks by sizing images appropriately, e.g. using <img width="..." height="..." ...> or having them load as a background with background-size: cover or contain in a block sized with CSS delivered in the initial payload. This reduces layout shift, which mitigates the need for computationally-expensive content reflows.
If possible (see point 4), move vector images that support basic site functionality, like logos, inline. This might also apply to icons, if they’re “as important” as text content.
Marking everything up with standard semantic HTML. There’s a trend for component-driven design to go much too far, resulting in JavaScript components being used in place of standard elements like links, buttons, and images, resulting in highly-fragile websites: when those scripts fail (or are very slow to load), the page becomes unusable.

6. Reduce your dependence on downloaded fonts

Fonts are lovely and can be an important part of your brand identity, but they can also add a lot of weight to your web pages.

If you’re ready and able to drop your webfonts and appreciate the beauty and flexibility of a system font stack (I get it: I’m not there quite yet!), you can at least make smarter use of your fonts:

Every modern browser supports WOFF2, so you can ditch those chunky old formats you’re clinging onto.
If you’re only using the Latin alphabet, minify your fonts further by dropping the characters you don’t need: tools like Google Webfonts Helper can help with this, as well as making it easier to selfhost fonts from the most-popular library (is a smart idea for the reasons described under point 3, above!). There are tools available to further minify fonts if e.g. you only need the capital letters for your title font or something.
- - Browsers are pretty clever and will work-around it if you make a mistake. Didn’t include an emoji or some obscure mathematical symbol, and then accidentally used them in a post? Browsers will switch to a system font that can fill in the gap, for you.
Make the most-liberal use of the font-display: CSS directive that you can tolerate!
- Don’t use font-display: block, which is functionally the default in most browsers, unless you absolutely have to.
- font-display: fallback is good if you’re too cowardly/think your font is too important for you to try font-display: optional.
- font-display: optional is an excellent choice for body text: if the browser thinks it’s worthwhile to download the font (it might choose not to if the operating system indicates that it’s using a metered or low-bandwidth connection, for example), it’ll try to download it, but it won’t let doing so slow things down too much and it’ll fall-back to whatever backup (system) font you specify.
- font-display: swap is also worth considering: this will render any text immediately, even if the right font hasn’t downloaded yet, with no blocking time whatsoever, and then swap it for the right font when it appears. It’s probably better for headings, because large paragraphs of text can be a little disorienting if they change font while a user is looking at them!

If writing is for nerds, then typography must be doubly-so. But you’ve read this far, so I’m confident that you qualify…

7. Cache pre-compressed static files

It’s possible that by this point you’re saying “if I had to do this much work, I might as well just use a static site generator”. Well good news: that’s what you’re about to do!

Obviously you should make sure all your regular caching improvements (appropriate HTTP headers for caching, a service worker that further improves on that logic based on your content’s update schedule, etc.) first. Again: everything in this guide presupposes that you’ve already done the things that normal people do.

By aggressively caching pre-compressed copies of all your pages, you’re effectively getting the best of both worlds: a website that, for anonymous visitors, is served directly from .html.gz files on a hard disk or even straight from RAM in memcached¹⁰, but which still maintains all the necessary server-side interactivity to allow it to be used as a conventional Web-based CMS (including accepting comments if that’s your jam).

WP Super Cache can do the heavy lifting for you for a filesystem-based solution so long as you put it into “Expert” mode and amending your webserver configuration. I’m using Nginx, so I needed a try_files directive like this:

location / {
try_files /wp-content/cache/supercache/$http_host/$wp_super_cache_path/index-https.html $uri $uri/ /index.php?$args;
}

8. Optimise image formats

I’m sure your favourite performance testing tool has already complained at you about your failure to use the best formats possible when serving images to your users. But how can you fix it?

There are some great plugins for improving your images automatically and/or in bulk – I use EWWW Image Optimizer – but to really make the most of them you’ll want to reconfigure your webserver to detect clients that Accept: image/webp and attempt to dynamically serve them .webp variants, for example. Or if you’re ready to give up on legacy formats and replace all your .pngs with .webps, that’s probably fine too!

Image containing the text — The image you see at `https://danq.me/_q23u/2023/11/dynamic.png` is *probably* an `image/webp`. But if your browser doesn’t support WebP, you’ll get an `image/png` instead!

Assuming you’ve got curl and Imagemagick‘s identify, you can see this in action:

curl -s https://danq.me/_q23u/2023/11/dynamic.png -H "Accept: image/webp" | identify -
(Will give you a WebP image)
curl -s https://danq.me/_q23u/2023/11/dynamic.png -H "Accept: image/png" | identify -
(Will give you a PNG image, even though the URL is the same)

9. Simplify, simplify, simplify

The single biggest impact you can have upon the performance of your WordPress pages is to make them less complex.

I’m not necessarily saying that everybody should follow in my lead and co-publish their WordPress sites on the Gemini protocol. But you’ve got to admit: the simplicity of the Gemini protocol and the associated Gemtext format makes both lightning fast.

Writing my templates and posts so that they’re compatible with CapsulePress helps keep my code necessarily-simple. You don’t have to do that, though, but you should be asking yourself:

Does my DOM need to cascade so deeply? Could I achieve the same with less?
Am I pre-emptively creating content, e.g. adding a hidden <dialog> directly to the markup in the anticipation that it might be triggered later using JavaScript, rather than having that JavaScript run document.createElement the element after the page becomes readable?
Have I created unnecessarily-long chains of CSS selectors¹¹ when what I really want is a simple class name, or perhaps even a semantic element name?

10. Add a Service Worker

A service worker isn’t magic. In particular, it can’t help you with those new visitors hitting your site for the first time¹².

A service worker lets you do smart things on behalf of the user’s network connection, so that by the time they ask for a resource, you already fetched it for them.

But a suitable service worker can do a few things that can help with performance. In particular, you might consider:

Precaching assets that you anticipate they’re likely to need (e.g. if you use different stylesheets for the homepage and other pages, you can preload both so no matter where a user lands they’ve already got the CSS they’ll need for the entire site).
Preloading popular pages like the homepage and recent articles, allowing them to load quickly.
Caching a fallback pages – and other resources as-they’re-accessed – to support a full experience for users even if they (or your site!) disconnect from the Internet (or even embedding “save for offline” functionality!).

Chapters 7 and 8 of Going Offline by Jeremy Keith are especially good for explaining how this can be achieved, and it’s all much easier than everything else I just described.

Anything else?

Did I miss anything? If you’ve got a tip about ramping up WordPress performance that isn’t one of the “typical seven” – probably because it’s too hard to be worthwhile for most people – I’d love to hear it!

¹ You’ll sometimes see guides that suggest that using a CDN is to be recommended specifically because it splits your assets among multiple domains/subdomains, which mitigates browsers’ limitation on the number of files they can download simultaneously. This is terrible advice, because such limitations essentially don’t exist any more, but DNS lookups and TLS handshakes still have a bandwidth and computational cost. There are good things about CDNs, sometimes, but this has not been one of them for some time now.

² I’m not sure why guides keep stressing the importance of minifying code, because by the time you’re compressing them too it’s almost pointless. I guess it’s helpful if your compression fails?

³ “Use a faster server” is a “just throw money/the environment at it” solution. I’d like to think we can do better.

⁴ For my personal blog, I choose to prioritise user experience, privacy, accessibility, resilience, and standards compliance above almost everything else.

⁵ If you prefer to keep your backstab code separate, you can put it in a custom plugin, but you might find that you have to name it something late in the alphabet – I’ve previously used names like zzz-danq-anti-plugin-hacks – to ensure that they load after the plugins whose functionality you intend to unhook: broadly-speaking, WordPress loads plugins in alphabetical order.

⁶ I’ve assumed you’re using a classic, not block, theme. If you’re using a block theme, you get a whole different set of performance challenges to think about. Don’t get me wrong: I love block themes and think they’re a great way to put more people in control of their site’s design! But if you’re at the point where you’re comfortable digging this deep into your site’s PHP code, you probably don’t need that feature anyway, right?

⁷ WordPress is really good at serving functionally-duplicate content, so search engines appreciate it if you declare a proper canonical URL.

⁸ Before you choose to block all third-party JavaScript, you might have to whitelist Google Analytics if you’re the kind of person who doesn’t mind selling their visitor data to the world’s biggest harvester of personal information in exchange for some pretty graphs. I’m not that kind of person.

⁹ You were looking to join me in 512kb club anyway, right?

¹⁰ I’ve experimented with mounting a ramdisk and storing the WP Super Cache directory there, but it didn’t make a huge difference, probably because my files are so small that the parse/render time on the browser side dominates the total cascade, and they’re already being served from an SSD. I imagine in my case memcached would provide similarly-small benefits.

¹¹ I really love the power of CSS preprocessors like Sass, but they do make it deceptively easy to create many more – and longer – selectors than you intended in your final compiled stylesheet.

¹² Tools like Lighthouse usually simulate first-time visitors, which can be a little unfair to sites with great performance for established visitors. But everybody is a first-time visitor at least once (and probably more times, as caches expire or are cleared), so they’re still a metric you should consider.

CapsulePress – Gemini / Spartan / Gopher to WordPress bridge

For a while now, this site has been partially mirrored via the Gemini¹ and Gopher protocols.² Earlier this year I presented hacky versions of the tools I’d used to acieve this (and made people feel nostalgic).

Now I’ve added support for Spartan³ too and, seeing as the implementations shared functionality, I’ve combined all three – Gemini, Spartan, and Gopher – into a single package: CapsulePress.

CapsulePress is a Gemini/Spartan/Gopher to WordPress bridge. It lets you use WordPress as a CMS for any or all of those three non-Web protocols in addition to the Web.

For example, that means that this post is available on all of:

my weblog, at https://danq.me/2023/09/10/capsulepress‎,
my gemlog, at gemini://danq.me/posts/capsulepress,
my spartanlog, at spartan://danq.me/posts/capsulepress, and
my goperhole, at gopher://danq.me/1/posts/capsulepress

It’s also possible to write posts that selectively appear via different media: if I want to put something exclusively on my gemlog, I can, by assigning metadata that tells WordPress to suppress a post but still expose it to CapsulePress. Neat!

90s-style web banners in the style of Netscape ads, saying "Gemini now!", "Spartan now!", and "Gopher now!". Beneath then a wider banner ad promotes CapsulePress v0.1. — Using Gemini and friends in the 2020s make me feel like the dream of the Internet of the nineties and early-naughties is still alive. But with fewer banner ads.

I’ve open-sourced the whole thing under a super-permissive license, so if you want your own WordPress blog to “feed” your Gemlog… now you can. With a few caveats:

It’s hard to use. While not as hacky as the disparate piles of code it replaced, it’s still not the cleanest. To modify it you’ll need a basic comprehension of all three protocols, plus Ruby, SQL, and sysadmin skills.
It’s super opinionated. It’s very much geared towards my use case. It’s improved by the use of templates. but it’s still probably only suitable for this site for the time being, until you make changes.
It’s very-much unfinished. I’ve got a growing to-do list, which should be a good clue that it’s Not Finished. Maybe it never will but. But there’ll be changes yet to come.

Whether or not your WordPress blog makes the jump to Geminispace⁴, I hope you’ll came take a look at mine at one of the URLs linked above, and then continue to explore.

If you’re nostalgic for the interpersonal Internet – or just the idea of it, if you’re too young to remember it… you’ll find it there. (That Internet never actually went away, but it’s harder to find on today’s big Web than it is on lighter protocols.)

¹ Also available via Gemini.

² Also via Finger (but that’s another story).

³ Also available via Spartan.

⁴ Need a browser? I suggest Lagrange.

512kb.club

Proud to say that danq.me is the newest site to be admitted into Kev Quirk‘s 512kb club. Let’s keep the web lean!

How a 2002 standard made 2022 bearable

This is an alternate history of the Web. The premise is true, but the story diverges from our timeline and looks at an alternative “Web that might have been”.

Prehistory

This is the story of P3P, one of the greatest Web standards whose history has been forgotten¹, and how the abject failure of its first versions paved the way for its bright future decades later. But I’m getting ahead of myself…

Drafted in 2002 in the wake of growing concern about the death of privacy on the Internet, P3P 1.0 aimed to make the collection of personally-identifiable data online transparent. Hurrah, right?

Not so much. Its immediate impact was lukewarm to negative: developers couldn’t understand why their cookies were no longer being accepted by Internet Explorer 6, the first browser to implement the standard, and the whole exercise was slated as providing a false sense of security, not stopping actual bad guys, and an attempt to apply a technical solution to a political problem.²

Flowchart showing the negotiation process between a user, browser, and server as the user browses an ecommerce site. The homepage's P3P policy states that it collects IP addresses, which is compatible with the user's preferences. Later, at checkout, the P3P policy states that the user's address will be collected and shared with a courier. The collection is fine according to the user's preferences, but she's asked to be notified if it'll be shared, so the browser notifies the user. The user approves of the policy and asks that this approval is remembered for this site, and the checkout process continues. — Initially, the principle was sound. The specification was weak. The implementation was apalling. But P3P 1.1 *could* have worked well.

Developers are lazy³ and soon converged on the simplest possible solution: add a garbage HTTP header like P3P: CP="See our website for our privacy policy." and your cookies work just fine! Ignore the problem, ignore the proposed solution, just do what gets the project shipped.

Without any meaningful enforcement it also perfectly feasible to, y’know, just lie about how well you treat user data. Seeing the way the wind was blowing, Mozilla dropped support for P3P, and Microsoft’s support – which had always been half-baked and lacked even the most basic user-facing controls or customisation options – languished in obscurity.

For a while, it seemed like P3P was dying. Maybe, in some alternate timeline, it did die: vanishing into nothing like VRML, WAP, and XBAP.

But fortunately for us, we don’t live in that timeline.

Revival

In 2009, the European Union revisited the Privacy and Electronic Communications Directive. The initial regulations, published in 2002, required that Web users be able to opt-out of tracking cookies, but the amendment required that sites ensure that users opted-in.

As-written, this confusing new regulation posed an immediate problem: if a user clicked the button to say “no, I don’t want cookies”, and you didn’t want to ask for their consent again on every page load… you had to give them a cookie (or use some other technique legally-indistinguishable from cookies). Now you’re stuck in an endless cookie-circle.⁴

This, and other factors of informed consent, quickly introduced a new pattern among those websites that were fastest to react to the legislative change:

Screenshot from how-i-experience-web-today.com showing an article mostly-covered by a cookie privacy statement and configuration options, utilising dark patterns to try to discourage users from opting-out of cookies. — The cookie consent banner, with all its confusing language and dark patterns, looked like it was going to become the new normal for web users in the early 2010s. But thankfully, our saviour had been waiting in the wings all along.

Web users rebelled. These ugly overlays felt like a regresssion to a time when popup ads and splash pages were commonplace. “If only,” people cried out, “There were a better way to do this!”

It was Professor Lorie Cranor, one of the original authors of the underloved P3P specification and a respected champion of usable privacy and security, whose rallying cry gave us hope. Her CNET article, “Why the EU Cookie Directive is a solved problem”⁵, inspired a new generation of development on what would become known as P3P 2.0.

While maintaining backwards compatibility, this new standard:

deprecated those horrible XML documents in favour of HTTP headers and <link> tags alone,
removing support for Set-Cookie2: headers, which nobody used anyway, and
added features by which the provenance and purpose of cookies could be stated in a way that dramatically simplified adoption in browsers

Internet Explorer at this point was still used by a majority of Web users. It still supported the older version of the standard, and – as perhaps the greatest gift that the much-maligned browser ever gave us – provided a reference implementation as well as a stepping-stone to wider adoption.

Opera, then Firefox, then “new kid” Chrome each adopted P3P 2.0; Microsoft finally got on board with IE 8 SP 1. Now the latest versions of all the mainstream browsers had a solid implementation⁶ well before the European data protection regulators began fining companies that misused tracking cookies.

Fabricated screenshot from Microsoft Edge, browsing 3r.org.uk: a "privacy" icon in the address bar has been clicked, and the resulting menu says: About 3r.org.uk. Connection is secure (with link for more info). Privacy and Cookies (with link for more info). Cookies (3 cookies in use) - Strictly necessary (2 in use), dropdown menu set to "Default (accept, delete later)"; Optional (1 in use), dropdown menu set to "Accept for this site". Checkbox for "Treat third-party cookies differently?", unchecked. Privacy (link to full policy): Legitimate interest - this site collects username, IP address, technical logs...; Consenmt - this site collects email address, phone number... Button to manage content. Button to "Exercise data rights". — Nowadays, we’ve pretty-well standardised on the address bar being the place where all cookie *and* privacy information and settings are stored. Can you imagine if things had gone any other way?

But where the story of P3P‘s successes shine brightest came in 2016, with the passing of the GDPR. The W3C realised that P3P could simplify both the expression and understanding of privacy policies for users, and formed a group to work on version 2.1. And that’s the version you use today.

When you launch a new service, you probably use one of the many free wizard-driven tools to express your privacy policy and the bases for your data processing, and it spits out a template privacy policy. You need the human-readable version, of course, since the 2020 German court ruling that you cannot rely on a machine-readable privacy policy alone, but the real gem is the P3P: 2.1 header version.

Assuming you don’t have any unusual quirks in your data processing (ask your lawyer!), you can just paste the relevant code into your server configuration and you’re good to go. Site users get a warning if their personal data preferences conflict with your data policies, and can choose how to act: not using your service, choosing which of your features to opt-in or out- of, or – hopefully! – granting an exception to your site (possibly with caveats, such as sandboxing your cookies or clearing them immediately after closing the browser tab).

Sure, what we’ve got isn’t perfect. Sometimes companies outright lie about their use of information or use illicit methods to track user behaviour. There’ll always be bad guys out there. That’s what laws are there to deal with.

But what we’ve got today is so seamless, it’s hard to imagine a world in which we somehow all… collectively decided that the correct solution to the privacy problem might have been to throw endless popovers into users’ faces, bury consent-based choices under dark patterns, and make humans do the work that should from the outset have been done by machines. What a strange and terrible timeline that would have been.

¹ If you know P3P‘s history, regardless of what timeline you’re in: congratulations! You win One Internet Point.

² Techbros have been trying to solve political problems using technology since long before the word “techbro” was used in its current context. See also: (a) there aren’t enough mental health professionals, let’s make an AI app? (b) we don’t have enough ventilators for this pandemic, let’s 3D print air pumps? (c) banks keep failing, let’s make a cryptocurrency? (d) we need less carbon in the atmosphere or we’re going to go extinct, better hope direct carbon capture tech pans out eh? (e) we have any problem at all, lets somehow shoehorn blockchain into some far-fetched idea about how to solve it without me having to get out of my chair why not?

³ Note to self: find a citation for this when you can be bothered.

⁴ I can’t decide whether “endless cookie circle” is the name of the New Wave band I want to form, or a description of the way I want to eventually die. Perhaps both.

⁵ Link missing. Did I jump timelines?

⁶ Implementation details varied, but that’s part of the joy of the Web. Firefox favoured “conservative” defaults; Chrome and IE had “permissive” ones; and Opera provided an ultra-configrable matrix of options by which a user could specify exactly which kinds of cookies to accept, linked to which kinds of personal data, from which sites, all somehow backed by an extended regular expression parser that was only truly understood by three people, two of whom were Opera developers.

The Page With No Code

It all started when I saw no-ht.ml, Terence Eden‘s hilarious response to Salma Alam-Naylor‘s excellent HTML is all you need to make a website. The latter is an argument against both the silly amount of JavaScript with which websites routinely burden their users, but also even against depending on CSS. As a fan of CSS Naked Day and a firm believer in using JS only for progressive enhancement, I’m obviously in favour.

Screenshot showing Terence Eden's no-ht.ml website, which uses plain text ASCII/Unicode art to argue that you don't need HTML. — Obviously no-ht.ml is to be taken as tongue-in-cheek, but as you’re about to see: it caught my interest and got me thinking: how could I go *even further*.

Terence’s site works by delivering a document with a claimed MIME type of text/html, but which contains only the (invalid) “HTML” code <!doctype UNICODE><meta charset="UTF-8"><plaintext> (to work around browsers’ wish to treat the page as HTML). This is followed by a block of UTF-8 plain text making use of spacing and emoji to illustrate and decorate the content. It’s frankly very silly, and I love it.¹

I think it’s possible to go one step further, though, and create a web page with no code whatsoever. That is, one that you can read as if it were a regular web page, but where using View Source or e.g. downloading the page with curl will show you… nothing.

I present: The Page With No Code! (It’ll probably only work if you’re using Firefox, for reasons that will become apparent later.)

Screenshot showing my webpage, "The Page With No Code". Using white text (and some emojis) on a blue gradient background, it describes the same thought process as I describe in this blog post, and invites the reader to "View Source" and see that the page genuinely does appear to have no code. — I’d encourage you to visit *The Page With No Code*, use View Source to confirm for yourself that it truly has no code, and see if you can work out for yourself how it manages this feat… before coming back here for an explanation. Again: probably Firefox-only.

Once you’ve had a look for yourself and had a chance to form an opinion, here’s an explanation of the black magic that makes this atrocity possible:

The page is blank. It’s delivered with Content-Type: text/html. Your browser interprets a completely-blank page as faulty and corrects it to a functionally-blank minimal HTML page: <html><head></head><body></body></html>.
<body> and <html> elements can be styled with CSS; this includes the ability to add content: ::before and ::after each element. If only we could load a stylesheet then content injection is possible.
We use the fourth way to inject CSS – a Link: HTTP header – to deliver a CSS payload (this, unfortunately, only works in Firefox). To further obfuscate what’s happening and remove the need for a round-trip, this is encoded as a data: URI.

Screenshot showing HTTP headers returned from a request to the No Code Webpage. A Link: header is highlighted, it contains a data: URL with a base64-encoded CSS stylesheet. — The stylesheet – and all the page content – is right there in the Link: header if you just care to decode it! Observe that while 5.84kB of data are transferred, the browser rightly states that the page is zero bytes in size.

This is one of the most disgusting things I’ve ever coded, and that’s saying a lot. I’m so proud of myself. You can view the code I used to generate this awful thing on Github.

My server-side implementation of this broke in 2023 after I upgraded Nginx; my new version doesn’t support the super-long Link: header needed to make this hack work, so I’ve updated the page to use the Link: to reference the CSS file rather than embed it via a data URI. It’s not as cool, but it at least means you can still see the page. Thanks to Thomas Bradshaw for pointing out the problem.

¹ My first reaction was “why not just deliver something with Content-Type: text/plain; charset=utf-8 and dispense with the invalid code, but perhaps that’s just me overthinking the non-existent problem.

Spring ’83 Came And Went

Just in time for Robin Sloan to give up on Spring ’83, earlier this month I finally got aroud to launching STS-6 (named for the first mission of the Space Shuttle Challenger in Spring 1983), my experimental Spring ’83 server. It’s been a busy year; I had other things to do. But you might have guessed that something like this had been under my belt when I open-sourced a keygenerator for the protocol the other day.

If you’ve not played with Spring ’83, this post isn’t going to make much sense to you. Sorry.

Introducing STS-6

My server is, as far as I can tell, very different from any others in a few key ways:

It does not allow third-party publishing at all. Some might argue that this undermines the aim of the exercise, but I disagree. My IndieWeb inclinations lead me to favour “self-hosted” content, shared from its owners’ domain. Also: the specification clearly states that a server must implement a denylist… I guess my denylist simply includes all keys that are not specifically permitted.
It’s geared towards dynamic content. My primary board self-publishes whenever I produce a new blog post, listing the most recent blog posts published. I have another half-implemented which shows a summary of the most-recent post, and another which would would simply use a WordPress page as its basis – yes, this was content management, but published over Spring ’83.
It provides helpers to streamline content production. It supports internal references to other boards you control using the format {{board:123}}which are automatically converted to addresses referencing the public key of the “current” keypair for that board. This separates the concept of a board and its content template from that board’s keypairs, making it easier to link to a board. To put it another way, STS-6 links are self-healing on the server-side (for local boards).
It helps automate content-fitting. Spring ’83 strictly requires a maximum board size of 2,217 bytes. STS-6 can be configured to fit a flexible amount of dynamic content within a template area while respecting that limit. For my posts list board, the number of posts shown is moderated by the size of the resulting board: STS-6 adds more and more links to the board until it’s too big, and then removes one!
It provides “hands-off” key management features. You can pregenerate a list of keys with different validity periods and the server will automatically cycle through them as necessary, implementing and retroactively-modifying <link rel="next"> connections to keep them current.

I’m sure that there are those who would see this as automating something that was beautiful because it was handcrafted; I don’t know whether or not I agree, but had Spring ’83 taken off in a bigger way, it would always only have been a matter of time before somebody tried my approach.

From a design perspective, I enjoyed optimising an SVG image of my header so it could meaningfully fit into the board. It’s pretty, and it’s tolerably lightweight.

If you want to see my server in action, patch this into your favourite Spring ’83 client: https://s83.danq.dev/10c3ff2e8336307b0ac7673b34737b242b80e8aa63ce4ccba182469ea83e0623

A dead end?

Without Robin’s active participation, I feel that Spring ’83 is probably coming to a dead end. It’s been a lot of fun to play with and I’d love to see what ideas the experience of it goes on to inspire next, but in its current form it’s one of those things that’s an interesting toy, but not something that’ll make serious waves.

In his last lab essay Robin already identified many of the key issues with the system (too complicated, no interpersonal-mentions, the challenge of keys-as-identifiers, etc.) and while they’re all solvable without breaking the underlying mechanisms (mentions might be handled by Webmention, perhaps, etc.), I understand the urge to take what was learned from this experiment and use it to help inform the decisions of the next one. Just as John Postel’s Quote of the Day protocol doesn’t see much use any more (although maybe if my finger server could support QotD?) but went on to inspire the direction of many subsequent “call-and-response” protocols, including HTTP, it’s okay if Spring ’83 disappears into obscurity, so long as we can learn what it did well and build upon that.

Meanwhile: if you’re looking for a hot new “like the web but lighter” protocol, you should probably check out Gemini. (Incidentally, you can find me at gemini://danq.me, but that’s something I’ll write about another day…)

.well-known/links in WordPress

Via Jeremy Keith I today discovered Jim Nielsen‘s suggestion for a website’s /.well-known/links to be a place where it can host a JSON-formatted list of all of its outgoing links.

That’s a really useful thing to have in this new age of the web, where Refererer: headers are no-longer commonly passed cross-domain and Google Search no longer provides the link: operator. If you want to know if I’ve ever linked to your site, it’s a bit of a drag to find out.

JSON output from DanQ.me's .well-known/links, showing 1,150 outbound links to en.wikipedia.org domains, for topics like "Bulls and Cows", "Shebang (Unix)", "Imposter syndrome", "Interrobang", and "Blakedown". — To nobody’s surprise whatsoever, I’ve made a so many links to Wikipedia that I might be single-handedly responsible for their PageRank.

So, obviously, I’ve written an implementation for WordPress. It’s really basic right now, but the source code can be found here if you want it. Install it as a plugin and run wp outbound-links to kick it off. It’s fast: it takes 3-5 seconds to parse the entirety of danq.me, and I’ve got somewhere in the region of 5,000 posts to parse.

You can see the results at https://danq.me/.well-known/links – if you’ve ever wondered “has Dan ever linked to my site?”, now you can find the answer.

If this could be useful to you, let’s collaborate on making this into an actually-useful plugin! Otherwise it’ll just languish “as-is”, which is good enough for my purposes.

DNDle (Wordle, but with D&D monster stats)

Don’t have time to read? Just start playing:

Play DNDle

There’s a Wordle clone for everybody

Am I too late to get onto the “making Wordle clones” bandwagon? Probably; there are quite a few now, including:

Hundreds of different languages,
Entirely different word sets (swear words, slang, bird banding codes, posix commands, common passwords…),
Different games in the same style (absurdle plays adversarially like my cheating hangman game, crosswordle involves reverse-engineering a wordle colour grid into a crossword, heardle is like Wordle but sounding out words using the IPA…)
Twists on the idea (try guessing prime numbers, equations, countries, chess openings, chords, or the composition of parties of fantasy adventurers…)
Just plain silly ones (horsle, easy wordle, chortle…)

Screenshot showing a WhatsApp conversation. Somebody shares a Wordle-like "solution" board but it's got six columns, not five. A second person comments "Hang on a minute... that's not Wordle!" — I’m sure that by now all your social feeds are full of people playing Wordle. But the cool nerds are playing something new…

Now, a Wordle clone for D&D players!

But you know what hasn’t been seen before today? A Wordle clone where you have to guess a creature from the Dungeons & Dragons (5e) Monster Manual by putting numeric values into a character sheet (STR, DEX, CON, INT, WIS, CHA):

Screenshot of DNDle, showing two guesses made already. — Just because nobody’s asking for a game doesn’t mean you shouldn’t make it anyway.

What are you waiting for: go give DNDle a try (I pronounce it “dindle”, but you can pronounce it however you like). A new monster appears at 10:00 UTC each day.

And because it’s me, of course it’s open source and works offline.

The boring techy bit

Like Wordle, everything happens in your browser: this is a “backendless” web application.
I’ve used ReefJS for state management, because I wanted something I could throw together quickly but I didn’t want to drown myself (or my players) in a heavyweight monster library. If you’ve not used Reef before, you should give it a go: it’s basically like React but a tenth of the footprint.
A cache-first/background-updating service worker means that it can run completely offline: you can install it to your homescreen in the same way as Wordle, but once you’ve visited it once it can work indefinitely even if you never go online again.
I don’t like to use a buildchain that’s any more-complicated than is absolutely necessary, so the only development dependency is rollup. It resolves my import statements and bundles a single JS file for the browser.

Can I use HTTP Basic Auth in URLs?

Web standards sometimes disappear

Sometimes a web standard disappears quickly at the whim of some company, perhaps to a great deal of complaint (and at least one joke).

But sometimes, they disappear slowly, like this kind of web address:

http://username:password@example.com/somewhere

If you’ve not seen a URL like that before, that’s fine, because the answer to the question “Can I still use HTTP Basic Auth in URLs?” is, I’m afraid: no, you probably can’t.

But by way of a history lesson, let’s go back and look at what these URLs were, why they died out, and how web browsers handle them today. Thanks to Ruth who asked the original question that inspired this post.

Basic authentication

The early Web wasn’t built for authentication. A resource on the Web was theoretically accessible to all of humankind: if you didn’t want it in the public eye, you didn’t put it on the Web! A reliable method wouldn’t become available until the concept of state was provided by Netscape’s invention of HTTP cookies in 1994, and even that wouldn’t see widespread for several years, not least because implementing a CGI (or similar) program to perform authentication was a complex and computationally-expensive option for all but the biggest websites.

Comic showing a conversation between a web browser and server. Browser: "Show me that page. (GET /)" Server: "No, take a ticket and fill this form. (Redirect, Set-Cookie)" Browser: "I've filled your form and here's your ticket (POST request with Cookie set)" Server: "Okay, Keep hold of your ticket. (Redirect, Set-Cookie)" Browser: "Show me that page, here's my ticket. (GET /, Cookie:)" — A simplified view of the form-and-cookie based authentication system used by virtually every website today, but which was too computationally-expensive for many sites in the 1990s.

1996’s HTTP/1.0 specification tried to simplify things, though, with the introduction of the WWW-Authenticate header. The idea was that when a browser tried to access something that required authentication, the server would send a 401 Unauthorized response along with a WWW-Authenticate header explaining how the browser could authenticate itself. Then, the browser would send a fresh request, this time with an Authorization: header attached providing the required credentials. Initially, only “basic authentication” was available, which basically involved sending a username and password in-the-clear unless SSL (HTTPS) was in use, but later, digest authentication and a host of others would appear.

Comic showing conversation between web browser and server. Browser: "Show me that page (GET /)" Server: "No. Send me credentials. (HTTP 401, WWW-Authenticate)" Browser: "Show me that page. I enclose credentials (Authorization)" Server: "Okay (HTTP 200)" — For all its faults, HTTP Basic Authentication (and its near cousins) are certainly elegant.

Webserver software quickly added support for this new feature and as a result web authors who lacked the technical know-how (or permission from the server administrator) to implement more-sophisticated authentication systems could quickly implement HTTP Basic Authentication, often simply by adding a .htaccess file to the relevant directory. .htaccess files would later go on to serve many other purposes, but their original and perhaps best-known purpose – and the one that gives them their name – was access control.

Credentials in the URL

A separate specification, not specific to the Web (but one of Tim Berners-Lee’s most important contributions to it), described the general structure of URLs as follows:

<scheme>://<username>:<password>@<host>:<port>/<url-path>#<fragment>

At the time that specification was written, the Web didn’t have a mechanism for passing usernames and passwords: this general case was intended only to apply to protocols that did have these credentials. An example is given in the specification, and clarified with “An optional user name. Some schemes (e.g., ftp) allow the specification of a user name.”

But once web browsers had WWW-Authenticate, virtually all of them added support for including the username and password in the web address too. This allowed for e.g. hyperlinks with credentials embedded in them, which made for very convenient bookmarks, or partial credentials (e.g. just the username) to be included in a link, with the user being prompted for the password on arrival at the destination. So far, so good.

Comic showing conversation between web browser and server. Browser asks for a page, providing an Authorization: header outright; server responds with the page immediately. — Encoding authentication into the URL provided an incredible shortcut at a time when Web round-trip times were much longer owing to higher latencies and no keep-alives.

This is why we can’t have nice things

The technique fell out of favour as soon as it started being used for nefarious purposes. It didn’t take long for scammers to realise that they could create links like this:

https://YourBank.com@HackersSite.com/

Everything we were teaching users about checking for “https://” followed by the domain name of their bank… was undermined by this user interface choice. The poor victim would actually be connecting to e.g. HackersSite.com, but a quick glance at their address bar would leave them convinced that they were talking to YourBank.com!

Theoretically: widespread adoption of EV certificates coupled with sensible user interface choices (that were never made) could have solved this problem, but a far simpler solution was just to not show usernames in the address bar. Web developers were by now far more excited about forms and cookies for authentication anyway, so browsers started curtailing the “credentials in addresses” feature.

Internet Explorer window showing https://YourBank.com@786590867/ in the address bar. — Users trained to look for “https://” followed by the site they wanted would often fall for scams like this one: the real domain name is *after* the @-sign. (This attacker is also using dword notation to obfuscate their IP address; this dated technique wasn’t often employed alongside this kind of scam, but it’s another historical oddity I enjoy so I’m shoehorning it in.)

(There are other reasons this particular implementation of HTTP Basic Authentication was less-than-ideal, but this reason is the big one that explains why things had to change.)

One by one, browsers made the change. But here’s the interesting bit: the browsers didn’t always make the change in the same way.

How different browsers handle basic authentication in URLs

Let’s examine some popular browsers. To run these tests I threw together a tiny web application that outputs the Authorization: header passed to it, if present, and can optionally send a 401 Unauthorized response along with a WWW-Authenticate: Basic realm="Test Site" header in order to trigger basic authentication. Why both? So that I can test not only how browsers handle URLs containing credentials when an authentication request is received, but how they handle them when one is not. This is relevant because some addresses – often API endpoints – have optional HTTP authentication, and it’s sometimes important for a user agent (albeit typically a library or command-line one) to pass credentials without first being prompted.

In each case, I tried each of the following tests in a fresh browser instance:

Go to http://<username>:<password>@<domain>/optional (authentication is optional).
Go to http://<username>:<password>@<domain>/mandatory (authentication is mandatory).
Experiment 1, then f0llow relative hyperlinks (which should correctly retain the credentials) to /mandatory.
Experiment 2, then follow relative hyperlinks to the /optional.

I’m only testing over the http scheme, because I’ve no reason to believe that any of the browsers under test treat the https scheme differently.

Chromium desktop family

Chrome 93 and Edge 93 both immediately suppressed the username and password from the address bar, along with the “http://” as we’ve come to expect of them. Like the “http://”, though, the plaintext username and password are still there. You can retrieve them by copy-pasting the entire address.

Opera 78 similarly suppressed the username, password, and scheme, but didn’t retain the username and password in a way that could be copy-pasted out.

Authentication was passed only when landing on a “mandatory” page; never when landing on an “optional” page. Refreshing the page or re-entering the address with its credentials did not change this.

Navigating from the “optional” page to the “mandatory” page using only relative links retained the username and password and submitted it to the server when it became mandatory, even Opera which didn’t initially appear to retain the credentials at all.

Navigating from the “mandatory” to the “optional” page using only relative links, or even entering the “optional” page address with credentials after visiting the “mandatory” page, does not result in authentication being passed to the “optional” page. However, it’s interesting to note that once authentication has occurred on a mandatory page, pressing enter at the end of the address bar on the optional page, with credentials in the address bar (whether visible or hidden from the user) does result in the credentials being passed to the optional page! They continue to be passed on each subsequent load of the “optional” page until the browsing session is ended.

Firefox desktop

Firefox 91 does a clever thing very much in-line with its image as a browser that puts decision-making authority into the hands of its user. When going to the “optional” page first it presents a dialog, warning the user that they’re going to a site that does not specifically request a username, but they’re providing one anyway. If the user says that no, navigation ceases (the GET request for the page takes place the same either way; this happens before the dialog appears). Strangely: regardless of whether the user selects yes or no, the credentials are not passed on the “optional” page. The credentials (although not the “http://”) appear in the address bar while the user makes their decision.

Similar to Opera, the credentials do not appear in the address bar thereafter, but they’re clearly still being stored: if the refresh button is pressed the dialog appears again. It does not appear if the user selects the address bar and presses enter.

Similarly, going to the “mandatory” page in Firefox results in an informative dialog warning the user that credentials are being passed. I like this approach: not only does it help protect the user from the use of authentication as a tracking technique (an old technique that I’ve not seen used in well over a decade, mind), it also helps the user be sure that they’re logging in using the account they mean to, when following a link for that purpose. Again, clicking cancel stops navigation, although the initial request (with no credentials) and the 401 response has already occurred.

Visiting any page within the scope of the realm of the authentication after visiting the “mandatory” page results in credentials being sent, whether or not they’re included in the address. This is probably the most-true implementation to the expectations of the standard that I’ve found in a modern graphical browser.

Safari desktop

Safari 14 never displays or uses credentials provided via the web address, whether or not authentication is mandatory. Mandatory authentication is always met by a pop-up dialog, even if credentials were provided in the address bar. Boo!

Once passed, credentials are later provided automatically to other addresses within the same realm (i.e. optional pages).

Older browsers

Let’s try some older browsers.

From version 7 onwards – right up to the final version 11 – Internet Explorer fails to even recognise addresses with authentication credentials in as legitimate web addresses, regardless of whether or not authentication is requested by the server. It’s easy to assume that this is yet another missing feature in the browser we all love to hate, but it’s interesting to note that credentials-in-addresses is permitted for ftp:// URLs…

…and if you go back a little way, Internet Explorer 6 and below supported credentials in the address bar pretty much as you’d expect based on the standard. The error message seen in IE7 and above is a deliberate design decision, albeit a somewhat knee-jerk reaction to the security issues posed by the feature (compare to the more-careful approach of other browsers).

These older versions of IE even (correctly) retain the credentials through relative hyperlinks, allowing them to be passed when they become mandatory. They’re not passed on optional pages unless a mandatory page within the same realm has already been encountered.

Pre-Mozilla Netscape behaved the same way. Truly this was the de facto standard for a long period on the Web, and the varied approaches we see today are the anomaly. That’s a strange observation to make, considering how much the Web of the 1990s was dominated by incompatible implementations of different Web features (I’ve written about the <blink> and <marquee> tags before, which was perhaps the most-visible division between the Microsoft and Netscape camps, but there were many, many more).

Interestingly: by Netscape 7.2 the browser’s behaviour had evolved to be the same as modern Firefox’s, except that it still displayed the credentials in the address bar for all to see.

Now here’s a real gem: pre-Chromium Opera. It would send credentials to “mandatory” pages and remember them for the duration of the browsing session, which is great. But it would also send credentials when passed in a web address to “optional” pages. However, it wouldn’t remember them on optional pages unless they remained in the address bar: this feels to me like an optimum balance of features for power users. Plus, it’s one of very few browsers that permitted you to change credentials mid-session: just by changing them in the address bar! Most other browsers, even to this day, ignore changes to HTTP Authentication credentials, which was sometimes be a source of frustration back in the day.

Finally, classic Opera was the only browser I’ve seen to mask the password in the address bar, turning it into a series of asterisks. This ensures the user knows that a password was used, but does not leak any sensitive information to shoulder-surfers (the length of the “masked” password was always the same length, too, so it didn’t even leak the length of the password). Altogether a spectacular design and a great example of why classic Opera was way ahead of its time.

The Command-Line

Most people using web addresses with credentials embedded within them nowadays are probably working with code, APIs, or the command line, so it’s unsurprising to see that this is where the most “traditional” standards-compliance is found.

I was unsurprised to discover that giving curl a username and password in the URL meant that username and password was sent to the server (using Basic authentication, of course, if no authentication was requested):

$ curl http://alpha:beta@localhost/optional Header: Basic YWxwaGE6YmV0YQ== $ curl http://alpha:beta@localhost/mandatory Header: Basic YWxwaGE6YmV0YQ==

However, wget did catch me out. Hitting the same addresses with wget didn’t result in the credentials being sent except where it was mandatory (i.e. where a HTTP 401 response and a WWW-Authenticate: header was received on the initial attempt). To force wget to send credentials when they haven’t been asked-for requires the use of the --http-user and --http-password switches:

$ wget http://alpha:beta@localhost/optional -qO- Header: $ wget http://alpha:beta@localhost/mandatory -qO- Header: Basic YWxwaGE6YmV0YQ==

lynx does a cute and clever thing. Like most modern browsers, it does not submit credentials unless specifically requested, but if they’re in the address bar when they become mandatory (e.g. because of following relative hyperlinks or hyperlinks containing credentials) it prompts for the username and password, but pre-fills the form with the details from the URL. Nice.

What’s the status of HTTP (Basic) Authentication?

HTTP Basic Authentication and its close cousin Digest Authentication (which overcomes some of the security limitations of running Basic Authentication over an unencrypted connection) is very much alive, but its use in hyperlinks can’t be relied upon: some browsers (e.g. IE, Safari) completely munge such links while others don’t behave as you might expect. Other mechanisms like Bearer see widespread use in APIs, but nowhere else.

The WWW-Authenticate: and Authorization: headers are, in some ways, an example of the best possible way to implement authentication on the Web: as an underlying standard independent of support for forms (and, increasingly, Javascript), cookies, and complex multi-part conversations. It’s easy to imagine an alternative timeline where these standards continued to be collaboratively developed and maintained and their shortfalls – e.g. not being able to easily log out when using most graphical browsers! – were overcome. A timeline in which one might write a login form like this, knowing that your e.g. “authenticate” attributes would instruct the browser to send credentials using an Authorization: header:

<form method="get" action="/" authenticate="Basic"> <label for="username">Username:</label> <input type="text" id="username" authenticate="username"> <label for="password">Password:</label> <input type="text" id="password" authenticate="password"> <input type="submit" value="Log In"> </form>

In such a world, more-complex authentication strategies (e.g. multi-factor authentication) could involve encoding forms as JSON. And single-sign-on systems would simply involve the browser collecting a token from the authentication provider and passing it on to the third-party service, directly through browser headers, with no need for backwards-and-forwards redirects with stacks of information in GET parameters as is the case today. Client-side certificates – long a powerful but neglected authentication mechanism in their own right – could act as first class citizens directly alongside such a system, providing transparent second-factor authentication wherever it was required. You wouldn’t have to accept a tracking cookie from a site in order to log in (or stay logged in), and if your browser-integrated password safe supported it you could log on and off from any site simply by toggling that account’s “switch”, without even visiting the site: all you’d be changing is whether or not your credentials would be sent when the time came.

The Web has long been on a constant push for the next new shiny thing, and that’s sometimes meant that established standards have been neglected prematurely or have failed to evolve for longer than we’d have liked. Consider how long it took us to get the <video> and <audio> elements because the “new shiny” Flash came to dominate, how the Web Payments API is only just beginning to mature despite over 25 years of ecommerce on the Web, or how we still can’t use Link: headers for all the things we can use <link> elements for despite them being semantically-equivalent!

The new model for Web features seems to be that new features first come from a popular JavaScript implementation, and then eventually it evolves into a native browser feature: for example HTML form validations, which for the longest time could only be done client-side using scripting languages. I’d love to see somebody re-think HTTP Authentication in this way, but sadly we’ll never get a 100% solution in JavaScript alone: (distributed SSO is almost certainly off the table, for example, owing to cross-domain limitations).

Or maybe it’s just a problem that’s waiting for somebody cleverer than I to come and solve it. Want to give it a go?