It all started when I saw no-ht.ml, Terence Eden‘s hilarious response to Salma
Alam-Naylor‘s excellent HTML is all you need to make a website. The latter is an
argument against both the silly amount of JavaScript with which websites routinely burden their users, but also even against depending on CSS. As a fan of CSS Naked Day and a firm
believer in using JS only for progressive enhancement, I’m obviously in favour.
Obviously no-ht.ml is to be taken as tongue-in-cheek, but as you’re about to see: it caught my interest and got me thinking: how could I go even further.
Terence’s site works by delivering a document with a
claimed MIME type of text/html, but which contains only the (invalid) “HTML” code
<!doctype UNICODE><meta charset="UTF-8"><plaintext> (to work around browsers’ wish to treat the page as HTML). This is followed by a block of UTF-8 plain text making use of spacing
and emoji to illustrate and decorate the content. It’s frankly very silly, and I love it.1
I think it’s possible to go one step further, though, and create a web page with no code whatsoever. That is, one that you can read as if it were a regular web page, but where
using View Source or e.g. downloading the page with curl will show you… nothing.
I present: The Page With No Code! (It’ll probably only work if you’re using Firefox, for reasons that will become apparent later.)
I’d encourage you to visit The Page With No Code, use View Source to confirm for yourself that it truly has no code, and see if you can work out for yourself how it manages
this feat… before coming back here for an explanation. Again: probably Firefox-only.
Once you’ve had a look for yourself and had a chance to form an opinion, here’s an explanation of the black magic that makes this atrocity possible:
The page is blank. It’s delivered with Content-Type: text/html. Your browser interprets a completely-blank page as faulty and corrects it to a functionally-blank
minimal HTML page: <html><head></head><body></body></html>.
<body> and <html> elements can be styled with CSS; this includes the ability to add
content:::before and ::after each
element. If only we could load a stylesheet then content injection is possible.
We use the fourth way to inject
CSS – a Link: HTTP header – to deliver a CSS payload (this, unfortunately, only works in Firefox). To further obfuscate what’s happening and remove the need for a round-trip, this is encoded
as a data: URI.
The stylesheet – and all the page content – is right there in the Link: header if you just care to decode it! Observe that while 5.84kB of
data are transferred, the browser rightly states that the page is zero bytes in size.
My server-side implementation of this broke in 2023 after I upgraded Nginx; my new version doesn’t support the super-long Link: header needed
to make this hack work, so I’ve updated the page to use the Link: to reference the CSS file rather than embed it via a data URI. It’s not as cool, but it at least means you can
still see the page. Thanks to Thomas Bradshaw for pointing out the problem.
Footnotes
1 My first reaction was “why not just deliver something with Content-Type:
text/plain; charset=utf-8 and dispense with the invalid code, but perhaps that’s just me overthinking the non-existent problem.
This is a reply to a post published elsewhere. Its content might be duplicated as a traditional comment at the original source.
In his blog post “The ethics of syndicating comments using WebMentions”, Terence Eden said:
…
I want to see what people are writing in public about my posts. I also want to direct people to the conversations which are happening elsewhere on the web. But people – quite
rightly – might not want their content permanently stored by my site.
So I think I have a few options.
Do nothing. My site; my rules. If you don’t want me to grab your hot takes, don’t post them in public. (Feels a bit rude, TBQH.)
Be reactive. If someone asks me to remove their content, do so. (But, of course, how will they know I’ve made a copy?)
Stop syndicating comments. (I don’t wanna!)
Replace the verbatim comments with a link saying “Fred mentioned this article on Twitter” . (A bit of a disruptive experience for readers.)
Use oEmbed to capture the user’s comment and dynamically load it from the 3rd party site. That would update automatically if the user changes their name or deleted the
comment. (A massive faff to set up.)
…
Terence describes a problem that I’ve wrestled with myself. If somebody comments directly on my blog using the form at the bottom of a post, that’s a pretty strong indicator of
them giving their consent for their comment to be published at the bottom of that post (at my discretion). If somebody publicly replies somewhere my post is syndicated, that’s
less-obvious, but still pretty clear. If somebody merely mentions my post publicly, writing their own post and linking to mine… that’s a real fuzzy area.
I take a minimal approach; only capturing their full content if it’s short and otherwise trying to extract a snippet that contains the bit that mentioned my content, and I think that
works great. But Terence points out an important follow-up: what if the commenter deletes that content?
My approach so far has always been a reactive one – the second in Terence’s list – and I think it’s a morally-acceptable stance for a personal blogger. But I’m not sure it scales. I
find myself asking: what if a news outlet did this, taking my self-published feedback to their story and publishing it on their site, even if I later amended, retracted, or deleted it
on my own? If somebody’s making money out of my content, that feels different: I’ve always been clear that what I write on my blog is permissively-licensed, but that permissiveness is based on the prohibition of commercial use of
my content.
Perhaps down the line this can be solved technologically: something machine-readable akin to the <link rel="license" ...> tag could state an author’s preference for
how their content is syndicated by third parties they’ve mentioned, answering questions like:
Can you quote me, or just link to me? Who do these rules apply to? (Should we be attaching metadata to individual links?)
Should you inform me that you’ve done so, and if so: how (WebMention, etc.)?
If you (or your site) observe that my content has disappeared or changed for an extended time, should that be taken as revokation of consent to syndicate it?
Right now, the relevant technologies are not well-established enough to even begin this kind of work, but if a modern interconected federated web of personal websites takes off, it’s
the kind of question we might one day have to answer.
For now my gut feeling is that option #2 (reactive moderation of syndicated comments) is ethically-sufficient for personal websites. But I’ll be watching the feedback Terence (who
probably gets many more readers than I) receives in case my gut doesn’t represent the majority!
103: Early Hints (“I’m not sure this can last forever.”)
300: Multiple Choices (“There are so many ways I can do better than you.”)
303: See Other (“You should date other people.”)
304: Not Modified (“With you, I feel like I’m stagnating.”)
402: Payment Required (“I am a prostitute.”)
403: Forbidden (“You don’t get this any more.”)
406: Not Acceptable (“I could never introduce you to my parents.”)
408: Request Timeout (“You keep saying you’ll propose but you never do.”)
409: Conflict (“We hate each other.”)
410: Gone (ghosted)
411: Length Required (“Your penis is too small.”)
413: Payload Too Large (“Your penis is too big.”)
416: Range Not Satisfied (“Our sex life is boring and repretitive.”)
425: Too Early (“Your premature ejaculation is a problem.”)
428: Precondition Failed (“You’re still sleeping with your ex-!?”)
429: Too Many Requests (“You’re so demanding!”)
451: Unavailable for Legal Reasons (“I’m married to somebody else.”)
502: Bad Gateway (“Your pussy is awful.”)
508: Loop Detected (“We just keep fighting.”)
With thanks to Ruth for the conversation that inspired these pictures, and apologies to the rest of the Internet for creating them.
I’m off work sick today: it’s just a cold, but it’s had a damn good go at wrecking my lungs and I feel pretty lousy. You know how when you’ve got too much of a brain-fog to trust
yourself with production systems but you still want to write code (or is that just me?), so this morning I threw together a really,
really stupid project which you can play online here.
It’s a board game. Well, the digital edition of one. Also, it’s not very good.
It’s inspired by a toot by Mason”Tailsteak” Williams (whom I’ve mentioned before once or
twice). At first I thought I’d try to calculate the odds of winning at his proposed game, or how many times one might expect to play before winning,
but I haven’t the brainpower for that in my snot-addled brain. So instead I threw together a terrible, terrible digital implementation.
Go play it if, like me, you’ve got nothing smarter that your brain can be doing today.
Just in time for Robin Sloan to give up on Spring ’83, earlier this month I finally got aroud to launching STS-6 (named for the first mission of the Space Shuttle Challenger in Spring 1983), my
experimental Spring ’83 server. It’s been a busy year; I had other things to do. But you might have guessed that something like this had been under my belt when I open-sourced a keygenerator for the protocol the other day.
If you’ve not played with Spring ’83, this post isn’t going to make much sense to you. Sorry.
My server is, as far as I can tell, very different from any others in a few key ways:
It does not allow third-party publishing at all. Some might argue that this undermines the aim of the exercise, but I disagree. My IndieWeb inclinations lead me to
favour “self-hosted” content, shared from its owners’ domain. Also: the specification clearly states that a server must implement a denylist… I guess my
denylist simply includes all keys that are not specifically permitted.
It’s geared towards dynamic content.My primary board self-publishes whenever I produce a new blog post, listing the most recent
blog posts published. I have another half-implemented which shows a summary of the most-recent post, and another which would would simply use a WordPress page as its basis – yes, this
was content management, but published over Spring ’83.
It provides helpers to streamline content production. It supports internal references to other boards you control using the format {{board:123}}which are
automatically converted to addresses referencing the public key of the “current” keypair for that board. This separates the concept of a board and its content template from that
board’s keypairs, making it easier to link to a board. To put it another way, STS-6 links are self-healing on the server-side (for local boards).
It helps automate content-fitting. Spring ’83 strictly requires a maximum board size of 2,217 bytes. STS-6 can be configured to fit a flexible amount of dynamic
content within a template area while respecting that limit. For my posts list board, the number of posts shown is moderated by the size of the resulting board: STS-6 adds more and
more links to the board until it’s too big, and then removes one!
It provides “hands-off” key management features. You can pregenerate a list of keys with different validity periods and the server will automatically cycle through
them as necessary, implementing and retroactively-modifying <link rel="next"> connections to keep them current.
I’m sure that there are those who would see this as automating something that was beautiful because it was handcrafted; I don’t know whether or not I agree, but had Spring ’83
taken off in a bigger way, it would always only have been a matter of time before somebody tried my approach.
From a design perspective, I enjoyed optimising an SVG image of my header so it could meaningfully fit into the board. It’s
pretty, and it’s tolerably lightweight.
If you want to see my server in action, patch this into your favourite Spring ’83 client:
https://s83.danq.dev/10c3ff2e8336307b0ac7673b34737b242b80e8aa63ce4ccba182469ea83e0623
A dead end?
Without Robin’s active participation, I feel that Spring ’83 is probably coming to a dead end. It’s been a lot of fun to play with and I’d love to see what ideas the experience of it
goes on to inspire next, but in its current form it’s one of those things that’s an interesting toy, but not something that’ll make serious waves.
In his last lab essay Robin already identified many of the key issues with the system (too complicated, no interpersonal-mentions, the challenge of keys-as-identifiers, etc.) and while
they’re all solvable without breaking the underlying mechanisms (mentions might be handled by Webmention, perhaps, etc.), I
understand the urge to take what was learned from this experiment and use it to help inform the decisions of the next one. Just as John Postel’s Quote of the Day protocol doesn’t see much use any more (although maybe if my
finger server could support QotD?) but went on to inspire the direction of many subsequent “call-and-response” protocols,
including HTTP, it’s okay if Spring ’83 disappears into obscurity, so long as we can learn what it did
well and build upon that.
Meanwhile: if you’re looking for a hot new “like the web but lighter” protocol, you should probably check out Gemini. (Incidentally, you
can find me at gemini://danq.me, but that’s something I’ll write about another day…)
This weekend I was experimentally reimplenting how my blog displays comments. For testing I needed to find an old post with both trackbacks and pingbacks on it. I found my post that you linked, here, and was delighted to be reminded that despite both of our blogs changing domain name (from photomatt.net to ma.tt
and from blog.scatmania.org to danq.me, respectively), all the links back and forth still work perfectly because clearly we share an apporopriate dedication to the principle that
Cool URIs Don’t Change, and set up our redirects accordingly. 🙌
Incidentally, this was about the point in time at which I first thought to myself “hey, I like what Matt’s doing with this Automattic thing; I should work there someday”. It took me
like a decade to a decade-and-a-half to get around to applying, though… 😅
Anyway: thanks for keeping your URIs cool so I could enjoy this trip down memory lane (and debug an experimental wp_list_comments callback!).
That’s a really useful thing to have in this new age of the web, where Refererer: headers are no-longer commonly passed cross-domain and Google Search no longer provides the link: operator. If you want to know if I’ve ever
linked to your site, it’s a bit of a drag to find out.
To nobody’s surprise whatsoever, I’ve made a so many links to Wikipedia that I might be single-handedly responsible for their PageRank.
So, obviously, I’ve written an implementation for WordPress. It’s really basic right now, but the source code can be
found here if you want it. Install it as a plugin and run wp outbound-links to kick it off. It’s fast: it takes 3-5 seconds to parse the entirety of danq.me,
and I’ve got somewhere in the region of 5,000 posts to parse.
You can see the results at https://danq.me/.well-known/links – if you’ve ever wondered “has Dan ever linked to my site?”, now you can find the
answer.
If this could be useful to you, let’s collaborate on making this into an actually-useful plugin! Otherwise it’ll just languish “as-is”, which is good enough for my purposes.
[Nilay:]It is fashionable to run around saying the web is dead and that apps shape the world, but in my mind, the web’s pretty healthy for at least two things: news
and shopping.
[Matt:] I think that’s your bubble, if I’m totally honest. That’s what’s cool about the web: We can live in a bubble and that can seem like the whole thing. One thing I would
explicitly try to do in 2022 is make the web weirder.
…
The Verge interviewed Matt Mullenweg, and – as both an Automattician and a fan of the Web as
a place for fun and weirdness – I really appreciated the direction the interview went in. I maintain that open web standards and platforms (as opposed to closed social media silos)
are inspirational and innovative.
Emilie Reed‘s Anything a Maze lives on itch.io, and (outside of selfhosting) that’s
clearly the best place for it: you couldn’t tell that story the same way on Medium; even less-so on Facebook or Twitter.
Suppose you’re running an application on a Passenger + Nginx powered server and you want to add caching.
Perhaps your application has a dynamic, public endpoint but the contents don’t change super-frequently or it isn’t critically-important that the user always gets up-to-the-second
accuracy, and you’d like to improve performance with microcaching. How would you do that?
Where you’re at
Not pictured: the rest of the Internet.
Your configuration might look something like this:
1
2
3
4
5
6
7
server {
# listen, server_name, ssl, logging etc. directives go here# ...root/your/application;
passenger_enabledon;
}
What you’re looking for is proxy_cache and its sister directives, but you can’t just
insert them here because while Passenger does act act like an upstream proxy (with parallels like e.g. passenger_pass_header which mirrors the behaviour of proxy_pass_header), it doesn’t provide any of the functions you need to implement proxy caching
of non-static files.
Where you need to be
Instead, what you need to to is define a second server, mount Passenger in that, and then proxy to that second server. E.g.:
# Set up a cacheproxy_cache_path/tmp/cache/my-app-cachekeys_zone=MyAppCache:10mlevels=1:2inactive=600smax_size=100m;
# Define the actual webserver that listens for Internet traffic:server {
# listen, server_name, ssl, logging etc. directives go here# ...# You can configure different rules by location etc., but here's a simple microcache:location/ {
proxy_passhttp://127.0.0.1:4863; # Proxy all traffic to the application server defined belowproxy_cacheMyAppCache; # Use the cache defined aboveproxy_cache_valid2003s; # Treat HTTP 200 responses as valid; cache them for 3 secondsproxy_cache_use_staleupdating; # (Optional) send outdated response while background-updating cacheproxy_cache_lock on; # (Optional) only allow one process to update cache at once
}
}
# (Local-only) application server on an arbitrary port number to act as the upstream proxy:server {
listen 127.0.0.1:4863;
root/your/application;
passenger_enabledon;
}
The two key changes are:
Passenger moves to a second server block, localhost-only, on an arbitrary port number (doesn’t need HTTPS, of course, but if your application detects/”expects” HTTPS you
might need to tweak your headers).
Your main server block proxies to the second as its upstream, and you can add whatever caching directives you like.
Obviously you’ll need to be smarter if you host a mixture of public and private content (e.g. send Vary: headers from your application) and if you want different cache
durations on different addresses or types of content, but there are already great guides to help with that. I only wrote this post because I spent some time searching for (nonexistent!)
passenger_cache_ etc. rules and wanted to save the next person from the same trouble!
Different games in the same style (absurdle plays adversarially like my cheating hangman
game, crosswordle involves reverse-engineering a wordle colour grid into a crossword, heardle
is like Wordle but sounding out words using the IPA…)
I’m sure that by now all your social feeds are full of people playing Wordle. But the cool nerds are playing something new…
Now, a Wordle clone for D&D players!
But you know what hasn’t been seen before today? A Wordle clone where you have to guess a creature from the Dungeons & Dragons (5e) Monster Manual by putting numeric values into a
character sheet (STR, DEX, CON, INT, WIS, CHA):
Just because nobody’s asking for a game doesn’t mean you shouldn’t make it anyway.
What are you waiting for: go give DNDle a try (I pronounce it “dindle”, but you can pronounce it however you like). A new monster
appears at 10:00 UTC each day.
And because it’s me, of course it’s open source and works offline.
The boring techy bit
Like Wordle, everything happens in your browser: this is a “backendless” web application.
I’ve used ReefJS for state management, because I wanted something I could throw together quickly but I didn’t want to drown myself (or my players)
in a heavyweight monster library. If you’ve not used Reef before, you should give it a go: it’s basically like React but a tenth of the footprint.
A cache-first/background-updating service worker means that it can run completely offline: you can install it to your homescreen in the
same way as Wordle, but once you’ve visited it once it can work indefinitely even if you never go online again.
I don’t like to use a buildchain that’s any more-complicated than is absolutely necessary, so the only development dependency is rollup. It
resolves my import statements and bundles a single JS file for the browser.
What happens when you give Gutenberg and Elementor to complete Beginners? In this challenge, Meg and Lily (two of my daughters) are tasked with re-creating a webpage. They’ve never
used Elementor or Gutenberg before, and I only gave them 30 minutes each.
…
Jamie of Pootlepress challenged his daughters – who are presumably both digital natives, but have no WordPress experience – to build a page to a specific design using both Gutenberg and Elementor. In 30 minutes.
Regardless of what you think about the products under test or the competitors in the challenge (Lily + Gutenberg clearly seems to be the fan favourite, which I’d sort-of expect because
IMO Gutenberg’s learning curve is much flatter that Elementor’s), this is a fantastic example of “thinking aloud” (“talkalong”)
UX testing. And with (only) a £20 prize on offer, it’s possibly the best-value testing of its type I’ve ever seen too! Both the
participants do an excellent job of expressing their praise of and frustration with different parts of the interface of their assigned editing platform, and the developers of both – and
other systems besides – could learn a lot from watching this video.
Specifically, this video shows how enormous the gulf is between how developers try to express concepts that are essential to web design and how beginner users assume things will work.
Concepts like thinking in terms of “blocks” that can resize or reposition dynamically, breakpoints, assets as cross-references rather than strictly embedded within documents, style as
an overarching concept by preference to something applied to individual elements, etc… some as second nature once you’re sixteen levels deep into the DOM and you’ve been doing it for years! But they’re rarely intuitive… or, perhaps, not expressed in a way that makes them intuitive… to new users.
Of course, you don’t strictly need a digital wallet to use EGX. But as we’re in a culture where people invariably ask “is
there an app for it?”, I thought I’d make one.
You can install it as an offline-first progressive web application, which means that this could be the first ever digital currency to have an app that works without an Internet
connection. That’s probably something no other digital currency can claim to have, right?
Here’s what it looks like if I send 0.1 EGX to my friend Chris using the app:
Naturally, I wouldn’t be backing Emma Goldcoin if it didn’t represent such a brilliant step up better-known digital currencies like Bitcoin, Ripple, and Etherium. Specific features
unique to Emma Goldcoin include:
Using it doesn’t massively contribute to energy wastage and environmental damage.
It doesn’t increase the digital divide by helping early adopters at the expense of late adopters.
It’s entirely secure: it’s mathematically impossible to “steal”EGX.
Emma Goldcoin is so simple that you don’t even need a computer to use it: it “just works”.
Sure, it’s got its downsides, and I’d encourage you to read the specification if you’d like to learn more about what
those are. Or if you already know what EGX is all about and just want to try a new way to manage your portfolio, give my new site EGXchange.org a go!
But sometimes, they disappear slowly, like this kind of web address:
http://username:password@example.com/somewhere
If you’ve not seen a URL like that before, that’s fine, because the answer to the question “Can I still use HTTP Basic Auth in URLs?” is, I’m afraid: no, you probably can’t.
But by way of a history lesson, let’s go back and look at what these URLs were, why they died out, and how web
browsers handle them today. Thanks to Ruth who asked the original question that inspired this post.
Basic authentication
The early Web wasn’t built for authentication. A resource on the Web was theoretically accessible to all of humankind: if you didn’t want it in the public eye, you didn’t put
it on the Web! A reliable method wouldn’t become available until the concept of state was provided by Netscape’s invention of HTTP
cookies in 1994, and even that wouldn’t see widespread for several years, not least because implementing a CGI (or
similar) program to perform authentication was a complex and computationally-expensive option for all but the biggest websites.
A simplified view of the form-and-cookie based authentication system used by virtually every website today, but which was too computationally-expensive for many sites in the 1990s.
1996’s HTTP/1.0 specification tried to simplify things, though, with the introduction of the WWW-Authenticate header. The idea was that when a browser tried to access something that required
authentication, the server would send a 401 Unauthorized response along with a WWW-Authenticate header explaining how the browser could authenticate
itself. Then, the browser would send a fresh request, this time with an Authorization: header attached providing the required credentials. Initially, only “basic
authentication” was available, which basically involved sending a username and password in-the-clear unless SSL (HTTPS) was in use, but later, digest authentication and a host of others would appear.
For all its faults, HTTP Basic Authentication (and its near cousins) are certainly elegant.
Webserver software quickly added support for this new feature and as a result web authors who lacked the technical know-how (or permission from the server administrator) to implement
more-sophisticated authentication systems could quickly implement HTTP Basic Authentication, often simply by adding a .htaccessfile to the relevant directory.
.htaccess files would later go on to serve many other purposes, but their original and perhaps best-known purpose – and the one that gives them their name – was access
control.
Credentials in the URL
A separate specification, not specific to the Web (but one of Tim Berners-Lee’s most important contributions to it), described the general structure of URLs as follows:
At the time that specification was written, the Web didn’t have a mechanism for passing usernames and passwords: this general case was intended only to apply to protocols that
did have these credentials. An example is given in the specification, and clarified with “An optional user name. Some schemes (e.g., ftp) allow the specification of a user
name.”
But once web browsers had WWW-Authenticate, virtually all of them added support for including the username and password in the web address too. This allowed for
e.g. hyperlinks with credentials embedded in them, which made for very convenient bookmarks, or partial credentials (e.g. just the username) to be included in a link, with the
user being prompted for the password on arrival at the destination. So far, so good.
Encoding authentication into the URL provided an incredible shortcut at a time when Web round-trip times were much longer owing to higher latencies and no keep-alives.
This is why we can’t have nice things
The technique fell out of favour as soon as it started being used for nefarious purposes. It didn’t take long for scammers to realise that they could create links like this:
https://YourBank.com@HackersSite.com/
Everything we were teaching users about checking for “https://” followed by the domain name of their bank… was undermined by this user interface choice. The poor victim would
actually be connecting to e.g. HackersSite.com, but a quick glance at their address bar would leave them convinced that they were talking to YourBank.com!
Theoretically: widespread adoption of EV certificates coupled with sensible user interface choices (that were never made) could
have solved this problem, but a far simpler solution was just to not show usernames in the address bar. Web developers were by now far more excited about forms and
cookies for authentication anyway, so browsers started curtailing the “credentials in addresses” feature.
Users trained to look for “https://” followed by the site they wanted would often fall for scams like this one: the real domain name is after the @-sign. (This attacker is
also using dword notation to obfuscate their IP address; this
dated technique wasn’t often employed alongside this kind of scam, but it’s another historical oddity I enjoy so I’m shoehorning it in.)
(There are other reasons this particular implementation of HTTP Basic Authentication was less-than-ideal, but this reason is the big one that explains why things had to change.)
One by one, browsers made the change. But here’s the interesting bit: the browsers didn’t always make the change in the same way.
How different browsers handle basic authentication in URLs
Let’s examine some popular browsers. To run these tests I threw together a tiny web application that outputs
the Authorization: header passed to it, if present, and can optionally send a 401 Unauthorized response along with a WWW-Authenticate: Basic realm="Test Site" header in order to trigger basic authentication. Why both? So that I can test not only how browsers handle URLs containing credentials when an authentication request is received, but how they handle them when one is not. This is relevant because
some addresses – often API endpoints – have optional HTTP authentication, and it’s sometimes important for a user agent (albeit typically a library or command-line one) to pass credentials without
first being prompted.
In each case, I tried each of the following tests in a fresh browser instance:
Go to http://<username>:<password>@<domain>/optional (authentication is optional).
Go to http://<username>:<password>@<domain>/mandatory (authentication is mandatory).
Experiment 1, then f0llow relative hyperlinks (which should correctly retain the credentials) to /mandatory.
Experiment 2, then follow relative hyperlinks to the /optional.
I’m only testing over the http scheme, because I’ve no reason to believe that any of the browsers under test treat the https scheme differently.
Chromium desktop family
Chrome 93 and Edge 93 both
immediately suppressed the username and password from the address bar, along with
the “http://” as we’ve come to expect of them. Like the “http://”, though, the plaintext username and password are still there. You can retrieve them by copy-pasting the
entire address.
Opera 78 similarly suppressed the username, password, and scheme, but didn’t retain the username and password in a way that could be copy-pasted out.
Authentication was passed only when landing on a “mandatory” page; never when landing on an “optional” page. Refreshing the page or re-entering the address with its credentials did not
change this.
Navigating from the “optional” page to the “mandatory” page using only relative links retained the username and password and submitted it to the server when it became mandatory,
even Opera which didn’t initially appear to retain the credentials at all.
Navigating from the “mandatory” to the “optional” page using only relative links, or even entering the “optional” page address with credentials after visiting the “mandatory” page, does
not result in authentication being passed to the “optional” page. However, it’s interesting to note that once authentication has occurred on a mandatory page, pressing enter at
the end of the address bar on the optional page, with credentials in the address bar (whether visible or hidden from the user) does result in the credentials being passed to
the optional page! They continue to be passed on each subsequent load of the “optional” page until the browsing session is ended.
Firefox desktop
Firefox 91 does a clever thing very much in-line with its image as a browser that puts decision-making authority into the hands of its user. When going to
the “optional” page first it presents a dialog, warning the user that they’re going to a site that does not specifically request a username, but they’re providing one anyway. If the
user says that no, navigation ceases (the GET request for the page takes place the same either way; this happens before the dialog appears). Strangely: regardless of whether the user
selects yes or no, the credentials are not passed on the “optional” page. The credentials (although not the “http://”) appear in the address bar while the user makes their decision.
Similar to Opera, the credentials do not appear in the address bar thereafter, but they’re clearly still being stored: if the refresh button is pressed the dialog appears again. It does
not appear if the user selects the address bar and presses enter.
Similarly, going to the “mandatory” page in Firefox results in an informative dialog warning the user that credentials are being passed. I like this approach: not only does it
help protect the user from the use of authentication as a tracking technique (an old technique that I’ve not seen used in well over a decade, mind), it also helps the user be sure that
they’re logging in using the account they mean to, when following a link for that purpose. Again, clicking cancel stops navigation, although the initial request (with no credentials)
and the 401 response has already occurred.
Visiting any page within the scope of the realm of the authentication after visiting the “mandatory” page results in credentials being sent, whether or not they’re included in the
address. This is probably the most-true implementation to the expectations of the standard that I’ve found in a modern graphical browser.
Safari desktop
Safari 14 never
displays or uses credentials provided via the web address, whether or not authentication is mandatory. Mandatory authentication is always met by a pop-up dialog, even if credentials
were provided in the address bar. Boo!
Once passed, credentials are later provided automatically to other addresses within the same realm (i.e. optional pages).
Older browsers
Let’s try some older browsers.
From version 7 onwards – right up to the final version 11 – Internet Explorer fails to even recognise addresses with authentication credentials in
as legitimate web addresses, regardless of whether or not authentication is requested by the server. It’s easy to assume that this is yet another missing feature in the browser we all
love to hate, but it’s interesting to note that credentials-in-addresses is permitted for ftp:// URLs…
…and if you go back a
little way, Internet Explorer 6 and below supported credentials in the address bar pretty much as you’d expect based on the standard. The error message seen in IE7 and above is a deliberate design
decision, albeit a somewhat knee-jerk reaction to the security issues posed by the feature (compare to the more-careful approach of other browsers).
These older versions of IE even (correctly) retain the credentials through relative hyperlinks, allowing them to be passed when
they become mandatory. They’re not passed on optional pages unless a mandatory page within the same realm has already been encountered.
Pre-Mozilla Netscape behaved the
same way. Truly this was the de facto standard for a long period on the Web, and the varied approaches we see today are the anomaly. That’s a strange observation to make,
considering how much the Web of the 1990s was dominated by incompatible implementations of different Web features (I’ve written about the
<blink> and <marquee> tags before, which was perhaps the most-visible division between the Microsoft and Netscape camps, but there were many,
many more).
Interestingly: by Netscape 7.2 the browser’s behaviour had evolved to be the same as modern Firefox’s, except that it still displayed the credentials in the
address bar for all to see.
Now here’s a real gem: pre-Chromium Opera. It would send credentials to “mandatory” pages and remember them for the duration of the browsing session, which is
great. But it would also send credentials when passed in a web address to “optional” pages. However, it wouldn’t remember them on optional pages unless they remained
in the address bar: this feels to me like an optimum balance of features for power users. Plus, it’s one of very few browsers that permitted you to change credentials
mid-session: just by changing them in the address bar! Most other browsers, even to this day, ignore changes to HTTP
Authentication credentials, which was sometimes be a source of frustration back in the day.
Finally, classic Opera was the only browser I’ve seen to mask the password in the address bar, turning it into a series of asterisks. This ensures the user knows that a
password was used, but does not leak any sensitive information to shoulder-surfers (the length of the “masked” password was always the same length, too, so it didn’t even leak the
length of the password). Altogether a spectacular design and a great example of why classic Opera was way ahead of its time.
The Command-Line
Most people using web addresses with credentials embedded within them nowadays are probably working with code, APIs,
or the command line, so it’s unsurprising to see that this is where the most “traditional” standards-compliance is found.
I was unsurprised to discover that giving curl a username and password in the URL meant that
username and password was sent to the server (using Basic authentication, of course, if no authentication was requested):
However, wgetdid catch me out. Hitting the same addresses with wget didn’t result in the credentials being sent
except where it was mandatory (i.e. where a HTTP 401 response and a WWW-Authenticate: header was received on the initial attempt). To force wget to
send credentials when they haven’t been asked-for requires the use of the --http-user and --http-password switches:
lynx does a cute and clever thing. Like most modern browsers, it does not submit credentials unless specifically requested, but if
they’re in the address bar when they become mandatory (e.g. because of following relative hyperlinks or hyperlinks containing credentials) it prompts for the username and password,
but pre-fills the form with the details from the URL. Nice.
What’s the status of HTTP (Basic) Authentication?
HTTP Basic Authentication and its close cousin Digest Authentication (which overcomes some of the security limitations of running Basic Authentication over an
unencrypted connection) is very much alive, but its use in hyperlinks can’t be relied upon: some browsers (e.g. IE, Safari)
completely munge such links while others don’t behave as you might expect. Other mechanisms like Bearer see widespread use in APIs, but nowhere else.
The WWW-Authenticate: and Authorization: headers are, in some ways, an example of the best possible way to implement authentication on the Web: as an
underlying standard independent of support for forms (and, increasingly, Javascript), cookies, and complex multi-part conversations. It’s easy to imagine an alternative
timeline where these standards continued to be collaboratively developed and maintained and their shortfalls – e.g. not being able to easily log out when using most graphical browsers!
– were overcome. A timeline in which one might write a login form like this, knowing that your e.g. “authenticate” attributes would instruct the browser to send credentials using an
Authorization: header:
In such a world, more-complex authentication strategies (e.g. multi-factor authentication) could involve encoding forms as JSON. And single-sign-on systems would simply involve the browser collecting a token from the authentication provider and passing it on to the
third-party service, directly through browser headers, with no need for backwards-and-forwards redirects with stacks of information in GET parameters as is the case today.
Client-side certificates – long a powerful but neglected authentication mechanism in their own right – could act as first class citizens directly alongside such a system, providing
transparent second-factor authentication wherever it was required. You wouldn’t have to accept a tracking cookie from a site in order to log in (or stay logged in), and if your
browser-integrated password safe supported it you could log on and off from any site simply by toggling that account’s “switch”, without even visiting the site: all you’d be changing is
whether or not your credentials would be sent when the time came.
The Web has long been on a constant push for the next new shiny thing, and that’s sometimes meant that established standards have been neglected prematurely or have failed to evolve for
longer than we’d have liked. Consider how long it took us to get the <video> and <audio> elements because the “new shiny” Flash came to dominate,
how the Web Payments API is only just beginning to mature despite over 25 years of ecommerce on the Web, or how we still can’t
use Link: headers for all the things we can use <link> elements for despite them being semantically-equivalent!
The new model for Web features seems to be that new features first come from a popular JavaScript implementation, and then eventually it evolves into a native browser feature: for
example HTML form validations, which for the longest time could only be done client-side using scripting languages. I’d love
to see somebody re-think HTTP Authentication in this way, but sadly we’ll never get a 100% solution in JavaScript alone: (distributed SSO is almost certainly off the table, for example, owing to cross-domain limitations).
Or maybe it’s just a problem that’s waiting for somebody cleverer than I to come and solve it. Want to give it a go?
There was a discussion this week in the Abnib WhatsApp group about whether a particular illustration of a farm was full of phallic imagery (it was).
This left me wondering if anybody had ever tried to identify the most-priapic buildings in the world. Of course towers often look at least a little bit like their architects
were compensating for something, but some – like the Ypsilanti Water Tower in Michigan pictured above – go further than
others.
Anyway: a shot tower in Bristol – a part of the UK with a long history of leadworking – was among the latecomer entrants to the competition, and seeing this curious building reminded me about something I’d read, once, about the
manufacture of lead shot. The idea (invented in Bristol by a plumber called William Watts) is that you pour molten lead
through a sieve at the top of a tower, let surface tension pull it into spherical drops as it falls, and eventually catch it in a cold water bath to finish solidifying it. I’d seen an
animation of the process, but I’d never seen a video of it, so I went about finding one.
The animation I saw might have been this one, or perhaps one that wasn’t so obviously-made-in-MS-Paint.
British Pathé‘s YouTube Channel provided me with this 1950 film, and if you follow only one hyperlink from this article, let it be this one! It’s a well-shot (pun intended, but there’s
a worse pun in the video!), and while I needed to translate all of the references to “hundredweights” and “Fahrenheit” to measurements that I can actually understand, it’s thoroughly
informative.
But there’s a problem with that video: it’s been badly cut from whatever reel it was originally found on, and from about 1 minute and 38 seconds in it switches to what is clearly a very
different film! A mother is seen shepherding her young daughter off to bed, and a voiceover says:
Bedtime has a habit of coming round regularly every night. But for all good parents responsibility doesn’t end there. It’s just the beginning of an evening vigil, ears attuned to cries
and moans and things that go bump in the night. But there’s no reason why those ears shouldn’t be your neighbours ears, on occasion.
“Off to bed, you little monster. And no watching TikTok when you should be trying to sleep!”
Now my interest’s piqued. What was this short film going to be about, and where could I find it? There’s no obvious link; YouTube doesn’t even make it easy to find the video
uploaded “next” by a given channel. I manipulated some search filters on British Pathé’s site until I eventually hit upon the right combination of magic words and found a clip called
Radio Baby Sitter. It starts off exactly where the misplaced prior clip cut out, and tells the story of “Mr.
and Mrs. David Hurst, Green Lane, Coventry”, who put a microphone by their daughter’s bed and ran a wire through the wall to their neighbours’ radio’s speaker so they can babysit
without coming over for the whole evening.
It’s a baby monitor, although not strictly a radio one as the title implies (it uses a signal wire!), nor is it groundbreakingly innovative: the first baby monitor predates it by over a decade, and it actually did use
radiowaves! Still, it’s a fun watch, complete with its contemporary fashion, technology, and social structures. Here’s the full thing, re-merged for your convenience:
Wait, what was I trying to do when I started, again? What was I even talking about…
It’s harder than it used to be
It used to be easier than this to get lost on the Web, and sometimes I miss that.
Obviously if you go back far enough this is true. Back when search engines were much weaker and Internet content was much less homogeneous and more distributed, we used to engage in
this kind of meandering walk all the time: we called it “surfing” the Web. Second-generation
Web browsers even had names, pretty often, evocative of this kind of experience: Mosaic, WebExplorer, Navigator, Internet Explorer, IBrowse. As people started to engage in the
noble pursuit of creating content for the Web they cross-linked their sources, their friends, their affiliations (remember webrings? here’s a reminder; they’re not quite as dead as you think!), your favourite sites etc. You’d follow links to other pages, then follow their links to others
still, and so on in that fashion. If you went round the circles enough times you’d start seeing all those invariably-blue hyperlinks turn purple and know you’d found your way home.
Some parts of the Web are perhaps best forgotten, though?
But even after that era, as search engines started to become a reliable and powerful way to navigate the wealth of content on the growing Web, links still dominated our exploration.
Following a link from a resource that was linked to by somebody you know carried the weight of a “web of trust”, and you’d quickly come to learn whose links were consistently valuable
and on what subjects. They also provided a sense of community and interconnectivity that paralleled the organic, chaotic networks of acquaintances people form out in the real world.
In recent times, that interpersonal connectivity has, for many, been filled by social networks (let’s ignore their failings in this regard for now). But linking to resources “outside” of the big
social media silos is hard. These advertisement-funded services work hard to discourage or monetise activity
that takes you off their platform, even at the expense of their users. Instagram limits the number of external links by profile; many social networks push
for resharing of summaries of content or embedding content from other sources, discouraging engagement with the wider Web, Facebook and Twitter both run external links
through a linkwrapper (which sometimes breaks); most large social networks make linking to the profiles of other users
of the same social network much easier than to users anywhere else; and so on.
The net result is that Internet users use fewer different websites today than they did 20 years ago,
and spend most of their “Web” time in app versions of
websites (which often provide a better experience only because site owners strategically make it so to increase their lock-in and data harvesting potential). Truly exploring the Web now
requires extra effort, like exercising an underused muscle. And if you begin and end your Web experience on just one to three services,
that just feels kind of… sad, to me. Wasted potential.
I suppose nowadays we don’t get lost as often outside of the Internet, either. Photo by Leah Kelly.
It sounds like I’m being nostalgic for a less-sophisticated time on the Web (that would certainly be in character!). A time before we’d
fully-refined the technology that would come to connect us in an instant to the answers we wanted. But that’s not exactly what I’m pining for. Instead, what I miss is something
we lost along the way, on that journey: a Web that was more fun-and-weird, more interpersonal, more diverse. More Geocities, less Facebook; there’s a surprising thing to find myself saying.
Somewhere along the way, we ended up with the Web we asked for, but it wasn’t the Web we wanted.
Here’s a perfect example I bumped into earlier this week, courtesy of The Green Web Foundation. This looks like a
hyperlink… but if you open it in a new tab/window, you see a page (not even a 404 page!) with the text “It looks like nothing was found at this location.”
In the site shown in the screenshot above, the developer took something the web gave them for free (a hyperlink), threw it away (by making it a link-to-nowhere), and rebuilt its
functionality with Javascript (without thinking about the fact that you can do more with hyperlinks than click them: you can click-and-drag them, you can bookmark them, you can share
them, you can open them in new tabs etc.). Ugh.
Something you can clearly type a numeric day, month and year into is best.
Three dropdowns are slightly worse, but at least if you use native HTML <select> elements keyboard
users can still “type” to filter.
Everything else – including things that look like <select>s but are really funky React <div>s, is pretty terrible.
Calendars can be great for choosing your holiday date range. But pressing “Prev” ~480 times to get to my month of birth isn’t good. Also: what’s with the time “sliders”? (Yes, I know I’ve implemented these myself, in the past, and I’m sorry.)
People designing webforms that require me to enter my birthdate:
I am begging you: just let me type it in.
Typing it in is 6-8 quick keystrokes. Trying to navigate a little calendar or spinny wheels back to the 1970s is time-consuming, frustrating and unnecessary.
They’re right. Those little spinny wheels are a pain in the arse if you’ve got to use one to go back 40+ years.
These things are okay (I guess) on mobile/touchscreen devices, though I’d still prefer the option to type in my date of birth. But send one to my desktop and I will
curse your name.
Can we do worse?
If there’s one thing we learned from making the worst volume control in the world, the other
year, it’s that you can always find a worse UI metaphor. So here’s my attempt at making a date of birth field that’s somehow
even worse than “date spinners”:
My datepicker implements a game of “higher/lower”. Starting from bounds specified in the HTML code and a random guess, it
narrows-down its guess as to what your date of birth is as you click the up or down buttons. If you make a mistake you can start over with the restart button.
Amazingly, this isn’t actually the worst datepicker into which I’ve entered my date of birth! It’s cognitively challenging compared to most, but it’s relatively fast at
narrowing down the options from any starting point. Plus, I accidentally implemented some good features that make it better than plenty of the datepickers out there:
It’s progressively enhanced – if the Javascript doesn’t load, you can still enter your date of birth in a sensible way.
Because it leans on a <input type="date"> control, your browser takes responsibility for localising, so if you’re from one of those weird countries that prefers
mm-dd-yyyy then that’s what you should see.
It’s moderately accessible, all things considered, and it could easily be improved further.
It turns out that even when you try to make something terrible, so long as you’re building on top of the solid principles the web gives you for free, you can accidentally end
up with something not-so-bad. Who knew?