Roll Your Own Antispam

Three Rings operates a Web contact form to help people get in touch with us: the idea is that it provides a quick and easy way to reach out if you’re a charity who might be able to make use of the system, a user who’s having difficulty with the features of the software, or maybe a potential new volunteer willing to give your time to the project.

But then the volume of spam it received increased dramatically. We don’t want our support team volunteers to spend all their time categorising spam: even if it doesn’t take long, it’s demoralising. So what could we do?

Clearly-spammy message shown in a ticket management system.
It’s clearly spam, but if it takes you 2 seconds to categorise it and there are 30 in your Inbox, that’s still a drag.

Our conventional antispam tools are configured pretty liberally: we don’t want to reject a contact from a legitimate user just because their message hits lots of scammy keywords (e.g. if a user’s having difficulty logging in and has copy-pasted all of the error messages they received, that can look a lot like a password reset spoofing scam to a spam filter). And we don’t want to add a CAPTCHA, because not only do those create a barrier to humans – while not necessarily reducing spam very much, nowadays – they’re often terrible for accessibility, privacy, or both.

But it didn’t take much analysis to spot some patterns unique to our contact form and the questions it asks that might provide an opportunity. For example, we discovered that spam messages would more-often-than-average:

  • Fill in both the “name” and (optional) “Three Rings username” field with the same value. While it’s cetainly possible for Three Rings users to have a login username that’s identical to their name, it’s very rare. But automated form-fillers seem to disproportionately pair-up these two fields.
  • Fill the phone number field with a known-fake phone number or a non-internationalised phone number from a country in which we currently support no charities. Legitimate non-UK contacts tend to put international-format phone numbers into this optional field, if they fill it at all. Spammers often put NANP (North American Numbering Plan) numbers.
  • Include many links in the body of the message. A few links, especially if they’re to our services (e.g. when people are asking for help) is not-uncommon in legitimate messages. Many links, few of which point to our servers, almost certainly means spam.
  • Choose the first option for the choose -one question “how can we help you?” Of course real humans sometimes pick this option too, but spammers almost always choose it.

None of these characteristics alone, or any of the half dozen or so others we analysed (including invisible checks like honeypots and IP-based geofencing), are reason to suspect a message of being spam. But taken together, they’re almost a sure thing.

To begin with, we assigned scores to each characteristic and automated the tagging of messages in our ticketing system with these scores. At this point, we didn’t do anything to block such messages: we were just collecting data. Over time, this allowed us to find a safe “threshold” score above which a message was certainly spam.

Three Rings contact form filled by Spammy McSpamface, showing a 'Security Checks Failed' error message and tips on refining the message.
Even when a message fails our customised spam checks, we only ‘soft-block’ it: telling the user their message was rejected and providing suggestions on working around that or emailing us conventionally. Our experience shows that the spammers aren’t willing to work to overcome this additional hurdle, but on the very rare ocassion a human hits them, they are.

Once we’d found our threshold we were able to engage a soft-block of submissions that exceeded it, and immediately the volume of spam making it to the ticketing system dropped considerably. Under 70 lines of PHP code (which sadly I can’t share with you) and we reduced our spam rate by over 80% while having, as far as we can see, no impact on the false-positive rate.

Where conventional antispam solutions weren’t quite cutting it, implementing a few rules specific to our particular use-case made all the difference. Sometimes you’ve just got to roll your sleeves up and look at the actual data you do/don’t want, and adapt your filters accordingly.

× ×

So… I’m A Podcast

Duration

Podcast Version

This post is also available as a podcast. Listen here, download for later, or subscribe wherever you consume podcasts.

Observant readers might have noticed that some of my recent blog posts – like the one about special roads, my idea for pressure-cooking tea, and the one looking at the history of window tax in two countries1 – are also available as podcast.

Podcast cover showing Dan touching his temple and speaking into a microphone, captioned 'a podcast nobody asked for, about things only Dan Q cares about'.

Why?

Like my occasional video content, this isn’t designed to replace any of my blogging: it’s just a different medium for those that might prefer it.

For some stories, I guess that audio might be a better way to find out what I’ve been thinking about. Just like how the vlog version of my post about my favourite video game Easter Egg might be preferable because video as a medium is better suited to demonstrating a computer game, perhaps audio’s the right medium for some of the things I write about, too?

But as much as not, it’s just a continuation of my efforts to explore different media over which a WordPress blog can be delivered2. Also, y’know, my ongoing effort to do what I’m bad at in the hope that I might get better at a wider diversity of skills.

How?

Let’s start by understanding what a “podcast” actually is. It is, in essence, just an RSS feed (something you might have heard me talk about before…) with audio enclosures – basically, “attachments” – on each item. The idea was spearheaded by Dave Winer back in 2001 as a way of subscribing to rich media like audio or videos in such a way that slow Internet connections could pre-download content so you didn’t have to wait for it to buffer.3

Mapping of wp-admin metadata fields to parts of a podcast feed.
Podcasts are pretty simple, even after you’ve bent over backwards to add all of the metadata that Apple Podcasts (formerly iTunes) expects to see. I looked at a couple of WordPress plugins that claimed to be able to do the work for me, but eventually decided it was simple enough to just add some custom metadata fields that could then be included in my feeds and tweak my theme code a little.

Here’s what I had to do to add podcasting capability to my theme:

The tag

I use a post tag, dancast, to represent posts with accompanying podcast content4. This way, I can add all the podcast-specific metadata only if the user requests the feed of that tag, and leave my regular feeds untampered . This means that you don’t get the podcast enclosures in the regular subscription; that might not be what everybody would want, but it suits me to serve podcasts only to people who explicitly ask for them.

It also means that I’m able to use a template, tag-dancast.php, in my theme to generate a customised page for listing podcast episodes.

The feed

Okay, onto the code (which I’ve open-sourced over here). I’ve use a series of standard WordPress hooks to add the functionality I need. The important bits are:

  1. rss2_item – to add the <enclosure>, <itunes:duration>, <itunes:image>, and <itunes:explicit> elements to the feed, when requesting a feed with my nominated tag. Only <enclosure> is strictly required, but appeasing Apple Podcasts is worthwhile too. These are lifted directly from the post metadata.
  2. the_excerpt_rss – I have another piece of post metadata in which I can add a description of the podcast (in practice, a list of chapter times); this hook swaps out the existing excerpt for my custom one in podcast feeds.
  3. rss_enclosure – some podcast syndication platforms and players can’t cope with RSS feeds in which an item has multiple enclosures, so as a safety precaution I strip out any enclosures that WordPress has already added (e.g. the featured image).
  4. the_content_feed – my RSS feed usually contains the full text of every post, because I don’t like feeds that try to force you to go to the original web page5 and I don’t want to impose that on others. But for the podcast feed, the text content of the post is somewhat redundant so I drop it.
  5. rss2_ns – of critical importance of course is adding the relevant namespaces to your XML declaration. I use the itunes namespace, which provides the widest compatibility for specifying metadata, but I also use the newer podcast namespace, which has growing compatibility and provides some modern features, most of which I don’t use except specifying a license. There’s no harm in supporting both.
  6. rss2_head – here’s where I put in the metadata for the podcast as a whole: license, category, type, and so on. Some of these fields are effectively essential for best support.

You’re welcome, of course, to lift any of all of the code for your own purposes. WordPress makes a perfectly reasonable platform for podcasting-alongside-blogging, in my experience.

What?

Finally, there’s the question of what to podcast about.

My intention is to use podcasting as an alternative medium to my traditional blog posts. But not every blog post is suitable for conversion into a podcast! Ones that rely on images (like my post about dithering) aren’t a great choice. Ones that have lots of code that you might like to copy-and-paste are especially unsuitable.

Dan, a microphone in front of him, smiles at the camera.
You’re listening to Radio Dan. 100% Dan, 100% of the time.(Also I suppose you might be able to hear my dog snoring in the background…)

Also: sometimes I just can’t be bothered. It’s already some level of effort to write a blog post; it’s like an extra 25% effort on top of that to record, edit, and upload a podcast version of it.

That’s not nothing, so I’ve tended to reserve podcasts for blog posts that I think have a sort-of eccentric “general interest” vibe to them. When I learn something new and feel the need to write a thousand words about it… that’s the kind of content that makes it into a podcast episode.

Which is why I’ve been calling the endeavour “a podcast nobody asked for, about things only Dan Q cares about”. I’m capable of getting nerdsniped easily and can quickly find my way down a rabbit hole of learning. My podcast is, I guess, just a way of sharing my passion for trivial deep dives with the rest of the world.

My episodes are probably shorter than most podcasts: my longest so far is around fifteen minutes, but my shortest is only two and a half minutes and most are about seven. They’re meant to be a bite-size alternative to reading a post for people who prefer to put things in their ears than into their eyes.

Anyway: if you’re not listening already, you can subscribe from here or in your favourite podcasting app. Or you can just follow my blog as normal and look for a streamable copy of podcasts at the top of selected posts (like this one!).

Footnotes

1 I’ve also retroactively recorded a few older ones. Have a look/listen!

2 As well as Web-based non-textual content like audio (podcasts) and video (vlogs), my blog is wholly or partially available over a variety of more-exotic protocols: did you find me yet on Gemini (gemini://danq.me/), Spartan (spartan://danq.me/), Gopher (gopher://danq.me/), and even Finger (finger://danq.me/, or run e.g. finger blog@danq.me from your command line)? Most of these are powered by my very own tool CapsulePress, and I’m itching to try a few more… how about a WordPress blog that’s accessible over FTP, NNTP, or DNS? I’m not even kidding when I say I’ve got ideas for these…

3 Nowadays, we have specialised media decoder co-processors which reduce the size of media files. But more-importantly, today’s high-speed always-on Internet connections mean that you probably rarely need to make a conscious choice between streaming or downloading.

4 I actually intended to change the tag to podcast when I went-live, but then I forgot, and now I can’t be bothered to change it. It’s only for my convenience, after all!

5 I’m very grateful that my favourite feed reader makes it possible to, for example, use a CSS selector to specify the page content it should pre-download for you! It means I get to spend more time in my feed reader.

× × ×

Even More 1999!

Spencer’s filter

Last month I implemented an alternative mode to view this website “like it’s 1999”, complete with with cursor trails, 88×31 buttons, tables for layout1, tiled backgrounds, and even a (fake) hit counter.

My blog post about 1999 Mode, viewed using 1999 Mode.
Feels like I’m 17 again.

One thing I’d have liked to do for 1999 Mode but didn’t get around to would have been to make the images look like it was the 90s, too.

Back then, many Web users only had  graphics hardware capable of displaying 256 distinct colours. Across different platforms and operating systems, they weren’t even necessarily the same 256 colours2! But the early Web agreed on a 216-colour palette that all those 8-bit systems could at least approximate pretty well.

I had an idea that I could make my images look “216-colour”-ish by using CSS to apply an SVG filter, but didn’t implement it.

A man wearing a cap pours himself a beer from a 10-litre box.
Let’s use this picture, from yesterday’s blog post, to talk about palettes…

But Spencer, a long-running source of excellent blog comments, stepped up and wrote an SVG filter for me! I’ve tweaked 1999 Mode already to use it… and I’ve just got to say it’s excellent: huge thanks, Spencer!

The filter coerces colours to their nearest colour in the “Web safe” palette, resulting in things like this:

A man wearing a cap pours himself a beer from a 10-litre box, reduced to a "Web safe" palette.
The flat surfaces are particularly impacted in this photo (as manipulated by the CSS SVG filter described above). Subtle hues and the gradients coalesce into slabs of colour, giving them an unnatural and blocky appearance.

Plenty of pictures genuinely looked like that on the Web of the 1990s, especially if you happened to be using a computer only capable of 8-bit colour to view a page built by somebody who hadn’t realised that not everybody would experience 24-bit colour like they did3.

Dithering

But not all images in the “Web safe” palette looked like this, because savvy web developers knew to dither their images when converting them to a limited palette. Let’s have another go:

A man wearing a cap pours himself a beer from a 10-litre box, reduced to a "Web safe" palette but using Floyd Steinberg dithering to reduce the impact of colour banding.
This image uses exactly the same 216-bit colour palette as the previous one, but looks a lot more “natural” thanks to the Floyd–Steinberg dithering algorithm.

Dithering introduces random noise to media4 in order to reduce the likelihood that a “block” will all be rounded to the same value. Instead; in our picture, a block of what would otherwise be the same colour ends up being rounded to maybe half a dozen different colours, clustered together such that the ratio in a given part of the picture is, on average, a better approximation of the correct colour.

The result is analogous to how halftone printing – the aesthetic of old comics and newspapers, with different-sized dots made from few colours of ink – produces the illusion of a continuous gradient of colour so long as you look at it from far-enough away.

Comparison image showing the original, websafe, and dithered-websafe images, zoomed in so that you can see the speckling of random noise in the dithered version.
Zooming in makes it easy to see the noisy “speckling” effect in the dithered version, but from a distance it’s almost invisible.

The other year I read a spectacular article by Surma that explained in a very-approachable way how and why different dithering algorithms produce the results they do. If you’ve any interest whatsoever in a deep dive or just want to know what blue noise is and why you should care, I’d highly recommend it.

You used to see digital dithering everywhere, but nowadays it’s so rare that it leaps out as a revolutionary aesthetic when, for example, it gets used in a video game.

Comparison image showing the image quantized to monochrome without (looks blocky/barely identifiable) and with (looks like old newspaper photography) dithering.
Dithering can be so effective that it can even make an image “work” all the way down to 1-bit (i.e. true monochrome/black-and-white) colour. Here I’ve used Jarvis, Judice & Ninke’s dithering algorithm, which is highly-effective for picking out subtle colour differences in what would otherwise be extreme dark and light patches, at the expense of being more computationally-expensive (to initially create) than other dithering strategies.

All of which is to say that: I really appreciate Spencer’s work to make my “1999 Mode” impose a 216-colour palette on images. But while it’s closer to the truth, it still doesn’t quite reflect what my website would’ve looked like in the 1990s because I made extensive use of dithering when I saved my images in Web safe palettes5.

Why did I take the time to dither my images, back in the day? Because doing the hard work once, as a creator of graphical Web pages, saves time and computation (and can look better!), compared to making every single Web visitor’s browser do it every single time.

Which, now I think about it, is a lesson that’s still true today (I’m talking to you, developers who send a tonne of JavaScript and ask my browser to generate the HTML for you rather than just sending me the HTML in the first place!).

Footnotes

1 Actually, my “1999 mode” doesn’t use tables for layout; it pretty much only applies a CSS overlay, but it’s deliberately designed to look a lot like my blog did in 1999, which did use tables for layout. For those too young to remember: back before CSS gave us the ability to lay out content in diverse ways, it was commonplace to use a table – often with the borders and cell-padding reduced to zero – to achieve things that today would be simple, like putting a menu down the edge of a page or an image alongside some text content. Using tables for non-tabular data causes problems, though: not only is it hard to make a usable responsive website with them, it also reduces the control you have over the order of the content, which upsets some kinds of accessibility technologies. Oh, and it’s semantically-invalid, of course, to describe something as a table if it’s not.

2 Perhaps as few as 22 colours were defined the same across all widespread colour-capable Web systems. At first that sounds bad. Then you remember that 4-bit (16 colour) palettes used to look look perfectly fine in 90s videogames. But then you realise that the specific 22 “very safe” colours are pretty shit and useless for rendering anything that isn’t composed of black, white, bright red, and maybe one of a few greeny-yellows. Ugh. For your amusement, here’s a copy of the image rendered using only the “very safe” 22 colours.

3 Spencer’s SVG filter does pretty-much the same thing as a computer might if asked to render a 24-bit colour image using only 8-bit colour. Simply “rounding” each pixel’s colour to the nearest available colour is a fast operation, even on older hardware and with larger images.

4 Note that I didn’t say “images”: dithering is also used to produce the same “more natural” feel for audio, too, when reducing its bitrate (i.e. reducing the number of finite states into which the waveform can be quantised for digitisation), for example.

5 I’m aware that my footnotes are capable of nerdsniping Spencer, so by writing this there’s a risk that he’ll, y’know, find a way to express a dithering algorithm as an SVG filter too. Which I suspect isn’t possible, but who knows! 😅

× × × × × ×

Ladybird Browser

I’ve been playing with the (pre-Alpha version of) Ladybird, and it fills me with such joy and excitement.

This page, as rendered by Ladybird.
As you can see, Ladybird does a perfectly adequate job of rendering this page, including most of its CSS and virtually all of its JavaScript.

Browser diversity

Back in 2018, while other Web developers were celebrating, I expressed my dismay at the news that Microsoft Edge was on the cusp of switching from using Microsoft’s own browser engine EdgeHTML to using Blink. Blink is the engine that powers almost all other mainstream browsers; all but Firefox, which continues to stand atop Gecko.

The developers who celebrated this loss of rendering engine diversity were, I suppose, happy to have one fewer browser in which they must necessarily test their work. I guess these are the same developers who don’t test the sites they develop for accessibility (does your site work if you can’t see the images? what about with a keyboard but without a pointing device? how about if you’re colourblind?), or consider what might happen if a part of their site fails (what if the third-party CDN that hosts your JavaScript libraries goes down or is blocked by the user’s security software or their ISP?).

This blog post viewed in Lynx.
When was the last time you tested your site in a text-mode browser?

But I was sad, because – as I observed after Andre Alves Garzia succinctly spelled it outbrowser engines are an endangered species. Building a new browser that supports the myriad complexities of the modern Web is such a huge endeavour that it’s unlikely to occur from scratch: from this point on, all “new” browsers are likely to be based upon an existing browser engine.

Engine diversity is important. Last time we had a lull in engine diversity, the Web got stuck, stagnating in the shadow of Internet Explorer 6’s dominance and under the thumb of Microsoft’s interests. I don’t want those days to come back; that’s a big part of why Firefox is my primary web browser.

A Ladybird book browser

Spoof cover for "The Ladybird Book of The Browser"
I actually still own a copy of the book from which I adapted this cover!

Ladybird is a genuine new browser engine. Y’know, that thing I said that we might never see happen again! So how’ve they made it happen?

It helps that it’s not quite starting from scratch. It’s starting point is the HTML viewer component from SerenityOS. And… it’s pretty good. It’s DOM processing’s solid, it seems to support enough JavaScript and CSS that the modern Web is usable, even if it’s not beautiful 100% of the time.

Acid3 test score of 97/100 in Ladybird.
I’ve certainly seen browsers do worse than this at Acid3 and related tests…

They’re not even expecting to make an Alpha release until next year! Right now if you want to use it at all, you’re going to need to compile the code for yourself and fight with a plethora of bugs, but it works and that, all by itself, is really exciting.

They’ve got four full-time engineers, funded off donations, with three more expected to join, plus a stack of volunteer contributors on Github. I’ve raised my first issue against the repo; sadly my C++ probably isn’t strong enough to be able to help more-directly, even if I somehow did have enough free time, which I don’t. But I’ll be watching-from-afar this wonderful, ambitious, and ideologically-sound initiative.

#100DaysToOffload

Woop! This is my 100th post of the year (stats), even using my more-conservative/pedant-friendly “don’t count checkins/reposts/etc. rule. If you’re not a pedant, I achieved #100DaysToOffload when I found a geocache alongside Regents Canal while changing trains to go to Amsterdam where I played games with my new work team, looked at windows and learned about how they’ve been taxed, and got nerdsniped by a bus depot. In any case: whether you’re a pedant or not you’ve got to agree I’ve achieved it now!

× × × ×

ARCC

In the late ’70s, a shadowy group of British technologists concluded that nuclear war was inevitable and secretly started work on a cutting-edge system designed to help rebuild society. And thanks to Matt Round-and-friends at vole.wtf (who I might have mentioned before), the system they created – ARCC – can now be emulated in your browser.

3D rendering of an ARCC system, by HappyToast.

I’ve been playing with it on-and-off all year, and I’ve (finally) managed to finish exploring pretty-much everything the platform currently has to offer, which makes it pretty damn good value for money for the £6.52 I paid for my ticket (the price started at £2.56 and increases by 2p for every ticket sold). But you can get it cheaper than I did if you score 25+ on one of the emulated games.

ARCC system showing a high score table for M1, with DAN50 (score 13012) at the top.
It gives me more pride than it ought to that I hold the high score for a mostly-unheard-of game for an almost-as-unheard-of computer system.

Most of what I just told you is true. Everything… except the premise. There never was a secretive cabal of engineers who made this whackballs computer system. What vole.wtf emulates is an imaginary system, and playing with that system is like stepping into a bizarre alternate timeline or a weird world. Over several separate days of visits you’ll explore more and more of a beautifully-realised fiction that draws from retrocomputing, Cold War fearmongering, early multi-user networks with dumb terminal interfaces, and aesthetics that straddle the tripoint between VHS, Teletext, and BBS systems. Oh yeah, and it’s also a lot like being in a cult.

Needless to say, therefore, it presses all the right buttons for me.

ARCC terminal in which an email is being written to DAN50.
If you make it onto ARCC – or are already there! – drop me a message. My handle is DAN50.

If you enjoy any of those things, maybe you’d like this too. I can’t begin to explain the amount of work that’s gone into it. If you’re looking for anything more-specific in a recommendation, suffice to say: this is a piece of art worth seeing.

× ×

A completely plaintext WordPress Theme

This is a repost promoting content originally published elsewhere. See more things Dan's reposted.

This is a silly idea. But it works. I saw Dan Q wondering about plaintext WordPress themes – so I made one.

This is what this blog looks like using it:

Screenshot showing my blog rendered just as text.

I clearly nerdsniped Terence at least a little when I asked whether a blog necessarily had to be HTML, because he went on to implement a WordPress theme that delivers content entirely in plain text.

Naturally, I’ve also shared his accomplishment on my own text/plain blog (which uses a much simpler CMS based on static files).

×

Does a blog have to be HTML?

Terence Eden wrote about his recent experience of IndieWebCamp Brighton, in which he mentioned that somebody – probably Jeremy Keith – had said, presumably to provoke discussion:

A blog post doesn’t need a title.

Terence disagrees, saying:

In a literal sense, he was wrong. The HTML specification makes it clear that the <title> element is mandatory. All documents have title.

But I think that’s an overreach. After all, where is it written that a blog must be presented in HTML?

Non-HTML blogs

There are plenty of counter-examples already in existence, of course:

But perhaps we can do better…

A totally text/plain blog

We’ve looked at plain text, which as a format clearly does not have to have a title. Let’s go one step further and implement it. What we’d need is:

  1. A webserver configured to deliver plain text files by preference, e.g. by adding directives like index index.txt; (for Nginx).5
  2. An index page listing posts by date and URL. Most browser won’t render these as “links” so users will have to copy-paste or re-type them, so let’s keep them short,
  3. Pages for each post at those URLs, presumably without any kind of “title” (just to prove a point), and
  4. An RSS feed: usually I use RSS as shorthand for all feed types, but this time I really do mean RSS and not e.g. Atom because RSS, strangely, doesn’t require that an <item> has a <title>!

I’ve implemented it! it’s at textplain.blog.

textplain.blog in Lynx
Unlike other sites, I didn’t need to test textplain.blog in Lynx to know it’d work well. But I did anyway.

In the end I decided it’d benefit from being automated as sort-of a basic flat-file CMS, so I wrote it in PHP. All requests are routed by the webserver to the program, which determines whether they’re a request for the homepage, the RSS feed, or a valid individual post, and responds accordingly.

It annoys me that feed discovery doesn’t work nicely when using a Link: header, at least not in any reader I tried. But apart from that, it seems pretty solid, despite its limitations. Is this, perhaps, an argument for my .well-known/feeds proposal?

Anyway, I’ve open-sourced the entire thing in case it’s of any use to anybody at all, which is admittedly unlikely! Here’s the code.

Footnotes

1 no-ht.ml technically does use HTML, but the same content could easily be delivered with an appropriate non-HTML MIME type if he’d wanted.

2 Again, I suppose this technically required HTML, even if what was delivered was an empty file!

3 Gemtext is basically Markdown, and doesn’t require a title.

4 Plain text obviously doesn’t require a title.

5 There’s no requirement that default files served by webservers are HTML, although it’s highly-unsual for that not to be the case.

5 Cool Apps for your Unraid NAS

I’ve got a (now four-year-old) Unraid NAS called Fox and I’m a huge fan. I particularly love the fact that Unraid can work not only as a NAS, but also as a fully-fledged Docker appliance, enabling me to easily install and maintain all manner of applications.

A cube-shaped black computer sits next to a battery pack on a laminated floor. A sign has been left atop it, reading "Caution: Generator connected to this installation."
There isn’t really a generator attached to Fox, just a UPS battery backup. The sign was liberated from our shonky home electrical system.

I was chatting this week to a colleague who was considering getting a similar setup, and he seemed to be taking notes of things he might like to install, once he’s got one. So I figured I’d round up five of my favourite things to install on an Unraid NAS that:

  1. Don’t require any third-party accounts (low dependencies),
  2. Don’t need any kind of high-powered hardware (low specs), and
  3. Provide value with very little set up (low learning curve).
Dan, his finger to his lips and his laptop on his knees, makes a "shush" action. A coworker can be seen working behind him.
It’d have been cooler if I’d have secretly written this blog post while sitting alongside said colleague (shh!). But sadly it had to wait until I was home.

Here we go:

Syncthing

I’ve been raving about Syncthing for years. If I had an “everyday carry” list of applications, it’d be high on that list.

Syncthing screenshot for computer Rebel, sharing with Fox, Idiophone, Lemmy and Maxine.
Syncthing’s just an awesome piece of set-and-forget software that facilitates file synchronisation between all of your devices and can also form part of a backup strategy.

Here’s the skinny: you install Syncthing on several devices, then give each the identification key of another to pair them. Now you can add folders on each and “share” them with the others, and the two are kept in-sync. There’s lots of options for power users, but just as a starting point you can use this to:

  • Manage the photos on your phone and push copies to your desktop whenever you’re home (like your favourite cloud photo sync service, but selfhosted).
  • Keep your Obsidian notes in-sync between all your devices (normally costs $4/month).1
  • Get a copy of the documents from all your devices onto your NAS, for backup purposes (note that sync’ing alone, even with versioning enabled, is not a good backup: the idea is that you run an actual backup from your NAS!).

Huginn

You know IFTTT? Zapier? Services that help you to “automate” things based on inputs and outputs. Huginn’s like that, but selfhosted. Also: more-powerful.

Screenshot showing Huginn workflows.
When we first started looking for a dog to adopt (y’know, before we got this derper), I set up Huginn watchers to monitor the websites of several rescue centres, filter them by some of our criteria, and push the results to us in real-time on Slack, giving us an edge over other prospective puppy-parents.

The learning curve is steeper than anything else on this list, and I almost didn’t include it for that reason alone. But once you’ve learned your way around its idiosyncrasies and dipped your toe into the more-advanced Javascript-powered magic it can do, you really begin to unlock its potential.

It couples well with Home Assistant, if that’s your jam. But even without it, you can find yourself automating things you never expected to.

FreshRSS

I’ve written a lot about how and why FreshRSS continues to be my favourite RSS reader. But you know what’s even better than an awesome RSS reader? An awesome selfhosted RSS reader!

FreshRSS screenshot.
Yes, I know I have a lot of “unread” items. That’s fine, and I can tell you why.

Many of these suggested apps benefit well from you exposing them to the open Web rather than just running them on your LAN, and an RSS reader is probably the best example (you want to read your news feeds when you’re out and about, right?). What you need for that is a reverse proxy, and there are lots of guides to doing it super-easily, even if you’re not on a static IP address.2. Alternatively you can just VPN in to your home: your router might be able to arrange this, or else Unraid can do it for you!

Open Trashmail

You know how sometimes you need to give somebody your email address but you don’t actually want to. Like: sure, I’d like you to email me a verification code for this download, but I don’t trust you not to spam me later! What you need is a disposable email address.3

Open Trashmail screenshot showing a subscription to Thanks for subscribing to Dan Q's Spam-Of-The-Hour List!
How do you feel about having infinite email addresses that you can make up on-demand (without even having access to a computer), subscribe to by RSS, and never have to see unless you specifically want to.

You just need to install Open Trashmail, point the MX records of a few domain names or subdomains (you’ve got some spare domain names lying around, right? if not; they’re pretty cheap…) at it, and it will now accept email to any address on those domains. You can make up addresses off the top of your head, even away from an Internet connection when using a paper-based form, and they work. You can check them later if you want to… or ignore them forever.

Couple it with an RSS reader, or Huginn, or Slack, and you can get a notification or take some action when an email arrives!

  • Need to give that escape room your email address to get a copy of your “team photo”? Give them a throwaway, pick up the picture when you get home, and then forget you ever gave it to them.
  • Company give you a freebie on your birthday if you sign up their mailing list? Sign up 366 times with them and write a Huginn workflow that puts “today’s” promo code into your Obsidian notetaking app (Sync’d over Syncthing) but filters out everything else.
  • Suspect some organisation is selling your email address on to third parties? Give them a unique email address that you only give to them and catch them in a honeypot.

YOURLS

Finally: a URL shortener. The Internet’s got lots of them, but they’re all at the mercy of somebody else (potentially somebody in a country that might not be very-friendly with yours…).

YOURLS screenshot (Your Own URL Shortener).
It isn’t pretty, but… it doesn’t need to be! Nobody actually sees the admin interface except you anyway.

Plus, it’s just kinda cool to be able to brand your shortlinks with your own name, right? If you follow only one link from this post, let it be to watch this video that helps explain why this is important: danq.link/url-shortener-highlights.

I run many, many other Docker containers and virtual machines on my NAS. These five aren’t even the “top five” that I use… they’re just five that are great starters because they’re easy and pack a lot of joy into their learning curve.

And if your NAS can’t do all the above… consider Unraid for your next NAS!

Footnotes

1 I wrote the beginnings of this post on my phone while in the Channel Tunnel and then carried on using my desktop computer once I was home. Sync is magic.

2 I can’t share or recommend one reverse proxy guide in particular because I set my own up because I can configure Nginx in my sleep, but I did a quick search and found several that all look good so I imagine you can do the same. You don’t have to do it on day one, though!

3 Obviously there are lots of approachable to on-demand disposable email addresses, including the venerable “plus sign in a GMail address” trick, but Open Trashmail is just… better for many cases.

× × × × × × ×

AI isn’t useless. But is it worth it?

This is a repost promoting content originally published elsewhere. See more things Dan's reposted.

Molly White writes, more-eloquently than I would’ve, almost-exactly my experience of LLMs and similar modern generative AIs:

I, like many others who have experimented with or adopted these products, have found that these tools actually can be pretty useful for some tasks. Though AI companies are prone to making overblown promises that the tools will shortly be able to replace your content writing team or generate feature-length films or develop a video game from scratch, the reality is far more mundane: they are handy in the same way that it might occasionally be useful to delegate some tasks to an inexperienced and sometimes sloppy intern.

Very much this.

I’ve experimented with a handful of generative AIs, such as:

  • GPT-3.5 / ChatGPT, for proofreading, summarisation, experimental rephrasing when writing, and idea generation. I’ve found it to be moderately good at summarisation and proofreading and pretty terrible at producing anything novel without sounding completely artificial and/or getting lost in a hallucination.
  • Bing for coalescing information. I like that it cites its sources. I dislike that it somehow still hallucinates. I might use it, I suppose, to help me re-phase a search query where I can’t remember the word I’m looking for.
  • Stable Diffusion for image generation. I’ve found it most-useful in image-to-image mode, for making low-effort concept art in bulk. For example, when running online roleplaying games for friends I’ve fed it an image of, say, a skeleton warrior and asked it to make me a few dozen more in a similar style, so as to provide a diverse selection of distinct tokens1. Its completely-original2 work lands squarely in the uncanny valley, though.
  • Github Copilot for code assistance. I’ve not tried its “chat”-powered functionality but I quite enjoy its “autocomplete” tool. When I’m coding and I forget the syntax of the command I’m typing, or need to stop and think for a moment about “what comes next”, it’s often there with the answer. I’ve even made us of the “write the comment describing what the code will do, let Copilot suggest the code for you” paradigm (though I’ve been pretty disappointed with the opposite approach: it doesn’t write great comments!). I find Copilot to be a lot like having an enthusiastic, eager-to-please, very well-read but somewhat naive junior programmer sitting beside me. If I ask them for some pairing assistance, they’re great, but I can’t trust them to do anything that I couldn’t do for myself!
  • Surely others besides that I’ve since forgotten.

Most-recently, I’ve played with music-making AI Suno and… it’s not great.. but like all these others it’s really interesting to experiment with and think about. Here: let me just ask it to write some “vocal trance europop about a woman called Molly; Molly has a robot friend who is pretty good at doing many tasks, but the one thing she’ll never trust the robot to do is write in her blog” –

AI-generated content:

AI-generated image of a boxy yellow humanoid robot sitting at a Macbook, typing.
[Verse]
Molly’s got a robot that she keeps by her side
It’s pretty handy
It can do many things right
But there’s one task she won’t let it touch
It’s true
The robot can’t write in her blog
No
That won’t do
[Verse 2]
This robot can clean the house
Sweep it up real nice
It can even cook a meal
Add some spice
But when it comes to sharing her thoughts on the screen
Molly won’t trust the robot
That’s her routine
[Chorus]
Molly’s robot friend
By her side all day and night
There’s one thing it can’t do
Can’t write her blog right
She’ll keep typing away
Sharing her heart and soul
That’s something the robot can’t do
It has no control

So yes, like Molly:

  • I’m absolutely a believer than these kinds of AIs have some value,
  • I’ve been reluctant and slow to say so because they seem to be such a polarising issue that it’s hard to say that you belong to neither “camp”,
  • I’m not entirely convinced that for the value they provide they’ve yet proven to be worth their cost, and I’m not certain that for general-purpose generation they will be any time soon, and
  • I’ve never used AI to write content for my blog, and I can’t see that ever changing.

It’s still an interesting field to follow-along with. Stuff like Sora from OpenAI and VASA-1 from Microsoft are just scary (the latter seems to have little purpose other than for misinformation-generation3!), but the genie’s out of the bottle now.

Footnotes

1 Visually-distinct tokens adds depth to the world and helps players communicate with one another: “You distract the skinny cultist, and I’ll try to creep up on the ugly one!”

2 I’m going to gloss right over the question of whether or not these tools are capable of creating anything truly original. You know what I mean.

3 Gotta admit though that I laughed like a drain at the Mona Lisa singing along with Anne Hathaway’s Lil’ Wayne Style Paparazzi Rap. If you’ve not seen the thing I’m talking about, go do that now.

RSS > ActivityPub

RSS is better than ActivityPub1.

Photograph of a boxing match, but with the heads of the competitors replaced with the ActivityPub and RSS logos (and "AP" or "RSS" written on their clothes, respectively). RSS is delivering a powerful uppercut to ActivityPub.
A devastating blow by RSS against a competitor 19 years his junior! For updates on this bout as it develops, don’t forget to subscribe… using either protocol.

When I subscribe to content, I want:

  • Resilient failsafes. ActivityPub has many points-of-failure. A notification might fail to complete transmission as a result of downtime, faults, or network conditions, and the receiving server might never know. A feed reader, conversely, can tell you that an address 404’d or the server was down.
  • Retroactive access. Once you fix the problem above… you still don’t get the message you missed: it’s probably gone forever – there’s no retroactive access. The same is true when your ActivityPub server connects with a peer for the first time: you only ever get new content after that point. RSS, on the other hand, provides some number of “recent” items the moment you first subscribe.
  • Simple subscriptions. RSS can be served from a statically-hosted single file, which makes it suitable to deploy anywhere as well as consume using anything. It can be read, after a fashion, in anything from Lynx upwards.

RSS ticks all these boxes. If I can choose between RSS and ActivityPub to subscribe to your content, and I don’t need a real-time update, I’m probably going to choose RSS.

About a month later, Matthias Pfefferle wrote a great post that makes a good “next stop” if you’re on a deep dive…

Footnotes

1 I feel like this statement needs a few clarifications and caveats, but my hot take looks spicier if I bury them in a footnote!

  • By RSS, I mean whichever pull-based basic HTTP you like, be that Atom, JSON Feed, h-entry, or even just properly-marked-up HTML5: did you know that the <article> element is intended to be suitable for syndication use?
  • Obviously I appreciate that RSS and ActivityPub are different tools for different jobs, and there are doubtless use-cases for which ActivityPub is clearly the superior solution.
  • I certainly don’t object to services providing both RSS and ActivityPub as syndication options, like Mastodon does, where both might be good choices.

×

Sonarr > Huginn > Slack

I use a tool called Sonarr to, uhh1, keep track of when new episodes of television shows are released, regardless of what platform they’re on (Netflix, Prime, iPlayer, whatever) and notify me so I remember to watch it.

For several years, I’ve used IFTTT as the intermediary, receiving webhooks from Sonarr and translating them for Slack:

A series of webhooks sent from Sonarr to IFTTT to Slack.
This worked for years, but it’s time to retire it.

IFTTT‘s move to kill its Legacy Pro plan2 – which I was on – gave me reason to re-assess this configuration. It turns that the only Pro feature I was using was an IFTTT “filter” to convert the Sonarr webhooks to a Slack-friendly-format.

Given that I’m running an installation of Huginn on my home network anyway, I resolved to re-implement this flow in Huginn and cancel my IFTTT subscription.

A series of webhooks sent from Sonarr to Huginn to Slack.
Raven-powered automation is the new hotness.

This turned out to be so easy I wonder why I never did it before.

First, I created a Webhook Agent and gave the URL to Sonarr.

Then I connected that to a Slack Agent with the following configuration:

{
  "webhook_url": "https://hooks.slack.com/services/...",
  "channel": "#sonarr",
  "username": "Sonarr",
  "message": "*<https://thetvdb.com/?tab=series&id={{series.tvdbId}}|{{series.title}}>*\nNew episodes:{% for episode in episodes %}\n• S{{episode.seasonNumber}}E{{episode.episodeNumber}} {{episode.title}}{% endfor %}",
  "icon": ":tv:"
}
I’ve omitted my Slack webhook URL so you don’t spam me. I tried for far too long to get the pluralize filter to work so it’d say “episode” or “episodes” as appropriate before realising I didn’t care enough and gave up.

Then all I needed to do was re-emit some of the previous webhooks to test it:

Slack chat window showing notifications: (1) a new episode of Resident Alien, announced by IFTTT, (b) the same episode, announced by "Sonarr", (c) two episodes of Marvel's Spidey and His Amazing Friends, also announced by "Sonarr".
As a bonus, I swapped out the IFTTT logo for Slackmoji’s :tv: icon and added the “Sonarr” username, as shown in my code sample.

Now I’ll continue to know when there’s new television to watch3!

I love the power and flexibility that Huginn provides to help automate your life. It does many of the things that I used to do with a handful of cron jobs and shell scripts, but all in one convenient place.

Footnotes

1 I’ve heard there are other uses for the tool. Your mileage may vary. Don’t forget to pay for your content, if possible.

2 Like many others, I originally signed up to the plan under the promise that the price would be honoured forever. Turns out “forever” means “three years”: who knew?

3 It’s especially useful when you’re between seasons or a show is on hiatus to be reminded that it’s back and I should go and watch it. Hey, there’s a thought: I wonder if I can extract the subtitles from shows and run them through a summarising LLM to give me a couple of paragraphs reminding me “what happened last series” if the show’s been on a long break?

× × ×

Oldest Digital Photo… of Me

Some younger/hipper friends tell me that there was a thing going around on Instagram this week where people post photos of themselves aged 21.

I might not have any photos of myself aged 21! I certainly can’t find any digital ones…

Dan, aged 22, stands in a cluttered flat with his partner Claire and several members of Dan's family.
The closest I can manage is this photo from 23 April 2003, when I was 22 years old.

It must sound weird to young folks nowadays, but prior to digital photography going mainstream in the 2000s (thanks in big part to the explosion of popularity of mobile phones), taking a photo took effort:

  • Most folks didn’t carry their cameras everywhere with them, ready-to-go, so photography was much more-intentional.
  • The capacity of a film only allowed you to take around 24 photos before you’d need to buy a new one and swap it out (which took much longer than swapping a memory card).
  • You couldn’t even look at the photos you’d taken until they were developed, which you couldn’t do until you finished the roll of film and which took at least hours – more-realistically days – and incurred an additional cost.

I didn’t routinely take digital photos until after Claire and I got together in 2002 (she had a digital camera, with which the photo above was taken). My first cameraphone – I was a relatively early-adopter – was a Nokia 7650, bought late that same year.

It occurs to me that I take more photos in a typical week nowadays, than I took in a typical year circa 2000.

Monochrome photo of a toddler, smiling broadly, pointing at the camera.
The oldest analogue photo of me that I own was taken on 2 October 1982, when I was 22 months old.

This got me thinking: what’s the oldest digital photo that exists, of me. So I went digging.

I might not have owned a digital camera in the 1990s, but my dad’s company owned one with which to collect pictures when working on-site. It was a Sony MVC-FD7, a camera most-famous for its quirky use of 3½” floppy disks as media (this was cheap and effective, but meant the camera was about the size and weight of a brick and took about 10 seconds to write each photo from RAM to the disk, during which it couldn’t do anything else).

In Spring 1998, almost 26 years ago, I borrowed it and took, among others, this photo:

Dan aged 17 - a young white man with platinum blonde shoulder-length hair - stands in front of a pink wall, holding up a large, boxy digital camera.
I’m aged 17 in what’s probably the oldest surviving digital photo of me, looking like a refugee from Legoland in 640×480 glorious pixels.

I’m confident a picture of me was taken by a Connectix QuickCam (an early webcam) in around 1996, but I can’t imagine it still exists.

So unless you’re about to comment to tell me know you differently and have an older picture of me: that snap of me taking my own photo with a bathroom mirror is the oldest digital photo of me that exists.

× × ×

Netscape’s Untold Webstories

I mentioned yesterday that during Bloganuary I’d put non-Bloganuary-prompt post ideas onto the backburner, and considered extending my daily streak by posting them in February. Here’s part of my attempt to do that:

Let’s take a trip into the Web of yesteryear, with thanks to our friends at the Internet Archive’s WayBack Machine.

The page we’re interested in used to live at http://www.netscape.com/comprod/columns/webstories/index.html, and promised to be a showcase for best practice in Web development. Back in October 1996, it looked like this:

Screenshot from Netscape Columns: Web Site Stories: a Coming Soon page which says "The series is scheduled to debut in November."

The page is a placeholder for Netscape Webstories (or Web Site Stories, in some places). It’s part of a digital magazine called Netscape Columns which published pieces written by Marc Andreeson, Jim Barksdale, and other bigwigs in the hugely-influential pre-AOL-acquisition Netscape Communications.

This new series would showcase best practice in designing and building Web sites1, giving a voice to the technical folks best-placed to speak on that topic. That sounds cool!

Those white boxes above and below the paragraph of text aren’t missing images, by the way: they’re horizontal rules, using the little-known size attribute to specify a thickness of <hr size=4>!2

Certainly you’re excited by this new column and you’ll come back in November 1996, right?

Screenshot from Netscape Columns: Web Site Stories: a Coming Soon page which says "The series is scheduled to begin in January."

Oh. The launch has been delayed, I guess. Now it’s coming in January.

The <hr>s look better now their size has been reduced, though, so clearly somebody’s paying attention to the page. But let’s take a moment and look at that page title. If you grew up writing web pages in the modern web, you might anticipate that it’s coded something like this:

<h2 style="font-variant: small-caps; text-align: center;">Coming Soon</h2>

There’s plenty of other ways to get that same effect. Perhaps you prefer font-feature-settings: 'smcp' in your chosen font; that’s perfectly valid. Maybe you’d use margin: 0 auto or something to centre it: I won’t judge.

But no, that’s not how this works. The actual code for that page title is:

<center>
  <h2>
    <font size="+3">C</font>OMING
    <font size="+3">S</font>OON
  </h2>
</center>

Back when this page was authored, we didn’t have CSS3. The only styling elements were woven right in amongst the semantic elements of a page4. It was simple to understand and easy to learn… but it was a total mess5.

Anyway, let’s come back in January 1997 and see what this feature looks like when it’s up-and-running.

Screenshot from Netscape Columns: Web Site Stories: a Coming Soon page which says "The series is scheduled to begin in the spring."

Nope, now it’s pushed back to “the spring”.

Under Construction pages were all the rage back in the nineties. Everybody had one (or several), usually adorned with one or more of about a thousand different animated GIFs for that purpose.6

Rotating animated "under construction" banner.

Building “in public” was an act of commitment, a statement of intent, and an act of acceptance of the incompleteness of a digital garden. They’re sort-of coming back into fashion in the interpersonal Web, with the “garden and stream” metaphor7 taking root. This isn’t anything new, of course – Mark Bernstein touched on the concepts in 1998 – but it’s not something that I can ever see returning to the “serious” modern corporate Web: but if you’ve seen a genuine, non-ironic “under construction” page published to a non-root page of a company’s website within the last decade, please let me know!

Under construction banner with an animated yellow-and-black tape banner between two "men at work" signs.

RSS doesn’t exist yet (although here’s a fun fact: the very first version of RSS came out of Netscape!). We’re just going to have to bookmark the page and check back later in the year, I guess…

Screenshot from Netscape Columns: Web Site Stories: a Coming Soon page identical to the previous version but with a search box ("To search the Netscape Columns, type a word or phrase here:") beneath.

Okay, so February clearly isn’t Spring, but they’ve updated the page… to add a search form.

It’s a genuine <form> tag, too, not one of those old-fashioned <isindex> tags you’d still sometimes find even as late as 1997. Interestingly, it specifies enctype="application/x-www-form-urlencoded". Today’s developers probably don’t think about the enctype attribute except when they’re doing a form that handles file uploads and they know they need to switch it to enctype="multipart/form-data", (or their framework does this automatically for them!).

But these aren’t the only options, and some older browsers at this time still defaulted to enctype="text/plain".  So long as you’re using a POST and not GET method, the distinction is mostly academic, but if your backend CGI program anticipates that special characters will come-in encoded, back then you’d be wise to specify that you wanted URL-encoding or you might get a nasty surprise when somebody turns up using LMB or something equally-exotic.

Anyway, let’s come back in June. The content must surely be up by now:

Screenshot from Netscape Columns: Web Site Stories: a Coming Soon page which says "The series is scheduled to begin in August."

Oh come on! Now we’re waiting until August?

At least the page isn’t abandoned. Somebody’s coming back and editing it from time to time to let us know about the still-ongoing series of delays. And that’s not a trivial task: this isn’t a CMS. They’re probably editing the .html file itself in their favourite text editor, then putting the appropriate file:// address into their copy of Netscape Navigator (and maybe other browsers) to test it, then uploading the file – probably using FTP – to the webserver… all the while thanking their lucky stars that they’ve only got the one page they need to change.

We didn’t have scripting languages like PHP yet, you see8. We didn’t really have static site generators. Most servers didn’t implement server-side includes. So if you had to make a change to every page on a site, for example editing the main navigation menu, you’d probably have to open and edit dozens or even hundreds of pages. Little wonder that framesets caught on, despite their (many) faults, with their ability to render your navigation separately from your page content.

Okay, let’s come back in August I guess:

Screenshot from Netscape Columns: Web Site Stories: a Coming Soon page which says "The series is scheduled to begin in the spring." Again.

Now we’re told that we’re to come back… in the Spring again? That could mean Spring 1998, I suppose… or it could just be that somebody accidentally re-uploaded an old copy of the page.

Hey: the footer’s gone too? This is clearly a partial re-upload: somebody realised they were accidentally overwriting the page with the previous-but-one version, hit “cancel” in their FTP client (or yanked the cable out of the wall), and assumed that they’d successfully stopped the upload before any damage was done.

They had not.

Screenshot of a Windows 95 dialog box, asking "Are you sure you want to delete index.html?" The cursor hovers over the "Yes" button.

I didn’t mention that top menu, did I? It looks like it’s a series of links, styled to look like flat buttons, right? But you know that’s not possible because you can’t rely on having the right fonts available: plus you’d have to do some <table> trickery to lay it out, at which point you’d struggle to ensure that the menu was the same width as the banner above it. So how did they do it?

The menu is what’s known as a client-side imagemap. Here’s what the code looks like:

<a href="/comprod/columns/images/nav.map">
  <img src="/comprod/columns/images/websitestories_ban.gif" width=468 height=32 border=0 usemap="#maintopmap" ismap>
</a><map name="mainmap">
  <area coords="0,1,92,24" href="/comprod/columns/mainthing/index.html">
  <area coords="94,1,187,24" href="/comprod/columns/techvision/index.html">
  <area coords="189,1,278,24" href="/comprod/columns/webstories/index.html">
  <area coords="280,1,373,24" href="/comprod/columns/intranet/index.html">
  <area coords="375,1,467,24" href="/comprod/columns/newsgroup/index.html">
</map>

The image (which specifies border=0 because back then the default behaviour for graphical browser was to put a thick border around images within hyperlinks) says usemap="#maintopmap" to cross-reference the <map> below it, which defines rectangular areas on the image and where they link to, if you click them! This ingenious and popular approach meant that you could transmit a single image – saving on HTTP round-trips, which were relatively time-consuming before widespread adoption of HTTP/1.1‘s persistent connections – along with a little metadata to indicate which pixels linked to which pages.

The ismap attribute is provided as a fallback for browsers that didn’t yet support client-side image maps but did support server-side image maps: there were a few! When you put ismap on an image within a hyperlink, then when the image is clicked on the href has appended to it a query parameter of the form ?123,456, where those digits refer to the horizontal and vertical coordinates, from the top-left, of the pixel that was clicked on! These could then be decoded by the webserver via a .map file or handled by a CGI program. Server-side image maps were sometimes used where client-side maps were undesirable, e.g. when you want to record the actual coordinates picked in a spot-the-ball competition or where you don’t want to reveal in advance which hotspot leads to what destination, but mostly they were just used as a fallback.9

Both client-side and server-side image maps still function in every modern web browser, but I’ve not seen them used in the wild for a long time, not least because they’re hard (maybe impossible?) to make accessible and they can’t cope with images being resized, but also because nowadays if you really wanted to make an navigation “image” you’d probably cut it into a series of smaller images and make each its own link.

Anyway, let’s come back in October 1997 and see if they’ve fixed their now-incomplete page:

Screenshot from Netscape Columns: Web Site Stories: the Coming Soon page is now laid out in two columns, but the expected launch date has been removed.

Oh, they have! From the look of things, they’ve re-written the page from scratch, replacing the version that got scrambled by that other employee. They’ve swapped out the banner and menu for a new design, replaced the footer, and now the content’s laid out in a pair of columns.

There’s still no reliable CSS, so you’re not looking at columns: (no implementations until 2014) nor at display: flex (2010) here. What you’re looking at is… a fixed-width <table> with a single row and three columns! Yes: three – the middle column is only 10 pixels wide and provides the “gap” between the two columns of text.10

This wasn’t Netscape’s only option, though. Did you ever hear of the <multicol> tag? It was the closest thing the early Web had to a semantically-sound, progressively-enhanced multi-column layout! The author of this page could have written this:

<multicol cols=2 gutter=10 width=301>
  <p>
    Want to create the best possible web site? Join us as we explore the newest
    technologies, discover the coolest tricks, and learn the best secrets for
    designing, building, and maintaining successful web sites.
  </p>
  <p>
    Members of the Netscape web site team, recognized designers, and technical
    experts will share their insights and experiences in Web Site Stories. 
  </p>
</multicol>

That would have given them the exact same effect, but with less code and it would have degraded gracefully. Browsers ignore tags they don’t understand, so a browser without support for <multicol> would have simply rendered the two paragraphs one after the other. Genius!

So why didn’t they? Probably because <multicol> only ever worked in Netscape Navigator.

Introduced in 1996 for version 3.0, this feature was absolutely characteristic of the First Browser War. The two “superpowers”, Netscape and Microsoft, both engaged in unilateral changes to the HTML specification, adding new features and launching them without announcement in order to try to get the upper hand over the other. Both sides would often refuse to implement one-another’s new tags unless they were forced to by widespread adoption by page authors, instead promoting their own competing mechanisms11.

Between adding this new language feature to their browser and writing this page, Netscape’s market share had fallen from around 80% to around 55%, and most of their losses were picked up by IE. Using <multicol> would have made their page look worse in Microsoft’s hot up-and-coming browser, which wouldn’t have helped them persuade more people to download a copy of Navigator and certainly wouldn’t be a good image on a soon-to-launch (any day now!) page about best-practice on the Web! So Netscape’s authors opted for the dominant, cross-platform solution on this page12.

Anyway, let’s fast-forward a bit and see this project finally leave its “under construction” phase and launch!

Screenshot showing the homepage of Netscape Columns from 15 February 1998; the first recorded copy NOT to have a header link to the Webstories / Web Site Stories page.

Oh. It’s gone.

Sometime between October 1997 and February 1998 the long promised “Web Site Stories” section of Netscape Columns quietly disappeared from the website. Presumably, it never published a single article, instead remaining a perpetual “Coming Soon” page right up until the day it was deleted.

I’m not sure if there’s a better metaphor for Netscape’s general demeanour in 1998 – the year in which they finally ceased to be the dominant market leader in web browsers – than the quiet deletion of a page about how Netscape customers are making the best of the Web. This page might not have been important, or significant, or even completed, but its disappearance may represent Netscape’s refocus on trying to stay relevant in the face of existential threat.

Of course, Microsoft won the First Browser War. They did so by pouring a fortune’s worth of developer effort into staying technologically one-step ahead, refusing to adopt standards proposed by their rival, and their unprecedented decision to give away their browser for free13.

Footnotes

1 Yes, we used to write “Web sites” as two words. We also used to consistently capitalise the words Web and Internet. Some of us still do so.

2 In case it’s not clear, this blog post is going to be as much about little-known and archaic Web design techniques as it is about Netscape’s website.

3 This is a white lie. CSS was first proposed almost at the same time as the Web! Microsoft Internet Explorer was first to deliver a partial implementation of the initial standard, late in 1996, but Netscape dragged their heels, perhaps in part because they’d originally backed a competing standard called JavaScript Style Sheets (JSSS). JSSS had a lot going for it: if it had enjoyed widespread adoption, for example, we’d have had the equivalent of CSS variables a full twenty years earlier! In any case, back in 1996 you definitely wouldn’t want to rely on CSS support.

4 Wondering where the text and link colours come from? <body bgcolor="#ffffff" text="#000000" link="#0000ff" vlink="#ff0000" alink="#ff0000">. Yes really, that’s where we used to put our colours.

5 Personally, I really loved the aesthetic Netscape touted when using Times New Roman (or whatever serif font was available on your computer: webfonts weren’t a thing yet) with temporary tweaks to font sizes, and I copied it in some of my own sites. If you look back at my 2018 blog post celebrating two decades of blogging, where I’ve got a screenshot of my blog as it looked circa 1999, you’ll see that I used exactly this technique for the ordinal suffixes on my post dates! On the same post, you’ll see that I somewhat replicated the “feel” of it again in my 2011 design, this time using a stylesheet.

6 There’s a whole section of Cameron’s World dedicated to “under construction” banners, and that’s a beautiful thing!

7 The idea of “garden and stream” is that you publish early and often, refining as you go, in your garden, which can act as an extension of whatever notetaking system you use already, but publish mostly “finished” content to your (chronological) stream. I see an increasing number of IndieWeb bloggers going down this route, but I’m not convinced that it’s for me.

8 Another white lie. PHP was released way back in 1995 and even the very first version supported something a lot like server-side includes, using the syntax <!--include /file/name.html-->. But it was a little computationally-intensive to run willy-nilly.

9 Server-side imagemaps are enjoying a bit of a renaissance on .onion services, whose visitors often keep JavaScript disabled, to make image-based CAPTCHAs. Simply show the visitor an image and describe the bit you want them to click on, e.g. “the blue pentagon with one side missing”, then compare the coordinates of the pixel they click on to the knowledge of the right answer. Highly-inaccessible, of course, but innovative from a purely-technical perspective.

10 Nowadays, use of tables for layout – or, indeed, for anything other than tabular data – is very-much frowned upon: it’s often bad for accessibility and responsive design. But back before we had the features granted to us by the modern Web, it was literally the only way to get content to appear side-by-side on a page, and designers got incredibly creative about how they misused tables to lay out content, especially as browsers became more-sophisticated and began to support cells that spanned multiple rows or columns, tables “nested” within one another, and background images.

11 It was a horrible time to be a web developer: having to make hacky workarounds in order to make use of the latest features but still support the widest array of browsers. But I’d still take that over the horrors of rendering engine monoculture!

12 Or maybe they didn’t even think about it and just copy-pasted from somewhere else on their site. I’m speculating.

13 This turned out to be the master-stroke: not only did it partially-extricate Microsoft from their agreement with Spyglass Inc., who licensed their browser engine to Microsoft in exchange for a percentage of sales value, but once Microsoft started bundling Internet Explorer with Windows it meant that virtually every computer came with their browser factory-installed! This strategy kept Microsoft on top until Firefox and Google Chrome kicked-off the Second Browser War in the early 2010s. But that’s another story.

× × × × × × × × × × ×

[Bloganuary] Uninvention

This post is part of my attempt at Bloganuary 2024. Today’s prompt is:

If you could un-invent something, what would it be?

Fucking cryptocurrency.

Industrial sprawl at sunset: countless tall chimneys belch smoke alongside crisscrossing power lines. In the smoke, the outline of "physical" Bitcoins can be seen.
To preempt the inevitable “well actually”: yes, I’m fully aware that there exist cryptocurrencies that have minimal environmental impact. I concede that those cryptocurrencies might only have all the other problems. Stop talking to me about how great you think Ripple is.

I remember when Bitcoin first appeared. A currency based on a ledger recorded in a shared blockchain sounded pretty cool from a technological standpoint, and so – as a technology enthusiast – I experimented with it.

I recall that I bought a couple of Bitcoin; I think they were about 50 pence each? It seemed like a “toy” currency; nothing that would ever attract any mainstream attention. After all: why would it? It’s less-anonymous than cash. It’s less-convenient than cards. It’s (even) less-widely-accepted than cheques. It somehow manages to be somehow slower than everything. And crucially, without any government backing it can’t be used to settle a debt or pay your taxes1. The technology was interesting to me, but it had no real-world application.

Screengrab from BEEF Series 1, Episode 1, showing a held mobile phone showing a Bitcoin wallet's value crashing by 87%
When a conventional currency does something like this, we call it a catastrophe. When a cryptocurrency does it, we call it a Thursday.

Imagine my surprise when people started investing in the cryptocurrency. Began accepting it in payment for things. I know a tulip economy when I see one, I figured, so I got rid of my “toy” Bitcoins when the price hit around £750 each2. Sure, it’d have been “smarter” to wait until it hit £45,000 each, but I genuinely thought the bubble was going to burst and, besides, I’d never wanted to get into that game to begin with: I was just playing about with an interesting bit of technology when suddenly half the world began talking about it.

The world taking cryptocurrencies seriously was the worst thing that ever happened to them. When they were just a toy, nobody “invested” in them. Nobody built planet-destroying mining rigs to compete to produce more of them. Nobody used them as a vehicle to make ransomware feasible or set up elaborate Ponzi schemes or get-rich-quick scams off the back of them.

(Fake) cryptolocker screenshot that implies that DanQ.me has encrypted your files and will only decrypt them if you send 1 EGX (Emma GoldCoin).
DanQ.me has encrypted all your files. As Emma GoldCoin is the only cryptocurrency I can get behind, I demand you send me 1 EGX to unlock them. (No, don’t go and check; I promise they’re encrypted! Just take my word on it!)

And yeah, with few exceptions (of which Emma GoldCoin is the best), cryptocurrencies not only provide a vehicle for scammers, do nothing to combat inequality (and potentially make it worse by tying it to the digital divide), and destroy the planet… but they generally don’t even achieve the promises they make of anonymous, decentralised, stable, utilitarian currencies.

I’m not going to deep-dive into everything that’s wrong with cryptocurrencies3 (and I’m not going near NFTs, but rest assured they’re even stupider). There’s plenty of more-eloquent people online who can explain it to you if you need to; start at Web3IsGoingGreat.com if you like.

So yeah, if we could just uninvent cryptocurrencies, or at least uninvent whatever it is the masses think they see in them, then that’d be just great, thanks.

Footnotes

1 Being legal tender and being useful to pay your taxes are the magic beans that make fiat currencies worth something.

2 Sometimes, people mistake me for somebody with any level of interest in cryptocurrency “investment”. After I’m done correcting their misapprehension, I enjoy pointing out that I made a 150,000% return-on-investment on cryptocurrencies and I still recommend against anybody getting involved in them.

3 If I can pick out just one pet hate, though, that trumps all the others: it’s the “cryptobros” who call cryptocurrencies “crypto”, as if that wasn’t a prefix that already had a plethora of better-established uses, all of which are undermined by the co-opting of their name. It’s somehow even worse than the idiots who shorten Wikipedia to “wiki”.

× ×

Length Extension Attack Demonstration (Video)

This post is also available as an article. So if you'd rather read a conventional blog post of this content, you can!

This is a video version of my blog post, Length Extension Attack. In it, I talk through the theory of length extension attacks and demonstrate an SHA-1 length extension attack against an (imaginary) website.

The video can also be found on: