Cable Car Marquee

This is a repost promoting content originally published elsewhere. See more things Dan's reposted.

Was playing around with some HTML and made a cable car for my page. Hmh.

Beautiful. It feels like it ought to have been wrapped in a HTML Web Component, maybe called <cable-car>, with progressive enhancement bonus features (maybe it’ll only run during daylight hours? or when the wind isn’t too fast?)?

But really: I can’t fault this. Beautiful.

Enumerating Domains

I’ve just enumerated my personal domain names. There’s a lot fewer of them than there used to be!1

Anyway: here’s the list –

I think that’s all of them, but it’s hard to be sure…

Footnotes

1 Maybe I’ve finally shaken off my habit of buying a domain name for everything. Or maybe it’s just that I’ve embraced subdomains for more stuff. Probably the latter.

Digital Dustbusting

tl;dr: I’m tidying up and consolidating my personal hosting; I’ve made a little progress, but I’ve got a way to go – fortunately I’ve got a sabbatical coming up at work!

At the weekend, I kicked-off what will doubtless be a multi-week process of gradually tidying and consolidating some of the disparate digital things I run, around the Internet.

I’ve a long-standing habit of having an idea (e.g. gamebook-making tool Twinebook, lockpicking puzzle game Break Into Us, my Cheating Hangman game, and even FreeDeedPoll.org.uk!), deploying it to one of several servers I run, and then finding it a huge headache when I inevitably need to upgrade or move said server because there’s such an insane diversity of different things that need testing!

Screenshot from Cheating Hangman: I guessed an 'E', but when I guessed an 'O' I was told that there was one (the computer was thinking of 'CLOSE'), but now there isn't because it's switched to a different word that ends with 'E'.
My “cheating hangman” game spun out from my analysis of the hardest words for an optimal player to guess, which was in turn inspired by the late Nick Berry’s examination of optimal strategy.

I can simplify, I figured. So I did.

And in doing so, I rediscovered several old projects I’d neglected or forgotten about. I wonder if anybody’s still using any of them?

Hosting I’ve tidied so far…

  • Cheating Hangman is now hosted by GitHub Pages.
  • DNDle, my Wordle-clone where you have to guess the Dungeons & Dragons 5e monster’s stat block, is now hosted by GitHub Pages. Also, I fixed an issue reported a month ago that meant that I was reporting Giant Scorpions as having a WIS of 19 instead of 9.
  • Abnib, which mostly reminds people of upcoming birthdays and serves as a dumping ground for any Abnib-related shit I produce, is now hosted by GitHub Pages.
  • RockMonkey.org.uk, which doesn’t really do much any more, is now hosted by GitHub Pages.
  • EGXchange, my implementation of a digital wallet for environmentally-friendly cryptocurrency EmmaGoldCoin, which I’ve written about before, is now hosted by GitHub Pages.
  • Sour Grapes, the single-page promo for a (remote) murder mystery party I hosted during a COVID lockdown, is now hosted by GitHub Pages.
  • A convenience-page for giving lost people directions to my house is now hosted by GitHub Pages.
  • Dan Q’s Things is now automatically built on a schedule and hosted by GitHub Pages.
  • Robin’s Improbable Blog, which spun out from 52 Reflect, wasn’t getting enough traffic to justify “proper” hosting so now it sits in a Docker container on my NAS.
  • My μlogger server, which records my location based on pings from my phone, has also moved to my NAS. This has broken Find Dan Q, but I’m not sure if I’ll continue with that in its current form anyway.
  • All of my various domain/subdomain redirects have been consolidated on, or are in the process of moving to, to a tiny Linode/Akamai instance. It’s a super simple plain Nginx server that does virtually nothing except redirect people – this is where I’ll park the domains I register but haven’t found a use for yet, in future.
Screenshot showing EGXchange, saying "everybody has an EGX wallet, log in to yours now".
I was pretty proud of EGXchange.org, but I’ll be first to admit that it’s among the stupider of my throwaway domains.

It turns out GitHub pages is a fine place to host simple, static websites that were open-source already. I’ve been working on improving my understanding of GitHub Actions anyway as part of what I’ve been doing while wearing my work, volunteering, and personal hats, so switching some static build processes like DNDle’s to GitHub Actions was a useful exercise.

Stuff I’m still to tidy…

There’s still a few things I need to tidy up to bring my personal hosting situation under control:

DanQ.me

Screenshot showing this blog post.
You’re looking at it. But later this year, you might be looking at it… elsewhere?

This is the big one, because it’s not just a WordPress blog: it’s also a Gemini, Spartan, and Gopher server (thanks CapsulePress!), a Finger server, a general-purpose host to a stack of complex stuff only some of which is powered by Bloq (my WordPress/PHP integrations): e.g. code to generate the maps that appear on my geopositioned posts, code to integrate with the Fediverse, a whole stack of configuration to make my caching work the way I want, etc.

FreeDeedPoll.org.uk

Right now this is a Ruby/Sinatra application, but I’ve got a (long-running) development branch that will make it run completely in the browser, which will further improve privacy, allow it to run entirely-offline (with a service worker), and provide a basis for new features I’d like to provide down the line. I’m hoping to get to finishing this during my Automattic sabbatical this winter.

Screenshot showing freedeedpoll.org.uk
The website’s basically unchanged for most of a decade and a half, and… umm… it looks it!

A secondary benefit of it becoming browser-based, of course, is that it can be hosted as a static site, which will allow me to move it to GitHub Pages too.

Geohashing.site

When I took over running the world’s geohashing hub from xkcd‘s Randall Munroe (and davean), I flung the site together on whatever hosting I had sitting around at the time, but that’s given me some headaches. The outbound email transfer agent is a pain, for example, and it’s a hard host on which to apply upgrades. So I want to get that moved somewhere better this winter too. It’s actually the last site left running on its current host, so it’ll save me a little money to get it moved, too!

Screenshot from Geohashing.site's homepage.
Geohashing’s one of the strangest communities I’m honoured to be a part of. So it’d be nice to treat their primary website to a little more respect and attention.

My FreshRSS instance

Right now I run this on my NAS, but that turns out to be a pain sometimes because it means that if my home Internet goes down (e.g. thanks to a power cut, which we have from time to time), I lose access to the first and last place I go on the Internet! So I’d quite like to move that to somewhere on the open Internet. Haven’t worked out where yet.

Next steps

It’s felt good so far to consolidate and tidy-up my personal web hosting (and to rediscover some old projects I’d forgotten about). There’s work still to do, but I’m expecting to spend a few months not-doing-my-day-job very soon, so I’m hoping to find the opportunity to finish it then!

× × × × ×

WP Engine’s Three Problems

Duration

Podcast Version

This post is also available as a podcast. Listen here, download for later, or subscribe wherever you consume podcasts.

If you’re active in the WordPress space you’re probably aware that there’s a lot of drama going on right now between (a) WordPress hosting company WP Engine, (b) WordPress hosting company (among quite a few other things) Automattic1, and (c) the WordPress Foundation.

If you’re not aware then, well: do a search across the tech news media to see the latest: any summary I could give you would be out-of-date by the time you read it anyway!

Illustration showing relationships between WordPress and Automattic (licensing trademarks and contributing effort to), between WordPress and WP Engine (the latter profits from the former), and between Automattic and WP Engine (throwing lawsuits at one another).
I tried to draw a better diagram with more of the relevant connections, but it quickly turned into spaghetti.

A declaration of war?

Like others, I’m not sure that the way Matt publicly called-out WP Engine at WCUS was the most-productive way to progress a discussion2.

In particular, I think a lot of the conversation that he kicked off conflates three different aspects of WP Engine’s misbehaviour. That muddies the waters when it comes to having a reasoned conversation about the issue3.

Matt Mullwenweg on stage at WordCamp US 2024, stating how he feels that WP Engine exploits WordPress (to great profit) without contributing back.
I’ve heard Matt speak a number of times, including in person… and I think he did a pretty bad job of expressing the problems with WP Engine during his Q&A at WCUS. In his defence, it sounds like he may have been still trying to negotiate a better way forward until the very second he walked on stage that day.

I don’t think WP Engine is a particularly good company, and I personally wouldn’t use them for WordPress hosting. That’s not a new opinion for me: I wouldn’t have used them last year or the year before, or the year before that either. And I broadly agree with what I think Matt tried to say, although not necessarily with the way he said it or the platform he chose to say it upon.

Misdeeds

As I see it, WP Engine’s potential misdeeds fall into three distinct categories: moral, ethical4, and legal.

Morally: don’t take without giving back

Matt observes that since WP Engine’s acquisition by huge tech-company-investor Silver Lake, WP Engine have made enormous profits from selling WordPress hosting as a service (and nothing else) while making minimal to no contributions back to the open source platform that they depend upon.

If true, and it appears to be, this would violate the principle of reciprocity. If you benefit from somebody else’s effort (and you’re able to) you’re morally-obliged to at least offer to give back in a manner commensurate to your relative level of resources.

Two children sit on a bed: one hands a toy dinosaur to the other.
The principle of reciprocity is a moral staple. This is evidenced by the fact that children (and some nonhuman animals) seem to be able to work it out for themselves from first principles using nothing more than empathy. Companies, however aren’t usually so-capable. Photo courtesy Cotton.

Abuse of this principle is… sadly not-uncommon in business. Or in tech. Or in the world in general. A lightweight example might be the many millions of profitable companies that host atop the Apache HTTP Server without donating a penny to the Apache Foundation. A heavier (and legally-backed) example might be Trump Social’s implementation being based on a modified version of Mastodon’s code: Mastodon’s license requires that their changes are shared publicly… but they don’t do until they’re sent threatening letters reminding them of their obligations.

I feel like it’s fair game to call out companies that act amorally, and encourage people to boycott them, so long as you do so without “punching down”.

Ethically: don’t exploit open source’s liberties as weaknesses

WP Engine also stand accused of altering the open source code that they host in ways that maximise their profit, to the detriment of both their customers and the original authors of that code5.

It’s well established, for example, that WP Engine disable the “revisions” feature of WordPress6. Personally, I don’t feel like this is as big a deal as Matt makes out that it is (I certainly wouldn’t go as far as saying “WP Engine is not WordPress”): it’s pretty commonplace for large hosting companies to tweak the open source software that they host to better fit their architecture and business model.

But I agree that it does make WordPress as-provided by WP Engine significantly less good than would be expected from virtually any other host (most of which, by the way, provide much better value-for-money at any price point).

Fake web screenshot showing turdpress.com, "WordPress... But Shit".
There’s nothing to stop me from registering TurdPress.com and providing a premium WordPress web hosting solution with all the best features disabled: I could even disable exports so that my customers wouldn’t even be able to easily leave my service for greener pastures! There’s nothing stop me… but that wouldn’t make it right7.
It also looks like WP Engine may have made more-nefarious changes, e.g. modifying the referral links in open source code (the thing that earns money for the original authors of that code) so that WP Engine can collect the revenue themselves when they deploy that code to their customers’ sites. That to me feels like it’s clearly into the zone ethical bad practice. Within the open source community, it’s not okay to take somebody’s code, which they were kind enough to release under a liberal license, strip out the bits that provide their income, and redistribute it, even just as a network service8.

Again, I think this is fair game to call out, even if it’s not something that anybody has a right to enforce legally. On which note…

Legally: trademarks have value, don’t steal them

Automattic Inc. has a recognised trademark on WooCommerce, and is the custodian of the WordPress Foundation’s trademark on WordPress. WP Engine are accused of unauthorised use of these trademarks.

Obviously, this is the part of the story you’re going to see the most news media about, because there’s reasonable odds it’ll end up in front of a judge at some point. There’s a good chance that such a case might revolve around WP Engine’s willingness (and encouragement?) to allow their business to be called “WordPress Engine” and to capitalise on any confusion that causes.

Screenshot from the WordPress Foundation's Trademark Policy page, with all but the first line highlighted of the paragraph that reads: The abbreviation “WP” is not covered by the WordPress trademarks, but please don’t use it in a way that confuses people. For example, many people think WP Engine is “WordPress Engine” and officially associated with WordPress, which it’s not. They have never once even donated to the WordPress Foundation, despite making billions of revenue on top of WordPress.
I don’t know how many people spotted this ninja-edit addition to the WordPress Foundation’s Trademark Policy page, but I did.

I’m not going to weigh in on the specifics of the legal case: I Am Not A Lawyer and all that. Naturally I agree with the underlying principle that one should not be allowed to profit off another’s intellectual property, but I’ll leave discussion on whether or not that’s what WP Engine are doing as a conversation for folks with more legal-smarts than I. I’ve certainly known people be confused by WP Engine’s name and branding, though, and think that they must be some kind of “officially-licensed” WordPress host: it happens.

If you’re following all of this drama as it unfolds… just remember to check your sources. There’s a lot of FUD floating around on the Internet right now9.

In summary…

With a reminder that I’m sharing my own opinion here and not that of my employer, here’s my thoughts on the recent WP Engine drama:

  1. WP Engine certainly act in ways that are unethical and immoral and antithetical to the spirit of open source, and those are just a subset of the reasons that I wouldn’t use them as a WordPress host.
  2. Matt Mullenweg calling them out at WordCamp US doesn’t get his point across as well as I think he hoped it might, and probably won’t win him any popularity contests.
  3. I’m not qualified to weigh in on whether or not WP Engine have violated the WordPress Foundation’s trademarks, but I suspect that they’ve benefitted from widespread confusion about their status.

Footnotes

1 I suppose I ought to point out that Automattic is my employer, in case you didn’t know, and point out that my opinions don’t necessarily represent theirs, etc. I’ve been involved with WordPress as an open source project for about four times as long as I’ve had any connection to Automattic, though, and don’t always agree with them, so I’d hope that it’s a given that I’m speaking my own mind!

2 Though like Manu, I don’t think that means that Matt should take the corresponding blog post down: I’m a digital preservationist, as might be evidenced by the unrepresentative-of-me and frankly embarrassing things in the 25-year archives of this blog!

3 Fortunately the documents that the lawyers for both sides have been writing are much clearer and more-specific, but that’s what you pay lawyers for, right?

4 There’s a huge amount of debate about the difference between morality and ethics, but I’m using the definition that means that morality is based on what a social animal might be expected to decide for themselves is right, think e.g. the Golden Rule etc., whereas ethics is the code of conduct expected within a particular community. Take stealing, for example, which covers the spectrum: that you shouldn’t deprive somebody else of something they need, is a moral issue; that we as a society deem such behaviour worthy of exclusion is an ethical one; meanwhile the action of incarcerating burglars is part of our legal framework.

5 Not that nobody’s immune to making ethical mistakes. Not me, not you, not anybody else. I remember when, back in 2005, Matt fucked up by injecting ads into WordPress (which at that point didn’t have a reliable source of funding). But he did the right thing by backpedalling, undoing the harm, and apologising publicly and profusely.

6 WP Engine claim that they disable revisions for performance reasons, but that’s clearly bullshit: it’s pretty obvious to me that this is about making hosting cheaper. Enabling revisions doesn’t have a performance impact on a properly-configured multisite hosting system, and I know this from personal experience of running such things. But it does have a significant impact on how much space you need to allocate to your users, which has cost implications at scale.

7 As an aside: if a court does rule that WP Engine is infringing upon WordPress trademarks and they want a new company name to give their service a fresh start, they’re welcome to TurdPress.

8 I’d argue that it is okay to do so for personal-use though: the difference for me comes when you’re making a profit off of it. It’s interesting to find these edge-cases in my own thinking!

9 A typical Reddit thread is about 25% lies-and-bullshit; but you can double that for a typical thread talking about this topic!

× × × × ×

Even More 1999!

Spencer’s filter

Last month I implemented an alternative mode to view this website “like it’s 1999”, complete with with cursor trails, 88×31 buttons, tables for layout1, tiled backgrounds, and even a (fake) hit counter.

My blog post about 1999 Mode, viewed using 1999 Mode.
Feels like I’m 17 again.

One thing I’d have liked to do for 1999 Mode but didn’t get around to would have been to make the images look like it was the 90s, too.

Back then, many Web users only had  graphics hardware capable of displaying 256 distinct colours. Across different platforms and operating systems, they weren’t even necessarily the same 256 colours2! But the early Web agreed on a 216-colour palette that all those 8-bit systems could at least approximate pretty well.

I had an idea that I could make my images look “216-colour”-ish by using CSS to apply an SVG filter, but didn’t implement it.

A man wearing a cap pours himself a beer from a 10-litre box.
Let’s use this picture, from yesterday’s blog post, to talk about palettes…

But Spencer, a long-running source of excellent blog comments, stepped up and wrote an SVG filter for me! I’ve tweaked 1999 Mode already to use it… and I’ve just got to say it’s excellent: huge thanks, Spencer!

The filter coerces colours to their nearest colour in the “Web safe” palette, resulting in things like this:

A man wearing a cap pours himself a beer from a 10-litre box, reduced to a "Web safe" palette.
The flat surfaces are particularly impacted in this photo (as manipulated by the CSS SVG filter described above). Subtle hues and the gradients coalesce into slabs of colour, giving them an unnatural and blocky appearance.

Plenty of pictures genuinely looked like that on the Web of the 1990s, especially if you happened to be using a computer only capable of 8-bit colour to view a page built by somebody who hadn’t realised that not everybody would experience 24-bit colour like they did3.

Dithering

But not all images in the “Web safe” palette looked like this, because savvy web developers knew to dither their images when converting them to a limited palette. Let’s have another go:

A man wearing a cap pours himself a beer from a 10-litre box, reduced to a "Web safe" palette but using Floyd Steinberg dithering to reduce the impact of colour banding.
This image uses exactly the same 216-bit colour palette as the previous one, but looks a lot more “natural” thanks to the Floyd–Steinberg dithering algorithm.

Dithering introduces random noise to media4 in order to reduce the likelihood that a “block” will all be rounded to the same value. Instead; in our picture, a block of what would otherwise be the same colour ends up being rounded to maybe half a dozen different colours, clustered together such that the ratio in a given part of the picture is, on average, a better approximation of the correct colour.

The result is analogous to how halftone printing – the aesthetic of old comics and newspapers, with different-sized dots made from few colours of ink – produces the illusion of a continuous gradient of colour so long as you look at it from far-enough away.

Comparison image showing the original, websafe, and dithered-websafe images, zoomed in so that you can see the speckling of random noise in the dithered version.
Zooming in makes it easy to see the noisy “speckling” effect in the dithered version, but from a distance it’s almost invisible.

The other year I read a spectacular article by Surma that explained in a very-approachable way how and why different dithering algorithms produce the results they do. If you’ve any interest whatsoever in a deep dive or just want to know what blue noise is and why you should care, I’d highly recommend it.

You used to see digital dithering everywhere, but nowadays it’s so rare that it leaps out as a revolutionary aesthetic when, for example, it gets used in a video game.

Comparison image showing the image quantized to monochrome without (looks blocky/barely identifiable) and with (looks like old newspaper photography) dithering.
Dithering can be so effective that it can even make an image “work” all the way down to 1-bit (i.e. true monochrome/black-and-white) colour. Here I’ve used Jarvis, Judice & Ninke’s dithering algorithm, which is highly-effective for picking out subtle colour differences in what would otherwise be extreme dark and light patches, at the expense of being more computationally-expensive (to initially create) than other dithering strategies.

All of which is to say that: I really appreciate Spencer’s work to make my “1999 Mode” impose a 216-colour palette on images. But while it’s closer to the truth, it still doesn’t quite reflect what my website would’ve looked like in the 1990s because I made extensive use of dithering when I saved my images in Web safe palettes5.

Why did I take the time to dither my images, back in the day? Because doing the hard work once, as a creator of graphical Web pages, saves time and computation (and can look better!), compared to making every single Web visitor’s browser do it every single time.

Which, now I think about it, is a lesson that’s still true today (I’m talking to you, developers who send a tonne of JavaScript and ask my browser to generate the HTML for you rather than just sending me the HTML in the first place!).

Footnotes

1 Actually, my “1999 mode” doesn’t use tables for layout; it pretty much only applies a CSS overlay, but it’s deliberately designed to look a lot like my blog did in 1999, which did use tables for layout. For those too young to remember: back before CSS gave us the ability to lay out content in diverse ways, it was commonplace to use a table – often with the borders and cell-padding reduced to zero – to achieve things that today would be simple, like putting a menu down the edge of a page or an image alongside some text content. Using tables for non-tabular data causes problems, though: not only is it hard to make a usable responsive website with them, it also reduces the control you have over the order of the content, which upsets some kinds of accessibility technologies. Oh, and it’s semantically-invalid, of course, to describe something as a table if it’s not.

2 Perhaps as few as 22 colours were defined the same across all widespread colour-capable Web systems. At first that sounds bad. Then you remember that 4-bit (16 colour) palettes used to look look perfectly fine in 90s videogames. But then you realise that the specific 22 “very safe” colours are pretty shit and useless for rendering anything that isn’t composed of black, white, bright red, and maybe one of a few greeny-yellows. Ugh. For your amusement, here’s a copy of the image rendered using only the “very safe” 22 colours.

3 Spencer’s SVG filter does pretty-much the same thing as a computer might if asked to render a 24-bit colour image using only 8-bit colour. Simply “rounding” each pixel’s colour to the nearest available colour is a fast operation, even on older hardware and with larger images.

4 Note that I didn’t say “images”: dithering is also used to produce the same “more natural” feel for audio, too, when reducing its bitrate (i.e. reducing the number of finite states into which the waveform can be quantised for digitisation), for example.

5 I’m aware that my footnotes are capable of nerdsniping Spencer, so by writing this there’s a risk that he’ll, y’know, find a way to express a dithering algorithm as an SVG filter too. Which I suspect isn’t possible, but who knows! 😅

× × × × × ×

Good Food, Bad Authorisation

I was browsing (BBC) Good Food today when I noticed something I’d not seen before: a “premium” recipe, available on their “app only”:

Screenshot showing recipes, one of which is labelled "App only" and "Premium".

I clicked on the “premium” recipe and… it looked just like any other recipe. I guess it’s not actually restricted after all?

Just out of curiosity, I fired up a more-vanilla web browser and tried to visit the same page. Now I saw an overlay and modal attempting1 to restrict access to the content:

Overlay attempting to block content to the page beneath, saying "Try 1 year for just £9.99 and save 81%".

It turns out their entire effort to restrict access to their premium content… is implemented in client-side JavaScript. Even when I did see the overlay and not get access to the recipe, all I needed to do was open my browser’s debugger and run document.body.classList.remove('tp-modal-open'); for(el of document.querySelectorAll('.tp-modal, .tp-backdrop')) el.remove(); and all the restrictions were lifted.

What a complete joke.

Why didn’t I even have to write my JavaScript two-liner to get past the restriction in my primary browser? Because I’m running privacy-protector Ghostery, and one of the services Ghostery blocks by-default is one called Piano. Good Food uses Piano to segment their audience in your browser, but they haven’t backed that by any, y’know, actual security so all of their content, “premium” or not, is available to anybody.

I’m guessing that Immediate Media (who bought the BBC Good Food brand a while back and have only just gotten around to stripping “BBC” out of the name) have decided that an ad-supported model isn’t working and have decided to monetise the site a little differently2. Unfortunately, their attempt to differentiate premium from regular content was sufficiently half-hearted that I barely noticed that, too, gliding through the paywall without even noticing were it not for the fact that I wondered why there was a “premium” badge on some of their recipes.

Screenshot from OpenSourceFood.com, circa 2007.
You know what website I miss? OpenSourceFood.com. It went downhill and then died around 2016, but for a while it was excellent.

Recipes probably aren’t considered a high-value target, of course. But I can tell you from experience that sometimes companies make basically this same mistake with much more-sensitive systems. The other year, for example, I discovered (and ethically disclosed) a fault in the implementation of the login forms of a major UK mobile network that meant that two-factor authentication could be bypassed entirely from the client-side.

These kinds of security mistakes are increasingly common on the Web as we train developers to think about the front-end first (and sometimes, exclusively). We need to do better.

Footnotes

1 The fact that I could literally see the original content behind the modal was a bit of a giveaway that they’d only hidden it, not actually protected it in any way.

2 I can see why they’d think that: personally, I didn’t even know there were ads on the site until I did the experiment above: turns out I was already blocking them, too, along with any anti-ad-blocking scripts that might have been running alongside.

× × ×

Ladybird Browser

I’ve been playing with the (pre-Alpha version of) Ladybird, and it fills me with such joy and excitement.

This page, as rendered by Ladybird.
As you can see, Ladybird does a perfectly adequate job of rendering this page, including most of its CSS and virtually all of its JavaScript.

Browser diversity

Back in 2018, while other Web developers were celebrating, I expressed my dismay at the news that Microsoft Edge was on the cusp of switching from using Microsoft’s own browser engine EdgeHTML to using Blink. Blink is the engine that powers almost all other mainstream browsers; all but Firefox, which continues to stand atop Gecko.

The developers who celebrated this loss of rendering engine diversity were, I suppose, happy to have one fewer browser in which they must necessarily test their work. I guess these are the same developers who don’t test the sites they develop for accessibility (does your site work if you can’t see the images? what about with a keyboard but without a pointing device? how about if you’re colourblind?), or consider what might happen if a part of their site fails (what if the third-party CDN that hosts your JavaScript libraries goes down or is blocked by the user’s security software or their ISP?).

This blog post viewed in Lynx.
When was the last time you tested your site in a text-mode browser?

But I was sad, because – as I observed after Andre Alves Garzia succinctly spelled it outbrowser engines are an endangered species. Building a new browser that supports the myriad complexities of the modern Web is such a huge endeavour that it’s unlikely to occur from scratch: from this point on, all “new” browsers are likely to be based upon an existing browser engine.

Engine diversity is important. Last time we had a lull in engine diversity, the Web got stuck, stagnating in the shadow of Internet Explorer 6’s dominance and under the thumb of Microsoft’s interests. I don’t want those days to come back; that’s a big part of why Firefox is my primary web browser.

A Ladybird book browser

Spoof cover for "The Ladybird Book of The Browser"
I actually still own a copy of the book from which I adapted this cover!

Ladybird is a genuine new browser engine. Y’know, that thing I said that we might never see happen again! So how’ve they made it happen?

It helps that it’s not quite starting from scratch. It’s starting point is the HTML viewer component from SerenityOS. And… it’s pretty good. It’s DOM processing’s solid, it seems to support enough JavaScript and CSS that the modern Web is usable, even if it’s not beautiful 100% of the time.

Acid3 test score of 97/100 in Ladybird.
I’ve certainly seen browsers do worse than this at Acid3 and related tests…

They’re not even expecting to make an Alpha release until next year! Right now if you want to use it at all, you’re going to need to compile the code for yourself and fight with a plethora of bugs, but it works and that, all by itself, is really exciting.

They’ve got four full-time engineers, funded off donations, with three more expected to join, plus a stack of volunteer contributors on Github. I’ve raised my first issue against the repo; sadly my C++ probably isn’t strong enough to be able to help more-directly, even if I somehow did have enough free time, which I don’t. But I’ll be watching-from-afar this wonderful, ambitious, and ideologically-sound initiative.

#100DaysToOffload

Woop! This is my 100th post of the year (stats), even using my more-conservative/pedant-friendly “don’t count checkins/reposts/etc. rule. If you’re not a pedant, I achieved #100DaysToOffload when I found a geocache alongside Regents Canal while changing trains to go to Amsterdam where I played games with my new work team, looked at windows and learned about how they’ve been taxed, and got nerdsniped by a bus depot. In any case: whether you’re a pedant or not you’ve got to agree I’ve achieved it now!

× × × ×

A completely plaintext WordPress Theme

This is a repost promoting content originally published elsewhere. See more things Dan's reposted.

This is a silly idea. But it works. I saw Dan Q wondering about plaintext WordPress themes – so I made one.

This is what this blog looks like using it:

Screenshot showing my blog rendered just as text.

I clearly nerdsniped Terence at least a little when I asked whether a blog necessarily had to be HTML, because he went on to implement a WordPress theme that delivers content entirely in plain text.

Naturally, I’ve also shared his accomplishment on my own text/plain blog (which uses a much simpler CMS based on static files).

×

Does a blog have to be HTML?

Terence Eden wrote about his recent experience of IndieWebCamp Brighton, in which he mentioned that somebody – probably Jeremy Keith – had said, presumably to provoke discussion:

A blog post doesn’t need a title.

Terence disagrees, saying:

In a literal sense, he was wrong. The HTML specification makes it clear that the <title> element is mandatory. All documents have title.

But I think that’s an overreach. After all, where is it written that a blog must be presented in HTML?

Non-HTML blogs

There are plenty of counter-examples already in existence, of course:

But perhaps we can do better…

A totally text/plain blog

We’ve looked at plain text, which as a format clearly does not have to have a title. Let’s go one step further and implement it. What we’d need is:

  1. A webserver configured to deliver plain text files by preference, e.g. by adding directives like index index.txt; (for Nginx).5
  2. An index page listing posts by date and URL. Most browser won’t render these as “links” so users will have to copy-paste or re-type them, so let’s keep them short,
  3. Pages for each post at those URLs, presumably without any kind of “title” (just to prove a point), and
  4. An RSS feed: usually I use RSS as shorthand for all feed types, but this time I really do mean RSS and not e.g. Atom because RSS, strangely, doesn’t require that an <item> has a <title>!

I’ve implemented it! it’s at textplain.blog.

textplain.blog in Lynx
Unlike other sites, I didn’t need to test textplain.blog in Lynx to know it’d work well. But I did anyway.

In the end I decided it’d benefit from being automated as sort-of a basic flat-file CMS, so I wrote it in PHP. All requests are routed by the webserver to the program, which determines whether they’re a request for the homepage, the RSS feed, or a valid individual post, and responds accordingly.

It annoys me that feed discovery doesn’t work nicely when using a Link: header, at least not in any reader I tried. But apart from that, it seems pretty solid, despite its limitations. Is this, perhaps, an argument for my .well-known/feeds proposal?

Anyway, I’ve open-sourced the entire thing in case it’s of any use to anybody at all, which is admittedly unlikely! Here’s the code.

Footnotes

1 no-ht.ml technically does use HTML, but the same content could easily be delivered with an appropriate non-HTML MIME type if he’d wanted.

2 Again, I suppose this technically required HTML, even if what was delivered was an empty file!

3 Gemtext is basically Markdown, and doesn’t require a title.

4 Plain text obviously doesn’t require a title.

5 There’s no requirement that default files served by webservers are HTML, although it’s highly-unsual for that not to be the case.

If you’ve ever found yourself missing the “good old days” of the web, what is it that you miss?

Jason Weill said:

Molly White said:

If you’ve ever found yourself missing the “good old days” of the #web, what is it that you miss? (Interpret “it” broadly: specific websites? types of activities? feelings? etc.) And approximately when were those good old days?

No wrong answers — I’m working on an article and wanted to get some outside thoughts.

I miss the era of personal web sites started out of genuine admiration for something, rather than out of a desire to farm a few advertising pennies

This. You wanted to identify a song? Type some of the lyrics into a search engine and hope that somebody transcribed the same lyrics onto their fansite. You needed to know a fact? Better hope some guru had taken the time to share it, or it’d be time for a trip to the library

Not having information instantly easy to find meant that you really treasured your online discoveries. You’d bookmark the best sites on whatever topics you cared about and feel no awkwardness about emailing a fellow netizen (or signing their guestbook to tell them) about a resource they might like. And then you’d check back, manually, from time to time to see what was new.

The young Web was still magical and powerful, but the effort to payoff ratio was harder, and that made you appreciate your own and other people’s efforts more.

Test your site in Lynx

When was the last time you tested your website in a text-only browser like Lynx (or ELinks, or one of several others)? Perhaps you should.

I’m a big fan of CSS Naked Day. I love the idea of JS Naked Day, although I missed it earlier this month (I was busy abroad, plus my aggressive caching, including in service workers, makes it hard to reliably make sweeping changes for short periods). I’m a big fan of the idea that, for the vast majority of websites, if it isn’t at least usable without any CSS or JavaScript, it should probably be considered broken.

This year, I thought I’d celebrate the events by testing DanQ.me in the most-limited browser I had to-hand: Lynx. Lynx has zero CSS or JavaScript support, along with limited-to-no support for heading levels, tables, images, etc. That may seem extreme, but it’s a reasonable analogue for the level of functionality you might routinely expect to see in the toughest environments in which your site is accessed: slow 2G connections from old mobile hardware, people on the other side of highly-restrictive firewalls or overenthusiastic privacy and security software, and of course users of accessibility technologies.

Here’s what broke (and some other observations):

<link rel="alternate">s at the top

I see the thinking that Lynx (and in an even more-extreme fashion, ELinks) have with showing “alternate versions” of a page at the top, but it’s not terribly helpful: most of mine are designed to help robots, not humans!

Screenshot showing four alternate links at the top of DanQ.me as viewed in Lynx.
Four alternates is pretty common for a WordPress site: post feed, comments feed, and two formats of oEmbed.

I wonder if switching from <link rel="alternate"> elements to Link: HTTP headers would indicate to Lynx that it shouldn’t be putting these URLs in humans’ faces, while still making them accessible to all the services that expect to find them? Doing so would require some changes to my caching logic, but might result in a cleaner, more human-readable HTML file as a side-effect. Possibly something worth investigating.

Fortunately, I ensure that my <link rel="alternate">s have a title attribute, which is respected by Lynx and ELinks and makes these scroll-past links slightly less-confusing.

Lynx screenshot from IKEA.com, showing no fewer than 113 anonymous "alternate" links at the top of the page.
Not all sites title their alternate links. IKEA.com requires you to scroll through 113 anonymous links for their alternate language versions, because Lynx doesn’t understand the hreflang attribute.

Post list indentation

Posts on the homepage are structured a little like this:

<li>
  <a href="...">
    <h2>Post Title</h2>
    <p>...post metadata, image, and things...</p>
  </a>
</li>

Strictly-speaking, that’s not valid. Heading elements are only permitted within flow elements. I chose to implement it that way because it seemed to be the most semantically-correct way to describe the literal “list of posts”. But probably my use of <h2> is not the best solution. Let’s see how Lynx handles it:

Screenshot from Lynx showing headings "outdented" from the list items they're children of.
Lynx “outdents” headings so they stand out, and “indents” lists so they look like lists. This causes a quirky clash where a heading is inside a list.

It’s not intolerable, but it’s a little ugly.

CSS lightboxes add a step to images

I use a zero-JavaScript approach to image lightboxes: you can see it by clicking on any of the images in this post! It works by creating a (closed) <dialog> at the bottom of the page, for each image. Each <dialog> has a unique id, and the inline image links to that anchor.

Originally, I used a CSS :target selector to detect when the link had been clicked and show the <dialog>. I’ve since changed this to a :has(:target) and directed the link to an element within the dialog, because it works better on browsers without CSS support.

It’s not perfect: in Lynx navigating on an inline image scrolls down to a list of images at the bottom of the page and selects the current one: hitting the link again now offers to download the image. I wonder if I might be better to use a JavaScript-powered lightbox after all!

gopher: and finger: links work perfectly!

I was pleased to discover that gopher: and finger: links to alternate copies of a post… worked perfectly! That shouldn’t be a surprise – Lynx natively supports these protocols.

Lynx screenshot showing DanQ.me via the Finger protocol.
I can’t conceive that many people access DanQ.me via the Finger protocol, but the option’s there.

In a fun quirk and unusually for a standard of its age, the Finger specification did not state the character encoding that ought to be used. I guess the authors just assumed everybody reading it would use ASCII. But both my WordPress-to-Finger bridge and Lynx instead assume that UTF-8 is acceptable (being a superset of ASCII, that seems fair!) which means that emoji work (as shown in the screenshot above). That’s nuts, isn’t it?

You can’t react to anything

Back in November I added the ability to “react” to a post by clicking an emoji, rather than typing out a full comment. Because I was feeling lazy, the feature was (and remains) experimental, and I didn’t consider it essential functionality, I implemented it mostly in JavaScript. Without JavaScript, all you can do is see what others have clicked.

Emoji reactions screenshot showing thumbs up, star eyes, surprise, muscle flex, nausea, and a dinosaur.
The available emoji vary from post to post; I sometimes like to throw a weird/fun one in there, knowing that it’ll invariably be Ruth that clicks it first.

In a browser with no JavaScript but with functional CSS, the buttons correctly appear disabled.

But with neither technology available, as in Lynx, they look like they should work, but just… don’t. Oops.

Screenshot from Lynx showing a "Bad HTML!! No form action defined." error when clicking an emoji reaction button.
Lynx is correct; this is sloppy code. Without CSS support, it even shows the instruction that implies the buttons will work, but they don’t.

If I decide to keep the reaction buttons long-term, I’ll probably reimplement them so that they function using plain-old HTML and HTTP, using a <form>, and refactor my JavaScript to properly progressively-enhance the buttons for those that support it. For now, this’ll do.

Comment form honeypot

The comment form on my blog posts works… but there’s a quirk:

At the end of the comments form, an additional <textarea> appears!

That’s an annoyance. It turns out it’s a honeypot added by Akismet: a fake comments field, normally hidden, that tries to trick spam bots into filling it (and thus giving themselves away): sort-of a “reverse CAPTCHA” where the robots do something extra, unintentionally, to prove their inhumanity. Lynx doesn’t understand the code that Akismet uses to hide the form, and so it’s visible to humans, which is suboptimal both because it’s confusing but also because a human who puts details into it is more-likely to be branded a spambot!

I might look into suppressing Akismet adding its honeypot field in the first place, or else consider one of the alternative anti-spam plugins for WordPress. I’ve heard good things about Antispam Bee; I ought to try it at some point.

Overall, it’s pretty good

On the whole, DanQ.me works reasonably well in browsers without any JavaScript or CSS capability, with only a few optional features failing to function fully. There’s always room for improvement, of course, and I’ve got a few things now to add to my “one day” to-do list for my little digital garden.

Obviously, this isn’t really about supporting people using text-mode browsers, who probably represent an incredible minority. It’s about making a real commitment to the semantic web, to accessibility, and to progressive enhancement! That making your site resilient, performant, and accessible also helps make it function in even the most-uncommon of browsers is just a bonus.

× × × × ×

Home Page, 50% Off!

In August, I celebrated my blog – with its homepage weighing-in at a total of just 481kbbeing admitted to Kev Quirk‘s 512kb club. 512kb club celebrates websites (often personal sites) whose homepage are neither “ultra minimal” or “link pages” but have a total size, including all assets, of under half a megabyte. It’s about making a commitment to a leaner, more-efficient Web.

My relatively-heavyweight homepage only just slipped in under the line. But, feeling inspired perhaps by some performance enhancements I’ve been planning this week at work, I decided to try to shave a little more off:

Now, at ~234kb, danq.me just beats the excellent gomakethings.com (it’s all those heavyweight fonts, Chris!).

Here’s what I changed:

  1. The “recent article” tiles are dynamically sized based on their number, type, and the visitor’s screen resolution. But apart from the top one they’re almost never very large. Using thumbnail images for the non-first tile shaved off almost 160kb.
Illustration showing how the smaller tiles only need thumbnail images.
You can see the difference, but it’s still acceptable to look at, I think.
  1. Not space-saving, but while I was in there I ensured that the first tile’s image – which almost-certainly comprises part of the Largest Contentful Paint – is never delivered with loading="lazy".
  2. I was providing a shortcut icon in .ico format (<link rel="shortcut icon" href="/_q23t/icons/favicon-16-32-48-64-128.ico" />), which is pretty redundant nowadays because all modern browsers (and even IE11) support .png icons. I was already providing .png and .svg versions, but it turns out that some browsers favour the one with the (harmful?) rel="shortcut icon" over rel="icon" if both are present, and .ico files are – being based on Windows Bitmaps – horrendously inefficient.

By getting under the 250kb threshold, I’ve jumped up a league from Blue Team to Orange Team, so that’s nice too. I can’t see a meaningful path from where I’m at to Green Team (under 100kb) though, so this level might have to suffice.

Last-minute additions:

×

BBC News… without the crap

Did I mention recently that I love RSS? That it brings me great joy? That I start and finish almost every day in my feed reader? Probably.

I used to have a single minor niggle with the BBC News RSS feed: that it included sports news, which I didn’t care about. So I wrote a script that downloaded it, stripped sports news, and re-exported the feed for me to subscribe to. Magic.

RSS reader showing duplicate copies of the news story "Barbie 2? 'We'd love to,' says Warner Bros boss", and an entry from BBC Sounds.
Lately my BBC News feed has caused me some annoyance and frustration.

But lately – presumably as a result of technical changes at the Beeb’s side – this feed has found two fresh ways to annoy me:

  1. The feed now re-publishes a story if it gets re-promoted to the front pagebut with a different <guid> (it appears to get a #0 after it when first published, a #1 the second time, and so on). In a typical day the feed reader might scoop up new stories about once an hour, any by the time I get to reading them the same exact story might appear in my reader multiple times. Ugh.
  2. They’ve started adding iPlayer and BBC Sounds content to the BBC News feed. I don’t follow BBC News in my feed reader because I want to watch or listen to things. If you do, that’s fine, but I don’t, and I’d rather filter this content out.

Luckily, I already have a recipe for improving this feed, thanks to my prior work. Let’s look at my newly-revised script (also available on GitHub):

#!/usr/bin/env ruby
require 'bundler/inline'

# # Sample crontab:
# # At 41 minutes past each hour, run the script and log the results
# */20 * * * * ~/bbc-news-rss-filter-sport-out.rb > ~/bbc-news-rss-filter-sport-out.log 2>>&1

# Dependencies:
# * open-uri - load remote URL content easily
# * nokogiri - parse/filter XML
gemfile do
  source 'https://rubygems.org'
  gem 'nokogiri'
end
require 'open-uri'

# Regular expression describing the GUIDs to reject from the resulting RSS feed
# We want to drop everything from the "sport" section of the website, also any iPlayer/Sounds links
REJECT_GUIDS_MATCHING = /^https:\/\/www\.bbc\.co\.uk\/(sport|iplayer|sounds)\//

# Load and filter the original RSS
rss = Nokogiri::XML(open('https://feeds.bbci.co.uk/news/rss.xml?edition=uk'))
rss.css('item').select{|item| item.css('guid').text =~ REJECT_GUIDS_MATCHING }.each(&:unlink)

# Strip the anchors off the <guid>s: BBC News "republishes" stories by using guids with #0, #1, #2 etc, which results in duplicates in feed readers
rss.css('guid').each{|g|g.content=g.content.gsub(/#.*$/,'')}

File.open( '/www/bbc-news-no-sport.xml', 'w' ){ |f| f.puts(rss.to_s) }
It’s amazing what you can do with Nokogiri and a half dozen lines of Ruby.

That revised script removes from the feed anything whose <guid> suggests it’s sports news or from BBC Sounds or iPlayer, and also strips any “anchor” part of the <guid> before re-exporting the feed. Much better. (Strictly speaking, this can result in a technically-invalid feed by introducing duplicates, but your feed reader oughta be smart enough to compensate for and ignore that: mine certainly is!)

You’re free to take and adapt the script to your own needs, or – if you don’t mind being tied to my opinions about what should be in BBC News’ RSS feed – just subscribe to my copy at: https://fox.q-t-a.uk/bbc-news-no-sport.xml

×

RSS > ActivityPub

RSS is better than ActivityPub1.

Photograph of a boxing match, but with the heads of the competitors replaced with the ActivityPub and RSS logos (and "AP" or "RSS" written on their clothes, respectively). RSS is delivering a powerful uppercut to ActivityPub.
A devastating blow by RSS against a competitor 19 years his junior! For updates on this bout as it develops, don’t forget to subscribe… using either protocol.

When I subscribe to content, I want:

  • Resilient failsafes. ActivityPub has many points-of-failure. A notification might fail to complete transmission as a result of downtime, faults, or network conditions, and the receiving server might never know. A feed reader, conversely, can tell you that an address 404’d or the server was down.
  • Retroactive access. Once you fix the problem above… you still don’t get the message you missed: it’s probably gone forever – there’s no retroactive access. The same is true when your ActivityPub server connects with a peer for the first time: you only ever get new content after that point. RSS, on the other hand, provides some number of “recent” items the moment you first subscribe.
  • Simple subscriptions. RSS can be served from a statically-hosted single file, which makes it suitable to deploy anywhere as well as consume using anything. It can be read, after a fashion, in anything from Lynx upwards.

RSS ticks all these boxes. If I can choose between RSS and ActivityPub to subscribe to your content, and I don’t need a real-time update, I’m probably going to choose RSS.

About a month later, Matthias Pfefferle wrote a great post that makes a good “next stop” if you’re on a deep dive…

Footnotes

1 I feel like this statement needs a few clarifications and caveats, but my hot take looks spicier if I bury them in a footnote!

  • By RSS, I mean whichever pull-based basic HTTP you like, be that Atom, JSON Feed, h-entry, or even just properly-marked-up HTML5: did you know that the <article> element is intended to be suitable for syndication use?
  • Obviously I appreciate that RSS and ActivityPub are different tools for different jobs, and there are doubtless use-cases for which ActivityPub is clearly the superior solution.
  • I certainly don’t object to services providing both RSS and ActivityPub as syndication options, like Mastodon does, where both might be good choices.
×