Had a fight with the Content-Security-Policy header today. Turns out, I won, but not without sacrifices.
Apparently I can’t just insert <style> tags into my posts anymore, because otherwise I’d have to somehow either put nonces on them, or hash their content (which would
be more preferrable, because that way it remains static).
I could probably do the latter by rewriting HTML at publish-time, but I’d need to hook into my Markdown parser and process HTML for that, and, well, that’s really complicated,
isn’t it? (It probably is no harder than searching for Webmention links, and I’m overthinking it.)
I’ve had this exact same battle.
Obviously the intended way to use nonces in a Content-Security-Policy is to have the nonce generated, injected, and served in a single operation. So in PHP,
perhaps, you might do something like this:
<?php$nonce=bin2hex(random_bytes(16));
header("Content-Security-Policy: script-src 'nonce-$nonce'");
?>
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<title>PHP CSP Nonce Test</title>
</head>
<body>
<h1>PHP CSP Nonce Test</h1>
<p>
JavaScript did not run.
</p>
<!-- This JS has a valid nonce: -->
<script nonce="<?phpecho$nonce; ?>">
document.querySelector('p').textContent = 'JavaScript ran successfully.';
</script>
<!-- This JS does not: -->
<script nonce="wrong-nonce">
alert('The bad guys won!');
</script>
</body>
</html>
Viewing this page in a browser (with Javascript enabled) should show the text “JavaScript ran successfully.”, but should not show an alertbox containing the text “The bad
guys won!”.
But for folks like me – and you too, Vika,, from the sounds of things – who serve most of their pages, most of the time, from the cache or from static HTML files… and who add the CSP
header on using webserver configuration… this approach just doesn’t work.
I experimented with a few solutions:
A long-lived nonce that rotates.
CSP allows you to specify multiple nonces, so I considered having a rotating nonce that was applied to pages (which were then cached for a period) and delivered
by the header… and then a few hours later a new nonce would be generated and used for future page generations and appended to the header… and after the
cache expiry time the oldest nonces were rotated-out of the header and became invalid.
Dynamic nonce injection.
I experimented with having the webserver parse pages and add nonces: randomly generating a nonce, putting it in the header, and then basically doing a
s/<script/<script nonce="..."/ to search-and-replace it in.
Both of these are terrible solutions. The first one leaves a window of, in my case, about 24 hours during which a successfully-injected script can be executed. The second one
effectively allowlists all scripts, regardless of their provenance. I realised that what I was doing was security theatre: seeking to boost my A-rating to an A+-rating on SecurityHeaders.com without actually improving security at all.
But the second approach gave me an idea. I could have a server-side secret that gets search-replaced out. E.g. if I “signed” all of my legitimate scripts with something like
<script nonce="dans-secret-key-goes-here" ...> then I could replace s/dans-secret-key-goes-here/actual-nonce-goes-here/ and thus have the best of both
worlds: static, cacheable pages, and actual untamperable nonces. So long as I took care to ensure that the pages were never delivered to anybody with the secret key still
intact, I’d be sorted!
Alternatively, I was looking into whether Caddy can do something like mod_asis does for Apache: that is, serve a
file “as is”, with headers included in the file. That way, I could have the CSP header generated with the page and then saved into the cache, so it’s delivered with the same
none every time… until the page changes. I’d love more webservers to have an “as is” mode, but I appreciate that might be a big ask (Apache’s mechanism, I suspect,
exploits the fact that HTTP/1.0 and HTTP/1.1 literally send headers, followed by two CRLFs, then content… but that’s not what happens in HTTP/2+).
So yeah, I’ll probably do a server-side-secret approach, down the line. Maybe that’ll work for you, too.
As I mentioned in my recent Blog Questions Challenge, I recently switched my blog from WordPress, which it had been running on for over 20 years of its 26 year history, to ClassicPress.1
I’m aware that I’m not the only person for whom ClassicPress might be a better fit than WordPress2,
so I figured I should share the process by which I undertook the change.
Switching from WordPress to ClassicPress
Switching from WordPress to ClassicPress should be a non-destructive, 100% reversible process, but (even though I’ve got solid backups) I wasn’t ready to
trust that, so I decided to operate on a copy of my site. I’m glad I did, because there were a couple of teething issues I needed to tackle before I could launch.
1. Duplicating the site
I took a simple approach to duplicating the site: (1) I copied the site directory, and (2) I copied the database, and (3) I set up a new subdomain to use for testing. Here’s how I did
each step:
1.1. Copying the site directory
This should’ve been simple, but a du -sh revealed that my /wp-content/uploads directory is massive (I should look into that) and I didn’t want to
clone it. And I didn’t want r need to clone my /wp-content/cache directory either. So I ran:
rsync -av --exclude=wp-content ./old-site-directory/ ./new-site-directory/ to copy everything exceptwp-content, and then
rsync -av --exclude=uploads --exclude=cache ./old-site-directory/wp-content/ ./new-site-directory/wp-content/ to copy wp-contentexcept the
uploads and cache subdirectories, and then finally
ln -s ./old-site-directory/wp-content/uploads ./new-site-directory/wp-content/uploads to symlink the uploads directory, sharing it between the two sites
1.2. Copying the database
I just piped mysqldump into mysql to clone from one database to the other:
mysqldump -uUSERNAME -p --lock-tables=false old-site-database | mysql -uUSERNAME -p new-site-database
I edited DB_NAME in wp-config.php in the new site’s directory to point it at the new database.
If you’re going to clone your WordPress site before converting to ClassicPress, you’ll want to be comfortable editing your wp-config.php.
1.3. Setting up a new subdomain
My DNS is already configured with a wildcard to point (almost) all *.danq.me subdomains to this server already. I decided to use the name classicpress-testing.danq.me as my
temporary/test domain name. To keep any “changes” to my cloned site to a minimum, I overrode the domain name in my wp-config.php rather than in my database, by adding the
following lines:
Because I use Caddy/FrankenPHP as my webserver3,
configuration was really easy: I just copied the relevant part of my Caddyfile (actually an include), changed the domain name and the root, and it just worked,
even provisioning me out a LetsEncrypt SSL certificate. Magical4.
2. Switching the duplicate to ClassicPress
Now that I had a duplicate copy of my blog running at https://classicpress-testing.danq.me/, it was time to switch it to ClassicPress. I started by switching my wp-admin
colour scheme to a different one in my cloned site, so it’d be immediately visually-obvious to me if I’d accidentally switched and was editing the “wrong” site (I also made sure I was
logged-out of my primary, live site, so I was confident I wouldn’t break anything while I was experimenting!).
ClassicPress provides a migration plugin which checks for common problems and then switches your site
from WordPress to ClassicPress, so I installed it and ran it. It said that everything was okay except for my (custom) theme and a my self-built plugins, which it understandably couldn’t
check compatibility of. It recommended that I install Twenty Seventeen – the last WordPress default theme to not
require the block editor – but I didn’t do so: I was confident that my theme would work anyway… and if it didn’t, I’d want to fix it rather than switch theme!
I failed to take a screenshot of the actual process, but it looked broadly like this.
And then… it all broke.
3. Fixing what broke
After swiftly doing a safety-check that my live site was still intact, I started trying to work out why my site wasn’t broken. Debugging a ClassicPress PHP issue is functionally
identical to debugging a similar WordPress issue, for obvious reasons: check the logs, work out what’s broken, realise it’s a plugin, disable that plugin while you investigate further,
etc.
EWWW Image Optimizer: I use this plugin to pregenerate WebP variants of my images, which I then serve using webserver rules. It’s not a
complex job, and I should probably integrate the feature into my theme at some point, but for now I use this plugin. Version 8.0.0 of the plugin doesn’t work on ClassicPress 2.3.1, so
I used WP-CLI to downgrade to the last version that does (7.7.0), and then it worked fine.
Dan’s Geocaching Log Reposter: a self-made plugin that copies my logs from geocaching websites stopped working properly, which I think is because
ClassicPress is doing a more-aggressive job than WordPress at nonce validation on admin REST endpoints? I put a quick hack into my plugin to work around it, but I’ll need to look into
this properly at some point.
Some other bits of my stack, e.g. CapsulePress (my Gemini/Spartan/Nex server), have their own copies of my
database credentials, because I’ve been too lazy to centralise them into environment variables, and needed updating (but not until live switchover time).
I ran the two sites in-parallel for a couple of weeks, with the ClassicPress one as a “read only” version (so I didn’t pollute my uploads directory!), but it was pretty unnecessary
because it all worked pretty seamlessly, despite my complex stack of custom code. When I wanted to switch for-real, all I needed to do was swap the domain names over in my Caddyfile and
edit the wp-config.php of my ClassicPress installation: step 1.3, but in reverse!
If you hadn’t been told5, you probably wouldn’t have even known I’d made a change: I suppress basically all infrastructure-identifying
headers from my server output as a matter of course, and ClassicPress and WordPress are functionally-interchangeable from a front-end perspective6.
So what’s difference?
From my experience, here are the differences I’ve discovered since switching from WordPress to ClassicPress:
The good stuff
😅 ClassicPress has no Gutenberg/block editor. This would absolutely be a showstopper for many people, and that’s fine: I have nothing against the block editor (I
use it basically every day elsewhere!), but I’ve never really used it on danq.me and don’t feel the need to change that! My theme, my workflow, and my custom plugins are all
geared around the perfectly-good “classic” editor, and so getting a more-lightweight CMS by removing a feature I wasn’t using anyway falls somewhere between neutral and a blessing.
⚡The backend is fast again! One of the changes the ClassicPress team have been working on applying to WordPress is to strip out jQuery and other redundancies from
the backend, and I love how much faster and lighter my editor interface is as a result. (With caveat; see below!)
🔌Virtually everything “just works”. With the few exceptions described above, everything works exactly as it does under WordPress. Which is what you’d hope for a fork
that’s mostly “WordPress, but without the block editor”, right, but it’s still reassuring (and, for me, an essential feature). There are a few “new” features to do with paging through
posts and the media library and they’re fine, I suppose, but not by themselves worth switching for (though it might be nice to backport them into WordPress!).
The bad stuff
🏷️ Adding tags to posts takes a step backwards. A side-effect of dropping jQuery is the partial loss of the autocomplete feature when selecting tags to add to a post.
You still get a partial autocomplete, but not after typing a comma: you need to press enter to submit the tag you were writing and then start typing them next, which
frankly sucks. This is because they’re relying on a <datalist>, which isn’t as full-featured as the Javascript solution WordPress employs. This bugs
me almost enough to be a showstopper, but I gather it’s getting fixed in a near-future version.
🗺️ You’re in uncharted territory when things go wrong. One great benefit of WordPress is the side-effects of its ubiquity. If you have a query or a problem
you can throw a stone at your favourite search engine and get a million answers… and some of them will even be right! If you have a problem in ClassicPress and it’s not shared with (or
you’re not sure if it’s shared with) WordPress… you’re mostly on your own. The forums are good and friendly,
but if you want a quick answer to something, you’re likely to have to roll your sleeves up and open some source code. I don’t mind this at all – when I first started using WordPress,
this was the case, too! – but it might be a showstopper for some folks.
In summary: I’m enjoying using ClassicPress, even where there are rough edges. For me, 99% of my experience with it is identical to how I used WordPress anyway, it’s relatively
lightweight and fast, and it’s easy enough to switch back if I change my mind.
Footnotes
1 It saddens me that I have to keep clarifying this, but I feel like I do: my switch from
WordPress to ClassicPress is absolutely nothing to do with any drama in the WordPress space that’s going on right now: in fact, I’d been planning to try it out since before
any of the drama appeared. I appreciate that some people making a similar switch, including folks who use this blog post as a guide, might have different motivations to me, and that’s
fine too. Personally, I think that ditching an installation of open-source WordPress based on your interpretation of what’s going on in the ecosystem is… short-sighted? But
hey: the joy of open source is you can – and should! – do what you want. Anyway: the short of it is – the desire to change from WordPress to ClassicPress was, for me, 100% a
technical decision and 0% a political one. And I’ll thank you for leaving any of your drama at the door if you slide into my comments, ta!
2Matt recently described ClassicPress as “the last decent fork
attempt for WordPress”, and I absolutely agree. There’s been a spate of forks and reimplementations recently. I’ve looked into many of them and been… very much underwhelmed. Want my
hot take? Sure, here you go: AspirePress is all lofty ideas and no deliverables. FreeWP seems to be the same, but somehow without the lofty ideas. ForkPress is a ghost. Speaking of
ghosts, Ghost isn’t a WordPress fork; they have got some cool ideas though. b2evolution is even less a WordPress fork but it’s pretty cool in its own right. I’m not sure what
clamPress is trying to achieve but I’ve not given it a serious look. So yeah: ClassicPress is, in my mind, the only WordPress fork even worth consideration at this point, and as I
describe in this blog post: it’s not for everybody.
3 I switched from Nginx over the winter and it’s been just magical: I really love
Caddy’s minimal approach to production configuration. The only thing I’ve been able to fault it on is that it’s not capable of setting up client-side SSL certificate authentication on
a path, only on an entire domain, which meant I needed to reimplement the authentication mechanism I use on a small part of my (non-blog) internal
infrastructure.
4 To be fair, it wouldn’t have been hard if I’d still be using Nginx, because I’d
set up Certbot to use DNS-based vertification to issue me wildcard SSL certificates. But doing this in Caddy still felt magical.
6 Indeed, I wouldn’t have considered a switch to ClassicPress in the first place if it
wasn’t a closely-aligned-enough fork that I retained the ability to flip-flop between the two to my heart’s content! I’ve loved WordPress for over two decades; that’s not going to
change any time soon… and if e.g. ClassicPress ceased tracking WordPress releases and the fork diverged too far for my comfort, I’d probably switch back to regular old WordPress!
Large companies find HTML & CSS frustrating “at scale” because the web is a fundamentally anti-capitalist mashup art experiment, designed to give consumers all the power.
This. This is what I needed to be reminded, today.
When somebody complains that the Web is hard to scale, they’re already working against the grain of the Web.
At its simplest – and the way we used to use it – a website is a collection of .html files, one of which might have a special name so the webserver knows to put it first.
Writing HTML is punk rock. A “platform” is the tool of the establishment.
While perfectly legal, it is remarkable that to read a Bloomberg article, you must first agree to binding arbitration and waive your class action rights.
I don’t often see dialog boxes like this one. In fact, if I go to the URL of a Bloomberg.com article, I don’t see any popups: nothing about privacy, nothing about cookies,
nothing about terms of service, nothing about only being allowed to read a limited number of articles without signing up an account. I just… get… the article.
The reason for this is, most-likely, because my web browser is configured, among other things, to:
Block all third-party Javascript (thanks, uBlock Origin‘s “advanced mode”), except on domains where they’re explicitly allowed (and even then
with a few exceptions: thanks, Ghostery),
Delete all cookies 30 seconds after I navigate away from a domain, except for domains that are explicitly greylisted/allowlisted (thanks, Cookie-AutoDelete), and
But here’s the thing I’ve always wondered: if I don’t get to see a “do you accept our terms and conditions?” popup, is is still enforceable?
Obviously, one could argue that by using my browser in a non-standard configuration that explicitly results in the non-appearance of “consent” popups that I’m deliberately turning a
blind eye to the popups and accepting them by my continued use of their services1. Like: if I pour a McDonalds coffee on my lap having
deliberately worn blinkers that prevent me reading the warning that it’s hot, it’s not McDonalds’ fault that I chose to ignore their helpful legally-recommended printed warning on the cup, right?2
But I’d counter that if a site chooses to rely on Javascript hosted by a third party in order to ask for consent, but doesn’t rely on that same third-party in
order to provide the service upon which consent is predicated, then they’re setting themselves up to fail!
The very nature of the way the Internet works means that you simply can’t rely on the user successfully receiving content from a CDN. There are all kinds of reasons my browser might not
get the Javascript required to show the consent dialog, and many of them are completely outside of the visitor’s control: maybe there was a network fault, or CDN downtime, or my
browser’s JS engine was buggy, or I have a disability and the technologies I use to mitigate its impact on my Web browsing experience means that the dialog isn’t read out to me. In any
of these cases, a site visitor using an unmodified, vanilla, stock web browser might visit a Bloomberg article and read it without ever being asked to agree to their terms and
conditions.
Would that be enforceable? I hope you’ll agree that the answer is: no, obviously not!
It’s reasonably easy for a site to ensure that consent is obtained before providing services based on that consent. Simply do the processing server-side, ask for whatever
agreement you need, and only then provide services. Bloomberg, like many others, choose not to do this because… well, it’s probably a combination of developer laziness and
search engine optimisation. But my gut feeling says that if it came to court, any sensible judge would ask them to prove that the consent dialog was definitely viewed by
and clicked on by the user, and from the looks of things: that’s simply not something they’d be able to do!
tl;dr: if you want to fight with Bloomberg and don’t want to go through their arbitration, simply say you never saw or never agreed to their terms and conditions – they
can’t prove that you did, so they’re probably unenforceable (assuming you didn’t register for an account with them or anything, of course). This same recommendation applies to many,
many other websites.
Footnotes
1 I’m confident that if it came down to it, Bloomberg’s lawyers would argue
exactly this.
2 I see the plaintiff’s argument that the cups were flimsy and obviously her injuries were
tragic, of course. But man, the legal fallout and those “contents are hot” warnings remain funny to this day.
This was fun. A simple interactive demonstration of ten different dark patterns you’ve probably experienced online. I might
use it as a vehicle for talking about such deceptive tactics with our eldest child, who’s now coming to an age where she starts to see these kinds of things.
After I finished exploring the dark patterns shown, I decided to find out more about the author and clicked the link in the footer, expecting to be taken to their personal web site. But
instead, ironically, I came to a web page on a highly-recognisable site that’s infamous for its dark patterns: 🤣
I’m pretty impressed with running WordPress on Caddy so far.
It took a little jiggerypokery to configure it with an equivalent of the Nginx configuration I use for DanQ.me. But off the back of it I get the capability for HTTP/3,
103 Early Hints, and built-in “batteries included” infrastructure for things like certificate renewal and log rotation.
(why yes, I am celebrating my birthday by doing selfhosting server configuration, why do you ask? 😅)
“We completely redesigned this thing you need to do your job for no good reason” Got it.
“Disable any adblocker.” Absolutely not.
…
I don’t know if I’m supposed to read this as a poem, but I did, and I love it. It speaks to me. It speaks of my experience of using (way too much of) the Web nowadays, enshittified as
it is.
But yeah, I run a fine-tuned setup on most of my computers that works for me… by working against most of the way the Web seems to expect me to use it, these days. I block all
third-party JavaScript and cookies by default (and drops first-party cookies extremely quickly). I use plugins to quietly reject consent banners, suppress soft paywalls, and so on. And
when I come across sites that don’t work that way, I make a case-by-case decision on whether to use them at all (if you hide some features in your “app” only, I just don’t use
those features).
Sure, there are probably half a dozen websites that you might use that I can’t. But in exchange I use a Web that’s fast, clean, and easy-to-read.
And just sometimes: when I’m on somebody else’s computer and I see an ad, or a cookie consent banner, or a “log in to keep reading” message, or a website weighed down and crawling
because of the dozens of tracking scripts, or similar… I’m surprised to remember that these things actually exist, and wonder for a moment how people who do see them
all time time cope with them!
Sigh.
Anyway: this was an excellent poem, assuming it was supposed to be interpreted as being a poem. Otherwise, it was an
excellent whatever-it-is.
Was playing around with some HTML and made a cable car for my page. Hmh.
Beautiful. It feels like it ought to have been wrapped in a HTML Web Component, maybe called <cable-car>, with progressive enhancement bonus features (maybe it’ll
only run during daylight hours? or when the wind isn’t too fast?)?
theimprobable.blog, which I look after on behalf of my partner’s brother after using it to GPS-track his adventures
I think that’s all of them, but it’s hard to be sure…
Footnotes
1 Maybe I’ve finally shaken off my habit of buying a domain name for everything.
Or maybe it’s just that I’ve embraced subdomains for more stuff. Probably the latter.
The Beeb continue to keep adding more and more non-news content to the BBC News RSS feed (like this ad for the iPlayer app!), so I’ve once again had to update my script to “fix” the feed so that it only contains, y’know, news.
tl;dr: I’m tidying up and consolidating my personal hosting; I’ve made a little progress, but I’ve got a way to go – fortunately I’ve got a sabbatical coming up at
work!
At the weekend, I kicked-off what will doubtless be a multi-week process of gradually tidying and consolidating some of the disparate digital things I run, around the Internet.
I’ve a long-standing habit of having an idea (e.g. gamebook-making tool Twinebook, lockpicking puzzle game Break Into Us, my Cheating Hangman game, and even FreeDeedPoll.org.uk!),
deploying it to one of several servers I run, and then finding it a huge headache when I inevitably need to upgrade or move said server because there’s such an insane diversity of
different things that need testing!
DNDle, my Wordle-clone where you have to guess the Dungeons & Dragons 5e monster’s stat block, is now hosted by GitHub Pages. Also, I
fixed an issue reported a month ago that meant that I was reporting Giant Scorpions as having a WIS of 19 instead of 9.
Abnib, which mostly reminds people of upcoming birthdays and serves as a dumping ground for any Abnib-related shit I produce, is now hosted by
GitHub Pages.
RockMonkey.org.uk, which doesn’t really do much any more, is now hosted by GitHub Pages.
Sour Grapes, the single-page promo for a (remote) murder mystery party I hosted during a COVID lockdown, is now hosted by GitHub
Pages.
A convenience-page for giving lost people directions to my house is now hosted by GitHub Pages.
Dan Q’s Things is now automatically built on a schedule and hosted by GitHub Pages.
Robin’s Improbable Blog, which spun out from 52 Reflect, wasn’t getting enough traffic to justify
“proper” hosting so now it sits in a Docker container on my NAS.
My μlogger server, which records my location based on pings from my phone, has also moved to my NAS. This has broken
Find Dan Q, but I’m not sure if I’ll continue with that in its current form anyway.
All of my various domain/subdomain redirects have been consolidated on, or are in the process of moving to, to a tinyLinode/Akamai
instance. It’s a super simple plain Nginx server that does virtually nothing except redirect people – this is where I’ll park the domains I register but haven’t found a use for yet, in
future.
I was pretty proud of EGXchange.org, but I’ll be first to admit that it’s among the stupider of my throwaway domains.
It turns out GitHub pages is a fine place to host simple, static websites that were open-source already. I’ve been working on improving my understanding of GitHub Actions
anyway as part of what I’ve been doing while wearing my work, volunteering, and personal hats, so switching some static build processes like DNDle’s to GitHub
Actions was a useful exercise.
Stuff I’m still to tidy…
There’s still a few things I need to tidy up to bring my personal hosting situation under control:
DanQ.me
You’re looking at it. But later this year, you might be looking at it… elsewhere?
This is the big one, because it’s not just a WordPress blog: it’s also a Gemini, Spartan, and Gopher server (thanks CapsulePress!), a Finger server, a general-purpose host to a stack of complex stuff only some of which is powered by Bloq (my WordPress/PHP integrations): e.g.
code to generate the maps that appear on my geopositioned posts, code to integrate with the Fediverse, a whole stack of configuration to make my caching work the way I want, etc.
FreeDeedPoll.org.uk
Right now this is a Ruby/Sinatra application, but I’ve got a (long-running) development branch that will make it run completely in the browser, which will further improve privacy, allow
it to run entirely-offline (with a service worker), and provide a basis for new features I’d like to provide down the line. I’m hoping to get to finishing this during my Automattic
sabbatical this winter.
The website’s basically unchanged for most of a decade and a half, and… umm… it looks it!
A secondary benefit of it becoming browser-based, of course, is that it can be hosted as a static site, which will allow me to move it to GitHub Pages too.
When I took over running the world’s geohashing hub from xkcd‘s Randall Munroe (and davean), I flung the site together on whatever hosting I had sitting
around at the time, but that’s given me some headaches. The outbound email transfer agent is a pain, for example, and it’s a hard host on which to apply upgrades. So I want to get that
moved somewhere better this winter too. It’s actually the last site left running on its current host, so it’ll save me a little money to get it moved, too!
Geohashing’s one of the strangest communities I’m honoured to be a part of. So it’d be nice to treat their primary website to a little more respect and attention.
Right now I run this on my NAS, but that turns out to be a pain sometimes because it means that if my home Internet goes down (e.g. thanks to a power cut, which we have from time to time), I lose access to the first and last place I
go on the Internet! So I’d quite like to move that to somewhere on the open Internet. Haven’t worked out where yet.
Next steps
It’s felt good so far to consolidate and tidy-up my personal web hosting (and to rediscover some old projects I’d forgotten about). There’s work still to do, but I’m expecting to spend
a few months not-doing-my-day-job very soon, so I’m hoping to find the opportunity to finish it then!
If you’re active in the WordPress space you’re probably aware that there’s a lot of drama going on right now between (a) WordPress hosting company WP Engine, (b) WordPress
hosting company (among quiteafewotherthings) Automattic1,
and (c) the WordPress Foundation.
If you’re not aware then, well: do a search across the tech news media to see the latest: any summary I could give you would be out-of-date by the time you read it anyway!
I tried to draw a better diagram with more of the relevant connections, but it quickly turned into spaghetti.
In particular, I think a lot of the conversation that he kicked off conflates three different aspects of WP Engine’s misbehaviour. That muddies the waters when it comes to
having a reasoned conversation about the issue3.
I’ve heard Matt speak a number of times, including in person… and I think he did a pretty bad job of expressing the problems with WP Engine during his Q&A at WCUS. In his defence,
it sounds like he may have been still trying to negotiate a better way forward until the very second he walked on stage that day.
I don’t think WP Engine is a particularly good company, and I personally wouldn’t use them for WordPress hosting. That’s not a new opinion for me: I wouldn’t have used them last year or
the year before, or the year before that either. And I broadly agree with what I think Matt tried to say, although not necessarily with the way he said it or the platform he
chose to say it upon.
Misdeeds
As I see it, WP Engine’s potential misdeeds fall into three distinct categories: moral, ethical4,
and legal.
Morally: don’t take without giving back
Matt observes that since WP Engine’s acquisition by huge tech-company-investor Silver Lake, WP Engine have made enormous profits from selling WordPress hosting as a service (and nothing else) while
making minimal to no contributions back to the open source platform that they depend upon.
If true, and it appears to be, this would violate the principle of reciprocity. If you benefit from somebody else’s
effort (and you’re able to) you’re morally-obliged to at least offer to give back in a manner commensurate to your relative level of resources.
The principle of reciprocity is a moral staple. This is evidenced by the fact that children (and some nonhuman animals) seem to be able to work it out for themselves
from first principles using nothing more than empathy. Companies, however aren’t usually so-capable. Photo courtesy Cotton.
Abuse of this principle is… sadly not-uncommon in business. Or in tech. Or in the world in general. A lightweight example might be the many millions of profitable companies that host
atop the Apache HTTP Server without donating a penny to the Apache Foundation. A heavier (and legally-backed) example might be Trump Social’s
implementation being based on a modified version of Mastodon’s code:
Mastodon’s license requires that their changes are shared publicly… but they don’t do until they’re sent threatening letters reminding them of their obligations.
I feel like it’s fair game to call out companies that act amorally, and encourage people to boycott them, so long as you do so without “punching down”.
Ethically: don’t exploit open source’s liberties as weaknesses
WP Engine also stand accused of altering the open source code that they host in ways that maximise their profit, to the detriment of both their customers and the original authors of
that code5.
It’s well established, for example, that WP Engine disable the “revisions” feature of WordPress6.
Personally, I don’t feel like this is as big a deal as Matt makes out that it is (I certainly wouldn’t go as far as saying “WP
Engine is not WordPress”): it’s pretty commonplace for large hosting companies to tweak the open source software that they host to better fit their architecture and business model.
But I agree that it does make WordPress as-provided by WP Engine significantly less good than would be expected from virtually any other host (most of which, by the way, provide much
better value-for-money at any price point).
There’s nothing to stop me from registering TurdPress.com and providing a premium WordPress web hosting solution with all the best features disabled: I could even disable
exports so that my customers wouldn’t even be able to easily leave my service for greener pastures! There’s nothing stop me… but that wouldn’t make it
right7.
It also looks like WP Engine may have made more-nefarious changes, e.g. modifying the referral links in open source code (the thing that earns money for the original authors of
that code) so that WP Engine can collect the revenue themselves when they deploy that code to their customers’ sites. That to me feels like it’s clearly into the zone ethical bad
practice. Within the open source community, it’s not okay to take somebody’s code, which they were kind enough to release under a liberal license, strip out the bits that provide
their income, and redistribute it, even just as a network service8.
Again, I think this is fair game to call out, even if it’s not something that anybody has a right to enforce legally. On which note…
Obviously, this is the part of the story you’re going to see the most news media about, because there’s reasonable odds it’ll end up in front of a judge at some point. There’s a good
chance that such a case might revolve around WP Engine’s willingness (and encouragement?) to allow their business to be called “WordPress Engine” and to capitalise on any confusion that
causes.
I’m not going to weigh in on the specifics of the legal case: I Am Not A Lawyer and all that. Naturally I agree with the underlying principle that one should not be allowed to profit
off another’s intellectual property, but I’ll leave discussion on whether or not that’s what WP Engine are doing as a conversation for folks with more legal-smarts than I. I’ve
certainly known people be confused by WP Engine’s name and branding, though, and think that they must be some kind of “officially-licensed” WordPress host: it happens.
If you’re following all of this drama as it unfolds… just remember to check your sources. There’s a lot of FUD floating around on the Internet right now9.
In summary…
With a reminder that I’m sharing my own opinion here and not that of my employer, here’s my thoughts on the recent WP Engine drama:
WP Engine certainly act in ways that are unethical and immoral and antithetical to the spirit of open source, and those are just a subset of the reasons that I wouldn’t use them as
a WordPress host.
Matt Mullenweg calling them out at WordCamp US doesn’t get his point across as well as I think he hoped it might, and probably won’t win him any popularity contests.
I’m not qualified to weigh in on whether or not WP Engine have violated the WordPress Foundation’s trademarks, but I suspect that they’ve benefitted from widespread confusion about
their status.
Footnotes
1 I suppose I ought to point out that Automattic is my employer, in case you didn’t know,
and point out that my opinions don’t necessarily represent theirs, etc. I’ve been involved with WordPress as an open source project for about four times as long as I’ve had any
connection to Automattic, though, and don’t always agree with them, so I’d hope that it’s a given that I’m speaking my own mind!
2 Though like Manu, I don’t
think that means that Matt should take the corresponding blog post down: I’m a digital preservationist, as might be evidenced by the unrepresentative-of-me and frankly embarrassing
things in the 25-year archives of this blog!
3 Fortunately the documents that the lawyers for both sides have been writing are much
clearer and more-specific, but that’s what you pay lawyers for, right?
4 There’s a huge amount of debate about the difference between morality and ethics, but
I’m using the definition that means that morality is based on what a social animal might be expected to decide for themselves is right, think e.g. the Golden Rule etc., whereas ethics is the code of conduct expected within a particular community. Take stealing, for example,
which covers the spectrum: that you shouldn’t deprive somebody else of something they need, is a moral issue; that we as a society deem such behaviour worthy of exclusion is an
ethical one; meanwhile the action of incarcerating burglars is part of our legal framework.
5 Not that nobody’s immune to making ethical mistakes. Not me, not you, not anybody else.
I remember when, back in 2005, Matt fucked up by injecting ads into WordPress (which at that point didn’t have a reliable source of
funding). But he did the right thing by backpedalling, undoing the harm, and apologising publicly and profusely.
6 WP Engine claim that they disable revisions for performance reasons, but that’s clearly
bullshit: it’s pretty obvious to me that this is about making hosting cheaper. Enabling revisions doesn’t have a performance impact on a properly-configured multisite hosting system,
and I know this from personal experience of running such things. But it does have a significant impact on how much space you need to allocate to your users, which has cost
implications at scale.
7 As an aside: if a court does rule that WP Engine is infringing upon
WordPress trademarks and they want a new company name to give their service a fresh start, they’re welcome to TurdPress.
8 I’d argue that it is okay to do so for personal-use though: the difference for
me comes when you’re making a profit off of it. It’s interesting to find these edge-cases in my own thinking!
9 A typical Reddit thread is about 25% lies-and-bullshit; but you can double that for a
typical thread talking about this topic!
Last month I implemented an alternative mode to view this website “like it’s 1999”, complete with with cursor trails, 88×31 buttons, tables for
layout1,
tiled backgrounds, and even a (fake) hit counter.
One thing I’d have liked to do for 1999 Mode but didn’t get around to would have been to make the images look like it was the 90s, too.
Back then, many Web users only had graphics hardware capable of displaying 256 distinct colours. Across different platforms and operating systems, they weren’t even necessarily
the same 256 colours2!
But the early Web agreed on a 216-colour palette that all those 8-bit systems could at least approximate pretty well.
I had an idea that I could make my images look “216-colour”-ish by using CSS to apply an SVG filter, but didn’t implement it.
But Spencer, a long-running source of excellent blog comments, stepped up and wrote an SVG
filter for me! I’ve tweaked 1999 Mode already to use it… and I’ve just got to say it’s excellent: huge thanks, Spencer!
The filter coerces colours to their nearest colour in the “Web safe” palette, resulting in things like this:
The flat surfaces are particularly impacted in this photo (as manipulated by the CSS SVG filter described above). Subtle hues and the gradients coalesce into slabs of colour, giving
them an unnatural and blocky appearance.
Plenty of pictures genuinely looked like that on the Web of the 1990s, especially if you happened to be using a computer only capable of 8-bit colour to view a page built by
somebody who hadn’t realised that not everybody would experience 24-bit colour like they did3.
Dithering
But not all images in the “Web safe” palette looked like this, because savvy web developers knew to dither their images when converting them to a limited palette.
Let’s have another go:
This image uses exactly the same 216-bit colour palette as the previous one, but looks a lot more “natural” thanks to the Floyd–Steinberg dithering algorithm.
Dithering introduces random noise to media4
in order to reduce the likelihood that a “block” will all be rounded to the same value. Instead; in our picture, a block of what would otherwise be the same colour ends up being rounded
to maybe half a dozen different colours, clustered together such that the ratio in a given part of the picture is, on average, a better approximation of the correct
colour.
The result is analogous to how halftone printing – the aesthetic of old comics and newspapers, with different-sized dots made from
few colours of ink – produces the illusion of a continuous gradient of colour so long as you look at it from far-enough away.
Zooming in makes it easy to see the noisy “speckling” effect in the dithered version, but from a distance it’s almost invisible.
The other year I read a spectacular article by Surma that explained in a very-approachable way
how and why different dithering algorithms produce the results they do. If you’ve any interest whatsoever in a deep dive or just want to know what blue noise is and why you
should care, I’d highly recommend it.
You used to see digital dithering everywhere, but nowadays it’s so rare that it leaps out as a revolutionary aesthetic when, for example, it gets used in
a video game.
Dithering can be so effective that it can even make an image “work” all the way down to 1-bit (i.e. true monochrome/black-and-white) colour. Here I’ve used Jarvis, Judice & Ninke’s dithering algorithm, which is highly-effective for picking out subtle colour differences in what would
otherwise be extreme dark and light patches, at the expense of being more computationally-expensive (to initially create) than other dithering strategies.
All of which is to say that: I really appreciate Spencer’s work to make my “1999 Mode” impose a 216-colour palette on images. But while it’s closer to the truth, it still doesn’t
quite reflect what my website would’ve looked like in the 1990s because I made extensive use of dithering when I saved my images in Web safe palettes5.
Why did I take the time to dither my images, back in the day? Because doing the hard work once, as a creator of graphical Web pages, saves time and computation (and can look
better!), compared to making every single Web visitor’s browser do it every single time.
Which, now I think about it, is a lesson that’s still true today (I’m talking to you, developers who send a tonne of JavaScript and ask my browser to generate the HTML for you
rather than just sending me the HTML in the first place!).
Footnotes
1 Actually, my “1999 mode” doesn’t use tables for layout; it pretty much only applies a
CSS overlay, but it’s deliberately designed to look a lot like my blog did in 1999, which did use tables for layout. For those too young to remember: back before CSS
gave us the ability to lay out content in diverse ways, it was commonplace to use a table – often with the borders and cell-padding reduced to zero – to achieve things that today
would be simple, like putting a menu down the edge of a page or an image alongside some text content. Using tables for non-tabular data causes problems, though: not only is
it hard to make a usable responsive website with them, it also reduces the control you have over the order of the content, which upsets some kinds of accessibility
technologies. Oh, and it’s semantically-invalid, of course, to describe something as a table if it’s not.
2Perhaps as few as 22 colours were defined the same across all
widespread colour-capable Web systems. At first that sounds bad. Then you remember that 4-bit (16 colour) palettes used to look look perfectly fine in 90s videogames. But then you
realise that the specific 22 “very safe” colours are pretty shit and useless for rendering anything that isn’t composed of black, white, bright red, and maybe one of a few
greeny-yellows. Ugh. For your amusement, here’s a copy of the image rendered using only the “very safe” 22 colours.
3 Spencer’s SVG filter does pretty-much the same thing as a computer might if asked to
render a 24-bit colour image using only 8-bit colour. Simply “rounding” each pixel’s colour to the nearest available colour is a fast operation, even on older hardware and with larger
images.
4 Note that I didn’t say “images”: dithering is also used to produce the same “more
natural” feel for audio, too, when reducing its bitrate (i.e. reducing the number of finite states into which the waveform can be quantised for digitisation), for example.
5 I’m aware that my footnotes are capable of nerdsniping Spencer, so by writing this
there’s a risk that he’ll, y’know, find a way to express a dithering algorithm as an SVG filter too. Which I suspect isn’t possible, but who knows! 😅
I was browsing (BBC) Good Food today when I noticed something I’d not seen before: a “premium” recipe, available on their “app only”:
I clicked on the “premium” recipe and… it
looked just like any other recipe. I guess it’s not actually restricted after all?
Just out of curiosity, I fired up a more-vanilla web browser and tried to visit the same page. Now I saw an overlay and modal attempting1 to
restrict access to the content:
It turns out their entire effort to restrict access to their premium content… is implemented in client-side JavaScript. Even when I did see the overlay and not get access to
the recipe, all I needed to do was open my browser’s debugger and run document.body.classList.remove('tp-modal-open'); for(el of document.querySelectorAll('.tp-modal,
.tp-backdrop')) el.remove(); and all the restrictions were lifted.
What a complete joke.
Why didn’t I even have to write my JavaScript two-liner to get past the restriction in my primary browser? Because I’m running privacy-protector Ghostery, and one of the services Ghostery blocks by-default is one called Piano. Good Food uses Piano to segment their audience in your
browser, but they haven’t backed that by any, y’know, actual security so all of their content, “premium” or not, is available to anybody.
I’m guessing that Immediate Media (who bought the BBC Good Food brand a while back and have only just gotten around to stripping “BBC” out of
the name) have decided that an ad-supported model isn’t working and have decided to monetise the site a little differently2.
Unfortunately, their attempt to differentiate premium from regular content was sufficiently half-hearted that I barely noticed that, too, gliding through the paywall without
even noticing were it not for the fact that I wondered why there was a “premium” badge on some of their recipes.
You know what website I miss? OpenSourceFood.com. It went downhill and then died around 2016, but for a while it was excellent.
Recipes probably aren’t considered a high-value target, of course. But I can tell you from experience that sometimes companies make basically this same mistake with much
more-sensitive systems. The other year, for example, I discovered (and ethically disclosed) a fault in the implementation of the login forms of a major UK mobile network that meant that
two-factor authentication could be bypassed entirely from the client-side.
These kinds of security mistakes are increasingly common on the Web as we train developers to think about the front-end first (and sometimes, exclusively). We need to do
better.
Footnotes
1 The fact that I could literally see the original content behind the modal
was a bit of a giveaway that they’d only hidden it, not actually protected it in any way.
2 I can see why they’d think that: personally, I didn’t even know there were ads
on the site until I did the experiment above: turns out I was already blocking them, too, along with any anti-ad-blocking scripts that might have been running alongside.
I’ve been playing with the (pre-Alpha version of) Ladybird, and it fills me with such joy and excitement.
As you can see, Ladybird does a perfectly adequate job of rendering this page, including most of its CSS and
virtually all of its JavaScript.
Browser diversity
Back in 2018, while other Web developers were celebrating, I expressed my dismay at the news that Microsoft Edge was on the cusp of switching
from using Microsoft’s own browser engine EdgeHTML to using Blink. Blink is the engine that powers almost all other mainstream browsers; all but Firefox, which continues to
stand atop Gecko.
The developers who celebrated this loss of rendering engine diversity were, I suppose, happy to have one fewer browser in which they must necessarily test their work. I guess these are
the same developers who don’t test the sites they develop for accessibility (does your site work if you can’t see the images? what about with a keyboard but without a pointing device?
how about if you’re colourblind?), or consider what might happen if a part of their site fails (what if the third-party CDN
that hosts your JavaScript libraries goes down or is blocked by the user’s security software or their ISP?).
When was the last time you tested your site in a text-mode browser?
But I was sad, because – as I observed after Andre
Alves Garzia succinctly spelled it out – browser engines are an endangered species. Building a new browser that supports the myriad complexities of the
modern Web is such a huge endeavour that it’s unlikely to occur from scratch: from this point on, all “new” browsers are likely to be based upon an existing browser engine.
Engine diversity is important. Last time we had a lull in engine diversity, the Web got stuck, stagnating in the
shadow of Internet Explorer 6’s dominance and under the thumb of Microsoft’s interests. I don’t want those days to come back; that’s a big part of why Firefox is my primary web browser.
A Ladybird book browser
I actually still own a copy of the book from which I adapted this cover!
Ladybird is a genuine new browser engine. Y’know, that thing I said that we might never see happen again! So how’ve they made it happen?
It helps that it’s not quite starting from scratch. It’s starting point is the HTML viewer component from SerenityOS. And… it’s pretty good. It’s DOM processing’s solid, it seems to support enough JavaScript and CSS that the modern Web is usable, even if it’s not beautiful 100% of the time.
I’ve certainly seen browsers do worse than this at Acid3 and related tests…
They’re not even expecting to make an Alpha release until next year! Right now if you want to use it at all, you’re going to need to compile the code for yourself and fight with a
plethora of bugs, but it works and that, all by itself, is really exciting.
They’ve got four full-time engineers, funded off donations, with three more expected to join, plus a stack of volunteer contributors on Github. I’ve raised my first issue against the repo; sadly my C++ probably isn’t strong enough to be able to help more-directly, even if I
somehow did have enough free time, which I don’t. But I’ll be watching-from-afar this wonderful, ambitious, and ideologically-sound initiative.