Is it possible to allow sideloading *and* keep users safe?

This is a repost promoting content originally published elsewhere. See more things Dan's reposted.

Terence Eden raises some valid points:

I’ve tried to be pragmatic, but there’s something of a dilemma here.

  1. Users should be free to run whatever code they like.
  2. Vulnerable members of society should be protected from scams.

Do we accept that a megacorporation should keep everyone safe at the expense of a few pesky nerds wanting to run some janky code?

Do we say that the right to run free software is more important than granny being protected from scammers?

Do we pour billions into educating users not to click “yes” to every prompt they see?

Do we try and build a super-secure Operating System which, somehow, gives users complete freedom without exposing them to risk?

Do we hope that Google won’t suddenly start extorting developers, users, and society as a whole?

Do we chase down and punish everyone who releases a scam app?

Do we stick an AI on every phone to detect scam apps and refuse to run them if they’re dodgy?

I don’t know the answers to any of these questions and – if I’m honest – I don’t like asking them.

Google’s gradual locking-down of Android bothers me, too. I’ve rooted many of my phones in order to unlock features that I benefit from (as a developer… and as a nerd!), and it’s bugged me on the occasions where I’ve been unable to run had to use complicated workarounds to trick e.g. a bank’s app. Having gone to the effort to root a phone – which remains outside of the reach of most regular users – I’d be happy to accept an appropriate share of the liability if my mistake, y’know, let a scammer steal all of my money.

That’s the risk you take with any device on which you have root, and it’s why we make it hard to the point of being discouraging. Because you can’t just put up a warning and hope that users will read and understand it, because they won’t. They’ll just click whatever button looks like it’ll get them to the next step without even glancing at the danger signs1.

I’m glad to have been increasingly decoupling myself from Google’s ecosystem, because I’ve been burned by it too. Like Terence, I’ve been hit by “real name” policies that discriminate against people with unusual names or who might be at risk of impersonation2. But I’m not convinced that there’s a good alternative for me to running Android on my mobile devices, at the moment: I really enjoyed Maemo back in the day; what’s the status of Sailfish nowadays?

I get that we need to protect people from dangerous scammy apps. But I’d like to think there’s a middle-ground somewhere between Doctrowian “it’s your device, you’re responsible for what runs on it” and the growing Apple/Google thinking of “if we don’t have the targetting coordinates of the developer that wrote the code, our OS won’t let you run it”. I’m ready to concede that user education alone hasn’t worked, but there’s got to be a better solution than this, Google.

Footnotes

1 Incidentally, I don’t blame users for this behaviour. Users have absolutely been conditioned, and continue to be conditioned, to click-without-reading. Cookie and privacy banners with dark patterns, EULAs and legal small print are notoriously (and often unnecessarily) long and convoluted, and companies routinely try to blur the line between “serious thing you should really read but we want you not to” and “trivial thing that you don’t need to read; it’s just a formality that we have to say it”.

2 Right now, my biggest fight with Google has come from the fact that lately, it seems like every time I upload a Three Rings demo video to YouTube it gets deleted under their harassment policy for doxxing people… people like “Alan Fakename” from Somewhereville, “Betty Notaperson” from Otherplace, and their friend “Chris McMadeup” who lives at 123 Imaginary Street. The appeals process turns out to be that you click a button to appeal, but don’t get to provide any further information (e.g. to explain that these are clearly-fake people who won’t mind being doxxed on account of the fact that they don’t exist), and then a few hours later you get an email to say “nah, we’re keeping it deleted”. I almost expect the YouTube version of my recent video demonstrating FreeDeedPoll.org.uk will be next to be targetted by this policy for showing me scribbling the purported signature Sam McRealName, formerly known as Jo Genuine-Person.

Note #26931

I got kicked off LinkedIn this week. Apparently there was “suspicious behaviour” on my account. To get back in, I needed to go through Persona’s digital ID check (this, despite the fact that I’ve got a Persona-powered verification on my LinkedIn, less than six months old).

After looping around many times identifying which way up a picture of a dog was and repeatedly photographing myself, my passport, and my driving license, I eventually got back in. Personally, I suspect they just rolled out some Online Safety Act functionality and it immediately tripped over my unusual name.

But let this be a reminder to anybody who (unlike me) depends upon their account in a social network: it can be taken away in a moment and be laborious (or impossible) to get back. If you care about your online presence, you should own your own domain name; simple as that!

Lock All The Computers

I wanted a way to simultaneously lock all of the computers – a mixture of Linux, MacOS and Windows boxen – on my desk, when I’m going to step away. Here’s what I came up with:

There’s optional audio in this video, if you want it.

One button. And everything locks. Nice!

Here’s how it works:

  1. The mini keyboard is just 10 cheap mechanical keys wired up to a CH552 chip. It’s configured to send CTRL+ALT+F13 through CTRL+ALT+F221 when one of its keys are pressed.
  2. The “lock” key is captured by my KVM tool Deskflow (which I migrated to when Barrier became neglected, which in turn I migrated to when I fell out of love with Synergy). It then relays this hotkey across to all currently-connected machines2.
  3. That shortcut is captured by each recipient machine in different ways:
    • The Linux computers run LXDE, so I added a line to /etc/xdg/openbox/rc.xml to set a <keybind> that executes xscreensaver-command -lock.
    • For the Macs, I created a Quick Action in Automator that runs pmset displaysleepnow as a shell script3, and then connected that via Keyboard Shortcuts > Services.
    • On the Windows box, I’ve got AutoHotKey running anyway, so I just have it run { DllCall("LockWorkStation") } when it hears the keypress.

That’s all there is to is! A magic “lock all my computers, I’m stepping away” button, that’s much faster and more-convenient than locking two to five computers individually.

Footnotes

1 F13 through F24 are absolutely valid “standard” key assignments, of course: it’s just that the vast majority of keyboards don’t have keys for them! This makes them excellent candidates for non-clashing personal-use function keys, but I like to append one or more modifier keys to the as well to be absolutely certain that I don’t interact with things I didn’t intend to!

2 Some of the other buttons on my mini keyboard are mapped to “jumping” my cursor to particular computers (if I lose it, which happens more often than I’d like to admit), and “locking” my cursor to the system it’s on.

3 These boxes are configured to lock as soon as the screen blanks; if yours don’t then you might need a more-sophisticated script.

Google Shared My Phone Number!

Duration

Podcast Version

This post is also available as a podcast. Listen here, download for later, or subscribe wherever you consume podcasts.

Earlier this month, I received a phone call from a user of Three Rings, the volunteer/rota management software system I founded1.

We don’t strictly offer telephone-based tech support – our distributed team of volunteers doesn’t keep any particular “core hours” so we can’t say who’s available at any given time – but instead we answer email/Web based queries pretty promptly at any time of the day or week.

But because I’ve called-back enough users over the years, it’s pretty much inevitable that a few probably have my personal mobile number saved. And because I’ve been applying for a couple of interesting-looking new roles, I’m in the habit of answering my phone even if it’s a number I don’t recognise.

Dan sits at his laptop in front of a long conference table where a group of people are looking at a projector screen.
Many of the charities that benefit from Three Rings seem to form the impression that we’re all just sat around in an office, like this. But in fact many of my fellow volunteers only ever see me once or twice a year!

After the first three such calls this month, I was really starting to wonder what had changed. Had we accidentally published my phone number, somewhere? So when the fourth tech support call came through, today (which began with a confusing exchange when I didn’t recognise the name of the caller’s charity, and he didn’t get my name right, and I initially figured it must be a wrong number), I had to ask: where did you find this number?

“When I Google ‘Three Rings login’, it’s right there!” he said.

Google Search results page for 'Three Rings CIC', showing a sidebar with information about the company and including... my personal mobile number and a 'Call' button that calls it!
I almost never use Google Search2, so there’s no way I’d have noticed this change if I hadn’t been told about it.

He was right. A Google search that surfaced Three Rings CIC’s “Google Business Profile” now featured… my personal mobile number. And a convenient “Call” button that connects you directly to it.

'Excuse me' GIF reaction. A white man blinks and looks surprised.

Some years ago, I provided my phone number to Google as part of an identity verification process, but didn’t consent to it being shared publicly. And, indeed, they didn’t share it publicly, until – seemingly at random – they started doing so, presumably within the last few weeks.

Concerned by this change, I logged into Google Business Profile to see if I could edit it back.

Screenshot from Google Business Profile, with my phone number and the message 'Your phone number was updated by Google.'.
Apparently Google inserted my personal mobile number into search results for me, randomly, without me asking them to. Delightful.

I deleted my phone number from the business listing again, and within a few minutes it seemed to have stopped being served to random strangers on the Internet. Unfortunately deleting the phone number also made the “Your phone number was updated by Google” message disappear, so I never got to click the “Learn more” link to maybe get a clue as to how and why this change happened.

Last month, high-street bank Halifax posted the details of a credit agreement I have with them to two people who aren’t me. Twice in two months seems suspicious. Did I accidentally click the wrong button on a popup and now I’ve consented to all my PII getting leaked everywhere?

Spoof privacy settings popup, such as you might find on a website, reading: We and our partners work very hard to keep your data safe and secure and to operate within the limitations of the law. It's really hard! Can you give us a break and make it easier for us by consenting for us to not have to do that? By clicking the 'I Agree' button, you consent to us and every other company you do business with to share your personal information with absolutely anybody, at any time, for any reason, forever. That's cool, right?
Don’t you hate it when you click the wrong button. Who reads these things, anyway, right?

Such feelings of rage.

Footnotes

1 Way back in 2002! We’re very nearly at the point where the Three Rings system is older than the youngest member of the Three Rings team. Speaking of which, we’re seeking volunteers to help expand our support team: if you’ve got experience of using Three Rings and an hour or two a week to spare helping to make volunteering easier for hundreds of thousands of people around the world, you should look us up!

2 Seriously: if you’re still using Google Search as your primary search engine, it’s past time you shopped around. There are great alternatives that do a better job on your choice of one or more of the metrics that might matter to you: better privacy, fewer ads (or more-relevant ads, if you want), less AI slop, etc.

× × × × ×

Halifax Shared My Credit Agreement!

Remember that hilarious letter I got from British Gas a in 2023 where they messed-up my name? This is like that… but much, much worse.

Today, Ruth and JTA received a letter. It told them about an upcoming change to the agreement of their (shared, presumably) Halifax credit card.

Except… they don’t have a shared Halifax credit card. Could it be a scam? Some sort of phishing attempt, maybe, or perhaps somebody taking out a credit card in their names?

I happened to be in earshot and asked to take a look at the letter, and was surprised to discover that all of the other details – the last four digits of the card, the credit limit, etc. – all matched my Halifax credit card.

Carefully-censored letter from Halifax, highlighting the parts that show my correct address, last four digits of my card, and my credit limit... and where it shows a pair of names that are not mine.
Halifax sent a letter to me, about my credit card… but addressed it to… two other people I live with

I spent a little over half an hour on the phone with Halifax, speaking to two different advisors, who couldn’t fathom what had happened or how. My credit card is not (and has never been) a joint credit card, and the only financial connection I have to Ruth and JTA is that I share a mortgage with them. My guess is that some person or computer at Halifax tried to join-the-dots from the mortgage outwards and re-assigned my credit card to them, instead?

Eventually I had to leave to run an errand, so I gave up on the phone call and raised a complaint with Halifax in writing. They’ve promised to respond within… eight weeks. Just brilliant.

×

Generative AI use and human agency

This is a repost promoting content originally published elsewhere. See more things Dan's reposted.

5. If you use AI, you are the one who is accountable for whatever you produce with it. You have to be certain that whatever you produced was correct. You cannot ask the system itself to do this. You must either already be expert at the task you are doing so you can recognise good output yourself, or you must check through other, different means the validity of any output.

9. Generative AI produces above average human output, but typically not top human output. If you overuse generative AI you may produce more mediocre output than you are capable of.

I was also tempted to include in 9 as a middle sentence “Note that if you are in an elite context, like attending a university, above average for humanity widely could be below average for your context.”

In this excellent post, Joanna says more-succinctly what I was trying to say in my comments on “AI Is Reshaping Software Engineering — But There’s a Catch…” a few days ago. In my case, I was talking very-specifically about AI as a programmer’s assistant, and Joanna’s points 5. and 9. are absolutely spot on.

Point 5 is a reminder that, as I’ve long said, you can’t trust an AI to do anything that you can’t do for yourself. I sometimes use a GenAI-based programming assistant, and I can tell you this – it’s really good for:

  • Fancy autocomplete: I start typing a function name, it guesses which variables I’m going to be passing into the function or that I’m going to want to loop through the output or that I’m going to want to return-early f the result it false. And it’s usually right. This is smart, and it saves me keypresses and reduces the embarrassment of mis-spelling a variable name1.
  • Quick reference guide: There was a time when I had all of my PHP DateTimeInterface::format character codes memorised. Now I’d have to look them up. Or I can write a comment (which I should anyway, for the next human) that says something like // @returns String a date in the form: Mon 7th January 2023 and when I get to my date(...) statement the AI will already have worked out that the format is 'D jS F Y' for me. I’ll recognise a valid format when I see it, and I’ll be testing it anyway.
  • Boilerplate: Sometimes I have to work in languages that are… unnecessarily verbose. Rather than writing a stack of setters and getters, or laying out a repetitive tree of HTML elements, or writing a series of data manipulations that are all subtly-different from one another in ways that are obvious once they’ve been explained to you… I can just outsource that and then check it2.
  • Common refactoring practices: “Rewrite this Javascript function so it doesn’t use jQuery any more” is a great example of the kind of request you can throw at an LLM. It’s already ingested, I guess, everything it could find on StackOverflow and Reddit and wherever else people go to bemoan being stuck with jQuery in their legacy codebase. It’s not perfect – just like when it’s boilerplating – and will make stupid mistakes3 but when you’re talking about a big function it can provide a great starting point so long as you keep the original code alongside, too, to ensure it’s not removing any functionality!

Other things… not so much. The other day I experimentally tried to have a GenAI help me to boilerplate some unit tests and it really failed at it. It determined pretty quickly, as I had, that to test a particular piece of functionality need to mock a function provided by a standard library, but despite nearly a dozen attempts to do so, with copious prompting assistance, it couldn’t come up with a working solution.

Overall, as a result of that experiment, I was less-effective as a developer while working on that unit test than I would have been had I not tried to get AI assistance: once I dived deep into the documentation (and eventually the source code) of the underlying library I was able to come up with a mocking solution that worked, and I can see why the AI failed: it’s quite-possibly never come across anything quite like this particular problem in its training set.

Solving it required a level of creativity and a depth of research that it was simply incapable of, and I’d clearly made a mistake in trying to outsource the problem to it. I was able to work around it because I can solve that problem.

But I know people who’ve used GenAI to program things that they wouldn’t be able to do for themselves, and that scares me. If you don’t understand the code your tool has written, how can you know that it does what you intended? Most developers have a blind spot for testing and will happy-path test their code without noticing if they’ve introduced, say, a security vulnerability owing to their handling of unescaped input or similar… and that’s a problem that gets much, much worse when a “developer” doesn’t even look at the code they deploy.

Security, accessibility, maintainability and performance – among others, I’ve no doubt – are all hard problems that are not made easier when you use an AI to write code that you don’t understand.

Footnotes

1 I’ve 100% had an occasion when I’ve called something $theUserID in one place and then $theUserId in another and not noticed the case difference until I’m debugging and swearing at the computer

2 I’ve described the experience of using an LLM in this way as being a little like having a very-knowledgeable but very-inexperienced junior developer sat next to me to whom I can pass off the boring tasks, so long as I make sure to check their work because they’re so eager-to-please that they’ll choose to assume they know more than they do if they think it’ll briefly impress you.

3 e.g. switching a selector from $(...) to document.querySelector but then failing to switch the trailing .addClass(...) to .classList.add(...)– you know: like an underexperienced but eager-to-please dev!

Reply to Vika, re: Content-Security-Policy

This is a reply to a post published elsewhere. Its content might be duplicated as a traditional comment at the original source.

Vika said:

Had a fight with the Content-Security-Policy header today. Turns out, I won, but not without sacrifices.
Apparently I can’t just insert <style> tags into my posts anymore, because otherwise I’d have to somehow either put nonces on them, or hash their content (which would be more preferrable, because that way it remains static).

I could probably do the latter by rewriting HTML at publish-time, but I’d need to hook into my Markdown parser and process HTML for that, and, well, that’s really complicated, isn’t it? (It probably is no harder than searching for Webmention links, and I’m overthinking it.)

I’ve had this exact same battle.

Obviously the intended way to use nonces in a Content-Security-Policy is to have the nonce generated, injected, and served in a single operation. So in PHP, perhaps, you might do something like this:

<?php
  $nonce = bin2hex(random_bytes(16));
  header("Content-Security-Policy: script-src 'nonce-$nonce'");
?>
<!DOCTYPE html>
<html lang="en">
<head>
  <meta charset="UTF-8">
  <title>PHP CSP Nonce Test</title>
</head>
<body>
  <h1>PHP CSP Nonce Test</h1>
  <p>
    JavaScript did not run.
  </p>

  <!-- This JS has a valid nonce: -->
  <script nonce="<?php echo $nonce; ?>">
    document.querySelector('p').textContent = 'JavaScript ran successfully.';
  </script>

  <!-- This JS does not: -->
  <script nonce="wrong-nonce">
    alert('The bad guys won!');
  </script>
</body>
</html>
Viewing this page in a browser (with Javascript enabled) should show the text “JavaScript ran successfully.”, but should not show an alertbox containing the text “The bad guys won!”.

But for folks like me – and you too, Vika,, from the sounds of things – who serve most of their pages, most of the time, from the cache or from static HTML files… and who add the CSP header on using webserver configuration… this approach just doesn’t work.

I experimented with a few solutions:

  • A long-lived nonce that rotates.
    CSP allows you to specify multiple nonces, so I considered having a rotating nonce that was applied to pages (which were then cached for a period) and delivered by the header… and then a few hours later a new nonce would be generated and used for future page generations and appended to the header… and after the cache expiry time the oldest nonces were rotated-out of the header and became invalid.
  • Dynamic nonce injection.
    I experimented with having the webserver parse pages and add nonces: randomly generating a nonce, putting it in the header, and then basically doing a s/<script/<script nonce="..."/ to search-and-replace it in.

Both of these are terrible solutions. The first one leaves a window of, in my case, about 24 hours during which a successfully-injected script can be executed. The second one effectively allowlists all scripts, regardless of their provenance. I realised that what I was doing was security theatre: seeking to boost my A-rating to an A+-rating on SecurityHeaders.com without actually improving security at all.

But the second approach gave me an idea. I could have a server-side secret that gets search-replaced out. E.g. if I “signed” all of my legitimate scripts with something like <script nonce="dans-secret-key-goes-here" ...> then I could replace s/dans-secret-key-goes-here/actual-nonce-goes-here/ and thus have the best of both worlds: static, cacheable pages, and actual untamperable nonces. So long as I took care to ensure that the pages were never delivered to anybody with the secret key still intact, I’d be sorted!

Alternatively, I was looking into whether Caddy can do something like mod_asis does for Apache: that is, serve a file “as is”, with headers included in the file. That way, I could have the CSP header generated with the page and then saved into the cache, so it’s delivered with the same none every time… until the page changes. I’d love more webservers to have an “as is” mode, but I appreciate that might be a big ask (Apache’s mechanism, I suspect, exploits the fact that HTTP/1.0 and HTTP/1.1 literally send headers, followed by two CRLFs, then content… but that’s not what happens in HTTP/2+).

So yeah, I’ll probably do a server-side-secret approach, down the line. Maybe that’ll work for you, too.

UK’s secret Apple iCloud backdoor order is a global emergency, say critics

This is a repost promoting content originally published elsewhere. See more things Dan's reposted.

In its latest attempt to erode the protections of strong encryption, the U.K. government has reportedly secretly ordered Apple to build a backdoor that would allow British security officials to access the encrypted cloud storage data of Apple customers anywhere in the world.

The secret order — issued under the U.K.’s Investigatory Powers Act 2016 (known as the Snoopers’ Charter) — aims to undermine an opt-in Apple feature that provides end-to-end encryption (E2EE) for iCloud backups, called Advanced Data Protection. The encrypted backup feature only allows Apple customers to access their device’s information stored on iCloud — not even Apple can access it.

Sigh. A continuation of a long-running saga of folks here in the UK attempting to make it easier for police to catch a handful of (stupid) criminals1… at the expense of making millions of people more-vulnerable to malicious hackers2.

If we continue on this path, it’ll only be a short number of years before you see a headline about a national secret, stored by a government minister (in the kind of ill-advised manner we know happens) on iCloud or similar and then stolen by a hostile foreign power who merely needed to bribe, infiltrate, or in the worst-case hack their way into Apple’s datacentres. And it’ll be entirely our own fault.

Meanwhile the serious terrorist groups will continue to use encryption that isn’t affected by whatever “ban” the UK can put into place (Al Qaeda were known to have developed their own wrapper around PGP, for example, decades ago), the child pornography rings will continue to tunnel traffic around whatever dark web platform they’ve made for themselves (I’m curious whether they’re actually being smart or not, but that’s not something I even remotely want to research), and either will still only be caught when they get sloppy and/or as the result of good old-fashioned police investigations.

Weakened and backdoored encryption in mainstream products doesn’t help you catch smart criminals. But it does help smart criminals to catch regular folks.

Footnotes

1 The smart criminals will start – or more-likely will already be using – forms of encryption that aren’t, and can’t, be prevented by legislation. Because fundamentally, cryptography is just maths. Incidentally, I assume you know that you can send me encrypted email that nobody else can read?

2 Or, y’know, abuse of power by police.

Endless SSH Tarpit on Debian

Tarpitting SSH with Endlessh

I had a smug moment when I saw security researcher Rob Ricci and friends’ paper empirically analysing brute-force attacks against SSH “in the wild”.1 It turns out that putting all your SSH servers on “weird” port numbers – which I’ve routinely done for over a decade – remains a pretty-effective way to stop all that unwanted traffic2, whether or not you decide to enhance that with some fail2ban magic.

But then I saw a comment about Endlessh. Endlessh3 acts like an SSH server but then basically reverse-Slow-Loris’s the connecting client, very gradually feeding it an infinitely-long SSH banner and hanging it for… well, maybe 15 seconds or so but possibly up to a week.

Installing an Endlessh tarpit on Debian 12

I was just setting up a new Debian 12 server when I learned about this. I’d already moved the SSH server port away from the default 224, so I figured I’d launch Endlessh on port 22 to slow down and annoy scanners.

Installation wasn’t as easy as I’d hoped considering there’s a package. Here’s what I needed to do:

  1. Move any existing SSH server to a different port, if you haven’t already, e.g. as shown in the footnotes.
  2. Install the package, e.g.: sudo apt update && sudo apt install -y endlessh
  3. Permit Endlessh to run on port 22: sudo setcap 'cap_net_bind_service=+ep' /usr/bin/endlessh
  4. Modify /etc/systemd/system/multi-user.target.wants/endlessh.service in the following ways:
    1. uncomment AmbientCapabilities=CAP_NET_BIND_SERVICE
    2. comment PrivateUsers=true
    3. change InaccessiblePaths=/run /var into InaccessiblePaths=/var
  5. Reload the modified service: sudo systemctl daemon-reload
  6. Configure Endlessh to run on port 22 rather than its default of 2222: echo "Port 22" | sudo tee /etc/endlessh/config
  7. Start Endlessh: sudo service endlessh start

To test if it’s working, connect to your SSH server on port 22 with your client in verbose mode, e.g. ssh -vp22 example.com and look for banner lines full of random garbage appearing at 10 second intervals.

Screenshot showing SSH connection being established to an Endlessh server, which is returning line after line of randomly-generated text as a banner.

It doesn’t provide a significant security, but you get to enjoy the self-satisfied feeling that you’re trolling dozens of opportunistic script kiddies a day.

Footnotes

1 It’s a good paper in general, if that’s your jam.

2 Obviously you gain very little security by moving to an unusual port number, given that you’re already running your servers in “keys-only” (PasswordAuthentication no) configuration mode already, right? Right!? But it’s nice to avoid all the unnecessary logging that wave after wave of brute-force attempts produce.

3 Which I can only assume is pronounced endle-S-S-H, but regardless of how it’s said out loud I appreciate the wordplay of its name.

4 To move your SSH port, you might run something like echo "Port 12345" | sudo tee /etc/ssh/sshd_config.d/unusual-port.conf and restart the service, of course.

×

Note #25343

As well as the programming tasks I’m working on for Three Rings this International Volunteer Day, I’m also doing a little devops. We’ve got a new server architecture rolling out next week, and I’m tasked with ensuring that the logging on them meets our security standards.

Terminal screenshot showing a directory listing of a logs directory with several gzipped logfiles with different date-stamped suffixes, and the contents of the logrotate configuration file that produced them.

Each server’s on-device logs are retained in date-stamped files for 14 days, but they’re also backed-up offsite daily.

Those bits all seem to be working, so next I need to work out a way to add a notification to our monitoring platform if any server doesn’t successfully push a log to the offsite backup in a timely manner.

×

Good Food, Bad Authorisation

I was browsing (BBC) Good Food today when I noticed something I’d not seen before: a “premium” recipe, available on their “app only”:

Screenshot showing recipes, one of which is labelled "App only" and "Premium".

I clicked on the “premium” recipe and… it looked just like any other recipe. I guess it’s not actually restricted after all?

Just out of curiosity, I fired up a more-vanilla web browser and tried to visit the same page. Now I saw an overlay and modal attempting1 to restrict access to the content:

Overlay attempting to block content to the page beneath, saying "Try 1 year for just £9.99 and save 81%".

It turns out their entire effort to restrict access to their premium content… is implemented in client-side JavaScript. Even when I did see the overlay and not get access to the recipe, all I needed to do was open my browser’s debugger and run document.body.classList.remove('tp-modal-open'); for(el of document.querySelectorAll('.tp-modal, .tp-backdrop')) el.remove(); and all the restrictions were lifted.

What a complete joke.

Why didn’t I even have to write my JavaScript two-liner to get past the restriction in my primary browser? Because I’m running privacy-protector Ghostery, and one of the services Ghostery blocks by-default is one called Piano. Good Food uses Piano to segment their audience in your browser, but they haven’t backed that by any, y’know, actual security so all of their content, “premium” or not, is available to anybody.

I’m guessing that Immediate Media (who bought the BBC Good Food brand a while back and have only just gotten around to stripping “BBC” out of the name) have decided that an ad-supported model isn’t working and have decided to monetise the site a little differently2. Unfortunately, their attempt to differentiate premium from regular content was sufficiently half-hearted that I barely noticed that, too, gliding through the paywall without even noticing were it not for the fact that I wondered why there was a “premium” badge on some of their recipes.

Screenshot from OpenSourceFood.com, circa 2007.
You know what website I miss? OpenSourceFood.com. It went downhill and then died around 2016, but for a while it was excellent.

Recipes probably aren’t considered a high-value target, of course. But I can tell you from experience that sometimes companies make basically this same mistake with much more-sensitive systems. The other year, for example, I discovered (and ethically disclosed) a fault in the implementation of the login forms of a major UK mobile network that meant that two-factor authentication could be bypassed entirely from the client-side.

These kinds of security mistakes are increasingly common on the Web as we train developers to think about the front-end first (and sometimes, exclusively). We need to do better.

Footnotes

1 The fact that I could literally see the original content behind the modal was a bit of a giveaway that they’d only hidden it, not actually protected it in any way.

2 I can see why they’d think that: personally, I didn’t even know there were ads on the site until I did the experiment above: turns out I was already blocking them, too, along with any anti-ad-blocking scripts that might have been running alongside.

× × ×

Brainfart

Brainfart moment this morning when my password safe prompted me to unlock it with a password, and for a moment I thought to myself “Why am I having to manually type in a password? Don’t I have a password safe to do this for me?” 🤦

KeePassXC authentication screen on Windows; no password has been entered.

×

Length Extension Attack Demonstration (Video)

This post is also available as an article. So if you'd rather read a conventional blog post of this content, you can!

This is a video version of my blog post, Length Extension Attack. In it, I talk through the theory of length extension attacks and demonstrate an SHA-1 length extension attack against an (imaginary) website.

The video can also be found on:

Length Extension Attack Demonstration

This post is also available as a video. If you'd prefer to watch/listen to me talk about this topic, give it a look.

Prefer to watch/listen than read? There’s a vloggy/video version of this post in which I explain all the key concepts and demonstrate an SHA-1 length extension attack against an imaginary site.

I understood the concept of a length traversal attack and when/how I needed to mitigate them for a long time before I truly understood why they worked. It took until work provided me an opportunity to play with one in practice (plus reading Ron Bowes’ excellent article on the subject) before I really grokked it.

Would you like to learn? I’ve put together a practical demo that you can try for yourself!

Screenshot of vulnerable site with legitimate "download" link hovered.
For the demonstration, I’ve built a skeletal stock photography site whose download links are protected by a hash of the link parameters, salted using a secret string stored securely on the server. Maybe they let authorised people hotlink the images or something.

You can check out the code and run it using the instructions in the repository if you’d like to play along.

Using hashes as message signatures

The site “Images R Us” will let you download images you’ve purchased, but not ones you haven’t. Links to the images are protected by a SHA-1 hash1, generated as follows:

Diagram showing SHA1 being fed an unknown secret key and the URL params "download=free" and outputting a hash as a "download key".
The nature of hashing algorithms like SHA-1 mean that even a small modification to the inputs, e.g. changing one character in the word “free”, results in a completely different output hash which can be detected as invalid.

When a “download” link is generated for a legitimate user, the algorithm produces a hash which is appended to the link. When the download link is clicked, the same process is followed and the calculated hash compared to the provided hash. If they differ, the input must have been tampered with and the request is rejected.

Without knowing the secret key – stored only on the server – it’s not possible for an attacker to generate a valid hash for URL parameters of the attacker’s choice. Or is it?

Changing download=free to download=valuable invalidates the hash, and the request is denied.

Actually, it is possible for an attacker to manipulate the parameters. To understand how, you must first understand a little about how SHA-1 and its siblings actually work:

SHA-1‘s inner workings

  1. The message to be hashed (SECRET_KEY + URL_PARAMS) is cut into blocks of a fixed size.2
  2. The final block is padded to bring it up to the full size.3
  3. A series of operations are applied to the first block: the inputs to those operations are (a) the contents of the block itself, including any padding, and (b) an initialisation vector defined by the algorithm.4
  4. The same series of operations are applied to each subsequent block, but the inputs are (a) the contents of the block itself, as before, and (b) the output of the previous block. Each block is hashed, and the hash forms part of the input for the next.
  5. The output of running the operations on the final block is the output of the algorithm, i.e. the hash.
Diagram showing message cut into blocks, the last block padded, and then each block being fed into a function along with the output of the function for the previous block. The first function, not having a previous block, receives the IV as its secondary input. The final function outputs the hash.
SHA-1 operates on a single block at a time, but the output of processing each block acts as part of the input of the one that comes after it. Like a daisy chain, but with cryptography.

In SHA-1, blocks are 512 bits long and the padding is a 1, followed by as many 0s as is necessary, leaving 64 bits at the end in which to specify how many bits of the block were actually data.

Padding the final block

Looking at the final block in a given message, it’s apparent that there are two pieces of data that could produce exactly the same output for a given function:

  1. The original data, (which gets padded by the algorithm to make it 64 bytes), and
  2. A modified version of the data, which has be modified by padding it in advance with the same bytes the algorithm would; this must then be followed by an additional block

Illustration showing two blocks: one short and padded, one pre-padded with the same characters, receiving the same IV and producing the same output.
A “short” block with automatically-added padding produces the same output as a full-size block which has been pre-populated with the same data as the padding would add.5
In the case where we insert our own “fake” padding data, we can provide more message data after the padding and predict the overall hash. We can do this because we the output of the first block will be the same as the final, valid hash we already saw. That known value becomes one of the two inputs into the function for the block that follows it (the contents of that block will be the other input). Without knowing exactly what’s contained in the message – we don’t know the “secret key” used to salt it – we’re still able to add some padding to the end of the message, followed by any data we like, and generate a valid hash.

Therefore, if we can manipulate the input of the message, and we know the length of the message, we can append to it. Bear that in mind as we move on to the other half of what makes this attack possible.

Parameter overrides

“Images R Us” is implemented in PHP. In common with most server-side scripting languages, when PHP sees a HTTP query string full of key/value pairs, if a key is repeated then it overrides any earlier iterations of the same key.

Illustration showing variables in a query string: "?one=foo&two=bar&one=baz". When parsed by PHP, the second value of "one" ("baz") only is retained.
Many online sources say that this “last variable matters” behaviour is a fundamental part of HTTP, but it’s not: you can disprove is by examining $_SERVER['QUERY_STRING'] in PHP, where you’ll find the entire query string. You could even implement your own query string handler that instead makes the first instance of each key the canonical one, if you really wanted.6
It’d be tempting to simply override the download=free parameter in the query string at “Images R Us”, e.g. making it download=free&download=valuable! But we can’t: not without breaking the hash, which is calculated based on the entire query string (minus the &key=... bit).

But with our new knowledge about appending to the input for SHA-1 first a padding string, then an extra block containing our payload (the variable we want to override and its new value), and then calculating a hash for this new block using the known output of the old final block as the IV… we’ve got everything we need to put the attack together.

Putting it all together

We have a legitimate link with the query string download=free&key=ee1cce71179386ecd1f3784144c55bc5d763afcc. This tells us that somewhere on the server, this is what’s happening:

Generation of the legitimate hash for the (unknown) secret key a string download=free, with algorithmic padding shown.
I’ve drawn the secret key actual-size (and reflected this in the length at the bottom). In reality, you might not know this, and some trial-and-error might be necessary.7
If we pre-pad the string download=free with some special characters to replicate the padding that would otherwise be added to this final8 block, we can add a second block containing an overriding value of download, specifically &download=valuable. The first value of download=, which will be the word free followed by a stack of garbage padding characters, will be discarded.

And we can calculate the hash for this new block, and therefore the entire string, by using the known output from the previous block, like this:

The previous diagram, but with the padding character manually-added and a second block containing "&download=valuable". The hash is calculated using the known output from the first block as the IV to the function run over the new block, producing a new hash value.
The URL will, of course, be pretty hideous with all of those special characters – which will require percent-encoding – on the end of the word ‘free’.

Doing it for real

Of course, you’re not going to want to do all this by hand! But an understanding of why it works is important to being able to execute it properly. In the wild, exploitable implementations are rarely as tidy as this, and a solid comprehension of exactly what’s happening behind the scenes is far more-valuable than simply knowing which tool to run and what options to pass.

That said: you’ll want to find a tool you can run and know what options to pass to it! There are plenty of choices, but I’ve bundled one called hash_extender into my example, which will do the job pretty nicely:

$ docker exec hash_extender hash_extender \
    --format=sha1 \
    --data="download=free" \
    --secret=16 \
    --signature=ee1cce71179386ecd1f3784144c55bc5d763afcc \
    --append="&download=valuable" \
    --out-data-format=html
Type: sha1
Secret length: 16
New signature: 7b315dfdbebc98ebe696a5f62430070a1651631b
New string: download%3dfree%80%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%e8%26download%3dvaluable

I’m telling hash_extender:

  1. which algorithm to use (sha1), which can usually be derived from the hash length,
  2. the existing data (download=free), so it can determine the length,
  3. the length of the secret (16 bytes), which I’ve guessed but could brute-force,
  4. the existing, valid signature (ee1cce71179386ecd1f3784144c55bc5d763afcc),
  5. the data I’d like to append to the string (&download=valuable), and
  6. the format I’d like the output in: I find html the most-useful generally, but it’s got some encoding quirks that you need to be aware of!

hash_extender outputs the new signature, which we can put into the key=... parameter, and the new string that replaces download=free, including the necessary padding to push into the next block and your new payload that follows.

Unfortunately it does over-encode a little: it’s encoded all the& and = (as %26 and %3d respectively), which isn’t what we wanted, so you need to convert them back. But eventually you end up with the URL: http://localhost:8818/?download=free%80%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%e8&download=valuable&key=7b315dfdbebc98ebe696a5f62430070a1651631b.

Browser at the resulting URL, showing the "valuable" image (a pile of money).
Disclaimer: the image you get when you successfully exploit the test site might not actually be valuable.

And that’s how you can manipulate a hash-protected string without access to its salt (in some circumstances).

Mitigating the attack

The correct way to fix the problem is by using a HMAC in place of a simple hash signature. Instead of calling sha1( SECRET_KEY . urldecode( $params ) ), the code should call hash_hmac( 'sha1', urldecode( $params ), SECRET_KEY ). HMACs are theoretically-immune to length extension attacks, so long as the output of the hash function used is functionally-random9.

Ideally, it should also use hash_equals( $validDownloadKey, $_GET['key'] ) rather than ===, to mitigate the possibility of a timing attack. But that’s another story.

Footnotes

1 This attack isn’t SHA1-specific: it works just as well on many other popular hashing algorithms too.

2 SHA-1‘s blocks are 64 bytes long; other algorithms vary.

3 For SHA-1, the padding bits consist of a 1 followed by 0s, except the final 8-bytes are a big-endian number representing the length of the message.

4 SHA-1‘s IV is 67452301 EFCDAB89 98BADCFE 10325476 C3D2E1F0, which you’ll observe is little-endian counting from 0 to F, then back from F to 0, then alternating between counting from 3 to 0 and C to F. It’s considered good practice when developing a new cryptographic system to ensure that the hard-coded cryptographic primitives are simple, logical, independently-discoverable numbers like simple sequences and well-known mathematical constants. This helps to prove that the inventor isn’t “hiding” something in there, e.g. a mathematical weakness that depends on a specific primitive for which they alone (they hope!) have pre-calculated an exploit. If that sounds paranoid, it’s worth knowing that there’s plenty of evidence that various spy agencies have deliberately done this, at various points: consider the widespread exposure of the BULLRUN programme and its likely influence on Dual EC DRBG.

5 The padding characters I’ve used aren’t accurate, just representative. But there’s the right number of them!

6 You shouldn’t do this: you’ll cause yourself many headaches in the long run. But you could.

7 It’s also not always obvious which inputs are included in hash generation and how they’re manipulated: if you’re actually using this technique adversarily, be prepared to do a little experimentation.

8 In this example, the hash operates over a single block, but the exact same principle applies regardless of the number of blocks.

9 Imagining the implementation of a nontrivial hashing algorithm, the predictability of whose output makes their HMAC vulnerable to a length extension attack, is left as an exercise for the reader.

× × × × × × × ×

WCEU23 – Day 1

The first “full” day of WordCamp Europe 2023 (which kicked-off at Contributor Day) was busy and intense, but I loved it.

This post is basically a live-blog of everything I got up to, and it’s mostly for my own benefit/notetaking. If you don’t read it, nobody will blame you.

Seen from behind, a very long queue runs through a conference centre.
Six minutes after workshop registration opened its queue snaked throughout an entire floor of the conference centre.

Here’s what I got up to:


10 things that all WordPress plugin developers should avoid

David Artiss took the courageous step of installing 36 popular plugins onto a fresh WordPress site and was, unsurprisingly, immediately bombarded by a billion banners on his dashboard. Some were merely unhelpful (“don’t forget to add your API key”), others were annoying (“thanks for installing our plugin”), and plenty more were commercial advertisements (“get the premium version”) despite the fact that WordPress.org guidelines recommend against this. It’s no surprise that this kind of “aggressive promotion” is the single biggest annoyance that people reported when David asked around on social media.

Similarly, plugins which attempt to break the standard WordPress look-and-feel by e.g. hoisting themselves to the top of the menu, showing admin popovers, putting settings sections in places other than the settings submenu, and so on are a huge annoyance to everybody. I get sufficiently frustrated by these common antifeatures of plugins I use that I actually maintain a plugin for my own use that “fixes” the ones that aggrivate me the most!

A man wearing glasses and a t-shirt with a WordPress logo stands on a stage.
David raised lots of other common gripes with WordPress plugins, too: data validation failures, leaving content behind after uninstallation (and “deactivation surveys”, ugh!), and a failure to account for accessibility.

David’s promised to put his slides online, plus to write articles about everything that came up in his Q&A.

I’m unconvinced that we can rely on plugin developers to independently fix the kinds of problems that come high on David’s list. I wonder if there’s mileage in WordPress Core reimplementing the way that the main navigation menu works such that all items in it can be (easily) re-arranged by users to their own preference? This would undermine the perceived value to plugin developers of “hoisting” their own to the top by allowing users to counteract it, and would provide a valuable feature to allow site admins to streamline their workflow: use WooCommerce but only in a way that’s secondary to your blog? Move “Products” below “Posts”! Etc.

Screenshot showing a WordPress admin interface writing this blog post, with the stage in the background.
Why yes, I’m liveblogging this. And yes, I’m not using Gutenberg yet (that’s a whole other story…)

Where did we come from?

Aaron Reimann from ClockworkWP gave us a tour of how WordPress has changed over the course of its 20-year history, starting even slightly before I started using WordPress; my blog (previously powered by some hacky PHP, previouslier powered by some hackier Perl, previousliest written in static HTML) switched to WordPress in 2004, when it hit version 1.2, so it was fun to get the opportunity to see some even older versions illustrated.

A WordPress site, circa 2004, simulated in a virtual machine.
A WordPress site from 2004 would, of course, still be perfectly usable today. How many JS-heavy/API-driven websites of today do you reckon will still function in 20 years time?

It was great to be reminded how far the Core code has come over that time. Early versions of WordPress – as was common among PHP applications at the time! – had very few files and each could reliably be expected to be a stack of SQL, wrapped in a stack of code, wrapped in what’s otherwise a HTML file: no modularity!

A man wearing a flat cap strides across a stage.
Aaron’s passion for this kind of digital archaeology really shows. I dig it.

There were very few surprises for me in this talk, as you might expect for such an “old hand”, but I really enjoyed the nostalgia of exploring WordPress history through his eyes.

I enjoyed putting him on the spot with a “spicy” question at the end of his talk, by asking him if, alongside everything we’ve gained over the years, whether there’s anything we lost along the way. He answered well, pointing out that the somewhat bloated stack of plugins that are commonplace on big sites nowadays and the ease with which admins can just “click and install” more of them. I agree with him, although personally I miss built-in XFN support…

Dan, smiling, wearing a purple t-shirt with a WordPress logo and a Pride flag, hugs a cut-out of a Wappu (itself hugging a "WP 20" balloon and wearing a party hat).
If you’d have told me in advance that hugging a Wapuu would have been a highlight of the day… yeah, that wouldn’t have been a surprise!

Networking And All That

There’s a lot of exhibitors with stands, but I tried to do a circuit or so and pay attention at least to those whose owners I’ve come into contact with in a professional capacity. Many developers who make extensions for WooCommerce, of course, sell those extensions through WooCommerce.com, which means they come into routine direct contact with my code (and it can mean that when their extension’s been initially rejected by our security scanners or linters, it’s me their developers first want to curse!).

A WordCamp Europe Athens 2023 lanyard and name badge for Dan Q, Attendee, onto which a "Woo" sticker has been affixed.
After a while, to spare some of that awkward exchange where somebody tries to sell me their product before I explain that I already sell their product for them, I slapped a “Woo” sticker on my lanyard.

It’s been great to connect with people using WordPress to power the Web in a whole variety of different contexts, but it somehow still feels strange to me that WordPress has such a commercial following! Even speaking as somebody who’s made their living at least partially out of WordPress for the last decade plus, it still feels to me like its greatest value comes from its use for personal publishing.

The feel of a WordCamp with its big shiny sponsors is enormously different from, say, the intimacy and individuality of a Homebrew Website Club meeting, and I think that’s something I still need to come to terms with. WordPress’s success story comes from many different causes, but perhaps chief among them is the fact that it’s versatile enough to power the website of a government, multinational, or household-name brand… but also to run the smallest personal indie blog. I struggle to comprehend that, even with my background.

(Side note, Sophie Koonin says that building a personal website is a radical act in 2023, and I absolutely agree.)

A "Woo" booth, staffed with a variety of people, with Dan at the centre.
My division of Automattic had a presence, of course.

I was proud of my colleagues for the “gimmick” they were using to attract people to the Woo stand: you could pick up a “credit card” and use it to make a purchase (of Greek olive oil) using a website, see your order appear on the app at the backend in real-time, and then receive your purchase as a giveaway. The “credit card” doubles as a business card from the stand, the olive oil is a real product from a real, local producer (who really uses WooCommerce to sell online!), and when you provide an email address at the checkout you can opt-in to being contacted by the team afterwards. That’s some good joined-up thinking by my buddies in marketing!


WordPress extended: build unique websites on top of WP

Petya Petkova observed that it’s commonplace to take the easy approach and make a website look like… well, every other website.  “Web deja-vu” is a real thing, and it’s fed not only by the ebbs and flows of trends in web design but by the proliferation of indistinct themes that people just install-and-use.

A woman with long hair, wearing a green t-shirt, stands before a screen on a stage.
How can we break free from web deja-vu, asks Petya. It almost makes me sad that her slides had been coalesced into the conference’s slidedeck design rather than being her own… although on second though maybe that just helps enhance the point!

Choice of colours and typography can be used to tell a story, to instil a feeling, to encourage engagement. Scrolling can be used as a metaphor for storytelling (“scrolly-telling”, Petya calls it). Animation flow can be used to direct a user’s attention and drive focus and encourage interaction.

A lot of the technical concepts she demonstrated – parts of a page that scroll at different speeds, typography that shifts or changes, videos used in a subtle way to accentuate other content, etc. – can be implemented in the frontend with WebGL, Three.js and the like. Petya observes that moving this kind of content interactivity into the frontend can produce an illusion of a performance improvement, which is an argument I’ve heard before, but personally I think it’s only valuable if it’s built as a progressive enhancement: otherwise, you’re always at risk that your site won’t look like you’d hope.

I note, for example, that Petya’s agency’s site shows only an “endless spinner” when viewed in my browser (which blocks the code.jQuery CDN by default, unless allowlisted for specific sites). All of the content is there, on the page, if you View Source, but it’s completely invisible if an external JavaScript fails to load. That doesn’t just happen when weirdos like me disable JavaScript in their browsers: it can happen if the browser interacts badly with the script, or if the user’s Internet connection is ropey, or a malware scanner misfires, or if government censorship blocks the CDN, or in any number of other conditions.

Screenshot from acceler8design.com, showing an "endless spinner" and no content.
While I agree with Petya about the value of animation and interactivity to make sites awesome, I don’t think it can take second-place to ensuring the most-widespread access and accessibility for your audience. Otherwise we’d still be making Flash sites, right?

So yeah: uniqueness and creativity are great, and I like what she’s proposing, but not the way she goes about it. The first person to ask a question wisely brought up accessibility, and Petya answered well that accessibility technologies can bridge the gap, but I’d counter that it’s preferable to build accessible in the first instance: if you have to use an aria- attribute it’s a good sign that you probably already did something wrong (not always, but it’s certainly a pointer that you ought to take a step back and check!).

Several other good questions and great answers followed: about how to showcase a preliminary design when they design is dependent upon animation and interactivity (which I’ve witnessed before!), on the value of server-side rendering of components, and about how to optimise for smaller screens. Petya clearly knows her stuff in all of these areas and had confident responses.


State of WordPress security – insights from 2022

Oliver Sild is the kind of self-taught hacker, security nerd, and community builder that I love, so I wasn’t going to miss his talk.

A man in a literal black hat stands in the centre of a large theatre stage.
The number of security vulnerability reports in the WordPress ecosystem is up +328%, Oliver opened. But the bugs being reported are increasingly old, so we’re not talking about new issues being created. And only 0.3% of bugs were in WordPress Core (and were patched before they were exploitable).

It’s good news in general in WordPress Security-land… but CSRF is on the up-and-up (overtaking XSS) in the plugin space. That, and all the broken access control we see in the admin area, are things I’ll be keeping in mind next time I’m arguing with a vendor about the importance of using nonces and security checks in their extension (I have this battle from time to time!).

But an interesting development is the growth of the supply chains in the WordPress plugin ecosystem. Nowadays a plugin might depend upon another plugin which might depend upon a library… and a patch applied to the latter of those might take time to be propagated through the chain, providing attackers with a growing window of opportunity.

Sankey chart showing 1160 submitted bugs being separated into pending, accepted, invalid, and (eventually) patched. 26% of critical bugs in 2022 received no timely patch.
I love a good Sankey chart. Even when it says scary things.

A worrying thought is that while plugin directory administrators will pull and remove plugins that have longstanding unactioned security issues. But that doesn’t help the sites that already have that plugin installed and are still using it! There’s a proposal to allow WordPress to notify admins if a plugin used on a site has been dropped for security reasons, but it was opened 9 years ago and hasn’t seen any real movement, soo…

I like that Oliver plugged for security researchers being acknowledged as equal contributors to developers on your software. But then, I would say that, as somebody who breaks into things once in a while and then tells the affected parties how to fix the problem that allowed me to do so! He also provided a whole wealth of tips for site owners and agencies to try to keep their sites safe, but little that I wasn’t aware of already.

A large audience of a few hundred people, seen from above, facing left.
Still, good to see this talk get as good an audience as it did, given the importance of the topic!

It was about this point in the day, glancing at my schedule and realising that at any given time there were up to four other sessions running simultaneously, that I really got a feel for the scale of this conference. Awesome. Meanwhile, Oliver was fielding the question that I’m sure everybody was thinking: with Gutenberg blocks powered by JavaScript that are often backed by a supply-chain of the usual billion-or-so files you find in your .node_modules directory, isn’t the risk of supply chain attacks increasing?

Spoiler: yes. Did you notice earlier in this post I mentioned that I don’t use Gutenberg on this site yet?

Animation showing Dan, wearing a pilot's hat, surrounded by cotton wool clouds, as the camera pans back and forth.
When the Jetpack team told me that they’ve been improving their cloud offering, this wasn’t what I expected.

Typographic readability in theme design & development

My first “workshop” was run by Giulia Laco, on the topic of readable content and design.

A title slide encourages designers to sit on the left (to the right of the speaker), developers to the right (on her left), and "no-coders" in the centre.
Designers to the left of me, coders to the right: here I am, stuck in the middle with you.

Giulia began by reminding us how short the attention span of Web readers is, and how important the right typographic choices are in ensuring that people actually read your content. I fully get this – I think that very few people will have the attention span to read this part of this very blog post, for example! – but I loved that she hammered the point home by presenting every slide of her presentation twice (or more), “improving” the typographic choices as she went along: an excellent and memorable quirk.

Our capacity to read and comprehend a text is affected by a combination of common (distance, lighting, environment, concentration, mood, etc.), personal (age, proficiency, motiviation, accessibility requirements, etc.), and typographic (face, style, size, line length and spacing, contrast, width, rhythm etc.) factors. To explore the impact of the typographic factors, the group dived into a pre-prepared Codepen and a shared Figma diagram. (I immediately had a TIL moment over the font-synthesis: CSS property!)

A presentation of the typography playground, in which the font is being changed.
I appreciated that Giulia stressed the importance of a fallback font. Just like the CDN issues I described above while talking about JavaScript dependencies, not specifying a fallback font puts your design at the mercy of the browser’s defaults. We don’t like to think about what happens when websites partially fail, but they do, and we should.

Things get interesting at the intersection of readability and accessibility. For example, WCAG accessibility requirements demand that you don’t use images of text (we used to do this a lot back before we could reliably use fonts on the web, and before we could easily have background images on e.g. buttons for navigation). But this accessibility requirement also aids screen readability when accounting for e.g. “retina” screens with virtual pixel ratios.

Slide showing a physical pixel and a "virtual pixel" representing a real pixel of a different size.
Do you remember when a pixel was the size of a pixel? Those days are long gone. True story.

Giulia provided a great explanation of why we may well think in pixels (as developers or digital designers) but we’re unlikely to use them everywhere: I’d internalised this lesson long ago but I appreciated a well-explained justification. The short of it is: screen zoom (that fancy zoom feature you use in your browser all the time, especially on mobile) and text zoom (the one you probably don’t use, or don’t use so much) are different things, and setting a pixel-based font size in the root node wrecks the latter, forcing some people with accessibility needs to use the former, which is likely to result in vertical scrolling. Boo!

I also enjoyed seeing this demo of how the different hyphenation-points in different languages (because of syllable stress) can impact on your wrapping points/line lengths when content is translated. This can affect any website, of course, because any website can be the target of automatic translation.

Plus, Giulia’s thoughts on the value of serifed fonts (even on digital displays) for improving typographic readability of the letters d, b, p and q which are often mirror- or rotationally-symmetric to one another in sans-serif fonts. It’s amazing to have something – in this case, a psychological letter transposition – pointed out that I’ve experienced but never pinned down the reason for, before. Neat!

It was a shame that this workshop took place late in the day, because many of the participants (including me) seemed to have flagging energy levels!


Altogether a great (but intense) day. Boggles my mind that there’s another one like it tomorrow.

× × × × × × × × × × × × × × × × ×