WCEU23 – Day 1

The first “full” day of WordCamp Europe 2023 (which kicked-off at Contributor Day) was busy and intense, but I loved it.

This post is basically a live-blog of everything I got up to, and it’s mostly for my own benefit/notetaking. If you don’t read it, nobody will blame you.

Seen from behind, a very long queue runs through a conference centre.
Six minutes after workshop registration opened its queue snaked throughout an entire floor of the conference centre.

Here’s what I got up to:


10 things that all WordPress plugin developers should avoid

David Artiss took the courageous step of installing 36 popular plugins onto a fresh WordPress site and was, unsurprisingly, immediately bombarded by a billion banners on his dashboard. Some were merely unhelpful (“don’t forget to add your API key”), others were annoying (“thanks for installing our plugin”), and plenty more were commercial advertisements (“get the premium version”) despite the fact that WordPress.org guidelines recommend against this. It’s no surprise that this kind of “aggressive promotion” is the single biggest annoyance that people reported when David asked around on social media.

Similarly, plugins which attempt to break the standard WordPress look-and-feel by e.g. hoisting themselves to the top of the menu, showing admin popovers, putting settings sections in places other than the settings submenu, and so on are a huge annoyance to everybody. I get sufficiently frustrated by these common antifeatures of plugins I use that I actually maintain a plugin for my own use that “fixes” the ones that aggrivate me the most!

A man wearing glasses and a t-shirt with a WordPress logo stands on a stage.
David raised lots of other common gripes with WordPress plugins, too: data validation failures, leaving content behind after uninstallation (and “deactivation surveys”, ugh!), and a failure to account for accessibility.

David’s promised to put his slides online, plus to write articles about everything that came up in his Q&A.

I’m unconvinced that we can rely on plugin developers to independently fix the kinds of problems that come high on David’s list. I wonder if there’s mileage in WordPress Core reimplementing the way that the main navigation menu works such that all items in it can be (easily) re-arranged by users to their own preference? This would undermine the perceived value to plugin developers of “hoisting” their own to the top by allowing users to counteract it, and would provide a valuable feature to allow site admins to streamline their workflow: use WooCommerce but only in a way that’s secondary to your blog? Move “Products” below “Posts”! Etc.

Screenshot showing a WordPress admin interface writing this blog post, with the stage in the background.
Why yes, I’m liveblogging this. And yes, I’m not using Gutenberg yet (that’s a whole other story…)

Where did we come from?

Aaron Reimann from ClockworkWP gave us a tour of how WordPress has changed over the course of its 20-year history, starting even slightly before I started using WordPress; my blog (previously powered by some hacky PHP, previouslier powered by some hackier Perl, previousliest written in static HTML) switched to WordPress in 2004, when it hit version 1.2, so it was fun to get the opportunity to see some even older versions illustrated.

A WordPress site, circa 2004, simulated in a virtual machine.
A WordPress site from 2004 would, of course, still be perfectly usable today. How many JS-heavy/API-driven websites of today do you reckon will still function in 20 years time?

It was great to be reminded how far the Core code has come over that time. Early versions of WordPress – as was common among PHP applications at the time! – had very few files and each could reliably be expected to be a stack of SQL, wrapped in a stack of code, wrapped in what’s otherwise a HTML file: no modularity!

A man wearing a flat cap strides across a stage.
Aaron’s passion for this kind of digital archaeology really shows. I dig it.

There were very few surprises for me in this talk, as you might expect for such an “old hand”, but I really enjoyed the nostalgia of exploring WordPress history through his eyes.

I enjoyed putting him on the spot with a “spicy” question at the end of his talk, by asking him if, alongside everything we’ve gained over the years, whether there’s anything we lost along the way. He answered well, pointing out that the somewhat bloated stack of plugins that are commonplace on big sites nowadays and the ease with which admins can just “click and install” more of them. I agree with him, although personally I miss built-in XFN support…

Dan, smiling, wearing a purple t-shirt with a WordPress logo and a Pride flag, hugs a cut-out of a Wappu (itself hugging a "WP 20" balloon and wearing a party hat).
If you’d have told me in advance that hugging a Wapuu would have been a highlight of the day… yeah, that wouldn’t have been a surprise!

Networking And All That

There’s a lot of exhibitors with stands, but I tried to do a circuit or so and pay attention at least to those whose owners I’ve come into contact with in a professional capacity. Many developers who make extensions for WooCommerce, of course, sell those extensions through WooCommerce.com, which means they come into routine direct contact with my code (and it can mean that when their extension’s been initially rejected by our security scanners or linters, it’s me their developers first want to curse!).

A WordCamp Europe Athens 2023 lanyard and name badge for Dan Q, Attendee, onto which a "Woo" sticker has been affixed.
After a while, to spare some of that awkward exchange where somebody tries to sell me their product before I explain that I already sell their product for them, I slapped a “Woo” sticker on my lanyard.

It’s been great to connect with people using WordPress to power the Web in a whole variety of different contexts, but it somehow still feels strange to me that WordPress has such a commercial following! Even speaking as somebody who’s made their living at least partially out of WordPress for the last decade plus, it still feels to me like its greatest value comes from its use for personal publishing.

The feel of a WordCamp with its big shiny sponsors is enormously different from, say, the intimacy and individuality of a Homebrew Website Club meeting, and I think that’s something I still need to come to terms with. WordPress’s success story comes from many different causes, but perhaps chief among them is the fact that it’s versatile enough to power the website of a government, multinational, or household-name brand… but also to run the smallest personal indie blog. I struggle to comprehend that, even with my background.

(Side note, Sophie Koonin says that building a personal website is a radical act in 2023, and I absolutely agree.)

A "Woo" booth, staffed with a variety of people, with Dan at the centre.
My division of Automattic had a presence, of course.

I was proud of my colleagues for the “gimmick” they were using to attract people to the Woo stand: you could pick up a “credit card” and use it to make a purchase (of Greek olive oil) using a website, see your order appear on the app at the backend in real-time, and then receive your purchase as a giveaway. The “credit card” doubles as a business card from the stand, the olive oil is a real product from a real, local producer (who really uses WooCommerce to sell online!), and when you provide an email address at the checkout you can opt-in to being contacted by the team afterwards. That’s some good joined-up thinking by my buddies in marketing!


WordPress extended: build unique websites on top of WP

Petya Petkova observed that it’s commonplace to take the easy approach and make a website look like… well, every other website.  “Web deja-vu” is a real thing, and it’s fed not only by the ebbs and flows of trends in web design but by the proliferation of indistinct themes that people just install-and-use.

A woman with long hair, wearing a green t-shirt, stands before a screen on a stage.
How can we break free from web deja-vu, asks Petya. It almost makes me sad that her slides had been coalesced into the conference’s slidedeck design rather than being her own… although on second though maybe that just helps enhance the point!

Choice of colours and typography can be used to tell a story, to instil a feeling, to encourage engagement. Scrolling can be used as a metaphor for storytelling (“scrolly-telling”, Petya calls it). Animation flow can be used to direct a user’s attention and drive focus and encourage interaction.

A lot of the technical concepts she demonstrated – parts of a page that scroll at different speeds, typography that shifts or changes, videos used in a subtle way to accentuate other content, etc. – can be implemented in the frontend with WebGL, Three.js and the like. Petya observes that moving this kind of content interactivity into the frontend can produce an illusion of a performance improvement, which is an argument I’ve heard before, but personally I think it’s only valuable if it’s built as a progressive enhancement: otherwise, you’re always at risk that your site won’t look like you’d hope.

I note, for example, that Petya’s agency’s site shows only an “endless spinner” when viewed in my browser (which blocks the code.jQuery CDN by default, unless allowlisted for specific sites). All of the content is there, on the page, if you View Source, but it’s completely invisible if an external JavaScript fails to load. That doesn’t just happen when weirdos like me disable JavaScript in their browsers: it can happen if the browser interacts badly with the script, or if the user’s Internet connection is ropey, or a malware scanner misfires, or if government censorship blocks the CDN, or in any number of other conditions.

Screenshot from acceler8design.com, showing an "endless spinner" and no content.
While I agree with Petya about the value of animation and interactivity to make sites awesome, I don’t think it can take second-place to ensuring the most-widespread access and accessibility for your audience. Otherwise we’d still be making Flash sites, right?

So yeah: uniqueness and creativity are great, and I like what she’s proposing, but not the way she goes about it. The first person to ask a question wisely brought up accessibility, and Petya answered well that accessibility technologies can bridge the gap, but I’d counter that it’s preferable to build accessible in the first instance: if you have to use an aria- attribute it’s a good sign that you probably already did something wrong (not always, but it’s certainly a pointer that you ought to take a step back and check!).

Several other good questions and great answers followed: about how to showcase a preliminary design when they design is dependent upon animation and interactivity (which I’ve witnessed before!), on the value of server-side rendering of components, and about how to optimise for smaller screens. Petya clearly knows her stuff in all of these areas and had confident responses.


State of WordPress security – insights from 2022

Oliver Sild is the kind of self-taught hacker, security nerd, and community builder that I love, so I wasn’t going to miss his talk.

A man in a literal black hat stands in the centre of a large theatre stage.
The number of security vulnerability reports in the WordPress ecosystem is up +328%, Oliver opened. But the bugs being reported are increasingly old, so we’re not talking about new issues being created. And only 0.3% of bugs were in WordPress Core (and were patched before they were exploitable).

It’s good news in general in WordPress Security-land… but CSRF is on the up-and-up (overtaking XSS) in the plugin space. That, and all the broken access control we see in the admin area, are things I’ll be keeping in mind next time I’m arguing with a vendor about the importance of using nonces and security checks in their extension (I have this battle from time to time!).

But an interesting development is the growth of the supply chains in the WordPress plugin ecosystem. Nowadays a plugin might depend upon another plugin which might depend upon a library… and a patch applied to the latter of those might take time to be propagated through the chain, providing attackers with a growing window of opportunity.

Sankey chart showing 1160 submitted bugs being separated into pending, accepted, invalid, and (eventually) patched. 26% of critical bugs in 2022 received no timely patch.
I love a good Sankey chart. Even when it says scary things.

A worrying thought is that while plugin directory administrators will pull and remove plugins that have longstanding unactioned security issues. But that doesn’t help the sites that already have that plugin installed and are still using it! There’s a proposal to allow WordPress to notify admins if a plugin used on a site has been dropped for security reasons, but it was opened 9 years ago and hasn’t seen any real movement, soo…

I like that Oliver plugged for security researchers being acknowledged as equal contributors to developers on your software. But then, I would say that, as somebody who breaks into things once in a while and then tells the affected parties how to fix the problem that allowed me to do so! He also provided a whole wealth of tips for site owners and agencies to try to keep their sites safe, but little that I wasn’t aware of already.

A large audience of a few hundred people, seen from above, facing left.
Still, good to see this talk get as good an audience as it did, given the importance of the topic!

It was about this point in the day, glancing at my schedule and realising that at any given time there were up to four other sessions running simultaneously, that I really got a feel for the scale of this conference. Awesome. Meanwhile, Oliver was fielding the question that I’m sure everybody was thinking: with Gutenberg blocks powered by JavaScript that are often backed by a supply-chain of the usual billion-or-so files you find in your .node_modules directory, isn’t the risk of supply chain attacks increasing?

Spoiler: yes. Did you notice earlier in this post I mentioned that I don’t use Gutenberg on this site yet?

Animation showing Dan, wearing a pilot's hat, surrounded by cotton wool clouds, as the camera pans back and forth.
When the Jetpack team told me that they’ve been improving their cloud offering, this wasn’t what I expected.

Typographic readability in theme design & development

My first “workshop” was run by Giulia Laco, on the topic of readable content and design.

A title slide encourages designers to sit on the left (to the right of the speaker), developers to the right (on her left), and "no-coders" in the centre.
Designers to the left of me, coders to the right: here I am, stuck in the middle with you.

Giulia began by reminding us how short the attention span of Web readers is, and how important the right typographic choices are in ensuring that people actually read your content. I fully get this – I think that very few people will have the attention span to read this part of this very blog post, for example! – but I loved that she hammered the point home by presenting every slide of her presentation twice (or more), “improving” the typographic choices as she went along: an excellent and memorable quirk.

Our capacity to read and comprehend a text is affected by a combination of common (distance, lighting, environment, concentration, mood, etc.), personal (age, proficiency, motiviation, accessibility requirements, etc.), and typographic (face, style, size, line length and spacing, contrast, width, rhythm etc.) factors. To explore the impact of the typographic factors, the group dived into a pre-prepared Codepen and a shared Figma diagram. (I immediately had a TIL moment over the font-synthesis: CSS property!)

A presentation of the typography playground, in which the font is being changed.
I appreciated that Giulia stressed the importance of a fallback font. Just like the CDN issues I described above while talking about JavaScript dependencies, not specifying a fallback font puts your design at the mercy of the browser’s defaults. We don’t like to think about what happens when websites partially fail, but they do, and we should.

Things get interesting at the intersection of readability and accessibility. For example, WCAG accessibility requirements demand that you don’t use images of text (we used to do this a lot back before we could reliably use fonts on the web, and before we could easily have background images on e.g. buttons for navigation). But this accessibility requirement also aids screen readability when accounting for e.g. “retina” screens with virtual pixel ratios.

Slide showing a physical pixel and a "virtual pixel" representing a real pixel of a different size.
Do you remember when a pixel was the size of a pixel? Those days are long gone. True story.

Giulia provided a great explanation of why we may well think in pixels (as developers or digital designers) but we’re unlikely to use them everywhere: I’d internalised this lesson long ago but I appreciated a well-explained justification. The short of it is: screen zoom (that fancy zoom feature you use in your browser all the time, especially on mobile) and text zoom (the one you probably don’t use, or don’t use so much) are different things, and setting a pixel-based font size in the root node wrecks the latter, forcing some people with accessibility needs to use the former, which is likely to result in vertical scrolling. Boo!

I also enjoyed seeing this demo of how the different hyphenation-points in different languages (because of syllable stress) can impact on your wrapping points/line lengths when content is translated. This can affect any website, of course, because any website can be the target of automatic translation.

Plus, Giulia’s thoughts on the value of serifed fonts (even on digital displays) for improving typographic readability of the letters d, b, p and q which are often mirror- or rotationally-symmetric to one another in sans-serif fonts. It’s amazing to have something – in this case, a psychological letter transposition – pointed out that I’ve experienced but never pinned down the reason for, before. Neat!

It was a shame that this workshop took place late in the day, because many of the participants (including me) seemed to have flagging energy levels!


Altogether a great (but intense) day. Boggles my mind that there’s another one like it tomorrow.

× × × × × × × × × × × × × × × × ×

Breakups as HTTP Response Codes

103: Early Hints ("I'm not sure this can last forever.")
103: Early Hints (“I’m not sure this can last forever.”)
300: Multiple Choices ("There are so many ways I can do better than you.")
300: Multiple Choices (“There are so many ways I can do better than you.”)
303: See Other ("You should date other people.")
303: See Other (“You should date other people.”)
304: Not Modified ("With you, I feel like I'm stagnating.")
304: Not Modified (“With you, I feel like I’m stagnating.”)
402: Payment Required ("I am a prostitute.")
402: Payment Required (“I am a prostitute.”)
403: Forbidden ("You don't get this any more.")
403: Forbidden (“You don’t get this any more.”)
406: Not Acceptable ("I could never introduce you to my parents.")
406: Not Acceptable (“I could never introduce you to my parents.”)
408: Request Timeout ("You keep saying you'll propose but you never do.")
408: Request Timeout (“You keep saying you’ll propose but you never do.”)
409: Conflict ("We hate each other.")
409: Conflict (“We hate each other.”)
410: Gone (ghosted)
410: Gone (ghosted)
411: Length Required ("Your penis is too small.")
411: Length Required (“Your penis is too small.”)
413: Payload Too Large ("Your penis is too big.")
413: Payload Too Large (“Your penis is too big.”)
416: Range Not Satisfied ("Our sex life is boring and repretitive.")
416: Range Not Satisfied (“Our sex life is boring and repretitive.”)
425: Too Early ("Your premature ejaculation is a problem.")
425: Too Early (“Your premature ejaculation is a problem.”)
428: Precondition Failed ("You're still sleeping with your ex-!?")
428: Precondition Failed (“You’re still sleeping with your ex-!?”)
429: Too Many Requests ("You're so demanding!")
429: Too Many Requests (“You’re so demanding!”)
451: Unavailable for Legal Reasons ("I'm married to somebody else.")
451: Unavailable for Legal Reasons (“I’m married to somebody else.”)
502: Bad Gateway ("Your pussy is awful.")
502: Bad Gateway (“Your pussy is awful.”)
508: Loop Detected ("We just keep fighting.")
508: Loop Detected (“We just keep fighting.”)

With thanks to Ruth for the conversation that inspired these pictures, and apologies to the rest of the Internet for creating them.

× × × × × × × × × × × × × × × × × × ×

Oxford Geek Nights #52

On Wednesday this week, three years and two months after Oxford Geek Nights #51, Oxford Geek Night #52. Originally scheduled for 15 April 2020 and then… postponed slightly because of the pandemic, its reapparance was an epic moment that I’m glad to have been a part of.

Matt Westcott stands to the side of a stage, drinking beer, while centrestage a cross-shaped "pharmacy sign" projects an animation of an ambulance rocketing into a starfield.
A particular highlight of the night was witnessing “Gasman” Matt Westcott show off his epic demoscene contribution Pharmageddon, which is presented via a “pharmacy sign”. Here’s a video, if you’re interested.

Ben Foxall also put in a sterling performance; hearing him talk – as usual – made me say “wow, I didn’t know you could do that with a web browser”. And there was more to learn, too: Jake Howard showed us how robots see, Steve Buckley inspired us to think about how technology can make our homes more energy-smart (this is really cool and sent me down a rabbithole of reading!), and Joe Wass showed adorable pictures of his kid exploring the user interface of his lockdown electronics project.

Digital scoreboard showing Dan Q in the lead with 5,561, Nick in second place with 5,442, and RaidIndigo in third with 5,398.
Oh, and there was a quiz competition too, and guess who came out on top after an incredibly tight race.

But mostly I just loved the chance to hang out with geeks again; chat to folks, make connections, and enjoy that special Oxford Geek Nights atmosphere. Also great to meet somebody from Perspectum, who look like they’d be great to work for and – after hearing about – I had in mind somebody to suggest for a job with them… but it looks like the company isn’t looking for anybody with their particular skills on this side of the pond. Still, one to watch.

Dan, outdoors on a grassy path, wearing a grey hoodie. On his head is a "trucker cap" emblazoned with the word "GEEK" and, in smaller writing "#OGN52".
My prize for winning the competition was an extremely-limited-edition cap which I love so much I’ve barely taken it off since.

Huge thanks are due to Torchbox, Perspectum and everybody in attendance for making this magical night possible!

Oh, and for anybody who’s interested, I’ve proposed to be a speaker at the next Oxford Geek Nights, which sounds like it’ll be towards Spring 2023. My title is “Yesterday’s Internet, Today!” which – spoilers! – might have something to do with the kind of technology I’ve been playing with recently, among other things. Hope to see you there!

× × ×

DNDle (Wordle, but with D&D monster stats)

Don’t have time to read? Just start playing:

Play DNDle

There’s a Wordle clone for everybody

Am I too late to get onto the “making Wordle clones” bandwagon? Probably; there are quite a few now, including:

Screenshot showing a WhatsApp conversation. Somebody shares a Wordle-like "solution" board but it's got six columns, not five. A second person comments "Hang on a minute... that's not Wordle!"
I’m sure that by now all your social feeds are full of people playing Wordle. But the cool nerds are playing something new…

Now, a Wordle clone for D&D players!

But you know what hasn’t been seen before today? A Wordle clone where you have to guess a creature from the Dungeons & Dragons (5e) Monster Manual by putting numeric values into a character sheet (STR, DEX, CON, INT, WIS, CHA):

Screenshot of DNDle, showing two guesses made already.
Just because nobody’s asking for a game doesn’t mean you shouldn’t make it anyway.

What are you waiting for: go give DNDle a try (I pronounce it “dindle”, but you can pronounce it however you like). A new monster appears at 10:00 UTC each day.

And because it’s me, of course it’s open source and works offline.

The boring techy bit

  • Like Wordle, everything happens in your browser: this is a “backendless” web application.
  • I’ve used ReefJS for state management, because I wanted something I could throw together quickly but I didn’t want to drown myself (or my players) in a heavyweight monster library. If you’ve not used Reef before, you should give it a go: it’s basically like React but a tenth of the footprint.
  • A cache-first/background-updating service worker means that it can run completely offline: you can install it to your homescreen in the same way as Wordle, but once you’ve visited it once it can work indefinitely even if you never go online again.
  • I don’t like to use a buildchain that’s any more-complicated than is absolutely necessary, so the only development dependency is rollup. It resolves my import statements and bundles a single JS file for the browser.
×

Making an RSS feed of YOURLS shortlinks

As you might know if you were paying close attention in Summer 2019, I run a “URL shortener” for my personal use. You may be familiar with public URL shorteners like TinyURL and Bit.ly: my personal URL shortener is basically the same thing, except that only I am able to make short-links with it. Compared to public ones, this means I’ve got a larger corpus of especially-short (e.g. 2/3 letter) codes available for my personal use. It also means that I’m not dependent on the goodwill of a free siloed service and I can add exactly the features I want to it.

Diagram showing the relationships of the DanQ.me ecosystem. Highlighted is the injection of links into the "S.2" link shortener and the export of these shortened links by RSS into FreshRSS.
Little wonder then that my link shortener sat so close to me on my ecosystem diagram the other year.

For the last nine years my link shortener has been S.2, a tool I threw together in Ruby. It stores URLs in a sequentially-numbered database table and then uses the Base62-encoding of the primary key as the “code” part of the short URL. Aside from the fact that when I create a short link it shows me a QR code to I can easily “push” a page to my phone, it doesn’t really have any “special” features. It replaced S.1, from which it primarily differed by putting the code at the end of the URL rather than as part of the domain name, e.g. s.danq.me/a0 rather than a0.s.danq.me: I made the switch because S.1 made HTTPS a real pain as well as only supporting Base36 (owing to the case-insensitivity of domain names).

But S.2’s gotten a little long in the tooth and as I’ve gotten busier/lazier, I’ve leant into using or adapting open source tools more-often than writing my own from scratch. So this week I switched my URL shortener from S.2 to YOURLS.

Screenshot of YOURLS interface showing Dan Q's list of shortened links. Six are shown of 1,939 total.
YOURLs isn’t the prettiest tool in the world, but then it doesn’t have to be: only I ever see the interface pictured above!

One of the things that attracted to me to YOURLS was that it had a ready-to-go Docker image. I’m not the biggest fan of Docker in general, but I do love the convenience of being able to deploy applications super-quickly to my household NAS. This makes installing and maintaining my personal URL shortener much easier than it used to be (and it was pretty easy before!).

Another thing I liked about YOURLS is that it, like S.2, uses Base62 encoding. This meant that migrating my links from S.2 into YOURLS could be done with a simple cross-database INSERT... SELECT statement:

INSERT INTO yourls.yourls_url(keyword, url, title, `timestamp`, clicks)
  SELECT shortcode, url, title, created_at, 0 FROM danq_short.links

But do you know what’s a bigger deal for my lifestack than my URL shortener? My RSS reader! I’ve written about it a lot, but I use RSS for just about everything and my feed reader is my first, last, and sometimes only point of contact with the Web! I’m so hooked-in to my RSS ecosystem that I’ll use my own middleware to add feeds to sites that don’t have them, or for which I’m not happy with the feed they provide, e.g. stripping sports out of BBC News, subscribing to webcomics that don’t provide such an option (sometimes accidentally hacking into sites on the way), and generating “complete” archives of series’ of posts so I can use my reader to track my progress.

One of S.1/S.2’s features was that it exposed an RSS feed at a secret URL for my reader to ingest. This was great, because it meant I could “push” something to my RSS reader to read or repost to my blog later. YOURLS doesn’t have such a feature, and I couldn’t find anything in the (extensive) list of plugins that would do it for me. I needed to write my own.

Partial list of Dan's RSS feed subscriptions, including Jeremy Keith, Jim Nielson, Natalie Lawhead, Bruce Schneier, Scott O'Hara, "Yahtzee", BBC News, and several podcasts, as well as (highlighted) "Dan's Short Links", which has 5 unread items.
In some ways, subscribing “to yourself” is a strange thing to do. In other ways… shut up, I’ll do what I like.

I could have written a YOURLS plugin. Or I could have written a stack of code in Ruby, PHP, Javascript or some other language to bridge these systems. But as I switched over my shortlink subdomain s.danq.me to its new home at danq.link, another idea came to me. I have direct database access to YOURLS (and the table schema is super simple) and the command-line MariaDB client can output XML… could I simply write an XML Transformation to convert database output directly into a valid RSS feed? Let’s give it a go!

I wrote a script like this and put it in my crontab:

mysql --xml yourls -e                                                                                                                     \
      "SELECT keyword, url, title, DATE_FORMAT(timestamp, '%a, %d %b %Y %T') AS pubdate FROM yourls_url ORDER BY timestamp DESC LIMIT 30" \
    | xsltproc template.xslt -                                                                                                            \
    | xmllint --format -                                                                                                                  \
    > output.rss.xml

The first part of that command connects to the yourls database, sets the output format to XML, and executes an SQL statement to extract the most-recent 30 shortlinks. The DATE_FORMAT function is used to mould the datetime into something approximating the RFC-822 standard for datetimes as required by RSS. The output produced looks something like this:

<?xml version="1.0"?>
<resultset statement="SELECT keyword, url, title, timestamp FROM yourls_url ORDER BY timestamp DESC LIMIT 30" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
  <row>
        <field name="keyword">VV</field>
        <field name="url">https://webdevbev.co.uk/blog/06-2021/perfect-is-the-enemy-of-good.html</field>
        <field name="title"> Perfect is the enemy of good || Web Dev Bev</field>
        <field name="timestamp">2021-09-26 17:38:32</field>
  </row>
  <row>
        <field name="keyword">VU</field>
        <field name="url">https://webdevlaw.uk/2021/01/30/why-generation-x-will-save-the-web/</field>
        <field name="title">Why Generation X will save the web  Hi, Im Heather Burns</field>
        <field name="timestamp">2021-09-26 17:38:26</field>
  </row>

  <!-- ... etc. ... -->
  
</resultset>

We don’t see this, though. It’s piped directly into the second part of the command, which  uses xsltproc to apply an XSLT to it. I was concerned that my XSLT experience would be super rusty as I haven’t actually written any since working for my former employer SmartData back in around 2005! Back then, my coworker Alex and I spent many hours doing XML backflips to implement a system that converted complex data outputs into PDF files via an XSL-FO intermediary.

I needn’t have worried, though. Firstly: it turns out I remember a lot more than I thought from that project a decade and a half ago! But secondly, this conversion from MySQL/MariaDB XML output to RSS turned out to be pretty painless. Here’s the template.xslt I ended up making:

<?xml version="1.0"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
  <xsl:template match="resultset">
    <rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
      <channel>
        <title>Dan's Short Links</title>
        <description>Links shortened by Dan using danq.link</description>
        <link> [ MY RSS FEED URL ] </link>
        <atom:link href=" [ MY RSS FEED URL ] " rel="self" type="application/rss+xml" />
        <lastBuildDate><xsl:value-of select="row/field[@name='pubdate']" /> UTC</lastBuildDate>
        <pubDate><xsl:value-of select="row/field[@name='pubdate']" /> UTC</pubDate>
        <ttl>1800</ttl>
        <xsl:for-each select="row">
          <item>
            <title><xsl:value-of select="field[@name='title']" /></title>
            <link><xsl:value-of select="field[@name='url']" /></link>
            <guid>https://danq.link/<xsl:value-of select="field[@name='keyword']" /></guid>
            <pubDate><xsl:value-of select="field[@name='pubdate']" /> UTC</pubDate>
          </item>
        </xsl:for-each>
      </channel>
    </rss>
  </xsl:template>
</xsl:stylesheet>

That uses the first (i.e. most-recent) shortlink’s timestamp as the feed’s pubDate, which makes sense: unless you’re going back and modifying links there’s no more-recent changes than the creation date of the most-recent shortlink. Then it loops through the returned rows and creates an <item> for each; simple!

The final step in my command runs the output through xmllint to prettify it. That’s not strictly necessary, but it was useful while debugging and as the whole command takes milliseconds to run once every quarter hour or so I’m not concerned about the overhead. Using these native binaries (plus a little configuration), chained together with pipes, had already resulted in way faster performance (with less code) than if I’d implemented something using a scripting language, and the result is a reasonably elegant “scratch your own itch”-type solution to the only outstanding barrier that was keeping me on S.2.

All that remained for me to do was set up a symlink so that the resulting output.rss.xml was accessible, over the web, to my RSS reader. I hope that next time I’m tempted to write a script to solve a problem like this I’ll remember that sometimes a chain of piped *nix utilities can provide me a slicker, cleaner, and faster solution.

Update: Right as I finished writing this blog post I discovered that somebody had already solved this problem using PHP code added to YOURLS; it’s just not packaged as a plugin so I didn’t see it earlier! Whether or not I use this alternate approach or stick to what I’ve got, the process of implementing this YOURLS-database ➡ XML ➡  XSLTRSS chain was fun and informative.

× × ×

The Cursed Computer Iceberg Meme

This is a repost promoting content originally published elsewhere. See more things Dan's reposted.

More awesome from Blackle Mori, whose praises I sung recently over The Basilisk Collection. This time we’re treated to a curated list of 182 articles demonstrating the “peculiarities and weirdness” of computers. Starting from relatively well-known memes like little Bobby Tables, the year 2038 problem, and how all web browsers pretend to be each other, we descend through the fast inverse square root (made famous by Quake III), falsehoods programmers believe about time (personally I’m more of a fan of …names, but then you might expect that), the EICAR test file, the “thank you for playing Wing Commander” EMM386 in-memory hack, The Basilisk Collection itself, and the GIF MD5 hashquine (which I’ve shared previously) before eventually reaching the esoteric depths of posuto and the nightmare that is Japanese postcodes

Plus many, many things that were new to me and that I’ve loved learning about these last few days.

It’s definitely not a competition; it’s a learning opportunity wrapped up in the weirdest bits of the field. Have an explore and feed your inner computer science geek.

HTML Movies

This is a repost promoting content originally published elsewhere. See more things Dan's reposted.

1. a (1998)

Documentary about the Buddhist sect responsible for the 1995 Tokyo subway Sarin gas attack.

2. body (2003)

Also known as Jism. No, really. A tale of passion and murder featuring an alcoholic lawyer and the wife of a travelling millionaire.

3. canvas (1992)

Gary Busey stars as an artist who takes part in a heist to save his brother from murderous art thieves.

We could have had so many HTML-themed Troma Nights, if we’d wanted…

Cheating Hangman

A long while ago, inspired by Nick Berry‘s analysis of optimal Hangman strategy, I worked it backwards to find the hardest words to guess when playing Hangman. This week, I showed these to my colleague Grace – who turns out to be a fan of word puzzles – and our conversation inspired me to go a little deeper. Is it possible, I thought, for me to make a Hangman game that cheats by changing the word it’s thinking of based on the guesses you make in order to make it as difficult as possible for you to win?

Play “Cheating Hangman”

The principle is this: every time the player picks a letter, but before declaring whether or not it’s found in the word –

  1. Make a list of all possible words that would fit into the boxes from the current game state.
  2. If there are lots of them, still, that’s fine: let the player’s guess go ahead.
  3. But if the player’s managing to narrow down the possibilities, attempt to change the word that they’re trying to guess! The new word must be:
    • Legitimate: it must still be the same length, have correctly-guessed letters in the same places, and contain no letters that have been declared to be incorrect guesses.
    • Harder: after resolving the player’s current guess, the number of possible words must be larger than the number of possible words that would have resulted otherwise.
Gallows on a hill.
Yeah, you’re screwed now.

You might think that this strategy would just involve changing the target word so that you can say “nope” to the player’s current guess. That happens a lot, but it’s not always the case: sometimes, it’ll mean changing to a different word in which the guessed letter also appears. Occasionally, it can even involve changing from a word in which the guessed letter didn’t appear to one in which it does: that is, giving the player a “freebie”. This may seem counterintuitive as a strategy, but it sometimes makes sense: if saying “yeah, there’s an E at the end” increases the number of possible words that it might be compared to saying “no, there are no Es” then this is the right move for a cheating hangman.

Playing against a cheating hangman also lends itself to devising new strategies as a player, too, although I haven’t yet looked deeply into this. But logically, it seems that the optimal strategy against a cheating hangman might involve making guesses that force the hangman to bisect the search space: knowing that they’re always going to adapt towards the largest set of candidate words, a perfect player might be able to make guesses to narrow down the possibilities as fast as possible, early on, only making guesses that they actually expect to be in the word later (before their guess limit runs out!).

Cheating Hangman
The game is brutally-difficult, but surprisingly fun, and you can have it tell you when and how it cheats so you can begin to understand its strategy.

I also find myself wondering how easily I could adapt this into a “helpful hangman”: a game which would always change the word that you’re trying to guess in order to try to make you win. This raises the possibility of a whole new game, “suicide hangman”, in which the player is trying to get themselves killed and so is trying to pick letters that can’t possibly be in the word and the hangman is trying to pick words in which those letters can be found, except where doing so makes it obvious which letters the player must avoid next. Maybe another day.

In the meantime, you’re welcome to go play the game (and let me know what you think, below!) and, if you’re of such an inclination, read the source code. I’ve used some seriously ugly techniques to make this work, including regular expression metaprogramming (using regular expressions to write regular expressions), but the code should broadly make sense if you want to adapt it. Have fun!

Play “Cheating Hangman”

Update 26 September 2019, 16:23: I’ve now added “helpful mode”, where the computer tries to cheat on your behalf rather than against you, but it’s not as helpful as you’d think because it assumes you’re playing optimally and have already memorised the dictionary!

Update 1 October 2019, 06:40: Now featured on MetaFilter; hi, MeFites!

×

css-only-chat/README.md at master · kkuchta/css-only-chat · GitHub

This is a repost promoting content originally published elsewhere. See more things Dan's reposted.

CSS-Only Chat

A truly monstrous async web chat using no JS whatsoever on the frontend.

Wait what

This is an asynchronous chat that sends + receives messages in the browser with no reloads and no javascript.

Ok so how

Background-images loaded via pseudoselectors + a forever-loading index page (remember Comet?).

Say that again?

Ok, so there are two things we need the browser to do: send data and receive data. Let’s start with the first.

Sending Data

CSS is really limited in what it can do. However, we can use it to effectively detect button presses:

.some-button:active {
  background-image: url('some_image.jpg')
}

What’s cool is that a browser won’t actually load that background image until this selector is used – that is, when this button is pressed. So now we have a way to trigger a request to a server of our choice on a button press. That sounds like data sending!

Now, of course, this only works once per button (since a browser won’t try to load that image twice), but it’s a start.

Receiving Data

Since we can’t use JS, it’s really hard to change a page after it’s already been loaded. But it is possible.

Back before websockets were widely supported, we had to use clever hacks if we wanted to push data from a server to a client in an ongoing basis. One such hack was just to make the page never finish loading. It turns out that you can tell the browser to start rendering a page before it’s finished loading (using the Transfer-Encoding: chunked http header). And when you do that, you don’t actually have to stop loading the page. You can just keep adding stuff to the bottom of the html forever, at whatever rate you want.

So, for example, you could start sending a normal html page, then just stop sending html (while still telling the client you’re sending) until you’re ready to deliver another message.

Now, all this lets us do is periodically append html to the page. What can we do with that? How about, when you load the index page, this happens:

  1. We load up the first pile of html we want to show. A welcome message, etc.
  2. We stop loading html for a while until we want to send some sort of update.
  3. Now we load up a <style> element that display: none‘s all the previous html
  4. Then we load up whatever new html we want to show
  5. Finally we wait until the next update we want to send and GOTO 3.

Single-use buttons?

Ok, so we have that problem earlier where each button is only single-use. It tries to send a get request once, then never will again.

Thankfully, our method of receiving data fixes that for us. Here’s what happens:

  1. We show an “a” button whose background image is like “img/a”.
  2. When you press it, the server receives the image request for “a”
  3. The server then pushes an update to the client to hide the current button and replace it with one whose background images is “image/aa”.

If the buttons you pressed were “h”, “e”, and “l”, then the “a” button’s background image url would be “img/hela”. And since we’re replacing all buttons every time you press one, the single-use button problem goes away!

Misc other details

  • We actually encode a bit more info into the button urls (like each client’s id)
  • Because the data-sending and data-receiving happens on different threads, we need inter-thread communication. That sounds like work, so we’ll just use Redis pubsub for that.

FAQ

What inspired this? Chernobyl, Hindenburg, The Tacoma Narrows Bridge…

Really? No, it was this clever tweet by davywtf.

Why’s your code suck Why do you suck?

No but really Because I was mostly making this up as I went. There’s a lot of exploratory coding here that I only minimally cleaned up. If I rebuilt it I’d store the UI state for a client in redis and just push it out in its entirety when needed via a single generic screen-updating mechanism.

What could go wrong with this technique? Broken by browser bg-image handling changes; long-request timeouts; running out of threads; fast-clicking bugs; generic concurrency headaches; poor handling by proxies; it’s crazy inaccessible; etc etc

Should I use this in real life? Dear god yes.

I have an idea for how this could be made better/worse/hackier Tweet at me (@kkuchta). I’m always down to see a terrible idea taken further!

Practical Details

If you want to install and use this locally you should:

  1. Re-evaluate your life choices
  2. If you simply must continue, check out INSTALL.md

If you want to contribute, see number 1 above. After that, just fork this repo, make a change, and then open a PR against this repo.

A poem about Silicon Valley, assembled from Quora questions about Silicon Valley

This is a repost promoting content originally published elsewhere. See more things Dan's reposted.

Enigma, the Bombe, and Typex

This is a repost promoting content originally published elsewhere. See more things Dan's reposted.

How to guides

How to encrypt/decrypt with Enigma

We’ll start with a step-by-step guide to decrypting a known message. You can see the result of these steps in CyberChef here. Let’s say that our message is as follows:

XTSYN WAEUG EZALY NRQIM AMLZX MFUOD AWXLY LZCUZ QOQBQ JLCPK NDDRW F

And that we’ve been told that a German service Enigma is in use with the following settings:

Rotors III, II, and IV, reflector B, ring settings (Ringstellung in German) KNG, plugboard (Steckerbrett)AH CO DE GZ IJ KM LQ NY PS TW, and finally the rotors are set to OPM.

Enigma settings are generally given left-to-right. Therefore, you should ensure the 3-rotor Enigma is selected in the first dropdown menu, and then use the dropdown menus to put rotor III in the 1st rotor slot, II in the 2nd, and IV in the 3rd, and pick B in the reflector slot. In the ring setting and initial value boxes for the 1st rotor, put K and O respectively, N and P in the 2nd, and G and M in the 3rd. Copy the plugboard settings AH CO DE GZ IJ KM LQ NY PS TW into the plugboard box. Finally, paste the message into the input window.

The output window will now read as follows:

HELLO CYBER CHEFU SERST HISIS ATEST MESSA GEFOR THEDO CUMEN TATIO N

The Enigma machine doesn’t support any special characters, so there’s no support for spaces, and by default unsupported characters are removed and output is put into the traditional five-character groups. (You can turn this off by disabling “strict input”.) In some messages you may see X used to represent space.

Encrypting with Enigma is exactly the same as decrypting – if you copy the decrypted message back into the input box with the same recipe, you’ll get the original ciphertext back.

Plugboard, rotor and reflector specifications

The plugboard exchanges pairs of letters, and is specified as a space-separated list of those pairs. For example, with the plugboard AB CD, A will be exchanged for B and vice versa, C for D, and so forth. Letters that aren’t specified are not exchanged, but you can also specify, for example, AA to note that A is not exchanged. A letter cannot be exchanged more than once. In standard late-war German military operating practice, ten pairs were used.

You can enter your own components, rather than using the standard ones. A rotor is an arbitrary mapping between letters – the rotor specification used here is the letters the rotor maps A through Z to, so for example with the rotor ESOVPZJAYQUIRHXLNFTGKDCMWB, A maps to E, B to S, and so forth. Each letter must appear exactly once. Additionally, rotors have a defined step point (the point or points in the rotor’s rotation at which the neighbouring rotor is stepped) – these are specified using a < followed by the letters at which the step happens.

Reflectors, like the plugboard, exchange pairs of letters, so they are entered the same way. However, letters cannot map to themselves.

How to encrypt/decrypt with Typex

The Typex machine is very similar to Enigma. There are a few important differences from a user perspective:

  • Five rotors are used.
  • Rotor wirings cores can be inserted into the rotors backwards.
  • The input plugboard (on models which had one) is more complicated, allowing arbitrary letter mappings, which means it functions like, and is entered like, a rotor.
  • There was an additional plugboard which allowed rewiring of the reflector: this is supported by simply editing the specified reflector.

Like Enigma, Typex only supports enciphering/deciphering the letters A-Z. However, the keyboard was marked up with a standardised way of representing numbers and symbols using only the letters. You can enable emulation of these keyboard modes in the operation configuration. Note that this needs to know whether the message is being encrypted or decrypted.

How to attack Enigma using the Bombe

Let’s take the message from the first example, and try and decrypt it without knowing the settings in advance. Here’s the message again:

XTSYN WAEUG EZALY NRQIM AMLZX MFUOD AWXLY LZCUZ QOQBQ JLCPK NDDRW F

Let’s assume to start with that we know the rotors used were III, II, and IV, and reflector B, but that we know no other settings. Put the ciphertext in the input window and the Bombe operation in your recipe, and choose the correct rotors and reflector. We need one additional piece of information to attack the message: a “crib”. This is a section of known plaintext for the message. If we know something about what the message is likely to contain, we can guess possible cribs.

We can also eliminate some cribs by using the property that Enigma cannot encipher a letter as itself. For example, let’s say our first guess for a crib is that the message begins with “Hello world”. If we enter HELLO WORLD into the crib box, it will inform us that the crib is invalid, as the W in HELLO WORLD corresponds to a W in the ciphertext. (Note that spaces in the input and crib are ignored – they’re included here for readability.) You can see this in CyberChef here

Let’s try “Hello CyberChef” as a crib instead. If we enter HELLO CYBER CHEF, the operation will run and we’ll be presented with some information about the run, followed by a list of stops. You can see this here. Here you’ll notice that it says Bombe run on menu with 0 loops (2+ desirable)., and there are a large number of stops listed. The menu is built from the crib you’ve entered, and is a web linking ciphertext and plaintext letters. (If you’re maths inclined, this is a graph where letters – plain or ciphertext – are nodes and states of the Enigma machine are edges.) The machine performs better on menus which have loops in them – a letter maps to another to another and eventually returns to the first – and additionally on longer menus. However, menus that are too long risk failing because the Bombe doesn’t simulate the middle rotor stepping, and the longer the menu the more likely this is to have happened. Getting a good menu is a mixture of art and luck, and you may have to try a number of possible cribs before you get one that will produce useful results.

Bombe menu diagram

In this case, if we extend our crib by a single character to HELLO CYBER CHEFU, we get a loop in the menu (that U maps to a Y in the ciphertext, the Y in the second cipher block maps to A, the A in the third ciphertext block maps to E, and the E in the second crib block maps back to U). We immediately get a manageable number of results. You can see this here. Each result gives a set of rotor initial values and a set of identified plugboard wirings. Extending the crib further to HELLO CYBER CHEFU SER produces a single result, and it has also recovered eight of the ten plugboard wires and identified four of the six letters which are not wired. You can see this here.

We now have two things left to do:

  1. Recover the remaining plugboard settings.
  2. Recover the ring settings.

This will need to be done manually.

Set up an Enigma operation with these settings. Leave the ring positions set to A for the moment, so from top to bottom we have rotor III at initial value E, rotor II at C, and rotor IV at G, reflector B, and plugboard DE AH BB CO FF GZ LQ NY PS RR TW UU.

You can see this here. You will immediately notice that the output is not the same as the decryption preview from the Bombe operation! Only the first three characters – HEL – decrypt correctly. This is because the middle rotor stepping was ignored by the Bombe. You can correct this by adjusting the ring position and initial value on the right-hand rotor in sync. They are currently A and G respectively. Advance both by one to B and H, and you’ll find that now only the first two characters decrypt correctly.

Keep trying settings until most of the message is legible. You won’t be able to get the whole message correct, but for example at F and L, which you can see here, our message now looks like:

HELLO CYBER CHEFU SERTC HVSJS QTEST KESSA GEFOR THEDO VUKEB TKMZM T

At this point we can recover the remaining plugboard settings. The only letters which are not known in the plugboard are J K V X M I, of which two will be unconnected and two pairs connected. By inspecting the ciphertext and partially decrypted plaintext and trying pairs, we find that connecting IJ and KM results, as you can see here, in:

HELLO CYBER CHEFU SERST HISIS ATEST MESSA GEFOR THEDO CUMEO TMKZK T

This is looking pretty good. We can now fine tune our ring settings. Adjusting the right-hand rotor to G and M gives, as you can see here,

HELLO CYBER CHEFU SERST HISIS ATEST MESSA GEFOR THEDO CUMEN WMKZK T

which is the best we can get with only adjustments to the first rotor. You now need to adjust the second rotor. Here, you’ll find that anything from D and F to Z and B gives the correct decryption, for example here. It’s not possible to determine the exact original settings from only this message. In practice, for the real Enigma and real Bombe, this step was achieved via methods that exploited the Enigma network operating procedures, but this is beyond the scope of this document.

What if I don’t know the rotors?

You’ll need the “Multiple Bombe” operation for this. You can define a set of rotors to choose from – the standard WW2 German military Enigma configurations are provided or you can define your own – and it’ll run the Bombe against every possible combination. This will take up to a few hours for an attack against every possible configuration of the four-rotor Naval Enigma! You should run a single Bombe first to make sure your menu is good before attempting a multi-Bombe run.

You can see an example of using the Multiple Bombe operation to attack the above example message without knowing the rotor order in advance here.

What if I get far too many stops?

Use a longer or different crib. Try to find one that produces loops in the menu.

What if I get no stops, or only incorrect stops?

Are you sure your crib is correct? Try alternative cribs.

What if I know my crib is right, but I still don’t get any stops?

The middle rotor has probably stepped during the encipherment of your crib. Try a shorter or different crib.

How things work

How Enigma works

We won’t go into the full history of Enigma and all its variants here, but as a brief overview of how the machine works:

Enigma uses a series of letter->letter conversions to produce ciphertext from plaintext. It is symmetric, such that the same series of operations on the ciphertext recovers the original plaintext.

The bulk of the conversions are implemented in “rotors”, which are just an arbitrary mapping from the letters A-Z to the same letters in a different order. Additionally, to enforce the symmetry, a reflector is used, which is a symmetric paired mapping of letters (that is, if a given reflector maps X to Y, the converse is also true). These are combined such that a letter is mapped through three different rotors, the reflector, and then back through the same three rotors in reverse.

To avoid Enigma being a simple Caesar cipher, the rotors rotate (or “step”) between enciphering letters, changing the effective mappings. The right rotor steps on every letter, and additionally defines a letter (or later, letters) at which the adjacent (middle) rotor will be stepped. Likewise, the middle rotor defines a point at which the left rotor steps. (A mechanical issue known as the double-stepping anomaly means that the middle rotor actually steps twice when the left hand rotor steps.)

The German military Enigma adds a plugboard, which is a configurable pair mapping of letters (similar to the reflector, but not requiring that every letter is exchanged) applied before the first rotor (and thus also after passing through all the rotors and the reflector).

It also adds a ring setting, which allows the stepping point to be adjusted.

Later in the war, the Naval Enigma added a fourth rotor. This rotor does not step during operation. (The fourth rotor is thinner than the others, and fits alongside a thin reflector, meaning this rotor is not interchangeable with the others on a real Enigma.)

There were a number of other variants and additions to Enigma which are not currently supported here, as well as different Enigma networks using the same basic hardware but different rotors (which are supported by supplying your own rotor configurations).

How Typex works

Typex is a clone of Enigma, with a few changes implemented to improve security. It uses five rotors rather than three, and the rightmost two are static. Each rotor has more stepping points. Additionally, the rotor design is slightly different: the wiring for each rotor is in a removable core, which sits in a rotor housing that has the ring setting and stepping notches. This means each rotor has the same stepping points, and the rotor cores can be inserted backwards, effectively doubling the number of rotor choices.

Later models (from the Mark 22, which is the variant we simulate here) added two plugboards: an input plugboard, which allowed arbitrary letter mappings (rather than just pair switches) and thus functioned similarly to a configurable extra static rotor, and a reflector plugboard, which allowed rewiring the reflector.

How the Bombe works

The Bombe is a mechanism for efficiently testing and discarding possible rotor positions, given some ciphertext and known plaintext. It exploits the symmetry of Enigma and the reciprocal (pairwise) nature of the plugboard to do this regardless of the plugboard settings. Effectively, the machine makes a series of guesses about the rotor positions and plugboard settings and for each guess it checks to see if there are any contradictions (e.g. if it finds that, with its guessed settings, the letter A would need to be connected to both B and C on the plugboard, that’s impossible, and these settings cannot be right). This is implemented via careful connection of electrical wires through a group of simulated Enigma machines.

A full explanation of the Bombe’s operation is beyond the scope of this document – you can read the source code, and the authors also recommend Graham Ellsbury’s Bombe explanation, which is very clearly diagrammed.

Implementation in CyberChef

Enigma/Typex

Enigma and Typex were implemented from documentation of their functionality.

Enigma rotor and reflector settings are from GCHQ’s documentation of known Enigma wirings. We currently simulate all basic versions of the German Service Enigma; most other versions should be possible by manually entering the rotor wirings. There are a few models of Enigma, or attachments for the Service Enigma, which we don’t currently simulate. The operation was tested against some of GCHQ’s working examples of Enigma machines. Output should be letter-for-letter identical to a real German Service Enigma. Note that some Enigma models used numbered rather than lettered rotors – we’ve chosen to stick with the easier-to-use lettered rotors.

There were a number of different Typex versions over the years. We implement the Mark 22, which is backwards compatible with some (but not completely with all, as some early variants supported case sensitivity) older Typex models. GCHQ also has a partially working Mark 22 Typex. This was used to test the plugboards and mechanics of the machine. Typex rotor settings were changed regularly, and none have ever been published, so a test against real rotors was not possible. An example set of rotors have been randomly generated for use in the Typex operation. Some additional information on the internal functionality was provided by the Bombe Rebuild Project.

The Bombe

The Bombe was likewise implemented on the basis of documentation of the attack and the machine. The Bombe Rebuild Project at the National Museum of Computing answered a number of technical questions about the machine and its operating procedures, and helped test our results against their working hardware Bombe, for which the authors would like to extend our thanks.

Constructing menus from cribs in a manner that most efficiently used the Bombe hardware was another difficult step of operating the real Bombes. We have chosen to generate the menu automatically from the provided crib, ignore some hardware constraints of the real Bombe (e.g. making best use of the number of available Enigmas in the Bombe hardware; we simply simulate as many as are necessary), and accept that occasionally the menu selected automatically may not always be the optimal choice. This should be rare, and we felt that manual menu creation would be hard to build an interface for, and would add extra barriers to users experimenting with the Bombe.

The output of the real Bombe is optimised for manual verification using the checking machine, and additionally has some quirks (the rotor wirings are rotated by, depending on the rotor, between one and three steps compared to the Enigma rotors). Therefore, the output given is the ring position, and a correction depending on the rotor needs to be applied to the initial value, setting it to W for rotor V, X for rotor IV, and Y for all other rotors. We felt that this would require too much explanation in CyberChef, so the output of CyberChef’s Bombe operation is the initial value for each rotor, with the ring positions set to A, required to decrypt the ciphertext starting at the beginning of the crib. The actual stops are the same. This would not have caused problems at Bletchley Park, as operators working with the Bombe would never have dealt with a real or simulated Enigma, and vice versa.

By default the checking machine is run automatically and stops which fail silently discarded. This can be disabled in the operation configuration, which will cause it to output all stops from the actual Bombe hardware instead. (In this case you only get one stecker pair, rather than the set identified by the checking machine.)

Optimisation

A three-rotor Bombe run (which tests 17,576 rotor positions and takes about 15-20 minutes on original Turing Bombe hardware) completes in about a fifth of a second in our tests. A four-rotor Bombe run takes about 5 seconds to try all 456,976 states. This also took about 20 minutes on the four-rotor US Navy Bombe (which rotates about 30 times faster than the Turing Bombe!). CyberChef operations run single-threaded in browser JavaScript.

We have tried to remain fairly faithful to the implementation of the real Bombe, rather than a from-scratch implementation of the underlying attack. There is one small deviation from “correct” behaviour: the real Bombe spins the slow rotor on a real Enigma fastest. We instead spin the fast rotor on an Enigma fastest. This means that all the other rotors in the entire Bombe are in the same state for the 26 steps of the fast rotor and then step forward: this means we can compute the 13 possible routes through the lower two/three rotors and reflector (symmetry means there are only 13 routes) once every 26 ticks and then save them. This does not affect where the machine stops, but it does affect the order in which those stops are generated.

The fast rotors repeat each others’ states: in the 26 steps of the fast rotor between steps of the middle rotor, each of the scramblers in the complete Bombe will occupy each state once. This means we can once again store each state when we hit them and reuse them when the other scramblers rotate through the same states.

Note also that it is not necessary to complete the energisation of all wires: as soon as 26 wires in the test register are lit, the state is invalid and processing can be aborted.

The above simplifications reduce the runtime of the simulation by an order of magnitude.

If you have a large attack to run on a multiprocessor system – for example, the complete M4 Naval Enigma, which features 1344 possible choices of rotor and reflector configuration, each of which takes about 5 seconds – you can open multiple CyberChef tabs and have each run a subset of the work. For example, on a system with four or more processors, open four tabs with identical Multiple Bombe recipes, and set each tab to a different combination of 4th rotor and reflector (as there are two options for each). Leave the full set of eight primary rotors in each tab. This should complete the entire run in about half an hour on a sufficiently powerful system.

To celebrate their centenary, GCHQ have open-sourced very-faithful reimplementations of Enigma, Typex, and Bombe that you can run in your browser. That’s pretty cool, and a really interesting experimental toy for budding cryptographers and cryptanalysts!

Logitech MX Master 2S

I’m a big believer in the idea that the hardware I lay my hands on all day, every day, needs to be the best for its purpose. On my primary desktop, I type on a Das Keyboard 4 Professional (with Cherry MX brown switches) because it looks, feels, and sounds spectacular. I use the Mac edition of the keyboard because it, coupled with a few tweaks, gives me the best combination of features and compatibility across all of the Windows, MacOS, and Linux (and occasionally other operating systems) I control with it. These things matter.

F13, Lower Brightness, and Raise Brightness keys on Dan's keyboard
I don’t know what you think these buttons do, but if you’re using my keyboard, you’re probably wrong. Also, they need a clean. I swear they don’t look this grimy when I’m not doing macro-photography.

I also care about the mouse I use. Mice are, for the most part, for the Web and for gaming and not for use in most other applications (that’s what keyboard shortcuts are for!) but nonetheless I spend plenty of time holding one and so I want the tool that feels right to me. That’s why I was delighted when, in replacing my four year-old Logitech MX1000 in 2010 with my first Logitech Performance MX, I felt able to declare it the best mouse in the world. My Performance MX lived for about four years, too – that seems to be how long a mouse can stand the kind of use that I give it – before it started to fail and I opted to replace it with an identical make and model. I’d found “my” mouse, and I was sticking with it. It’s a great shape (if you’ve got larger hands), is full of features including highly-configurable buttons, vertical and horizontal scrolling (or whatever you want to map them to), and a cool “flywheel” mouse wheel that can be locked to regular operation or unlocked for controlled high-speed scrolling at the touch of a button: with practice, you can even use it as a speed control by gently depressing the switch like it was a brake pedal. Couple all of that with incredible accuracy on virtually any surface, long battery life, and charging “while you use” and you’ve a recipe for success, in my mind.

My second Performance MX stopped properly charging its battery this week, and it turns out that they don’t make them any more, so I bought its successor, the Logitech MX Master 2S.

(New) Logitech MX Master 2S and (old) Logitech Performance MX
On the left, the (new) Logitech MX Master 2S. On the right, my (old) Logitech Performance MX.

The MX Master 2S is… different… from its predecessor. Mostly in good ways, sometimes less-good. Here’s the important differences:

  • Matte coating: only the buttons are made of smooth plastic; the body of the mouse is now a slightly coarser plastic: you’ll see in the photo above how much less light it reflects. It feels like it would dissipate heat less-well.
  • Horizontal wheel replaces rocker wheel: instead of the Performance MX’s “rocker” scroll wheel that can be pushed sideways for horizontal scroll, the MX Master 2S adds a dedicated horizontal scroll (or whatever you reconfigure it to) wheel in the thumb well. This is a welcome change: the rocker wheel in both my Performance MXes became less-effective over time and in older mice could even “jam on”, blocking the middle-click function. This seems like a far more-logical design.
  • New back/forward button shape: to accommodate the horizontal wheel, the “back” and “forward” buttons in the thumb well have been made smaller and pushed closer together. This is the single biggest failing of the MX Master 2S: it’s clearly a mouse designed for larger hands, and yet these new buttons are slightly, but noticeably, harder to accurately trigger with a large thumb! It’s tolerable, but slightly annoying.
  • Bluetooth support: one of my biggest gripes about the Performance MX was its dependence on Unifying, Logitech’s proprietary wireless protocol. The MX Master 2S supports Unifying but also supports Bluetooth, giving you the best of both worlds.
  • Digital flywheel: the most-noticable change when using the mouse is the new flywheel and braking mechanism, which is comparable to the change in contemporary cars from a mechanical to a digital handbrake. The flywheel “lock” switch is now digital, turning on or off the brake in a single stroke and so depriving you of the satisfaction of using it to gradually “slow down” a long spin-scroll through an enormous log or source code file. But in exchange comes an awesome feature called SmartShift, which dynamically turns on or off the brake (y’know, like an automatic handbrake!) depending on the speed with which you throw the wheel. That’s clever and intuitive and “just works” far better than I’d have imagined: I can choose to scroll slowly or quickly, with or without the traditional ratchet “clicks” of a wheel mouse, with nothing more than the way I flick my finger (and all fully-configurable, of course). And I’ve still got the button to manually “toggle” the brake if I need it. It took some getting used to, but this change is actually really cool! (I’m yet to get used to the sound of the digital brake kicking in automatically, but that’s true of my car too).
  • Basic KVM/multi-computing capability: with a button on the underside to toggle between different paired Unifying/Bluetooth transceivers and software support for seamless edge-of-desktop multi-computer operation, Logitech are clearly trying to target folks who, like me, routinely run multiple computers simultaneously from a single keyboard and mouse. But it’s a pointless addition in my case because I’ve been quite happy using Synergy to do this for the last 7+ years, which does it better. Still, it’s a harmless “bonus” feature and it might be of value to others, I suppose.

All in all, the MX Master 2S isn’t such an innovative leap forward over the Performance MX as the Performance MX was over the MX1000, but it’s still great that this spectacular series of heavyweight workhouse feature-rich mice continues to innovate and, for the most part, improve upon the formula. This mouse isn’t cheap, and it isn’t for everybody, but if you’re a big-handed power user with a need to fine-tune all your hands-on hardware to get it just right, it’s definitely worth a look.

× ×

The “Backendification” of Frontend Development

This is a repost promoting content originally published elsewhere. See more things Dan's reposted.

https://hackernoon.com/the-backendification-of-frontend-development-62f218a773d4 (hackernoon.com)

Asynchronous JavaScript in the form of Single Page Applications (SPA) offer an incredible opportunity for improving the user experience of your web applications. CSS frameworks like Bootstrap enable developers to quickly contribute styling as they’re working on the structure and behaviour of things.

Unfortunately, SPA and CSS frameworks tend to result in relatively complex solutions where traditionally separated concerns – HTML-structure, CSS-style, and JS-behaviour – are blended together as a matter of course — Counter to the lessons learned by previous generations.

This blending of concerns can prevent entry level developers and valued specialists (Eg. visual design, accessibility, search engine optimization, and internationalization) from making meaningful contributions to a project.

In addition to the increasing cost of the few developers somewhat capable of juggling all of these concerns, it can also result in other real world business implications.

What is a front-end developer? Does anybody know, any more? And more-importantly, how did we get to the point where we’re actively encouraging young developers into habits like writing (cough React cough) files containing a bloaty, icky mixture of content, HTML (markup), CSS (style), and Javascript (behaviour)? Yes, I get that the idea is that individual components should be packaged together (if you’re thinking in a React-like worldview), but that alone doesn’t justify this kind of bullshit antipattern.

It seems like the Web used to have developers. Then it got complex so we started differentiating back-end from front-end developers and described those who, like me, spanned the divide, as full-stack developers We gradually became a minority as more and more new developers, deprived of the opportunity to learn each new facet organically in this newly-complicated landscape, but that’s fine. But then… we started treating the front-end as the only end, and introducing all kinds of problems as a result… and most people don’t seem to have noticed, yet, exactly how much damage we’re doing to Web applications’ security, maintainability, future-proofibility, archivability, addressibility…

Pac-Man: The Untold Story of How We Really Played The Game

This is a repost promoting content originally published elsewhere. See more things Dan's reposted.

Unrestored Pac-Man machine with worn paint in a specific place on the left-hand side.

Human beings leave physical impressions upon the things they love and use just as much as their do upon the lives of people and the planet they live upon. For every action, there’s a reaction. For every pressure, there’s an affect on mass and volume. And in the impressions left by that combination, particularly if you’re lucky enough to see the sides of a rare, unrestored  vintage Pac-Man cabinet, lies the never before told story of how we really played the game.

Until now, I don’t believe anyone has ever written about it.

Interesting exploration of the history of the cabinets housing Pac-Man, observing the ergonomic impact of the controls on the way that people would hold the side of the machine and, in turn, how that would affect where and how the paint would wear off.

I love that folks care about this stuff.