Making an RSS feed of YOURLS shortlinks

As you might know if you were paying close attention in Summer 2019, I run a “URL shortener” for my personal use. You may be familiar with public URL shorteners like TinyURL and Bit.ly: my personal URL shortener is basically the same thing, except that only I am able to make short-links with it. Compared to public ones, this means I’ve got a larger corpus of especially-short (e.g. 2/3 letter) codes available for my personal use. It also means that I’m not dependent on the goodwill of a free siloed service and I can add exactly the features I want to it.

Diagram showing the relationships of the DanQ.me ecosystem. Highlighted is the injection of links into the "S.2" link shortener and the export of these shortened links by RSS into FreshRSS.
Little wonder then that my link shortener sat so close to me on my ecosystem diagram the other year.

For the last nine years my link shortener has been S.2, a tool I threw together in Ruby. It stores URLs in a sequentially-numbered database table and then uses the Base62-encoding of the primary key as the “code” part of the short URL. Aside from the fact that when I create a short link it shows me a QR code to I can easily “push” a page to my phone, it doesn’t really have any “special” features. It replaced S.1, from which it primarily differed by putting the code at the end of the URL rather than as part of the domain name, e.g. s.danq.me/a0 rather than a0.s.danq.me: I made the switch because S.1 made HTTPS a real pain as well as only supporting Base36 (owing to the case-insensitivity of domain names).

But S.2’s gotten a little long in the tooth and as I’ve gotten busier/lazier, I’ve leant into using or adapting open source tools more-often than writing my own from scratch. So this week I switched my URL shortener from S.2 to YOURLS.

Screenshot of YOURLS interface showing Dan Q's list of shortened links. Six are shown of 1,939 total.
YOURLs isn’t the prettiest tool in the world, but then it doesn’t have to be: only I ever see the interface pictured above!

One of the things that attracted to me to YOURLS was that it had a ready-to-go Docker image. I’m not the biggest fan of Docker in general, but I do love the convenience of being able to deploy applications super-quickly to my household NAS. This makes installing and maintaining my personal URL shortener much easier than it used to be (and it was pretty easy before!).

Another thing I liked about YOURLS is that it, like S.2, uses Base62 encoding. This meant that migrating my links from S.2 into YOURLS could be done with a simple cross-database INSERT... SELECT statement:

INSERT INTO yourls.yourls_url(keyword, url, title, `timestamp`, clicks)
  SELECT shortcode, url, title, created_at, 0 FROM danq_short.links

But do you know what’s a bigger deal for my lifestack than my URL shortener? My RSS reader! I’ve written about it a lot, but I use RSS for just about everything and my feed reader is my first, last, and sometimes only point of contact with the Web! I’m so hooked-in to my RSS ecosystem that I’ll use my own middleware to add feeds to sites that don’t have them, or for which I’m not happy with the feed they provide, e.g. stripping sports out of BBC News, subscribing to webcomics that don’t provide such an option (sometimes accidentally hacking into sites on the way), and generating “complete” archives of series’ of posts so I can use my reader to track my progress.

One of S.1/S.2’s features was that it exposed an RSS feed at a secret URL for my reader to ingest. This was great, because it meant I could “push” something to my RSS reader to read or repost to my blog later. YOURLS doesn’t have such a feature, and I couldn’t find anything in the (extensive) list of plugins that would do it for me. I needed to write my own.

Partial list of Dan's RSS feed subscriptions, including Jeremy Keith, Jim Nielson, Natalie Lawhead, Bruce Schneier, Scott O'Hara, "Yahtzee", BBC News, and several podcasts, as well as (highlighted) "Dan's Short Links", which has 5 unread items.
In some ways, subscribing “to yourself” is a strange thing to do. In other ways… shut up, I’ll do what I like.

I could have written a YOURLS plugin. Or I could have written a stack of code in Ruby, PHP, Javascript or some other language to bridge these systems. But as I switched over my shortlink subdomain s.danq.me to its new home at danq.link, another idea came to me. I have direct database access to YOURLS (and the table schema is super simple) and the command-line MariaDB client can output XML… could I simply write an XML Transformation to convert database output directly into a valid RSS feed? Let’s give it a go!

I wrote a script like this and put it in my crontab:

mysql --xml yourls -e                                                                                                                     \
      "SELECT keyword, url, title, DATE_FORMAT(timestamp, '%a, %d %b %Y %T') AS pubdate FROM yourls_url ORDER BY timestamp DESC LIMIT 30" \
    | xsltproc template.xslt -                                                                                                            \
    | xmllint --format -                                                                                                                  \
    > output.rss.xml

The first part of that command connects to the yourls database, sets the output format to XML, and executes an SQL statement to extract the most-recent 30 shortlinks. The DATE_FORMAT function is used to mould the datetime into something approximating the RFC-822 standard for datetimes as required by RSS. The output produced looks something like this:

<?xml version="1.0"?>
<resultset statement="SELECT keyword, url, title, timestamp FROM yourls_url ORDER BY timestamp DESC LIMIT 30" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
  <row>
        <field name="keyword">VV</field>
        <field name="url">https://webdevbev.co.uk/blog/06-2021/perfect-is-the-enemy-of-good.html</field>
        <field name="title"> Perfect is the enemy of good || Web Dev Bev</field>
        <field name="timestamp">2021-09-26 17:38:32</field>
  </row>
  <row>
        <field name="keyword">VU</field>
        <field name="url">https://webdevlaw.uk/2021/01/30/why-generation-x-will-save-the-web/</field>
        <field name="title">Why Generation X will save the web  Hi, Im Heather Burns</field>
        <field name="timestamp">2021-09-26 17:38:26</field>
  </row>

  <!-- ... etc. ... -->
  
</resultset>

We don’t see this, though. It’s piped directly into the second part of the command, which  uses xsltproc to apply an XSLT to it. I was concerned that my XSLT experience would be super rusty as I haven’t actually written any since working for my former employer SmartData back in around 2005! Back then, my coworker Alex and I spent many hours doing XML backflips to implement a system that converted complex data outputs into PDF files via an XSL-FO intermediary.

I needn’t have worried, though. Firstly: it turns out I remember a lot more than I thought from that project a decade and a half ago! But secondly, this conversion from MySQL/MariaDB XML output to RSS turned out to be pretty painless. Here’s the template.xslt I ended up making:

<?xml version="1.0"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
  <xsl:template match="resultset">
    <rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
      <channel>
        <title>Dan's Short Links</title>
        <description>Links shortened by Dan using danq.link</description>
        <link> [ MY RSS FEED URL ] </link>
        <atom:link href=" [ MY RSS FEED URL ] " rel="self" type="application/rss+xml" />
        <lastBuildDate><xsl:value-of select="row/field[@name='pubdate']" /> UTC</lastBuildDate>
        <pubDate><xsl:value-of select="row/field[@name='pubdate']" /> UTC</pubDate>
        <ttl>1800</ttl>
        <xsl:for-each select="row">
          <item>
            <title><xsl:value-of select="field[@name='title']" /></title>
            <link><xsl:value-of select="field[@name='url']" /></link>
            <guid>https://danq.link/<xsl:value-of select="field[@name='keyword']" /></guid>
            <pubDate><xsl:value-of select="field[@name='pubdate']" /> UTC</pubDate>
          </item>
        </xsl:for-each>
      </channel>
    </rss>
  </xsl:template>
</xsl:stylesheet>

That uses the first (i.e. most-recent) shortlink’s timestamp as the feed’s pubDate, which makes sense: unless you’re going back and modifying links there’s no more-recent changes than the creation date of the most-recent shortlink. Then it loops through the returned rows and creates an <item> for each; simple!

The final step in my command runs the output through xmllint to prettify it. That’s not strictly necessary, but it was useful while debugging and as the whole command takes milliseconds to run once every quarter hour or so I’m not concerned about the overhead. Using these native binaries (plus a little configuration), chained together with pipes, had already resulted in way faster performance (with less code) than if I’d implemented something using a scripting language, and the result is a reasonably elegant “scratch your own itch”-type solution to the only outstanding barrier that was keeping me on S.2.

All that remained for me to do was set up a symlink so that the resulting output.rss.xml was accessible, over the web, to my RSS reader. I hope that next time I’m tempted to write a script to solve a problem like this I’ll remember that sometimes a chain of piped *nix utilities can provide me a slicker, cleaner, and faster solution.

Update: Right as I finished writing this blog post I discovered that somebody had already solved this problem using PHP code added to YOURLS; it’s just not packaged as a plugin so I didn’t see it earlier! Whether or not I use this alternate approach or stick to what I’ve got, the process of implementing this YOURLS-database ➡ XML ➡  XSLTRSS chain was fun and informative.

Can I use HTTP Basic Auth in URLs?

Web standards sometimes disappear

Sometimes a web standard disappears quickly at the whim of some company, perhaps to a great deal of complaint (and at least one joke).

But sometimes, they disappear slowly, like this kind of web address:

http://username:password@example.com/somewhere

If you’ve not seen a URL like that before, that’s fine, because the answer to the question “Can I still use HTTP Basic Auth in URLs?” is, I’m afraid: no, you probably can’t.

But by way of a history lesson, let’s go back and look at what these URLs were, why they died out, and how web browsers handle them today. Thanks to Ruth who asked the original question that inspired this post.

Basic authentication

The early Web wasn’t built for authentication. A resource on the Web was theoretically accessible to all of humankind: if you didn’t want it in the public eye, you didn’t put it on the Web! A reliable method wouldn’t become available until the concept of state was provided by Netscape’s invention of HTTP cookies in 1994, and even that wouldn’t see widespread for several years, not least because implementing a CGI (or similar) program to perform authentication was a complex and computationally-expensive option for all but the biggest websites.

Comic showing a conversation between a web browser and server. Browser: "Show me that page. (GET /)" Server: "No, take a ticket and fill this form. (Redirect, Set-Cookie)" Browser: "I've filled your form and here's your ticket (POST request with Cookie set)" Server: "Okay, Keep hold of your ticket. (Redirect, Set-Cookie)" Browser: "Show me that page, here's my ticket. (GET /, Cookie:)"
A simplified view of the form-and-cookie based authentication system used by virtually every website today, but which was too computationally-expensive for many sites in the 1990s.

1996’s HTTP/1.0 specification tried to simplify things, though, with the introduction of the WWW-Authenticate header. The idea was that when a browser tried to access something that required authentication, the server would send a 401 Unauthorized response along with a WWW-Authenticate header explaining how the browser could authenticate itself. Then, the browser would send a fresh request, this time with an Authorization: header attached providing the required credentials. Initially, only “basic authentication” was available, which basically involved sending a username and password in-the-clear unless SSL (HTTPS) was in use, but later, digest authentication and a host of others would appear.

Comic showing conversation between web browser and server. Browser: "Show me that page (GET /)" Server: "No. Send me credentials. (HTTP 401, WWW-Authenticate)" Browser: "Show me that page. I enclose credentials (Authorization)" Server: "Okay (HTTP 200)"
For all its faults, HTTP Basic Authentication (and its near cousins) are certainly elegant.

Webserver software quickly added support for this new feature and as a result web authors who lacked the technical know-how (or permission from the server administrator) to implement more-sophisticated authentication systems could quickly implement HTTP Basic Authentication, often simply by adding a .htaccess file to the relevant directory. .htaccess files would later go on to serve many other purposes, but their original and perhaps best-known purpose – and the one that gives them their name – was access control.

Credentials in the URL

A separate specification, not specific to the Web (but one of Tim Berners-Lee’s most important contributions to it), described the general structure of URLs as follows:

<scheme>://<username>:<password>@<host>:<port>/<url-path>#<fragment>

At the time that specification was written, the Web didn’t have a mechanism for passing usernames and passwords: this general case was intended only to apply to protocols that did have these credentials. An example is given in the specification, and clarified with “An optional user name. Some schemes (e.g., ftp) allow the specification of a user name.”

But once web browsers had WWW-Authenticate, virtually all of them added support for including the username and password in the web address too. This allowed for e.g. hyperlinks with credentials embedded in them, which made for very convenient bookmarks, or partial credentials (e.g. just the username) to be included in a link, with the user being prompted for the password on arrival at the destination. So far, so good.

Comic showing conversation between web browser and server. Browser asks for a page, providing an Authorization: header outright; server responds with the page immediately.
Encoding authentication into the URL provided an incredible shortcut at a time when Web round-trip times were much longer owing to higher latencies and no keep-alives.

This is why we can’t have nice things

The technique fell out of favour as soon as it started being used for nefarious purposes. It didn’t take long for scammers to realise that they could create links like this:

https://YourBank.com@HackersSite.com/

Everything we were teaching users about checking for “https://” followed by the domain name of their bank… was undermined by this user interface choice. The poor victim would actually be connecting to e.g. HackersSite.com, but a quick glance at their address bar would leave them convinced that they were talking to YourBank.com!

Theoretically: widespread adoption of EV certificates coupled with sensible user interface choices (that were never made) could have solved this problem, but a far simpler solution was just to not show usernames in the address bar. Web developers were by now far more excited about forms and cookies for authentication anyway, so browsers started curtailing the “credentials in addresses” feature.

Internet Explorer window showing https://YourBank.com@786590867/ in the address bar.
Users trained to look for “https://” followed by the site they wanted would often fall for scams like this one: the real domain name is after the @-sign. (This attacker is also using dword notation to obfuscate their IP address; this dated technique wasn’t often employed alongside this kind of scam, but it’s another historical oddity I enjoy so I’m shoehorning it in.)

(There are other reasons this particular implementation of HTTP Basic Authentication was less-than-ideal, but this reason is the big one that explains why things had to change.)

One by one, browsers made the change. But here’s the interesting bit: the browsers didn’t always make the change in the same way.

How different browsers handle basic authentication in URLs

Let’s examine some popular browsers. To run these tests I threw together a tiny web application that outputs the Authorization: header passed to it, if present, and can optionally send a 401 Unauthorized response along with a WWW-Authenticate: Basic realm="Test Site" header in order to trigger basic authentication. Why both? So that I can test not only how browsers handle URLs containing credentials when an authentication request is received, but how they handle them when one is not. This is relevant because some addresses – often API endpoints – have optional HTTP authentication, and it’s sometimes important for a user agent (albeit typically a library or command-line one) to pass credentials without first being prompted.

In each case, I tried each of the following tests in a fresh browser instance:

  1. Go to http://<username>:<password>@<domain>/optional (authentication is optional).
  2. Go to http://<username>:<password>@<domain>/mandatory (authentication is mandatory).
  3. Experiment 1, then f0llow relative hyperlinks (which should correctly retain the credentials) to /mandatory.
  4. Experiment 2, then follow relative hyperlinks to the /optional.

I’m only testing over the http scheme, because I’ve no reason to believe that any of the browsers under test treat the https scheme differently.

Chromium desktop family

Chrome at an "Auth Optional" page, showing no header sent.Chrome 93 and Edge 93 both immediately suppressed the username and password from the address bar, along with the “http://” as we’ve come to expect of them. Like the “http://”, though, the plaintext username and password are still there. You can retrieve them by copy-pasting the entire address.

Opera 78 similarly suppressed the username, password, and scheme, but didn’t retain the username and password in a way that could be copy-pasted out.

Authentication was passed only when landing on a “mandatory” page; never when landing on an “optional” page. Refreshing the page or re-entering the address with its credentials did not change this.

Navigating from the “optional” page to the “mandatory” page using only relative links retained the username and password and submitted it to the server when it became mandatory, even Opera which didn’t initially appear to retain the credentials at all.

Navigating from the “mandatory” to the “optional” page using only relative links, or even entering the “optional” page address with credentials after visiting the “mandatory” page, does not result in authentication being passed to the “optional” page. However, it’s interesting to note that once authentication has occurred on a mandatory page, pressing enter at the end of the address bar on the optional page, with credentials in the address bar (whether visible or hidden from the user) does result in the credentials being passed to the optional page! They continue to be passed on each subsequent load of the “optional” page until the browsing session is ended.

Firefox desktop

Firefox window with popup reading "You are about to log in to the site 192.168.0.11 with the username alpha, but the web site does not require authentication. This may be an attempt to trick you."Firefox 91 does a clever thing very much in-line with its image as a browser that puts decision-making authority into the hands of its user. When going to the “optional” page first it presents a dialog, warning the user that they’re going to a site that does not specifically request a username, but they’re providing one anyway. If the user says that no, navigation ceases (the GET request for the page takes place the same either way; this happens before the dialog appears). Strangely: regardless of whether the user selects yes or no, the credentials are not passed on the “optional” page. The credentials (although not the “http://”) appear in the address bar while the user makes their decision.

Similar to Opera, the credentials do not appear in the address bar thereafter, but they’re clearly still being stored: if the refresh button is pressed the dialog appears again. It does not appear if the user selects the address bar and presses enter.

Firefox dialog reading "You are about to log in to the site 192.168.0.11 with the username alpha".Similarly, going to the “mandatory” page in Firefox results in an informative dialog warning the user that credentials are being passed. I like this approach: not only does it help protect the user from the use of authentication as a tracking technique (an old technique that I’ve not seen used in well over a decade, mind), it also helps the user be sure that they’re logging in using the account they mean to, when following a link for that purpose. Again, clicking cancel stops navigation, although the initial request (with no credentials) and the 401 response has already occurred.

Visiting any page within the scope of the realm of the authentication after visiting the “mandatory” page results in credentials being sent, whether or not they’re included in the address. This is probably the most-true implementation to the expectations of the standard that I’ve found in a modern graphical browser.

Safari desktop

Safari showing a dialog "Log in" / "Your password will be sent unencrypted."Safari 14 never displays or uses credentials provided via the web address, whether or not authentication is mandatory. Mandatory authentication is always met by a pop-up dialog, even if credentials were provided in the address bar. Boo!

Once passed, credentials are later provided automatically to other addresses within the same realm (i.e. optional pages).

Older browsers

Let’s try some older browsers.

Internet Explorer 8 showing the error message "Windows cannot find http://alpha:beta@10.0.2.2/optional. Check the spelling and try again."From version 7 onwards – right up to the final version 11 – Internet Explorer fails to even recognise addresses with authentication credentials in as legitimate web addresses, regardless of whether or not authentication is requested by the server. It’s easy to assume that this is yet another missing feature in the browser we all love to hate, but it’s interesting to note that credentials-in-addresses is permitted for ftp:// URLs…

Internet Explorer 5 showing credentials in the address bar being passed to the server.…and if you go back a little way, Internet Explorer 6 and below supported credentials in the address bar pretty much as you’d expect based on the standard. The error message seen in IE7 and above is a deliberate design decision, albeit a somewhat knee-jerk reaction to the security issues posed by the feature (compare to the more-careful approach of other browsers).

These older versions of IE even (correctly) retain the credentials through relative hyperlinks, allowing them to be passed when they become mandatory. They’re not passed on optional pages unless a mandatory page within the same realm has already been encountered.

Netscape Communicator 4.7 showing credentials in a URL, passed to a server.Pre-Mozilla Netscape behaved the same way. Truly this was the de facto standard for a long period on the Web, and the varied approaches we see today are the anomaly. That’s a strange observation to make, considering how much the Web of the 1990s was dominated by incompatible implementations of different Web features (I’ve written about the <blink> and <marquee> tags before, which was perhaps the most-visible division between the Microsoft and Netscape camps, but there were many, many more).

Screenshot showing Netscape 7.2, with a popup saying "You are about to log in to site 192.168.0.11 with the username alpha, but the website does not require authenticator. This may be an attempt to trick you." The username and password are visible in the address bar.Interestingly: by Netscape 7.2 the browser’s behaviour had evolved to be the same as modern Firefox’s, except that it still displayed the credentials in the address bar for all to see.

Screenshot of Opera 5 showing credentials in a web address with the password masked, being passed to the server on an optional page.Now here’s a real gem: pre-Chromium Opera. It would send credentials to “mandatory” pages and remember them for the duration of the browsing session, which is great. But it would also send credentials when passed in a web address to “optional” pages. However, it wouldn’t remember them on optional pages unless they remained in the address bar: this feels to me like an optimum balance of features for power users. Plus, it’s one of very few browsers that permitted you to change credentials mid-session: just by changing them in the address bar! Most other browsers, even to this day, ignore changes to HTTP Authentication credentials, which was sometimes be a source of frustration back in the day.

Finally, classic Opera was the only browser I’ve seen to mask the password in the address bar, turning it into a series of asterisks. This ensures the user knows that a password was used, but does not leak any sensitive information to shoulder-surfers (the length of the “masked” password was always the same length, too, so it didn’t even leak the length of the password). Altogether a spectacular design and a great example of why classic Opera was way ahead of its time.

The Command-Line

Most people using web addresses with credentials embedded within them nowadays are probably working with code, APIs, or the command line, so it’s unsurprising to see that this is where the most “traditional” standards-compliance is found.

I was unsurprised to discover that giving curl a username and password in the URL meant that username and password was sent to the server (using Basic authentication, of course, if no authentication was requested):

$ curl http://alpha:beta@localhost/optional
Header: Basic YWxwaGE6YmV0YQ==
$ curl http://alpha:beta@localhost/mandatory
Header: Basic YWxwaGE6YmV0YQ==

However, wget did catch me out. Hitting the same addresses with wget didn’t result in the credentials being sent except where it was mandatory (i.e. where a HTTP 401 response and a WWW-Authenticate: header was received on the initial attempt). To force wget to send credentials when they haven’t been asked-for requires the use of the --http-user and --http-password switches:

$ wget http://alpha:beta@localhost/optional -qO-
Header:
$ wget http://alpha:beta@localhost/mandatory -qO-
Header: Basic YWxwaGE6YmV0YQ==

lynx does a cute and clever thing. Like most modern browsers, it does not submit credentials unless specifically requested, but if they’re in the address bar when they become mandatory (e.g. because of following relative hyperlinks or hyperlinks containing credentials) it prompts for the username and password, but pre-fills the form with the details from the URL. Nice.

Lynx browser following a link from an optional-authentication to a mandatory-authentication page. The browser prompts for a username but it's pre-filled with the one provided by the URL.

What’s the status of HTTP (Basic) Authentication?

HTTP Basic Authentication and its close cousin Digest Authentication (which overcomes some of the security limitations of running Basic Authentication over an unencrypted connection) is very much alive, but its use in hyperlinks can’t be relied upon: some browsers (e.g. IE, Safari) completely munge such links while others don’t behave as you might expect. Other mechanisms like Bearer see widespread use in APIs, but nowhere else.

The WWW-Authenticate: and Authorization: headers are, in some ways, an example of the best possible way to implement authentication on the Web: as an underlying standard independent of support for forms (and, increasingly, Javascript), cookies, and complex multi-part conversations. It’s easy to imagine an alternative timeline where these standards continued to be collaboratively developed and maintained and their shortfalls – e.g. not being able to easily log out when using most graphical browsers! – were overcome. A timeline in which one might write a login form like this, knowing that your e.g. “authenticate” attributes would instruct the browser to send credentials using an Authorization: header:

<form method="get" action="/" authenticate="Basic">
<label for="username">Username:</label> <input type="text" id="username" authenticate="username">
<label for="password">Password:</label> <input type="text" id="password" authenticate="password">
<input type="submit" value="Log In">
</form>

In such a world, more-complex authentication strategies (e.g. multi-factor authentication) could involve encoding forms as JSON. And single-sign-on systems would simply involve the browser collecting a token from the authentication provider and passing it on to the third-party service, directly through browser headers, with no need for backwards-and-forwards redirects with stacks of information in GET parameters as is the case today. Client-side certificates – long a powerful but neglected authentication mechanism in their own right – could act as first class citizens directly alongside such a system, providing transparent second-factor authentication wherever it was required. You wouldn’t have to accept a tracking cookie from a site in order to log in (or stay logged in), and if your browser-integrated password safe supported it you could log on and off from any site simply by toggling that account’s “switch”, without even visiting the site: all you’d be changing is whether or not your credentials would be sent when the time came.

The Web has long been on a constant push for the next new shiny thing, and that’s sometimes meant that established standards have been neglected prematurely or have failed to evolve for longer than we’d have liked. Consider how long it took us to get the <video> and <audio> elements because the “new shiny” Flash came to dominate, how the Web Payments API is only just beginning to mature despite over 25 years of ecommerce on the Web, or how we still can’t use Link: headers for all the things we can use <link> elements for despite them being semantically-equivalent!

The new model for Web features seems to be that new features first come from a popular JavaScript implementation, and then eventually it evolves into a native browser feature: for example HTML form validations, which for the longest time could only be done client-side using scripting languages. I’d love to see somebody re-think HTTP Authentication in this way, but sadly we’ll never get a 100% solution in JavaScript alone: (distributed SSO is almost certainly off the table, for example, owing to cross-domain limitations).

Or maybe it’s just a problem that’s waiting for somebody cleverer than I to come and solve it. Want to give it a go?

The Cursed Computer Iceberg Meme

This article is a repost promoting content originally published elsewhere. See more things Dan's reposted.

More awesome from Blackle Mori, whose praises I sung recently over The Basilisk Collection. This time we’re treated to a curated list of 182 articles demonstrating the “peculiarities and weirdness” of computers. Starting from relatively well-known memes like little Bobby Tables, the year 2038 problem, and how all web browsers pretend to be each other, we descend through the fast inverse square root (made famous by Quake III), falsehoods programmers believe about time (personally I’m more of a fan of …names, but then you might expect that), the EICAR test file, the “thank you for playing Wing Commander” EMM386 in-memory hack, The Basilisk Collection itself, and the GIF MD5 hashquine (which I’ve shared previously) before eventually reaching the esoteric depths of posuto and the nightmare that is Japanese postcodes

Plus many, many things that were new to me and that I’ve loved learning about these last few days.

It’s definitely not a competition; it’s a learning opportunity wrapped up in the weirdest bits of the field. Have an explore and feed your inner computer science geek.

The Coolest Thing About GPS

I’m currently doing a course, through work, delivered by BetterOn Video. The aim of the course is to improve my video presentation skills, in particular my engagement with the camera and the audience.

I made this video based on the week 2 prompt “make a video 60-90 seconds long about something you’re an expert on”. The idea came from a talk I used to give at the University of Oxford.

Watching Films Together… Apart

This weekend I announced and then hosted Homa Night II, an effort to use technology to help bridge the chasms that’ve formed between my diaspora of friends as a result mostly of COVID. To a lesser extent we’ve been made to feel distant from one another for a while as a result of our very diverse locations and lifestyles, but the resulting isolation was certainly compounded by lockdowns and quarantines.

Mark, Sian, Alec, Paul, Kit, Adam, Dan and Claire at Troma Night V.
Long gone are the days when I could put up a blog post to say “Troma Night tonight?” and expect half a dozen friends to turn up at my house.

Back in the day we used to have a regular weekly film night called Troma Night, named after the studio who dominated our early events and whose… genre… influenced many of our choices thereafter. We had over 300 such film nights, by my count, before I eventually left our shared hometown of Aberystwyth ten years ago. I wasn’t the last one of the Troma Night regulars to leave town, but more left before me than after.

Sour Grapes: participants share "hearts" with Ruth
Observant readers will spot a previous effort I made this year at hosting a party online.

Earlier this year I hosted Sour Grapes, a murder mystery party (an irregular highlight of our Aberystwyth social calendar, with thanks to Ruth) run entirely online using a mixture of video chat and “second screen” technologies. In some ways that could be seen as the predecessor to Homa Night, although I’d come up with most of the underlying technology to make Homa Night possible on a whim much earlier in the year!

WhatsApp chat: Dan proposes "Troma Night Remote"; Matt suggests calling it "Troma at Homa"; Dan settles on "Homa Night".
The idea spun out of a few conversations on WhatsApp but the final name – Homa Night – wasn’t agreed until early in November.

How best to make such a thing happen? When I first started thinking about it, during the first of the UK’s lockdowns, I considered a few options:

  • Streaming video over a telemeeting service (Zoom, Google Meet, etc.)
    Very simple to set up, but the quality – as anybody who’s tried this before will attest – is appalling. Being optimised for speech rather than music and sound effects gives the audio a flat, scratchy sound, video compression artefacts that are tolerable when you’re chatting to your boss are really annoying when they stop you reading a crucial subtitle, audio and video often get desynchronised in a way that’s frankly infuriating, and everybody’s download speed is limited by the upload speed of the host, among other issues. The major benefit of these platforms – full-duplex audio – is destroyed by feedback so everybody needs to stay muted while watching anyway. No thanks!
  • Teleparty or a similar tool
    Teleparty (formerly Netflix Party, but it now supports more services) is a pretty clever way to get almost exactly what I want: synchronised video streaming plus chat alongside. But it only works on Chrome (and some related browsers) and doesn’t work on tablets, web-enabled TVs, etc., which would exclude some of my friends. Everybody requires an account on the service you’re streaming from, potentially further limiting usability, and that also means you’re strictly limited to the media available on those platforms (and further limited again if your party spans multiple geographic distribution regions for that service). There’s definitely things I can learn from Teleparty, but it’s not the right tool for Homa Night.
  • “Press play… now!”
    The relatively low-tech solution might have been to distribute video files in advance, have people download them, and get everybody to press “play” at the same time! That’s at least slightly less-convenient because people can’t just “turn up”, they have to plan their attendance and set up in advance, but it would certainly have worked and I seriously considered it. There are other downsides, though: if anybody has a technical issue and needs to e.g. restart their player then they’re basically doomed in any attempt to get back in-sync again. We can do better…
  • A custom-made synchronised streaming service…?
Homa Night architecture: S3 delivers static content to browsers, browsers exchange real-time information via Firebase.
A custom solution that leveraged existing infrastructure for the “hard bits” proved to be the right answer.

So obviously I ended up implementing my own streaming service. It wasn’t even that hard. In case you want to try your own, here’s how I did it:

Media preparation

First, I used Adobe Premiere to create a video file containing both of the night’s films, bookended and separated by “filler” content to provide an introduction/lobby, an intermission, and a closing “you should have stopped watching by now” message. I made sure that the “intro” was a nice round duration (90s) and suitable for looping because I planned to hold people there until we were all ready to start the film. Thanks to Boris & Oliver for the background music!

Dan uses a green screen to add to the intermission.
Honestly, the intermission was just an excuse to keep my chroma key gear out following its most-recent use.

Next, I ran the output through Handbrake to produce “web optimized” versions in 1080p and 720p output sizes. “Web optimized” in this case means that metadata gets added to the start of the file to allow it to start playing without downloading the entire file (streaming) and to allow the calculation of what-part-of-the-file corresponds to what-part-of-the-timeline: the latter, when coupled with a suitable webserver, allows browsers to “skip” to any point in the video without having to watch the intervening part. Naturally I’m encoding with H.264 for the widest possible compatibility.

Handbrake preparing to transcode Premiere's output.
Even using my multi-GPU computer for the transcoding I had time to get up and walk around a bit.

Real-Time Synchronisation

To keep everybody’s viewing experience in-sync, I set up a Firebase account for the application: Firebase provides an easy-to-use Websockets platform with built-in data synchronisation. Ignoring the authentication and chat features, there wasn’t much shared here: just the currentTime of the video in seconds, whether or not introMode was engaged (i.e. everybody should loop the first 90 seconds, for now), and whether or not the video was paused:

Firebase database showing shared currentTime, introMode, and paused values.
Firebase makes schemaless real-time databases pretty easy.

To reduce development effort, I never got around to implementing an administrative front-end; I just manually went into the Firebase database and acknowledged “my” computer as being an administrator, after I’d connected to it, and then ran a little Javascript in my browser’s debugger to tell it to start pushing my video’s currentTime to the server every few seconds. Anything else I needed to edit I just edited directly from the Firebase interface.

Other web clients’ had Javascript to instruct them to monitor these variables from the Firebase database and, if they were desynchronised by more than 5 seconds, “jump” to the correct point in the video file. The hard part of the code… wasn’t really that hard:

// Rewind if we're passed the end of the intro loop
function introModeLoopCheck() {
  if (!introMode) return;
  if (video.currentTime > introDuration) video.currentTime = 0;
}

function fixPlayStatus() {
  // Handle "intro loop" mode
  if (remotelyControlled && introMode) {
    if (video.paused) video.play(); // always play
    introModeLoopCheck();
    return; // don't look at the rest
  }

  // Fix current time
  const desync = Math.abs(lastCurrentTime - video.currentTime);
  if (
    (video.paused && desync > DESYNC_TOLERANCE_WHEN_PAUSED) ||
    (!video.paused && desync > DESYNC_TOLERANCE_WHEN_PLAYING)
  ) {
    video.currentTime = lastCurrentTime;
  }
  // Fix play status
  if (remotelyControlled) {
    if (lastPaused && !video.paused) {
      video.pause();
    } else if (!lastPaused && video.paused) {
      video.play();
    }
  }
  // Show/hide paused notification
  updatePausedNotification();
}

Web front-end

Finally, there needed to be a web page everybody could go to to get access to this. As I was hosting the video on S3+CloudFront anyway, I put the HTML/CSS/JS there too.

Configuring a Homa Night video player.
I decided to carry the background theme of the video through to the web interface too.

I tested in Firefox, Edge, Chrome, and Safari on desktop, and (slightly less) on Firefox, Chrome and Safari on mobile. There were a few quirks to work around, mostly to do with browsers not letting videos make sound until the page has been interacted with after the video element has been rendered, which I carefully worked-around by putting a popup “over” the video to “enable sync”, but mostly it “just worked”.

Delivery

On the night I shared the web address and we kicked off! There were a few hiccups as some people’s browsers got disconnected early on and tried to start playing the film before it was time, and one of these even when fixed ran about a minute behind the others, leading to minor spoilers leaking via the rest of us riffing about them! But on the whole, it worked. I’ve had lots of useful feedback to improve on it for the next version, and I might even try to tidy up my code a bit and open-source the results if this kind of thing might be useful to anybody else.

Endpoint Encabulator

(This video is also available on YouTube.)

I’ve been working as part of the team working on the new application framework called the Endpoint Encabulator and wanted to share with you what I think makes our project so exciting: I promise it’ll make for two minutes of your time you won’t seen forget!

Naturally, this project wouldn’t have been possible without the pioneering work that preceded it by John Hellins Quick, Bud Haggart, and others. Nothing’s invented in a vacuum. However, my fellow developers and I think that our work is the first viable encabulator implementation to provide inverse reactive data binding suitable for deployment in front of a blockchain-driven backend cache. I’m not saying that all digital content will one day be delivered through Endpoint Encabulator, but… well; maybe it will.

If the technical aspects go over your head, pass it on to a geeky friend who might be able to make use of my work. Sharing is caring!

Accidental Geohashing

Over the last six years I’ve been on a handful of geohashing expeditions, setting out to functionally-random GPS coordinates to see if I can get there, and documenting what I find when I do. The comic that inspired the sport was already six years old by the time I embarked on my first outing, and I’m far from the most-active member of the ‘hasher community, but I’ve a certain closeness to them as a result of my work to resurrect and host the “official” website. Either way: I love the sport.

Dan, Ruth, and baby Annabel at geohashpoint 2014-04-21 51 -1
I even managed to drag-along Ruth and Annabel to a hashpoint (2014-04-21 51 -1) once.

But even when I’ve not been ‘hashing, it occurs to me that I’ve been tracking my location a lot. Three mechanisms in particular dominate:

  • Google’s somewhat-invasive monitoring of my phones’ locations (which can be exported via Google Takeout)
  • My personal GPSr logs (I carry the device moderately often, and it provides excellent precision)
  • The personal μlogger server I’ve been running for the last few years (it’s like Google’s system, but – y’know – self-hosted, tweakable, and less-creepy)

If I could mine all of that data, I might be able to answer the question… have I ever have accidentally visited a geohashpoint?

Let’s find out.

KML from Google is coverted into GPX where it joins GPX from my GPSr and real-time position data from uLogger in a MySQL database. This is queried against historic hashpoints to produce a list of candidate accidental hashpoints.
There’s a lot to my process, but it’s technically quite simple.

Data mining my own movements

To begin with, I needed to get all of my data into μLogger. The Android app syncs to it automatically and uploading from my GPSr was simple. The data from Google Takeout was a little harder.

I found a setting in Google Takeout to export past location data in KML, rather than JSON, format. KML is understood by GPSBabel which can convert it into GPX. I can “cut up” the resulting GPX file using a little grep-fu (relevant xkcd?) to get month-long files and import them into μLogger. Easy!

Requesting KML rather than JSON from Google Takeout
It’s slightly hidden, but Google Takeout choose your geoposition output format (from a limited selection).

Well.. μLogger’s web interface sometimes times-out if you upload enormous files like a whole month of Google Takeout logs. So instead I wrote a Nokogiri script to convert the GPX into SQL to inject directly into μLogger’s database.

Next, I got a set of hashpoint offsets. I only had personal positional data going back to around 2010, so I didn’t need to accommodate for the pre-2008 absence of the 30W time zone rule. I’ve had only one trip to the Southern hemisphere in that period, and I checked that manually. A little rounding and grouping in SQL gave me each graticule I’d been in on every date. Unsurprisingly, I spend most of my time in the 51 -1 graticule. Adding (or subtracting, for the Western hemisphere) the offset provided the coordinates for each graticule that I visited for the date that I was in that graticule. Nice.

SQL retreiving hashpoints for every graticule I've been to in the last 10 years, grouped by date.
Preloading the offsets into a temporary table made light work of listing all the hashpoints in all the graticules I’d visited, by date. Note that some dates (e.g. 2011-08-04, above) saw me visit multiple graticules.

The correct way to find the proximity of my positions to each geohashpoint is, of course, to use WGS84. That’s an easy thing to do if you’re using a database that supports it. My database… doesn’t. So I just used Pythagoras’ theorem to find positions I’d visited that were within 0.15° of a that day’s hashpoint.

Using Pythagoras for geopositional geometry is, of course, wrong. Why? Because the physical length of a “degree” varies dependent on latitude, and – more importantly – a degree of latitude is not the same distance as a degree of longitude. The ratio varies by latitude: only an idealised equatorial graticule would be square!

But for this case, I don’t care: the data’s going to be fuzzy and require some interpretation anyway. Not least because Google’s positioning has the tendency to, for example, spot a passing train’s WiFi and assume I’ve briefly teleported to Euston Station, which is apparently where Google thinks that hotspot “lives”.

GIF animation comparing routes recorded by Google My Location with those recorded by my GPSr: they're almost identical
I overlaid randomly-selected Google My Location and GPSr routes to ensure that they coincided, as an accuracy-test. It’s interesting to note that my GPSr points cluster when I was moving slower, suggesting it polls on a timer. Conversely Google’s points cluster when I was using data (can you see the bit where I used a chat app), suggesting that Google Location Services ramps up the accuracy and poll frequency when you’re actively using your device.

I assumed that my algorithm would detect all of my actual geohash finds, and yes: all of these appeared as-expected in my results. This was a good confirmation that my approach worked.

And, crucially: about a dozen additional candidate points showed up in my search. Most of these – listed at the end of this post – were 50m+ away from the hashpoint and involved me driving or cycling past on a nearby road… but one hashpoint stuck out.

Hashing by accident

Annabel riding on Tom's shoulders in Edinburgh.
We all had our roles to play in our trip to Edinburgh. Tom… was our pack mule.

In August 2015 we took a trip up to Edinburgh to see a play of Ruth‘s brother Robin‘s. I don’t remember much about the play because I was on keeping-the-toddler-entertained duty and so had to excuse myself pretty early on. After the play we drove South, dropping Tom off at Lanark station.

We exited Lanark via the Hyndford Bridge… which is – according to the map – tantalisingly-close to the 2015-08-22 55 -3 hashpoint: only about 23 metres away!

The 2015-08-22 55 -3
Google puts the centre of the road I drove down only 23m from the 2015-08-22 55 -3 hashpoint (of course, I was actually driving on the near side of the road and may have been closer still).

That doesn’t feel quite close enough to justify retroactively claiming the geohash, tempting though it would be to use it as a vehicle to my easy geohash ribbon. Google doesn’t provide error bars for their exported location data so I can’t draw a circle of uncertainty, but it seems unlikely that I passed through this very close hashpoint.

Pity. But a fun exercise. This was the nearest of my near misses, but plenty more turned up in my search, too:

  1. 2013-09-28 54 -2 (9,000m)
    Near a campsite on the River Eden. I drove past on the M6 with Ruth on the way to Loch Lomond for a mini-break to celebrate our sixth anniversary. I was never more than 9,000 metres from the hashpoint, but Google clearly had a moment when it couldn’t get good satellite signal and tries to trilaterate my position from cell masts and coincidentally guessed, for a few seconds, that I was much closer. There are a few such erroneous points in my data but they’re pretty obvious and easy to spot, so my manual filtering process caught them.
  2. 2019-09-13 52 -0 (719m)
    A600, near Cardington Airstrip, south of Bedford. I drove past on the A421 on my way to Three Rings‘ “GDPR Camp”, which was more fun than it sounds, I promise.
  3. 2014-03-29 53 -1 (630m)
    Spen Farm, near Bramham Interchange on the A1(M). I drove past while heading to the Nightline Association Conference to talk about Three Rings. Curiously, I came much closer to the hashpoint the previous week when I drove a neighbouring road on my way to York for my friend Matt’s wedding.
  4. 2020-05-06 51 -1 (346m)
    Inside Kidlington Police Station! Short of getting arrested, I can’t imagine how I’d easily have gotten to this one, but it’s moot anyway because I didn’t try! I’d taken the day off work to help with child-wrangling (as our normal childcare provisions had been scrambled by COVID-19), and at some point during the day we took a walk and came somewhat near to the hashpoint.
  5. 2016-02-05 51 -1 (340m)
    Garden of a house on The Moors, Kidlington. I drove past (twice) on my way to and from the kids’ old nursery. Bonus fact: the house directly opposite the one whose garden contained the hashpoint is a house that I looked at buying (and visited), once, but didn’t think it was worth the asking price.
  6. 2017-08-30 51 -1 (318m)
    St. Frieswide Farm, between Oxford and Kidlington. I cycled past on Banbury Road twice – once on my way to and once on my way from work.
  7. 2015-01-25 51 -1 (314m)
    Templar Road, Cutteslowe, Oxford. I’ve cycled and driven along this road many times, but on the day in question the closest I came was cycling past on nearby Banbury Road while on the way to work.
  8. 2018-01-28 51 -1 (198m)
    Stratfield Brake, Kidlington. I took our youngest by bike trailer this morning to his Monkey Music class: normally at this point in history Ruth would have been the one to take him, but she had a work-related event that she couldn’t miss in the morning. I cycled right by the entrance to this nature reserve: it could have been an ideal location for a geohash!
  9. 2014-01-24 51 -1 (114m)
    On the Marston Cyclepath. I used to cycle along this route on the way to and from work most days back when I lived in Marston, but by 2014 I lived in Kidlington and so I’d only cycle past the end of it. So it was that I cycled past the Linacre College of the path, around 114m away from the hashpoint, on this day.
  10. 2015-06-10 51 -1 (112m)
    Meadow near Peartree Interchange, Oxford. I stopped at the filling station on the opposite side of the roundabout, presumably to refuel a car.
  11. 2020-02-27 51 -1 (70m)
    This was a genuine attempt at a hashpoint that I failed to reach and was so sad about that I never bothered to finish writing up. The hashpoint was very close (but just out of sight of, it turns out) a geocache I’d hidden in the vicinity, and I was hopeful that I might be able to score the most-epic/demonstrable déjà vu/hash collision achievement ever, not least because I had pre-existing video evidence that I’d been at the coordinates before! Unfortunately it wasn’t to be: I had inadequate footwear for the heavy rains that had fallen in the days that preceded the expedition and I was in a hurry to get home, get changed, and go catch a train to go and see the Goo Goo Dolls in concert. So I gave up and quit the expedition. This turned out to be the right decision: going to the concert one of the last “normal” activities I got to do before the COVID-19 lockdown made everybody’s lives weird.
  12. 2014-05-23 51 -1 (61m)
    White Way, Kidlington, near the Bicester Road to Green Road footpath. I passed close by while cycling to work, but I’ve since walked through this hashpoint many times: it’s on a route that our eldest sometimes used to take when walking home from her school! With the exception only of the very-near-miss in Lanark, this was my nearest “near miss”.
  13. 2015-08-22 55 -3 (23m)
    So near, yet so far. 🙄
Rain in Lanark, seen through a car window
No silly grin, but coincidentally – perhaps by accident – I took a picture out of the car window shortly after we passed the hashpoint. This is what Lanark looks like when you drive through it in the rain.

Syncthing

This last month or so, my digital life has been dramatically improved by Syncthing. So much so that I want to tell you about it.

Syncthing interface via Synctrayzor on Windows, showing Dan's syncs.
1.25TiB of data is automatically kept in sync between (depending on the data in question) a desktop PC, NAS, media centre, and phone. This computer’s using the Synctrayzor system tray app.

I started using it last month. Basically, what it does is keeps a pair of directories on remote systems “in sync” with one another. So far, it’s like your favourite cloud storage service, albeit self-hosted and much-more customisable. But it’s got a handful of killer features that make it nothing short of a dream to work with:

  • The unique identifier for a computer can be derived from its public key. Encryption comes free as part of the verification of a computer’s identity.
  • You can share any number of folders with any number of other computers, point-to-point or via an intermediate proxy, and it “just works”.
  • It’s super transparent: you can always see what it’s up to, you can tweak the configuration to match your priorities, and it’s open source so you can look at the engine if you like.

Here are some of the ways I’m using it:

Keeping my phone camera synced to my PC

Phone syncing with PC

I’ve tried a lot of different solutions for this over the years. Back in the way-back-when, like everybody else in those dark times, I used to plug my phone in using a cable to copy pictures off and sort them. Since then, I’ve tried cloud solutions from Google, Amazon, and Flickr and never found any that really “worked” for me. Their web interfaces and apps tend to be equally terrible for organising or downloading files, and I’m rarely able to simply drag-and-drop images from them into a blog post like I can from Explorer/Finder/etc.

At first, I set this up as a one-way sync, “pushing” photos and videos from my phone to my desktop PC whenever I was on an unmetered WiFi network. But then I switched it to a two-way sync, enabling me to more-easily tidy up my phone of old photos too, by just dragging them from the folder that’s synced with my phone to my regular picture storage.

Centralising my backups

Phone and desktop backups centralised through the NAS

Now I’ve got a fancy NAS device with tonnes of storage, it makes sense to use it as a central point for backups to run fom. Instead of having many separate backup processes running on different computers, I can just have each of them sync to the NAS, and the NAS can back everything up. Computers don’t need to be “on” at a particular time because the NAS runs all the time, so backups can use the Internet connection when it’s quietest. And in the event of a hardware failure, there’s an up-to-date on-site backup in the first instance: the cloud backup’s only needed in the event of accidental data deletion (which could be sync’ed already, of course!). Plus, integrating the sync with ownCloud running on the NAS gives easy access to my files wherever in the world I am without having to fire up a VPN or otherwise remote-in to my house.

Plus: because Syncthing can share a folder between any number of devices, the same sharing mechanism that puts my phone’s photos onto my main desktop can simultaneously be pushing them to the NAS, providing redundant connections. And it was a doddle to set up.

Maintaining my media centre’s screensaver

PC photos syncing to the media centre.

Since the NAS, running Jellyfin, took on most of the media management jobs previously shared between desktop computers and the media centre computer, the household media centre’s had less to do. But one thing that it does, and that gets neglected, is showing a screensaver of family photos (when it’s not being used for anything else). Historically, we’ve maintained the photos in that collection via a shared network folder, but then you’ve got credential management and firewall issues to deal with, not to mention different file naming conventions by different people (and their devices).

But simply sharing the screensaver’s photo folder with the computer of anybody who wants to contribute photos means that it’s as easy as copying the picture to a particular place. It works on whatever device they care to (computer, tablet, mobile) on any operating system, and it’s quick and seamless. I’m just using it myself, for now, but I’ll be offering it to the rest of the family soon. It’s a trivial use-case, but once you’ve got it installed it just makes sense.

In short: this month, I’m in love with Syncthing. And maybe you should be, too.

Why $FAMOUS_COMPANY switched to $HYPED_TECHNOLOGY

This article is a repost promoting content originally published elsewhere. See more things Dan's reposted.

As we’ve mentioned in previous blog posts, the $FAMOUS_COMPANY backend has historically been developed in $UNREMARKABLE_LANGUAGE and architected on top of $PRACTICAL_OPEN_SOURCE_FRAMEWORK. To suit our unique needs, we designed and open-sourced $AN_ENGINEER_TOOK_A_MYTHOLOGY_CLASS, a highly-available, just-in-time compiler for $UNREMARKABLE_LANGUAGE.

Saagar Jha tells the now-familiar story of how a bunch of techbros solved their scaling problems by reinventing the wheel. And then, when that didn’t work out, moved the goalposts of success. It’s a story as old as time; or at least as old as the modern Web.

(Should’a strangled the code. Or better yet, just refactored what they had.)

Mackerelmedia Fish

Normally this kind of thing would go into the ballooning dump of “things I’ve enjoyed on the Internet” that is my reposts archive. But sometimes something is so perfect that you have to try to help it see the widest audience it can, right? And today, that thing is: Mackerelmedia Fish.

Mackerelmedia Fish reports: WARNING! Your Fish have escaped!
Historical fact: escaped fish was one of the primary reasons for websites failing in 1996.

What is Mackerelmedia Fish? I’ve had a thorough and pretty complete experience of it, now, and I’m still not sure. It’s one or more (or none) of these, for sure, maybe:

  • A point-and-click, text-based, or hypertext adventure?
  • An homage to the fun and weird Web of yesteryear?
  • A statement about the fragility of proprietary technologies on the Internet?
  • An ARG set in a parallel universe in which the 1990s never ended?
  • A series of surrealist art pieces connected by a loose narrative?

Rock Paper Shotgun’s article about it opens with “I don’t know where to begin with this—literally, figuratively, existentially?” That sounds about right.

I stared into THE VOID and am OK!
This isn’t the reward for “winning” the “game”. But I was proud of it anyway.

What I can tell you with confident is what playing feels like. And what it feels like is the moment when you’ve gotten bored waiting for page 20 of Argon Zark to finish appear so you decide to reread your already-downloaded copy of the 1997 a.r.k bestof book, and for a moment you think to yourself: “Whoah; this must be what living in the future feels like!”

Because back then you didn’t yet have any concept that “living in the future” will involve scavenging for toilet paper while complaining that you can’t stream your favourite shows in 4K on your pocket-sized supercomputer until the weekend.

Dancing... thing?
I was always more of a Bouncing Blocks than a Hamster Dance guy, anyway.

Mackerelmedia Fish is a mess of half-baked puns, retro graphics, outdated browsing paradigms and broken links. And that’s just part of what makes it great.

It’s also “a short story that’s about the loss of digital history”, its creator Nathalie Lawhead says. If that was her goal, I think she managed it admirably.

An ASCII art wizard on a faux Apache directory listing page.
Everything about this, right down to the server signature (Artichoke), is perfect.

If I wasn’t already in love with the game already I would have been when I got to the bit where you navigate through the directory indexes of a series of deepening folders, choose-your-own-adventure style. Nathalie writes, of it:

One thing that I think is also unique about it is using an open directory as a choose your own adventure. The directories are branching. You explore them, and there’s text at the bottom (an htaccess header) that describes the folder you’re in, treating each directory as a landscape. You interact with the files that are in each of these folders, and uncover the story that way.

Back in the naughties I experimented with making choose-your-own-adventure games in exactly this way. I was experimenting with different media by which this kind of branching-choice game could be presented. I envisaged a project in which I’d showcase the same (or a set of related) stories through different approaches. One was “print” (or at least “printable”): came up with a Twee1-to-PDF converter to make “printable” gamebooks. A second was Web hypertext. A third – and this is the one which was most-similar to what Nathalie has now so expertly made real – was FTP! My thinking was that this would be an adventure game that could be played in a browser or even from the command line on any (then-contemporary: FTP clients aren’t so commonplace nowadays) computer. And then, like so many of my projects, the half-made version got put aside “for later” and forgotten about. My solution involved abusing the FTP protocol terribly, but it worked.

(I also looked into ways to make Gopher-powered hypertext fiction and toyed with the idea of using YouTube annotations to make an interactive story web [subsequently done amazingly by Wheezy Waiter, though the death of YouTube annotations in 2017 killed it]. And I’ve still got a prototype I’d like to get back to, someday, of a text-based adventure played entirely through your web browser’s debug console…! But time is not my friend… Maybe I ought to collaborate with somebody else to keep me on-course.)

Three virtual frogs. One needs a hug.
My first batch of pet frogs died quite quickly, but these ones did okay.

In any case: Mackerelmedia Fish is fun, weird, nostalgic, inspiring, and surreal, and you should give it a go. You’ll need to be on a Windows or OS X computer to get everything you can out of it, but there’s nothing to stop you starting out on your mobile, I imagine.

Sso long as you’re capable of at least 800 × 600 at 256 colours and have 4MB of RAM, if you know what I mean.

Avoid rewriting a legacy system from scratch, by strangling it

This article is a repost promoting content originally published elsewhere. See more things Dan's reposted.

Sometimes, code is risky to change and expensive to refactor.

In such a situation, a seemingly good idea would be to rewrite it.

From scratch.

Here’s how it goes:

  1. You discuss with management about the strategy of stopping new features for some time, while you rewrite the existing app.
  2. You estimate the rewrite will take 6 months to cover what the existing app does.
  3. A few months in, a nasty bug is discovered and ABSOLUTELY needs to be fixed in the old code. So you patch the old code and the new one too.
  4. A few months later, a new feature has been sold to the client. It HAS TO BE implemented in the old code—the new version is not ready yet! You need to go back to the old code but also add a TODO to implement this in the new version.
  5. After 5 months, you realize the project will be late. The old app was doing way more things than expected. You start hustling more.
  6. After 7 months, you start testing the new version. QA raises up a lot of things that should be fixed.
  7. After 9 months, the business can’t stand “not developing features” anymore. Leadership is not happy with the situation, you are tired. You start making changes to the old, painful code while trying to keep up with the rewrite.
  8. Eventually, you end up with the 2 systems in production. The long-term goal is to get rid of the old one, but the new one is not ready yet. Every feature needs to be implemented twice.

Sounds fictional? Or familiar?

Don’t be shamed, it’s a very common mistake.

I’ve rewritten legacy systems from scratch before. Sometimes it’s all worked out, and sometimes it hasn’t, but either way: it’s always been a lot more work than I could have possibly estimated. I’ve learned now to try to avoid doing so: at least, to avoid replacing a single monolithic (living) system in a monolithic way. Nicholas gives an even-better description of the true horror of legacy reimplementation, and promotes progressive strangulation as a candidate solution.