How did it take me years of working-from-home before I thought to install one of these in my desk? Brilliant.
Tag: technology
Making an RSS feed of YOURLS shortlinks
As you might know if you were paying close attention in Summer 2019, I run a “URL shortener” for my personal use. You may be familiar with public URL shorteners like TinyURL and Bit.ly: my personal URL shortener is basically the same thing, except that only I am able to make short-links with it. Compared to public ones, this means I’ve got a larger corpus of especially-short (e.g. 2/3 letter) codes available for my personal use. It also means that I’m not dependent on the goodwill of a free siloed service and I can add exactly the features I want to it.
For the last nine years my link shortener has been S.2, a tool I threw together in Ruby. It stores URLs in a
sequentially-numbered database table and then uses the Base62-encoding of the primary key as the “code” part of the short URL. Aside from the fact that when I create a short link it shows me a QR code to I can
easily “push” a page to my phone, it doesn’t really have any “special” features. It replaced S.1, from which it primarily differed by putting the code at the end of the URL rather than as part of the domain name, e.g. s.danq.me/a0 rather than a0.s.danq.me: I made the switch
because S.1 made HTTPS a real pain as well as only supporting Base36 (owing to the case-insensitivity of domain names).
But S.2’s gotten a little long in the tooth and as I’ve gotten busier/lazier, I’ve leant into using or adapting open source tools more-often than writing my own from scratch. So this week I switched my URL shortener from S.2 to YOURLS.
One of the things that attracted to me to YOURLS was that it had a ready-to-go Docker image. I’m not the biggest fan of Docker in general, but I do love the convenience of being able to deploy applications super-quickly to my household NAS. This makes installing and maintaining my personal URL shortener much easier than it used to be (and it was pretty easy before!).
Another thing I liked about YOURLS is that it, like S.2, uses Base62 encoding. This meant that migrating my links from S.2 into YOURLS could be done with a simple cross-database
INSERT... SELECT statement:
INSERT INTO yourls.yourls_url(keyword, url, title, `timestamp`, clicks) SELECT shortcode, url, title, created_at, 0 FROM danq_short.links
But do you know what’s a bigger deal for my lifestack than my URL shortener? My RSS reader! I’ve written about it a lot, but I use RSS for just about everything and my feed reader is my first, last, and sometimes only point of contact with the Web! I’m so hooked-in to my RSS ecosystem that I’ll use my own middleware to add feeds to sites that don’t have them, or for which I’m not happy with the feed they provide, e.g. stripping sports out of BBC News, subscribing to webcomics that don’t provide such an option (sometimes accidentally hacking into sites on the way), and generating “complete” archives of series’ of posts so I can use my reader to track my progress.
One of S.1/S.2’s features was that it exposed an RSS feed at a secret URL for my reader to ingest. This was great, because it meant I could “push” something to my RSS reader to read or repost to my blog later. YOURLS doesn’t have such a feature, and I couldn’t find anything in the (extensive) list of plugins that would do it for me. I needed to write my own.
I could have written a YOURLS plugin. Or I could have written a stack of code in Ruby, PHP, Javascript or
some other language to bridge these systems. But as I switched over my shortlink subdomain s.danq.me to its new home at danq.link, another idea came to me. I
have direct database access to YOURLS (and the table schema is super simple) and the command-line MariaDB client can output XML… could I simply write an XML
Transformation to convert database output directly into a valid RSS feed? Let’s give it a go!
I wrote a script like this and put it in my crontab:
mysql --xml yourls -e \ "SELECT keyword, url, title, DATE_FORMAT(timestamp, '%a, %d %b %Y %T') AS pubdate FROM yourls_url ORDER BY timestamp DESC LIMIT 30" \ | xsltproc template.xslt - \ | xmllint --format - \ > output.rss.xml
The first part of that command connects to the yourls database, sets the output format to XML, and executes an
SQL statement to extract the most-recent 30 shortlinks. The DATE_FORMAT function is used to mould the datetime into
something approximating the RFC-822 standard for datetimes as required by
RSS. The output produced looks something like this:
<?xml version="1.0"?> <resultset statement="SELECT keyword, url, title, timestamp FROM yourls_url ORDER BY timestamp DESC LIMIT 30" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"> <row> <field name="keyword">VV</field> <field name="url">https://webdevbev.co.uk/blog/06-2021/perfect-is-the-enemy-of-good.html</field> <field name="title"> Perfect is the enemy of good || Web Dev Bev</field> <field name="timestamp">2021-09-26 17:38:32</field> </row> <row> <field name="keyword">VU</field> <field name="url">https://webdevlaw.uk/2021/01/30/why-generation-x-will-save-the-web/</field> <field name="title">Why Generation X will save the web Hi, Im Heather Burns</field> <field name="timestamp">2021-09-26 17:38:26</field> </row> <!-- ... etc. ... --> </resultset>
We don’t see this, though. It’s piped directly into the second part of the command, which uses xsltproc to apply an XSLT to it. I was concerned that my XSLT
experience would be super rusty as I haven’t actually written any since working for my former employer SmartData back in around 2005! Back then, my coworker Alex and I spent many hours doing XML
backflips to implement a system that converted complex data outputs into PDF files via an XSL-FO intermediary.
I needn’t have worried, though. Firstly: it turns out I remember a lot more than I thought from that project a decade and a half ago! But secondly, this conversion from MySQL/MariaDB
XML output to RSS turned out to be pretty painless. Here’s the
template.xslt I ended up making:
<?xml version="1.0"?> <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0"> <xsl:template match="resultset"> <rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"> <channel> <title>Dan's Short Links</title> <description>Links shortened by Dan using danq.link</description> <link> [ MY RSS FEED URL ] </link> <atom:link href=" [ MY RSS FEED URL ] " rel="self" type="application/rss+xml" /> <lastBuildDate><xsl:value-of select="row/field[@name='pubdate']" /> UTC</lastBuildDate> <pubDate><xsl:value-of select="row/field[@name='pubdate']" /> UTC</pubDate> <ttl>1800</ttl> <xsl:for-each select="row"> <item> <title><xsl:value-of select="field[@name='title']" /></title> <link><xsl:value-of select="field[@name='url']" /></link> <guid>https://danq.link/<xsl:value-of select="field[@name='keyword']" /></guid> <pubDate><xsl:value-of select="field[@name='pubdate']" /> UTC</pubDate> </item> </xsl:for-each> </channel> </rss> </xsl:template> </xsl:stylesheet>
That uses the first (i.e. most-recent) shortlink’s timestamp as the feed’s pubDate, which makes sense: unless you’re going back and modifying links there’s no more-recent
changes than the creation date of the most-recent shortlink. Then it loops through the returned rows and creates an <item> for each; simple!
The final step in my command runs the output through xmllint to prettify it. That’s not strictly necessary, but it was useful while debugging and as the whole command takes
milliseconds to run once every quarter hour or so I’m not concerned about the overhead. Using these native binaries (plus a little configuration), chained together with pipes, had
already resulted in way faster performance (with less code) than if I’d implemented something using a scripting language, and the result is a reasonably elegant “scratch your
own itch”-type solution to the only outstanding barrier that was keeping me on S.2.
All that remained for me to do was set up a symlink so that the resulting output.rss.xml was accessible, over the web, to my RSS reader. I hope that next time I’m tempted to write a script to solve a problem like this I’ll remember that sometimes a chain of piped *nix
utilities can provide me a slicker, cleaner, and faster solution.
Update: Right as I finished writing this blog post I discovered that somebody had already solved this problem using PHP code added to YOURLS; it’s just not packaged as a plugin so I didn’t see it earlier! Whether or not I use this alternate approach or stick to what I’ve got, the process of implementing this YOURLS-database ➡ XML ➡ XSLT ➡ RSS chain was fun and informative.
Can I use HTTP Basic Auth in URLs?
Web standards sometimes disappear
Sometimes a web standard disappears quickly at the whim of some company, perhaps to a great deal of complaint (and at least one joke).
But sometimes, they disappear slowly, like this kind of web address:
http://username:password@example.com/somewhere
If you’ve not seen a URL like that before, that’s fine, because the answer to the question “Can I still use HTTP Basic Auth in URLs?” is, I’m afraid: no, you probably can’t.
But by way of a history lesson, let’s go back and look at what these URLs were, why they died out, and how web browsers handle them today. Thanks to Ruth who asked the original question that inspired this post.
Basic authentication
The early Web wasn’t built for authentication. A resource on the Web was theoretically accessible to all of humankind: if you didn’t want it in the public eye, you didn’t put it on the Web! A reliable method wouldn’t become available until the concept of state was provided by Netscape’s invention of HTTP cookies in 1994, and even that wouldn’t see widespread for several years, not least because implementing a CGI (or similar) program to perform authentication was a complex and computationally-expensive option for all but the biggest websites.
1996’s HTTP/1.0 specification tried to simplify things, though, with the introduction of the WWW-Authenticate header. The idea was that when a browser tried to access something that required
authentication, the server would send a 401 Unauthorized response along with a WWW-Authenticate header explaining how the browser could authenticate
itself. Then, the browser would send a fresh request, this time with an Authorization: header attached providing the required credentials. Initially, only “basic
authentication” was available, which basically involved sending a username and password in-the-clear unless SSL (HTTPS) was in use, but later, digest authentication and a host of others would appear.
Webserver software quickly added support for this new feature and as a result web authors who lacked the technical know-how (or permission from the server administrator) to implement
more-sophisticated authentication systems could quickly implement HTTP Basic Authentication, often simply by adding a .htaccess file to the relevant directory.
.htaccess files would later go on to serve many other purposes, but their original and perhaps best-known purpose – and the one that gives them their name – was access
control.
Credentials in the URL
A separate specification, not specific to the Web (but one of Tim Berners-Lee’s most important contributions to it), described the general structure of URLs as follows:
<scheme>://<username>:<password>@<host>:<port>/<url-path>#<fragment>
At the time that specification was written, the Web didn’t have a mechanism for passing usernames and passwords: this general case was intended only to apply to protocols that did have these credentials. An example is given in the specification, and clarified with “An optional user name. Some schemes (e.g., ftp) allow the specification of a user name.”
But once web browsers had WWW-Authenticate, virtually all of them added support for including the username and password in the web address too. This allowed for
e.g. hyperlinks with credentials embedded in them, which made for very convenient bookmarks, or partial credentials (e.g. just the username) to be included in a link, with the
user being prompted for the password on arrival at the destination. So far, so good.
This is why we can’t have nice things
The technique fell out of favour as soon as it started being used for nefarious purposes. It didn’t take long for scammers to realise that they could create links like this:
https://YourBank.com@HackersSite.com/
Everything we were teaching users about checking for “https://” followed by the domain name of their bank… was undermined by this user interface choice. The poor victim would actually be connecting to e.g. HackersSite.com, but a quick glance at their address bar would leave them convinced that they were talking to YourBank.com!
Theoretically: widespread adoption of EV certificates coupled with sensible user interface choices (that were never made) could have solved this problem, but a far simpler solution was just to not show usernames in the address bar. Web developers were by now far more excited about forms and cookies for authentication anyway, so browsers started curtailing the “credentials in addresses” feature.
(There are other reasons this particular implementation of HTTP Basic Authentication was less-than-ideal, but this reason is the big one that explains why things had to change.)
One by one, browsers made the change. But here’s the interesting bit: the browsers didn’t always make the change in the same way.
How different browsers handle basic authentication in URLs
Let’s examine some popular browsers. To run these tests I threw together a tiny web application that outputs
the Authorization: header passed to it, if present, and can optionally send a 401 Unauthorized response along with a WWW-Authenticate: Basic realm="Test Site" header in order to trigger basic authentication. Why both? So that I can test not only how browsers handle URLs containing credentials when an authentication request is received, but how they handle them when one is not. This is relevant because
some addresses – often API endpoints – have optional HTTP authentication, and it’s sometimes important for a user agent (albeit typically a library or command-line one) to pass credentials without
first being prompted.
In each case, I tried each of the following tests in a fresh browser instance:
- Go to
http://<username>:<password>@<domain>/optional(authentication is optional). - Go to
http://<username>:<password>@<domain>/mandatory(authentication is mandatory). - Experiment 1, then f0llow relative hyperlinks (which should correctly retain the credentials) to
/mandatory. - Experiment 2, then follow relative hyperlinks to the
/optional.
I’m only testing over the http scheme, because I’ve no reason to believe that any of the browsers under test treat the https scheme differently.
Chromium desktop family
Chrome 93 and
Edge 93 both immediately suppressed the username and password from the address bar, along with the “http://” as we’ve come to expect of them. Like the “http://”, though,
the plaintext username and password are still there. You can retrieve them by copy-pasting the entire address.
Opera 78 similarly suppressed the username, password, and scheme, but didn’t retain the username and password in a way that could be copy-pasted out.
Authentication was passed only when landing on a “mandatory” page; never when landing on an “optional” page. Refreshing the page or re-entering the address with its credentials did not change this.
Navigating from the “optional” page to the “mandatory” page using only relative links retained the username and password and submitted it to the server when it became mandatory, even Opera which didn’t initially appear to retain the credentials at all.
Navigating from the “mandatory” to the “optional” page using only relative links, or even entering the “optional” page address with credentials after visiting the “mandatory” page, does not result in authentication being passed to the “optional” page. However, it’s interesting to note that once authentication has occurred on a mandatory page, pressing enter at the end of the address bar on the optional page, with credentials in the address bar (whether visible or hidden from the user) does result in the credentials being passed to the optional page! They continue to be passed on each subsequent load of the “optional” page until the browsing session is ended.
Firefox desktop
Firefox 91 does a clever thing very much in-line with its image as a browser that puts decision-making authority into the hands of its user. When going to
the “optional” page first it presents a dialog, warning the user that they’re going to a site that does not specifically request a username, but they’re providing one anyway. If the
user says that no, navigation ceases (the GET request for the page takes place the same either way; this happens before the dialog appears). Strangely: regardless of whether the user
selects yes or no, the credentials are not passed on the “optional” page. The credentials (although not the “http://”) appear in the address bar while the user makes their decision.
Similar to Opera, the credentials do not appear in the address bar thereafter, but they’re clearly still being stored: if the refresh button is pressed the dialog appears again. It does not appear if the user selects the address bar and presses enter.
Similarly, going to the “mandatory” page in Firefox results in an informative dialog warning the user that credentials are being passed. I like this approach: not
only does it help protect the user from the use of authentication as a tracking technique (an old technique that I’ve not seen used in well over a decade, mind), it also helps the user
be sure that they’re logging in using the account they mean to, when following a link for that purpose. Again, clicking cancel stops navigation, although the initial request (with no
credentials) and the 401 response has already occurred.
Visiting any page within the scope of the realm of the authentication after visiting the “mandatory” page results in credentials being sent, whether or not they’re included in the address. This is probably the most-true implementation to the expectations of the standard that I’ve found in a modern graphical browser.
Safari desktop
Safari 14 never displays or uses credentials provided via the web address, whether or not authentication is mandatory. Mandatory authentication is always met by a pop-up
dialog, even if credentials were provided in the address bar. Boo!
Once passed, credentials are later provided automatically to other addresses within the same realm (i.e. optional pages).
Older browsers
Let’s try some older browsers.
From version 7
onwards – right up to the final version 11 – Internet Explorer fails to even recognise addresses with authentication credentials in as legitimate web addresses, regardless of
whether or not authentication is requested by the server. It’s easy to assume that this is yet another missing feature in the browser we all love to hate, but it’s interesting to note
that credentials-in-addresses is permitted for ftp:// URLs…
…and
if you go back a little way, Internet Explorer 6 and below supported credentials in the address bar pretty much as you’d expect based on the standard. The error message seen in
IE7 and above is a deliberate design
decision, albeit a somewhat knee-jerk reaction to the security issues posed by the feature (compare to the more-careful approach of other browsers).
These older versions of IE even (correctly) retain the credentials through relative hyperlinks, allowing them to be passed when they become mandatory. They’re not passed on optional pages unless a mandatory page within the same realm has already been encountered.
Pre-Mozilla
Netscape behaved the same way. Truly this was the de facto standard for a long period on the Web, and the varied approaches we see today are the anomaly. That’s a strange
observation to make, considering how much the Web of the 1990s was dominated by incompatible implementations of different Web features (I’ve
written about the <blink> and <marquee> tags before, which was perhaps the most-visible division between the Microsoft and Netscape camps, but
there were many, many more).
Interestingly: by Netscape 7.2 the browser’s behaviour had evolved to be the same as modern Firefox’s, except that it still displayed the credentials in the
address bar for all to see.
Now here’s a real
gem: pre-Chromium Opera. It would send credentials to “mandatory” pages and remember them for the duration of the browsing session, which is great. But it would also send
credentials when passed in a web address to “optional” pages. However, it wouldn’t remember them on optional pages unless they remained in the address bar: this feels to me
like an optimum balance of features for power users. Plus, it’s one of very few browsers that permitted you to change credentials mid-session: just by changing them in
the address bar! Most other browsers, even to this day, ignore changes to HTTP Authentication credentials, which was
sometimes be a source of frustration back in the day.
Finally, classic Opera was the only browser I’ve seen to mask the password in the address bar, turning it into a series of asterisks. This ensures the user knows that a password was used, but does not leak any sensitive information to shoulder-surfers (the length of the “masked” password was always the same length, too, so it didn’t even leak the length of the password). Altogether a spectacular design and a great example of why classic Opera was way ahead of its time.
The Command-Line
Most people using web addresses with credentials embedded within them nowadays are probably working with code, APIs, or the command line, so it’s unsurprising to see that this is where the most “traditional” standards-compliance is found.
I was unsurprised to discover that giving curl a username and password in the URL meant that username and password was sent to the server (using Basic authentication, of course, if no authentication was requested):
$ curl http://alpha:beta@localhost/optional
Header: Basic YWxwaGE6YmV0YQ==
$ curl http://alpha:beta@localhost/mandatory
Header: Basic YWxwaGE6YmV0YQ==
However, wget did catch me out. Hitting the same addresses with wget didn’t result in the credentials being sent
except where it was mandatory (i.e. where a HTTP 401 response and a WWW-Authenticate: header was received on the initial attempt). To force wget to
send credentials when they haven’t been asked-for requires the use of the --http-user and --http-password switches:
$ wget http://alpha:beta@localhost/optional -qO-
Header:
$ wget http://alpha:beta@localhost/mandatory -qO-
Header: Basic YWxwaGE6YmV0YQ==
lynx does a cute and clever thing. Like most modern browsers, it does not submit credentials unless specifically requested, but if they’re in the address bar when they become mandatory (e.g. because of following relative hyperlinks or hyperlinks containing credentials) it prompts for the username and password, but pre-fills the form with the details from the URL. Nice.
What’s the status of HTTP (Basic) Authentication?
HTTP Basic Authentication and its close cousin Digest Authentication (which overcomes some of the security limitations of running Basic Authentication over an
unencrypted connection) is very much alive, but its use in hyperlinks can’t be relied upon: some browsers (e.g. IE, Safari)
completely munge such links while others don’t behave as you might expect. Other mechanisms like Bearer see widespread use in APIs, but nowhere else.
The WWW-Authenticate: and Authorization: headers are, in some ways, an example of the best possible way to implement authentication on the Web: as an
underlying standard independent of support for forms (and, increasingly, Javascript), cookies, and complex multi-part conversations. It’s easy to imagine an alternative
timeline where these standards continued to be collaboratively developed and maintained and their shortfalls – e.g. not being able to easily log out when using most graphical browsers!
– were overcome. A timeline in which one might write a login form like this, knowing that your e.g. “authenticate” attributes would instruct the browser to send credentials using an
Authorization: header:
<form method="get" action="/" authenticate="Basic">
<label for="username">Username:</label> <input type="text" id="username" authenticate="username">
<label for="password">Password:</label> <input type="text" id="password" authenticate="password">
<input type="submit" value="Log In">
</form>
In such a world, more-complex authentication strategies (e.g. multi-factor authentication) could involve encoding forms as JSON. And single-sign-on systems would simply involve the browser collecting a token from the authentication provider and passing it on to the
third-party service, directly through browser headers, with no need for backwards-and-forwards redirects with stacks of information in GET parameters as is the case today.
Client-side certificates – long a powerful but neglected authentication mechanism in their own right – could act as first class citizens directly alongside such a system, providing
transparent second-factor authentication wherever it was required. You wouldn’t have to accept a tracking cookie from a site in order to log in (or stay logged in), and if your
browser-integrated password safe supported it you could log on and off from any site simply by toggling that account’s “switch”, without even visiting the site: all you’d be changing is
whether or not your credentials would be sent when the time came.
The Web has long been on a constant push for the next new shiny thing, and that’s sometimes meant that established standards have been neglected prematurely or have failed to evolve for
longer than we’d have liked. Consider how long it took us to get the <video> and <audio> elements because the “new shiny” Flash came to dominate,
how the Web Payments API is only just beginning to mature despite over 25 years of ecommerce on the Web, or how we still can’t
use Link: headers for all the things we can use <link> elements for despite them being semantically-equivalent!
The new model for Web features seems to be that new features first come from a popular JavaScript implementation, and then eventually it evolves into a native browser feature: for example HTML form validations, which for the longest time could only be done client-side using scripting languages. I’d love to see somebody re-think HTTP Authentication in this way, but sadly we’ll never get a 100% solution in JavaScript alone: (distributed SSO is almost certainly off the table, for example, owing to cross-domain limitations).
Or maybe it’s just a problem that’s waiting for somebody cleverer than I to come and solve it. Want to give it a go?
The Cursed Computer Iceberg Meme
This is a repost promoting content originally published elsewhere. See more things Dan's reposted.
More awesome from Blackle Mori, whose praises I sung recently over The Basilisk Collection. This time we’re treated to a curated list of 182 articles demonstrating the “peculiarities and weirdness” of computers. Starting from relatively well-known memes like little Bobby Tables, the year 2038 problem, and how all web browsers pretend to be each other, we descend through the fast inverse square root (made famous by Quake III), falsehoods programmers believe about time (personally I’m more of a fan of …names, but then you might expect that), the EICAR test file, the “thank you for playing Wing Commander” EMM386 in-memory hack, The Basilisk Collection itself, and the GIF MD5 hashquine (which I’ve shared previously) before eventually reaching the esoteric depths of posuto and the nightmare that is Japanese postcodes…
Plus many, many things that were new to me and that I’ve loved learning about these last few days.
It’s definitely not a competition; it’s a learning opportunity wrapped up in the weirdest bits of the field. Have an explore and feed your inner computer science geek.
The Coolest Thing About GPS
I’m currently doing a course, through work, delivered by BetterOn Video. The aim of the course is to improve my video presentation skills, in particular my engagement with the camera and the audience.
I made this video based on the week 2 prompt “make a video 60-90 seconds long about something you’re an expert on”. The idea came from a talk I used to give at the University of Oxford.
**
Despite the fact that it’s existed for 11 years, TIL about
globstar in Bash. Set shopt -s globstar and ** globs become fully recursive, making that bulk transcode operation much simpler. 🤯
Watching Films Together… Apart
This weekend I announced and then hosted Homa Night II, an effort to use technology to help bridge the chasms that’ve formed between my diaspora of friends as a result mostly of COVID. To a lesser extent we’ve been made to feel distant from one another for a while as a result of our very diverse locations and lifestyles, but the resulting isolation was certainly compounded by lockdowns and quarantines.
Back in the day we used to have a regular weekly film night called Troma Night, named after the studio who dominated our early events and whose… genre… influenced many of our choices thereafter. We had over 300 such film nights, by my count, before I eventually left our shared hometown of Aberystwyth ten years ago. I wasn’t the last one of the Troma Night regulars to leave town, but more left before me than after.
Earlier this year I hosted Sour Grapes, a murder mystery party (an irregular highlight of our Aberystwyth social calendar, with thanks to Ruth) run entirely online using a mixture of video chat and “second screen” technologies. In some ways that could be seen as the predecessor to Homa Night, although I’d come up with most of the underlying technology to make Homa Night possible on a whim much earlier in the year!
How best to make such a thing happen? When I first started thinking about it, during the first of the UK’s lockdowns, I considered a few options:
-
Streaming video over a telemeeting service (Zoom, Google Meet, etc.)
Very simple to set up, but the quality – as anybody who’s tried this before will attest – is appalling. Being optimised for speech rather than music and sound effects gives the audio a flat, scratchy sound, video compression artefacts that are tolerable when you’re chatting to your boss are really annoying when they stop you reading a crucial subtitle, audio and video often get desynchronised in a way that’s frankly infuriating, and everybody’s download speed is limited by the upload speed of the host, among other issues. The major benefit of these platforms – full-duplex audio – is destroyed by feedback so everybody needs to stay muted while watching anyway. No thanks! -
Teleparty or a similar tool
Teleparty (formerly Netflix Party, but it now supports more services) is a pretty clever way to get almost exactly what I want: synchronised video streaming plus chat alongside. But it only works on Chrome (and some related browsers) and doesn’t work on tablets, web-enabled TVs, etc., which would exclude some of my friends. Everybody requires an account on the service you’re streaming from, potentially further limiting usability, and that also means you’re strictly limited to the media available on those platforms (and further limited again if your party spans multiple geographic distribution regions for that service). There’s definitely things I can learn from Teleparty, but it’s not the right tool for Homa Night. -
“Press play… now!”
The relatively low-tech solution might have been to distribute video files in advance, have people download them, and get everybody to press “play” at the same time! That’s at least slightly less-convenient because people can’t just “turn up”, they have to plan their attendance and set up in advance, but it would certainly have worked and I seriously considered it. There are other downsides, though: if anybody has a technical issue and needs to e.g. restart their player then they’re basically doomed in any attempt to get back in-sync again. We can do better… - A custom-made synchronised streaming service…?
So obviously I ended up implementing my own streaming service. It wasn’t even that hard. In case you want to try your own, here’s how I did it:
Media preparation
First, I used Adobe Premiere to create a video file containing both of the night’s films, bookended and separated by “filler” content to provide an introduction/lobby, an intermission, and a closing “you should have stopped watching by now” message. I made sure that the “intro” was a nice round duration (90s) and suitable for looping because I planned to hold people there until we were all ready to start the film. Thanks to Boris & Oliver for the background music!
Next, I ran the output through Handbrake to produce “web optimized” versions in 1080p and 720p output sizes. “Web optimized” in this case means that metadata gets added to the start of the file to allow it to start playing without downloading the entire file (streaming) and to allow the calculation of what-part-of-the-file corresponds to what-part-of-the-timeline: the latter, when coupled with a suitable webserver, allows browsers to “skip” to any point in the video without having to watch the intervening part. Naturally I’m encoding with H.264 for the widest possible compatibility.
Real-Time Synchronisation
To keep everybody’s viewing experience in-sync, I set up a Firebase account for the application: Firebase provides an easy-to-use Websockets
platform with built-in data synchronisation. Ignoring the authentication and chat features, there wasn’t much
shared here: just the currentTime of the video in seconds, whether or not introMode was engaged (i.e. everybody should loop the first 90 seconds, for now), and
whether or not the video was paused:
To reduce development effort, I never got around to implementing an administrative front-end; I just manually went into the Firebase database and acknowledged “my” computer as being an
administrator, after I’d connected to it, and then ran a little Javascript in my browser’s debugger to tell it to start pushing my video’s currentTime to the server every
few seconds. Anything else I needed to edit I just edited directly from the Firebase interface.
Other web clients’ had Javascript to instruct them to monitor these variables from the Firebase database and, if they were desynchronised by more than 5 seconds, “jump” to the correct point in the video file. The hard part of the code… wasn’t really that hard:
// Rewind if we're passed the end of the intro loop function introModeLoopCheck() { if (!introMode) return; if (video.currentTime > introDuration) video.currentTime = 0; } function fixPlayStatus() { // Handle "intro loop" mode if (remotelyControlled && introMode) { if (video.paused) video.play(); // always play introModeLoopCheck(); return; // don't look at the rest } // Fix current time const desync = Math.abs(lastCurrentTime - video.currentTime); if ( (video.paused && desync > DESYNC_TOLERANCE_WHEN_PAUSED) || (!video.paused && desync > DESYNC_TOLERANCE_WHEN_PLAYING) ) { video.currentTime = lastCurrentTime; } // Fix play status if (remotelyControlled) { if (lastPaused && !video.paused) { video.pause(); } else if (!lastPaused && video.paused) { video.play(); } } // Show/hide paused notification updatePausedNotification(); }
Web front-end
Finally, there needed to be a web page everybody could go to to get access to this. As I was hosting the video on S3+CloudFront anyway, I put the HTML/CSS/JS there too.
I tested in Firefox, Edge, Chrome, and Safari on desktop, and (slightly less) on Firefox, Chrome and Safari on mobile. There were a few quirks to work around, mostly to do with browsers not letting videos make sound until the page has been interacted with after the video element has been rendered, which I carefully worked-around by putting a popup “over” the video to “enable sync”, but mostly it “just worked”.
Delivery
On the night I shared the web address and we kicked off! There were a few hiccups as some people’s browsers got disconnected early on and tried to start playing the film before it was time, and one of these even when fixed ran about a minute behind the others, leading to minor spoilers leaking via the rest of us riffing about them! But on the whole, it worked. I’ve had lots of useful feedback to improve on it for the next version, and I might even try to tidy up my code a bit and open-source the results if this kind of thing might be useful to anybody else.
Endpoint Encabulator
(This video is also available on YouTube.)
I’ve been working as part of the team working on the new application framework called the Endpoint Encabulator and wanted to share with you what I think makes our project so exciting: I promise it’ll make for two minutes of your time you won’t seen forget!
Naturally, this project wouldn’t have been possible without the pioneering work that preceded it by John Hellins Quick, Bud Haggart, and others. Nothing’s invented in a vacuum. However, my fellow developers and I think that our work is the first viable encabulator implementation to provide inverse reactive data binding suitable for deployment in front of a blockchain-driven backend cache. I’m not saying that all digital content will one day be delivered through Endpoint Encabulator, but… well; maybe it will.
If the technical aspects go over your head, pass it on to a geeky friend who might be able to make use of my work. Sharing is caring!
Note #17836
Accidental Geohashing
Over the last six years I’ve been on a handful of geohashing expeditions, setting out to functionally-random GPS coordinates to see if I can get there, and documenting what I find when I do. The comic that inspired the sport was already six years old by the time I embarked on my first outing, and I’m far from the most-active member of the ‘hasher community, but I’ve a certain closeness to them as a result of my work to resurrect and host the “official” website. Either way: I love the sport.
But even when I’ve not been ‘hashing, it occurs to me that I’ve been tracking my location a lot. Three mechanisms in particular dominate:
- Google’s somewhat-invasive monitoring of my phones’ locations (which can be exported via Google Takeout)
- My personal GPSr logs (I carry the device moderately often, and it provides excellent precision)
- The personal μlogger server I’ve been running for the last few years (it’s like Google’s system, but – y’know – self-hosted, tweakable, and less-creepy)
If I could mine all of that data, I might be able to answer the question… have I ever have accidentally visited a geohashpoint?
Let’s find out.
Data mining my own movements
To begin with, I needed to get all of my data into μLogger. The Android app syncs to it automatically and uploading from my GPSr was simple. The data from Google Takeout was a little harder.
I found a setting in Google Takeout to export past location data in KML, rather than JSON, format. KML is understood by GPSBabel which can convert it into GPX. I can “cut up” the resulting GPX file using a little grep-fu (relevant xkcd?) to get month-long files and import them into μLogger. Easy!
Well.. μLogger’s web interface sometimes times-out if you upload enormous files like a whole month of Google Takeout logs. So instead I wrote a Nokogiri script to convert the GPX into SQL to inject directly into μLogger’s database.
Next, I got a set of hashpoint offsets. I only had personal positional data going back to around 2010, so I didn’t need to accommodate for the pre-2008 absence of the 30W time zone rule. I’ve had only one trip to the Southern hemisphere in that period, and I checked that manually. A little rounding and grouping in SQL gave me each graticule I’d been in on every date. Unsurprisingly, I spend most of my time in the 51 -1 graticule. Adding (or subtracting, for the Western hemisphere) the offset provided the coordinates for each graticule that I visited for the date that I was in that graticule. Nice.
The correct way to find the proximity of my positions to each geohashpoint is, of course, to use WGS84. That’s an easy thing to do if you’re using a database that supports it. My database… doesn’t. So I just used Pythagoras’ theorem to find positions I’d visited that were within 0.15° of a that day’s hashpoint.
Using Pythagoras for geopositional geometry is, of course, wrong. Why? Because the physical length of a “degree” varies dependent on latitude, and – more importantly – a degree of latitude is not the same distance as a degree of longitude. The ratio varies by latitude: only an idealised equatorial graticule would be square!
But for this case, I don’t care: the data’s going to be fuzzy and require some interpretation anyway. Not least because Google’s positioning has the tendency to, for example, spot a passing train’s WiFi and assume I’ve briefly teleported to Euston Station, which is apparently where Google thinks that hotspot “lives”.
I assumed that my algorithm would detect all of my actual geohash finds, and yes: all of these appeared as-expected in my results. This was a good confirmation that my approach worked.
And, crucially: about a dozen additional candidate points showed up in my search. Most of these – listed at the end of this post – were 50m+ away from the hashpoint and involved me driving or cycling past on a nearby road… but one hashpoint stuck out.
Hashing by accident
In August 2015 we took a trip up to Edinburgh to see a play of Ruth‘s brother Robin‘s. I don’t remember much about the play because I was on keeping-the-toddler-entertained duty and so had to excuse myself pretty early on. After the play we drove South, dropping Tom off at Lanark station.
We exited Lanark via the Hyndford Bridge… which is – according to the map – tantalisingly-close to the 2015-08-22 55 -3 hashpoint: only about 23 metres away!
That doesn’t feel quite close enough to justify retroactively claiming the geohash, tempting though it would be to use it as a vehicle to my easy geohash ribbon. Google doesn’t provide error bars for their exported location data so I can’t draw a circle of uncertainty, but it seems unlikely that I passed through this very close hashpoint.
Pity. But a fun exercise. This was the nearest of my near misses, but plenty more turned up in my search, too:
-
2013-09-28 54 -2 (9,000m)
Near a campsite on the River Eden. I drove past on the M6 with Ruth on the way to Loch Lomond for a mini-break to celebrate our sixth anniversary. I was never more than 9,000 metres from the hashpoint, but Google clearly had a moment when it couldn’t get good satellite signal and tries to trilaterate my position from cell masts and coincidentally guessed, for a few seconds, that I was much closer. There are a few such erroneous points in my data but they’re pretty obvious and easy to spot, so my manual filtering process caught them. -
2019-09-13 52 -0 (719m)
A600, near Cardington Airstrip, south of Bedford. I drove past on the A421 on my way to Three Rings‘ “GDPR Camp”, which was more fun than it sounds, I promise. -
2014-03-29 53 -1 (630m)
Spen Farm, near Bramham Interchange on the A1(M). I drove past while heading to the Nightline Association Conference to talk about Three Rings. Curiously, I came much closer to the hashpoint the previous week when I drove a neighbouring road on my way to York for my friend Matt’s wedding. -
2020-05-06 51 -1 (346m)
Inside Kidlington Police Station! Short of getting arrested, I can’t imagine how I’d easily have gotten to this one, but it’s moot anyway because I didn’t try! I’d taken the day off work to help with child-wrangling (as our normal childcare provisions had been scrambled by COVID-19), and at some point during the day we took a walk and came somewhat near to the hashpoint. -
2016-02-05 51 -1 (340m)
Garden of a house on The Moors, Kidlington. I drove past (twice) on my way to and from the kids’ old nursery. Bonus fact: the house directly opposite the one whose garden contained the hashpoint is a house that I looked at buying (and visited), once, but didn’t think it was worth the asking price. -
2017-08-30 51 -1 (318m)
St. Frieswide Farm, between Oxford and Kidlington. I cycled past on Banbury Road twice – once on my way to and once on my way from work. -
2015-01-25 51 -1 (314m)
Templar Road, Cutteslowe, Oxford. I’ve cycled and driven along this road many times, but on the day in question the closest I came was cycling past on nearby Banbury Road while on the way to work. -
2018-01-28 51 -1 (198m)
Stratfield Brake, Kidlington. I took our youngest by bike trailer this morning to his Monkey Music class: normally at this point in history Ruth would have been the one to take him, but she had a work-related event that she couldn’t miss in the morning. I cycled right by the entrance to this nature reserve: it could have been an ideal location for a geohash! -
2014-01-24 51 -1 (114m)
On the Marston Cyclepath. I used to cycle along this route on the way to and from work most days back when I lived in Marston, but by 2014 I lived in Kidlington and so I’d only cycle past the end of it. So it was that I cycled past the Linacre College of the path, around 114m away from the hashpoint, on this day. -
2015-06-10 51 -1 (112m)
Meadow near Peartree Interchange, Oxford. I stopped at the filling station on the opposite side of the roundabout, presumably to refuel a car. -
2020-02-27 51 -1 (70m)
This was a genuine attempt at a hashpoint that I failed to reach and was so sad about that I never bothered to finish writing up. The hashpoint was very close (but just out of sight of, it turns out) a geocache I’d hidden in the vicinity, and I was hopeful that I might be able to score the most-epic/demonstrable déjà vu/hash collision achievement ever, not least because I had pre-existing video evidence that I’d been at the coordinates before! Unfortunately it wasn’t to be: I had inadequate footwear for the heavy rains that had fallen in the days that preceded the expedition and I was in a hurry to get home, get changed, and go catch a train to go and see the Goo Goo Dolls in concert. So I gave up and quit the expedition. This turned out to be the right decision: going to the concert one of the last “normal” activities I got to do before the COVID-19 lockdown made everybody’s lives weird. -
2014-05-23 51 -1 (61m)
White Way, Kidlington, near the Bicester Road to Green Road footpath. I passed close by while cycling to work, but I’ve since walked through this hashpoint many times: it’s on a route that our eldest sometimes used to take when walking home from her school! With the exception only of the very-near-miss in Lanark, this was my nearest “near miss”. -
2015-08-22 55 -3 (23m)
So near, yet so far. 🙄
Winamp Skin Museum
This is a repost promoting content originally published elsewhere. See more things Dan's reposted.
Remember Winamp (especially Winamp 1/2, back when it was awesome)? Remember Winamp skins?
Webamp does.
This is not only a gallery of skins, they’re all interactive! Click into one and try it out. It really whips the llama’s ass.
Note #17583
Needed new UPS batteries.
Almost bought from @insight_uk but they require registration to checkout.
Purchased from @SourceUPSLtd instead.
Moral: having no “guest checkout” costs you business.
Syncthing
This last month or so, my digital life has been dramatically improved by Syncthing. So much so that I want to tell you about it.
I started using it last month. Basically, what it does is keeps a pair of directories on remote systems “in sync” with one another. So far, it’s like your favourite cloud storage service, albeit self-hosted and much-more customisable. But it’s got a handful of killer features that make it nothing short of a dream to work with:
- The unique identifier for a computer can be derived from its public key. Encryption comes free as part of the verification of a computer’s identity.
- You can share any number of folders with any number of other computers, point-to-point or via an intermediate proxy, and it “just works”.
- It’s super transparent: you can always see what it’s up to, you can tweak the configuration to match your priorities, and it’s open source so you can look at the engine if you like.
Here are some of the ways I’m using it:
Keeping my phone camera synced to my PC
I’ve tried a lot of different solutions for this over the years. Back in the way-back-when, like everybody else in those dark times, I used to plug my phone in using a cable to copy pictures off and sort them. Since then, I’ve tried cloud solutions from Google, Amazon, and Flickr and never found any that really “worked” for me. Their web interfaces and apps tend to be equally terrible for organising or downloading files, and I’m rarely able to simply drag-and-drop images from them into a blog post like I can from Explorer/Finder/etc.
At first, I set this up as a one-way sync, “pushing” photos and videos from my phone to my desktop PC whenever I was on an unmetered WiFi network. But then I switched it to a two-way sync, enabling me to more-easily tidy up my phone of old photos too, by just dragging them from the folder that’s synced with my phone to my regular picture storage.
Centralising my backups
Now I’ve got a fancy NAS device with tonnes of storage, it makes sense to use it as a central point for backups to run fom. Instead of having many separate backup processes running on different computers, I can just have each of them sync to the NAS, and the NAS can back everything up. Computers don’t need to be “on” at a particular time because the NAS runs all the time, so backups can use the Internet connection when it’s quietest. And in the event of a hardware failure, there’s an up-to-date on-site backup in the first instance: the cloud backup’s only needed in the event of accidental data deletion (which could be sync’ed already, of course!). Plus, integrating the sync with ownCloud running on the NAS gives easy access to my files wherever in the world I am without having to fire up a VPN or otherwise remote-in to my house.
Plus: because Syncthing can share a folder between any number of devices, the same sharing mechanism that puts my phone’s photos onto my main desktop can simultaneously be pushing them to the NAS, providing redundant connections. And it was a doddle to set up.
Maintaining my media centre’s screensaver
Since the NAS, running Jellyfin, took on most of the media management jobs previously shared between desktop computers and the media centre computer, the household media centre’s had less to do. But one thing that it does, and that gets neglected, is showing a screensaver of family photos (when it’s not being used for anything else). Historically, we’ve maintained the photos in that collection via a shared network folder, but then you’ve got credential management and firewall issues to deal with, not to mention different file naming conventions by different people (and their devices).
But simply sharing the screensaver’s photo folder with the computer of anybody who wants to contribute photos means that it’s as easy as copying the picture to a particular place. It works on whatever device they care to (computer, tablet, mobile) on any operating system, and it’s quick and seamless. I’m just using it myself, for now, but I’ll be offering it to the rest of the family soon. It’s a trivial use-case, but once you’ve got it installed it just makes sense.
In short: this month, I’m in love with Syncthing. And maybe you should be, too.
Why $FAMOUS_COMPANY switched to $HYPED_TECHNOLOGY
This is a repost promoting content originally published elsewhere. See more things Dan's reposted.
…
As we’ve mentioned in previous blog posts, the $FAMOUS_COMPANY backend has historically been developed in $UNREMARKABLE_LANGUAGE and architected on top of $PRACTICAL_OPEN_SOURCE_FRAMEWORK. To suit our unique needs, we designed and open-sourced $AN_ENGINEER_TOOK_A_MYTHOLOGY_CLASS, a highly-available, just-in-time compiler for $UNREMARKABLE_LANGUAGE.
…
Saagar Jha tells the now-familiar story of how a bunch of techbros solved their scaling problems by reinventing the wheel. And then, when that didn’t work out, moved the goalposts of success. It’s a story as old as time; or at least as old as the modern Web.
(Should’a strangled the code. Or better yet, just refactored what they had.)
Mackerelmedia Fish
Normally this kind of thing would go into the ballooning dump of “things I’ve enjoyed on the Internet” that is my reposts archive. But sometimes something is so perfect that you have to try to help it see the widest audience it can, right? And today, that thing is: Mackerelmedia Fish.
What is Mackerelmedia Fish? I’ve had a thorough and pretty complete experience of it, now, and I’m still not sure. It’s one or more (or none) of these, for sure, maybe:
- A point-and-click, text-based, or hypertext adventure?
- An homage to the fun and weird Web of yesteryear?
- A statement about the fragility of proprietary technologies on the Internet?
- An ARG set in a parallel universe in which the 1990s never ended?
- A series of surrealist art pieces connected by a loose narrative?
Rock Paper Shotgun’s article about it opens with “I don’t know where to begin with this—literally, figuratively, existentially?” That sounds about right.
What I can tell you with confident is what playing feels like. And what it feels like is the moment when you’ve gotten bored waiting for page 20 of Argon Zark to finish appear so you decide to reread your already-downloaded copy of the 1997 a.r.k bestof book, and for a moment you think to yourself: “Whoah; this must be what living in the future feels like!”
Because back then you didn’t yet have any concept that “living in the future” will involve scavenging for toilet paper while complaining that you can’t stream your favourite shows in 4K on your pocket-sized supercomputer until the weekend.
Mackerelmedia Fish is a mess of half-baked puns, retro graphics, outdated browsing paradigms and broken links. And that’s just part of what makes it great.
It’s also “a short story that’s about the loss of digital history”, its creator Nathalie Lawhead says. If that was her goal, I think she managed it admirably.
If I wasn’t already in love with the game already I would have been when I got to the bit where you navigate through the directory indexes of a series of deepening folders, choose-your-own-adventure style. Nathalie writes, of it:
One thing that I think is also unique about it is using an open directory as a choose your own adventure. The directories are branching. You explore them, and there’s text at the bottom (an htaccess header) that describes the folder you’re in, treating each directory as a landscape. You interact with the files that are in each of these folders, and uncover the story that way.
Back in the naughties I experimented with making choose-your-own-adventure games in exactly this way. I was experimenting with different media by which this kind of branching-choice game could be presented. I envisaged a project in which I’d showcase the same (or a set of related) stories through different approaches. One was “print” (or at least “printable”): came up with a Twee1-to-PDF converter to make “printable” gamebooks. A second was Web hypertext. A third – and this is the one which was most-similar to what Nathalie has now so expertly made real – was FTP! My thinking was that this would be an adventure game that could be played in a browser or even from the command line on any (then-contemporary: FTP clients aren’t so commonplace nowadays) computer. And then, like so many of my projects, the half-made version got put aside “for later” and forgotten about. My solution involved abusing the FTP protocol terribly, but it worked.
(I also looked into ways to make Gopher-powered hypertext fiction and toyed with the idea of using YouTube annotations to make an interactive story web [subsequently done amazingly by Wheezy Waiter, though the death of YouTube annotations in 2017 killed it]. And I’ve still got a prototype I’d like to get back to, someday, of a text-based adventure played entirely through your web browser’s debug console…! But time is not my friend… Maybe I ought to collaborate with somebody else to keep me on-course.)
In any case: Mackerelmedia Fish is fun, weird, nostalgic, inspiring, and surreal, and you should give it a go. You’ll need to be on a Windows or OS X computer to get everything you can out of it, but there’s nothing to stop you starting out on your mobile, I imagine.
Sso long as you’re capable of at least 800 × 600 at 256 colours and have 4MB of RAM, if you know what I mean.




