PHP 8.4 on Caddy on Debian 13… in Three Minutes

I just needed to spin up a new PHP webserver and I was amazed how fast and easy it was, nowadays. I mean: Caddy already makes it pretty easy, but I was delighted to see that, since the last time I did this, the default package repositories had 100% of what I needed!

Apart from setting the hostname, creating myself a user and adding them to the sudo group, and reconfiguring sshd to my preference, I’d done nothing on this new server. And then to set up a fully-functioning PHP-powered webserver, all I needed to run (for a domain “example.com”) was:

sudo apt update && sudo apt upgrade -y
sudo apt install -y caddy php8.4-fpm
sudo mkdir -p /var/www/example.com
printf "example.com {\n"                               | sudo tee    /etc/caddy/Caddyfile
printf "  root * /var/www/example.com\n"               | sudo tee -a /etc/caddy/Caddyfile
printf "  encode zstd gzip\n"                          | sudo tee -a /etc/caddy/Caddyfile
printf "  php_fastcgi unix//run/php/php8.4-fpm.sock\n" | sudo tee -a /etc/caddy/Caddyfile
printf "  file_server\n"                               | sudo tee -a /etc/caddy/Caddyfile
printf "}\n"                                           | sudo tee -a /etc/caddy/Caddyfile
sudo service caddy restart

After that, I was able to put an index.php file into /var/www/example.com and it just worked.

And when I say “just worked”, I mean with all the bells and whistles you ought to expect from Caddy. HTTPS came as standard (with a solid QualSys grade). HTTP/3 was supported with a 0-RTT handshake.

Mind blown.

Bee

As part of my efforts to reclaim the living room from the children, I’m building a new gaming PC for the playroom. She’s called Bee, and – thanks to the absolute insanity that is The Tower 300 case from Thermaltake – she’s one of the most bonkers PC cases I’ve ever worked in.

On a desk cluttered with computer parts, a partially-built PC stands in an irregular-hexagonal prism shaped case with vented yellow sides and a three-pane angled glass front.

×

A small collection of text-only websites

This is a repost promoting content originally published elsewhere. See more things Dan's reposted.

I’m not saying the plain-text is the best web experience. But it is an experience. Perfect if you like your browsing fast, simple, and readable. There are no cookie banners, pop-ups, permission prompts, autoplaying videos, or garish colour schemes.

I’m certainly not the first person to do this, so I thought it might be fun to gather a list of websites which you browse in text-only mode.

Terence Eden’s maintaining a list of websites that are presented as, or are wholly or partially available via, plain text. Obviously my own text/plain blog is among them, and is as far as I’m aware the only one to be entirely presented as text/plain.

Anyway, this inspired me to write a post of my own (on text/plain blog, of course!), in which I ask the question: what do we consider plain text? Based on the sites in the list, Markdown is permissible as plain text,  (for the purposes of Terence’s list), but this implies that “plain text” is a spectrum of human-readability.

If Markdown’s fine, then presumably Gemtext would be too? How about BBCode? HTML and RTF are explicitly excluded by Terence’s rules, but I’d argue that HTML 1.0 could be more human-readable than some of the more-sophisticated dialects of BBCode (or any Markdown that contains tables, unless those tables are laid-out in a way that specifically facilitates human-readability)?

As I say in my post:

  <--  More human-readable                                              Less human-readable  -->  
|-----------|-----------|-----------|------------|-----------|-----------|-----------|-----------|
Plain text       Gemtext       Markdown        BBCode         HTML 1.0         Modern HTML     RTF

This provocation is only intended to get you to think about “what does it mean for a markup language to be ‘human readable’?” Where do you draw the line?

Roomscale VR Still Rocks

Over the Christmas break I dug out my old HTC Vive VR gear, which I got way back in the Spring of 2016. Graphics card technology having come a long way1, it was now relatively simple to set up a fully-working “holodeck” in our living room with only a slight risk to the baubles on the Christmas tree.

For our younger child, this was his first experience of “roomscale VR”, which I maintain is the most magical thing about this specific kind of augmented reality. Six degrees of freedom for your head and each of your hands provides the critical level of immersion, for me.

And you know what: this ten-year-old hardware of mine still holds up and is still awesome!2

The kids and I have spent a few days dipping in and out of classics like theBlu, Beat Saber, Job Simulator, Vacation Simulator, Raw Data, and (in my case3) Half-Life: Alyx.

A tweenage girl in a black 'Hazbin Hotel' hoodie wears a VR headset; the screen behind her shows that she's drawn a picture featuring a rainbow background and the word 'CAR', while playing Job Simulator.
It doesn’t feel too heavy, but this first edition Vive sure is a big beast, isn’t it?

I’m moderately excited by the upcoming Steam Frame with its skinny headset, balanced weight, high-bandwidth wireless connectivity, foveated streaming, and built-in PC for basic gaming… but what’s with those controllers? Using AA batteries instead of a built-in rechargeable one feels like a step backwards, and the lack of a thumb “trackpad” seems a little limiting too. I’ll be waiting to see the reviews, thanks.

When I looked back at my blog to double-check that my Vive really is a decade old, I was reminded that I got it in the same month at Three Rings2016 hackathon, then called “DevCamp”, near Tintern4. This amused me, because I’m returning to Tintern this year, too, although on family holiday rather than Three Rings business. Maybe I’ll visit on a third occasion in another decade’s time, following another round of VR gaming?

Footnotes

1 The then-high-end graphics card I used to use to drive this rig got replaced many years ago… and then that replacement card in turn got replaced recently, at which point it became a hand-me-down for our media centre PC in the living room.

2 I’ve had the Vive hooked-up in the office since our house move in 2020, but there’s rarely been space for roomscale play there: just an occasional bit of Elite: Dangerous at my desk… which is still a good application of VR, but not remotely the same thing as being able to stand up and move around!

3 I figure Alyx be a little scary/intense for the kids, but I could be wrong. I think the biggest demonstration of how immersive the game can be in VR is the moment when you see how somebody can watch it played on the big screen and be fine but as soon as they’re in the headset and a combine zombie has you pinned-down in a railway carriage and it’s suddenly way too much!

4 Where, while doing a little geocaching, I messed-up a bonus cache’s coordinate calculation, realised my mistake, brute-forced the possible answers, narrowed it down to two… and then picked the wrong one and fell off a cliff.

×

HTTP is not simple

This is a repost promoting content originally published elsewhere. See more things Dan's reposted.

HTTP/1 may appear simple because of several reasons: it is readable text, the most simple use case is not overly complicated and existing tools like curl and browsers help making HTTP easy to play with.

The HTTP idea and concept can perhaps still be considered simple and even somewhat ingenious, but the actual machinery is not.

[goes on to describe several specific characteristics of HTTP that make it un-simple, under the headings:

  • newlines
  • whitespace
  • end of body
  • parsing numbers
  • folding headers
  • never-implemented
  • so many headers
  • not all methods are alike
  • not all headers are alike
  • spineless browsers
  • size of the specs

]

I discovered this post late, while catching up on posts in the comp.infosystems.gemini newsgroup, but I’m glad I did because it’s excellent. Daniel Stenberg is, of course, the creator of cURL and so probably knows more about the intricacies of HTTP than virtually anybody (including, most-likely, some of the earliest contributors to its standards), and in this post he does a fantastic job of dissecting the oft-made argument that HTTP/1 is a “simple” protocol; based usually upon the argument that “if a human can speak it over telnet/netcat/etc., it’s simple”.

This argument, of course, glosses over the facts that (a) humans are not simple, and the things that we find “easy”… like reading a string of ASCII representations of digits and converting it into a representation of a number… are not necessarily easy for computers, and (b) the ways in which a human might use HTTP 0.9 through 1.1 are rarely representative of the complexities inherent in more-realistic “real world” use.

Obviously Daniel’s written about Gemini, too, and I agree with some of his points there (especially the fact that the specification intermingles the transfer protocol and the recommended markup language; ick!). There’s a reasonable rebuttal here (although it has its faults too, like how it conflates the volume of data involved in the encryption handshake with the processing overhead of repeated handshakes). But now we’re going way down the rabbithole and you didn’t come here to listen to me dissect arguments and counter-arguments about the complexities of Internet specifications that you might never use, right? (Although maybe you should: you could have been reading this blog post via Gemini, for instance…)

But if you’ve ever telnet’ted into a HTTP server and been surprised at how “simple” it was, or just have an interest in the HTTP specifications, Daniel’s post is worth a read.

The perils of doors in gamedev

This is a repost promoting content originally published elsewhere. See more things Dan's reposted.

Recent discussion about the perils of doors in gamedev reminded me of a bug caused by a door in a game you may have heard of called “Half Life 2”. Are you sitting comfortably? Then I shall begin.

A Combine soldier threatening with a baton, in front of a door. Which one is a greater menace in gamedev?

What is meant to happen is a guard (spoiler alert – it’s actually Barney in disguise) bangs on a door, the door opens, he says “get in”, and then the game waits for you to enter the room before the script proceeds.

But in this case the door sort of rattled, but didn’t open, and then locked shut again. So you can’t get in the room, and the gate closed behind you, so you can’t go do anything else. The guard waits forever, pointing at the locked door, and you’re stuck.

If you watch the video, when the door unlocks and then opens, there’s a second guard standing inside the room to the left of the opening door. That guard is actually standing very slightly too close – the very corner of his bounding box intersects the door’s path as it opens. So what’s happening is the door starts to open, slightly nudges into the guard’s toe, bounces back, closes, and then automatically locks. And because there’s no script to deal with this and re-open the door, you’re stuck.

So this kicked off an even longer bug-hunt. The answer was (as with so many of my stories) good old floating point. Half Life 2 was originally shipped in 2004, and although the SSE instruction set existed, it wasn’t yet ubiquitous, so most of HL2 was compiled to use the older 8087 or x87 maths instruction set. That has a wacky grab-bag of precisions – some things are 32-bit, some are 64-bit, some are 80-bit, and exactly which precision you get in which bits of code is somewhat arcane.

Amazing thread from Tom Forsyth, reflecting on his time working at Valve. The tl;dr is that after their compiler was upgraded (to support the SSE instruction sets that had now become common in processors), subsequent builds of Half-Life 2 became unwinnable. The reason was knock-on effects from a series of precision roundings, which meant that a Combine security guard’s toe was in a slightly wrong place and the physics engine would bounce a door off him.

A proper 500-mile-email grade story, in terms of unusual bugs.

SVGs that feel like GIFs

This is a repost promoting content originally published elsewhere. See more things Dan's reposted.

The moving image below is only 49Kb and has an incredibly high resolution.

It’s similar to a GIF but instead of showing moving images, it shows moving SVGs! The best part: Github supports these in their README.md files!

Vincent D. Warmerdam, in SVGs that feel like GIFs

Got to admit, this is really cool and something I can see myself using a lot. So I installed the prerequisites:

brew install asciinema npm install -g svg-term-cli

Then I gave it a go. I needed to use asciinema rec -f asciicast-v2 myfile.cast to record my screen into Asciinema‘s version 2 format, because the new version 3 format isn’t yet supported by svg-term-cli (but there is at least an asciinema convert command if you record in the “wrong” one).

Then I ran cat myfile.cast | svg-term --out myfile.cast.svg to convert that terminal recording into an SVG: I happened to be in a directory containing the source code of FreeDeedPoll.org.uk, so recorded myself running an 11ty development server with npm serve:

Animation showing a user running npm start at a terminal and seeing an 11ty development server compile assets and run.

Amazing stuff.

×

A chat with 19-year-old me

This is a repost promoting content originally published elsewhere. See more things Dan's reposted.

I bumped into my 19-year-old self the other day. It was horrifying, in the same way that looking in the mirror every morning is horrifying, but with added horror on top.

I stopped him mid-stride, he wasn’t even looking at me. His attention was elsewhere. Daydreaming. I remember, I used to do a lot of that. I tapped his shoulder.

“Hey. Hi. Hello. It’s me! I mean: you.”

I wanted to pick two parts of this piece to quote, but I couldn’t.  The whole thing is great. And it’s concise – only about 1,700 words – so you should just go read it.

I wonder what conversations I’d have with my 19-year-old self. Certainly technology would come up, as it was already a huge part of my life (and, indeed, I was already publishing on the Web and even blogging), but younger-me would still certainly have been surprised by and interested in some of the changes that have happened since. High-speed, always-on cellular Internet access… cheap capacitive touchscreens… universal media streaming… the complete disappearance of CRT screens… high-speed wireless networking…

Giles tells his younger self to hold onto his vinyl collection: to retain a collection of physical media for when times get strange and ephemeral, like now. What would I say to 19-year-old me? It’s easy to fantasise about the advice you’d give your younger self, but would I even listen to myself? Possibly not! I was a stubborn young know-it-all!

Anyway, go read Giles’ post because it’s excellent.

Best Viewed at: Your Resolution!

Way back in the day, websites sometimes had banners or buttons (often 88×31 pixels, for complicated historical reasons) to indicate what screen resolution would be the optimal way to view the site. Just occasionally, you still see these today.

Best: 1024 x 768 Best viewed on desktop Best Viewed 800 x 600 Best Viewed In Landscape

Folks who were ahead of the curve on what we’d now call “responsive design” would sometimes proudly show off that you could use any resolution, in the same way as they’d proudly state that you could use any browser1!

Best viewed with eyes This page is best viewed with: a computer and a monitor Best viewed with YOUR browser Best viewed with A COMPUTER 

I saw a “best viewed at any size” 88×31 button recently, and it got me thinking: could we have a dynamic button that always shows the user’s current resolution as the “best” resolution. So it’s like a “best viewed at any size” button… except even more because it says “whatever resolution you’re at… that’s perfect; nice one!”

Turns out, yes2:

Looks best at: any resolution!

Anyway, I’ve made a website: best-resolution.danq.dev. If you want a “Looks best at [whatever my visitor’s screen resolution is]” button, you can get one there.

It’s a good job I’ve already done so many stupid things on the Web, or this would make me look silly.

Footnotes

1 I was usually in the camp that felt that you ought to be able to access my site with any browser, at any resolution and colour depth, and get an acceptable and satisfactory experience. I guess I still am.

2 If you’re reading this via RSS or have JavaScript disabled then you’ll probably see an “any size” button, but if you view it on the original page with JavaScript enabled then you should see your current browser inner width and height shown on the button.

The rise of Whatever

This is a repost promoting content originally published elsewhere. See more things Dan's reposted.

A freaking excellent longread by Eevee (Evelyn Woods), lamenting the direction of popular technological progress and general enshittification of creator culture. It’s ultimately uplifting, I feel, but it’s full of bitterness until it gets there. I’ve pulled out a couple of highlights to try to get you interested, but you should just go and read the entire thing:

And so the entire Web sort of congealed around a tiny handful of gigantic platforms that everyone on the fucking planet is on at once. Sometimes there is some sort of partitioning, like Reddit. Sometimes there is not, like Twitter.

That’s… fine, I guess. Things centralize. It happens. You don’t get tubgirl spam raids so much any more, at least.

But the centralization poses a problem. See, the Web is free to look at (by default), but costs money to host. There are free hosts, yes, but those are for static things getting like a thousand visitors a day, not interactive platforms serving a hundred million. That starts to cost a bit. Picture logs being shoveled into a steam engine’s firebox, except it’s bundles of cash being shoveled into… the… uh… website hole.

I don’t want to help someone who opens with “I don’t know how to do this so I asked ChatGPT and it gave me these 200 lines but it doesn’t work”. I don’t want to know how much code wasn’t actually written by anyone. I don’t want to hear how many of my colleagues think Whatever is equivalent to their own output.

I glimpsed someone on Twitter a few days ago, also scoffing at the idea that anyone would decide not to use the Whatever machine. I can’t remember exactly what they said, but it was something like: “I created a whole album, complete with album art, in 3.5 hours. Why wouldn’t I use the make it easier machine?”

This is kind of darkly fascinating to me, because it gives rise to such an obvious question: if anyone can do that, then why listen to your music? It takes a significant chunk of 3.5 hours just to listen to an album, so how much manual work was even done here? Apparently I can just go generate an endless stream of stuff of the same quality! Why would I want your particular brand of Whatever?

Nobody seems to appreciate that if you can make a computer do something entirely on its own, then that becomes the baseline.

Do things. Make things. And then put them on your website so I can see them.

Clearly this all ties in to stuff that I’ve been thinking, lately. Expect more posts and reposts in this vein, I guess?

Internet Services^H Provider

Do you remember when your domestic ISP – Internet Service Provider – used to be an Internet Services Provider? They were only sometimes actually called that, but what I mean is: when ISPs provided more than one Internet service? Not just connectivity, but… more.

Web page listing 'Standard Services' for dial-up and leased line connections, including: user homepages, FTP, email, usenet, IRC, email-to-fax, and fax-to-email services.
One of the first ISPs I subscribed to had a “standard services” list longer than most modern ISPs complete services list!

ISPs twenty years ago

It used to just be expected that your ISP would provide you with not only an Internet connection, but also some or all of:

  • A handful of email inboxes, plus SMTP relaying
  • Shared or private FTP storage1
  • Hosting for small Websites/homepages
  • Usenet access
  • Email-to-fax and/or fax-to-email services
  • Caching forward proxies (this was so-commonplace that it isn’t even listed in the “standard services” screenshot above)
  • One or more local nodes to IRC networks
  • Sometimes, licenses for useful Internet software
  • For leased-line (technically “broadband”, by the original definition) connections: a static IP address or IP pool
Stylish (for circa 2000) webpage for HoTMetaL Pro 6.0, advertising its 'unrivaled [sic] editing, site management and publishing tools'.
I don’t remember which of my early ISPs gave me a free license for HoTMetaL Pro, but I was very appreciative of it at the time.

ISPs today

The ISP I hinted at above doesn’t exist any more, after being bought out and bought out and bought out by a series of owners. But I checked the Website of the current owner to see what their “standard services” are, and discovered that they are:

  • A pretty-shit router2
  • Optional 4G backup connectivity (for an extra fee)
  • A voucher for 3 months access to a streaming service3

The connection is faster, which is something, but we’re still talking about the “baseline” for home Internet access then-versus-now. Which feels a bit galling, considering that (a) you’re clearly, objectively, getting fewer services, and (b) you’re paying more for them – a cheap basic home Internet subscription today, after accounting for inflation, seems to cost about 25% more than it did in 2000.4

Are we getting a bum deal?

An xternal 33.6kbps serial port dial-up modem.
Not every BBS nor ISP would ever come to support the blazing speeds of a 33.6kbps modem… but when you heard the distinctive scream of its negotiation at close to the Shannon Limit of the piece of copper dangling outside your house… it felt like you were living in the future.

Would you even want those services?

Some of them were great conveniences at the time, but perhaps not-so-much now: a caching server, FTP site, or IRC node in the building right at the end of my dial-up connection? That’s a speed boost that was welcome over a slow connection to an unencrypted service, but is redundant and ineffectual today. And if you’re still using a fax-to-email service for any purpose, then I think you have bigger problems than your ISP’s feature list!

Some of them were things I wouldn’t have recommend that you depend on, even then: tying your email and Web hosting to your connectivity provider traded one set of problems for another. A particular joy of an email address, as opposed to a postal address (or, back in the day, a phone number), is that it isn’t tied to where you live. You can move to a different town or even to a different country and still have the same email address, and that’s a great thing! But it’s not something you can guarantee if your email address is tied to the company you dial-up to from the family computer at home. A similar issue applies to Web hosting, although for a true traditional “personal home page”: a little information about yourself, and your bookmarks, it would be fine.

But some of them were things that were actually useful and I miss: honestly, it’s a pain to have to use a third-party service for newsgroup access, which used to be so-commonplace that you’d turn your nose up at an ISP that didn’t offer it as standard. A static IP being non-standard on fixed connections is a sad reminder that the ‘net continues to become less-participatory, more-centralised, and just generally more watered-down and shit: instead of your connection making you “part of” the Internet, nowadays it lets you “connect to” the Internet, which is a very different experience.5

But the Web hosting, for example, wasn’t useless. In fact, it served an important purpose in lowering the barrier to entry for people to publish their first homepage! The magical experience of being able to just FTP some files into a directory and have them be on the Web, as just a standard part of the “package” you bought-into, was a gateway to a participatory Web that’s nowadays sadly lacking.

'Setting Up your Web Site, Step by Step Instructions' page, describing use of an FTP client to upload web pages.
A page like this used to be absolutely standard on the Website6 of any ISP worth its salt.

Yeah, sure, you can set up a static site (unencumbered by any opinionated stack) for free on Github Pages, Neocities, or wherever, but the barrier to entry has been raised by just enough that, doubtless, there are literally millions of people who would have taken that first step… but didn’t.

And that makes me sad.

Footnotes

1 ISP-provided shared FTP servers would also frequently provide locally-available copies of Internet software essentials for a variety of platforms. This wasn’t just a time-saver – downloading Netscape Navigator from your ISP rather than from half-way across the world was much faster! – it was also a way to discover new software, curated by people like you: a smidgen of the feel of a well-managed BBS, from the comfort of your local ISP!

2 ISP-provided routers are, in my experience, pretty crap 50% of the time… although they’ve been improving over the last decade as consumers have started demanding that their WiFi works well, rather than just works.

3 These streaming services vouchers are probably just a loss-leader for the streaming service, who know that you’ll likely renew at full price afterwards.

4 Okay, in 2000 you’d have also have had to pay per-minute for the price of the dial-up call… but that money went to BT (or perhaps Mercury or KCOM), not to your ISP. But my point still stands: in a world where technology has in general gotten cheaper and backhaul capacity has become underutilised, why has the basic domestic Internet connection gotten less feature-rich and more-expensive? And often with worse customer service, to boot.

5 The problem of your connection not making you “part of” the Internet is multiplied if you suffer behind carrier-grade NAT, of course. But it feels like if we actually cared enough to commit to rolling out IPv6 everywhere we could obviate the need for that particular turd entirely. And yet… I’ll bet that the ISPs who currently use it will continue to do so, even as the offer IPv6 addresses as-standard, because they buy into their own idea that it’s what their customers want.

6 I think we can all be glad that we no longer write “Web Site” as two separate words, but you’ll note that I still usually correctly capitalise Web (it’s a proper noun: it’s the Web, innit!).

× × × ×

Part of the Internet, or connecting to the Internet?

Some time in the last 25 years, ISPs stopped saying they made you “part of” the Internet, just that they’d help you “connect to” the Internet.

Most people don’t need a static IP, sure. But when ISPs stopped offering FTP and WWW hosting as a standard feature (shit though it often was), they became part of the tragic process by which the Internet became centralised, and commoditised, and corporate, and just generally watered-down.

The amount of effort to “put something online” didn’t increase by a lot, but it increased by enough that millions probably missed-out on the opportunity to create their first homepage.

Lock All The Computers

I wanted a way to simultaneously lock all of the computers – a mixture of Linux, MacOS and Windows boxen – on my desk, when I’m going to step away. Here’s what I came up with:

There’s optional audio in this video, if you want it.

One button. And everything locks. Nice!

Here’s how it works:

  1. The mini keyboard is just 10 cheap mechanical keys wired up to a CH552 chip. It’s configured to send CTRL+ALT+F13 through CTRL+ALT+F221 when one of its keys are pressed.
  2. The “lock” key is captured by my KVM tool Deskflow (which I migrated to when Barrier became neglected, which in turn I migrated to when I fell out of love with Synergy). It then relays this hotkey across to all currently-connected machines2.
  3. That shortcut is captured by each recipient machine in different ways:
    • The Linux computers run LXDE, so I added a line to /etc/xdg/openbox/rc.xml to set a <keybind> that executes xscreensaver-command -lock.
    • For the Macs, I created a Quick Action in Automator that runs pmset displaysleepnow as a shell script3, and then connected that via Keyboard Shortcuts > Services.
    • On the Windows box, I’ve got AutoHotKey running anyway, so I just have it run { DllCall("LockWorkStation") } when it hears the keypress.

That’s all there is to is! A magic “lock all my computers, I’m stepping away” button, that’s much faster and more-convenient than locking two to five computers individually.

Footnotes

1 F13 through F24 are absolutely valid “standard” key assignments, of course: it’s just that the vast majority of keyboards don’t have keys for them! This makes them excellent candidates for non-clashing personal-use function keys, but I like to append one or more modifier keys to the as well to be absolutely certain that I don’t interact with things I didn’t intend to!

2 Some of the other buttons on my mini keyboard are mapped to “jumping” my cursor to particular computers (if I lose it, which happens more often than I’d like to admit), and “locking” my cursor to the system it’s on.

3 These boxes are configured to lock as soon as the screen blanks; if yours don’t then you might need a more-sophisticated script.

Zip It – Finding File Similarity Using Compression Utilities

This is a repost promoting content originally published elsewhere. See more things Dan's reposted.

This was an enjoyable video. Nothing cutting-edge, but a description of an imaginative use of an everyday algorithm – DEFLATE, which is what powers most of the things you consider “ZIP files” – to do pattern-matching and comparison between two files. The tl;dr is pretty simple:

  • Lossless compression works by looking for repetition, and replacing the longest/most-repeated content with references to a lookup table.
  • Therefore, the reduction-in-size from compressing a file is an indicator of the amount of repetition within it.
  • Therefore, the difference in reduction-in-size of compressing a single file to the reduction-in-size of compressing a pair of files is indicative of their similarity, because the greatest compression gains come from repetition of data that is shared across both files.
  • This can be used, for example, to compare the same document written in two languages as an indication of the similarity of the languages to one another, or to compare the genomes of two organisms as an indication of their genetic similarity (and therefore how closely-related they are).

I love it when somebody finds a clever and novel use for an everyday tool.

Dan Has Too Many Monitors

My new employer sent me a laptop and a monitor, which I immediately added to my already pretty-heavily-loaded desk. Wanna see?

It’s a video. Click it to play it, of course.