Accidental Geohashing

Over the last six years I’ve been on a handful of geohashing expeditions, setting out to functionally-random GPS coordinates to see if I can get there, and documenting what I find when I do. The comic that inspired the sport was already six years old by the time I embarked on my first outing, and I’m far from the most-active member of the ‘hasher community, but I’ve a certain closeness to them as a result of my work to resurrect and host the “official” website. Either way: I love the sport.

Dan, Ruth, and baby Annabel at geohashpoint 2014-04-21 51 -1
I even managed to drag-along Ruth and Annabel to a hashpoint (2014-04-21 51 -1) once.

But even when I’ve not been ‘hashing, it occurs to me that I’ve been tracking my location a lot. Three mechanisms in particular dominate:

  • Google’s somewhat-invasive monitoring of my phones’ locations (which can be exported via Google Takeout)
  • My personal GPSr logs (I carry the device moderately often, and it provides excellent precision)
  • The personal μlogger server I’ve been running for the last few years (it’s like Google’s system, but – y’know – self-hosted, tweakable, and less-creepy)

If I could mine all of that data, I might be able to answer the question… have I ever have accidentally visited a geohashpoint?

Let’s find out.

KML from Google is coverted into GPX where it joins GPX from my GPSr and real-time position data from uLogger in a MySQL database. This is queried against historic hashpoints to produce a list of candidate accidental hashpoints.
There’s a lot to my process, but it’s technically quite simple.

Data mining my own movements

To begin with, I needed to get all of my data into μLogger. The Android app syncs to it automatically and uploading from my GPSr was simple. The data from Google Takeout was a little harder.

I found a setting in Google Takeout to export past location data in KML, rather than JSON, format. KML is understood by GPSBabel which can convert it into GPX. I can “cut up” the resulting GPX file using a little grep-fu (relevant xkcd?) to get month-long files and import them into μLogger. Easy!

Requesting KML rather than JSON from Google Takeout
It’s slightly hidden, but Google Takeout choose your geoposition output format (from a limited selection).

Well.. μLogger’s web interface sometimes times-out if you upload enormous files like a whole month of Google Takeout logs. So instead I wrote a Nokogiri script to convert the GPX into SQL to inject directly into μLogger’s database.

Next, I got a set of hashpoint offsets. I only had personal positional data going back to around 2010, so I didn’t need to accommodate for the pre-2008 absence of the 30W time zone rule. I’ve had only one trip to the Southern hemisphere in that period, and I checked that manually. A little rounding and grouping in SQL gave me each graticule I’d been in on every date. Unsurprisingly, I spend most of my time in the 51 -1 graticule. Adding (or subtracting, for the Western hemisphere) the offset provided the coordinates for each graticule that I visited for the date that I was in that graticule. Nice.

SQL retreiving hashpoints for every graticule I've been to in the last 10 years, grouped by date.
Preloading the offsets into a temporary table made light work of listing all the hashpoints in all the graticules I’d visited, by date. Note that some dates (e.g. 2011-08-04, above) saw me visit multiple graticules.

The correct way to find the proximity of my positions to each geohashpoint is, of course, to use WGS84. That’s an easy thing to do if you’re using a database that supports it. My database… doesn’t. So I just used Pythagoras’ theorem to find positions I’d visited that were within 0.15° of a that day’s hashpoint.

Using Pythagoras for geopositional geometry is, of course, wrong. Why? Because the physical length of a “degree” varies dependent on latitude, and – more importantly – a degree of latitude is not the same distance as a degree of longitude. The ratio varies by latitude: only an idealised equatorial graticule would be square!

But for this case, I don’t care: the data’s going to be fuzzy and require some interpretation anyway. Not least because Google’s positioning has the tendency to, for example, spot a passing train’s WiFi and assume I’ve briefly teleported to Euston Station, which is apparently where Google thinks that hotspot “lives”.

GIF animation comparing routes recorded by Google My Location with those recorded by my GPSr: they're almost identical
I overlaid randomly-selected Google My Location and GPSr routes to ensure that they coincided, as an accuracy-test. It’s interesting to note that my GPSr points cluster when I was moving slower, suggesting it polls on a timer. Conversely Google’s points cluster when I was using data (can you see the bit where I used a chat app), suggesting that Google Location Services ramps up the accuracy and poll frequency when you’re actively using your device.

I assumed that my algorithm would detect all of my actual geohash finds, and yes: all of these appeared as-expected in my results. This was a good confirmation that my approach worked.

And, crucially: about a dozen additional candidate points showed up in my search. Most of these – listed at the end of this post – were 50m+ away from the hashpoint and involved me driving or cycling past on a nearby road… but one hashpoint stuck out.

Hashing by accident

Annabel riding on Tom's shoulders in Edinburgh.
We all had our roles to play in our trip to Edinburgh. Tom… was our pack mule.

In August 2015 we took a trip up to Edinburgh to see a play of Ruth‘s brother Robin‘s. I don’t remember much about the play because I was on keeping-the-toddler-entertained duty and so had to excuse myself pretty early on. After the play we drove South, dropping Tom off at Lanark station.

We exited Lanark via the Hyndford Bridge… which is – according to the map – tantalisingly-close to the 2015-08-22 55 -3 hashpoint: only about 23 metres away!

The 2015-08-22 55 -3
Google puts the centre of the road I drove down only 23m from the 2015-08-22 55 -3 hashpoint (of course, I was actually driving on the near side of the road and may have been closer still).

That doesn’t feel quite close enough to justify retroactively claiming the geohash, tempting though it would be to use it as a vehicle to my easy geohash ribbon. Google doesn’t provide error bars for their exported location data so I can’t draw a circle of uncertainty, but it seems unlikely that I passed through this very close hashpoint.

Pity. But a fun exercise. This was the nearest of my near misses, but plenty more turned up in my search, too:

  1. 2013-09-28 54 -2 (9,000m)
    Near a campsite on the River Eden. I drove past on the M6 with Ruth on the way to Loch Lomond for a mini-break to celebrate our sixth anniversary. I was never more than 9,000 metres from the hashpoint, but Google clearly had a moment when it couldn’t get good satellite signal and tries to trilaterate my position from cell masts and coincidentally guessed, for a few seconds, that I was much closer. There are a few such erroneous points in my data but they’re pretty obvious and easy to spot, so my manual filtering process caught them.
  2. 2019-09-13 52 -0 (719m)
    A600, near Cardington Airstrip, south of Bedford. I drove past on the A421 on my way to Three Rings‘ “GDPR Camp”, which was more fun than it sounds, I promise.
  3. 2014-03-29 53 -1 (630m)
    Spen Farm, near Bramham Interchange on the A1(M). I drove past while heading to the Nightline Association Conference to talk about Three Rings. Curiously, I came much closer to the hashpoint the previous week when I drove a neighbouring road on my way to York for my friend Matt’s wedding.
  4. 2020-05-06 51 -1 (346m)
    Inside Kidlington Police Station! Short of getting arrested, I can’t imagine how I’d easily have gotten to this one, but it’s moot anyway because I didn’t try! I’d taken the day off work to help with child-wrangling (as our normal childcare provisions had been scrambled by COVID-19), and at some point during the day we took a walk and came somewhat near to the hashpoint.
  5. 2016-02-05 51 -1 (340m)
    Garden of a house on The Moors, Kidlington. I drove past (twice) on my way to and from the kids’ old nursery. Bonus fact: the house directly opposite the one whose garden contained the hashpoint is a house that I looked at buying (and visited), once, but didn’t think it was worth the asking price.
  6. 2017-08-30 51 -1 (318m)
    St. Frieswide Farm, between Oxford and Kidlington. I cycled past on Banbury Road twice – once on my way to and once on my way from work.
  7. 2015-01-25 51 -1 (314m)
    Templar Road, Cutteslowe, Oxford. I’ve cycled and driven along this road many times, but on the day in question the closest I came was cycling past on nearby Banbury Road while on the way to work.
  8. 2018-01-28 51 -1 (198m)
    Stratfield Brake, Kidlington. I took our youngest by bike trailer this morning to his Monkey Music class: normally at this point in history Ruth would have been the one to take him, but she had a work-related event that she couldn’t miss in the morning. I cycled right by the entrance to this nature reserve: it could have been an ideal location for a geohash!
  9. 2014-01-24 51 -1 (114m)
    On the Marston Cyclepath. I used to cycle along this route on the way to and from work most days back when I lived in Marston, but by 2014 I lived in Kidlington and so I’d only cycle past the end of it. So it was that I cycled past the Linacre College of the path, around 114m away from the hashpoint, on this day.
  10. 2015-06-10 51 -1 (112m)
    Meadow near Peartree Interchange, Oxford. I stopped at the filling station on the opposite side of the roundabout, presumably to refuel a car.
  11. 2020-02-27 51 -1 (70m)
    This was a genuine attempt at a hashpoint that I failed to reach and was so sad about that I never bothered to finish writing up. The hashpoint was very close (but just out of sight of, it turns out) a geocache I’d hidden in the vicinity, and I was hopeful that I might be able to score the most-epic/demonstrable déjà vu/hash collision achievement ever, not least because I had pre-existing video evidence that I’d been at the coordinates before! Unfortunately it wasn’t to be: I had inadequate footwear for the heavy rains that had fallen in the days that preceded the expedition and I was in a hurry to get home, get changed, and go catch a train to go and see the Goo Goo Dolls in concert. So I gave up and quit the expedition. This turned out to be the right decision: going to the concert one of the last “normal” activities I got to do before the COVID-19 lockdown made everybody’s lives weird.
  12. 2014-05-23 51 -1 (61m)
    White Way, Kidlington, near the Bicester Road to Green Road footpath. I passed close by while cycling to work, but I’ve since walked through this hashpoint many times: it’s on a route that our eldest sometimes used to take when walking home from her school! With the exception only of the very-near-miss in Lanark, this was my nearest “near miss”.
  13. 2015-08-22 55 -3 (23m)
    So near, yet so far. 🙄
Rain in Lanark, seen through a car window
No silly grin, but coincidentally – perhaps by accident – I took a picture out of the car window shortly after we passed the hashpoint. This is what Lanark looks like when you drive through it in the rain.

Syncthing

This last month or so, my digital life has been dramatically improved by Syncthing. So much so that I want to tell you about it.

Syncthing interface via Synctrayzor on Windows, showing Dan's syncs.
1.25TiB of data is automatically kept in sync between (depending on the data in question) a desktop PC, NAS, media centre, and phone. This computer’s using the Synctrayzor system tray app.

I started using it last month. Basically, what it does is keeps a pair of directories on remote systems “in sync” with one another. So far, it’s like your favourite cloud storage service, albeit self-hosted and much-more customisable. But it’s got a handful of killer features that make it nothing short of a dream to work with:

  • The unique identifier for a computer can be derived from its public key. Encryption comes free as part of the verification of a computer’s identity.
  • You can share any number of folders with any number of other computers, point-to-point or via an intermediate proxy, and it “just works”.
  • It’s super transparent: you can always see what it’s up to, you can tweak the configuration to match your priorities, and it’s open source so you can look at the engine if you like.

Here are some of the ways I’m using it:

Keeping my phone camera synced to my PC

Phone syncing with PC

I’ve tried a lot of different solutions for this over the years. Back in the way-back-when, like everybody else in those dark times, I used to plug my phone in using a cable to copy pictures off and sort them. Since then, I’ve tried cloud solutions from Google, Amazon, and Flickr and never found any that really “worked” for me. Their web interfaces and apps tend to be equally terrible for organising or downloading files, and I’m rarely able to simply drag-and-drop images from them into a blog post like I can from Explorer/Finder/etc.

At first, I set this up as a one-way sync, “pushing” photos and videos from my phone to my desktop PC whenever I was on an unmetered WiFi network. But then I switched it to a two-way sync, enabling me to more-easily tidy up my phone of old photos too, by just dragging them from the folder that’s synced with my phone to my regular picture storage.

Centralising my backups

Phone and desktop backups centralised through the NAS

Now I’ve got a fancy NAS device with tonnes of storage, it makes sense to use it as a central point for backups to run fom. Instead of having many separate backup processes running on different computers, I can just have each of them sync to the NAS, and the NAS can back everything up. Computers don’t need to be “on” at a particular time because the NAS runs all the time, so backups can use the Internet connection when it’s quietest. And in the event of a hardware failure, there’s an up-to-date on-site backup in the first instance: the cloud backup’s only needed in the event of accidental data deletion (which could be sync’ed already, of course!). Plus, integrating the sync with ownCloud running on the NAS gives easy access to my files wherever in the world I am without having to fire up a VPN or otherwise remote-in to my house.

Plus: because Syncthing can share a folder between any number of devices, the same sharing mechanism that puts my phone’s photos onto my main desktop can simultaneously be pushing them to the NAS, providing redundant connections. And it was a doddle to set up.

Maintaining my media centre’s screensaver

PC photos syncing to the media centre.

Since the NAS, running Jellyfin, took on most of the media management jobs previously shared between desktop computers and the media centre computer, the household media centre’s had less to do. But one thing that it does, and that gets neglected, is showing a screensaver of family photos (when it’s not being used for anything else). Historically, we’ve maintained the photos in that collection via a shared network folder, but then you’ve got credential management and firewall issues to deal with, not to mention different file naming conventions by different people (and their devices).

But simply sharing the screensaver’s photo folder with the computer of anybody who wants to contribute photos means that it’s as easy as copying the picture to a particular place. It works on whatever device they care to (computer, tablet, mobile) on any operating system, and it’s quick and seamless. I’m just using it myself, for now, but I’ll be offering it to the rest of the family soon. It’s a trivial use-case, but once you’ve got it installed it just makes sense.

In short: this month, I’m in love with Syncthing. And maybe you should be, too.

Why $FAMOUS_COMPANY switched to $HYPED_TECHNOLOGY

This article is a repost promoting content originally published elsewhere. See more things Dan's reposted.

( )

As we’ve mentioned in previous blog posts, the $FAMOUS_COMPANY backend has historically been developed in $UNREMARKABLE_LANGUAGE and architected on top of $PRACTICAL_OPEN_SOURCE_FRAMEWORK. To suit our unique needs, we designed and open-sourced $AN_ENGINEER_TOOK_A_MYTHOLOGY_CLASS, a highly-available, just-in-time compiler for $UNREMARKABLE_LANGUAGE.

Saagar Jha tells the now-familiar story of how a bunch of techbros solved their scaling problems by reinventing the wheel. And then, when that didn’t work out, moved the goalposts of success. It’s a story as old as time; or at least as old as the modern Web.

(Should’a strangled the code. Or better yet, just refactored what they had.)

Mackerelmedia Fish

Normally this kind of thing would go into the ballooning dump of “things I’ve enjoyed on the Internet” that is my reposts archive. But sometimes something is so perfect that you have to try to help it see the widest audience it can, right? And today, that thing is: Mackerelmedia Fish.

Mackerelmedia Fish reports: WARNING! Your Fish have escaped!
Historical fact: escaped fish was one of the primary reasons for websites failing in 1996.

What is Mackerelmedia Fish? I’ve had a thorough and pretty complete experience of it, now, and I’m still not sure. It’s one or more (or none) of these, for sure, maybe:

  • A point-and-click, text-based, or hypertext adventure?
  • An homage to the fun and weird Web of yesteryear?
  • A statement about the fragility of proprietary technologies on the Internet?
  • An ARG set in a parallel universe in which the 1990s never ended?
  • A series of surrealist art pieces connected by a loose narrative?

Rock Paper Shotgun’s article about it opens with “I don’t know where to begin with this—literally, figuratively, existentially?” That sounds about right.

I stared into THE VOID and am OK!
This isn’t the reward for “winning” the “game”. But I was proud of it anyway.

What I can tell you with confident is what playing feels like. And what it feels like is the moment when you’ve gotten bored waiting for page 20 of Argon Zark to finish appear so you decide to reread your already-downloaded copy of the 1997 a.r.k bestof book, and for a moment you think to yourself: “Whoah; this must be what living in the future feels like!”

Because back then you didn’t yet have any concept that “living in the future” will involve scavenging for toilet paper while complaining that you can’t stream your favourite shows in 4K on your pocket-sized supercomputer until the weekend.

Dancing... thing?
I was always more of a Bouncing Blocks than a Hamster Dance guy, anyway.

Mackerelmedia Fish is a mess of half-baked puns, retro graphics, outdated browsing paradigms and broken links. And that’s just part of what makes it great.

It’s also “a short story that’s about the loss of digital history”, its creator Nathalie Lawhead says. If that was her goal, I think she managed it admirably.

An ASCII art wizard on a faux Apache directory listing page.
Everything about this, right down to the server signature (Artichoke), is perfect.

If I wasn’t already in love with the game already I would have been when I got to the bit where you navigate through the directory indexes of a series of deepening folders, choose-your-own-adventure style. Nathalie writes, of it:

One thing that I think is also unique about it is using an open directory as a choose your own adventure. The directories are branching. You explore them, and there’s text at the bottom (an htaccess header) that describes the folder you’re in, treating each directory as a landscape. You interact with the files that are in each of these folders, and uncover the story that way.

Back in the naughties I experimented with making choose-your-own-adventure games in exactly this way. I was experimenting with different media by which this kind of branching-choice game could be presented. I envisaged a project in which I’d showcase the same (or a set of related) stories through different approaches. One was “print” (or at least “printable”): came up with a Twee1-to-PDF converter to make “printable” gamebooks. A second was Web hypertext. A third – and this is the one which was most-similar to what Nathalie has now so expertly made real – was FTP! My thinking was that this would be an adventure game that could be played in a browser or even from the command line on any (then-contemporary: FTP clients aren’t so commonplace nowadays) computer. And then, like so many of my projects, the half-made version got put aside “for later” and forgotten about. My solution involved abusing the FTP protocol terribly, but it worked.

(I also looked into ways to make Gopher-powered hypertext fiction and toyed with the idea of using YouTube annotations to make an interactive story web [subsequently done amazingly by Wheezy Waiter, though the death of YouTube annotations in 2017 killed it]. And I’ve still got a prototype I’d like to get back to, someday, of a text-based adventure played entirely through your web browser’s debug console…! But time is not my friend… Maybe I ought to collaborate with somebody else to keep me on-course.)

Three virtual frogs. One needs a hug.
My first batch of pet frogs died quite quickly, but these ones did okay.

In any case: Mackerelmedia Fish is fun, weird, nostalgic, inspiring, and surreal, and you should give it a go. You’ll need to be on a Windows or OS X computer to get everything you can out of it, but there’s nothing to stop you starting out on your mobile, I imagine.

Sso long as you’re capable of at least 800 × 600 at 256 colours and have 4MB of RAM, if you know what I mean.

Avoid rewriting a legacy system from scratch, by strangling it

This article is a repost promoting content originally published elsewhere. See more things Dan's reposted.

( )

Sometimes, code is risky to change and expensive to refactor.

In such a situation, a seemingly good idea would be to rewrite it.

From scratch.

Here’s how it goes:

  1. You discuss with management about the strategy of stopping new features for some time, while you rewrite the existing app.
  2. You estimate the rewrite will take 6 months to cover what the existing app does.
  3. A few months in, a nasty bug is discovered and ABSOLUTELY needs to be fixed in the old code. So you patch the old code and the new one too.
  4. A few months later, a new feature has been sold to the client. It HAS TO BE implemented in the old code—the new version is not ready yet! You need to go back to the old code but also add a TODO to implement this in the new version.
  5. After 5 months, you realize the project will be late. The old app was doing way more things than expected. You start hustling more.
  6. After 7 months, you start testing the new version. QA raises up a lot of things that should be fixed.
  7. After 9 months, the business can’t stand “not developing features” anymore. Leadership is not happy with the situation, you are tired. You start making changes to the old, painful code while trying to keep up with the rewrite.
  8. Eventually, you end up with the 2 systems in production. The long-term goal is to get rid of the old one, but the new one is not ready yet. Every feature needs to be implemented twice.

Sounds fictional? Or familiar?

Don’t be shamed, it’s a very common mistake.

I’ve rewritten legacy systems from scratch before. Sometimes it’s all worked out, and sometimes it hasn’t, but either way: it’s always been a lot more work than I could have possibly estimated. I’ve learned now to try to avoid doing so: at least, to avoid replacing a single monolithic (living) system in a monolithic way. Nicholas gives an even-better description of the true horror of legacy reimplementation, and promotes progressive strangulation as a candidate solution.

Pay Up, Or We’ll Make Google Ban Your Ads

This article is a repost promoting content originally published elsewhere. See more things Dan's reposted.

( )

A new email-based extortion scheme apparently is making the rounds, targeting Web site owners serving banner ads through Google’s AdSense program. In this scam, the fraudsters demand bitcoin in exchange for a promise not to flood the publisher’s ads with so much bot and junk traffic that Google’s automated anti-fraud systems suspend the user’s AdSense account for suspicious traffic.

The shape of our digital world grows increasingly strange. As anti-DoS techniques grow better and more and more uptime-critical websites hide behind edge caches, zombie network operators remain one step ahead and find new and imaginative ways to extort money from their victims. In this new attack, the criminal demands payment (in cryptocurrency) under threat that, if it’s not delivered, they’ll unleash an army of bots to act like the victim trying to scam their advertising network, thereby getting the victim’s site demonetised.

Difference Between VR and AR

This weekend, my sister Sarah challenged me to define the difference between Virtual Reality and Augmented Reality. And the more I talked about the differences between them, the more I realised that I don’t have a concrete definition, and I don’t think that anybody else does either.

A man wearing a VR headset while writing on a whiteboard.
AR: the man sees simulated reality and, probably, the whiteboard.
VR: the man sees simulated reality only, which may or may not include a whiteboard.
Either way: what the hell is he doing?

After all: from a technical perspective, any fully-immersive AR system – for example a hypothetical future version of the Microsoft Hololens that solves the current edition’s FOV problems – exists in a theoretical superset of any current-generation VR system. That AR augments the reality you can genuinely see, rather than replacing it entirely, becomes irrelevant if that AR system could superimpose a virtual environment covering your entire view. So the argument that compared to VR, AR only covers part of your vision is not a reliable definition of the difference.

Spectrum showing AR view of the Louvre on the left (with 5% of the view occluded by UI) and VR view of a 3D model of the Louvre on the right (with 100% of the view occluded). The middle of the spectrum remains undefined.
The difference by how much of your view is occluded by machine-generated images fails to define where the boundary between AR and VR lies. 5%? 50%? 99%?

This isn’t a new conundrum. Way back in 1994 back when the Sega VR-1 was our idea of cutting edge, Milgram et al. developed a series of metaphorical spectra to describe the relationship between different kinds of “mixed reality” systems. The core difference, they argue, is whether or not the computer-generated content represents a “world” in itself (VR) is just an “overlay” (AR).

But that’s unsatisfying for the same reason as above. The HTC Vive headset can be configured to use its front-facing camera(s) to fade seamlessly from the game world to the real world as the player gets close to the boundaries of their play space. This is a safety feature, but it doesn’t have to be: there’s no reason that a HTC Vive couldn’t be adapted to function as what Milgram would describe as a “class 4” device, which is functionally the same as a headset-mounted AR device. So what’s the difference?

You might argue that the difference between AR and VR is content-based: that is, it’s the thing that you’re expected to focus on that dictates which is which. If you’re expected to look at the “real world”, it’s an augmentation, and if not then it’s a virtualisation. But that approach fails to describe Google’s tech demo of putting artefacts in your living room via augmented reality (which I’ve written about before), because your focus is expected to be on the artefact rather than the “real world” around it. The real world only exists to help with the interpretation of scale: it’s not what the experience is about and your countertop is as valid a real world target as the Louvre: Google doesn’t care.

Milgram et al. (1994)'s Table 1: Some major differences between classes of Mixed Reality (MR) displays, showing seven classes: 1. monitor-based video with CG overlays, 2. headset-based video with CG overlays, 3. headset-based see-through display with CG overlays, 4. headset-based see-through video with CG overlays, 5. monitor/CG world with video overlays, 6. headset-based CG world with video overlays, and 7. CG-based world with real object intervention.
Categories 3 and 4 would probably be used to describe most contemporary AR; categories 6 and 7 to describe contemporary VR.

Many researchers echo Milgram’s idea that what turns AR into VR is when the computer-generated content completely covers your vision.

But even if we accept this explanation, the definition gets muddied by the wider field of “extended reality” (XR). Originally an umbrella term to cover both AR and VR (and “MR“, if you believe that’s a separate and independent thing), XR gets used to describe interactive experiences that cover other senses, too. If I play a VR game with real-world “props” that I can pick up and move around, but that appear differently in my vision, am I not “augmenting” reality? Is my experience, therefore, more or less “VR” than if the interactive objects exist only on my screen? What about if – as in a recent VR escape room I attended – the experience is enhanced by fans to simulate the movement of air around you? What about smell? (You know already that somebody’s working on bridging virtual reality with Smell-O-Vision.)

Dan and Robin concluding a VR Escape Room.
Not sure if you’re real or if you’re dreaming? Don’t ask an XR researcher; they don’t know either.

Increasingly, then, I’m beginning to feel that XR itself is a spectrum, and a pretty woolly one. Just as it’s hard to specify in a concrete way where the boundary exists between being asleep and being awake, it’s hard to mark where “our” reality gives way to the virtual and vice-versa.

It’s based upon the addition of information to our senses, by a computer, and there can be more (as in fully-immersive VR) or less (as in the subtle application of AR) of it… but the edges are very fuzzy. I guess that the spectrum of the visual experience of XR might look a little like this:

Comic showing different levels of "XR-ness", from the most to the least: The Matrix has you now, fully immersive VR, headset AR, "Magic Window" AR, conventional videogaming, HUDs, using google maps while walking, imagining playing Tetris, reality.

Honestly, I don’t know any more. But I don’t think my sister does either.

Wacom drawing tablets track the name of every application that you open

This article is a repost promoting content originally published elsewhere. See more things Dan's reposted.

( )

I don’t care whether anything materially bad will or won’t happen as a consequence of Wacom taking this data from me. I simply resent the fact that they’re doing it.

The second is that we can also come up with scenarios that involve real harms. Maybe the very existence of a program is secret or sensitive information. What if a Wacom employee suddenly starts seeing entries spring up for “Half Life 3 Test Build”? Obviously I don’t care about the secrecy of Valve’s new games, but I assume that Valve does.

We can get more subtle. I personally use Google Analytics to track visitors to my website. I do feel bad about this, but I’ve got to get my self-esteem from somewhere. Google Analytics has a “User Explorer” tool, in which you can zoom in on the activity of a specific user. Suppose that someone at Wacom “fingerprints” a target person that they knew in real life by seeing that this person uses a very particular combination of applications. The Wacom employee then uses this fingerprint to find the person in the “User Explorer” tool. Finally the Wacom employee sees that their target also uses “LivingWith: Cancer Support”.

Remember, this information is coming from a device that is essentially a mouse.

Interesting deep-dive investigation into the (immoral, grey-area illegal) data mining being done by Wacom when you install the drivers for their tablets. Horrifying, but you’ve got to remember that Wacom are unlikely to be a unique case. I had a falling out with Razer the other year when they started bundling spyware into the drivers for their keyboards and locking-out existing and new customers from advanced features unless they consented to data harvesting.

I’m becoming increasingly concerned by the normalisation of surveillance capitalism: between modern peripherals and the Internet of Things, we’re “willingly” surrendering more of our personal lives than ever before. If you haven’t seen it, I’d also thoroughly recommend Data, the latest video from Philosophy Tube (of which I’ve sung the praises before).

Spoiled by the Web

This article is a repost promoting content originally published elsewhere. See more things Dan's reposted.

( )

Back in 2016, I made an iMessage app called Overreactions. Actually, the term “app” is probably generous: It’s a collection of static and animated silly faces you can goof around with in iMessage. Its “development” involved many PNGs but zero lines of code.

Just before the 2019 holidays, I received an email from Apple notifying me that the app “does not follow one or more of the App Store Review Guidelines.” I signed in to Apple’s Resource Center, where it elaborated that the app had gone too long without an update. There were no greater specifics, no broken rules or deprecated dependencies, they just wanted some sort of update to prove that it was still being maintained or they’d pull the app from the store in December.

Here’s what it took to keep that project up and running…

There’s always a fresh argument about Web vs. native (alongside all the rehashed ones, of course). But here’s one you might not have heard before: nobody ever wrote a Web page that met all the open standards only to be told that they had to re-compile it a few years later for no reason other than that the browser manufacturers wanted to check that the author was still alive.

But that’s basically what happened here. The author of an app which had been (and still did) work fine was required to re-install the development environment and toolchain, recompile, and re-submit a functionally-identical version of their app (which every user of the app then had to re-download along with their other updates)… just because Apple think that an app shouldn’t ever go more than 3 years between updates.

Fox (Household NAS)

Last week I built Fox, the newest addition to our home network. Fox, whose specification called for not one, not two, not three but four 12 terabyte hard disk drives was built principally as a souped-up NAS device – a central place for us all to safely hold and control access to important files rather than having them spread across our various devices – but she’s got a lot more going on that that, too.

A black computer "cube" nestled under a desk, amongst cables.
Right now, Fox lives under my desk along with most of our network cables.

Fox has:

  • Enough hard drive space to give us 36TB of storage capacity plus 12TB of parity, allowing any one of the drives to fail without losing any data.
  • “Headroom” sufficient to double its capacity in the future without significant effort.
  • A mediumweight graphics card to assist with real-time transcoding, helping her to convert and stream audio and videos to our devices in whatever format they prefer.
  • A beefy processor and sufficient RAM to run a dozen virtual machines supporting a variety of functions like software development, media ripping and cataloguing, photo rescaling, reverse-proxying, and document scanning (a planned future purpose for Fox is to have a network-enabled scanner near our “in-trays” so that we can digitise and OCR all of our post and paperwork into a searchable, accessible, space-saving collection).
QFlix (media selection) menu showing on a TV
“QFlix” is a lot like Netflix, except geared mostly towards saving us from having to walk over to the DVD shelf or remember which disc we were up to when watching a long-running series. #firstworldproblems

The last time I filmed myself building a PC was when I built Cosmo, a couple of desktops ago. He turned out to be a bit of a nightmare: he was my first fully-watercooled computer and he leaked everywhere: by the time I’d done all the fixing and re-fixing to make him behave nicely, I wasn’t happy with the video footage and I never uploaded it. I’d been wary, almost-superstitious, about filming a build since then, but I shot a timelapse of Fox’s construction and it turned out pretty well: you can watch it below or on YouTube or QTube.

The timelapse slows to real-time, about a minute in, to illustrate a point about the component test I did with only a CPU (and cooler), PSU, and RAM attached. Something I routinely do when building computers but which I only recently discovered isn’t commonly practised is shown: that the easiest way to power on a computer without attaching a power switch is just to bridge the power switch pins using your screwdriver!

Fox is running Unraid, an operating system basically designed for exactly these kinds of purposes. I’ve been super-impressed by the ease-of-use and versatility of Unraid and I’d recommend it if you’ve got a similar NAS project in your future! I’d also like to sing the praises of the Fractal Design Node 804 case: it’s not got quite as many bells-and-whistles as some cases, but its dual-chamber design is spot-on for a multipurpose NAS, giving ample room for both full-sized expansion cards and heatsinks and lots of hard drives in a relatively compact space.

Evolving Computer Words: “Hacker”

This is part of a series of posts on computer terminology whose popular meaning – determined by surveying my friends – has significantly diverged from its original/technical one. Read more evolving words…

Anticipatory note: based on the traffic I already get to my blog and the keywords people search for, I imagine that some people will end up here looking to learn “how to become a hacker”. If that’s your goal, you’re probably already asking the wrong question, but I direct you to Eric S. Raymond’s Guide/FAQ on the subject. Good luck.

Few words have seen such mutation of meaning over their lifetimes as the word “silly”. The earliest references, found in Old English, Proto-Germanic, and Old Norse and presumably having an original root even earlier, meant “happy”. By the end of the 12th century it meant “pious”; by the end of the 13th, “pitiable” or “weak”; only by the late 16th coming to mean “foolish”; its evolution continues in the present day.

Right, stop that! It's too silly.
The Monty Python crew were certainly the experts on the contemporary use of the word.

But there’s little so silly as the media-driven evolution of the word “hacker” into something that’s at least a little offensive those of us who probably would be described as hackers. Let’s take a look.

Hacker

What people think it means

Computer criminal with access to either knowledge or tools which are (or should be) illegal.

What it originally meant

Expert, creative computer programmer; often politically inclined towards information transparency, egalitarianism, anti-authoritarianism, anarchy, and/or decentralisation of power.

The Past

The earliest recorded uses of the word “hack” had a meaning that is unchanged to this day: to chop or cut, as you might describe hacking down an unruly bramble. There are clear links between this and the contemporary definition, “to plod away at a repetitive task”. However, it’s less certain how the word came to be associated with the meaning it would come to take on in the computer labs of 1960s university campuses (the earliest references seem to come from around April 1955).

There, the word hacker came to describe computer experts who were developing a culture of:

  • sharing computer resources and code (even to the extent, in extreme cases, breaking into systems to establish more equal opportunity of access),
  • learning everything possible about humankind’s new digital frontiers (hacking to learn, not learning to hack)
  • judging others only by their contributions and not by their claims or credentials, and
  • discovering and advancing the limits of computers: it’s been said that the difference between a non-hacker and a hacker is that a non-hacker asks of a new gadget “what does it do?”, while a hacker asks “what can I make it do?”
Venn-Euler-style diagram showing crackers as a subset of security hackers, who in turn are a subset of hackers. Script kiddies are a group of their own, off to the side where nobody has to talk to them (this is probably for the best).
What the media generally refers to as “hackers” would be more-accurately, within the hacker community, be called crackers; a subset of security hackers, in turn a subset of hackers as a whole. Script kiddies – people who use hacking tools exclusively for mischief without fully understanding what they’re doing – are a separate subset on their own.

It is absolutely possible for hacking, then, to involve no lawbreaking whatsoever. Plenty of hacking involves writing (and sharing) code, reverse-engineering technology and systems you own or to which you have legitimate access, and pushing the boundaries of what’s possible in terms of software, art, and human-computer interaction. Even among hackers with a specific interest in computer security, there’s plenty of scope for the legal pursuit of their interests: penetration testing, security research, defensive security, auditing, vulnerability assessment, developer education… (I didn’t say cyberwarfare because 90% of its application is of questionable legality, but it is of course a big growth area.)

Getty Images search for "Hacker".
Hackers have a serious image problem, and the best way to see it is to search on your favourite stock photo site for “hacker”. If you don’t use a laptop in a darkened room, wearing a hoodie and optionally mask and gloves, you’re not a real hacker. Also, 50% of all text should be green, 40% blue, 10% red.

So what changed? Hackers got famous, and not for the best reasons. A big tipping point came in the early 1980s when hacking group The 414s broke into a number of high-profile computer systems, mostly by using the default password which had never been changed. The six teenagers responsible were arrested by the FBI but few were charged, and those that were were charged only with minor offences. This was at least in part because there weren’t yet solid laws under which to prosecute them but also because they were cooperative, apologetic, and for the most part hadn’t caused any real harm. Mostly they’d just been curious about what they could get access to, and were interested in exploring the systems to which they’d logged-in, and seeing how long they could remain there undetected. These remain common motivations for many hackers to this day.

"Hacker" Dan Q
Hoodie: check. Face-concealing mask: check. Green/blue code: check. Is I a l33t hacker yet?

News media though – after being excited by “hacker” ideas introduced by WarGames – rightly realised that a hacker with the same elementary resources as these teens but with malicious intent could cause significant real-world damage. Bruce Schneier argued last year that the danger of this may be higher today than ever before. The press ran news stories strongly associating the word “hacker” specifically with the focus on the illegal activities in which some hackers engage. The release of Neuromancer the following year, coupled with an increasing awareness of and organisation by hacker groups and a number of arrests on both sides of the Atlantic only fuelled things further. By the end of the decade it was essentially impossible for a layperson to see the word “hacker” in anything other than a negative light. Counter-arguments like The Conscience of a Hacker (Hacker’s Manifesto) didn’t reach remotely the same audiences: and even if they had, the points they made remain hard to sympathise with for those outside of hacker communities.

"Glider" Hacker Emblem
‘Nuff said.

A lack of understanding about what hackers did and what motivated them made them seem mysterious and otherworldly. People came to make the same assumptions about hackers that they do about magicians – that their abilities are the result of being privy to tightly-guarded knowledge rather than years of practice – and this elevated them to a mythical level of threat. By the time that Kevin Mitnick was jailed in the mid-1990s, prosecutors were able to successfully persuade a judge that this “most dangerous hacker in the world” must be kept in solitary confinement and with no access to telephones to ensure that he couldn’t, for example, “start a nuclear war by whistling into a pay phone”. Yes, really.

Four hands on one keyboard, from CSI: Cyber
Whistling into a phone to start a nuclear war? That makes CSI: Cyber seem realistic [watch].

The Future

Every decade’s hackers have debated whether or not the next decade’s have correctly interpreted their idea of “hacker ethics”. For me, Steven Levy’s tenets encompass them best:

  1. Access to computers – and anything which might teach you something about the way the world works – should be unlimited and total.
  2. All information should be free.
  3. Mistrust authority – promote decentralization.
  4. Hackers should be judged by their hacking, not bogus criteria such as degrees, age, race, or position.
  5. You can create art and beauty on a computer.
  6. Computers can change your life for the better.

Given these concepts as representative of hacker ethics, I’m convinced that hacking remains alive and well today. Hackers continue to be responsible for many of the coolest and most-important innovations in computing, and are likely to continue to do so. Unlike many other sciences, where progress over the ages has gradually pushed innovators away from backrooms and garages and into labs to take advantage of increasingly-precise generations of equipment, the tools of computer science are increasingly available to individuals. More than ever before, bedroom-based hackers are able to get started on their journey with nothing more than a basic laptop or desktop computer and a stack of freely-available open-source software and documentation. That progress may be threatened by the growth in popularity of easy-to-use (but highly locked-down) tablets and smartphones, but the barrier to entry is still low enough that most people can pass it, and the new generation of ultra-lightweight computers like the Raspberry Pi are doing their part to inspire the next generation of hackers, too.

That said, and as much as I personally love and identify with the term “hacker”, the hacker community has never been less in-need of this overarching label. The diverse variety of types of technologist nowadays coupled with the infiltration of pop culture by geek culture has inevitably diluted only to be replaced with a multitude of others each describing a narrow but understandable part of the hacker mindset. You can describe yourself today as a coder, gamer, maker, biohacker, upcycler, cracker, blogger, reverse-engineer, social engineer, unconferencer, or one of dozens of other terms that more-specifically ties you to your community. You’ll be understood and you’ll be elegantly sidestepping the implications of criminality associated with the word “hacker”.

The original meaning of “hacker” has also been soiled from within its community: its biggest and perhaps most-famous advocate‘s insistence upon linguistic prescriptivism came under fire just this year after he pushed for a dogmatic interpretation of the term “sexual assault” in spite of a victim’s experience. This seems to be absolutely representative of his general attitudes towards sex, consent, women, and appropriate professional relationships. Perhaps distancing ourselves from the old definition of the word “hacker” can go hand-in-hand with distancing ourselves from some of the toxicity in the field of computer science?

(I’m aware that I linked at the top of this blog post to the venerable but also-problematic Eric S. Raymond; if anybody can suggest an equivalent resource by another author I’d love to swap out the link.)

Verdict: The word “hacker” has become so broad in scope that we’ll never be able to rein it back in. It’s tainted by its associations with both criminality, on one side, and unpleasant individuals on the other, and it’s time to accept that the popular contemporary meaning has won. Let’s find new words to define ourselves, instead.

Evolving Computer Words: “GIF”

This is part of a series of posts on computer terminology whose popular meaning – determined by surveying my friends – has significantly diverged from its original/technical one. Read more evolving words…

The language we use is always changing, like how the word “cute” was originally a truncation of the word “acute”, which you’d use to describe somebody who was sharp-witted, as in “don’t get cute with me”. Nowadays, we use it when describing adorable things, like the subject of this GIF:

[Animated GIF] Puppy flumps onto a human.
Cute, but not acute.
But hang on a minute: that’s another word that’s changed meaning: GIF. Want to see how?

GIF

What people think it means

File format (or the files themselves) designed for animations and transparency. Or: any animation without sound.

What it originally meant

File format designed for efficient colour images. Animation was secondary; transparency was an afterthought.

The Past

Back in the 1980s cyberspace was in its infancy. Sir Tim hadn’t yet dreamed up the Web, and the Internet wasn’t something that most people could connect to, and bulletin board systems (BBSes) – dial-up services, often local or regional, sometimes connected to one another in one of a variety of ways – dominated the scene. Larger services like CompuServe acted a little like huge BBSes but with dial-up nodes in multiple countries, helping to bridge the international gaps and provide a lower learning curve than the smaller boards (albeit for a hefty monthly fee in addition to the costs of the calls). These services would later go on to double as, and eventually become exclusively, Internet Service Providers, but for the time being they were a force unto themselves.

CompuServe ad circa 1983
My favourite bit of this 1983 magazine ad for CompuServe is the fact that it claims a trademark on the word “email”. They didn’t try very hard to cling on to that claim, unlike their controversial patent on the GIF format…

In 1987, CompuServe were about to start rolling out colour graphics as a new feature, but needed a new graphics format to support that. Their engineer Steve Wilhite had the idea for a bitmap image format backed by LZW compression and called it GIF, for Graphics Interchange Format. Each image could be composed of multiple frames each having up to 256 distinct colours (hence the common mistaken belief that a GIF can only have 256 colours). The nature of the palette system and compression algorithm made GIF a particularly efficient format for (still) images with solid contiguous blocks of colour, like logos and diagrams, but generally underperformed against cosine-transfer-based algorithms like JPEG/JFIF for images with gradients (like most photos).

GIF with more than 256 colours.
This animated GIF (of course) shows how it’s possible to have more than 256 colours in a GIF by separating it into multiple non-temporal frames.

GIF would go on to become most famous for two things, neither of which it was capable of upon its initial release: binary transparency (having “see through” bits, which made it an excellent choice for use on Web pages with background images or non-static background colours; these would become popular in the mid-1990s) and animation. Animation involves adding a series of frames which overlay one another in sequence: extensions to the format in 1989 allowed the creator to specify the duration of each frame, making the feature useful (prior to this, they would be displayed as fast as they could be downloaded and interpreted!). In 1995, Netscape added a custom extension to GIF to allow them to loop (either a specified number of times or indefinitely) and this proved so popular that virtually all other software followed suit, but it’s worth noting that “looping” GIFs have never been part of the official standard!

Hex editor view of a GIF file's metadata section, showing Netscape headers.
Open almost any animated GIF file in a hex editor and you’ll see the word NETSCAPE2.0; evidence of Netscape’s role in making animated GIFs what they are today.

Compatibility was an issue. For a period during the mid-nineties it was quite possible that among the visitors to your website there would be a mixture of:

  1. people who wouldn’t see your GIFs at all, owing to browser, bandwidth, preference, or accessibility limitations,
  2. people who would only see the first frame of your animated GIFs, because their browser didn’t support animation,
  3. people who would see your animation play once, because their browser didn’t support looping, and
  4. people who would see your GIFs as you intended, fully looping

This made it hard to depend upon GIFs without carefully considering their use. But people still did, and they just stuck a Netscape Now button on to warn people, as if that made up for it. All of this has happened before, etc.

In any case: as better, newer standards like PNG came to dominate the Web’s need for lossless static (optionally transparent) image transmission, the only thing GIFs remained good for was animation. Standards like APNG/MNG failed to get off the ground, and so GIFs remained the dominant animated-image standard. As Internet connections became faster and faster in the 2000s, they experienced a resurgence in popularity. The Web didn’t yet have the <video> element and so embedding videos on pages required a mixture of at least two of <object>, <embed>, Flash, and black magic… but animated GIFs just worked and soon appeared everywhere.

Magic.
How animation online really works.

The Future

Nowadays, when people talk about GIFs, they often don’t actually mean GIFs! If you see a GIF on Giphy or WhatsApp, you’re probably actually seeing an MPEG-4 video file with no audio track! Now that Web video is widely-supported, service providers know that they can save on bandwidth by delivering you actual videos even when you expect a GIF. More than ever before, GIF has become a byword for short, often-looping Internet animations without sound… even though that’s got little to do with the underlying file format that the name implies.

What's a web page? Something ducks walk on?
What’s a web page? What’s anything?

Verdict: We still can’t agree on whether to pronounce it with a soft-G (“jif”), as Wilhite intended, or with a hard-G, as any sane person would, but it seems that GIFs are here to stay in name even if not in form. And that’s okay. I guess.