I swear that I used to be good at Mastermind when I was a kid. But now, when it’s my turn to break
the code that one of our kids has chosen, I fail more often than I succeed. That’s no good!
If you didn’t have me pegged as a board gamer… where the hell have you been?
Mastermind and me
Maybe it’s because I’m distracted; multitasking doesn’t help problem-solving. Or it’s because we’re “Super” Mastermind, which differs from the one I had as a child in that
eight (not six) peg colours are available and secret codes are permitted to have duplicate peg colours. These changes increase the possible permutations from 360 to 4,096, but the
number of guesses allowed only goes up from 8 to 10. That’s hard.
The set I had as a kid was like this, I think. Photo courtesy ZeroOne; CC-BY-SA license.
Hey, that’s an idea. Let’s crack the code… by writing some code!
This online edition plays a lot like the version our kids play, although the peg colours are different. Next guess should be an
easy solve!
Representing a search space
The search space for Super Mastermind isn’t enormous, and it lends itself to some highly-efficient computerised storage.
There are 8 different colours of peg. We can express these colours as a number between 0 and 7, in three bits of binary, like this:
Decimal
Binary
Colour
0
000
Red
1
001
Orange
2
010
Yellow
3
011
Green
4
100
Blue
5
101
Pink
6
110
Purple
7
111
White
There are four pegs in a row, so we can express any given combination of coloured pegs as a 12-bit binary number. E.g. 100 110 111 010 would represent the
permutation blue (100), purple (110), white (111), yellow (010). The total search space, therefore, is the range of numbers from
000000000000 through 111111111111… that is: decimal 0 through 4,095:
Decimal
Binary
Colours
0
000000000000
Red, red, red, red
1
000000000001
Red, red, red, orange
2
000000000010
Red, red, red, yellow
…………
4092
111111111100
White, white, white, blue
4093
111111111101
White, white, white, pink
4094
111111111110
White, white, white, purple
4095
111111111111
White, white, white, white
Whenever we make a guess, we get feedback in the form of two variables: each peg that is in the right place is a bull; each that represents a peg in the secret code but
isn’t in the right place is a cow (the names come from Mastermind’s precursor, Bulls & Cows). Four bulls
would be an immediate win (lucky!), any other combination of bulls and cows is still valuable information. Even a zero-score guess is valuable- potentially very valuable! – because it
tells the player that none of the pegs they’ve guessed appear in the secret code.
If one of Wordle‘s parents was Scrabble, then this was the other. Just ask its Auntie Twitter.
Solving with Javascript
The latest versions of Javascript support binary literals and bitwise operations, so we can encode and decode between arrays of four coloured pegs (numbers 0-7) and the number 0-4,095
representing the guess as shown below. Decoding uses an AND bitmask to filter to the requisite digits then divides by the order of magnitude. Encoding is just a reduce
function that bitshift-concatenates the numbers together.
/** * Decode a candidate into four peg values by using binary bitwise operations. */function decodeCandidate(candidate){
return [
(candidate &0b111000000000) /0b001000000000,
(candidate &0b000111000000) /0b000001000000,
(candidate &0b000000111000) /0b000000001000,
(candidate &0b000000000111) /0b000000000001
];
}
/** * Given an array of four integers (0-7) to represent the pegs, in order, returns a single-number * candidate representation. */function encodeCandidate(pegs) {
return pegs.reduce((a, b)=>(a <<3) + b);
}
With this, we can simply:
Produce a list of candidate solutions (an array containing numbers 0 through 4,095).
Choose one candidate, use it as a guess, and ask the code-maker how it scores.
Eliminate from the candidate solutions list all solutions that would not score the same number of bulls and cows for the guess that was made.
Repeat from step #2 until you win.
Step 3’s the most important one there. Given a function getScore( solution, guess ) which returns an array of [ bulls, cows ] a given guess would
score if faced with a specific solution, that code would look like this (I’m convined there must be a more-performant way to eliminate candidates from the list with XOR
bitmasks, but I haven’t worked out what it is yet):
/** * Given a guess (array of four integers from 0-7 to represent the pegs, in order) and the number * of bulls (number of pegs in the guess that are in the right place) and cows (number of pegs in the * guess that are correct but in the wrong place), eliminates from the candidates array all guesses * invalidated by this result. Return true if successful, false otherwise. */function eliminateCandidates(guess, bulls, cows){
const newCandidatesList = data.candidates.filter(candidate=>{
const score = getScore(candidate, guess);
return (score[0] == bulls) && (score[1] == cows);
});
if(newCandidatesList.length ==0) {
alert('That response would reduce the candidate list to zero.');
returnfalse;
}
data.candidates = newCandidatesList;
chooseNextGuess();
returntrue;
}
I continued in this fashion to write a full solution (source code). It uses ReefJS for
component rendering and state management, and you can try it for yourself right in your web browser. If you play against the online version I mentioned you’ll need to transpose the colours in your head: the physical version I play with the kids has pink and
purple pegs, but the online one replaces these with brown and black.
Testing the solution
Let’s try it out against the online version:
As expected, my code works well-enough to win the game every time I’ve tried, both against computerised and in-person opponents. So – unless you’ve been actively thinking about the
specifics of the algorithm I’ve employed – it might surprise you to discover that… my solution is very-much a suboptimal one!
My code has only failed to win a single game… and that turned out to because my opponent, playing overexcitedly, cheated in the third turn. To be fair, my code didn’t lose
either, though: it identified that a mistake must have been made and we declared the round void when we identified the problem.
My solution is suboptimal
A couple of games in, the suboptimality of my solution became pretty visible. Sure, it still won every game, but it was a blunt instrument, and anybody who’s seriously thought about
games like this can tell you why. You know how when you play e.g. Wordle (but not in “hard mode”) you sometimes want to type in a word that can’t possibly be the
solution because it’s the best way to rule in (or out) certain key letters? This kind of strategic search space bisection reduces the mean number of guesses you need to solve the
puzzle, and the same’s true in Mastermind. But because my solver will only propose guesses from the list of candidate solutions, it can’t make this kind of improvement.
My blog post about Break Into Us used a series of visual metaphors to show search space dissection, including this one. If you missed
it, it might be worth reading.
Search space bisection is also used in my adverserial hangman game, but in this case the aim is to split the search space in such a way that no
matter what guess a player makes, they always find themselves in the larger remaining portion of the search space, to maximise the number of guesses they have to make. Y’know, because
it’s evil.
A great first guess, assuming you’re playing against a random code and your rules permit the code to have repeated colours, is a “1122” pattern.
There are mathematically-derived heuristics to optimise Mastermind strategy. The first
of these came from none other than Donald Knuth (legend of computer science, mathematics, and pipe organs) back in 1977. His solution,
published at probably the height of the game’s popularity in the amazingly-named Journal of Recreational Mathematics, guarantees a solution to the six-colour version of the
game within five guesses. Ville [2013] solved an
optimal solution for a seven-colour variant, but demonstrated how rapidly the tree of possible moves grows and the need for early pruning – even with powerful modern computers – to
conserve memory. It’s a very enjoyable and readable paper.
But for my purposes, it’s unnecessary. My solver routinely wins within six, maybe seven guesses, and by nonchalantly glancing at my phone in-between my guesses I can now reliably guess
our children’s codes quickly and easily. In the end, that’s what this was all about.
Different games in the same style (absurdle plays adversarially like my cheating hangman
game, crosswordle involves reverse-engineering a wordle colour grid into a crossword, heardle
is like Wordle but sounding out words using the IPA…)
I’m sure that by now all your social feeds are full of people playing Wordle. But the cool nerds are playing something new…
Now, a Wordle clone for D&D players!
But you know what hasn’t been seen before today? A Wordle clone where you have to guess a creature from the Dungeons & Dragons (5e) Monster Manual by putting numeric values into a
character sheet (STR, DEX, CON, INT, WIS, CHA):
Just because nobody’s asking for a game doesn’t mean you shouldn’t make it anyway.
What are you waiting for: go give DNDle a try (I pronounce it “dindle”, but you can pronounce it however you like). A new monster
appears at 10:00 UTC each day.
And because it’s me, of course it’s open source and works offline.
The boring techy bit
Like Wordle, everything happens in your browser: this is a “backendless” web application.
I’ve used ReefJS for state management, because I wanted something I could throw together quickly but I didn’t want to drown myself (or my players)
in a heavyweight monster library. If you’ve not used Reef before, you should give it a go: it’s basically like React but a tenth of the footprint.
A cache-first/background-updating service worker means that it can run completely offline: you can install it to your homescreen in the
same way as Wordle, but once you’ve visited it once it can work indefinitely even if you never go online again.
I don’t like to use a buildchain that’s any more-complicated than is absolutely necessary, so the only development dependency is rollup. It
resolves my import statements and bundles a single JS file for the browser.
A not-entirely-theoretical question about open source software licensing came up at work the other day. I thought it was interesting
enough to warrant a quick dive into the philosophy of minification, and how it relates to copyleft open source licenses. Specifically: does distributing (only) minified
source code violate the GPL?
If you’ve come here looking for a legally-justifiable answer to that question, you’re out of luck. But what I can give you is a (fictional) story:
TheseusJS is slow
TheseusJS is a (fictional) Javascript library designed to be run in a browser. It’s released under the GPLv3 license. This license allows you to download and use TheseusJS for any purpose you like, including making money off it, modifying
it, or redistributing it to others… but it requires that if you redistribute it you have to do so under the same license and include the source code. As such, it forces you to
share with others the same freedoms you enjoy for yourself, which is highly representative of some schools of open-source thinking.
It’s a cool project, but it really needs some maintenance this side of 2010.
It’s a great library and it’s used on many websites, but its performance isn’t great. It’s become infamous for the impact it has on the speed of the websites it’s used on, and it’s
often the butt of jokes by developers: “Man, this website’s slow. Must be running Theseus!”
The original developer has moved onto his new project, Moralia, and seems uninterested in handling the growing number of requests for improvements. So I’ve decided to fork it
and make my own version, FastTheseusJS and work on improving its speed.
FastTheseusJS is fast
I do some analysis and discover the single biggest problem with TheseusJS is that the Javascript file itself is enormous. The original developer kept all of the
copious documentation in comments in the file itself, and for some reason it doesn’t even compress well. When you use TheseusJS on a website it takes a painfully long time for
a browser to download it, if it’s not precached.
Nobody even uses the documentation in the comments: there’s a website with a fully-documented API.
My first release of FastTheseusJS, then, removes virtually of the comments, replacing them with a single comment at the top pointing developers to a website where the
API is fully documented. While I’m in there anyway, I also fix a minor bug that’s been annoying me for a while.
v1.1.0 changes
Forked from TheseusJS v1.0.4
Fixed issue #1071 (running mazeSolver() without first connecting <String> component results in endless loop)
Removed all comments: improves performance considerably
I discover another interesting fact: the developer of TheseusJS used a really random mixture of tabs and spaces for indentation, sometimes in the same line! It looks…
okay if you set your editor up just right, but it’s pretty hideous otherwise. That whitespace is unnecessary anyway: the codebase is sprawling but it seldom goes more than two
levels deep, so indentation levels don’t add much readability. For my second release of FastTheseusJS, then, I remove this extraneous whitespace, as well as removing
the in-line whitespace inside parameter lists and the components of for loops. Every little helps, right?
v1.1.1 changes
Standardised whitespace usage
Removed unnecessary whitespace
Some of the simpler functions now fit onto just a single line, and it doesn’t even inconvenience me to see them this way: I know the codebase well enough by now that it’s no
disadvantage for me to edit it in this condensed format.
Personally, I’ve given up on the tabs-vs-spaces debate and now I indent my code using semicolons. (That’s clearly a joke. Don’t flame me.)
In the next version, I shorten the names of variables and functions in the code.
For some reason, the original developer used epic rambling strings for function names, like the well-known function
dedicateIslandTempleToTheImageOfAGodBeforeOrAfterMakingASacrificeWithOrWithoutDancing( boolBeforeMakingASacrifice, objectImageOfGodToDedicateIslandTempleTo,
stringNmeOfPersonMakingDedication, stringOrNullNameOfLocalIslanderDancedWith). That one gets called all the time internally and isn’t exposed via the external
API so it might as well be shortened to d=(i,j,k,l,m)=>. Now all the internal workings of the library
are each represented with just one or two letters.
v1.1.2 changes
Shortened/standarised non-API variable and function names – improves performance
I’ve shaved several kilobytes off the monstrous size of TheseusJS and I’m very proud. The original developer says nice things about my fork on social media, resulting in a
torrent of downloads and attention. Within a certain archipelago of developers, I’m slightly famous.
But did I violate the license?
But then a developer says to me: you’re violating the license of the original project because you’re not making the source code available!
This happens every day. Probably not to this same guy every time though, but you never know. Original photo by Andrea Piacquadio.
They claim that my bugfix in the first version of FastTheseusJS represents a material change to the software, and that the changes I’ve made since then are
obfuscation: efforts short of binary compilation that aim to reduce the accessibility of the source code. This fails to meet the GPL‘s definition of source code as “the preferred form of the work for making modifications to
it”. I counter that this condensed view of the source code is my “preferred” way of working with it, and moreover that my output is not the result of some build step that
makes the code harder to read, the code is just hard to read as a result of the optimisations I’ve made. In ambiguous cases, whose “preference” wins?
Did I violate the license? My gut feeling is that no, all of my changes were within the spirit and the letter of the GPL (they’re a
terrible way to write code, but that’s not what’s in question here). Because I manually condensed the code, did so with the intention that this condensing was a feature, and
continue to work directly with the code after condensing it because I prefer it that way… that feels like it’s “okay”.
But if I’d just run the code through a minification tool, my opinion changes. Suppose I’d run minify --output fasttheseus.js theseus.js and then deleted my copy of
theseus.js. Then, making changes to fasttheseus.js and redistributing it feels like a violation to me… even if the resulting code is the same as I’d have
gotten via the “manual” method!
I don’t know the answer (IANAL), but I’ll tell you this: I feel hypocritical for saying one piece of code would not violate
the license but another identical piece of code would, based only on the process the developer followed to produce it. If I replace one piece of code at a time with
less-readable versions the license remains intact, but if I replace them all at once it doesn’t? That doesn’t feel concrete nor satisfying.
Sure, I can write a blog post in just one line of code. It’ll just be a really, really, really long line… (Still perfectly readable, though!)
This isn’t an entirely contrived example
This example might seem highly contrived, and that’s because it is. But the grey area between the extremes is where the real questions are. If you agree that redistribution of (only)
minified source code violates the GPL, you’re left asking: at what point does the change occur? Code isn’t necessarily minified or
not-minified: there are many intermediate steps.
If I use a correcting linter to standardise indentation and whitespace – switching multiple spaces for the appropriate number of tabs, removing excess line breaks etc. (or do the same
tasks manually) I’m sure you’d agree that’s fine. If I have it replace whole-function if-blocks with hoisted return statements, that’s probably fine too. If I replace if blocks with
ternery operators or remove or shorten comments… that might be fine, but probably depends upon context. At some point though, some way along the process, minification goes “too
far” and feels like it’s no longer within the limitations of the license. And I can’t tell you where that point is!
This issue’s even more-complicated with some other licenses, e.g. the AGPL, which extends the requirement to share source code to hosted applications. Suppose I implement a web application that uses an AGPL-licensed library. The person who redistributed it to me only gave me the minified version, but they gave me a web address from which
to acquire the full source code, so they’re in the clear. I need to make a small patch to the library to support my service, so I edit it right into the minified version I’ve already
got. A user of my hosted application asks for a copy of the source code, so I provide it, including the edited minified library… am I violating the license for not providing the full,
unminified version, even though I’ve never even seen it? It seems absurd to say that I would be, but it could still be argued to be the case.
I love diagrams like this, which show license compatibility of different open source licenses. Adapted from a diagram by Carlo Daffara,
in turn adapted from a diagram by David E. Wheeler, used under a CC-BY-SA license.
99% of the time, though, the answer’s clear, and the ambiguities shown above shouldn’t stop anybody from choosing to open-source their work
under GPL, AGPL (or any other open source license depending on their
preference and their community). Perhaps the question of whether minification violates the letter of a copyleft license is one of those Potter Stewart “I know it when I see it” things. It certainly goes against the spirit of the thing to do so deliberately or
unnecessarily, though, and perhaps it’s that softer, more-altruistic goal we should be aiming for.
I’ve been using Synergy for a long, long time. By the time I wrote about my
admiration of its notification icon back in 2010 I’d already been using it for some years. But this long love affair ended this week when I made the switch to its competitor,
Barrier.
I’m not certain exactly when I took this screenshot (which I shared with Kit while praising Synergy), but it’s clearly a pre-1.4 version
and those look distinctly like Windows Vista’s ugly rounded corners, so I’m thinking no later than 2009?
If you’ve not come across it before: Synergy was possibly the first multiplatform tool to provide seamless “edge-to-edge” sharing of a keyboard and mouse between multiple
computers. Right now, for example, I’m sitting in front of Cornet, a Debian 11 desktop, Idiophone, a Macbook Pro docked to a desktop monitor, and Renegade, a
Windows desktop. And I can move my mouse cursor from one, to the other, to the next, interacting with them all as if I were connected directly to it.
There have long been similar technologies. KVM switches can do this, as
can some modern wireless mice (I own at least two such mice!). But none of them are as seamless as what Synergy does: moving from computer to computer as fast as you can move your mouse
and sharing a clipboard between multiple devices. I also love that I can configure my set-up around how I work, e.g. when I undock my Macbook it switches from ethernet to wifi, this
gets detected and it’s automatically removed from the cluster. So when I pick up my laptop, it magically stops being controlled by my Windows PC’s mouse and keyboard until I dock it
again.
Synergy’s published under a hybrid model: open-source components, with paid-for extra features. It used to provide more in the open-source offering: you could download a
fully-working copy of the software and use it without limitation, losing out only on a handful of features that for many users were unnecessary. Nontheless, early on I wanted to support
the development of this tool that I used so much, and so I donated money towards funding its development. In exchange, I gained access to Synergy Premium, and then when their business
model changed I got grandfathered-in to a lifetime subscription to Synergy Pro.
I continued using Synergy all the while. When their problem-stricken 2.x branch went into beta, I was among the
testers: despite the stability issues and limitations, I loved the fact that I could have what was functionally multiple co-equal “host” computers, and – when it worked – I liked the
slick new configuration interface it sported. I’ve been following with bated breath announcements about the next generation – Synergy 3 – and I’ve registered as an alpha tester for when the time comes.
If it sounds like I’m a fanboy… that’d probably be an accurate assessment of the situation. So why, after all these years, have I jumped ship?
Dear Future Dan. If you ever need a practical example of where open-source thinking provides a better user experience than arbritrarily closed-source products, please see above.
Yours, Past Dan.
I’ve been aware of Barrier since the project started, as a fork of the last open-source version of the core Synergy program. Initially, I didn’t consider Barrier to be a
suitable alternative for me, because it lacked features I cared about that were only available in the premium version of Synergy. As time went on and these features were implemented, I
continued to stick with Synergy and didn’t bother to try out Barrier… mostly out of inertia: Synergy worked fine, and the only thing Barrier seemed to offer would be a simpler set-up
(because I wouldn’t need to insert my registration details!).
This week, though, as part of a side project, I needed to add an extra computer to my cluster. For reasons that are boring and irrelevant and so I’ll spare you the details, the new
computer’s running the 32-bit version of Debian 11.
I went to the Symless download pages and discovered… there isn’t a Debian 11 package. Ah well, I think: the Debian 10 one can probably be made to work. But then I discover… there’s only
a 64-bit version of the Debian 10 binary. I’ll note that this isn’t a fundamental limitation – there are 32-bit versions of Synergy available for Windows and for ARMhf
Raspberry Pi devices – but a decision by the developers not to support that platform. In order to protect their business model, Synergy is only available as closed-source binaries, and
that means that it’s only available for the platforms for which the developers choose to make it available.
So I thought: well, I’ll try Barrier then. Now’s as good a time as any.
Setting up Barrier in place of Synergy was pretty familiar and painless.
Barrier and Synergy aren’t cross-compatible, so first I had to disable Synergy on each machine in my cluster. Then I installed Barrier. Like most popular open-source software, this was
trivially easy compared to Synergy: I just used an appropriate package manager by running choco install barrier, brew install barrier, and apt install barrier to install on each of the Windows, Mac, and Debian computers, respectively.
Configuring Barrier was basically identical to configuring Synergy: set up the machine names, nominate one the server, and tell the server what the relative positions are of each of the
others’ screens. I usually bind the “scroll lock” key to the “lock my cursor to the current screen” function but I wasn’t permitted to do this in Barrier for some reason, so I remapped
my scroll lock key to some random high unicode character and bound that instead.
Getting Barrier to auto-run on MacOS was a little bit of a drag – in the end I had to use Automator to set up a shortcut that ran it and loaded the configuration, and set that to run on
login. These little touches are mostly solved in Synergy, but given its technical audience I don’t imagine that anybody is hugely inconvenienced by them. Nonetheless, Synergy clearly
retains a slightly more-polished experience.
Altogether, switching from Synergy to Barrier took me under 15 minutes and has so far offered me a functionally-identical experience, except that it works on more devices, can be
installed via my favourite package managers, and doesn’t ask me for registration details before it functions. Synergy 3’s going to have to be a big leap forward to beat that!
As I mentioned last year, for several years I’ve collected pretty complete historic location data from GPSr devices I carry with me everywhere, which I collate in a personal μlogger server.
Going back further, I’ve got somewhat-spotty data going back a decade, thanks mostly to the fact that I didn’t get around to opting-out of Google’s location tracking until only a few years ago (this data is now
also housed in μlogger). More-recently, I now also get tracklogs from my smartwatch, so I’m managing to collate more personal
location data than ever before.
The blob around my house, plus some of the most common routes I take to e.g. walk or cycle the children to school.
A handful of my favourite local walking and cycling routes, some of which stand out very well: e.g. the “loop” just below the big blob represents a walk around the lake at Dix Pit;
the blob on its right is the Devils Quoits, a stone circle and henge that I thought were sufficiently interesting that
I made a virtual geocache out of them.
The most common highways I spend time on: two roads into Witney, the road into and around Eynsham, and routes to places in Woodstock and North Oxford where the kids have often had
classes/activities.
I’ve unsurprisingly spent very little time in Oxford City Centre, but when I have it’s most often been at the Westgate Shopping Centre,
on the roof of which is one of the kids’ favourite restaurants (and which we’ve been able to go to again as Covid restrictions have lifted, not least thanks to their outdoor seating!).
One to eight years ago
Let’s go back to the 7 years prior, when I lived in Kidlington. This paints a different picture:
For the seven years I lived in Kidlington I moved around a lot more than I have since: each hotspot tells a story, and some tell a few.
This heatmap highlights some of the ways in which my life was quite different. For example:
Most of my time was spent in my village, but it was a lot larger than the hamlet I live in now and this shows in the size of my local “blob”. It’s also possible to pick out common
destinations like the kids’ nursery and (later) school, the parks, and the routes to e.g. ballet classes, music classes, and other kid-focussed hotspots.
I worked at the Bodleian from early 2011 until late in 2019, and so I spent a lot of time in
Oxford City Centre and cycling up and down the roads connecting my home to my workplace: Banbury Road glows the brightest, but I spent some time on Woodstock Road too.
For some of this period I still volunteered with Samaritans in Oxford, and their branch – among other volunteering hotspots
– show up among my movements. Even without zooming in it’s also possible to make out individual venues I visited: pubs, a cinema, woodland and riverside walks, swimming pools etc.
Less-happily, it’s also obvious from the map that I spent a significant amount of time at the John Radcliffe Hospital, an unpleasant reminder of some challenging times from that
chapter of our lives.
The data’s visibly “spottier” here, mostly because I built the heatmap only out of the spatial data over the time period, and not over the full tracklogs (i.e. the map it doesn’t
concern itself with the movement between two sampled points, even where that movement is very-guessable), and some of the data comes from less-frequently-sampled sources like Google.
Eight to ten years ago
Let’s go back further:
Back when I lived in Kennington I moved around a lot less than I would come to later on (although again, the spottiness of the data makes that look more-significant than it is).
Before 2011, and before we bought our first house, I spent a couple of years living in Kennington, to the South of Oxford. Looking at
this heatmap, you’ll see:
I travelled a lot less. At the time, I didn’t have easy access to a car and – not having started my counselling qualification yet – I
didn’t even rent one to drive around very often. You can see my commute up the cyclepath through Hinksey into the City Centre, and you can even make out the outline of Oxford’s Covered
Market (where I’d often take my lunch) and a building in Osney Mead where I’d often deliver training courses.
Sometimes I’d commute along Abingdon Road, for a change; it’s a thinner line.
My volunteering at Samaritans stands out more-clearly, as do specific venues inside Oxford: bars, theatres, and cinemas – it’s the kind of heatmap that screams “this person doesn’t
have kids; they can do whatever they like!”
Every map tells a story
I really love maps, and I love the fact that these heatmaps are capable of painting a picture of me and what my life was like in each of these three distinct chapters of my life over
the last decade. I also really love that I’m able to collect and use all of the personal data that makes this possible, because it’s also proven useful in answering questions like “How
many times did I visit Preston in 2012?”, “Where was this photo taken?”, or “What was the name of that place we had lunch when we got lost during our holiday in Devon?”.
There’s so much value in personal geodata (that’s why unscrupulous companies will try so hard to steal it from you!), but sometimes all you want to do is use it to draw pretty heatmaps.
And that’s cool, too.
How these maps were generated
I have a μlogger instance with the relevant positional data in. I’ve automated my process, but the essence of it if you’d like to try it yourself is as follows:
First, write some SQL to extract all of the position data you need. I round off the latitude and longitude to 5 decimal places to help “cluster” dots for frequency-summing, and I raise
the frequency to the power of 3 to help make a clear gradient in my heatmap by making hotspots exponentially-brighter the more popular they are:
This data needs converting to JSON. I was using Ruby’s mysql2 gem to
fetch the data, so I only needed a .to_json call to do the conversion – like this:
db =Mysql2::Client.new(host: ENV['DB_HOST'], username: ENV['DB_USERNAME'], password: ENV['DB_PASSWORD'], database: ENV['DB_DATABASE'])
db.query(sql).to_a.to_json
Approximately following this guide and leveraging my Mapbox
subscription for the base map, I then just needed to include leaflet.js, heatmap.js, and leaflet-heatmap.js before writing some JavaScript code
like this:
body.innerHTML ='<div id="map"></div>';
let map = L.map('map').setView([51.76, -1.40], 10);
// add the base layer to the map
L.tileLayer('https://api.mapbox.com/styles/v1/{id}/tiles/{z}/{x}/{y}?access_token={accessToken}', {
maxZoom:18,
id:'itsdanq/ckslkmiid8q7j17ocziio7t46', // this is the style I defined for my map, using Mapbox
tileSize:512,
zoomOffset:-1,
accessToken:'...'// put your access token here if you need one!
}).addTo(map);
// fetch the heatmap JSON and render the heatmap
fetch('heat.json').then(r=>r.json()).then(json=>{
let heatmapLayer =new HeatmapOverlay({
"radius":parseFloat(document.querySelector('#radius').value),
"scaleRadius":true,
"useLocalExtrema":true,
});
heatmapLayer.setData({ data: json });
heatmapLayer.addTo(map);
});
When I was a kid of about 10, one of my favourite books was Usborne’s Spy’s Guidebook. (I also liked its sister the Detective’s Handbook, but the Spy’s
Guidebook always seemed a smidge cooler to me).
I imagine that a younger version of me would approve of our 7-year-old’s bookshelf, too.
So I was pleased when our eldest, now 7, took an interest in the book too. This morning, for example, she came to breakfast with an encrypted message for me (along with the relevant
page in the book that contained the cipher I’d need to decode it).
Decryption efforts were hampered by sender’s inability to get her letter “Z”s the right damn way around.
Later, as we used the experience to talk about some of the easier practical attacks against this simple substitution cipher (letter frequency analysis, and known-plaintext attacks… I
haven’t gotten on to the issue of its miniscule keyspace yet!), she asked me to make a pocket version of the code card as described in the book.
A three-bit key doesn’t make a simple substitution cipher significantly safer, but it does serve as a vehicle to teach elementary cryptanalysis!
While I was eating leftover curry for lunch with one hand and producing a nice printable, foldable pocket card for her (which you can download here if you like) with the other, I realised something. There are likely to be a lot more messages in my
future that are protected by this substitution cipher, so I might as well preempt them by implementing a computerised encoder/decoder right away.
If you’ve got kids of the right kind of age, I highly recommend picking up a copy of the Spy’s Guidebook (and possibly the Detective’s Handbook). Either use it as a
vehicle to talk about codes and maths, like I have… or let them believe it’s secure while you know you can break it, like we did with Enigma machines after WWII. Either way, they eventually learn a valuable lesson about cryptography.
Wix, the website builder company you may remember from stealing WordPress code and lying about it, has now decided the best way
to gain relevance is attacking the open source WordPress community in a bizarre
set of ads. They can’t even come up with original concepts for attack ads, and have tried to rip-off of Apple’s Mac vs PC ads, but tastelessly personify the WordPress community
as an absent, drunken father in a therapy session.
I have a lot of empathy for whoever was forced to work on these ads, including the actors, it must have felt bad working on something that’s like Encyclopedia Britannica
attacking Wikipedia. WordPress is a global movement of hundreds of thousands of volunteers and community members, coming together to make the web a better place. The code,
and everything you put into it, belongs to you, and its open source license ensures that you’re in complete control, now and forever. WordPress is free, and also gives you freedom.
…
For those that haven’t been following the relevant bits of tech social media this last week, here’s the insanity you’ve missed:
Wix’s Twitter and YouTube responses suddenly swing from their usual “why is your customer service so slow to respond to me?” level of negative to outright hostile. LOL.
Sure, I’m not the target audience. I’ve been a WordPress user for 15 years, and every time I visit a Wix site it annoys me when I have to permit a stack of third-party JavaScript just
to load images like they’ve never heard of the <img>tag or something. Hell, I like WordPress enough that I used it as a
vehicle to get a job with Automattic, a company most-famous for its WordPress hosting provision. But even putting all of that aside: this
advertising campaign stinks.
Today we reinstated youtube-dl, a popular project on GitHub, after we received additional information about the project that enabled us to reverse a Digital
Millennium Copyright Act (DMCA) takedown.
…
This is a Big Deal. For two reasons:
Firstly, youtube-dl is a spectacularly useful project. I’ve used it for many years to help me archive my own content, to improve my access to content that’s freely
available on the platform, and to help centralise (freely available) metadata to keep my subscriptions on video-sharing sites. Others have even more-important uses for the tool. I love youtube-dl, and I’d never considered the possibility
that it could be used to circumvent digital restrictions (apparently it’s got some kind of geofence-evading features you can optionally enable, for people who don’t have a
multi-endpoint VPN I guess?… I note that it definitely doesn’t break DRM…) until its GitHub repo got taken down the other week.
Which was a bleeding stupid thing to use a DMCA request on, because, y’know: Barbara Streisand Effect. Lampshading that a free, open-source tool could be used for people’s convenience is likely to
increase awareness and adoption, not decrease it! Huge thanks to the EFF for
stepping up and telling GitHub that they’d got it wrong (this letter is
great reading, by the way).
But secondly, GitHub’s response is admirable and – assuming their honour their new stance –
effective. They acknowledge their mistake, then go on to set out a new process by which they’ll review takedown requests. That new process includes technical and legal review, erring on
the side of the developer rather than the claimant (i.e. “innocent until proven guilty”), multiparty negotiation, and limiting the scope of takedowns by allowing violators to export
their non-infringing content after the fact.
I was concerned that the youtube-dl takedown might create a FOSS “chilling effect” on GitHub. It still
might: in the light of it, I for one have started backing up my repositories and those of projects I care about to an different Git server! But with this response, I’d still be
confident hosting the main copy of an open-source project on GitHub, even if that project was one which was at risk of being mistaken for copyright violation.
Note that the original claim came not from Google/YouTube as you might have expected (if you’ve just tuned in) but from the RIAA, based on the fact that
youtube-dlcould be used to download copyrighted music videos for enjoyment offline. If you’re reminded of Sony v. Universal City Studios (1984) – the case behind the “Betamax standard” – you’re not
alone.
I scratched an itch of mine this week and wanted to share the results with you, in case you happen to be one of the few dozen other people on Earth who will cry “finally!” to discover
that this is now a thing.
In the top right corner of this email, you can see that it was sent with end-to-end encryption from another ProtonMail user.
I’ve used ProtonMail as my primary personal email provider for about four years, and I love it. Seamless PGP/GPG for proper end-to-end encryption, privacy as standard, etc. At first, I used their web and mobile app interfaces but over time
I’ve come to rediscover my love affair with “proper” email clients, and I’ve been mostly using Thunderbird for my desktop mail. It’s been
great: lightning-fast search, offline capabilities, and thanks to IMAP (provided by ProtonMail Bridge) my mail’s still just as accessible when I fall-back on the web or mobile clients because I’m out and about.
But the one thing this set-up lacked was the ability to easily see which emails had been delivered encrypted versus those which had merely been delivered “in the clear” (like
most emails) and then encrypted for storage on ProtonMail’s servers. So I fixed it.
There are fundamentally four states a Thunderbird+ProtonMail Bridge email can be in, and here’s how I represent them.
I’ve just released my first ever Thunderbird plugin. If you’re using
ProtonMail Bridge, it adds a notification to the corner of every email to say whether it was encrypted in transit or not. That’s all.
And of course it’s open source with a permissive license (and a doddle to compile using your
standard operating system tools, if you want to build it yourself). If you’re using Thunderbird and ProtonMail Bridge you should give it a whirl. And if you’re not then… maybe you
should consider it?
This last month or so, my digital life has been dramatically improved by Syncthing. So much so that I want to tell you about it.
1.25TiB of data is automatically kept in sync between (depending on the data in question) a desktop PC, NAS, media centre, and phone. This computer’s using the Synctrayzor system tray app.
I started using it last month. Basically, what it does is keeps a pair of directories on remote systems “in sync” with one another. So far, it’s like your favourite cloud
storage service, albeit self-hosted and much-more customisable. But it’s got a handful of killer features that make it nothing short of a dream to work with:
The unique identifier for a computer can be derived from its public key. Encryption comes free as part of the verification of a computer’s identity.
You can share any number of folders with any number of other computers, point-to-point or via an intermediate proxy, and it “just works”.
It’s super transparent: you can always see what it’s up to, you can tweak the configuration to match your priorities, and it’s open source so you can look at the engine if you like.
Here are some of the ways I’m using it:
Keeping my phone camera synced to my PC
I’ve tried a lot of different solutions for this over the years. Back in the way-back-when, like everybody else in those dark times, I used to plug my phone in using a cable to copy
pictures off and sort them. Since then, I’ve tried cloud solutions from Google, Amazon, and Flickr and never found any that really “worked” for me. Their web interfaces and apps tend to
be equally terrible for organising or downloading files, and I’m rarely able to simply drag-and-drop images from them into a blog post like I can from Explorer/Finder/etc.
At first, I set this up as a one-way sync, “pushing” photos and videos from my phone to my desktop PC whenever I was on an unmetered WiFi network. But then I switched it to a two-way
sync, enabling me to more-easily tidy up my phone of old photos too, by just dragging them from the folder that’s synced with my phone to my regular picture storage.
Centralising my backups
Now I’ve got a fancy NAS device with tonnes of storage, it makes sense to use it as a central
point for backups to run fom. Instead of having many separate backup processes running on different computers, I can just have each of them sync to the NAS, and the NAS can back everything up. Computers don’t need to be “on” at a particular
time because the NAS runs all the time, so backups can use the Internet connection when it’s quietest. And in the event of a
hardware failure, there’s an up-to-date on-site backup in the first instance: the cloud backup’s only needed in the event of accidental data deletion (which could be sync’ed already, of
course!). Plus, integrating the sync with ownCloud running on the NAS gives easy access to
my files wherever in the world I am without having to fire up a VPN or otherwise remote-in to my house.
Plus: because Syncthing can share a folder between any number of devices, the same sharing mechanism that puts my phone’s photos onto my main desktop can simultaneously be
pushing them to the NAS, providing redundant connections. And it was a doddle to set up.
Maintaining my media centre’s screensaver
Since the NAS, running Jellyfin, took on most of the media management jobs previously
shared between desktop computers and the media centre computer, the household media centre’s had less to do. But one thing that it does, and that gets neglected, is showing a
screensaver of family photos (when it’s not being used for anything else). Historically, we’ve maintained the photos in that collection via a shared network folder, but then you’ve got
credential management and firewall issues to deal with, not to mention different file naming conventions by different people (and their devices).
But simply sharing the screensaver’s photo folder with the computer of anybody who wants to contribute photos means that it’s as easy as copying the picture to a particular place. It
works on whatever device they care to (computer, tablet, mobile) on any operating system, and it’s quick and seamless. I’m just using it myself, for now, but I’ll be offering it to the
rest of the family soon. It’s a trivial use-case, but once you’ve got it installed it just makes sense.
In short: this month, I’m in love with Syncthing. And maybe you should be, too.
I use the Post Kinds plugin to streamline the management of the different types of posts I make on my blog, based on the
IndieWeb post types list: articles, like this one, are “conventional” blog posts, but I also publish
notes (which are analogous to “tweets”), reposts (“shares” of things I’ve found online, sometimes with commentary), checkins (mostly chronicling my geocaching/geohashing), and others: I’ve extended Post Kinds to facilitate comics and
reviews, for example.
But for people who subscribe (either directly or indirectly) to everything I post, I imagine it must be a little frustrating to sometimes be
unable to identify the type of a post before clicking-through. So I’ve added the following code, which I’m sharing here and on GitHub in case it’s of any use to anybody else, to my theme’s functions.php:
// Make titles in RSS feed be prefixed by the Kind of the post.functionadd_kind_to_rss_post_title(){
$kinds= wp_get_post_terms( get_the_ID(), 'kind' );
if( !isset( $kinds ) ||empty( $kinds ) ) return get_the_title(); // sanity-check.$kind=$kinds[0]->name;
$title= get_the_title();
return trim( "[{$kind}] {$title}" );
}
add_filter( 'the_title_rss', 'add_kind_to_rss_post_title', 4 ); // priority 4 to ensure it happens BEFORE default escaping filters.
This decorates the titles of my posts, but only in my feeds, so it’s easier for people to tell at-a-glance what’s going on:
Down the line I might expand this so that it doesn’t show if the subscriber is, for example, asking only for articles (e.g. via this
feed); I’m coming up with a huge list of things I’d like to do at IndieWebCamp London! But for now, this feels like a nice simple
improvement to a plugin I love that helps it to fit my specific needs.
Cute open source project that produces on-demand SVG and PNG maps,
like the one above, based on the roads in OpenStreetMap data. It takes a somewhat liberal view of what a “road” is: I found it momentarily
challenging to get my bearings in the map above, which includes where I live, because the towpath and cycle paths are included which I hadn’t expected. Still a beautiful bit of output
and the source code could be adapted for any number of interesting cartographic projects.
A long while ago, inspired by Nick Berry‘s analysis of optimal Hangman strategy, I worked it backwards to find the
hardest words to guess when playing Hangman. This week, I showed these to my colleague Grace – who turns out to be a fan of word puzzles – and our conversation inspired me to go a little deeper. Is it possible, I
thought, for me to make a Hangman game that cheats by changing the word it’s thinking of based on the guesses you make in order to make it as difficult as possible for you to
win?
The principle is this: every time the player picks a letter, but before declaring whether or not it’s found in the word –
Make a list of all possible words that would fit into the boxes from the current game state.
If there are lots of them, still, that’s fine: let the player’s guess go ahead.
But if the player’s managing to narrow down the possibilities, attempt to change the word that they’re trying to guess! The new word must be:
Legitimate: it must still be the same length, have correctly-guessed letters in the same places, and contain no letters that have been declared to be incorrect
guesses.
Harder: after resolving the player’s current guess, the number of possible words must be larger than the number of possible words that would have
resulted otherwise.
Yeah, you’re screwed now.
You might think that this strategy would just involve changing the target word so that you can say “nope” to the player’s current guess. That happens a lot, but it’s not always the
case: sometimes, it’ll mean changing to a different word in which the guessed letter also appears. Occasionally, it can even involve changing from a word in which the guessed
letter didn’t appear to one in which it does: that is, giving the player a “freebie”. This may seem counterintuitive as a strategy, but it sometimes makes sense: if
saying “yeah, there’s an E at the end” increases the number of possible words that it might be compared to saying “no, there are no Es” then this is the right move for a
cheating hangman.
Playing against a cheating hangman also lends itself to devising new strategies as a player, too, although I haven’t yet looked deeply into this. But logically, it seems that the
optimal strategy against a cheating hangman might involve making guesses that force the hangman to bisect the search space: knowing that they’re always going to adapt towards the
largest set of candidate words, a perfect player might be able to make guesses to narrow down the possibilities as fast as possible, early on, only making guesses that they actually
expect to be in the word later (before their guess limit runs out!).
The game is brutally-difficult, but surprisingly fun, and you can have it tell you when and how it cheats so you can begin to understand its strategy.
I also find myself wondering how easily I could adapt this into a “helpful hangman”: a game which would always change the word that you’re trying to guess in order to try to make you
win. This raises the possibility of a whole new game, “suicide hangman”, in which the player is trying to get themselves killed and so is trying to pick letters that can’t
possibly be in the word and the hangman is trying to pick words in which those letters can be found, except where doing so makes it obvious which letters the player must avoid next.
Maybe another day.
In the meantime, you’re welcome to go play the game (and let me know what you think, below!) and, if you’re of such an inclination, read the source code. I’ve used some seriously ugly techniques to make this work, including regular expression metaprogramming (using
regular expressions to write regular expressions), but the code should broadly make sense if you want to adapt it. Have fun!
Update 26 September 2019, 16:23: I’ve now added “helpful mode”, where the computer tries to cheat on your behalf
rather than against you, but it’s not as helpful as you’d think because it assumes you’re playing optimally and have already memorised the dictionary!