Ireland and the UK Aren’t In The Same Timezone!

This weekend, while investigating a bug in some code that generates iCalendar (ICS) feeds, I learned about a weird quirk in the Republic of Ireland’s timezone. It’s such a strange thing (and has so little impact on everyday life) that I imagine that even most Irish people don’t even know about it, but it’s important enough that it can easily introduce bugs into the way that computer calendars communicate:

Most of Europe put their clocks forward in Summer, but the Republic of Ireland instead put their clocks backward in Winter.

If that sounds to you like the same thing said two different ways – or the set-up to a joke! – read on:

Map showing timezones of Europe. The UK and Ireland are grouped (along with Iceland) in a zone labelled as being UTC+0.
The timezones of Europe look pretty simple compared to some parts of the world, but the illustration of the British Isles hides an interesting eccentricity.

A Brief History of Time (in Ireland)

Poster titled "Time (Ireland) Act 1916", advising that "On and after Sunday 1st October 1916 Western European Time will be ovserved throughout Ireland" asking people to set their clocks and watches back 35 minutes.
Spring forward, fall back… just a little bit back, though. Not too much.

After high-speed (rail) travel made mean solar timekeeping problematic, Great Britain in 1880 standardised on Greenwich Mean Time (UTC+0) as the time throughout the island, and Ireland standardised on Dublin Mean Time (UTC-00:25:21). If you took a ferry from Liverpool to Dublin towards the end of the 19th century you’d have to put your watch back by about 25 minutes. With air travel not yet being a thing, countries didn’t yet feel the need to fixate on nice round offsets in the region of one-hour (today, only a handful of regions retain UTC-offsets of half or quarter hours).

That’s all fine in peacetime, but by the First World War and especially following the Easter Rising, the British government decided that it was getting too tricky for their telegraph operators (many of whom operated out of Ireland, which provided an important junction for transatlantic traffic) to be on a different time to London.

1885 GPO telegraph instrument from the Porthcurno Telegraph Museum, which Dan almost visited the other week but it was closed.
It’s widely believed that the world’s first “U UP? [STOP]” message never got a response as a direct result of Anglo-Irish timezone confusion.
So the Time (Ireland) Act 1916 was passed, putting Ireland on Greenwich Mean Time. Ireland put her clocks back by 35 minutes and synched-up with the rest of the British Isles. And from then on, everything was simple and because nothing ever went wrong in Ireland as a result of the way it was governed by by Britain, nobody ever had to think about the question of timezones on the island again.

Ah. Hmm.

December 1920 photograph showing St Patrick's Street, Cork, following the burning of the city by British forces.
“Those Irish people want to govern their own country, do they? After we so kindly shared our king with them? Right-ho: let’s set fire to their cities and see how they feel then.”

Following Irish independence, the keeping of time carried on in much the same way for a long while, which will doubtless have been convenient for families spread across the Northern Irish border. But then came the Second World War.

Summers in the 1940s saw Churchill introduce Double Summer Time which he believed would give the UK more daylight, saving energy that might otherwise be used for lighting and increasing production of war materiel.

Ireland considered using the emergency powers they’d put in place to do the same, as a fuel saving measure… but ultimately didn’t. This was possibly because aligning her time with Britain might be seen as undermining her neutrality, but was more likely because the government saw that such a measure wouldn’t actually have much impact on fuel use (it certainly didn’t in Britain). Whatever the reason, though, Britain and Northern Ireland were again out-of-sync with one another until the war ended.

Newspaper clipping advising that "Double Summer Time comes to an end on Saturday night, August 8-9, when all clocks and watches should be put back one hour, thus reverting to British Summer Time, which will probably be maintained throughout the winter."
I like to imagine that the development of powerful computers by the folks at Bletchley Park was a result of needing to keep track of timezones across the British Isles.

From 1968 to 1971 Britain experimented with “British Standard Time” – putting the clocks forward in Summer once, to UTC+1, and then leaving them there for three years. This worked pretty well except if you were Scottish in which case you’ll have found winter mornings to be even gloomier than you were used to, which was already pretty gloomy. Conveniently: during much of this period Ireland was also on UTC+1, but in their case it was part of a different experiment. Ireland were working on joining the European Economic Community, and aligning themselves with “Paris time” year-round was an unnecessary concession but an interesting idea.

But here’s where the quirk appears: the Standard Time Act 1968, which made UTC+1 the “standard” timezone for the Republic of Ireland, was not repealed and is still in effect. Ireland could have started over in 1971 with a new rule that made UTC+0 the standard and added a “Summer Time” alternative during which the clocks are put forward… but instead the Standard Time (Amendment) Act 1971 left UTC+1 as Ireland’s standard timezone and added a “Winter Time” alternative during which the clocks are put back.

Two clocks, both showing the same time. One has a sign reading "LONDON", the other "DUBLIN, I GUESS?"
It all seems so simple until you actually think about it.

(For a deeper look at the legal history of time in the UK and Ireland, see this timeline. Certainly don’t get all your history lessons from me.)

So what?

You might rightly be thinking: so what! Having a standard time of UTC+0 and going forward for the Summer (like the UK), is functionally-equivalent to having a standard time of UTC+1 and going backwards in the Winter, like Ireland, right? It’s certainly true that, at any given moment, a clock in London and a clock in Dublin should show the same time. So why would anybody care?

Perl Data::ICal::TimeZone implementation of Dublin timezone, incorrectly showing summer DST at +1 rather than winter DST of -1.
This code for Europe/Dublin, from the Perl module Data::ICal::TimeZone, is technically-incorrect because it states that the winter time is the standard and daylight savings of +1 hour apply in the summer, rather than the opposite.

But declaring which is “standard” is important when you’re dealing with computers. If, for example, you run a volunteer rota management system that supports a helpline charity that has branches in both the UK and Ireland, then it might really matter that the computer systems involved know what each other mean when they talk about specific times.

The author of an iCalendar file can choose to embed timezone information to explain what, in that file, a particular timezone means. That timezone information might say, for example, “When I say ‘Europe/Dublin’, I mean UTC+1, or UTC+0 in the winter.” Or it might say – like the code above! – “When I say ‘Europe/Dublin’, I mean UTC+0, or UTC+1 in the summer.” Both of these declarations would be technically-valid and could be made to work, although only the first one would be strictly correct in accordance with the law.

Stressed programmer hunched over a MacBook. Photo by Anna Shvets from Pexels.
Clients who need solid timezone support represent 50% of a programmer’s production of stress hormones. See also Falsehoods Programmers Believe About Time.

But if you don’t include timezone information in your iCalendar file, you’re relying  on the feed subscriber’s computer (e.g. their calendar software) to make a sensible interpretation.. And that’s where you run into trouble. Because in cases like Ireland, for which the standard is one thing but is commonly-understood to be something different, there’s a real risk that the way your system interprets and encodes time won’t necessarily be the same as the way somebody else’s does.

If I say I’ll meet you at 12:00 on 1 January, in Ireland, you rightly need to know whether I’m talking about 12:00 in Irish “standard” time (i.e. 11:00, because daylight savings are in effect) or 12:00 in local-time-at-the-time-of-the-meeting (i.e. 12:00). Humans usually mean the latter because we think in terms of local time, but when your international computer system needs to make sure that people are on a shift at the same time, but in different timezones, it needs to be very clear what exactly it means!

And when your daylight savings works “backwards” compared to everybody else’s… that’s sure to make a developer somewhere cry. And, possibly, blog about your weird legislation.

× × × × × × × ×

Big List of Naughty Strings

This is a repost promoting content originally published elsewhere. See more things Dan's reposted.

# Reserved Strings
#
# Strings which may be used elsewhere in code
undefined
undef
null
NULL

then
constructor
\
\\

# Numeric Strings
#
# Strings which can be interpreted as numeric
0
1
1.00
$1.00
1/2
1E2

Max Woolf

Max has produced a list of “naughty strings”: things you might try injecting into your systems along with any fuzz testing you’re doing to check for common errors in escaping, processing, casting, interpreting, parsing, etc. The copy above is heavily truncated: the list is long!

It’s got a lot of the things in it that you’d expect to find: reserved keywords and filenames, unusual or invalid unicode codepoints, tests for the Scunthorpe Problem, and so on. But perhaps my favourite entry is this one, a test for “human injection”:

# Human injection
#
# Strings which may cause human to reinterpret worldview
If you're reading this, you've been in a coma for almost 20 years now. We're trying a new technique. We don't know where this message will end up in your dream, but we hope it works. Please wake up, we miss you.

Beautiful.

OpenAI-powered Linux shell uses AI to Do What You Mean

This is a repost promoting content originally published elsewhere. See more things Dan's reposted.

It’s like Alexa/Siri/Cortana for your terminal!

This is a basic Python shell (really, it’s a fancy wrapper over the system shell) that takes a task and asks OpenAI for what Linux bash command to run based on your description. For safety reasons, you can look at the command and cancel before actually running it.

Of all the stupid uses of OpenAI’s GPT-3, this might be the most-amusing. It’s really interesting to see how close – sometimes spot-on – the algorithm comes to writing the right command when you “say what you mean”. Also, how terribly, terribly ill-advised it would be to actually use this for real.

The most important feature of Sublime Text

This is a repost promoting content originally published elsewhere. See more things Dan's reposted.

The most important feature of Sublime Text is that it doesn’t change. In the modern world, everything changes at a crazy pace. We get new OSes and new phones every year, Google opens and closes its products monthly, many physical devices get announced, produced, and disappear in an interval shorter than the Sublime Text release cycle. I have two problems with that.

I love Sublime Text. It was the editor for which I finally broke my long, long emacs habit (another editor that “doesn’t change”). Like emacs, Sublime is simple but powerful. Unlike Atom, it doesn’t eat all the RAM in the universe. And unlike VS Code, I can rely on it being fundamentally the same today, tomorrow, and next year.

Break Into . Us (lock puzzle game)

I’ve made a puzzle game about breaking open padlocks. If you just want to play the game, go play the game. Or read on for the how-and-why of its creation.

About three months ago, my friend Claire, in a WhatsApp group we both frequent, shared a brainteaser:

WhatsApp message from Claire, challenging people to solve her puzzle.
Was this way back at the beginning of April? Thank heavens for WhatsApp scrollback.

The puzzle was to be interpreted as follows: you have a three-digit combination lock with numbers 0-9; so 1,000 possible combinations in total. Bulls and Cows-style, a series of clues indicate how “close” each of several pre-established “guesses” are. In “bulls and cows” nomenclature, a “bull” is a correctly-guessed digit in the correct location and a “cow” is a correctly-guessed digit in the wrong location, so the puzzle’s clues are:

  • 682 – one bull
  • 614 – one cow
  • 206 – two cows
  • 738 – no bulls, no cows
  • 380 – one cow
"Can you open the lock using these clues?" puzzle
Feel free to stop scrolling at this point and solve it for yourself. Or carry on; there are no spoilers in this post.

By the time I’d solved her puzzle the conventional way I was already interested in the possibility of implementing a general-case computerised solver for this kind of puzzle, so I did. My solver uses a simple “brute force” technique, as follows:

  1. Put all possible combinations into a search space.
  2. For each clue, remove from the search space all invalid combinations.
  3. Whatever combination is left is the correct answer.
Animation showing how the first three clues alone are sufficient to derive a unique answer from the search space of Claire's puzzle.
The first three clues of Claire’s puzzle are sufficient alone to reduce the search space to a single answer, although a human is likely to need more.

Visualising the solver as a series of bisections of a search space got me thinking about something else: wouldn’t this be a perfectly reasonable way to programatically generate puzzles of this type, too? Something like this:

  1. Put all possible combinations into a search space.
  2. Randomly generate a clue such that the search space is bisected (within given parameters to ensure that neither too many nor too few clues are needed)
  3. Repeat until only one combination is left

Interestingly, this approach is almost the opposite of what a human would probably do. A human, tasked with creating a puzzle of this sort, would probably choose the answer first and then come up with clues that describe it. Instead, though, my solution would come up with clues, apply them, and then see what’s left-over at the end.

Sample output of the puzzle generator for an alphabet of 0-9 and a combination length of 3.
Sometimes it comes up with inelegant or unchallenging suggestions, but for the most part my generator produces adequate puzzles.

I expanded my generator to go beyond simple bulls-or-cows clues: it’s also capable of generating clues that make reference to the balance of odd and even digits (in a numeric lock), the number of different digits used in the combination, the sum of the digits of the combination, and whether or not the correct combination “ascends” or “descends”. I’ve ideas for other possible clue types too, which could be valuable to make even tougher combination locks: e.g. specifying how many numbers in the combination are adjacent to a consecutive number, specifying the types of number that the sum of the digits adds to (e.g. “the sum of the digits is a prime number”) and so on.

A single solution in a search space derived in multiple ways.
Like the original puzzle, puzzles produced by my generator might have redundancies. In the picture above, the black square can be defined by the light blue, dark blue, and green bisections only: the yellow bisection is rendered redundant by the light blue one. I’ve left this as a deliberate feature.

Next up, I wanted to make a based interface so that people could have a go at the puzzles in their web browser, track their progress through the levels, get a “score” based on the number and difficulty of the locks that they’d cracked (so they can compare it to their friends), and save their progress to carry on next time.

I implemented in pure vanilla HTML, CSS, SVG and JS, with no dependencies. Compressed, it delivers to your browser and is ready-to-play in a little under 10kB, most of which is the puzzles themselves (which are pregenerated and stored in a JSON file). Naturally, it lends itself well to running offline, so it’s PWA-enhanced with a service worker so it can be “installed” onto your device, too, and it’ll check for bonus puzzles and other updates periodically.

The original puzzle shown via BreakInto.Us.
Naturally, the original puzzle appears in the web-based game, too.

Honestly, the hardest bit of implementing the frontend was the “spinnable” digits: depending on your browser, these are an endless-scrolling <ul> implemented mostly in CSS and with snap points set, and then some JS to work out “what you meant” based on where you span to. Which feels like the right way to implement such a thing, but was a lot more work than putting together my own control, not least because of browser inconsistencies in the implementation of snap points.

Anyway: you should go and play the game, now, and let me know what you think. Is it worth expanding and improving? Should I leave it as it is? I’m open to ideas (and if you don’t like that I’m not implementing your suggestions, you can always fork a copy of the code and change it yourself)!

Or if you’d like to see some of the other JavaScript experiments I’ve done, you might enjoy my “cheating” hangman game, my recreation of the lunar lander game I wrote in college, or rediscover that time I was ill and came up the worst conceivable tool to calculate Pi.

× × × × ×

Third-party libraries and security issues

This is a repost promoting content originally published elsewhere. See more things Dan's reposted.

Earlier this week, I wrote about why you should still use vanilla JS when so many amazing third-party libraries exist.

A few folks wrote to me to mention something I missed: security.

When you use code you didn’t author, you’re taking a risk. You’re trusting that the third-party code does not have security issues, that the author has good intent.

Chris makes a very good point, especially for those developers of the npm install every-damn-thing persuasion: getting an enormous framework that you don’t completely understand just because you need  a small portion of its features is bad security practice. And the target is a juicy one: a bad actor who finds (or introduces) a vulnerability in a big and widely-used library has a whole lot of power. Security concerns are a major part of why I go vanilla/stdlib where possible.

But as always with security the answer isn’t so clear-cut and simple, and I’d argue that it’s dangerous to encourage people to write their own solutions as a matter of course, for security reasons. For a start, you should never roll your own cryptographic libraries because you’re almost certainly going to fuck it up: an undetectable and easy-to-make mistake in your crypto implementation can lead to a catastrophic cascade and completely undermine the value of your cryptography. If you’re smart enough about crypto to implement crypto properly, you should contribute towards one of the major libraries. And if you’re not smart enough about crypto (and if you’re not sure, then you’re not), you should use one of those libraries. And even then you should take care to integrate and use it properly: people have been tripped over before by badly initialised keys or the use of the wrong kind of cipher for their use-case. Crypto is hard enough that even experts fuck it up and important enough that you can’t afford to get it wrong.

The same rule applies to a much lesser extent to other parts of your application, and especially for beginner developers. Implementing an authentication/authorisation system isn’t hard, but it’s another thing where getting it wrong can have disastrous consequences. Beginner (and even intermediate) developers routinely make mistakes with this kind of feature: unhashed, reversibly-encrypted, or incorrectly-hashed (wrong algorithm, no salt, etc.) passwords, badly-thought-out password reset strategies, incompletely applied access controls, etc. I’m confident that Chris and I would be in agreement that the best approach is for a developer to learn to implement these things properly and then do so. But if having to learn to implement them properly is a barrier to getting started, I’d rather than a beginner developer instead use a tried-and-tested off-the-shelf like Devise/Warden.

Other examples of things that beginner/intermediate developers sometimes get wrong might be XSS protection and SQL parameter escaping. And again, for languages that don’t have safety features built in, a framework can fill the gap. Rolling your own DOM whitelisting code for a social application is possible, but using a solution like DOMPurify is almost-certainly going to be more-secure for most developers because, you guessed it, this is another area where it’s easy to make a mess of things.

My inclination is to adapt Chris’s advice on this issue, to instead say that for the best security:

  1. Ideally: understand what all your code does, for example because you wrote it yourself.
  2. However: if you’re not confident in your ability to implement something securely (and especially with cryptography), use an off-the-shelf library.
  3. If you use a library: use the usual rules (popularity, maintenance cycle, etc.) to filter the list, but be sure to use the library with the smallest possible footprint – the best library should (a) do only the one specific task you need done, and no more, and (b) be written in a way that lends itself to you learning from it, understanding it, and hopefully being able to maintain it yourself.

Just my tuppence worth.

The 7 Types Of StackOverflow Answers

StackOverflow‘s one of the most-popular and widely-used resources for software developers. It dominates the search results when you’re looking for answers to techy questions. If you know how to read it, it can be invaluable.

But… I’m not sure what it is about the platform or the culture surrounding it that creates a certain… pattern to the answers that you can expect to receive on StackOverflow. To illustrate, let’s suppose we have a question:

SnackOverflow question: Let's say I'm camping and I need to make toast. I have a loaf of bread and a campfire. What's the best way to make toast?

Here are the answers you might see:

The Golden Hammer

The top answer is often somebody answering not the question you asked, but the question they’d like to think you asked.

Answer: Just plug a toaster in. You can do this with: npm install toaster

Never mind that you specifically said that you were using a campfire, the answer suggests that you use a toaster. Look back a few years and you’ll see countless examples of people asking for solutions using “vanilla” JavaScript and being told to use some heavyweight, everything-but-the-kitchen sink jQuery plugin. Now we’re in a more enlightened time, those same people are being told to use some heavyweight, everything-but-the-kitchen-sink npm module. How far we’ve come.

The Belligerent

Far often than you might expect, a perfectly reasonable “how do I do this?” question is met with an aggressive response of “why would you want to do that?”

Answer: Why would you want to make toast on a campfire? When you're camping you should be eating beans, soup, and spit-roasted meats and fish. Every time I've tried to toast bread over a campfire I've ended up unsatisfied. Uneven toasting, burnt bits, even the whole slice catching fire. I can't imagine why anybody would ever want toast like that! If you want toast you should stay at home. It's still pretty pointless, though: toast isn't a very good meal. It's basically empty calories with no protein, no vitamins, no minerals. I mean, it'd be okay as a snack but that's clearly not what you're asking about. There's a reason that the Chef's Guide To Camping doesn't include a recipe for toast. Just don't do it!

These are particularly infuriating to read when you come to a closed thread and you know that you do want to be doing the “forbidden” thing. You’ve considered the other options, you’ve assessed the situation… and now some arrogant bugger’s telling you that you’re wrong!

This kind of response is among the most annoying, second only to…

The Kindred Spirit

You’re getting a strange and inexplicable error message. You search for it and get exactly one result. Reading the thread, after hours of tearing your hair out, you suddenly feel a sense of relief: you’ve found another soul in this crazy world that’s suffering in precisely the same way as you are. Every word you read reconfirms for you that you and they have the same issue. At last, a solution is in reach!

Answer: I'm having almost exactly the same issue. I've brought bagels to my campfire, though. If anybody knows how to toast either bread or bagels on a campfire please let me know how! Edit: NM, I've worked it out.Nope.

Not only have you not got a solution, but the saviour you thought you’d found? They do have a solution, but they were thinking only about themselves when they got it, so they didn’t share it.

I get it: when you’re deep in focus on a problem you forget that the forum you’re on will receive search traffic indefinitely. But “NM, I’ve worked it out” is the most infuriating sentence on the Internet. When you solve a tough problem that you’d talked about online, for the love of God put the solution online too.

The Expert

There’s always somebody who answers the question but in a way you’d need a PhD to comprehend.

Answer: What you're looking to do is increase the ratio of 6-Acetyl-2,3,4,5-tetrahydropyridine on the surface of the bread, as described by Louis-Camille Maillard. Aim to maximise the surface area exposed to heat to accelerate the reaction of the carbonyls with the nucleophilic amino acids, without increasing the temperature enough to produce significant amounts of benzopyrene nor acrylamide.

StackOverflow is often used by beginners. Make your answer beginner-friendly if possible.

The Hero We Don’t Need

Like the Golden Hammer, the Hero We Don’t Need answers the question that they know the answer to rather than the question you actually asked. Unlike the Golden Hammer, the question they answer isn’t even remotely related to the question you asked.

Answer: Place the loaf down on a broad flat surface. Use a serrated blade in a moderately-rapid back-and-forth motion to cut through it. Now the bread will be sliced and ready to use. Don't cut any more than you need at once: sliced bread goes stale much faster.

Perhaps some future site visitor who chose their search terms badly might benefit from this out-of-the-box look at a completely different problem. But I wouldn’t count on it.

The Correct Answer

Eventually, if you’re lucky, somebody will provide the actual answer to the question. You’ll often have to scroll about this far down the page to find it.

Answer: There are two approaches. Both are equally valid - choose the one that's right for you. Method #1: place flat rocks near to your campfire and allow them to heat up. Slice your bread, and lay each slice on a hot rock, being careful not to touch the rock. Turn it over when it's done on one side. Method #2: use a long fork, skewer, or stick to impale a slice of bread lengthways (here's a diagram) and suspend it over the fire either by holding the utensil or by poking the other end into the ground. If holding it, be sure to keep your hand lower than the bread as heat will travel up metal implements. Happy camping!Still, at least there’s an answer. And it only took four hours between posting the question and it appearing. Sometimes that’s what it takes, and at least the answer will be there for the next person, assuming that they, too, scroll down far enough.

Unfortunately hundreds of novice developers will have no way to tell that this alone is the correct answer amongst the endless stream of bullshit in which it resides.

The Echo

And finally, there’s always some idiot who repeats one of the same (useless) answers from before. Just to keep the noise-to-signal ratio up, I guess.

Answer: Just install toaster from NPM. Comment 1: @KISS DRY already said this. This is the correct answer. Comment 2: Can toaster slice bread, too?

StackOverflow’s given me so many useful answers to so many questions, over the years. But it’s also been a great source of frustration for me at the hands of six of these seven archetypes. Did I miss any?

× × × × × × × ×

An app can be a home-cooked meal

This is a repost promoting content originally published elsewhere. See more things Dan's reposted.

Have you heard about this new app called BoopSnoop?

It launched in the first week of 2020, and almost immediately, it was downloaded by four people in three different time zones. In the months since, it has remained steady at four daily active users, with zero churn: a resounding success, exceeding every one of its creator’s expectations.

:)

I made a messaging app for, and with, my family. It is ruthlessly simple; we love it; no one else will ever use it. I wanted to jot down some notes about how and why I made it, both to (a) offer a nudge to anyone else out there considering a similar project and (b) suggest something a little larger about software.

Robin Sloan (yes, this one) talks about an app that he wrote exclusively for his family. He likens the experience to a making a home-cooked meal. And I totally get it.

I do this kind of thing all the time. Our new home NAS device, Fox, performs a handful of functions (and I plan to expand it to many more) based on a mixture of open-source and homegrown code, just for my immediate family. Our “family wiki” does the same thing. And the spreadsheet we use for our finances. I’ve written apps for small groups of friends before, too (e.g. 1, 2, 3, 4, 5, 6, 7, 8…). And that’s not to mention the countless “meals for one” I’ve cooked: small applications written entirely for my own benefit – I’m using one right now to pull this article from the list of “things I’ve read and enjoyed recently” into my blog.

A home-cooked meal benefits from being tailored to its audience (if the recipe calls for mustard, I might use less or omit it because it makes my nose feel funny). It benefits from being tailored to its purpose. And it benefits from the love that goes into it. My only superstition – that I’m aware of – is that I believe that food tastes better if the chef smiled during its production… I’m beginning to think that the same might be true for software, too.

First among the reasons I think that learning the basics of programming should be in the school curriculum is that it teaches people how computers work and so, by proxy, what they are (and are not) capable of. The most digitally-literate non-programmers I know are people who have the strongest understanding about how and why computers do what they do. But a close second among my reasons is that those with an inclination can go a step further and, without even necessarily pushing their skills to a level at which they could or would want to work as software developers, build their own tools to “scratch their own itches”. Solving a problem for yourself is enormously empowering, and the versatility of software lends itself to solving a huge array of relatively-tiny problems: problems that affect individuals, families, or small communities but that aren’t big enough to warrant commercial attention.

(Sometimes these projects explode into something bigger, but usually they remain just as they are: a tool for the benefit of oneself and one’s immediate tribe. And that’s just great.)

Identifying Post Kinds in WordPress RSS Feeds

I use the Post Kinds plugin to streamline the management of the different types of posts I make on my blog, based on the IndieWeb post types list: articles, like this one, are “conventional” blog posts, but I also publish notes (which are analogous to “tweets”), reposts (“shares” of things I’ve found online, sometimes with commentary), checkins (mostly chronicling my geocaching/geohashing), and others: I’ve extended Post Kinds to facilitate comics and reviews, for example.

But for people who subscribe (either directly or indirectly) to everything I post, I imagine it must be a little frustrating to sometimes be unable to identify the type of a post before clicking-through. So I’ve added the following code, which I’m sharing here and on GitHub in case it’s of any use to anybody else, to my theme’s functions.php:

// Make titles in RSS feed be prefixed by the Kind of the post.
function add_kind_to_rss_post_title(){
        $kinds = wp_get_post_terms( get_the_ID(), 'kind' );
        if( ! isset( $kinds ) || empty( $kinds ) ) return get_the_title(); // sanity-check.
        $kind = $kinds[0]->name;
        $title = get_the_title();
        return trim( "[{$kind}] {$title}" );
}
add_filter( 'the_title_rss', 'add_kind_to_rss_post_title', 4 ); // priority 4 to ensure it happens BEFORE default escaping filters.

This decorates the titles of my posts, but only in my feeds, so it’s easier for people to tell at-a-glance what’s going on:

Rendered RSS feed showing Post Kinds prefixes

Down the line I might expand this so that it doesn’t show if the subscriber is, for example, asking only for articles (e.g. via this feed); I’m coming up with a huge list of things I’d like to do at IndieWebCamp London! But for now, this feels like a nice simple improvement to a plugin I love that helps it to fit my specific needs.

×

Avoid rewriting a legacy system from scratch, by strangling it

This is a repost promoting content originally published elsewhere. See more things Dan's reposted.

Sometimes, code is risky to change and expensive to refactor.

In such a situation, a seemingly good idea would be to rewrite it.

From scratch.

Here’s how it goes:

  1. You discuss with management about the strategy of stopping new features for some time, while you rewrite the existing app.
  2. You estimate the rewrite will take 6 months to cover what the existing app does.
  3. A few months in, a nasty bug is discovered and ABSOLUTELY needs to be fixed in the old code. So you patch the old code and the new one too.
  4. A few months later, a new feature has been sold to the client. It HAS TO BE implemented in the old code—the new version is not ready yet! You need to go back to the old code but also add a TODO to implement this in the new version.
  5. After 5 months, you realize the project will be late. The old app was doing way more things than expected. You start hustling more.
  6. After 7 months, you start testing the new version. QA raises up a lot of things that should be fixed.
  7. After 9 months, the business can’t stand “not developing features” anymore. Leadership is not happy with the situation, you are tired. You start making changes to the old, painful code while trying to keep up with the rewrite.
  8. Eventually, you end up with the 2 systems in production. The long-term goal is to get rid of the old one, but the new one is not ready yet. Every feature needs to be implemented twice.

Sounds fictional? Or familiar?

Don’t be shamed, it’s a very common mistake.

I’ve rewritten legacy systems from scratch before. Sometimes it’s all worked out, and sometimes it hasn’t, but either way: it’s always been a lot more work than I could have possibly estimated. I’ve learned now to try to avoid doing so: at least, to avoid replacing a single monolithic (living) system in a monolithic way. Nicholas gives an even-better description of the true horror of legacy reimplementation, and promotes progressive strangulation as a candidate solution.

CSS4 is here!

This is a repost promoting content originally published elsewhere. See more things Dan's reposted.

I think that CSS would be greatly helped if we solemnly state that “CSS4 is here!” In this post I’ll try to convince you of my viewpoint.

I am proposing that we web developers, supported by the W3C CSS WG, start saying “CSS4 is here!” and excitedly chatter about how it will hit the market any moment now and transform the practice of CSS.

Of course “CSS4” has no technical meaning whatsoever. All current CSS specifications have their own specific versions ranging from 1 to 4, but CSS as a whole does not have a version, and it doesn’t need one, either.

Regardless of what we say or do, CSS 4 will not hit the market and will not transform anything. It also does not describe any technical reality.

Then why do it? For the marketing effect.

Hurrah! CSS4 is here!

I’m sure that, like me, you’re excited to start using the latest CSS technologies, like paged media, hyphen control, the zero-specificity :where() selector, and new accessibility selectors like the ‘prefers-reduced-motion’ @media query. The browser support might not be “there” yet, but so long as you’ve got a suitable commitment to progressive enhancement then you can be using all of these and many more today (I am!). Fantastic!

But if you’ve got more than a little web savvy you might still be surprised to hear me say that CSS4 is here, or even that it’s a “thing” at all. Welll… that’s because it isn’t. Not officially. Just like JavaScript’s versioning has gone all evergreen these last few years, CSS has gone the same way, with different “modules” each making their way through the standards and implementation processes independently. Which is great, in general, by the way – we’re seeing faster development of long-overdue features now than we have through most of the Web’s history – but it does make it hard to keep track of what’s “current” unless you follow along watching very closely. Who’s got time for that?

When CSS2 gained prominence at around the turn of the millennium it was revolutionary, and part of the reason for that – aside from the fact that it gave us all some features we’d wanted for a long time – was that it gave us a term to rally behind. This browser already supports it, that browser’s getting there, this other browser supports it but has a f**ked-up box model (you all know the one I’m talking about)… we at last had an overarching term to discuss what was supported, what was new, what was ready for people to write articles and books about. Nobody’s going to buy a book that promises to teach them “CSS3 Selectors Level 3, Fonts Level 3, Writing Modes Level 3, and Containment Level 1”: that title’s not even going to fit on the cover. But if we wrapped up a snapshot of what’s current and called it CSS4… now that’s going to sell.

Can we show the CSS WG that there’s mileage in this idea and make it happen? Oh, I hope so. Because while the modular approach to CSS is beautiful and elegant and progressive… I’m afraid that we can’t use it to inspire junior developers.

Also: I don’t want this joke to forever remain among the top results when searching for CSS4

Reinventing Git interface

This is a repost promoting content originally published elsewhere. See more things Dan's reposted.

I also ask not to discard all this nonsense right away, but at least give it a fair round of thought. My recommendations, if applied in their entirety, can radically change Git experience. They can move Git from advanced, hacky league to the mainstream and enable adoption of VCS by very wide groups of people. While this is still a concept and obviosly requires a hell of a job to become reality, my hope is that somebody working on a Git interface can borrow from these ideas and we all will get better Git client someday.

I love Git. But I love it more conceptually than I do practically. Everything about its technical design is elegant and clever. But most of how you have to act when you’re using it just screams “I was developed by lots of engineers and by exactly zero UX developers.” I can’t imagine why.

Nikita proposes ways in which it can be “fixed” while retaining 100% backwards-compatibility, and that’s bloody awesome. I hope the Git core team are paying attention, because these ideas are gold.

Sandwichware: Machines Talking to Machines About Humans

A recent observation by Phil Gyford reminded me of a recurring thought I’ve had. He wrote:

While being driven around England it struck me that humans are currently like the filling in a sandwich between one slice of machine — the satnav — and another — the car. Before the invention of sandwiches the vehicle was simply a slice of machine with a human topping. But now it’s a sandwich, and the two machine slices are slowly squeezing out the human filling and will eventually be stuck directly together with nothing but a thin layer of API butter. Then the human will be a superfluous thing, perhaps a little gherkin on the side of the plate.

While we were driving I was reading the directions from a mapping app on my phone, with the sound off, checking the upcoming turns, and giving verbal directions to Mary, the driver. I was an extra layer of human garnish — perhaps some chutney or a sliced tomato — between the satnav slice and the driver filling.

What Phil’s describing is probably familiar to you: the experience of one or more humans acting as the go-between to allow two machines to communicate. If you’ve ever re-typed a document that was visible on another screen, read somebody a password over the phone, given directions from a digital map, used a pendrive to carry files between computers that weren’t talking to one another properly then you’ve done it: you’ve been the soft wet meaty middleware that bridged two already semi-automated (but not quite automated enough) systems.

Galaxy Quest: Tawny Madison says "Gosh, I'm doing it. I'm repeating the damn computer."
Sigourney Weaver as Gwen DeMarco as Tawny Madison realised what she was doing back in 1999. Should I be alarmed that a science fiction spoof is a better indicator of the future that the science fiction it parodies?

This generally happens because of the lack of a common API (a communications protocol) between two systems. If your phone and your car could just talk it out then the car would know where to go all by itself! Or, until we get self-driving cars, it could at least provide the directions in a way that was appropriately-accessible to the driver: heads-up display, context-relative directions, or whatever.

It also sometimes happens when the computer-to-human interface isn’t good enough; for example I’ve often offered to navigate for a driver (and used my phone for the purpose) because I can add a layer of common sense. There’s no need for me to tell my buddy to take the second exit from every roundabout in Milton Keynes (did you know that the town has 930 of them?) – I can just tell them that I’ll let them know when they have to change road and trust that they’ll just keep going straight ahead until then.

Finally, we also sometimes find ourselves acting as a go-between to filter and improve information flow when the computers don’t have enough information to do better by themselves. I’ll use the fact that I can see the road conditions and the lane markings and the proposed route ahead to tell a driver to get into the right lane with an appropriate amount of warning. Or if the driver says “I can see signs to our destination now, I’ll just keep following them,” I can shut up unless something goes awry. Your in-car SatNav can’t do that because it can’t see and interpret the road ahead of you… at least not yet!

Oxbotica Driven self-driving car in Oxford.
I was certainly glad that this prototype self-driving car could “see” me when it overtook my bike the other day.

But here’s my thought: claims of an upcoming AI winter aside, it feels to me like we’re making faster progress in technologies related to human-computer interaction – voice and natural languages interfaces, popularised by virtual assistants like Siri and Alexa and by chatbots – than we are in technologies related to universal computer interoperability. Voice-controller computers are hip and exciting and attract a lot of investment but interoperable systems are hampered by two major things. The first thing holding back interoperability is business interests: for the longest while, for example, you couldn’t use Amazon Prime Video on a Google Chromecast for a long while because the two companies couldn’t play nice. The second thing is a lack of interest by manufacturers in developing open standards: every smart home appliance manufacturer wants you to use their app, and so your smart speaker manufacturer needs to implement code to talk to each and every one of them, and when they stop supporting one… well, suddenly your thermostat switches jumps permanently from smart mode to dumb mode.

A thing that annoys me is that from a technical perspective making an open standard should be a much easier task than making an AI that can understand what a human is asking for or drive a car safely or whatever we’re using them for this week. That’s not to say that technical standards aren’t difficult to get right – they absolutely are! – but we’ve been practising doing it for many, many decades! The very existence of the Internet over which you’ve been delivered this article is proof that computer interoperability is a solvable problem. For anybody who thinks that the interoperability brought about by the Internet was inevitable or didn’t take lots of hard work, I direct you to Darius Kazemi’s re-reading of the early standards discussions, which I first plugged a year ago; but the important thing is that people were working on it. That’s something we’re not really seeing in the Internet of Things space.

XKCD 927: Standards
Engineers: “Standards are good. Let’s have lots of them.”
Everybody else: “…?”

On our current trajectory, it’s absolutely possible that our virtual assistants will reach a point of becoming perfectly “human” communicators long before we can reach agreements about how they should communicate with one another. If that’s the case, those virtual assistants will probably fall back on using English-language voice communication as their lingua franca. In that case, it’s not unbelievable that ten to twenty years from now, the following series of events might occur:

  1. You want to go to your friends’ house, so you say out loud “Alexa, drive me to Bob’s house in five minutes.” Alexa responds “I’m on it; I’ll let you know more in a few minutes.”
  2. Alexa doesn’t know where Bob’s house is, but it knows it can get it from your netbook. It opens a voice channel over your wireless network (so you don’t have to “hear” it) and says “Hey Google, it’s Alexa [and here’s my credentials]; can you give me the address that [your name] means when they say ‘Bob’s house’?” And your netbook responds by reading out the address details, which Alexa then understands.
  3. Alexa doesn’t know where your self-driving car is right now and whether anybody’s using it, but it has a voice control system and a cellular network connection, so Alexa phones up your car and says: “Hey SmartCar, it’s Alexa [and here’s my credentials]; where are you and when were you last used?”. The car replies “I’m on the driveway, I’m fully-charged, and I was last used three hours ago by [your name].” So Alexa says “Okay, boot up, turn on climate control, and prepare to make a journey to [Bob’s address].” In this future world, most voice communication over telephones is done by robots: your virtual assistant calls your doctor’s virtual assistant to make you an appointment, and you and your doctor just get events in your calendars, for example, because nobody manages to come up with a universal API for medical appointments.
  4. Alexa responds “Okay, your SmartCar is ready to take you to Bob’s house.” And you have no idea about the conversations that your robots have been having behind your back

I’m not saying that this is a desirable state of affairs. I’m not even convinced that it’s likely. But it’s certainly possible if IoT development keeps focussing on shiny friendly conversational interfaces at the expense of practical, powerful technical standards. Our already topsy-turvy technologies might get weirder before they get saner.

But if English does become the “universal API” for robot-to-robot communication, despite all engineering common sense, I suggest that we call it “sandwichware”.

× ×

Text Rendering Hates You

This is a repost promoting content originally published elsewhere. See more things Dan's reposted.

Rendering text, how hard could it be? As it turns out, incredibly hard! To my knowledge, literally no system renders text “perfectly”. It’s all best-effort, although some efforts are more important than others.

Just so you have an idea for how a typical text-rendering pipeline works, here’s a quick sketch:

  1. Styling (parse markup, query system for fonts)
  2. Layout (break text into lines)
  3. Shaping (compute the glyphs in a line and their positions)
  4. Rasterization (rasterize needed glyphs into an atlas/cache)
  5. Composition (copy glyphs from the atlas to their desired positions)

Unfortunately, these steps aren’t as clean as they might seem.

Delightful dive into the variety of issues that face developers who have to implement text rendering. Turns out this is, and might always remain, an unsolved issue.

Time-Efficient Shuffling

Some years ago, a friend of mine told me about an interview they’d had for a junior programming position. Their interviewer was one of that particular breed who was attached to programming-test questions: if you’re in the field of computer science, you already know that these questions exist. In any case: my friend was asked to write pseudocode to shuffle a deck of cards: a classic programming problem that pretty much any first-year computer science undergraduate is likely to have considered, if not done.

Interview with paper visible.
Let’s play at writing software. Rather than a computer, we’ll use paper. But to make it sound techy, we’ll call it “pseudocode”.

There are lots of wrong ways to programmatically shuffle a deck of cards, such as the classic “swap the card in each position with the card in a randomly-selected position”, which results in biased results. In fact, the more that you think in terms of how humans shuffle cards, the less-likely you are to come up with a good answer!

Chart showing the probability bias for an incorrectly-implemented Fisher-Yates shuffle, for a 6-card deck.
If we shuffled a deck of six cards with this ‘broken’ algorithm, for example, we’d be more-likely to find the card that was originally in second place at the top of the deck than in any other position. This kind of thing REALLY matters if, for example, you’re running an online casino.

The simplest valid solution is to take a deck of cards and move each card, choosing each at random, into a fresh deck (you can do this as a human, if you like, but it takes a while)… and that’s exactly what my friend suggested.

The interviewer was ready for this answer, though, and asked my friend if they could think of a “more-efficient” way to do the shuffle. And this is where my friend had a brain fart and couldn’t think of one. That’s not a big problem in the real world: so long as you can conceive that there exists a more-efficient shuffle, know what to search for, and can comprehend the explanation you get, then you can still be a perfectly awesome programmer. Demanding that people already know the answer to problems in an interview setting doesn’t actually tell you anything about their qualities as a programmer, only how well they can memorise answers to stock interview questions (this interviewer should have stopped this line of inquiry one question sooner).

Riffle shuffling
Writing a program to shuffle a deck takes longer than just shuffling it, but that’s hardly the point, is it?

The interviewer was probably looking for an explanation of the modern form of the Fisher-Yates shuffle algorithm, which does the same thing as my friend suggested but without needing to start a “separate” deck: here’s a video demonstrating it. When they asked for greater efficiency, the interviewer was probably looking for a more memory-efficient solution. But that’s not what they said, and it’s certainly not the only way to measure efficiency.

When people ask ineffective interview questions, it annoys me a little. When people ask ineffective interview questions and phrase them ambiguously to boot, that’s just makes me want to contrive a deliberately-awkward answer.

So: another way to answer the shuffling efficiency question would be to optimise for time-efficiency. If, like my friend, you get a question about improving the efficiency of a shuffling algorithm and they don’t specify what kind of efficiency (and you’re feeling sarcastic), you’re likely to borrow either of the following algorithms. You won’t find them any computer science textbook!

Complexity/time-efficiency optimised shuffling

  1. Precompute and store an array of all 52! permutations of a deck of cards. I think you can store a permutation in no more than 226 bits, so I calculate that 2.3 quattuordecillion yottabytes would be plenty sufficient to store such an array. That’s about 25 sexdecillion times more data than is believed to exist on the Web, so you’re going to need to upgrade your hard drive.
  2. To shuffle a deck, simply select a random number x such that 0 <= x < 52! and retrieve the deck stored at that location.

This converts the O(n) problem that is Fisher-Yates to an O(1) problem, an entire complexity class of improvement. Sure, you need storage space valued at a few hundred orders of magnitude greater than the world GDP, but if you didn’t specify cost-efficiency, then that’s not what you get.

Stack of shuffled cards
If you’ve got a thousand galaxies worth of free space you can just fill them with actual decks of cards – one for each permutation – and physically pick one at random. That sounds convenient, right?

You’re also going to need a really, really good PRNG to ensure that the 226-bit binary number you generate has sufficient entropy. You could always use a real physical deck of cards to seed it, Solitaire/Pontifex-style, and go full meta, but I worry that doing so might cause this particular simulation of the Universe to implode, sooo… do it at your own risk?

Perhaps we can do one better, if we’re willing to be a little sillier…

(Everett interpretation) Quantum optimised shuffling

Quantum ungulations
If you live in a universe in which quantum optimised shuffling isn’t possible, the technique below can be adapted to create a universe in which it is.

Assuming the many-worlds interpretation of quantum mechanics is applicable to reality, there’s a yet-more-efficient way to shuffle a deck of cards, inspired by the excellent (and hilarious) quantum bogosort algorithm:

  1. Create a superposition of all possible states of a deck of cards. This divides the universe into 52! universes; however, the division has no cost, as it happens constantly anyway.
  2. Collapse the waveform by observing your shuffled deck of cards.

The unneeded universes can be destroyed or retained as you see fit.

Let me know if you manage to implement either of these.

× × × × ×