Bypassing WordPress / Jetpack’s “Prove your humanity:” CAPTCHA

One of the most-popular WordPress plugins is Jetpack, a product of Automattic (best-known for providing the widely-used WordPress hosting service “WordPress.com“). Among Jetpack’s features (many of which are very good) is Jetpack Protect which adds – among other things – the possibility for a CAPTCHA to appear on your login pages. This feature is slightly worse than pointless as it makes it harder for humans to log in but has no significant impact upon automated robots; at best, it provides a false sense of security and merely frustrates and slows down legitimate human editors.

WordPress/Jetpack's CAPTCHA, asking for the solution to "9+10="
Thanks, WordPress, for slowing me down with a CAPTCHA that a robot can solve more-easily than a human.

“Proving your humanity”, as you’re asked to do, is a task that’s significantly easier for a robot to perform than a human. Eventually, of course, all tests of this nature seem likely to fail as robots become smarter than humans (especially as the most-popular system is specifically geared towards training robots), but that’s hardly an excuse for inventing a system that was a failure from its inception. Jetpack’s approach is fundamentally flawed because it makes absolutely no effort to disguise the challenge in a way that humans are able to read any-differently than robots. I’ll demonstrate that in a moment.

Jetpack security settings: "Protect" switch
Don’t just disable this, though! Other “Protect” features make sense. If only you could disable just the one that doesn’t…

A while back, a colleague of mine network-enabled Jetpack Protect across a handful of websites that I occasionally need to log into, and it bugged me that it ‘broke’ my password safe’s ability to automatically log me in. So to streamline my workflow – as well as to demonstrate quite how broken Jetpack Protect’s CAPTCHA is, I’ve written a userscript that you can install into your web browser that will completely circumvent it, solving the maths problems on your behalf so that you don’t have to. Here’s how to use it:

  1. Install a userscript manager into your browser if you don’t have one already: I use Tampermonkey, but it ought to work with almost any of them.
  2. Install Jetpack Maths Solver.

From now on, whenever you go to a page whose web path begins with “/wp-login.php” that contains a Jetpack Protect maths problem, the answer will be automatically calculated and filled-in on your behalf. The usual userscript rules apply: if you don’t trust me, read the source code (there are really only five lines to check) and disable automatic updates for it (especially as it operates across all domains), and feel free to adapt/improve however you see fit. Maybe if we can get enough people using it Automattic will fix this half-hearted CAPTCHA – or at least give us a switch to disable it in the first place.

Update: 15 October 2018 – the latest version of Jetpack makes an insignificant change to this CAPTCHA; version 1.2 of this script (linked above) works around the change.

Squiz CMS Easter Eggs (or: why do I keep seeing Greg’s name in my CAPTCHA?)

Anybody who has, like me, come into contact with the Squiz Matrix CMS for any length of time will have come across the reasonably easy-to-read but remarkably long CAPTCHA that it shows. These are especially-noticeable in its administrative interface, where it uses them as an exaggerated and somewhat painful “are you sure?” – restarting the CMS’s internal crontab manager, for example, requires that the administrator types a massive 25-letter CAPTCHA.

Four long CAPTCHA from the Squiz Matrix CMS.
Four long CAPTCHA from the Squiz Matrix CMS.

But there’s another interesting phenomenon that one begins to notice after seeing enough of the back-end CAPTCHA that appear. Strange patterns of letters that appear in sequence more-often than would be expected by chance. If you’re a fan of wordsearches, take a look at the composite screenshot above: can you find a person’s name in each of the four lines?

Four long CAPTCHA from the Squiz Matrix CMS, with the names Greg, Dom, Blair and Marc highlighted.
Four long CAPTCHA from the Squiz Matrix CMS, with the names Greg, Dom, Blair and Marc highlighted.

There are four names – GregDomBlair and Marc – which routinely appear in these CAPTCHA. Blair, being the longest name, was the first that I noticed, and at first I thought that it might represent a fault in the pseudorandom number generation being used that was resulting in a higher-than-normal frequency of this combination of letters. Another idea I toyed with was that the CAPTCHA text might be being entirely generated from a set of pronounceable syllables (which is a reasonable way to generate one-time passwords that resist entry errors resulting from reading difficulties: in fact, we do this at Three Rings), in which these four names also appear, but by now I’d have thought that I’d have noticed this in other patterns, and I hadn’t.

Instead, then, I had to conclude that these names were some variety of Easter Egg.

In software (and other media), "Easter Eggs" are undocumented hidden features, often in the form of inside jokes.
Smiley decorated eggs. Picture courtesy Kate Ter Haar.

I was curious about where they were coming from, so I searched the source code, but while I found plenty of references to Greg Sherwood, Marc McIntyre, and Blair Robertson. I couldn’t find Dom, but I’ve since come to discover that he must be Dominic Wong – these four were, according to Greg’s blog – developers with Squiz in the early 2000s, and seemingly saw themselves as a dynamic foursome responsible for the majority of the CMS’s code (which, if the comment headers are to be believed, remains true).

Greg, Marc, Blair and Dom, as depicted in Greg's 2007 blog post.
Greg, Marc, Blair and Dom, as depicted in Greg’s 2007 blog post.

That still didn’t answer for me why searching for their names in the source didn’t find the responsible code. I started digging through the CMS’s source code, where I eventually found fudge/general/general.inc (a lot of Squiz CMS code is buried in a folder called “fudge”, and web addresses used internally sometimes contain this word, too: I’d like to believe that it’s being used as a noun and that the developers were just fans of the buttery sweet, but I have a horrible feeling that it was used in its popular verb form). In that file, I found this function definition:

/**
 * Generates a string to be used for a security key
 *
 * @param int            $key_len                the length of the random string to display in the image
 * @param boolean        $include_uppercase      include uppercase characters in the generated password
 * @param boolean        $include_numbers        include numbers in the generated password
 *
 * @return string
 * @access public
 */
function generate_security_key($key_len, $include_uppercase = FALSE, $include_numbers = FALSE) {
  $k = random_password($key_len, $include_uppercase, $include_numbers);
  if ($key_len > 10) {
    $gl = Array('YmxhaXI=', 'Z3JlZw==', 'bWFyYw==', 'ZG9t');
    $g = base64_decode($gl[rand(0, (count($gl) - 1)) ]);
    $pos = rand(1, ($key_len - strlen($g)));
    $k = substr($k, 0, $pos) . $g . substr($k, ($pos + strlen($g)));
  }
  return $k;
} //end generate_security_key()

For the benefit of those of you who don’t speak PHP, especially PHP that’s been made deliberately hard to decipher, here’s what’s happening when “generate_security_key” is being called:

  • A random password is being generated.
  • If that password is longer than 10 characters, a random part of it is being replaced with either “blair”, “greg”, “marc”, or “dom”. The reason that you can’t see these words in the code is that they’re trivially-encoded using a scheme called Base64 – YmxhaXI=Z3JlZw==, bWFyYw==, and ZG9t are Base64 representations of the four names.

This seems like a strange choice of Easter Egg: immortalising the names of your developers in CAPTCHA. It seems like a strange choice especially because this somewhat weakens the (already-weak) CAPTCHA, because an attacking robot can quickly be configured to know that a 11+-letter codeword will always consist of letters and exactly one instance of one of these four names: in fact, knowing that a CAPTCHA will always contain one of these four and that I can refresh until I get one that I like, I can quickly turn an 11-letter CAPTCHA into a 6-letter one by simply refreshing until I get one with the longest name – Blair – in it!

A lot has been written about how Easter Eggs undermine software security (in exchange for a small boost to developer morale) – that’s a major part of why Microsoft has banned them from its operating systems (and, for the most part, Apple has too). Given that these particular CAPTCHA in Squiz CMS are often nothing more than awkward-looking “are you sure?” dialogs, I’m not concerned about the direct security implications, but it does make me worry a little about the developer culture that produced them.

I know that this Easter Egg might be harmless, but there’s no way for me to know (short of auditing the entire system) what other Easter Eggs might be hiding under the surface and what they do, especially if the developers have, as in this case, worked to cover their tracks! It’s certainly the kind of thing I’d worry about if I were, I don’t know, a major government who use Squiz software, especially their cloud-hosted variants which are harder to effectively audit. Just a thought.

Four long CAPTCHA from the Squiz Matrix CMS.× Four long CAPTCHA from the Squiz Matrix CMS, with the names Greg, Dom, Blair and Marc highlighted.× Greg, Marc, Blair and Dom, as depicted in Greg's 2007 blog post.×