Anybody who has, like me, come into contact with the Squiz Matrix CMS for any length of time will have come across the reasonably easy-to-read but remarkably long CAPTCHA that it shows. These are especially-noticeable in its administrative interface, where it uses them as an exaggerated and somewhat painful “are you sure?” – restarting the CMS’s internal crontab manager, for example, requires that the administrator types a massive 25-letter CAPTCHA.
But there’s another interesting phenomenon that one begins to notice after seeing enough of the back-end CAPTCHA that appear. Strange patterns of letters that appear in sequence more-often than would be expected by chance. If you’re a fan of wordsearches, take a look at the composite screenshot above: can you find a person’s name in each of the four lines?
There are four names – Greg, Dom, Blair and Marc – which routinely appear in these CAPTCHA. Blair, being the longest name, was the first that I noticed, and at first I thought that it might represent a fault in the pseudorandom number generation being used that was resulting in a higher-than-normal frequency of this combination of letters. Another idea I toyed with was that the CAPTCHA text might be being entirely generated from a set of pronounceable syllables (which is a reasonable way to generate one-time passwords that resist entry errors resulting from reading difficulties: in fact, we do this at Three Rings), in which these four names also appear, but by now I’d have thought that I’d have noticed this in other patterns, and I hadn’t.
Instead, then, I had to conclude that these names were some variety of Easter Egg.
I was curious about where they were coming from, so I searched the source code, but while I found plenty of references to Greg Sherwood, Marc McIntyre, and Blair Robertson. I couldn’t find Dom, but I’ve since come to discover that he must be Dominic Wong – these four were, according to Greg’s blog – developers with Squiz in the early 2000s, and seemingly saw themselves as a dynamic foursome responsible for the majority of the CMS’s code (which, if the comment headers are to be believed, remains true).
That still didn’t answer for me why searching for their names in the source didn’t find the responsible code. I started digging through the CMS’s source code, where I eventually found fudge/general/general.inc (a lot of Squiz CMS code is buried in a folder called “fudge”, and web addresses used internally sometimes contain this word, too: I’d like to believe that it’s being used as a noun and that the developers were just fans of the buttery sweet, but I have a horrible feeling that it was used in its popular verb form). In that file, I found this function definition:
/** * Generates a string to be used for a security key * * @param int $key_len the length of the random string to display in the image * @param boolean $include_uppercase include uppercase characters in the generated password * @param boolean $include_numbers include numbers in the generated password * * @return string * @access public */ function generate_security_key($key_len, $include_uppercase = FALSE, $include_numbers = FALSE) { $k = random_password($key_len, $include_uppercase, $include_numbers); if ($key_len > 10) { $gl = Array('YmxhaXI=', 'Z3JlZw==', 'bWFyYw==', 'ZG9t'); $g = base64_decode($gl[rand(0, (count($gl) - 1)) ]); $pos = rand(1, ($key_len - strlen($g))); $k = substr($k, 0, $pos) . $g . substr($k, ($pos + strlen($g))); } return $k; } //end generate_security_key()
For the benefit of those of you who don’t speak PHP, especially PHP that’s been made deliberately hard to decipher, here’s what’s happening when “generate_security_key” is being called:
- A random password is being generated.
- If that password is longer than 10 characters, a random part of it is being replaced with either “blair”, “greg”, “marc”, or “dom”. The reason that you can’t see these words in the code is that they’re trivially-encoded using a scheme called Base64 – YmxhaXI=, Z3JlZw==, bWFyYw==, and ZG9t are Base64 representations of the four names.
This seems like a strange choice of Easter Egg: immortalising the names of your developers in CAPTCHA. It seems like a strange choice especially because this somewhat weakens the (already-weak) CAPTCHA, because an attacking robot can quickly be configured to know that a 11+-letter codeword will always consist of letters and exactly one instance of one of these four names: in fact, knowing that a CAPTCHA will always contain one of these four and that I can refresh until I get one that I like, I can quickly turn an 11-letter CAPTCHA into a 6-letter one by simply refreshing until I get one with the longest name – Blair – in it!
A lot has been written about how Easter Eggs undermine software security (in exchange for a small boost to developer morale) – that’s a major part of why Microsoft has banned them from its operating systems (and, for the most part, Apple has too). Given that these particular CAPTCHA in Squiz CMS are often nothing more than awkward-looking “are you sure?” dialogs, I’m not concerned about the direct security implications, but it does make me worry a little about the developer culture that produced them.
I know that this Easter Egg might be harmless, but there’s no way for me to know (short of auditing the entire system) what other Easter Eggs might be hiding under the surface and what they do, especially if the developers have, as in this case, worked to cover their tracks! It’s certainly the kind of thing I’d worry about if I were, I don’t know, a major government who use Squiz software, especially their cloud-hosted variants which are harder to effectively audit. Just a thought.
“PHP that’s been made deliberately hard to decipher”
Isn’t that more commonly known as simply “PHP”?
It’s not Perl, you know.
It’s almost Perl, but not quite.
But still; there’s a difference between a programming language that’s hard to follow and deliberately rendering obvious search terms encoded in Base64 so that they don’t show up, as I’m sure you’ll agree!
Good to hear from you, buddy!
Yes, you found the one and only EE in Matrix. As you say, CAPTCHAs are not used extensively within Matrix, and never for actual security, so few people ever notice. This is why we specifically targeted this bit of code.
It would be nice to not spread FUD about the security of the product though. Matrix code has been routinely audited by security companies for more than 5 years. A lot of our clients, particularly government clients, like to have these independent audits done before they launch, and then periodically thereafter. We also have a full-time security engineer embedded with the development team to do internal audits and keep people’s security knowledge current.
And finally, I honestly can’t remember why we called that directory “fudge”, but I’m pretty sure it had nothing to do with another f-word :)
Ah! Didn’t mean to imply that it had anything to do with that other f-word (it’s okay, you can say “fuck”: we’re all adults here!): that wasn’t the alternate meaning I was looking at. The one I was looking at was the first one after the link – “fudging” something being like taking a shortcut rather than doing something the established ‘right’ way. I’m pretty sure I’ve had a “fudge” directory in at least one non-Squiz-related project I’ve worked on, too! Usually for that late-spec-change piece of functionality that can be properly integrated in the next major relase but for now just needs jamming in to the infrastructure so that it works, for now. That was my assumption when I saw the “fudge” directory: that some piece of functionality, probably in the long-distant past, had been thusly “fudged” and the directory name had stuck.
The CMS itself doesn’t rely on CAPTCHAs as more than an “are you sure?”, you’re right (and man, typing that 25-character one to restart the crontab manager is a drag: I mean – do I have to be that sure?). But it does encourage their use on e.g. Custom Form Assets, such that backend users are likely to try to use them to protect their forms from robots, so I’m glad that this Easter Egg doesn’t kick in on shorter key lengths!
Ahunga! Ahunga!
Squids CMS?
PCP?
Um? I’m lost…