Anybody who has, like me, come into contact with the Squiz Matrix CMS for any length of time will
have come across the reasonably easy-to-read but remarkably long CAPTCHA that it
shows. These are especially-noticeable in its administrative interface, where it uses them as an exaggerated and somewhat painful “are you sure?” – restarting the CMS’s internal
crontab manager, for example, requires that the administrator types a massive 25-letter CAPTCHA.
But there’s another interesting phenomenon that one begins to notice after seeing enough of the back-end CAPTCHA that appear. Strange patterns of letters that appear in sequence
more-often than would be expected by chance. If you’re a fan of wordsearches, take a look at the composite screenshot above: can you find a person’s name in each of the four lines?
There are four names – Greg, Dom, Blair and Marc – which routinely appear in these CAPTCHA.
Blair, being the longest name, was the first that I noticed, and at first I thought that it might represent a fault in the pseudorandom number generation being used that was resulting
in a higher-than-normal frequency of this combination of letters. Another idea I toyed with was that the CAPTCHA text might be being entirely generated from a set of pronounceable
syllables (which is a reasonable way to generate one-time passwords that resist entry errors resulting from reading difficulties: in fact, we do this at Three Rings), in which these four names also appear, but by now I’d have
thought that I’d have noticed this in other patterns, and I hadn’t.
Instead, then, I had to conclude that these names were some variety of Easter Egg.
I was curious about where they were coming from, so I searched the source code, but while I found plenty of references to Greg Sherwood, Marc McIntyre, and Blair Robertson. I
couldn’t find Dom, but I’ve since come to discover that he must be Dominic Wong – these four were, according to Greg’s blog – developers with Squiz in the early 2000s, and seemingly saw themselves as a dynamic
foursome responsible for the majority of the CMS’s code (which, if the comment headers are to be believed, remains true).
That still didn’t answer for me why searching for their names in the source didn’t find the responsible code. I started digging through the CMS’s source code, where I eventually
found fudge/general/general.inc (a lot of Squiz CMS code is buried in a folder called “fudge”, and web addresses used internally sometimes contain this word, too: I’d like to
believe that it’s being used as a noun and that the developers were just fans of the buttery sweet, but I have a horrible feeling that it was used in its popular verb form). In that file, I found
this function definition:
/**
* Generates a string to be used for a security key
*
* @param int $key_len the length of the random string to display in the image
* @param boolean $include_uppercase include uppercase characters in the generated password
* @param boolean $include_numbers include numbers in the generated password
*
* @return string
* @access public
*/
function generate_security_key($key_len, $include_uppercase = FALSE, $include_numbers = FALSE) {
$k = random_password($key_len, $include_uppercase, $include_numbers);
if ($key_len > 10) {
$gl = Array('YmxhaXI=', 'Z3JlZw==', 'bWFyYw==', 'ZG9t');
$g = base64_decode($gl[rand(0, (count($gl) - 1)) ]);
$pos = rand(1, ($key_len - strlen($g)));
$k = substr($k, 0, $pos) . $g . substr($k, ($pos + strlen($g)));
}
return $k;
} //end generate_security_key()
For the benefit of those of you who don’t speak PHP, especially PHP that’s been made deliberately hard to decipher, here’s what’s happening when “generate_security_key” is being called:
A random password is being generated.
If that password is longer than 10 characters, a random part of it is being replaced with either “blair”, “greg”, “marc”, or “dom”. The reason that you can’t see these words in the
code is that they’re trivially-encoded using a scheme called Base64 – YmxhaXI=, Z3JlZw==, bWFyYw==, and ZG9t are Base64 representations of the four
names.
This seems like a strange choice of Easter Egg: immortalising the names of your developers in CAPTCHA. It seems like a strange choice especially because this somewhat weakens the
(already-weak) CAPTCHA, because an attacking robot can quickly be configured to know that a 11+-letter codeword will always consist of letters and exactly one instance of one of these
four names: in fact, knowing that a CAPTCHA will always contain one of these four and that I can refresh until I get one that I like, I can quickly turn an
11-letter CAPTCHA into a 6-letter one by simply refreshing until I get one with the longest name – Blair – in it!
A lot has been written about how Easter Eggs undermine software security (in exchange for a
small boost to developer morale) – that’s a major part of why Microsoft has banned them from its operating systems (and, for the most part, Apple has too). Given that these
particular CAPTCHA in Squiz CMS are often nothing more than awkward-looking “are you sure?” dialogs, I’m not concerned about the direct security implications, but it does make me worry
a little about the developer culture that produced them.
I know that this Easter Egg might be harmless, but there’s no way for me to know (short of auditing the entire system) what other Easter Eggs might be hiding under the
surface and what they do, especially if the developers have, as in this case, worked to cover their tracks! It’s certainly the kind of thing I’d worry about if I were, I don’t
know, a major government who use Squiz software, especially their cloud-hosted variants which are harder to
effectively audit. Just a thought.
Personal flying machines will be a reality, home computer and electric car pioneer Sir Clive Sinclair has said.
He told BBC Radio 4’s iPM programme that soon it would be “economically and technically possible” to create flying cars for individuals.
Sir Clive is best-known for the Spectrum computer and his failed electric car effort, the C5.
“I’m sure it will happen and I am sure it will change the world dramatically,” he predicted.
Despite his pioneering work in the field of computers, Sir Clive told BBC Radio 4 he was not an internet user.
“I don’t use it myself directly,” he said, explaining that as an inventor he tried to avoid “mechanical and technical things around me so they don’t blur the mind”.
As web developers, we’re used to working around the bugs in Microsoft Internet Explorer. The older versions are worst, and I’m certainly glad to not have to write code that
works in Internet Explorer 6 (or, increasingly, Internet Explorer 7) any more: even Microsoft are glad to see Internet Explorer 6 dying out, but even IE8 is pretty ropey too. And despite what Microsoft claim, I’m afraid IE9 isn’t really a “modern” browser either (although it is a huge step forwards over its
predecessors).
But imagine my surprise when I this week found what I suspect might be a previously undiscovered bug in Internet Explorer 8 and below. Surely they’ve all been found (and some of them
even fixed), but now? But no. It takes a very specific set of circumstances for the bug to manifest itself, but it’s not completely unbelievable – I ran into it by accident while
refactoring parts of Three Rings.
Here’s the crux of it: if you’re –
Using Internet Explorer 8 or lower, and
You’re on a HTTPS (secure) website, and
You’re downloding one of a specific set of file types: Bitmap files, for example, are a problem, but JPEG files aren’t (Content-Type: image/bmp), and
The web server indicates that the file you’re downloading should be treated as something to be “saved”, rather than something to be viewed in your browser
(Content-Disposition: attachment), and
The web server passes a particular header to ask that Internet Explorer does not cache a copy of the file (Cache-Control: no-cache),
Then you’ll see a dialog box like the one shown above. Switching any of the prerequisites in that list out makes the problem go away: even switching the header from a strict “no-cache”
to a more-permissive “private” makes all the difference.
I’ve set up a test environment where you can see this for yourself: HTTP version; HTTPS version. The source code of my experiment (PHP) is also available. Of course, if you try it in a functional, normal web browser, it’ll all work fine. But if
you’ve got access to a copy of Internet Explorer 8 on some old Windows XP box somewhere (IE8 is the last version of the browser made available for XP), then try it in that and see for
yourself what a strange error you get.
Now that the list of new top-level domain applications for 2013 has been revealed, geeks around the world can
start planning for the domain hacks of the future. Please.do.not.disturb.me was fun, and all, but the if many or all of these new
registries are willing to sell their domains to anybody, there’s a lot of potential for new and unusual domain names.
I suspect we’ll soon be typing in addresses like:
jack.and/jill – the .and TLD is clearly supposed to be for the Andalusian community in Spain, but I doubt that’s going to stop people from coming up with imaginative uses for
domain names where you can just “put your own suffix” after the .and/, like we used to do before .isgay.com before it got taken over by domain squatters. (note
that .gay will soon be a TLD, so there’s probably going to be a whole raft of these new sites soon…)
crow.bar – or as we’ll say at the time, “.bar – it’s not just for bars any more!”
I quite like the idea of sugar.beats, but I think a far more popular use will be “put your own suffix” sites, again,
like rock.beats/scissors.
ro.bot- .bot is one of the many TLDs that Amazon is going for, and it seems likely to me that they’re going to try to resell domains underneath it. I’m just not sure
whether sex.bot or ro.bot will be first to be snatched up.
not.just.broke.but.broker – perhaps you have to be in my head to find this amusing.
fizz.buzz. This web site would have the best hit counter ever on it (why?).
s.care, s.cars, s.expert, s.tab, and dozens of other domain names that are only a letter away from
meaning something completely different – and that letter is often “s”.
mon.day, sun.day, dooms.day, birth.day – etc. etc. I’d buy birth.day if the
price was right, and then run a basic site spanning happy.birth.day, first.birth.day, and the like, with automatically-generated content on
each. It’d be fun.
yo.dog – a complete abuse of the .dog TLD, no matter what its purpose is supposed to be. Better still, I’d put a page at
http://yo.dog.yo.dog/yo.dog, containing the message “I heard you like domain names in your domain names, so I put a domain name in a domain name.” (why?)
electric.fan – the website that Koreans will set as one another’s home page, as a cruel prank against the superstitious.
jelly.fish would be an awesome domain name! Who wouldn’t want to have the email address throw.stones@jelly.fish?
mtee.ggee- the future domain name of Hungry Horse pubs? (get it? “empty gee-gee”?)
a.boy.named.goo, after the Goo Goo Dolls album. But then, I don’t object to domain names with possibly-excessive numbers of dots in them, as the Summer Party On Earth website probably gives away. Hell: I could possibly be using
a.home.called.earth as the domain name for our house, in 2013.
fag.got – I’ll bet that homosexual sex blogger Dan
Savage, who’s been trying to reclaim the word “faggot”, would love to have the email address hey@fag.got!
bl.ink – I’ve got an idea for a webcam-based site, like ChatRoulette, but with facial recognition software
that watches your eye movements. You get paired up with a random stranger and the pair of you have a staring contest, right over the Internet. If you win, you get a point. It’ll be
awesome.
commun.ist, rac.ist, anarch.ist, etc. – I’m sure that Istanbul, for whom the .ist TLD is intended, won’t mind if
we borrow their new domain name for a few amusing addresses. Like the email address shoot@the.rac.ist, for example.
bob.lob.law/law/blog – with apologies to those who don’t follow Arrested Development.
bi.ngo – sure,.bingo is likely to exist anyway, but this way’s more fun.
fuck.off – I have no idea what anybody else expected the.off TLD to be used for, if not this.
child.ren – I quite like this, because it makes not only a full word, but the first part is a word, too.
im.off.ski – faux Russian is never going to go out of style.
tube.tube.tube – if I can, I’m totally setting this site up in 2013. All that there’ll be is the picture, below, which makes me smile every time I see it.
Honestly, though: it feels like all of these new top-level domain name opportunities take a lot of the fun out of domain hacks. The more TLDs we have, the easier it is to put together
words and phrases with the opportunities given.
Scrabble wouldn’t be so enjoyable if each player had a rack of, say, 30 tiles, rather than just 7. The restriction (and working around them) is what makes
domain-name-based jokes so funny, in my mind. What are we supposed to do in a world where anybody with a spare $185,000 USD can have anything he wants?
When I realise that the era of funny domain hacks is coming to an end, it makes me a little sad. But then I look at that picture of a polar bear and everything’s okay again.
Tuuuuuuube!
RBS Group this week
rolled out a service to all of
its customers, allowing them to withdraw cash from an ATM without using their bank card. The service is based upon the same technologies that’s used to provide emergency access to
cash by people who’ve had their cards stolen, but integrates directly into the mobile banking apps of the group’s constituent banks. I decided to give it a go.
The first step is to use the mobile app to request a withdrawal. There’s an icon for this, but it’s a bit of a mystery that it’s there unless you already know what you’re looking for.
You can’t make a request from online banking without using the mobile app, which seems to be an oversight (in case you can’t think of a reason that you’d want to do this, read on:
there’s one at the end). I opted to withdraw £50.
Next, it’s off to find a cash machine. I struck out, without my wallet, to try to find the nearest Royal Bank of Scotland, NatWest, or Tesco cashpoint. The mobile app features a GPS
tool to help you find these, although it didn’t seem to think that my local Tesco cashpoint existed, walking me on to a branch of NatWest.
As instructed by the app, I pressed the Enter key on the keypad of the cash machine. This bypasses the usual “Insert card” prompt and asks, “Do you wish to carry out a
Get Cash or Emergency Cash transaction?” I pressed Yes.
The ATM asked for the PIN I’d been given by the mobile app: a 6-digit code. Each code is only valid for a window of 3 hours and can only be used once.
I’m not sure why, but the ATM asks that the PIN is confirmed by being entered a second time. This doesn’t make a lot of sense to me – if it was mistyped, it’d surely fail anyway (unless
I happened to guess another valid code, within its window), and I’d simply be able to try again. And if I were an attacker, trying to guess numbers, then there’s no difficulty in typing
the same number twice.
It’s possible that this is an attempt at human-tarpitting,
but that wouldn’t be the best way to do it. If the aim is to stop a hacker from attempting many codes in quick succession, simply imposing a delay would be far more effective (this is
commonplace with cash machines anyway: ever notice that you can’t put a card in right after the last transaction has finished?). Strange.
Finally, the ATM asks what value of cash was agreed to be withdrawn. I haven’t tried putting in an incorrect value, but I assume that it would refuse to dispense any cash if the wrong
number was entered – this is presumably a final check that you really are who you claim to be.
It worked. I got my money. The mobile app quickly updated to reflect the change to my balance and invalidated the code: the system was a success.
The banks claim that this will be useful for times that you’ve not got your card with you. Personally, I don’t think I ever take my phone outdoors without also taking my wallet with me,
so the chance of that it pretty slim. If my card were stolen, I’d be phoning the bank to cancel the card anyway, so it wouldn’t save me a call, either, if I needed emergency cash. But
there are a couple of situations in which I’d consider using this neat little feature:
If I was suspicious of a possible card-skimming device on a cash machine, but I needed to withdraw money and there wasn’t an un-tampered ATM in the vicinity. It’d be nice to know
that you can avoid having your card scanned by some kid with a skimmer just by using your phone to do the authentication rather than a valuable piece of plastic.
To send money to somebody else. Using this tool is cheaper than a money order and faster than a bank transfer: it’s an instantaneous way to get small sums of cash
directly into the hands of a distant friend. “Sure, I’ll lend you £50: just go to a cash machine and type in this code.” I’m not sure whether or not this is a legitimate
use of the service, but I can almost guarantee that it’ll be the most-popular. It’ll probably be reassuring to parents of teenagers, for example, who know that they can help their
offspring get a taxi home when they’ve got themselves stranded somewhere.
What do you think? If you’re with RBS, NatWest or Tesco, have you tried this new mobile banking feature? Do you think there’s mileage in it as an idea, or is it a solution in need of a
problem?
This blog post is about password security. If you don’t run a website and you just want to know what you should do to protect yourself, jump to the
end.
I’d like to tell you a story about a place called Internetland. Internetland is a little bit like the town or country that you live in, but there’s one really important difference: in
Internetland, everybody is afflicted with an unusual disorder called prosopagnosia, or “face-blindness”. This means that, no matter how hard they try, the inhabitants of Internetland
can’t recognise each other by looking at one another: it’s almost as if everybody was wearing masks, all the time.
Denied the ability to recognise one another on sight, the people of Internetland have to say out loud who they are when they want to be identified. As I’m sure you can imagine, it’d be
very easy for people to pretend to be one another, if they wanted. There are a few different ways that the inhabitants get around that problem, but the most-common way is that people
agree on and remember passwords to show that they really are who they claim to be.
Alice’s Antiques
Alice runs an antiques store in Internetland. She likes to be able to give each customer a personalised service, so she invites her visitors to identify themselves, if they like, when
they come up to the checkout. Having them on file means that she can contact them about special offers that might interest them, and she can keep a record of their address so that the
customer doesn’t have to tell her every time that they want a piece of furniture delivered to their house.
One day, Bob came by. He found a nice desk and went to the checkout to pay for it.
“Hi,” said Alice, “Have you shopped here before?” Remember that even if he’d visited just yesterday, she wouldn’t remember him, so crippling is her face-blindness.
“No,” replied Bob, “First time.”
“Okay then,” Alice went on, “Would you like to check out ‘as a guest’, or would you like to set up an account so that I’ll remember you next time?”
Bob opted to set up an account: it’d only take a few minutes, Alice promised, and would allow him to check out faster in future. Alice gave Bob a form to fill in:
Alice took the form and put it into her filing cabinet.
The following week, Bob came by Alice’s Antiques again. When he got to the checkout, Alice again asked him if he’d shopped there before.
“Yes, I’ve been here before,” said Bob, “It’s me: Bob!”
Alice turned to her filing cabinet and pulled out Bob’s file. This might sound like a lot of work, but the people of Internetland are very fast at sorting through filing cabinets, and
can usually find what they’re looking for in less than a second. Alice found Bob’s file and, looking at it, challenged Bob to prove his identity:
“If you’re really Bob – tell me your password!”
“It’s swordfish1,” came the reply.
Alice checked the form and, sure, that was the password that Bob chose when he registered, so now she knew that it really was him. When he asked for a set of
chairs he’d found to be delivered, Alice was able to simply ask, “You want that delivered to 1 Fisherman’s Wharf, right?”, and Bob just nodded. Simple!
Evil Eve
That night, a burglar called Eve broke into Alice’s shop by picking the lock on the door (Alice never left money in the till, so she didn’t think it was worthwhile buying a very good
lock). Creeping through the shadows, Eve opened up the filing cabinet and copied out all of the information on all of the files. Then, she slipped back out, locking the door behind her.
Alice’s shop has CCTV – virtually all shops in Internetland do – but because it wasn’t obvious that there had been a break-in, Alice didn’t bother to check the recording.
Now Eve has lots of names and passwords, so it’s easy for her to pretend to be other Internetlanders. You see: most people living in Internetland use the same password at most or all of
the places they visit. So Eve can go to any of the other shops that Bob buys from, or the clubs he’s part of, or even to his bank… and they’ll believe that she’s really him.
One of Eve’s favourite tricks is to impersonate her victim and send letters to their friends. Eve might pretend to be Bob, for example, and send a letter to his friend Charlie. The
letter might say that Bob’s short on cash, and ask if Charlie can lend him some: and if Charlie follows the instructions (after all, Charlie trusts Bob!), he’ll end up having his money
stolen by Eve! That dirty little rotter.
So it’s not just Bob who suffers for Alice’s break-in, but Charlie, too.
Bob Thinks He’s Clever
Bob thinks he’s cleverer than most people, though. Rather than use the same password everywhere he goes, he has three different passwords. The first one is his “really secure” one: it’s
a good password, and he’s proud of it. He only uses it when he talks to his bank, the tax man, and his credit card company – the stuff he thinks is really important. Then
he’s got a second password that he uses when he goes shopping, and for the clubs he joins. A third password, which he’s been using for years, he reserves for places that demand that he
chooses a password, but where he doesn’t expect to go back to: sometimes he joins in with Internetland debates and uses this password to identify himself.
Bob’s approach was cleverer than most of the inhabitants of Internetland, but it wasn’t as clever as he thought. Eve had gotten his medium-security password, and this was enough to
persuade the Post Office to let her read Bob’s mail. Once she was able to do this, she went on to tell Bob’s credit card company that Bob had forgotten his password, so they sent him a
new one… which she was able to read. She was then able to use this new password to tell the credit card company that Bob had moved house, and that he’d lost his card. The credit card
company promptly sent out a new card… to Eve’s address. Now Eve was able to steal all of Bob’s money. “Muhahaha!” chortled Eve, evilly.
But even if Bob hadn’t made the mistake of using his “medium-security” password at the Post Office, Eve could have tried a different approach: Eve would have pretended to be Alice, and
asked Bob for his password. Bob would of course have responded, saying “It’s ‘swordfish1’.”
Then Eve would have done something sneaky: she’d have lied and said that was wrong. Bob would be confused, but he’d probably just think to himself, “Oh, I must have given Alice a
different password.”
“It must be ‘haddock’, then,” Bob would say.
“Nope; wrong again,” Eve would say, all the while pretending to be Alice.
“Surely it’s not ‘h@mm3rHead!’, is it?” Bob would try, one last time. And now Eve would have all of Bob’s passwords, and Bob would just be left confused.
Good Versus Eve
What went wrong in Internetland this week? Well, a few things did:
Alice didn’t look after her filing cabinet
For starters, Alice should have realised that the value of the information in her filing cabinet was worth at least as much as money would be, to the right kind of burglar. It was easy
for her to be complacent, because it wasn’t her identity that was most at risk, but that of her customers. Alice should have planned her security in line with that realisation:
there’s no 100% certain way of stopping Eve from breaking in, but Alice should have done more to make it harder for Eve (a proper lock, and perhaps a separate, second lock on the filing
cabinet), and should have made it so that Eve’s break-in was likely to be noticed (perhaps skimming through the security tapes every morning, or installing motion sensors).
But the bigger mistake that Alice made was that she kept Bob’s password in a format that Eve could read. Alice knew perfectly well that Bob would probably be using the same password in
other places, and so to protect him she ought to have kept his password encrypted in a way that would make it virtually impossible for Eve to read it. This, in combination with an
effort to insist that her customers used good, strong passwords, could have completely foiled Eve’s efforts, even if she had managed to get past the locks and CCTV un-noticed.
Here in the real world: Some of Alice’s mistakes are not too dissimilar to the recently-publicised mistakes made by LinkedIn, eHarmony, and LastFM. While
these three giants did encrypt the passwords of their users, they did so inadequately (using mechanisms not designed for passwords, by using outdated
and insecure mechanisms, and by failing to protect stolen passwords from bulk-decryption). By the way: if you have an account with any
of these providers, you ought to change your password, and also change your password anywhere else that uses the same password… and if this includes your email, change it everywhere
else, too.
Bob should have used different passwords everywhere he went
Good passwords should be long (8 characters should be an absolute minimum, now, and Bob really ought to start leaning towards 12), complex (not based on a word in any dictionary, and
made of a mixture of numbers, letters, and other characters), and not related to you (dates of birth, names of children, and the like are way out). Bob had probably heard
all of that a hundred times.
But good passwords should also be unique. You shouldn’t ever use the same password in two different places. This was Bob’s mistake, and it’s the mistake of almost everybody
else in Internetland, too. What Bob probably didn’t know was that there are tools that could have helped him to have a different password for everybody he talked to, yet still
been easier than remembering the three passwords he already remembered.
Here in the real world: There are some really useful tools to help you, too. Here are some of them:
LastPass helps you generate secure passwords, then stores encrypted versions of them on the Internet so that you can get at them
from anywhere. After a short learning curve, it’s ludicrously easy to use. It’s free for most users, or there are advanced options for paid subscribers.
KeePass does a similar thing, but it’s open source. However, it doesn’t store your encrypted passwords online (which you might
consider to be an advantage), so you have to carry a pen drive around or use a plugin to add this functionality.
SuperGenPass provides a super-lightweight approach to web browser password generation/storing. It’s easy to understand and
makes it simple to generate different passwords for every site you use, without having to remember all of those different passwords!
One approach for folks who like to “roll their own” is simply to put a spreadsheet or a text file into a TrueCrypt (or
similar) encrypted volume, which you can carry around on your pendrive. Just decrypt and read, wherever you are.
Another “manual” approach is simply to use a “master password” everywhere, prefixed or suffixed with a (say) 4-5 character modifier, that you vary from site to site. Keep your
modifiers on a Post-It note in your wallet, and back it up by taking a picture of it with your mobile phone. So maybe your Skype suffix is “8Am2%”, so when you log into Skype you type
in your master password, plus that suffix. Easy enough that you can do it even without a computer, and secure enough for most people.
Last week I was talking to Alexander Dutton about an idea that we had to implement cookie-like behaviour using browser caching. As I first mentioned last year, new laws are coming into force across Europe that will require
websites to ask for your consent before they store cookies on your computer.
Regardless of their necessity, these laws are badly-defined and ill thought-out, and there’s been a significant lack of information to support web managers in understanding and
implementing the required changes.
To illustrate one of the ambiguities in the law, I’ve implemented a tool which tracks site visitors almost as effectively as cookies (or similar technologies such as Flash Objects or
Local Storage), but which must necessarily fall into one of the larger grey areas. My tool abuses the way that “permanent” (301) HTTP redirects are cached by web browsers.
[callout][button link=”http://c301.scatmania.org/” align=”right” size=”medium” color=”green”]See Demo Site[/button]You can try out my implementation for yourself. Click on the button to
see the sample site, then close down all of your browser windows (or even restart your computer) and come back and try again: the site will recognise you and show you the same random
number as it did the first time around, as well as identifying when your first visit was.[/callout]
Here’s how it works, in brief:
A user visits the website.
The website contains a <script> tag, pointing at a URL where the user’s browser will find some Javascript.
The user’s browser requests the Javascript file.
The server generates a random unique identifier for this user.
The server uses a HTTP 301 response to tell the browser “this Javascript can be found at a different web address,” and provides an address that contains the new unique identifier.
The user’s browser requests the new document (e.g. /javascripts/tracking/123456789.js, if the user’s unique ID was 123456789).
The resulting Javascript is generated dynamically to automatically contain the ID in a variable, which can then be used for tracking purposes.
Subsequent requests to the server, even after closing the browser, skip steps 3 through 5, because the user’s browser will cache the 301 and re-use the unique web
address associated with that individual user.
Compared to conventional cookie-based tracking (e.g. Google Analytics), this approach:
Is more-fragile (clearing the cache is a more-common user operation than clearing cookies, and a “force refresh” may, in some browsers, result in a new tracking ID
being issued).
Is less-blockable using contemporary privacy tools, including the W3C’s
proposed one: it won’t be spotted by any cookie-cleaners or privacy filters that I’m aware of: it won’t penetrate incognito mode or other browser “privacy modes”, though.
Moreover, this technique falls into a slight legal grey area. It would certainly be against the spirit of the law to use this technique for tracking purposes (although it
would be trivial to implement even an advanced solution which “proxied” requests, using a database to associate conventional cookies with unique IDs, through to Google Analytics or a
similar solution). However, it’s hard to legislate against the use of HTTP 301s, which are an even more-fundamental and required part of the web than cookies are. Also, and for the same
reasons, it’s significantly harder to detect and block this technique than it is conventional tracking cookies. However, the technique is somewhat brittle and it would be necessary to
put up with a reduced “cookie lifespan” if you used it for real.
[callout][button link=”http://c301.scatmania.org/” align=”right” size=”medium” color=”green”]See Demo Site[/button] [button link=”https://gist.github.com/avapoet/5318224″ align=”right”
size=”medium” color=”orange”]Download Code[/button] Please try out the demo, or download the source code (Ruby/Sinatra) and see for yourself how this technique works.[/callout]
Note that I am not a lawyer, so I can’t make a statement about the legality (or not) of this approach to tracking. I would suspect that if you were somehow caught doing
it without the consent of your users, you’d be just as guilty as if you used a conventional approach. However, it’s certainly a technically-interesting approach that might have
applications in areas of legitimate tracking, too.
Update: The demo site is down, but I’ve update the download code link so that it still works.
Well, it’s been over a year since I last updated
the look-and-feel of my blog, so it felt like it was time for a redesign. The last theme was made during a period that I was just recovering from a gloomy
patch, and that was reflected the design: full of heavy, dark reds, blacks, and greys, and it’s well-overdue a new look!
I was also keen to update the site to in line with the ideas and technologies that are becoming more commonplace in web design, nowadays… as well as using it as a playground for some of
the more-interesting CSS3 features!
Key features of the new look include:
A theme that uses strong colours in the footer and header, to “frame” the rest of the page content.
A responsive design that rescales dynamically all the way from a mobile phone screen through tablets, small 4:3 monitors, and widescreen ratios (try
resizing your browser window!).
CSS transitions to produce Javascript-less dynamic effects: hover your cursor over the picture of me in the header to make me “hide”.
CSS “spriting” to reduce the number of concurrent downloads your browser has to make in order to see the content. All of the social media icons, for
example, are one file, split back up again using background positioning. They’re like image maps, but a million times less 1990s.
Front page “feature” blocks to direct people to particular (tagged) areas of the site, dynamically-generated (from pre-made templates) based on what’s
popular at any given time.
A re-arrangement of the controls and sections based on the most-popular use-cases of the site, according to visitor usage trends. For example, search
has been made more-prominent, especially on the front page, the “next post”/”previous post” controls have been removed, and the “AddToAny” sharing tool has been tucked away at the
very bottom.
[spb_message color=”alert-warning” width=”1/1″ el_position=”first last”]Note that some of these features will only work in modern browsers, so Internet Explorer users might be out of
luck![/spb_message]
As always, I’m keen to hear your feedback (yes, even from those of you who subscribe by
RSS). So let me know what you think!
I’m going to assume that you’re aware of the issues and have already taken action appropriate to your place – if you’re in the US, you’ve written to your representatives; if you’re in
the rest of the English-speaking world, you’ve donated to the EFF (this issue affects all of us), etc. But if you’re in need of Wikipedia, here’s the simplest way to view it, today:
Accessing Wikipedia during the blackout
Go to the English-language Wikipedia as normal. You’ll see the “SOPA blackout” page after a second or so.
Copy-paste the following code into the address bar of the browser:
That’s all. You don’t even have to turn off Javascript in your browser, as others are suggesting: just surf away.
If you get sick of copy-pasting on every single Wikipedia page you visit… you can drag this link to your
bookmarks toolbar (or right click it and select “add to bookmarks”) and then just click it from your bookmarks whenever you want to remove the blackout.
And if you just came here for the shortcut without making yourself aware of the issues, shame on you.
My Name Is Me. I choose to participate on much of the Internet by my full name. I say “full name”, rather than “real name”, because the term “real name” is full of loaded
connotations. For example, I (still) periodically have people insist that Dan Q isn’t my real name, because it’s not the name I was born with. It doesn’t matter to them
that it’s the name I’m known by to pretty much everybody (except my mother, who still calls me Daniel). It doesn’t matter that it’s the name on my passport or driving license. To them,
it’s not “real” because to them, real names are either those acquired by birth or marriage, and somehow nothing else is valid. And that’s without even looking at the number of times
I’ve been discriminated against because my name is “too short” for ill-designed computer systems.
That doesn’t bother me. What does bother me is that sites like Facebook and – in the
news recently on this very topic – Google+ demand that full “real” names are used on the profiles of their site users. If you
don’t use the name that appears on your government-issued documentation (if you have such a thing), then your accounts on these sites are liable to be closed. By the way: the same is
theoretically true of your Google Profile, too, so even if you’re not on the Google+ bandwagon and you, say, use a nickname in your Google Profile, your account is still at risk.
Now, I can see the point that these policies are trying to make. In fact, there was a time that I’d have naively agreed with them. They’re trying to make the Internet a safer,
more-accountable place. But in actual fact, there’s a real risk that they’ll make the Internet a lot more-treacherous for some people. I shan’t bother listing folks who are affected,
because others have done it far
more-thoroughly than I ever could.
But I shall point you in the direction of my.nameis.me, where you can read a little more about these issues.
Thanks.
When I relaunched Abnibthe other week (which
I swear I didn’t expect to have to do, until people started complaining that I was
going to let it die – this genuinely wasn’t some “marketing” stunt!), I simultaneously brought back Abnib Chat (#abnib),
the IRC channel.
I blame Jen for this. She told me that she missed the long-dead #rockmonkey chat room, and wanted it (or something
similar) back, so I decided to provide one. Hell; if Jen wanted it, maybe other people wanted it to? And it’s an easy thing to set up, I thought.
Personally, I thought that the chat room would be a flop. I’d give it a go, of course, but I didn’t hold up much hope for its survival. When Abnib first launched, back in 2003, the Abnibbers were all students first and foremost.
Now, they’ve all got jobs, and many of those jobs aren’t of a variety compatible with sitting on an IRC channel all day. And at night? We’ve got money, nowadays, and homes, and
spice, and all kinds of activities that consume our lives on an evening. Many
of us get what our younger student selves would call an “early night” every day of the week, and there’s always so much to do that shooting the breeze over a laborious IRC
channel simply isn’t compatible with our lives any more.
Looks like I was right. Here’s the channel activity for the first fortnight of the new Abnib Chat:
Sure, the 1st of the month was busy, but not very busy: in actual fact, many of the people who were “around” were only around briefly, and one of those – Guest1332 – didn’t
even identify themselves.
We’ve all got new ways of communicating now. Some folks are using Twitter (I occasionally read the feeds of those who write in a way that I’m permitted to see, but I don’t “tweet”
myself). Others use Facebook (for a given definition of “use”, anyway). Others still continue to blog (that’s the medium for me: I think I’m just a little too wordy for
anything less). In any case; we’re like Abnib: The Next Generation, and we’ve got reliable transporters and replicators and all kinds of cool shit, and hanging around in
an IRC channel just feels kind of… backwards.
Perhaps I’ve been watching too much Star Trek recently.
Anyway – unless people object to that, too (seriously?), I’ll be turning off Iggy later this month: so if you’ve got something important to say to him, say it soon! I’ll leave the
“Chat” button on Abnib because it’s lazier than removing it, and you never know if somebody might find a use for it, but I think it’s time to declare the channel “dead”.
Web developers have tried to compensate for [the IPv4 address shortage] by creating IPv6 — a system that recognizes six-digit IP addresses rather than four-digit ones.
I can’t even begin to get my head in line with the level of investigative failure that’s behind this sloppy reporting. I’m not even looking at the fact that apparently it’s “web
developers” who are responsible for fixing the Internet’s backbone; just the 4/6-digits thing is problematic enough.
Given that Wikipedia can get this right, you’d hope that a news agency could manage. Even the Daily Mail did slightly better (although they did
call IPv4 addresses 16-bit and then call them 32-bit in the very next sentence).
Oh; wait: Fox News. Right.
For the benefit of those who genuinely want to know, one of the most significant changes between IPv4 and IPv6 is the change from 32-bit addresses to 128-bit
addresses: that’s the difference between about 4 billion addresses and 340 undecillion addresses (that’s 34 followed by thirty-eight zeros). Conversely, adding “two digits” to a
four-digit number (assuming we’re talking about decimal numbers), as Fox News suggest, is the difference between a thousand addresses and a hundred thousand. And it’s not web developers
who are responsible for it: this change has nothing to do with the web but with the more fundamental architecture of the underlying Internet itself.
Oh yeah: I changed the look-and-feel of scatmania.org the other week, in case you hadn’t noticed. It’s become a
sort-of-traditional January activity for me, these years, to redesign the theme of my blog at this point in the year.
This year’s colours are black, white, greys, and red, and you’ll note also that serifed fonts are centre-stage again, appearing pretty-much-universally throughout the site for the first
time since 2004. Yes, I know that it’s heavier and darker than previous versions of the site: but it’s been getting fluffier and lighter year on year for ages, now, and I thought it was
time to take a turn. You know: like the economy did.
Aside from other cosmetic changes, it’s also now written using several of the new technologies of HTML5 (I may put the shiny new logo on it, at some point). So apologies to those of you running archaic and non-standards-compliant browsers (I’m looking at you, Internet
Explorer 6 users) if it doesn’t look quite right, but really: when your browser is more than half as old as the web itself, it’s time to upgrade.
I’ve also got my site running over IPv6 – the next generation Internet protocol – for those of you who care about those sorts of things. If you don’t know why IPv6 is important and “a
big thing”, then here’s a simple explanation.
Right now you’re probably viewing the IPv4 version: but if you’re using an IPv6-capable Internet connection, you might be viewing the IPv6 version. You’re not missing out, either way:
the site looks identical: but this is just my tiny contribution towards building the Internet of tomorrow.
(if you really want to, you can go to ipv6.scatmania.org to see the IPv6 version – but it’ll only work if your Internet Service Provider is on the ball and has set you up with an IPv6
address!)