Anatomy of Cookie XSS

A cross-site scripting vulnerability (shortened to XSS, because CSS already means other things) occurs when a website can be tricked into showing a visitor unsafe content that came from another site visitor. Typically when we talk about an XSS attack, we’re talking about tricking a website into sending Javascript code to the user: that Javascript code can then be used to steal cookies and credentials, vandalise content, and more.

Good web developers know to sanitise input – making anything given to their pages by a user safe before ever displaying it on a page – but even the best can forget quite how many things really are “user input”.

"Who Am I?" page provided by University of Oxford IT Services.
This page outputs a variety of your inputs right back at you.

Recently, I reported a vulnerability in a the University of Oxford’s IT Services‘ web pages that’s a great example of this.  The page (which isn’t accessible from the public Internet, and now fixed) is designed to help network users diagnose problems. When you connect to it, it tells you a lot of information about your connection: what browser you’re using, your reverse DNS lookup and IP address, etc.. The developer clearly understood that XSS was a risk, because if you pass a query string to the page, it’s escaped before it’s returned back to you. But unfortunately, the developer didn’t consider the fact that virtually anything given to you by the browser can’t be trusted.

My Perl program, injecting XSS code into the user's cookie and then redirecting them.
To demonstrate this vulnerability, I had the option of writing Perl or Javascript. For some reason, I chose Perl.

In this case, I noticed that the page would output any cookies that you had from the .ox.ac.uk domain, without escaping them. .ox.ac.uk cookies can be manipulated by anybody who has access to write pages on the domain, which – thanks to the users.ox.ac.uk webspace – means any staff or students at the University (or, in an escalation attack, anybody’s who’s already compromised the account of any staff member or student). The attacker can then set up a web page that sets up such a “poisoned” cookie and then redirects the user to the affected page and from there, do whatever they want. In my case, I experimented with showing a fake single sign-on login page, almost indistinguishable from the real thing (it even has a legitimate-looking .ox.ac.uk domain name served over a HTTPS connection, padlock and all). At this stage, a real attacker could use a spear phishing scam to trick users into clicking a link to their page and start stealing credentials.

A fake SSO login page, delivered from a legitimate-looking https URL.
The padlock, the HTTPS url, and the convincing form make this page look legitimate. But it’s actually spoofed.

I’m sure that I didn’t need to explain why XSS vulnerabilities are dangerous. But I wanted to remind you all that truly anything that comes from the user’s web browser, even if you think that you probably put it there yourself, can’t be trusted. When you’re defending against XSS attacks, your aim isn’t just to sanitise obvious user input like GET and POST parameters but also anything that comes from a browser header including cookies and referer headers, especially if your domain name carries websites managed by many different people. In an ideal world, Content Security Policy would mitigate all these kinds of attacks: but in our real world – sanitise those inputs!

DRY: Do Repeat Yourself

I am increasingly of the opinion that the general software engineering adage “Don’t Repeat Yourself” does not always apply to web development. Also, I found that web development classes in CS academia are not very realistic. These two problems turn out to have the same root cause: a lack of appreciation of what browsers do…

Non-Rails Frameworks in Ruby: Cuba, Sinatra, Padrino, Lotus

It’s common for a Ruby developer to describe themselves as a Rails developer. It’s also common for someone’s entire Ruby experience to be through Rails. Rails began in 2003 by David Heinemeier Hansson, quickly becoming the most popular web framework and also serving as an introduction to Ruby for many developers.

Rails is great. I use it every day for my job, it was my introduction to Ruby, and this is the most fun I’ve ever had as a developer. That being said, there are a number of great framework options out there that aren’t Rails.

This article intends to highlight the differences between Cuba, Sinatra, Padrino, Lotus, and how they compare to or differ from Rails. Let’s have a look at some Non-Rails Frameworks in Ruby.

Squiz CMS Easter Eggs (or: why do I keep seeing Greg’s name in my CAPTCHA?)

Anybody who has, like me, come into contact with the Squiz Matrix CMS for any length of time will have come across the reasonably easy-to-read but remarkably long CAPTCHA that it shows. These are especially-noticeable in its administrative interface, where it uses them as an exaggerated and somewhat painful “are you sure?” – restarting the CMS’s internal crontab manager, for example, requires that the administrator types a massive 25-letter CAPTCHA.

Four long CAPTCHA from the Squiz Matrix CMS.
Four long CAPTCHA from the Squiz Matrix CMS.

But there’s another interesting phenomenon that one begins to notice after seeing enough of the back-end CAPTCHA that appear. Strange patterns of letters that appear in sequence more-often than would be expected by chance. If you’re a fan of wordsearches, take a look at the composite screenshot above: can you find a person’s name in each of the four lines?

Four long CAPTCHA from the Squiz Matrix CMS, with the names Greg, Dom, Blair and Marc highlighted.
Four long CAPTCHA from the Squiz Matrix CMS, with the names Greg, Dom, Blair and Marc highlighted.

There are four names – GregDomBlair and Marc – which routinely appear in these CAPTCHA. Blair, being the longest name, was the first that I noticed, and at first I thought that it might represent a fault in the pseudorandom number generation being used that was resulting in a higher-than-normal frequency of this combination of letters. Another idea I toyed with was that the CAPTCHA text might be being entirely generated from a set of pronounceable syllables (which is a reasonable way to generate one-time passwords that resist entry errors resulting from reading difficulties: in fact, we do this at Three Rings), in which these four names also appear, but by now I’d have thought that I’d have noticed this in other patterns, and I hadn’t.

Instead, then, I had to conclude that these names were some variety of Easter Egg.

In software (and other media), "Easter Eggs" are undocumented  hidden features, often in the form of inside jokes.
Smiley decorated eggs. Picture courtesy Kate Ter Haar.

I was curious about where they were coming from, so I searched the source code, but while I found plenty of references to Greg Sherwood, Marc McIntyre, and Blair Robertson. I couldn’t find Dom, but I’ve since come to discover that he must be Dominic Wong – these four were, according to Greg’s blog – developers with Squiz in the early 2000s, and seemingly saw themselves as a dynamic foursome responsible for the majority of the CMS’s code (which, if the comment headers are to be believed, remains true).

Greg, Marc, Blair and Dom, as depicted in Greg's 2007 blog post.
Greg, Marc, Blair and Dom, as depicted in Greg’s 2007 blog post.

That still didn’t answer for me why searching for their names in the source didn’t find the responsible code. I started digging through the CMS’s source code, where I eventually found fudge/general/general.inc (a lot of Squiz CMS code is buried in a folder called “fudge”, and web addresses used internally sometimes contain this word, too: I’d like to believe that it’s being used as a noun and that the developers were just fans of the buttery sweet, but I have a horrible feeling that it was used in its popular verb form). In that file, I found this function definition:

/**
 * Generates a string to be used for a security key
 *
 * @param int            $key_len                the length of the random string to display in the image
 * @param boolean        $include_uppercase      include uppercase characters in the generated password
 * @param boolean        $include_numbers        include numbers in the generated password
 *
 * @return string
 * @access public
 */
function generate_security_key($key_len, $include_uppercase = FALSE, $include_numbers = FALSE) {
  $k = random_password($key_len, $include_uppercase, $include_numbers);
  if ($key_len > 10) {
    $gl = Array('YmxhaXI=', 'Z3JlZw==', 'bWFyYw==', 'ZG9t');
    $g = base64_decode($gl[rand(0, (count($gl) - 1)) ]);
    $pos = rand(1, ($key_len - strlen($g)));
    $k = substr($k, 0, $pos) . $g . substr($k, ($pos + strlen($g)));
  }
  return $k;
} //end generate_security_key()

For the benefit of those of you who don’t speak PHP, especially PHP that’s been made deliberately hard to decipher, here’s what’s happening when “generate_security_key” is being called:

  • A random password is being generated.
  • If that password is longer than 10 characters, a random part of it is being replaced with either “blair”, “greg”, “marc”, or “dom”. The reason that you can’t see these words in the code is that they’re trivially-encoded using a scheme called Base64 – YmxhaXI=Z3JlZw==, bWFyYw==, and ZG9t are Base64 representations of the four names.

This seems like a strange choice of Easter Egg: immortalising the names of your developers in CAPTCHA. It seems like a strange choice especially because this somewhat weakens the (already-weak) CAPTCHA, because an attacking robot can quickly be configured to know that a 11+-letter codeword will always consist of letters and exactly one instance of one of these four names: in fact, knowing that a CAPTCHA will always contain one of these four and that I can refresh until I get one that I like, I can quickly turn an 11-letter CAPTCHA into a 6-letter one by simply refreshing until I get one with the longest name – Blair – in it!

A lot has been written about how Easter Eggs undermine software security (in exchange for a small boost to developer morale) – that’s a major part of why Microsoft has banned them from its operating systems (and, for the most part, Apple has too). Given that these particular CAPTCHA in Squiz CMS are often nothing more than awkward-looking “are you sure?” dialogs, I’m not concerned about the direct security implications, but it does make me worry a little about the developer culture that produced them.

I know that this Easter Egg might be harmless, but there’s no way for me to know (short of auditing the entire system) what other Easter Eggs might be hiding under the surface and what they do, especially if the developers have, as in this case, worked to cover their tracks! It’s certainly the kind of thing I’d worry about if I were, I don’t know, a major government who use Squiz software, especially their cloud-hosted variants which are harder to effectively audit. Just a thought.

Craziest Internet Explorer Bug Ever?

As web developers, we’re used to working around the bugs in Microsoft Internet Explorer. The older versions are worst, and I’m certainly glad to not have to write code that works in Internet Explorer 6 (or, increasingly, Internet Explorer 7) any more: even Microsoft are glad to see Internet Explorer 6 dying out, but even IE8 is pretty ropey too. And despite what Microsoft claim, I’m afraid IE9 isn’t really a “modern” browser either (although it is a huge step forwards over its predecessors).

But imagine my surprise when I this week found what I suspect might be a previously undiscovered bug in Internet Explorer 8 and below. Surely they’ve all been found (and some of them even fixed), but now? But no. It takes a very specific set of circumstances for the bug to manifest itself, but it’s not completely unbelievable – I ran into it by accident while refactoring parts of Three Rings.

A completely useless Internet Explorer error message.
A completely useless Internet Explorer error message. Thanks, IE.

Here’s the crux of it: if you’re –

  • Using Internet Explorer 8 or lower, and
  • You’re on a HTTPS (secure) website, and
  • You’re downloding one of a specific set of file types: Bitmap files, for example, are a problem, but JPEG files aren’t (Content-Type: image/bmp), and
  • The web server indicates that the file you’re downloading should be treated as something to be “saved”, rather than something to be viewed in your browser (Content-Disposition: attachment), and
  • The web server passes a particular header to ask that Internet Explorer does not cache a copy of the file (Cache-Control: no-cache),

Then you’ll see a dialog box like the one shown above. Switching any of the prerequistes in that list out makes the problem go away: even switching the header from a strict “no-cache” to a more-permissive “private” makes all the difference.

I’ve set up a test environment where you can see this for yourself: HTTP version; HTTPS version. The source code of my experiment (PHP) is also available. Of course, if you try it in a functional, normal web browser, it’ll all work fine. But if you’ve got access to a copy of Internet Explorer 8 on some old Windows XP box somewhere (IE8 is the last version of the browser made available for XP), then try it in that and see for yourself what a strange error you get.

KeePass for Opera

Opera 12 has been released, and brought with it a handful of new features. But there’s also been a feature removed – a little-known feature that allowed power users to have the web address appear in the title bar of the browser. I guess that the development team decided that, because the title bar is rarely seen nowadays (the space in which a title once occupied has for a long while now been used as a tab strip, in the style that Google Chrome and Mozilla Firefox eventually copied), this feature wasn’t needed.

But for users of the KeePass Password Safe, this has the knock-on effect of crippling the ability for this security tool to automatically type passwords and other form data into web pages, forcing users to take the long-winded route of manually copy-pasting them each time.

KeePass for Opera Plugin

To fix this problem, I’ve released the KeePass for Opera browser extension. It’s ludicrously simple: it injects a bit of Javascript (originally by Jean François) into every page you visit, which then appends the URL of the page to the title bar. This allows KeePass to detect what site you’re on, so the usual Global Auto-Type command (typically Left Ctrl + Alt + A) will work as normal.

[button link=”https://addons.opera.com/en-gb/extensions/details/keepass-auto-type/” align=”right” size=”medium” caption=”KeePass for Opera”]Install[/button]

Download KeePass for Opera (browser extension)

Open in Opera to install.

Further reading:

I’ve mentioned KeePass a few times before. See:

Internetland

[spb_message color=”alert-warning” width=”1/1″ el_position=”first last”]This blog post is about password security. If you don’t run a website and you just want to know what you should do to protect yourself, jump to the end.[/spb_message]

I’d like to tell you a story about a place called Internetland. Internetland is a little bit like the town or country that you live in, but there’s one really important difference: in Internetland, everybody is afflicted with an unusual disorder called prosopagnosia, or “face-blindness”. This means that, no matter how hard they try, the inhabitants of Internetland can’t recognise each other by looking at one another: it’s almost as if everybody was wearing masks, all the time.

Denied the ability to recognise one another on sight, the people of Internetland have to say out loud who they are when they want to be identified. As I’m sure you can imagine, it’d be very easy for people to pretend to be one another, if they wanted. There are a few different ways that the inhabitants get around that problem, but the most-common way is that people agree on and remember passwords to show that they really are who they claim to be.

Alice’s Antiques

Alice runs an antiques store in Internetland. She likes to be able to give each customer a personalised service, so she invites her visitors to identify themselves, if they like, when they come up to the checkout. Having them on file means that she can contact them about special offers that might interest them, and she can keep a record of their address so that the customer doesn’t have to tell her every time that they want a piece of furniture delivered to their house.

An antique desk and chair.
Some of Alice’s Antiques’ antiques.

One day, Bob came by. He found a nice desk and went to the checkout to pay for it.

“Hi,” said Alice, “Have you shopped here before?” Remember that even if he’d visited just yesterday, she wouldn’t remember him, so crippling is her face-blindness.

“No,” replied Bob, “First time.”

“Okay then,” Alice went on, “Would you like to check out ‘as a guest’, or would you like to set up an account so that I’ll remember you next time?”

Bob opted to set up an account: it’d only take a few minutes, Alice promised, and would allow him to check out faster in future. Alice gave Bob a form to fill in:

A form filled in with name - Bob, password - swordfish1, address - 1, Fisherman's Wharf, Internetland, and with a box ticked to allow a catalogue to be posted.
Bob filled in the form with his name, a password, and his address. He ticked the box to agree that Alice could send him a copy of her catalogue.

Alice took the form and put it into her filing cabinet.

The following week, Bob came by Alice’s Antiques again. When he got to the checkout, Alice again asked him if he’d shopped there before.

“Yes, I’ve been here before,” said Bob, “It’s me: Bob!”

Alice turned to her filing cabinet and pulled out Bob’s file. This might sound like a lot of work, but the people of Internetland are very fast at sorting through filing cabinets, and can usually find what they’re looking for in less than a second. Alice found Bob’s file and, looking at it, challenged Bob to prove his identity:

“If you’re really Bob – tell me your password!”

“It’s swordfish1,” came the reply.

Alice checked the form and, sure, that was the password that Bob chose when he registered, so now she knew that it really was him. When he asked for a set of chairs he’d found to be delivered, Alice was able to simply ask, “You want that delivered to 1 Fisherman’s Wharf, right?”, and Bob just nodded. Simple!

Evil Eve

That night, a burglar called Eve broke into Alice’s shop by picking the lock on the door (Alice never left money in the till, so she didn’t think it was worthwhile buying a very good lock). Creeping through the shadows, Eve opened up the filing cabinet and copied out all of the information on all of the files. Then, she slipped back out, locking the door behind her.

Alice’s shop has CCTV – virtually all shops in Internetland do – but because it wasn’t obvious that there had been a break-in, Alice didn’t bother to check the recording.

CCTV camera.
Alice has CCTV, but she only checks the recording if it’s obvious that something has happened.

Now Eve has lots of names and passwords, so it’s easy for her to pretend to be other Internetlanders. You see: most people living in Internetland use the same password at most or all of the places they visit. So Eve can go to any of the other shops that Bob buys from, or the clubs he’s part of, or even to his bank… and they’ll believe that she’s really him.

One of Eve’s favourite tricks is to impersonate her victim and send letters to their friends. Eve might pretend to be Bob, for example, and send a letter to his friend Charlie. The letter might say that Bob’s short on cash, and ask if Charlie can lend him some: and if Charlie follows the instructions (after all, Charlie trusts Bob!), he’ll end up having his money stolen by Eve! That dirty little rotter.

So it’s not just Bob who suffers for Alice’s break-in, but Charlie, too.

Bob Thinks He’s Clever

Bob thinks he’s cleverer than most people, though. Rather than use the same password everywhere he goes, he has three different passwords. The first one is his “really secure” one: it’s a good password, and he’s proud of it. He only uses it when he talks to his bank, the tax man, and his credit card company – the stuff he thinks is really important. Then he’s got a second password that he uses when he goes shopping, and for the clubs he joins. A third password, which he’s been using for years, he reserves for places that demand that he chooses a password, but where he doesn’t expect to go back to: sometimes he joins in with Internetland debates and uses this password to identify himself.

Bob's password list - his high-security password is "h@mm3rHead!", his medium-security one is "swordfish1", and his low-security one is "haddock".
Bob’s password list. Don’t tell anybody I showed you it: Bob’ll kill me.

Bob’s approach was cleverer than most of the inhabitants of Internetland, but it wasn’t as clever as he thought. Eve had gotten his medium-security password, and this was enough to persuade the Post Office to let her read Bob’s mail. Once she was able to do this, she went on to tell Bob’s credit card company that Bob had forgotten his password, so they sent him a new one… which she was able to read. She was then able to use this new password to tell the credit card company that Bob had moved house, and that he’d lost his card. The credit card company promptly sent out a new card… to Eve’s address. Now Eve was able to steal all of Bob’s money. “Muhahaha!” chortled Eve, evilly.

But even if Bob hadn’t made the mistake of using his “medium-security” password at the Post Office, Eve could have tried a different approach: Eve would have pretended to be Alice, and asked Bob for his password. Bob would of course have responded, saying “It’s ‘swordfish1’.”

Then Eve would have done something sneaky: she’d have lied and said that was wrong. Bob would be confused, but he’d probably just think to himself, “Oh, I must have given Alice a different password.”

“It must be ‘haddock’, then,” Bob would say.

“Nope; wrong again,” Eve would say, all the while pretending to be Alice.

“Surely it’s not ‘h@mm3rHead!’, is it?” Bob would try, one last time. And now Eve would have all of Bob’s passwords, and Bob would just be left confused.

Good Versus Eve

What went wrong in Internetland this week? Well, a few things did:

Alice didn’t look after her filing cabinet

For starters, Alice should have realised that the value of the information in her filing cabinet was worth at least as much as money would be, to the right kind of burglar. It was easy for her to be complacent, because it wasn’t her identity that was most at risk, but that of her customers. Alice should have planned her security in line with that realisation: there’s no 100% certain way of stopping Eve from breaking in, but Alice should have done more to make it harder for Eve (a proper lock, and perhaps a separate, second lock on the filing cabinet), and should have made it so that Eve’s break-in was likely to be noticed (perhaps skimming through the security tapes every morning, or installing motion sensors).

But the bigger mistake that Alice made was that she kept Bob’s password in a format that Eve could read. Alice knew perfectly well that Bob would probably be using the same password in other places, and so to protect him she ought to have kept his password encrypted in a way that would make it virtually impossible for Eve to read it. This, in combination with an effort to insist that her customers used good, strong passwords, could have completely foiled Eve’s efforts, even if she had managed to get past the locks and CCTV un-noticed.

Here in the real world: Some of Alice’s mistakes are not too dissimilar to the recently-publicised mistakes made by LinkedIn, eHarmony, and LastFM. While these three giants did encrypt the passwords of their users, they did so inadequately (using mechanisms not designed for passwords, by using outdated and insecure mechanisms, and by failing to protect stolen passwords from bulk-decryption). By the way: if you have an account with any of these providers, you ought to change your password, and also change your password anywhere else that uses the same password… and if this includes your email, change it everywhere else, too.

Bob should have used different passwords everywhere he went

Good passwords should be long (8 characters should be an absolute minimum, now, and Bob really ought to start leaning towards 12), complex (not based on a word in any dictionary, and made of a mixture of numbers, letters, and other characters), and not related to you (dates of birth, names of children, and the like are way out). Bob had probably heard all of that a hundred times.

But good passwords should also be unique. You shouldn’t ever use the same password in two different places. This was Bob’s mistake, and it’s the mistake of almost everybody else in Internetland, too. What Bob probably didn’t know was that there are tools that could have helped him to have a different password for everybody he talked to, yet still been easier than remembering the three passwords he already remembered.

Here in the real world: There are some really useful tools to help you, too. Here are some of them:

  • LastPass helps you generate secure passwords, then stores encrypted versions of them on the Internet so that you can get at them from anywhere. After a short learning curve, it’s ludicrously easy to use. It’s free for most users, or there are advanced options for paid subscribers.
  • KeePass does a similar thing, but it’s open source. However, it doesn’t store your encrypted passwords online (which you might consider to be an advantage), so you have to carry a pen drive around or use a plugin to add this functionality.
  • SuperGenPass provides a super-lightweight approach to web browser password generation/storing. It’s easy to understand and makes it simple to generate different passwords for every site you use, without having to remember all of those different passwords!
  • One approach for folks who like to “roll their own” is simply to put a spreadsheet or a text file into a TrueCrypt (or similar) encrypted volume, which you can carry around on your pendrive. Just decrypt and read, wherever you are.
  • Another “manual” approach is simply to use a “master password” everywhere, prefixed or suffixed with a (say) 4-5 character modifier, that you vary from site to site. Keep your modifiers on a Post-It note in your wallet, and back it up by taking a picture of it with your mobile phone. So maybe your Skype suffix is “8Am2%”, so when you log into Skype you type in your master password, plus that suffix. Easy enough that you can do it even without a computer, and secure enough for most people.

Pay To Post

I see that Facebook is experimenting with allowing you to pay a nominal fee to make sure that your posts end up “highlighted” over those of your friends’ other friends. That’s a whole new level of crazy… or is it?

A screenshot of Facebook's new "Highlight" feature.
A screenshot of Facebook's new "Highlight" feature. For about a quid, you can push your wall posts to the top of everybody's list.

I’m not on Facebook, but I think that this is a really interesting piece of news. The biggest thing that makes Facebook unusable (and which also affects Twitter) is that people will post every little banal thing that comes to their mind. I don’t care what you’re eating for your lunch. I don’t want to read the lyrics of some song that must have been written for you. I really can’t stand your chain messages (for a while there, after I hadn’t received any by email for a few years, I hoped that they’d died out… but it turns out that they just moved to Facebook instead). If you’re among my friends, I know that you have some pretty smart and interesting things to say… but unless I’m willing to spend hours sifting through the detritus it’s buried in, I’ll never find it.

Social Media Citation. The littering fine tickets of the digital generation.
Social Media Citation. The littering fine tickets of the digital generation.

But this might work. If the price sweet spot can be found, and it’s marketed right, then this kind of feature might make services like Facebook more tolerable. When you’re writing about a cute picture of the cat you’ve seen, that’s fine. And when you write something I might care about, you can tick the “this is actually relevant” box. You’ll have to pay a few pence, but at least you know I’ll see it. And if I want to churn through reams of “X likes Chocolate” (who doesn’t?) and “Y is… in a queue for the bus” then I can turn off the “only relevant things” mode and waste some time.

The problem is that the sweet spot will vary from person to person, and there’s no way to work around that. Big Bucks Bob can probably afford to pay a couple of pounds every time he wants to push some meme photo to the top of your feed, but Poor Penniless Penny can’t even justify ten pence to make sure that all of her friends hear about her birthday party.

Google+ tries to use heuristics to show you "top" content you might be interested in.
Google+ tries to use heuristics to show you "top" content you might be interested in. It feels less insidious than charging you, as Facebook will, but it still doesn't quite work.

It’s a pity that it won’t work, because a part of me is drawn to the idea that economic theory can help to improve the signal-to-noise ratio in our information-saturated lives. Turning my attention to email: of all the cost-based anti-spam systems, I was always quite impressed with Hashcash (which Microsoft seem to be reinventing with their Penny Black project). The idea is that your computer does some hard-to-do (but easy-to-verify) computational work for each and every email that it sends. But in its own way, Hashcash has a similar problem to Facebook’s new system: the ability to pay of a sender is not directly proportional to their relevance to the recipient. If my mother wants to send me an email from her aging smartphone, should she have to wait for several minutes while it processes and generates an “e-stamp”, just because – if it were made any faster – spammers with zombie networks of computers could do so too easily?

Yes, I just equated your social network status, about what you ate for your lunch, with spam. If you don’t like it, don’t share this blog post with your friends.

hashcash token: 1:20:120511:https://danq.me/2012/05/11/pay-to-post/::UVHo081pj6bSDWkI:00000000000001sxI

On This Day In 2004

Looking Back

On this day in 2004 I handed in my dissertation, contributing towards my BEng in Software Engineering. The topic of my dissertation was the Three Rings project, then in its first incarnation, a web application originally designed to help university Nightlines to run their services.

An early Three Rings Directory page. If you remember when Three Rings used to look like this, then you're very old.

I’d originally started developing the project early in the previous academic year, before I’d re-arranged how I was going to finish my course: Three Rings celebrates its tenth birthday this year. This might be considered to have given me a head start over my peers, but in actual fact it just meant that I had even more to write-up at the end. Alongside my work at SmartData a few days a week (and sometimes at weekends), that meant that I’d been pretty damn busy.

A page from my dissertation, covering browser detection and HTTPS support (then, amazingly, still not-quite-universal in contemporary browsers).

I’d celebrated hitting 10,000 words – half of the amount that I estimated that I’d need – but little did I know that my work would eventually weigh in at over 30,000 words, and well over the word limit! In the final days, I scrambled to cut back on text and shunt entire chapters into the appendices (A through J), where they’d be exempt, while a team of volunteers helped to proofread everything I’d done so far.

Go on then; have another screenshot of an ancient web application to gawk at.

Finally, I was done, and I could relax. Well: right up until I discovered that I was supposed to have printed and bound two copies, and I had to run around a busy and crowded campus to get another copy run off at short notice.

Looking Forward

Three Rings went from strength to strength, as I discussed in an earlier “on this day”. When Bryn came on board and offered to write programs to convert Three Rings 1 data into Three Rings 2 data, in 2006, he borrowed my dissertation as a reference. After he forgot that he still had it, he finally returned it last month.

The inside front cover of my dissertation, along with a note from Bryn.

Later still in 2009, Ruth expanded Three Rings as part of her Masters dissertation, in a monumental effort to add much-needed features at the same time as getting herself a degree. After handing it in and undergoing her defense (which went better than she expected), she got a first.

My dissertation (left) back on my bookshelf, where it belongs.

Today, Three Rings continues to eat a lot of my time, and now supports tens of thousands of volunteers at hundreds of different helplines and other charities, including virtually every Nightline and the majority of all Samaritans branches.

It’s grown even larger than I ever imagined, back in those early days. I often tell people that it started as a dissertation project, because it’s simpler than the truth: that it started a year or two before that, and provided a lot of benefit to a few Nightlines, and it was just convenient that I was able to use it as a part of my degree because otherwise I probably wouldn’t have had time to make it into what it became. Just like I’m fortunate now to have the input of such talented people as I have, over the last few years, because I couldn’t alone make it into the world-class service that it’s becoming.

This blog post is part of the On This Day series, in which Dan periodically looks back on years gone by.

Visitor Tracking Without Cookies (or How To Abuse HTTP 301s)

Last week I was talking to Alexander Dutton about an idea that we had to implement cookie-like behaviour using browser caching. As I first mentioned last year, new laws are coming into force across Europe that will require websites to ask for your consent before they store cookies on your computer. Regardless of their necessity, these laws are badly-defined and ill thought-out, and there’s been a significant lack of information to support web managers in understanding and implementing the required changes.

British Telecom's implementation of the new cookie laws. Curiously, if you visit their site using the Opera web browser, it assumes that you've given consent, even if you click the button to not do so.
British Telecom’s implementation of the new cookie laws. Curiously, if you visit their site using the Opera web browser, it assumes that you’ve given consent, even if you click the button to not do so.

To illustrate one of the ambiguities in the law, I’ve implemented a tool which tracks site visitors almost as effectively as cookies (or similar technologies such as Flash Objects or Local Storage), but which must necessarily fall into one of the larger grey areas. My tool abuses the way that “permanent” (301) HTTP redirects are cached by web browsers.

[callout][button link=”http://c301.scatmania.org/” align=”right” size=”medium” color=”green”]See Demo Site[/button]You can try out my implementation for yourself. Click on the button to see the sample site, then close down all of your browser windows (or even restart your computer) and come back and try again: the site will recognise you and show you the same random number as it did the first time around, as well as identifying when your first visit was.[/callout]

Here’s how it works, in brief:

  1. A user visits the website.
  2. The website contains a <script> tag, pointing at a URL where the user’s browser will find some Javascript.
  3. The user’s browser requests the Javascript file.
  4. The server generates a random unique identifier for this user.
  5. The server uses a HTTP 301 response to tell the browser “this Javascript can be found at a different web address,” and provides an address that contains the new unique identifier.
  6. The user’s browser requests the new document (e.g. /javascripts/tracking/123456789.js, if the user’s unique ID was 123456789).
  7. The resulting Javascript is generated dynamically to automatically contain the ID in a variable, which can then be used for tracking purposes.
  8. Subsequent requests to the server, even after closing the browser, skip steps 3 through 5, because the user’s browser will cache the 301 and re-use the unique web address associated with that individual user.
How my "301-powered 'cookies'" work.
How my “301-powered ‘cookies'” work.

Compared to conventional cookie-based tracking (e.g. Google Analytics), this approach:

  • Is more-fragile (clearing the cache is a more-common user operation than clearing cookies, and a “force refresh” may, in some browsers, result in a new tracking ID being issued).
  • Is less-blockable using contemporary privacy tools, including the W3C’s proposed one: it won’t be spotted by any cookie-cleaners or privacy filters that I’m aware of: it won’t penetrate incognito mode or other browser “privacy modes”, though.

Moreover, this technique falls into a slight legal grey area. It would certainly be against the spirit of the law to use this technique for tracking purposes (although it would be trivial to implement even an advanced solution which “proxied” requests, using a database to associate conventional cookies with unique IDs, through to Google Analytics or a similar solution). However, it’s hard to legislate against the use of HTTP 301s, which are an even more-fundamental and required part of the web than cookies are. Also, and for the same reasons, it’s significantly harder to detect and block this technique than it is conventional tracking cookies. However, the technique is somewhat brittle and it would be necessary to put up with a reduced “cookie lifespan” if you used it for real.

[callout][button link=”http://c301.scatmania.org/” align=”right” size=”medium” color=”green”]See Demo Site[/button] [button link=”https://gist.github.com/avapoet/5318224″ align=”right” size=”medium” color=”orange”]Download Code[/button] Please try out the demo, or download the source code (Ruby/Sinatra) and see for yourself how this technique works.[/callout]

Note that I am not a lawyer, so I can’t make a statement about the legality (or not) of this approach to tracking. I would suspect that if you were somehow caught doing it without the consent of your users, you’d be just as guilty as if you used a conventional approach. However, it’s certainly a technically-interesting approach that might have applications in areas of legitimate tracking, too.

Update: The demo site is down, but I’ve update the download code link so that it still works.

Moving Things Around With CSS Transitions

As I indicated in my last blog post, my new blog theme has a “pop up” Dan in the upper-left corner. Assuming that you’re not using Internet Explorer, then when you move your mouse cursor over it, my head will “duck” back behind the bar below it.

My head "pops up" in the top-left hand corner of the site, and hides when you hover your mouse cursor over it.
My head "pops up" in the top-left hand corner of the site, and hides when you hover your mouse cursor over it.

This is all done without any Javascript whatsoever: it’s pure CSS. Here’s how it’s done:

<div class="sixteen columns">
  <div id="dans-creepy-head"></div>
  <h1 id="site-title" class="graphic">
    <a href="https://danq.me/" title="Scatmania">Scatmania</a>
  </h1>
  <span class="site-desc graphic">
    The adventures and thoughts of &quot;Scatman&quot; Dan Q
  </span>
</div>

The HTML for the header itself is pretty simple: there’s a container (the big blue bar) which contains, among other things, a <div> with the id "dans-creepy-head". That’s what we’ll be working with. Here’s the main CSS:

#dans-creepy-head {
  position: absolute;
  top: -24px;
  left: 15px;
  width: 123px;
  height: 133px;
  background: url(/dans-creepy-head.png) top left no-repeat;
  transition: all 800ms;
  -o-transition: all 800ms;
  -webkit-transition: all 800ms;
  -moz-transition: all 800ms;
}
#dans-creepy-head:hover {
  top: 100px;
  height: 60px;
}

The CSS sets a size, position, and background image to the <div>, in what is probably a familiar way. A :hover selector changes the style to increase the distance from the top of the container (from -24px to 100px) and to decrease the height, cropping the image (from 133px to 60px – this was necessary in this case to prevent the bottom of the image from escaping out from underneath the masking bar that it’s supposed to be “hiding behind”). With just that code, you’d have a perfectly workable “duck”, but with a jerky, one-step animation.

The transition directive (and browser-specific prefix versions -o-transition, -webkit-transition, and -moz-transition, for compatability) are what makes the magic happen. This element specifies that any ("all") style is changed on this element (whether via CSS directives, as in this case, or by a change of class or properties by a Javascript function), that a transition effect will be applied to those changes. My use of  "all" is a lazy catch-all – I could have specified the individual properties ( top and  height) that I was interested in changing, and even put different periods on each, but I’ll leave it to you to learn about CSS3 transition options for yourself. The  800ms is the duration of the transition: in my case, 0.8 seconds.

html.ie #dans-creepy-head:hover {
  top: -24px;
  height: 133px;
}
@media (max-width: 780px) {
  #dans-creepy-head {
    display: none;
  }
}

I apply some CSS to prevent the :hover effect from taking place in Internet Explorer, which doesn’t support transitions. The "ie" class is applied to the  <html> tag using Paul Irish’s technique, so it’s easy to detect and handle IE users without loading separate stylesheet files for them. And finally, in order to fit with my newly-responsive design, I make the pop-up head disappear when the window is under 780px wide (at which point there’d be a risk of it colliding with the title).

That’s all there is to it! A few lines of CSS, and you’ve got an animation that degrades gracefully. You could equally-well apply transformations to links (how about making them fade in or out, or change the position of their background image?) or, with a little Javascript, to your tabstrips and drop-down menus.

A New Look

Well, it’s been over a year since I last updated the look-and-feel of my blog, so it felt like it was time for a redesign. The last theme was made during a period that I was just recovering from a gloomy patch, and that was reflected the design: full of heavy, dark reds, blacks, and greys, and it’s well-overdue a new look!

The old Scatmania design: very serious-looking, and with dark, moody colours.
The old Scatmania design: very serious-looking, and with dark, moody colours.

I was also keen to update the site to in line with the ideas and technologies that are becoming more commonplace in web design, nowadays… as well as using it as a playground for some of the more-interesting CSS3 features!

The theme before last, with which this new design has elements in common: a big blue header, an off-white background, and sans-serif faces.
The theme before last, with which this new design has elements in common: a big blue header, an off-white background, and sans-serif faces.

Key features of the new look include:

  • A theme that uses strong colours in the footer and header, to “frame” the rest of the page content.
  • A responsive design that rescales dynamically all the way from a mobile phone screen through tablets, small 4:3 monitors, and widescreen ratios (try resizing your browser window!).
  • CSS transitions to produce Javascript-less dynamic effects: hover your cursor over the picture of me in the header to make me “hide”.
  • CSS “spriting” to reduce the number of concurrent downloads your browser has to make in order to see the content. All of the social media icons, for example, are one file, split back up again using background positioning. They’re like image maps, but a million times less 1990s.
  • Front page “feature” blocks to direct people to particular (tagged) areas of the site, dynamically-generated (from pre-made templates) based on what’s popular at any given time.
  • A re-arrangement of the controls and sections based on the most-popular use-cases of the site, according to visitor usage trends. For example, search has been made more-prominent, especially on the front page, the “next post”/”previous post” controls have been removed, and the “AddToAny” sharing tool has been tucked away at the very bottom.

[spb_message color=”alert-warning” width=”1/1″ el_position=”first last”]Note that some of these features will only work in modern browsers, so Internet Explorer users might be out of luck![/spb_message]

As always, I’m keen to hear your feedback (yes, even from those of you who subscribe by RSS). So let me know what you think!