Last week I was talking to Alexander Dutton about an idea that we had to implement cookie-like behaviour using browser caching. As I first mentioned last year, new laws are coming into force across Europe that will require websites to ask for your consent before they store cookies on your computer. Regardless of their necessity, these laws are badly-defined and ill thought-out, and there’s been a significant lack of information to support web managers in understanding and implementing the required changes.
To illustrate one of the ambiguities in the law, I’ve implemented a tool which tracks site visitors almost as effectively as cookies (or similar technologies such as Flash Objects or Local Storage), but which must necessarily fall into one of the larger grey areas. My tool abuses the way that “permanent” (301) HTTP redirects are cached by web browsers.
[callout][button link=”http://c301.scatmania.org/” align=”right” size=”medium” color=”green”]See Demo Site[/button]You can try out my implementation for yourself. Click on the button to see the sample site, then close down all of your browser windows (or even restart your computer) and come back and try again: the site will recognise you and show you the same random number as it did the first time around, as well as identifying when your first visit was.[/callout]
Here’s how it works, in brief:
- A user visits the website.
- The website contains a <script> tag, pointing at a URL where the user’s browser will find some Javascript.
- The user’s browser requests the Javascript file.
- The server generates a random unique identifier for this user.
- The server uses a HTTP 301 response to tell the browser “this Javascript can be found at a different web address,” and provides an address that contains the new unique identifier.
- The user’s browser requests the new document (e.g. /javascripts/tracking/123456789.js, if the user’s unique ID was 123456789).
- The resulting Javascript is generated dynamically to automatically contain the ID in a variable, which can then be used for tracking purposes.
- Subsequent requests to the server, even after closing the browser, skip steps 3 through 5, because the user’s browser will cache the 301 and re-use the unique web address associated with that individual user.
Compared to conventional cookie-based tracking (e.g. Google Analytics), this approach:
- Is more-fragile (clearing the cache is a more-common user operation than clearing cookies, and a “force refresh” may, in some browsers, result in a new tracking ID being issued).
- Is less-blockable using contemporary privacy tools, including the W3C’s proposed one: it won’t be spotted by any cookie-cleaners or privacy filters that I’m aware of: it won’t penetrate incognito mode or other browser “privacy modes”, though.
Moreover, this technique falls into a slight legal grey area. It would certainly be against the spirit of the law to use this technique for tracking purposes (although it would be trivial to implement even an advanced solution which “proxied” requests, using a database to associate conventional cookies with unique IDs, through to Google Analytics or a similar solution). However, it’s hard to legislate against the use of HTTP 301s, which are an even more-fundamental and required part of the web than cookies are. Also, and for the same reasons, it’s significantly harder to detect and block this technique than it is conventional tracking cookies. However, the technique is somewhat brittle and it would be necessary to put up with a reduced “cookie lifespan” if you used it for real.
[callout][button link=”http://c301.scatmania.org/” align=”right” size=”medium” color=”green”]See Demo Site[/button] [button link=”https://gist.github.com/avapoet/5318224″ align=”right” size=”medium” color=”orange”]Download Code[/button] Please try out the demo, or download the source code (Ruby/Sinatra) and see for yourself how this technique works.[/callout]
Note that I am not a lawyer, so I can’t make a statement about the legality (or not) of this approach to tracking. I would suspect that if you were somehow caught doing it without the consent of your users, you’d be just as guilty as if you used a conventional approach. However, it’s certainly a technically-interesting approach that might have applications in areas of legitimate tracking, too.
Update: The demo site is down, but I’ve update the download code link so that it still works.
It doesn’t seem to work in Firefox, which is (as far as I can tell) refetching the /c201.js each time. It mostly works in Chrome, but a page refresh causes the JS to be refetched. (I think) this could be worked around by having the JS add the script tag after the page has loaded (unless browsers make a note to re-perform all subsequent requests for a page).
Whether or not it works in Firefox seems to depend upon your caching settings in the browser. Chrome and Opera seem to respect the Expires: header that’s being passed with the 301. You’re right about page refreshes, and I’ve been trying to think of a solution – I like your suggestion about loading the Javascript dynamically (using another Javascript): I’ll give that a go at some point.
In short: some tweaks are required, but it’s getting there.
Neat. But stop giving the legislators ideas of more fundamental things to ban!
Ah, I’d only find another way, and you know it. I had a thought about delivering an (unique, stamped) image to the user with a long Expires: time, with a unique ID number encoded into the pixels. Then use Javascript and a HTML5 canvas to decode and “read” the number back out again and act upon it accordingly. That’d be even harder to detect and act upon than the approach given above. And what’re you going to do: ban image caching?
Or a third option: embedding the cookie data into an ETag: header on a resource (e.g. a Javascript file). The browser will cache the ETag and will pass it back with the subsequent requests, and the server will lie and say that it’s *always* out-of-date, delivering back a new file with a unique ID (based on the ETag) to the user, along with a fresh ETag. The ETags would be associated with the ID by a hashing algorithm, so that – to the browser – it just looks like a trying-to-be-cached-but-often-changing Javascript file, while in actual fact there’s a unique identifying mark embedded within the caching data.
Why yes, yes I am evil: why do you ask?
It doesn’t seem to work in Chrome when tested here. I tried it both in normal mode and incognito, hitting refresh or even just closing/reopening. I got a different value each and every time.
Curious. Wonder if you’ve got your caching settings set aggressively-against.
Would be interested to see the results of your Debugger->Network output, or to dive into your cache. For me, it works both in regular and incognito windows (although it doesn’t span them, of course, and the incognito session is lost by closing the incognito windows, just as it would if it were a conventional cookie).
I actually use something similar to this for tracking RSS/Atom subscribers. The feed urls given on the page go to a script that returns a HTTP 301 with a unique ID. When a new ID is formed I know its a new subscriber. When the ID is present it is a refresh.
It’s not the cookies the law wants to ban but the ability to track users so one fine day even this idea will be banned. Sooner or later the law will see amendments implying “not to track a person online or even offline”
Demo site doesn’t work. Big ruby dump error.
Same…
Nice try. I’m not sure that it works from a legal perspective.
The law states:
… a person shall not store or gain access to information stored, in the terminal equipment of a subscriber or user unless the requirements of paragraph (2) are met.
(2) The requirements are that the subscriber or user of that terminal equipment-
(a) is provided with clear and comprehensive information about the purposes of the storage of, or access to, that information; and
(b) has given his or her consent. …
In your example, you are deliberately sending your javascript to request the 301 and storing in the user’s cache (or, in the arcane language of this law – in the terminal equipment of a subscriber) the result of your 301 (which in itself contains a unique ID code for the purposes of tracking – although actually that is irrelevant here). It doesn’t matter that you are not explicitly reading it back.
Before anyone points out the obvious flaw here, there are exceptions… the most relevant being “where it is strictly necessary for delivering a service requested by the user” (i.e. sending the web page they have asked for).
The test is then:
IF CONSTRUE_NARROWLY(“Am I sending this data because the user has asked for it”) = YES THEN SEND DATA
ELSEIF CONSTRUE_NARROWLY(“Am I sending this data because the user has asked for it”) = NO AND CONSTRUE_IN_LINE_WITH_ICO_GUIDANCE(“the user given their consent to receive it”) = YES THEN SEND DATA
ELSE DON’T SEND DATA
The javascript code you send to request the 301, and the result of the 301 fail this test. Unless you get consent, of course. But if you’re going to get consent – you may as well just use a cookie.
As for your comment “there’s been a significant lack of information to support web managers in understanding and implementing the required changes” – that is what your lawyers are for :) If your lawyer is not up to scratch on this topic you could contact the ICO helpline directly for guidance. Or you could change lawyer.
CONSTRUE_IN_LINE_WITH_ICO_GUIDANCE()
{
http://www.ico.gov.uk/news/blog/2012/updated-ico-advice-guidance-e-privacy-directive-eu-cookie-law.aspx
http://www.ico.gov.uk/news/blog/2012/~/media/documents/library/Privacy_and_electronic/Practical_application/cookies_guidance_v3.ashx
}
CONSTRUE_NARROWLY()
{
import #commonsense
// assume user only wants directly what they’ve asked for – e.g. a train timetable or a recipe, not a google analytics cookie.
}
the techniques of samy kamkar’s evercookie will be interesting to you, scatman dan
The new EU ePrivacy law is not about cookies, but storing and re-accessing data you set on user’s computers. It never mentions “cookies”. This approach does not get around that law.
IANAL but I’d presume this approach is *more* illegal than un-consented cookies, because it’s hidden away much more and there’s very little way to get around it.
What Rory says is true, the new cookie law is not just about cookies its ment for any form which could essentially track a person. So I suspect this form would not be legal either, just harder to notice.
Demo site doesn’t work. Big ruby dump error???
#WebSecurity Article: “Visitor Tracking Without Cookies (Abusing HTTP 301s)” by @scatmandan — http://scatmania.org/2012/04/24/visitor-tracking-without-cookies
Visitor Tracking Without Cookies (or How To Abuse HTTP 301s) http://bit.ly/TnL4Hj @scatmandan
Ahhhh. After going the post and some of the comments, I can say only one thing: glad I’m not a web developer anymore :)
Interesting technique but I agree with the others in this thread that it is the tracking of users which has been legislated in the EU not just using Cookies.