Debugging WorldWideWeb

Earlier this week, I mentioned the exciting hackathon that produced a moderately-faithful reimagining of the world’s first Web browser. I was sufficiently excited about it that I not only blogged here but I also posted about it to MetaFilter. Of course, the very first thing that everybody there did was try to load MetaFilter in it, which… didn’t work.

MetaFilter failing to load on the reimagined WorldWideWeb.
500? Really?

People were quick to point this out and assume that it was something to do with the modernity of MetaFilter:

honestly, the disheartening thing is that many metafilter pages don’t seem to work. Oh, the modern web.

Some even went so far as to speculate that the reason related to MetaFilter’s use of CSS and JS:

CSS and JS. They do things. Important things.

This is, of course, complete baloney, and it’s easy to prove to oneself. Firstly, simply using the View Source tool in your browser on a MetaFilter page reveals source code that’s quite comprehensible, even human-readable, without going anywhere near any CSS or JavaScript.

MetaFilter in Lynx: perfectly usable browing experience
As late as the early 2000s I’d occasionally use Lynx for serious browsing, but any time I’ve used it since it’s been by necessity.

Secondly, it’s pretty simple to try browsing MetaFilter without CSS or JavaScript enabled! I tried in two ways: first, by using Lynx, a text-based browser that’s never supported either of those technologies. I also tried by using Firefox but with them disabled (honestly, I slightly miss when the Web used to look like this):

MetaFilter in Firefox (with CSS and JS disabled)
It only took me three clicks to disable stylesheets and JavaScript in my copy of Firefox… but I’ll be the first to admit that I don’t keep my browser configured like “normal people” probably do.

And thirdly: the error code being returned by the simulated WorldWideWeb browser is a HTTP code 500. Even if you don’t know your HTTP codes (I mean, what kind of weirdo would take the time to memorise them all anyway <ahem>), it’s worth learning this: the first digit of a HTTP response code tells you what happened:

  • 1xx means “everything’s fine, keep going”;
  • 2xx means “everything’s fine and we’re done”;
  • 3xx means “try over there”;
  • 4xx means “you did something wrong” (the infamous 404, for example, means you asked for a page that doesn’t exist);
  • 5xx means “the server did something wrong”.

Simple! The fact that the error code begins with a 5 strongly implies that the problem isn’t in the (client-side) reimplementation of WorldWideWeb: if this had have been a CSS/JS problem, I’d expect to see a blank page, scrambled content, “filler” content, or incomplete content.

So I found myself wondering what the real problem was. This is, of course, where my geek flag becomes most-visible: what we’re talking about, let’s not forget, is a fringe problem in an incomplete simulation of an ancient computer program that nobody uses. Odds are incredibly good that nobody on Earth cares about this except, right now, for me.

Dan's proposed "Geek Flag"
I searched for a “Geek Flag” and didn’t like anything I saw, so I came up with this one based on… well, if you recognise what it’s based on, good for you, you’re certainly allowed to fly it. If not… well, you can too: there’s no geek-gatekeeping here.

Luckily, I spotted Jeremy’s note that the source code for the WorldWideWeb simulator was now available, so I downloaded a copy to take a look. Here’s what’s happening:

  1. The (simulated) copy of WorldWideWeb is asked to open a document by reference, e.g. “https://www.metafilter.com/”.
  2. To work around same-origin policy restrictions, the request is sent to an API which acts as a proxy server.
  3. The API makes a request using the Node package “request” with this line of code: request(url, (error, response, body) => { ... }).  When the first parameter to request is a (string) URL, the module uses its default settings for all of the other options, which means that it doesn’t set the User-Agent header (an optional part of a Web request where the computer making the request identifies the software that’s asking).
  4. MetaFilter, for some reason, blocks requests whose User-Agent isn’t set. This is weird! And nonstandard: while web browsers should – in RFC2119 terms – set their User-Agent: header, web servers shouldn’t require that they do so. MetaFilter returns a 403 and a message to say “Forbidden”; usually a message you only see if you’re trying to access a resource that requires session authentication and you haven’t logged-in yet.
  5. The API is programmed to handle response codes 200 (okay!) and 404 (not found), but if it gets anything else back it’s supposed to throw a 400 (bad request). Except there’s a bug: when trying to throw a 400, it requires that an error message has been set by the request module and if there hasn’t… it instead throws a 500 with the message “Internal Server Fangle” and  no clue what actually went wrong. So MetaFilter’s 403 gets translated by the proxy into a 400 which it fails to render because a 403 doesn’t actually produce an error message and so it gets translated again into the 500 that you eventually see. What a knock-on effect!
Illustration showing conversation between simulated WorldWideWeb and MetaFilter via an API that ultimately sends requests without a User-Agent, gets a 403 in response, and can't handle the 403 and so returns a confusing 500.
If you’re having difficulty visualising the process, this diagram might help you to continue your struggle with that visualisation.

The fix is simple: simply change the line:

request(url, (error, response, body) => { ... })

to:

request({ url: url, headers: { 'User-Agent': 'WorldWideWeb' } }, (error, response, body) => { ... })

This then sets a User-Agent header and makes servers that require one, such as MetaFilter, respond appropriately. I don’t know whether WorldWideWeb originally set a User-Agent header (CERN’s source file archive seems to be missing the relevant C sources so I can’t check) but I suspect that it did, so this change actually improves the fidelity of the emulation as a bonus. A better fix would also add support for and appropriate handling of other HTTP response codes, but that’s a story for another day, I guess.

I know the hackathon’s over, but I wonder if they’re taking pull requests…

Post-It Minesweeper

Remember Minesweeper? It’s probably been forever since you played, so go have a game online now. And there went your afternoon.

A game of Microsoft Minesweeper in progress.
This is actually a pretty tough move.

My geek-crush Ben Foxall posted on Twitter on Monday morning to share that he’d had a moment of fun nostalgia when he’d come into the office to discover that somebody in his team had covered his monitor with two layers of Post-It notes. The bottom layer contained numbers – and bombs! – to represent the result of a Minesweeper board, and the upper layer ‘covered’ them so that individual Post-Its could be removed to reveal what lay beneath. Awesome.

Ben Foxall discovers Post-It Minesweeper
Unlike most computerised implementations of Minesweeper, the first move isn’t guaranteed to be safe. Tread carefully…

Not to be outdone, I hunted around my office and found some mini-Post-Its. Being smaller meant that I could fit more of them onto a monitor and thus make a more-sophisticated (and more-challenging!) play space. But how to generate the board? Sure: I could do it by hand, but that doesn’t seem very elegant at all – plus, humans make really bad random number generators! I didn’t need quantum-tunnelling-seeded Minesweeper (yes, that’s a thing) levels of entropy, sure, but it’d still be nice to outsource the heavy lifting to a computer, right?

Screenshot of my Post-It Minesweeper board generator.
Yes, I’m quite aware of the irony of using a computer to generate a paper-based version of a computer game, why do you ask?

So naturally, I wrote a program to do it for me. Want to see? It’s at danq.me/minesweeper. Just line up some Post-Its on a co-worker’s monitor to work out how many you can fit across it in each dimension (I found that I could get 6 × 4 standard-sized Post-Its but 7 × 5 or even 8 × 5 mini-sized Post-Its very comfortably onto one of the typical widescreen monitors in my office), decide how many mines you want, and click Generate. Don’t like the board you get? Click it again!

Liz McCarthy tweets about her experience of being given a Post-It Minesweeper game to play.
I set up the first game on my colleague Liz’s computer, before she came in this morning.

And because I was looking for a fresh excuse to play with Periscope, I broadcast the first game I set up live to the Internet. In the end, 66 people ended up watching some or all of a paper-based game of Minesweeper played by my colleague Liz, including moments of cheering her on and, in one weird moment, dispair at the revelation that she was married. The internet’s strange, yo.

Anyway: in case you missed the Periscope broadcast, I’ve put it on YouTube. Sorry about the portrait-orientation filming: I think it’s awful, too, but it’s a Periscope thing and I haven’t installed the new update that fixes it yet.

Now go set up a game of Post-It Minesweeper for a friend or co-worker.

Interview Sarah Palin

Remember about four-and-a-bit years ago, I downloaded Dadadodo, which I described at the time as a “word disassociator?” The program itself is a Markov chain generator/randomiser that works on sentence structures: in other words, given some text (speeches, poetry, blog posts, whatever – other kinds have been demonstrated to work on things like music) it will learn the frequencies in which words and punctuation follow other words and punctuation and use that to build resulting sentences.

Imagine the fun you could have if you took the combined speeches of any politician particularly famous for waffling through their answers. Like, say, US presidential election Republican party running mate Sarah Palin

Well, imagine no more – Interview Sarah Palin has you covered. Kick-starting paragraphs (“winding her up”) with particular topics (e.g. “Iraq and Afghanistan,” “John McCain,” etc.) sets off this fabulous little Markov-chain-speechbot. Even if you don’t understand even the theories of the mathematics, you can enjoy this site so long as you’ve got a suitable sense of humour around political waffling.

Super Munchkin

There’s been quite a lot said recently on abnib about class. JTA opened up the debate; Claire followed up by listing some of her least favourite things about the stereotypes of the middle class, and attracted a lot of debate in her comments; Matt P argued that the class system doesn’t exist (or, at least, isn’t relevant) in the UK any more anyway; and even Beth weighed in with her opinions on the whole thing, although it did take me prodding her with a virtual stick before she did so.

I thought it was about time that I rode in like a knight in slimy armour (wearing my helm of peripheral vision, of course) and closed the argument once and for all:

Who says I can’t be a half-middle-class, half-lower-class half-Elf, half-Orc?

(with insincere apologies to those who don’t play Munchkin)

Incidentally, Geek Night this week will be on Friday at Ele and Penny‘s house.