Debugging WorldWideWeb

Earlier this week, I mentioned the exciting hackathon that produced a moderately-faithful reimagining of the world’s first Web browser. I was sufficiently excited about it that I not only blogged here but I also posted about it to MetaFilter. Of course, the very first thing that everybody there did was try to load MetaFilter in it, which… didn’t work.

MetaFilter failing to load on the reimagined WorldWideWeb.
500? Really?

People were quick to point this out and assume that it was something to do with the modernity of MetaFilter:

honestly, the disheartening thing is that many metafilter pages don’t seem to work. Oh, the modern web.

Some even went so far as to speculate that the reason related to MetaFilter’s use of CSS and JS:

CSS and JS. They do things. Important things.

This is, of course, complete baloney, and it’s easy to prove to oneself. Firstly, simply using the View Source tool in your browser on a MetaFilter page reveals source code that’s quite comprehensible, even human-readable, without going anywhere near any CSS or JavaScript.

MetaFilter in Lynx: perfectly usable browing experience
As late as the early 2000s I’d occasionally use Lynx for serious browsing, but any time I’ve used it since it’s been by necessity.

Secondly, it’s pretty simple to try browsing MetaFilter without CSS or JavaScript enabled! I tried in two ways: first, by using Lynx, a text-based browser that’s never supported either of those technologies. I also tried by using Firefox but with them disabled (honestly, I slightly miss when the Web used to look like this):

MetaFilter in Firefox (with CSS and JS disabled)
It only took me three clicks to disable stylesheets and JavaScript in my copy of Firefox… but I’ll be the first to admit that I don’t keep my browser configured like “normal people” probably do.

And thirdly: the error code being returned by the simulated WorldWideWeb browser is a HTTP code 500. Even if you don’t know your HTTP codes (I mean, what kind of weirdo would take the time to memorise them all anyway <ahem>), it’s worth learning this: the first digit of a HTTP response code tells you what happened:

  • 1xx means “everything’s fine, keep going”;
  • 2xx means “everything’s fine and we’re done”;
  • 3xx means “try over there”;
  • 4xx means “you did something wrong” (the infamous 404, for example, means you asked for a page that doesn’t exist);
  • 5xx means “the server did something wrong”.

Simple! The fact that the error code begins with a 5 strongly implies that the problem isn’t in the (client-side) reimplementation of WorldWideWeb: if this had have been a CSS/JS problem, I’d expect to see a blank page, scrambled content, “filler” content, or incomplete content.

So I found myself wondering what the real problem was. This is, of course, where my geek flag becomes most-visible: what we’re talking about, let’s not forget, is a fringe problem in an incomplete simulation of an ancient computer program that nobody uses. Odds are incredibly good that nobody on Earth cares about this except, right now, for me.

Dan's proposed "Geek Flag"
I searched for a “Geek Flag” and didn’t like anything I saw, so I came up with this one based on… well, if you recognise what it’s based on, good for you, you’re certainly allowed to fly it. If not… well, you can too: there’s no geek-gatekeeping here.

Luckily, I spotted Jeremy’s note that the source code for the WorldWideWeb simulator was now available, so I downloaded a copy to take a look. Here’s what’s happening:

  1. The (simulated) copy of WorldWideWeb is asked to open a document by reference, e.g. “https://www.metafilter.com/”.
  2. To work around same-origin policy restrictions, the request is sent to an API which acts as a proxy server.
  3. The API makes a request using the Node package “request” with this line of code: request(url, (error, response, body) => { ... }).  When the first parameter to request is a (string) URL, the module uses its default settings for all of the other options, which means that it doesn’t set the User-Agent header (an optional part of a Web request where the computer making the request identifies the software that’s asking).
  4. MetaFilter, for some reason, blocks requests whose User-Agent isn’t set. This is weird! And nonstandard: while web browsers should – in RFC2119 terms – set their User-Agent: header, web servers shouldn’t require that they do so. MetaFilter returns a 403 and a message to say “Forbidden”; usually a message you only see if you’re trying to access a resource that requires session authentication and you haven’t logged-in yet.
  5. The API is programmed to handle response codes 200 (okay!) and 404 (not found), but if it gets anything else back it’s supposed to throw a 400 (bad request). Except there’s a bug: when trying to throw a 400, it requires that an error message has been set by the request module and if there hasn’t… it instead throws a 500 with the message “Internal Server Fangle” and  no clue what actually went wrong. So MetaFilter’s 403 gets translated by the proxy into a 400 which it fails to render because a 403 doesn’t actually produce an error message and so it gets translated again into the 500 that you eventually see. What a knock-on effect!
Illustration showing conversation between simulated WorldWideWeb and MetaFilter via an API that ultimately sends requests without a User-Agent, gets a 403 in response, and can't handle the 403 and so returns a confusing 500.
If you’re having difficulty visualising the process, this diagram might help you to continue your struggle with that visualisation.

The fix is simple: simply change the line:

request(url, (error, response, body) => { ... })

to:

request({ url: url, headers: { 'User-Agent': 'WorldWideWeb' } }, (error, response, body) => { ... })

This then sets a User-Agent header and makes servers that require one, such as MetaFilter, respond appropriately. I don’t know whether WorldWideWeb originally set a User-Agent header (CERN’s source file archive seems to be missing the relevant C sources so I can’t check) but I suspect that it did, so this change actually improves the fidelity of the emulation as a bonus. A better fix would also add support for and appropriate handling of other HTTP response codes, but that’s a story for another day, I guess.

I know the hackathon’s over, but I wonder if they’re taking pull requests…

CSS-driven console graphics

If you’re reading this post via my blog and using a desktop computer, try opening your browser’s debug console (don’t worry; I’ll wait). If you don’t know how, here’s instructions for Firefox and instructions for Chrome. Other browsers may vary. You ought to see something like this in your debugger:

Debug console on DanQ.me showing Dan's head and a speech bubble.
I’m in your console, eating your commands!

What sorcery is this?

The debug console is designed to be used by web developers so that they can write Javascript code right in their browser as well as to investigate any problems with the code run by a web page. The web page itself can also output to the console, which is usually used for what I call “hello-based debugging”: printing out messages throughout a process so that the flow and progress can be monitored by the developer without having to do “proper” debugging. And it gets used by some web pages to deliver secret messages to any of the site users who open their debugger.

Facebook console messaging advising against the use of the console.
Facebook writes to the console a “stop” message, advising against using the console unless you know what you’re doing in an attempt to stop people making themselves victims of console-based social engineering attacks.

Principally, though, the console is designed for textual content and nothing else. That said, both Firefox and Chrome’s consoles permit the use of CSS to style blocks of debug output by using the %c escape sequence. For example, I could style some of a message with italic text:

>> console.log('I have some %citalic %ctext', 'font-style: italic;', '');
   I have some italic text

Using CSS directives like background, then, it’s easy to see how one could embed an image into the console, and that’s been done before. Instead, though, I wanted to use the lessons I’d learned developing PicInHTML 8¾ years ago to use text and CSS (only) to render a colour picture to the console. First, I created my template image – a hackergotchi of me and an accompanying speech bubble, shrunk to a tiny size and posterised to reduce the number of colours used and saved as a PNG.

Hackergotchi of Dan with a speech bubble, "squashed".
The image appears “squashed” to compensate for console monospace letters not being “square”.

Next, I wrote a quick Ruby program, consolepic.rb, to do the hard work. It analyses each pixel of the image and for each distinct colour assigns to a variable the CSS code used to set the background colour to that colour. It looks for “strings” of like pixels and combines them into one, and then outputs the Javascript necessary to write out all of the above. Finally, I made a few hand-tweaks to insert the text into the speech bubble.

The resulting output weighs in at 31.6kB – about a quarter of the size of the custom Javascript on the frontend of my site and so quite a bit larger than I’d have liked and significantly less-efficient than the image itself, even base64-encoded for embedding directly into the code, but that really wasn’t the point of the exercise, was it? (I’m pretty sure there’s significant room for improvement from a performance perspective…)

Scatmania.org in 2012
I’ll be first to admit it’s not as cool as the “pop-up Dan” in the corner of my 2012 design. You might enjoy my blog post about my 20 years of blogging or the one about how “pop-up Dan” worked.

What it achieved was an interesting experiment into what can be achieved with Javascript, CSS, the browser console, and a little imagination. An experiment that can live here on my site, for anybody who looks in the direction of their debugger, for the foreseeable future (or until I get bored of it). Anybody with any more-exotic/silly ideas about what this technique could be used for is welcome to let me know!

Update: 17 April 2019 – fun though this was, it wasn’t worth continuing to deliver an additional 25% Javascript payload to every visitor just for this, so I’ve stopped it for now. You can still read the source code (and even manually run it in the console) if you like. And I have other ideas for fun things to do with the console, so keep an eye out for that…

Things you probably didn’t know you could do with Chrome’s Developer Console

This article is a repost promoting content originally published elsewhere. See more things Dan's reposted.

Chrome comes with built-in developer tools. This comes with a wide variety of features, such as Elements, Network, and Security. Today, we’ll focus 100% on its JavaScript console.

When I started coding, I only used the JavaScript console for logging values like responses from the server, or the value of variables. But over time, and with the help of tutorials, I discovered that the console can do way more than I ever imagined…