Blog

Debugging WorldWideWeb

Earlier this week, I mentioned the exciting hackathon that produced a moderately-faithful reimagining of the world’s first Web browser. I was sufficiently excited about it that I not only blogged here but I also posted about it to MetaFilter. Of course, the very first thing that everybody there did was try to load MetaFilter in it, which… didn’t work.

MetaFilter failing to load on the reimagined WorldWideWeb.
500? Really?

People were quick to point this out and assume that it was something to do with the modernity of MetaFilter:

honestly, the disheartening thing is that many metafilter pages don’t seem to work. Oh, the modern web.

Some even went so far as to speculate that the reason related to MetaFilter’s use of CSS and JS:

CSS and JS. They do things. Important things.

This is, of course, complete baloney, and it’s easy to prove to oneself. Firstly, simply using the View Source tool in your browser on a MetaFilter page reveals source code that’s quite comprehensible, even human-readable, without going anywhere near any CSS or JavaScript.

MetaFilter in Lynx: perfectly usable browing experience
As late as the early 2000s I’d occasionally use Lynx for serious browsing, but any time I’ve used it since it’s been by necessity.

Secondly, it’s pretty simple to try browsing MetaFilter without CSS or JavaScript enabled! I tried in two ways: first, by using Lynx, a text-based browser that’s never supported either of those technologies. I also tried by using Firefox but with them disabled (honestly, I slightly miss when the Web used to look like this):

MetaFilter in Firefox (with CSS and JS disabled)
It only took me three clicks to disable stylesheets and JavaScript in my copy of Firefox… but I’ll be the first to admit that I don’t keep my browser configured like “normal people” probably do.

And thirdly: the error code being returned by the simulated WorldWideWeb browser is a HTTP code 500. Even if you don’t know your HTTP codes (I mean, what kind of weirdo would take the time to memorise them all anyway <ahem>), it’s worth learning this: the first digit of a HTTP response code tells you what happened:

  • 1xx means “everything’s fine, keep going”;
  • 2xx means “everything’s fine and we’re done”;
  • 3xx means “try over there”;
  • 4xx means “you did something wrong” (the infamous 404, for example, means you asked for a page that doesn’t exist);
  • 5xx means “the server did something wrong”.

Simple! The fact that the error code begins with a 5 strongly implies that the problem isn’t in the (client-side) reimplementation of WorldWideWeb: if this had have been a CSS/JS problem, I’d expect to see a blank page, scrambled content, “filler” content, or incomplete content.

So I found myself wondering what the real problem was. This is, of course, where my geek flag becomes most-visible: what we’re talking about, let’s not forget, is a fringe problem in an incomplete simulation of an ancient computer program that nobody uses. Odds are incredibly good that nobody on Earth cares about this except, right now, for me.

Dan's proposed "Geek Flag"
I searched for a “Geek Flag” and didn’t like anything I saw, so I came up with this one based on… well, if you recognise what it’s based on, good for you, you’re certainly allowed to fly it. If not… well, you can too: there’s no geek-gatekeeping here.

Luckily, I spotted Jeremy’s note that the source code for the WorldWideWeb simulator was now available, so I downloaded a copy to take a look. Here’s what’s happening:

  1. The (simulated) copy of WorldWideWeb is asked to open a document by reference, e.g. “https://www.metafilter.com/”.
  2. To work around same-origin policy restrictions, the request is sent to an API which acts as a proxy server.
  3. The API makes a request using the Node package “request” with this line of code: request(url, (error, response, body) => { ... }).  When the first parameter to request is a (string) URL, the module uses its default settings for all of the other options, which means that it doesn’t set the User-Agent header (an optional part of a Web request where the computer making the request identifies the software that’s asking).
  4. MetaFilter, for some reason, blocks requests whose User-Agent isn’t set. This is weird! And nonstandard: while web browsers should – in RFC2119 terms – set their User-Agent: header, web servers shouldn’t require that they do so. MetaFilter returns a 403 and a message to say “Forbidden”; usually a message you only see if you’re trying to access a resource that requires session authentication and you haven’t logged-in yet.
  5. The API is programmed to handle response codes 200 (okay!) and 404 (not found), but if it gets anything else back it’s supposed to throw a 400 (bad request). Except there’s a bug: when trying to throw a 400, it requires that an error message has been set by the request module and if there hasn’t… it instead throws a 500 with the message “Internal Server Fangle” and  no clue what actually went wrong. So MetaFilter’s 403 gets translated by the proxy into a 400 which it fails to render because a 403 doesn’t actually produce an error message and so it gets translated again into the 500 that you eventually see. What a knock-on effect!
Illustration showing conversation between simulated WorldWideWeb and MetaFilter via an API that ultimately sends requests without a User-Agent, gets a 403 in response, and can't handle the 403 and so returns a confusing 500.
If you’re having difficulty visualising the process, this diagram might help you to continue your struggle with that visualisation.

The fix is simple: simply change the line:

request(url, (error, response, body) => { ... })

to:

request({ url: url, headers: { 'User-Agent': 'WorldWideWeb' } }, (error, response, body) => { ... })

This then sets a User-Agent header and makes servers that require one, such as MetaFilter, respond appropriately. I don’t know whether WorldWideWeb originally set a User-Agent header (CERN’s source file archive seems to be missing the relevant C sources so I can’t check) but I suspect that it did, so this change actually improves the fidelity of the emulation as a bonus. A better fix would also add support for and appropriate handling of other HTTP response codes, but that’s a story for another day, I guess.

I know the hackathon’s over, but I wonder if they’re taking pull requests…

× × × × ×

“Ammo can” style geocache – a guide for UK cachers

“Ammo can” style cache containers are commonplace in the USA but very rare in the UK. As a result, British cachers coming across them for the first time sometimes report difficulty in opening or closing the containers or accidentally removing the lid and being unable to reattach it. This video quickly examines an ammo can cache so that you might know your way around it.

WorldWideWeb, 30 years on

This month, a collection of some of my favourite geeks got invited to CERN in Geneva to participate in a week-long hackathon with the aim of reimplementing WorldWideWeb – the first web browser, circa 1990-1994 – as a web application. I’m super jealous, but I’m also really pleased with what they managed to produce.

DanQ.me as displayed by the reimagined WorldWideWeb browser circa 1990
With the exception of a few character entity quirks, this site remains perfectly usable in the simulated WorldWideWeb browser. Clearly I wasn’t the only person to try this vanity-check…

This represents a huge leap forward from their last similar project, which aimed to recreate the line mode browser: the first web browser that didn’t require a NeXT computer to run it and so a leap forward in mainstream appeal. In some ways, you might expect reimplementing WorldWideWeb to be easier, because its functionality is more-similar that of a modern browser, but there were doubtless some challenges too: this early browser predated the concept of the DOM and so there are distinct processing differences that must be considered to get a truly authentic experience.

Geeks hacking on WorldWideWeb reborn
It’s just like any other hackathon, if you ignore the enormous particle collider underneath it.

Among their outputs, the team also produced a cool timeline of the Web, which – thanks to some careful authorship – is as legible in WorldWideWeb as it is in a modern browser (if, admittedly, a little less pretty).

WorldWideWeb screenshot by Sir Tim Berners-Lee
When Sir Tim took this screenshot, he could never have predicted the way the Web would change, technically, over the next 25-30 years. But I’m almost more-interested in how it’s stayed the same.

In an age of increasing Single Page Applications and API-driven sites and “apps”, it’s nice to be reminded that if you develop right for the Web, your content will be visible (sort-of; I’m aware that there are some liberties taken here in memory and processing limitations, protocols and negotiation) on machines 30 years old, and that gives me hope that adherence to the same solid standards gives us a chance of writing pages today that look just as good in 30 years to come. Compare that to a proprietary technology like Flash whose heyday 15 years ago is overshadowed by its imminent death (not to mention Java applets or ActiveX <shudders>), iOS apps which stopped working when the operating system went 64-bit, and websites which only work in specific browsers (traditionally Internet Explorer, though as I’ve complained before we’re getting more and more Chrome-only sites).

The Web is a success story in open standards, natural and by-design progressive enhancement, and the future-proof archivability of human-readable code. Long live the Web.

Update 24 February 2019: After I submitted news of the browser to MetaFilter, I (and others) spotted a bug. So I came up with a fix…

× × ×

Minimal Google Analytics Snippet

This is a repost promoting content originally published elsewhere. See more things Dan's reposted.

<script>
(function(a,b,c){var d=a.history,e=document,f=navigator||{},g=localStorage,
h=encodeURIComponent,i=d.pushState,k=function(){return Math.random().toString(36)},
l=function(){return g.cid||(g.cid=k()),g.cid},m=function(r){var s=[];for(var t in r)
r.hasOwnProperty(t)&&void 0!==r[t]&&s.push(h(t)+"="+h(r[t]));return s.join("&")},
n=function(r,s,t,u,v,w,x){var z="https://www.google-analytics.com/collect",
A=m({v:"1",ds:"web",aip:c.anonymizeIp?1:void 0,tid:b,cid:l(),t:r||"pageview",
sd:c.colorDepth&&screen.colorDepth?screen.colorDepth+"-bits":void 0,dr:e.referrer||
void 0,dt:e.title,dl:e.location.origin+e.location.pathname+e.location.search,ul:c.language?
(f.language||"").toLowerCase():void 0,de:c.characterSet?e.characterSet:void 0,
sr:c.screenSize?(a.screen||{}).width+"x"+(a.screen||{}).height:void 0,vp:c.screenSize&&
a.visualViewport?(a.visualViewport||{}).width+"x"+(a.visualViewport||{}).height:void 0,
ec:s||void 0,ea:t||void 0,el:u||void 0,ev:v||void 0,exd:w||void 0,exf:"undefined"!=typeof x&&
!1==!!x?0:void 0});if(f.sendBeacon)f.sendBeacon(z,A);else{var y=new XMLHttpRequest;
y.open("POST",z,!0),y.send(A)}};d.pushState=function(r){return"function"==typeof d.onpushstate&&
d.onpushstate({state:r}),setTimeout(n,c.delay||10),i.apply(d,arguments)},n(),
a.ma={trackEvent:function o(r,s,t,u){return n("event",r,s,t,u)},
trackException:function q(r,s){return n("exception",null,null,null,null,r,s)}}})
(window,"XX-XXXXXXXXX-X",{anonymizeIp:true,colorDepth:true,characterSet:true,screenSize:true,language:true});
</script>

This is cute: a Google Analytics code snippet that results in a payload about a fiftieth of the size of the one provided by Google but still provides most of the important features.

You probably don’t need a single-page application

This is a repost promoting content originally published elsewhere. See more things Dan's reposted.

The meteoric rise of front-end frameworks like React, Angular, Vue.js, Elm, etc. has made single-page applications ubiquitous on the web. For many developers, these have become part of their ‘default’ toolset. When they start a new project, they grab the tools they know already: a REST API on the backend, and a React/Angular/Vue/Elm frontend.

Is there something wrong with these tools? Absolutely not. In fact, I love working with them. However, I would only choose this architecture when an actual requirement is pushing me in that direction. If there are no specific reasons to build a single-page application, I will go with a traditional server-rendered architecture every day of the week. It is simpler and allows you to move faster.

There’s been an increasing trend towards delivering web applications as SPAs backed by an API. I can see the attraction: disposing of the browser’s navigation cycle lets you develop that coveted “app-like” interaction experience, pushing only data around lets you implement multiple clients backed by the same single middleware, and it results in a development workflow that fits tightly with many of the hippest frameworks (go jamstack, backendless, Node-backed, or whatever). I love REST and all, but I feel that it works best when it’s used to deliver multiformat results (whether by content negotiation or whatever): web pages for the humans, JSON or whatever for the computers.

For an increasing number of developers, SPAs are a golden hammer. Let’s fix that.

Post-it Note Affirmations and the Amazon Dash

This is a repost promoting content originally published elsewhere. See more things Dan's reposted.

The amazon dash is a pinnacle of modern web design. It’s one of the most intrusive, complex, and resource-dependent devices we’ve introduced into our homes, yet it appears as a simple oval with a single button for a single use. The use is absurdly narrow: the button will have a picture of Tide detergent, and when you press the button, Tide detergent is sent to your door.

Barely a week goes by between the times that I discover some horrifically over-engineered “solution” on the Internet. Amazon’s Dash buttons are terrible: disposable (plastic) single-purpose computers that could so easily have been made into something “more” – more-versatile, more-open, more-configurable, more-flexible. Indeed: people have been doing exactly that kind of thing! But the vanilla Dash button remains little more than selling you convenience (and not much convenience, if we’re honest) in exchange for more and more of your feeling of digital freedom. Yet another example of what replaced the Web we lost…

By hiding the technical processes, and simplifying the onboarding and engagement of their services, Amazon can continually reinforce your depression for a profit— and you can get name-brand laundry detergent faster.

Also, can I just take a moment to point out how awesome Zach’s website is. Not only is it the perfect example of how fun and weird the Internet can be and having a mixture of fascinating and curious content, it’s also available via dat:// for those of you who’ve got some love for the datbaseiverse.

Note #12968

Blood on a pillow

Woke up this morning bleeding from the neck. Surprise #vampire attack?

×

Newtown bypass in Powys opens after 70-year wait

This is a repost promoting content originally published elsewhere. See more things Dan's reposted.

Town’s bypass opens after 70-year wait (BBC News)

It was 1949 when highways officials started to look at traffic issues affecting Newtown.

A multi-million pound bypass that has been 70 years in the planning officially opened in Powys on Thursday.

One haulier said Newtown bypass will make a “big difference” due to 45-minute hold-ups in the town, while the local AM said it was a “momentous” day.

The Welsh Government said the road will ease congestion by about 40% in the town centre.

A public notice printed in 1949 shows a bypass was being considered by the former Montgomeryshire County Council.

The four-mile (6.4km) road runs to the south of the town with two lanes in one direction and one in the opposite direction, to provide overtaking points.

Never thought I’d see the day. Back when she used to work in Newtown, Claire would routinely be delayed on her journey home by traffic passing through the town that could quite-justifiably have gone around it were it not for the lack of a decent trunk road, and she’d bemoan the continuing absence of the long-promised bypass. That was like 15 years ago… I can’t imagine what it’s been like for the people who’ve lived in Newtown, waiting for the bypass to be built, for their entire life.

In the time it’s taken to build this bypass, people who’ve been too young to drive have heard about it, grown up, had children of their own, and those people have had children who are now old enough to drive. The mind boggles.

Programming is just solving puzzles

This is a repost promoting content originally published elsewhere. See more things Dan's reposted.

‘Programming is just solving puzzles’ – Nominet (Nominet)

As a child, I wanted to be a botanical researcher. I loved being outdoors and used to visit the botanical gardens near my house all the time. My grandma inspired me to change my mind and helped me get interested in science. She lived in the country and we would look at the stars together,…

Ruth Trevor-Allen

As a child, I wanted to be a botanical researcher. I loved being outdoors and used to visit the botanical gardens near my house all the time. My grandma inspired me to change my mind and helped me get interested in science. She lived in the country and we would look at the stars together, which led to an early fascination in astronomy.

Unusually for the era, both my grandmothers had worked in science: one as a lab technician and one as a researcher in speech therapy. I have two brothers, but neither went into technology as a career. My mum was a vicar and my dad looked after us kids, although he had been a maths teacher.

My aptitude for science and maths led me to study physics at university, but I didn’t enjoy it, and switched to software engineering after the first year. As soon as I did my first bit of programming, I knew this was what I had been looking for. I like solving problems and building stuff that works, and programming gave me the opportunity to do both. It was my little eureka moment.

Wise words from my partner on her workplace’s blog as part of a series of pieces they’re doing on women in technology. Plus, a nice plug for Three Rings there (thanks, love!).

Shouldn’t We All Have Seamless Micropayments By Now?

This is a repost promoting content originally published elsewhere. See more things Dan's reposted.

Shouldn’t We All Have Seamless Micropayments By Now? (WIRED)

The web’s founders fully expected some form of digital payment to be integral to its functioning. But nearly three decades later, we’re still waiting.

Back in the 1990s, when Tim Berners-Lee and his team were creating the infrastructure of the World Wide Web, they made a list of the error codes that would pop up when something went wrong. You’ve surely encountered many of them: “404 Not Found,” which pops up if you click on a dead link; “401 Unauthorized” when you hit a page that needs a password; and so on.

Here’s one you probably haven’t seen—and its absence from your life speaks to why the promise of the early web seems increasingly out of reach: “402 Payment Required.”

That’s right: The web’s founders fully expected some form of digital payment to be integral to its functioning, just as integral as links, web pages, and passwords. After all, without a way to quickly and smoothly exchange money, how would a new economy be able to flourish online? Of course there ought to be a way to integrate digital cash into browsing and other activities. Of course.

Yet after almost three decades, that 402 error code is still “reserved for future use.” So I still have to ask: Where are my digital micropayments? Where are those frictionless, integrated ways of exchanging money online—cryptographically protected to allow commerce but not surveillance?

In response to this article being discussed on MetaFilter, I wrote:

The Web Payments Working Group published a specification for a standardised mechanism for the collection of card payment details online, a couple of years ago. It’s not quite the same thing because it’s done in the page application rather than at the HTTP(S) level, but it goes a long way towards solving a lot of the problems with our existing approach to payment processing online.

It’s already seeing adoption in browsers, but merchants and payment processors are unlikely to start rolling it out until adoption until later because (a) they want critical mass and (b) they’re wary of change. But within a few years, you’ll probably see it for the first time, and you might not even notice.

The idea is that instead of asking you to fill out an (arbitrary) form, a web page will ask your browser to get payment details from you in a standardised format. That might mean entering your card details if that’s how you prefer to work (but even if you choose to do this, the form you fill in will look the same every time) but it would instead allow you to use a payment tool built in to your browser, operating system, or password safe to do it for you. I know that browsers and password safes will offer to try to do this today, but standardising the format means that they’ll always be able to achieve it.

Once this technology’s in place, there’s nothing to stop HTTP 402’s implementation being completed: all the infrastructure will exist.

The thing about the future is that when it arrives, you don’t even notice. It’s never jetpacks and flying cars: it’s a series of iterative changes, each one predictable after the completion of the last but the entire ensemble seeming innovative and surprising when taken as a whole.