Browser diversity starts with us

This article is a repost promoting content originally published elsewhere. See more things Dan's reposted.

Jeffrey Zeldman

Even if you love Chrome, adore Gmail, and live in Google Docs or Analytics, no single company, let alone a user-tracking advertising giant, should control the internet.

Diversity is as good for the web as it is for society. And it starts with us.

Yet more fallout from the Microsoft announcement that Edge will switch to Chromium, which I discussed earlier. This one’s pretty inspirational, and gives a good reminder about what our responsibilities are to the Web, as its developers.

When to use CSS vs. JavaScript

This article is a repost promoting content originally published elsewhere. See more things Dan's reposted.

by an author

CSS before JS #

My general rule of thumb is…

If something I want to do with JavaScript can be done with CSS instead, use CSS.

CSS parses and renders faster.

For things like animations, it more easily hooks into the browser’s refresh rate cycle to provide silky smooth animations (this can be done in JS, too, but CSS just makes it so damn easy).

And it fails gracefully.

A JavaScript error can bring all of the JS on a page to screeching halt. Mistype a CSS property or miss a semicolon? The browser just skips the property and moves on. Use an unsupported feature? Same thing.

This exactly! If you want progressive enhancement (and you should), performance, and the cleanest separation of behaviour and presentation, the pages you deliver to your users (regardless of what technology you use on your server) should consist of:

  • HTML, written in such a way that that they’re complete and comprehensible alone – from an information science perspective, your pages shouldn’t “need” any more than this (although it’s okay if they’re pretty ugly without any more)
  • CSS, adding design, theme, look-and-feel to your web page
  • Javascript, using progressive enhancement to add functionality in-the-browser (e.g. validation on the client-side in addition to the server side validation, for speed and ease of user experience) and, where absolutely necessary, to add functionality not possible any other way (e.g. if you’re looking to tap into the geolocation API, you’re going to need Javascript… but it’s still desirable to provide as much of the experience as possible without)

Developers failing to follow this principle is making the Web more fragile and harder to archive. It’s not hard to do things “right”: we just need to make sure that developers learn what “right” is and why it’s important.

Incidentally, I just some enhancements to the header of this site, including some CSS animations on the logo and menu (none of them necessary, but all useful) and some Javascript to help ensure that users of touch-capable devices have an easier time. Note that neither Javascript nor CSS are required to use this site; they just add value… just the way the Web ought to be (where possible).

The CSS Working Group At TPAC: What’s New In CSS?

This article is a repost promoting content originally published elsewhere. See more things Dan's reposted.

by an author

Last week, I attended W3C TPAC as well as the CSS Working Group meeting there. Various changes were made to specifications, and discussions had which I feel are of interest to web designers and developers. In this article, I’ll explain a little bit about what happens at TPAC, and show some examples and demos of the things we discussed at TPAC for CSS in particular.

This article describes proposals for the future of CSS, some of which are really interesting. It includes mention of:

  • CSS scrollbars – defining the look and feel of scrollbars. If that sounds familiar, it’s because it’s not actually new: Internet Explorer 5.5 (and contemporaneous version of Opera) supported a proprietary CSS extension that did the same thing back in 2000!
  • Aspect ratio units – this long-needed feature would make it possible to e.g. state that a box is square (or 4:3, or whatever), which has huge value for CSS grid layouts: I’m excited by this one.
  • :where() – although I’ll be steering clear until they decide whether the related :matches() becomes :is(), I can see a million uses for this (and its widespread existence would dramatically reduce the amount that I feel the need to use a preprocessor!).

Rehabilitating Google AMP: My failed attempt

This article is a repost promoting content originally published elsewhere. See more things Dan's reposted.

by an author

This article is a follow-up to my article “Why Google AMP is a threat to the Open Web”. In the comments of that article I promised I’d soon provide a follow-up, and for reasons I’ll get into, that has not been possible until now – but now I’m finally providing it.

Back in February I wrote an article saying how I believed Google AMP has been imposed on the web by Google as a ‘standard’ for developing fast webpages, and my dismay about that. Google apparently developed this as an internal project without any open collaboration, and avoiding the W3C standardization processes. Google made implementation of Google AMP a requirement to show at the top of the search results for common news searches.

To many of us open web folk, Google’s AMP violated the widely held principle of search engines not putting bias into search results, and/or the principle of web standards (take your pick – it would not be bias if it was a standardized approach that the wider web community had agreed upon).

You know how I feel about AMP. I’m not alone, and others are doing a pretty good job of talking to Google about our concerns. Unfortunately, Google aren’t listening.

IndieWebCamp Oxford

This weekend, I attended part of Oxford’s first ever IndieWebCamp! As a long (long, long) time proponent of IndieWeb philosophy (since long before anybody said “IndieWeb”, at least) I’ve got my personal web presence pretty-well sorted out. Still, I loved the idea of attending and pushing some of my own tools even further: after all, a personal website isn’t “finished” until its owner says it is! One of the things I ended up hacking on was pretty-predictable: enhancements to my recently-open-sourced geocaching PESOS tools… but the other’s worth sharing too, I think.

Hacking and learning at IndieWebCamp Oxford
Some of IndieWebCamp Oxford’s attendees share knowledge and hack code together.

I’ve recently been playing with WebVR – for my day job at the Bodleian, I swear! – and I was looking for an excuse to try to expand some of what I’d learned into my personal blog, too. Given that I’ve recently acquired a Ricoh Theta V I thought that this’d be the perfect opportunity to add WebVR-powered panoramas to this site. My goals were:

  • Entirely self-hosted; no external third-party dependencies
  • Must degrade gracefully (i.e. even if you’re using an older browser, don’t have Javascript enabled, etc.) it should at least show the original image
  • In plain-old browsers should support mouse (or touch) control to pan the scene
  • Where accelerators are available (e.g. mobiles), “magic window” support to allow twist-to-explore
  • And where “true” VR hardware (Cardboard, Vive, Rift etc.) with WebVR support is available, allow one-click use of that
IndieWebCamp Oxford attendees at the pub
It wouldn’t be a geeky hacky camp thingy if it didn’t finish at a bar.

Hopefully the images above are working for you and are “interactive”. Try click-and-dragging on them (or tilt your device), try fullscreen mode, and/or try WebVR mode if you’ve got hardware that supports it. The mechanism of operation is slightly hacky but pretty simple: here’s how it works:

  1. The image is inserted into the page as normal but with an extra CSS class of “vr360” and a data attribute pointing to the full-resolution image, e.g.:
    <img class="vr360" src="/uploads/2018/09/R0010005_20180922182210-1024x512.jpg" alt="IndieWebCamp Oxford attendees at the pub" width="640" height="320" data-vr360="/uploads/2018/09/R0010005_20180922182210.jpg" />
  2. Some Javascript swaps-out images with this class for an iframe of the same size, showing a special page and passing the image filename after the hash, e.g.:
    for(vr360 of document.querySelectorAll('.vr360')){
    const width = parseInt(vr360.width);
    const height = parseInt(vr360.height);
    if(width == 0) width = '100%'; // Fallback for where width/height not specified,
    if(height == 0) height = '100%'; // needed because of some quirks with Dan's lazy-loader
    vr360.outerHTML = `<iframe src="/wp-content/themes/q18/vr360/#${vr360.dataset.vr360}" width="${width}" height="${height}" class="aligncenter" class="vr360-frame" style="min-width: 340px; min-height: 340px;"></iframe>`;
    }
  3. The iframe page loads this Javascript file. This loads three.js (to make 3D things easy) and WebVR-polyfill (to fix browser quirks). Finally (scroll to the bottom of the code), it creates a camera in the centre of a sphere, loads the image specified in the hash, flips it, and paints it onto the inside surface of the sphere, sets up controls, and turns the user loose on it. That’s all there is to it!

You’re welcome to any of my code if you’d like a drop-in approach to hosting panoramic photographs on your own personal site. My solution’s pretty extensible if you want e.g. interactive hotspots or contextual overlays – in fact, that – plus an easy route to editing the content for less-technical users – is pretty-much exactly what I’m working on for my day job at the moment.

The Bullshit Web

This article is a repost promoting content originally published elsewhere. See more things Dan's reposted.

The Bullshit Web (pxlnv.com)
My home computer in 1998 had a 56K modem connected to our telephone line; we were allowed a maximum of thirty minutes of computer usage a day, because my parents — quite reasonably — did not want to have their telephone shut off for an evening at a time. I remember webpages loading slowly: ten […]

My home computer in 1998 had a 56K modem connected to our telephone line; we were allowed a maximum of thirty minutes of computer usage a day, because my parents — quite reasonably — did not want to have their telephone shut off for an evening at a time. I remember webpages loading slowly: ten to twenty seconds for a basic news article.

At the time, a few of my friends were getting cable internet. It was remarkable seeing the same pages load in just a few seconds, and I remember thinking about the kinds of the possibilities that would open up as the web kept getting faster.

And faster it got, of course. When I moved into my own apartment several years ago, I got to pick my plan and chose a massive fifty megabit per second broadband connection, which I have since upgraded.

So, with an internet connection faster than I could have thought possible in the late 1990s, what’s the score now? A story at the Hill took over nine seconds to load; at Politico, seventeen seconds; at CNN, over thirty seconds. This is the bullshit web.

But first, a short parenthetical: I’ve been writing posts in both long- and short-form about this stuff for a while, but I wanted to bring many threads together into a single document that may pretentiously be described as a theory of or, more practically, a guide to the bullshit web.

A second parenthetical: when I use the word “bullshit” in this article, it isn’t in a profane sense. It is much closer to Harry Frankfurt’s definition in “On Bullshit”:

It is just this lack of connection to a concern with truth — this indifference to how things really are — that I regard as of the essence of bullshit.

I also intend it to be used in much the same sense as the way it is used in David Graeber’s “On the Phenomenon of Bullshit Jobs”:

In the year 1930, John Maynard Keynes predicted that, by century’s end, technology would have advanced sufficiently that countries like Great Britain or the United States would have achieved a 15-hour work week. There’s every reason to believe he was right. In technological terms, we are quite capable of this. And yet it didn’t happen. Instead, technology has been marshaled, if anything, to figure out ways to make us all work more. In order to achieve this, jobs have had to be created that are, effectively, pointless. Huge swathes of people, in Europe and North America in particular, spend their entire working lives performing tasks they secretly believe do not really need to be performed. The moral and spiritual damage that comes from this situation is profound. It is a scar across our collective soul. Yet virtually no one talks about it.

[…]

These are what I propose to call ‘bullshit jobs’.

What is the equivalent on the web, then?

This, this, a thousand times this. As somebody who’s watched the Web grow both in complexity and delivery speed over the last quarter century, it apalls me that somewhere along the way complexity has started to win. I don’t want to have to download two dozen stylesheets and scripts before your page begins to render – doubly-so if those additional files serve no purpose, or at least no purpose discernable to the reader. Personally, the combination of uMatrix and Ghostery is all the adblocker I need (and I’m more-than-willing to add a little userscript to “fix” your site if it tries to sabotage my use of these technologies), but when for whatever reason I turn these plugins off I feel like the Web has taken a step backwards while I wasn’t looking.

Leak in Comic Chameleon (app API hacking)

I recently discovered a minor security vulnerability in mobile webcomic reading app Comic Chameleon, and I thought that it was interesting (and tame) enough to share as a learning example of (a) how to find security vulnerabilities in an app like this, and (b) more importantly, how to write an app like this without this kind of security vulnerability.

The nature of the vulnerability is that, for webcomics pushed directly into the platform by their authors, it’s possible to read comics (long) before they’re published. By way of proof, here’s a copy of the top-right 200 × 120 pixels of episode 54 of the (excellent) Forward Comic, which Imgur will confirm was uploaded on 2 July 2018: over three months ahead of its planned publication date.

Forward Comic 0054, due for publication in October
I’m not going to spoil this comic for you, but if you follow it then when October comes I think you’ll be pleased.

How to hack a web-backed app

Just to be clear, I didn’t set out to hack this app, but once I stumbled upon the vulnerability I wanted to make sure that I was able to collect enough information that I’d be able to explain to its author what was wrong and how to fix it. You’d be amazed how many systems I find security holes in almost-completely by accident. In fact, I’d just noticed that the application supported some webcomics that I follow but for which I hadn’t been able to find RSS feeds (and so I was selfdogfooding my own tool, RSSey, to “produce” RSS feeds for my reader by screen-scraping: not the most-elegant solution). But if this app could produce a list of issues of the comic, it must have some way of doing what I was trying to do, and I wanted to know what it was.

Comic Chameleon running on Android
Comic Chameleon brings a lot of comics into a single slick Android/iOS app. Some of them you’ll even have heard of!

The app, I figured, must “phone home” to some website – probably the app’s official website itself – to get the list of comics that it supports and details of where to get their feeds from, so I grabbed a copy of the app and started investigating. Because I figured I was probably looking for a URL, the first thing I did was to download the raw APK file (your favourite search engine can tell you how to do this), decompressed it (APK files are just ZIP files, really) and ran strings on it to search for likely-looking URLs:

Running strings on the Comic Chameleon APK contents
As predicted, there are several hard-coded addresses. And all over unencrypted HTTP, eww!

I tried visiting a few of the addresses but many of them seemed to be API endpoints that were expecting additional parameters. Probably, I figured, the strings I’d extracted were prefixes to which those parameters were attached. Rather than fuzz for the right parameters, I decided to watch what the app did: I spun up a simulated Android device using the official emulator (I could have used my own on a wireless network that I control, of course, but this was lazier) and ran my favourite packet sniffer to see what the application was requesting.

Wireshark output showing Comic Chameleon traffic.
The web addresses are even clearer, here, and include all of the parameters I need.

Now I had full web addresses with parameters. Comparing the parameters that appeared when I clicked different comics revealed that each comic in the “full list” was assigned a numeric ID which was used when requesting issues of that comic (along with an intermediate stage where the year of publication is requested).

Comic Chameleon comic list XML
Each comic is assigned an ID number, probably sequentially.

Interestingly, a number of comics were listed with the attribute s="no-show" and did not appear in the app: it looked like comics that weren’t yet being made available via the app were already being indexed and collected by its web component, and for some reason were being exposed via the XML API: presumably the developer had never considered that anybody but their app would look at the XML itself, but the thing about the Web is that if you put it on the Web, anybody can see it.

Still: at this point I assumed that I was about to find what I was looking for – some kind of machine-readable source (an RSS feed or something like one) for a webcomic or two. But when I looked at the XML API for one of those webcomics I discovered quite a bit more than I’d bargained on finding:

no-shows in the episode list produced by the web component of Comic Chameleon
Hey, what’s this? This feed includes titles for webcomics that haven’t been published yet, marked as ‘no-show’…

The first webcomic I looked at included the “official” web addresses and titles of each published comic… but also several not yet published ones. The unpublished ones were marked with s="no-show" to indicate to the app that they weren’t to be shown, but I could now see them. The “official” web addresses didn’t work for me, as I’d expected, but when I tried Comic Chameleon’s versions of the addresses, I found that I could see entire episodes of comics, up to three and a half months ahead of their expected publication date.

Whoops.

Naturally, I compiled all of my findings into an email and contacted the app developer with all of the details they’d need to fix it – in hacker terms, I’m one of the “good guys”! – but I wanted to share this particular example with you because (a) it’s not a very dangerous leak of data (a few webcomics a few weeks early and/or a way to evade a few ads isn’t going to kill anybody) and (b) it’s very illustrative of the kinds of mistakes that app developers are making a lot, these days, and it’s important to understand why so that you’re not among them. On to that in a moment.

Responsible disclosure

Because (I’d like to think) I’m one of the “good guys” in the security world, the first thing I did after the research above was to contact the author of the software. They didn’t seem to have a security.txt file, a disclosure policy, nor a profile on any of the major disclosure management sites, so I sent an email. Were the security issue more-severe, I’d have sent a preliminary email suggesting (and agreeing on a mechanism for) encrypted email, but given the low impact of this particular issue, I just explained the entire issue in the initial email: basically what you’ve read above, plus some tips on fixing the issue and an offer to help out.

"Hacking", apparently
This is what stock photo sites think “hacking” is. Well… this, pages full of green code, or hoodies.

I subscribe to the doctrine of responsible disclosure, which – in the event of more-significant vulnerabilities – means that after first contacting the developer of an insecure system and giving them time to fix it, it’s acceptable (in fact: in significant cases, it’s socially-responsible) to publish the details of the vulnerability. In this case, though, I think the whole experience makes an interesting learning example about ways in which you might begin to “black box” test an app for data leaks like this and – below – how to think about software development in a way that limits the risk of such vulnerabilities appearing in the first place.

The author of this software hasn’t given any answer to any of the emails I’ve sent over the last couple of weeks, so I’m assuming that they just plan to leave this particular leak in place. I reached out and contacted the author of Forward Comic, though, which turns out (coincidentally) to be probably the most-severely affected publication on the platform, so that he had the option of taking action before I published this blog post.

Lessons to learn

When developing an “app” (whether for the web or a desktop or mobile platform) that connects to an Internet service to collect data, here are the important things you really, really ought to do:

  1. Don’t publish any data that you don’t want the user to see.
  2. If the data isn’t for everybody, remember to authenticate the user.
  3. And for heaven’s sake use SSL, it’s not the 1990s any more.
Website message asking visitor to confirm that they're old enough.
It’s a good job that nobody on the Web would ever try to view something easily-available but which they shouldn’t, right? That’s why screens like this have always worked so well.

That first lesson’s the big one of course: if you don’t want something to be on the public Internet, don’t put it on the public Internet! The feeds I found simply shouldn’t have contained the “secret” information that they did, and the unpublished comics shouldn’t have been online at real web addresses. But aside from (or in addition to) not including these unpublished items in the data feeds, what else might our app developer have considered?

  • Encryption. There’s no excuse for not using HTTPS these days. This alone wouldn’t have prevented a deliberate effort to read the secret data, but it would help prevent it from happening accidentally (which is a risk right now), e.g. on a proxy server or while debugging something else on the same network link. It also protects the user from exposing their browsing habits (do you want everybody at that coffee shop to know what weird comics you read?) and from having content ‘injected’ (do you want the person at the next table of the coffee shop to be able to choose what you see when you ask for a comic?
  • Authentication (app). The app could work harder to prove that it’s genuinely the app when it contacts the website. No mechanism for doing this can ever be perfect, because the user hasa access to the app and can theoretically reverse-engineer it to fish the entire authentication strategy out of it, but some approaches are better than others. Sending a password (e.g. over Basic Authentication) is barely better than just using a complex web address, but using a client-side certiciate or an OTP algorithm would (in conjunction with encryption) foil many attackers.
  • Authentication (user). It’s a very-different model to the one currently used by the app, but requiring users to “sign up” to the service would reduce the risks and provide better mechanisms for tracking/blocking misusers, though the relative anonymity of the Internet doesn’t give this much strength and introduces various additional burdens both technical and legal upon the developer.

Fundamentally, of course, there’s nothing that an app developer can do to perfectly protect the data that is published to that app, because the app runs on a device that the user controls! That’s why the first lesson is the most important: if it shouldn’t be on the public Internet (yet), don’t put it on the public Internet.

Hopefully there’s a lesson for you somewhere too: about how to think about app security so that you don’t make a similar mistake, or about some of the ways in which you might test the security of an application (for example, as part of an internal audit), or, if nothing else, that you should go and read Forward, because it’s pretty cool.

Further reading

7 August 2018: I’ve now written a quick explanation about how to intercept HTTPS traffic from Android apps, for those that asked.