web – Page 9

Rehabilitating Google AMP: My failed attempt

This is a repost promoting content originally published elsewhere. See more things Dan's reposted.

This article is a follow-up to my article “Why Google AMP is a threat to the Open Web”. In the comments of that article I promised I’d soon provide a follow-up, and for reasons I’ll get into, that has not been possible until now – but now I’m finally providing it.

Back in February I wrote an article saying how I believed Google AMP has been imposed on the web by Google as a ‘standard’ for developing fast webpages, and my dismay about that. Google apparently developed this as an internal project without any open collaboration, and avoiding the W3C standardization processes. Google made implementation of Google AMP a requirement to show at the top of the search results for common news searches.

To many of us open web folk, Google’s AMP violated the widely held principle of search engines not putting bias into search results, and/or the principle of web standards (take your pick – it would not be bias if it was a standardized approach that the wider web community had agreed upon).

…

Chris Graham (socPub)

You know how I feel about AMP. I’m not alone, and others are doing a pretty good job of talking to Google about our concerns. Unfortunately, Google aren’t listening.

CSS Border-Radius Can Do That? | IO 9elements

This is a repost promoting content originally published elsewhere. See more things Dan's reposted.

TL/DR: When you use eight values specifying border-radius in CSS, you can create organic looking shapes. WOW. No time to read it all ?— we made a visual tool for you. Find it here.

Introduction

During this year’s Frontend Conference Zurich Rachel Andrew talked about Unlocking the Power of CSS Grid Layout. At the end of her talk, she mentioned something about an old CSS property that got stuck in my head:

“The Image is set round just by using the well-supported border-radius. Don’t forget that old CSS still exists and is useful. You don’t need to use something fancy for every effect.” — Rachel Andrew

Shortly after I heard this talk, I thought that you certainly could create more than just circles and started to dig deeper into what can be done using border-radius.

…

Nils Binder

That’s really cool. I had a quick play and made this gold ‘shield’ award:

You Win A Prize!

Note #11210

Notes from #musetech18 presentations (with a strong “collaboration” theme). Note that these are “live notes” first-and-foremost for my own use and so are probably full of typos. Sorry.

Matt Locke (StoryThings, @matlocke):

Over the last 100 years, proportional total advertising revenue has been stolen from newspapers by radio, then television: scheduled media that is experienced simultaneously. But we see a recent drift in “patterns of attention” towards the Internet. (Schedulers, not producers, hold the power in radio/television.)
The new attention “spectrum” includes things that aren’t “20-60 minutes” (which has historically been dominated by TV) nor “1-3 hours” (which has been film), but now there are shorter and longer forms of popular medium, from tweets and blog posts (very short) to livestreams and binging (very long). To gather the full spectrum of attention, we need to span these spectra.
Rhythm is the traditions and patterns of how work is done in your industry, sector, platforms and supply chains. You need to understand this to be most-effective (but this is hard to see from the inside: newcomers are helpful). In broadcast television as a medium, the schedules dictate the rhythms… in traditional print publishing, the major book festivals and “blockbuster release” cycles dominate the rhythm.
Then how do we collaborate with organisations not in our sector (i.e. with different rhythms)? There are several approaches, but think about the rhythmic impact.

Lizzy Bullock (English Heritage, @lizzybethness):

g.co/englishheritage
Partnered with Google Arts & Heritage; Google’s first single-partner project and also their first project with a multi-site organisation.
This kind of tech can be used to increase access (e.g. street view of closed sites) and also support curatorial/research aims (e.g. ultra-high-resolution photography).
Aside from the tech access, working with a big company like Google provides basically “free” PR. In combination, these benefits boost reach.
Learnings: prepare to work hard and fast, multi-site projects are a logistical nightmare, you will need help, stay organised and get recordkeeping/planning in place early, be aware that there’ll be things you can’t control (e.g. off-brand PR produced by the partner), don’t be afraid to stand your ground where you know your content better.
Decide what successw looks like at the outset and with all relevant stakeholders involved, so that you can stay on course. Make sure the project is integrated into contributors’ work streams.

Daria Cybulska (Wikimedia UK, @DCybulska):

Collaborative work via Wikimedians-in-residence not only provides a boost to open content but involves engagement with staff and opens further partnership opportunities.
Your audience is already using Wikipedia: reaching out via Wikipedia provides new ways to engage with them – see it as a medium as well as a platform.
Wikimedians-in-residence, being “external”, are great motivators to agitate processes and promote healthy change in your organisation.

Creative Collaborations ([1] Kate Noble @kateinoble, Ina Pruegel @3today, [2] Joanna Salter, [3] Michal Cudrnak, Johnathan Prior):

Digital making (learning about technology through making with it) can link museums with “maker culture”. Cambridge museums (Zoology, Fitzwilliam) used a “Maker in Residence” programme and promoted “family workshops” and worked with primary schools. Staff learned-as-they-went and delivered training that they’d just done themselves (which fits maker culture thinking). Unexpected outcomes included interest from staff and discovery of “hidden” resources around the museums, and the provision of valuable role models to participants. Tips: find allies, be ambitious and playful, and take risks.
National Maritime Museum Greenwich/National Maritime Museum – “re.think” aimed to engage public with emotive topics and physically-interactive exhibits. Digital wing allowed leaving of connections/memories, voting on hot issues, etc. This leads to a model in which visitors are actively engaged in shaping the future display (and interpretation) of exhibitions. Stefanie Posavec appointed as a data artist in residence.
SoundWalk Strazky at Slovak National Gallery: audio-geography soundwalks as an immersive experiential exhibition; can be done relatively cheaply, at the basic end. Telling fictional stories (based on reality) can help engage visitors with content (in this case, recreating scenes from artists’ lives). Interlingual challenges. Delivery via Phonegap app which provides map and audio at “spots”; with a simple design that discourages staring-at-the-screen (only use digital to improve access to content!).

Lightning talks:

Maritime Museum Greenwich: wanted to find out how people engage with objects – we added both a museum interpretation and a community message to each object. Highly-observational testing helped see how hundreds of people engage with content. Lesson: curators are not good judges of how their stuff will be received; audience ownership is amazing. Be reactive. Visitors don’t mind being testers of super-rough paper-based designs.
Nordic Museum / Swedish National Heritage Board explored Generous Interfaces: show first, don’t ask, rich overviews, interobject relationships, encourage exploration etc. (Whitelaw, 2012). Open data + open source + design sprints (with coding in between) + lots of testing = a collaborative process. Use testing to decide between sorting OR filtering; not both! As a bonus, generous interfaces encourage finding of data errors. bit.ly/2CNsNna
IWM on the centenary of WWI: thinking about continuing the crowdsourcing begun by the IWM’s original mission. Millions of assets have been created by users. Highly-collaborative mechanism to explore, contribute to, and share a data space.
Lauren Bassam (@lswbassam) on LGBT History and co-opting of Instagram as an archival space: Instagram is an unconventional archival source, but provides a few benefits in collaboration and engagement management, and serves as a viable platform for stories that are hard to tell using the collections in conventional archives. A suitably-engaged community can take pride in their accuracy and their research cred, whether or not you strictly approve of their use of the term “archivist”. With closed stacks, we sometimes forget how important engagement, touch, exploration and play can be.
Owen Gower (@owentg) from Dr. Jenner’s House Museum and Garden: they received EU REVEAL funding to look at VR as an engagement tool. Their game is for PSVR and has a commercial release. The objects that interested the game designers the most weren’t necessarily those which the curators might have chosen. Don’t let your designers get carried away and fill the game with e.g. zombies. But work with them, and your designers can help you find not only new ways to tell stories, but new stories you didn’t know you could tell. Don’t be afraid to use cheap/student developers!
Rebecca Kahm @rebamex from Pelagios Commons (@Pelagiosproject): the problem with linked data is that it’s hard to show its value to end users (or even show museums “what you can do” with it). Coins have great linked data, in collections. Peripleo was used to implement a sort-of “reverse Indiana Jones”: players try to recover information to find where an artefact belongs.
Jon Pratty: There are lots of useful services (Flickr, Storify etc.) and many are free (which is great)… but this produces problems for us in terms of the long-term life of our online content, not to mention the ethical issues with using services whose business model is built on trading personal data of our users. [Editor’s note: everything being talked about here is the stuff that the Indieweb movement have been working on for some time!] We need to de-siloise and de-centralise our content and services. redecentralize.org? responsibledata.io?

In-House Collaboration and the State of the Sector:

Rosie Cardiff @RosieCardiff, Serpentine Galleries on Mobile Tours. Delivered as web application via captive WiFi hotspot. Technical challenges were significant for a relatively small digital team, and there was some apprehension among frontline staff. As a result of these and other problems, the mobile tours were underused. Ideas to overcome barriers: report successes and feedback, reuse content cross-channel, fix bugs ASAP, invite dialogue. Interesting that they’ve gained a print guides off the back of the the digital. Learn lessons and relaunch.
Sarah Younaf @sarahyounas, Tyne & Wear Museums. Digital’s job is to ask the questions the museum wouldn’t normally ask, i.e. experimentation (with a human-centric bias). Digital is quietly, by its nature, “given permission” to take risks. Consider establishing relationships with (and inviting-in) people who will/want to do “mashups” or find alternative uses for your content; get those conversations going about collections access. Experimental Try-New-Things afternoons had value but this didn’t directly translate into ideas-from-the-bottom, perhaps as a result of a lack of confidence, a requirement for fully-formed ideas, or a heavy form in the application process for investment in new initiatives. Remember you can’t change everyone, but find champions and encourage participation!
Kati Price @katiprice on Structuring for Digital Success in GLAM. Study showed that technical leadership and digital management/analysis is rated as vital, yet they’re also underrepresented. Ambitions routinely outstrip budgets. Assumptions about what digital teams “look” like from an org-chart perspective don’t cover the full diversity: digital teams look very different from one another! Forrester Research model of Digital Maturity seems to be the closest measure of digital maturity in GLAM institutions, but has flaws (mostly relating to its focus in the commercial sector): what’s interesting is that digital maturity seems to correlate to structure – decentralised less mature than centralised less mature than hub-and-spoke less mature than holistic.
Jennifer Wexler, Daniel Pett, Chiara Bonacchi on Diversifying Museum Audiences through Participation and stuff. Crowdsourcing boring data entry tasks is sometimes easier than asking staff to do it, amazingly. For success, make sure you get institutional buy-in and get press on board. Also: make sure that the resulting data is open so everybody can explore it. Crowdsourcing is not implicitly democratisating, but it leads to the production of data that can be. 3D prints (made from 3D cutouts generated by crowdsourcing) are a useful accessibility feature for bringing a collection to blind or partially-sighted visitors, for example. Think about your audiences: kids might love your hip VR, but if their parents hate it then you still need a way to engage with them!

Your RSS is grass: Mozilla euthanizes feed reader, Atom code in Firefox browser, claims it’s old and unloved

This is a repost promoting content originally published elsewhere. See more things Dan's reposted.

When Firefox 64 arrives in December, support for RSS, the once celebrated content syndication scheme, and its sibling, Atom, will be missing.

“After considering the maintenance, performance and security costs of the feed preview and subscription features in Firefox, we’ve concluded that it is no longer sustainable to keep feed support in the core of the product,” said Gijs Kruitbosch, a software engineer who works on Firefox at Mozilla, in a blog post on Thursday.

…

Thomas Claburn (The Register)

Not a great sign, but understandable. Live Bookmarks was never strong enough to be a full-featured RSS reader, and I don’t know about you but I haven’t really made use of bookmarks for a good few years, let alone “live” bookmarks, but the media are likely to see this (as El Reg does, in the article) as another nail in the coffin of one of the best syndication mechanisms the Web ever came up with.

“Stop Thinking About Consent: It Isn’t Possible and It Isn’t Right”

This is a repost promoting content originally published elsewhere. See more things Dan's reposted.

For a philosopher, Helen Nissenbaum is a surprisingly active participant in shaping how we collect, use, and protect personal data. Nissenbaum, who earned her PhD from Stanford, is a professor of information science at Cornell Tech, New York City, where she focuses on the intersection of politics, ethics, and values in technology and digital media — the hard stuff. Her framework for understanding digital privacy has deeply influenced real-world policy.

In addition to several books and countless papers, she’s also coauthored privacy plug-ins for web browsers including TrackMeNot, AdNauseum, and Adnostic. Nissenbaum views these pieces of code as small efforts at rationalizing a marketplace where opaque consent agreements give consumers little bargaining power against data collectors as they extract as much information, and value from this information, as they can. Meanwhile, these practices offer an indefinite value proposition to consumers while compromising the integrity of digital media, social institutions, and individual security.

…

Scott Berinato (Harvard Business Review)

People don’t change

This is a repost promoting content originally published elsewhere. See more things Dan's reposted.

Peter Gasston / Front End London (YouTube)

Fundamentally, people haven’t changed much in tens of thousands of years. If ancient Egyptians had smartphones, you know full well that they’d have been posting cat pictures too. What can we learn from this and how should we look at our role when developing front-end Web experiences?

The thing about ad-blocker popups

This is a repost promoting content originally published elsewhere. See more things Dan's reposted.

I’ve been, in the past, a firm distruster of ad blocking software. I still am, to a large extent. I don’t trust any company whose finance model is based on inserting exceptions for advertisers they like. But I installed Ghostery, whose model is to use the stats of what gets blocked to offer consultancy to companies to make their adverts less horrific. I like this idea, so I support it. My Ghostery install is fairly open, blocking only sites that offer page-takeover, popups, autoplaying videos, and other stuff that annoys me a lot. So I get a bit annoyed when I’m scrolling through a Wired article and get something like this:

Fine. I don’t disagree with the sentiment, but I don’t read Wired often enough to care about being a member, so yeah, ad supported isn’t unreasonable. Do you know what’s unreasonable, Wired? This is what happens when I whitelist your site:

…

Nicholas ‘Aquarion’ Avenell

I’ve gone full-nuclear these last few years and I just keep Javascript disabled for most domains, most of the time (I’m using uMatrix). The Web is a lot faster, for it, and I can just enable it for domains that “need” it as-and-when. I also keep a userscript to-hand that I can tweak as-and-when to block anti-ad-blocker scripts, so that enabling Javascript on your domain (but not the domains of your dozen trackers/advertisers) doesn’t mean that I see your anti-ad-blocker popups either.

If your site nags gently (e.g. by mentioning where ads would be that they’re blocked, perhaps with a sad face emoticon) I’ll consider adding the ads, if your site has value. But more likely, if your site’s good, I’ll be looking for the donate link. You can make more money out of me with donations than you ever would be showing me ads: I’m more than happy to pay for the Web… I’m not happy to have 75% of the work my computer does when I’m reading your content be about your advertising partners tracking me nor about trying to “block” me from seeing your content.

The full article helps show how bad the Web’s gotten. When it starts to get better again, perhaps I’ll stop blocking ads and trackers so aggressively.

20 Years Of Blogging

As of next week, I’ll have been blogging for 20 years, or about 54% of my life. How did that happen?

Castle of the Four Winds in early 1999. — I’d been “blogging” – not that we called it that, yet – since late 1998, but my original collection of content-mangling Perl scripts wasn’t all that. More history…

The mid-1990s were a very different time for the World Wide Web (yes, we still called it that, and sometimes we even described its use as “surfing”). Going “on the Internet” was a calculated and deliberate action requiring tying up your phone line, minutes of “connecting” along with all of the associated screeching sounds if you hadn’t turned off your modem’s loudspeaker, and you’d typically be paying twice for the experience: both a monthly fee to your ISP for the service and a per-minute charge to your phone company for the call.

It was into this environment that in 1994 I published my first web pages: as far as I know, nothing remains of them now. It wasn’t until 1998 that I signed up an account with UserActive (whose website looks almost the same today as it did then) who offered economical subdomain hosting with shell and CGI support and launched “Castle of the Four Winds”, a set of vanity pages that included my first blog.

Except I didn’t call it a “blog”, of course, because it wasn’t until the following year that Peter Merholz invented the word (he also commemorated 20 years of blogging, this year). I didn’t even call it a “weblog”, because that word was still relatively new and I wasn’t hip enough to be around people who said it, yet. It was self-described as an “online diary”, a name which only served to reinforce the notion that I was writing principally for myself. In fact, it wasn’t until mid-1999 that I discovered that it was being more-widely read than just by me and my circle of friends when I attracted a stalker who travelled across the UK to try to “surprise” me by turning up at places she expected to find me, based on what I’d written online… which was exactly as creepy as it sounds.

AvAngel.com, my second vanity site, as seen in 2001 — AvAngel.com

While the world began to panic that the coming millennium was going to break all of the computers, I migrated Castle of the Four Winds’ content into AvAngel.com, a joint vanity site venture with my friend Andy. Aside from its additional content (purity tests, funny stuff, risqué e-cards), what we hosted was mostly the same old stuff, and I continued to write snippets about my life in what was now quite-clearly a “blog-like” format, with the most-recent posts at the top and separate pages for content too old for the front page. Looking back, there’s still a certain naivety to these posts which exemplify the youth of the Web. For example, posts routinely referenced my friends by their email addresses, because spam was yet to become a big enough problem that people didn’t much mind if you put their email address on a public webpage somewhere, and because email addresses still carried with them a feeling of anonymity that ceased to be the case when we started using them for important things.

Technologically-speaking, too, this was a simpler time. Neither Javascript nor CSS support was widespread (nor consistently-standardised) enough to rely upon for anything other than the simplest progressive enhancement unless you were willing to “pick a side” in what we’d subsequently call the first browser war and put one of those apalling “best viewed in Internet Explorer” or “best viewed in Netscape Navigator” banners on your site. I’ve always been a believer in a universal web (and my primary browser at the time was Opera, anyway, as it mostly-remained until Opera went wrong in 2013), and I didn’t have the energy to write everything twice, so our cool/dynamic functionality came mostly from back-end (e.g. Perl, PHP) technologies.

Meanwhile, during my initial months as a student in Aberystwyth, I wrote a series of emails to friends back home entitled “Cool And Interesting Thing Of The Day To Do At The University Of Wales, Aberystwyth”, and put copies of each onto my student webspace; I’ve since recovered these and integrated them into my unified blog.

The first version of Scatmania.org. — Scatmania.org

In 2002 I’d bought the domain name scatmania.org – a reference to my university halls of residence nickname “Scatman Dan”; I genuinely didn’t consider the possibility that the name might be considered scatalogical until later on. As I wanted to continue my blogging at an address that felt like it was solely mine (AvAngel.com having been originally shared with a friend, although in practice over time it became associated only with me), this seemed like a good domain upon which to relaunch. And so, in mid-2003 and powered by a short-lived and ill-fated blogging engine called Flip I did exactly that. WordPress, to which I’d subsequently migrate, hadn’t been invented yet and it wasn’t clear whether its predecessor, b2/cafelog, would survive the troubles its author was experiencing.

From this point on, any web address for any post made to my blog still works to this day, despite multiple technological and infrastructural changes to my blog (and some domain name shenanigans!) in the meantime. I’d come to be a big believer in the mantra that cool URIs don’t change: something that as far as possible I’ve committed to trying to upload in my blogging, my archiving, and my paid work since then. I’m moderately confident that all extant links on the web that point to earlier posts are all under my control so they can (and in most cases have) been fixed already, so I’m pretty close to having all my permalink URIs be “cool”, for now. You might hit a short chain of redirects, but you’ll get to where you’re going.

And everything was fine, until one day in 2004 when it wasn’t. The server hosting scatmania.org died in a very bad way, and because my backup strategy was woefully inadequate, I lost a lot of content. I’ve recovered quite a lot of it and put it back in-place, but some is probably gone forever.

The resurrected site was powered by WordPress, and this was the first time that live database queries had been used to power my blog. Occasionally, these days, when talking to younger, cooler developers, I’m tempted to follow the hip trend of reimplementing my blog as a static site, compiling a stack of host-anywhere HTML files based upon whatever-structure-I-like at the “backend”… but then I remember that I basically did that already for six years and I’m far happier with my web presence today. I’ve nothing against static site systems (I’m quite partial to Middleman, myself, although I’m also fond of Hugo) but they’re not right for this site, right now.

IndieAuth hadn’t been invented yet, but I was quite keen on the ideals of OpenID (I still am, really), and so I implemented what was probably the first viable “install-anywhere” implementation of OpenID for WordPress – you can see part of it functioning in the top-right of the screenshot above, where my (copious, at that time) LiveJournal-using friends were encouraged to sign in to my blog using their LiveJournal identity. Nowadays, the majority of the WordPress plugins I use are ones I’ve written myself: my blog is powered by a CMS that’s more “mine” than not!

Over the course of the first decade of my blogging, a few trends had become apparent in my technical choices. For example:

I’ve always self-hosted my blog, rather than relying on a “blog as a service” or siloed social media platform like WordPress.com, Blogger, or LiveJournal.
I’ve preferred an approach of storing the “master” copy of my content on my own site and then (sometimes) syndicating it elsewhere: for example, for the benefit of my friends who during their University years maintained a LiveJournal, for many years I had my blog cross-post to a LiveJournal account (and backfeed copies of comments back to my site).
I’ve favoured web standards that provided maximum interoperability (e.g. RSS with full content) and longevity (serving HTML pages from permanent URLs, adding “extra” functionality via progressive enhancement so as to ensure that content functioned e.g. without Javascript, with CSS disabled or the specification evolved, etc.).

These were deliberate choices, but they didn’t require much consideration: growing up with a Web far less-sophisticated than today’s (e.g. truly stateless prior to the advent of HTTP cookies) and seeing the chaos caused during the first browser war and the period of stagnation that followed, these choices seemed intuitive.

(Perhaps it’s not so much of a coincidence that I’ve found myself working at a library: maybe I’ve secretly been a hobbyist archivist all along!)

Third major design reboot of scatmania.org — That body font is plain old Verdana, you know: I’ve always felt that it (plus full justification) was the right choice for this particular design, even though I regret other parts of it (like the brightness!).

As you’d expect from a blog covering a period from somebody’s teen years through to their late thirties, there’ve been significant changes in the kinds of content I’ve posted (and the tone with which I’ve done so) over the years, too. If you dip into 2003, for example, you’ll see the results of quiz memes and unqualified daily minutiae alongside actual considered content. Go back further, to early 1999, and it is (at best) meaningless wittering about the day-to-day life of a teenage student. It took until around 2009/2010 before I actually started focussing on writing content that specifically might be enjoyable for others to read (even where that content was frankly silly) and only far more-recently-still that I’ve committed to the “mostly technical stuff, ocassional bits of ‘life’ stuff” focus that I have today.

I say “committed”, but of course I’m fully aware that whatever this blog is now, it’ll doubtless be something somewhat different if I’m still writing it in another two decades…

Once I reached the 2010s I started actually taking the time to think about the design of my blog and its meaning. Conceptually, all of my content is data-driven: database tables full of different “kinds” of content and associated metadata, and that’s pretty-much ideal – it provides a strong separation between content and presentation and makes it possible to make significant design changes with less work than might otherwise be expected. I’ve also always generally favoured a separation of concerns in web development and so I’m not a fan of CSS design methodologies that encourage class names describing how things should appear, like Atomic CSS. Even where it results in a performance hit, I’d far rather use CSS classes to describe what things are or represent. The single biggest problem with this approach, to my mind, is that it violates the DRY principle… but that’s something that your CSS preprocessor’s there to fix for you, isn’t it?

But despite this philosophical outlook on the appropriate gap between content and presentation, it took until about 2010 before I actually attached any real significance to the presentation at all! Until this point, I’d considered myself to have been more of a back-end than a front-end engineer, and felt that the most-important thing was to get the content out there via an appropriate medium. After all, a site without content isn’t a site at all, but a site without design is (or at least should be) still intelligible thanks to browser defaults! Remember, again, that I started web development at a time when stylesheets didn’t exist at all.

My previous implementations of my blog design had used simple designs, often adapted from open-source templates, in an effort to get them deployed as quickly as possible and move on to the next task, but now, I felt, it was time to do a little more.

Scatmania.org in 2010 — My 2010 relaunch put far more focus on the graphical design elements of my blog as well as providing a fully responsive design based on (then-new) CSS media queries. Alongside my focus on separation of concerns in web development, I’m also quite opinionated on the idea that a responsive design has almost always been a superior solution to having a separate “mobile site”.

For a few years, I was producing a new theme once per year. I experimented with different colours, fonts, and layouts, and decided (after some ad-hoc A/B testing) that my audience was better-served by a “front” page than by being dropped directly into my blog archives as had previously been the case. Highlighting the latest few – and especially the very-latest – post and other recent content increased the number of posts that a visitor would be likely to engage with in a single visit. I’ve always presumed that the reason for this is that regular (but non-subscribing) readers are more-likely to be able to work out what they have and haven’t read already from summary text than from trying to decipher an entire post: possibly because my blogging had (has!) become rather verbose.

Scatmania.org until early 2012 — My 2011 design, in hindsight, said more about my mood and state-of-mind at the time than it did about artistic choices: what’s with all the black backgrounds and seriffed fonts? Is this a funeral parlour?

I went through a bit of a lull in blogging: I’ve joked that I spent more time on my 2010 and 2011 designs than I did on the sum total of the content that was published in between the pair of them (which isn’t true… at least, not quite!). In the month I left Aberystwyth for Oxford, for example, I was doing all kinds of exciting and new things… and yet I only wrote a total of two blog posts.

With RSS waning in popularity – which I can’t understand: RSS is amazing! – I began to crosspost to social networks like Twitter and Google+ (although no longer to Google+, following the news of its imminent demise) to help those readers who prefer to get their content via these media, but because I wasn’t producing much content, it probably didn’t make a significant difference anyway: the chance of a regular reader “missing” something must have been remarkably slim.

Scatmania.org in 2012 — The 2012 design featured “CSS peekaboo”: a transformation that caused my head to “hide” from you behind the search bar if your cursor got too close. Ruth, I hear, spent far too long playing with just this feature.

Nobody calls me “Scatman Dan” any more, and hadn’t for a long, long time. Given that my name is already awesome and unique all by itself (having changed to be so during the era in which scatmania.org was my primary personal domain name), it felt like I had the opportunity to rebrand.

I moved my blog to a new domain, DanQ.me (which is nice and short, too) and came up with a new collection of colours, fonts, and layout choices that I felt better-reflected my identity… and the fact that my blog was becoming less a place to record the mundane details of my daily life and more a place where I talk about (principally-web) technology, security, and GPS games… and just occasionally about other topics like breadmaking and books. Also, it gave me a chance to get on top of the current trend in web design for big, clean, empty spaces, square corners, and using pictures as the hook to a story.

I’ve been working harder this last year or two to re-integrate (in a PESOS-like way) into my blog content that I’ve published elsewhere, mostly geocaching logs and geohashing expedition records, and I’ve also done so retroactively, so in addition to my first blog article on the subject of geocaching, you can read my first ever cache log without switching to a different site nor relying upon the continued existence and accessibility of that site. I’ve been working at being increasingly mindful of where my content is siloed outside of my control and reclaiming it by hosting it here, on my blog.

Particular areas in which I produce content elsewhere but would like to at-least maintain a copy here, and would ideally publish here first and syndicate elsewhere, although I appreciate that this is difficult, are:

GPS games like geocaching and geohashing – I’ve mostly got this under control, but could enjoy streamlining the process or pushing towards POSSE
Reddit, where I’ve written tens of thousands of words under a variety of accounts, but I don’t really pay attention to the site any more
I left Facebook in 2011 but I still have a backup of what was on my “Wall” at that point, which I could look into reintegrating into my blog
I share a lot of the source code I write via my GitHub account, but I’m painfully aware that this is yet-another-silo that I ought to learn not to depend upon (and it ought to be simple enough to mirror my repos on my own site!)
I’ve got a reasonable number of videos on two YouTube channels which are online by Google’s good graces (and potential for advertising revenue); for a handful of technical reasons they’re a bit of a pain to self-host, but perhaps my blog could act as a secondary source to my own video content
I write business reviews on Google Maps which I should probably look into recovering from the hivemind and hosting here… in fact, I’ve probably written plenty of reviews on other sites, too, like Amazon for example…
On two previous occasions I’ve maintained an online photo gallery; I might someday resurrect the concept, at least for the photos that used to be published on them
I’ve dabbled on a handful of other, often weirder, social networks before like Scuttlebutt (which has a genius concept, by the way) and Ello, and ought to check if there’s anything “original” on there I should reintegrate
Going way, way back, there are a good number of usenet postings I’ve made over the last twenty-something years that I could reclaim, if I can find them…

(if you’re asking why I’m inclined to do all of these things: here’s why)

20 years and around 717,000 words worth of blogging down, it’s interesting to look back and see how things have changed: in my life, on the Web, and in the world in general. I’ve seen many friends’ blogs come and go: they move into a new phase of their life and don’t feel like what they wrote before reflects them today, most often, and so they delete them… which is fine, of course: it’s their content! But for me it’s always felt wrong to do so, for two reasons: firstly, it feels false to do so given that once something’s been put on the Web, it might well be online forever – you can’t put the genie back in the bottle! And secondly: for me, it’s valuable to own everything I wrote before. Even the cringeworthy things I wrote as a teenager who thought they knew everything and the antagonistic stuff I wrote in my early 20s but that I clearly wouldn’t stand by today is part of my history, and hiding that would be a disservice to myself.

The 17-year-old who wrote my first blog posts two decades ago this month fully expected that the things he wrote would be online forever, and I don’t intend to take that away from him. I’m sure that when I write a post in October 2038 looking back on the next two decades, I’ll roll my eyes at myself today, too, but for me: that’s part of the joy of a long-running personal blog. It’s like a diary, but with a sense of accountability. It’s a space on the web that’s “mine” into which I can dump pretty-much whatever I like.

I love it: I’ve been blogging for over half of my life, and if I can get back to you in 2031 and tell you that I’ve by-then been doing so for two-thirds of my life, that would be a win.

IndieWebCamp Oxford

This weekend, I attended part of Oxford’s first ever IndieWebCamp! As a long (long, long) time proponent of IndieWeb philosophy (since long before anybody said “IndieWeb”, at least) I’ve got my personal web presence pretty-well sorted out. Still, I loved the idea of attending and pushing some of my own tools even further: after all, a personal website isn’t “finished” until its owner says it is! One of the things I ended up hacking on was pretty-predictable: enhancements to my recently-open-sourced geocaching PESOS tools… but the other’s worth sharing too, I think.

I’ve recently been playing with WebVR – for my day job at the Bodleian, I swear! – and I was looking for an excuse to try to expand some of what I’d learned into my personal blog, too. Given that I’ve recently acquired a Ricoh Theta V I thought that this’d be the perfect opportunity to add WebVR-powered panoramas to this site. My goals were:

Entirely self-hosted; no external third-party dependencies
Must degrade gracefully (i.e. even if you’re using an older browser, don’t have Javascript enabled, etc.) it should at least show the original image
In plain-old browsers should support mouse (or touch) control to pan the scene
Where accelerators are available (e.g. mobiles), “magic window” support to allow twist-to-explore
And where “true” VR hardware (Cardboard, Vive, Rift etc.) with WebVR support is available, allow one-click use of that

IndieWebCamp Oxford attendees at the pub — It wouldn’t be a geeky hacky camp thingy if it didn’t finish at a bar.

Hopefully the images above are working for you and are “interactive”. Try click-and-dragging on them (or tilt your device), try fullscreen mode, and/or try WebVR mode if you’ve got hardware that supports it. The mechanism of operation is slightly hacky but pretty simple: here’s how it works:

The image is inserted into the page as normal but with an extra CSS class of “vr360” and a data attribute pointing to the full-resolution image, e.g.:
<img class="vr360" src="/uploads/2018/09/R0010005_20180922182210-1024x512.jpg" alt="IndieWebCamp Oxford attendees at the pub" width="640" height="320" data-vr360="/uploads/2018/09/R0010005_20180922182210.jpg" />
Some Javascript swaps-out images with this class for an iframe of the same size, showing a special page and passing the image filename after the hash, e.g.:
for(vr360 of document.querySelectorAll('.vr360')){ const width = parseInt(vr360.width); const height = parseInt(vr360.height); if(width == 0) width = '100%'; // Fallback for where width/height not specified, if(height == 0) height = '100%'; // needed because of some quirks with Dan's lazy-loader vr360.outerHTML = `<iframe src="/q23-content/themes/q18/vr360/#${vr360.dataset.vr360}" width="${width}" height="${height}" class="aligncenter" class="vr360-frame" style="min-width: 340px; min-height: 340px;"></iframe>`; }
The iframe page loads this Javascript file. This loads three.js (to make 3D things easy) and WebVR-polyfill (to fix browser quirks). Finally (scroll to the bottom of the code), it creates a camera in the centre of a sphere, loads the image specified in the hash, flips it, and paints it onto the inside surface of the sphere, sets up controls, and turns the user loose on it. That’s all there is to it!

You’re welcome to any of my code if you’d like a drop-in approach to hosting panoramic photographs on your own personal site. My solution’s pretty extensible if you want e.g. interactive hotspots or contextual overlays – in fact, that – plus an easy route to editing the content for less-technical users – is pretty-much exactly what I’m working on for my day job at the moment.

How Edge Follows In IE’s Security Failings

I’ve generally been pretty defensive of Microsoft Edge, the default web browser in Windows 10. Unlike its much-mocked predecessor Internet Explorer, Edge is fast, clean, modern, and boasts good standards-compliance: all of the things that Internet Explorer infamously failed at! I was genuinely surprised to see Edge fail to gain a significant market share in its first few years: it seemed to me that everyday Windows users installed other browsers (mostly Chrome, which is causing its own problems) specifically because Internet Explorer was so terrible, and that once their default browser was replaced with something moderately-good this would no longer be the case. But that’s not what’s happened. Maybe it’s because Edge’s branding is too-remiscient of its terrible predecessor or maybe just because Windows users have grown culturally-used to the idea that the first thing they should do on a new PC is download a different browser, but whatever the reason, Edge is neglected. And for the most part, I’ve argued, that’s a shame.

Edge's minimalistic Certificate View. — I ranted at an Edge developer I met at a conference, once, about Edge’s weak TLS debugging tools that couldn’t identify an OCSP stapling issue that only affected Edge, but I thought that was the worse of its bugs… until now…

But I’ve changed my tune this week after doing some research that demonstrates that a long-standing security issue of Internet Explorer is alive and well in Edge. This particular issue, billed as a “feature” by Microsoft, is deliberately absent from virtually every other web browser.

About 5 years ago, Steve Gibson observed a special feature of EV (Extended Validation) SSL certificates used on HTTPS websites: that their extra-special “green bar”/company name feature only appears if the root CA (certificate authority) is among the browser’s default trust store for EV certificate signing. That’s a pretty-cool feature! It means that if you’re on a website where you’d expect to see a “green bar”, like Three Rings, PayPal, or HSBC, then if you don’t see the green bar one day it most-likely means that your connection is being intercepted in the kind of way I described earlier this year, and everything you see or send including passwords and credit card numbers could be at risk. This could be malicious software (or nonmalicious software: some antivirus software breaks EV certificates!) or it could be your friendly local network admin’s middlebox (you trust your IT team, right?), but either way: at least you have a chance of noticing, right?

Firefox address bars showing EV certificates of Three Rings CIC (GB), PayPal, Inc. (US), and HSBC Holdings plc (GB) — Firefox, like most browsers, shows the company name in the address bar when valid EV certificates are presented, and hides it when the validity of that certificate is put into question by e.g. network sniffing tools set up by your IT department.

Browsers requiring that the EV certificate be signed by a one of a trusted list of CAs and not allowing that list to be manipulated (short of recompiling the browser from scratch) is a great feature that – were it properly publicised and supported by good user interface design, which it isn’t – would go a long way to protecting web users from unwanted surveillance by network administrators working for their employers, Internet service providers, and governments. Great! Except Internet Explorer went and fucked it up. As Gibson reported, not only does Internet Explorer ignore the rule of not allowing administrators to override the contents of the trusted list but Microsoft even provides a tool to help them do it!

Address bars from major browsers connecting to a spoofed site, with EV certificate "green bars" showing only in Internet Explorer and Edge. — From top to bottom: Internet Explorer 11, Edge 17, Firefox 61, Chrome 68. Only Internet Explorer and Edge show the (illegitimate) certificate for “Barclays PLC”. Sorry, Barclays; I had to spoof *somebody*.

I decided to replicate Gibson’s experiment to confirm his results with today’s browsers: I was also interested to see whether Edge had resolved this problem in Internet Explorer. My full code and configuration can be found here. As is doubtless clear from the title of this post and the screenshot above, Edge failed the test: it exhibits exactly the same troubling behaviour as Internet Explorer.

Thanks, Microsoft.

Safari doesn't fall for it, either. — I also tried Safari (both on MacOS, above, and iOS, below) and it behaved as the other non-Microsoft browsers do (i.e. arguably more-correctly than IE or Edge).

I shan’t for a moment pretend that our current certification model isn’t without it’s problems – it’s deeply flawed; more on that in a future post – but that doesn’t give anybody an excuse to get away with making it worse. When it became apparent that Internet Explorer was affected by the “feature” described above, we all collectively rolled our eyes because we didn’t expect better of everybody’s least-favourite web browser. But for Edge to inherit this deliberate-fault, despite every other browser (even those that share its certificate store) going in the opposite direction, is just insulting.

Oat the Goat

This is a repost promoting content originally published elsewhere. See more things Dan's reposted.

Oat the Goat (oatthegoat.co.nz)

Oh my Goat! We just finished reading this awesome pick-a-path story that helps children learn the power of kindness. Have a go… #OatTheGoat

Oat the Goat

Discovered this fun interactive storybook; it tells the tale of a goat called Oat who endeavours to climb a mountain (making friends along the way). At a few points, it presents as a “choose your own adventure”-style book (although the forks are artificial and making the “wrong” choice immediately returns you the previous page), but it still does a reasonable job at looking at issues of bullying and diversity.

Intercepting HTTPS Traffic from Android Emulator

Mostly for my own benefit, as most other guides online are outdated, here’s my set-up for intercepting TLS-encrypted communications from an emulated Android device (in Android Emulator) using Fiddler. This is useful if you want to debug, audit, reverse-engineer, or evaluate the security of an Android app. I’m using Fiddler 5.0 and Android Studio 2.3.3 (but it should work with newer versions too) to intercept connections from an Android 8 (Oreo) device using Windows. You can easily adapt this set-up to work with physical devices too, and it’s not hard to adapt these instructions for other configurations too.

1. Configure Fiddler

Install Fiddler and run it.

Under Tools > Options > HTTPS, enable “Decrypt HTTPS traffic” and allow a root CA certificate to be created.

Click Actions > Export Root Certificate to Desktop to get a copy of the root CA public key.

On the Connections tab, ensure that “Allow remote computers to connect” is ticked. You’ll need to restart Fiddler after changing this and may be prompted to grant it additional permissions.

If Fiddler changed your system proxy, you can safely change this back (and it’ll simplify your output if you do because you won’t be logging your system’s connections, just the Android device’s ones). Fiddler will complain with a banner that reads “The system proxy was changed. Click to reenable capturing.” but you can ignore it.

2. Configure your Android device

Install Android Studio. Click Tools > Android > AVD Manager to get a list of virtual devices. If you haven’t created one already, create one: it’s now possible to create Android devices with Play Store support (look for the icon, as shown above), which means you can easily intercept traffic from third-party applications without doing APK-downloading hacks: this is great if you plan on working out how a closed-source application works (or what it sends when it “phones home”).

In Android’s Settings > Network & Internet, disable WiFi. Then, under Mobile Network > Access Point Names > {Default access point, probably T-Mobile} set Proxy to the local IP address of your computer and Port to 8888. Now all traffic will go over the virtual cellular data connection which uses the proxy server you’ve configured in Fiddler.

Drag the root CA file you exported to your desktop to your virtual Android device. This will automatically copy the file into the virtual device’s “Downloads” folder (if you’re using a physical device, copy via cable or network). In Settings > Security & Location > Encryption & Credentials > Install from SD Card, use the hamburger menu to get to the Downloads folder and select the file: you may need to set up a PIN lock on the device to do this. Check under Trusted credentials > User to check that it’s there, if you like.

Test your configuration by visiting a HTTPS website: as you browse on the Android device, you’ll see the (decrypted) traffic appear in Fiddler. This also works with apps other than the web browser, of course, so if you’re reverse-engineering a API-backed application encryption then encryption doesn’t have to impede you.

3. Not working? (certificate pinning)

A small but increasing number of Android apps implement some variation of built-in key pinning, like HPKP but usually implemented in the application’s code (which is fine, because most people auto-update their apps). What this does is ensures that the certificate presented by the server is signed by a certification authority from a trusted list (a trusted list that doesn’t include Fiddler’s CA!). But remember: the app is running on your device, so you’re ultimately in control – FRIDA’s bypass script “fixed” all of the apps I tried, but if it doesn’t then I’ve heard good things about Inspeckage‘s “SSL uncheck” action.

Summary of steps

If you’re using a distinctly different configuration (different OS, physical device, etc.) or this guide has become dated, here’s the fundamentals of what you’re aiming to achieve:

Set up a decrypting proxy server (e.g. Fiddler, Charles, Burp, SSLSplit – note that Wireshark isn’t suitable) and export its root certificate.
Import the root certificate into the certificate store of the device to intercept.
Configure the device to connect via the proxy server.
If using an app that implements certificate pinning, “fix” the app with FRIDA or another tool.

The Web That Never Was

This is a repost promoting content originally published elsewhere. See more things Dan's reposted.

YouTube (youtube.com)

Dylan Beattie (YouTube)

Underappreciated video presentation on how only a few small changes to the timeline of the Internet and early Web results in a completely different set of technologies and companies becoming dominant today. Thought-provoking.

The Bullshit Web

This is a repost promoting content originally published elsewhere. See more things Dan's reposted.

The Bullshit Web (pxlnv.com)

My home computer in 1998 had a 56K modem connected to our telephone line; we were allowed a maximum of thirty minutes of computer usage a day, because my parents — quite reasonably — did not want to have their telephone shut off for an evening at a time. I remember webpages loading slowly: ten […]

My home computer in 1998 had a 56K modem connected to our telephone line; we were allowed a maximum of thirty minutes of computer usage a day, because my parents — quite reasonably — did not want to have their telephone shut off for an evening at a time. I remember webpages loading slowly: ten to twenty seconds for a basic news article.

At the time, a few of my friends were getting cable internet. It was remarkable seeing the same pages load in just a few seconds, and I remember thinking about the kinds of the possibilities that would open up as the web kept getting faster.

And faster it got, of course. When I moved into my own apartment several years ago, I got to pick my plan and chose a massive fifty megabit per second broadband connection, which I have since upgraded.

So, with an internet connection faster than I could have thought possible in the late 1990s, what’s the score now? A story at the Hill took over nine seconds to load; at Politico, seventeen seconds; at CNN, over thirty seconds. This is the bullshit web.

But first, a short parenthetical: I’ve been writing posts in both long- and short-form about this stuff for a while, but I wanted to bring many threads together into a single document that may pretentiously be described as a theory of or, more practically, a guide to the bullshit web.

A second parenthetical: when I use the word “bullshit” in this article, it isn’t in a profane sense. It is much closer to Harry Frankfurt’s definition in “On Bullshit”:

It is just this lack of connection to a concern with truth — this indifference to how things really are — that I regard as of the essence of bullshit.

I also intend it to be used in much the same sense as the way it is used in David Graeber’s “On the Phenomenon of Bullshit Jobs”:

In the year 1930, John Maynard Keynes predicted that, by century’s end, technology would have advanced sufficiently that countries like Great Britain or the United States would have achieved a 15-hour work week. There’s every reason to believe he was right. In technological terms, we are quite capable of this. And yet it didn’t happen. Instead, technology has been marshaled, if anything, to figure out ways to make us all work more. In order to achieve this, jobs have had to be created that are, effectively, pointless. Huge swathes of people, in Europe and North America in particular, spend their entire working lives performing tasks they secretly believe do not really need to be performed. The moral and spiritual damage that comes from this situation is profound. It is a scar across our collective soul. Yet virtually no one talks about it.

[…]

These are what I propose to call ‘bullshit jobs’.

What is the equivalent on the web, then?

…

Nick Heer

This, this, a thousand times this. As somebody who’s watched the Web grow both in complexity and delivery speed over the last quarter century, it apalls me that somewhere along the way complexity has started to win. I don’t want to have to download two dozen stylesheets and scripts before your page begins to render – doubly-so if those additional files serve no purpose, or at least no purpose discernable to the reader. Personally, the combination of uMatrix and Ghostery is all the adblocker I need (and I’m more-than-willing to add a little userscript to “fix” your site if it tries to sabotage my use of these technologies), but when for whatever reason I turn these plugins off I feel like the Web has taken a step backwards while I wasn’t looking.

Leak in Comic Chameleon (app API hacking)

I recently discovered a minor security vulnerability in mobile webcomic reading app Comic Chameleon, and I thought that it was interesting (and tame) enough to share as a learning example of (a) how to find security vulnerabilities in an app like this, and (b) more importantly, how to write an app like this without this kind of security vulnerability.

The nature of the vulnerability is that, for webcomics pushed directly into the platform by their authors, it’s possible to read comics (long) before they’re published. By way of proof, here’s a copy of the top-right 200 × 120 pixels of episode 54 of the (excellent) Forward Comic, which Imgur will confirm was uploaded on 2 July 2018: over three months ahead of its planned publication date.

Forward Comic 0054, due for publication in October — I’m not going to spoil this comic for you, but if you follow it then when October comes I think you’ll be pleased.

How to hack a web-backed app

Just to be clear, I didn’t set out to hack this app, but once I stumbled upon the vulnerability I wanted to make sure that I was able to collect enough information that I’d be able to explain to its author what was wrong and how to fix it. You’d be amazed how many systems I find security holes in almost-completely by accident. In fact, I’d just noticed that the application supported some webcomics that I follow but for which I hadn’t been able to find RSS feeds (and so I was selfdogfooding my own tool, RSSey, to “produce” RSS feeds for my reader by screen-scraping: not the most-elegant solution). But if this app could produce a list of issues of the comic, it must have some way of doing what I was trying to do, and I wanted to know what it was.

Comic Chameleon running on Android — Comic Chameleon brings a lot of comics into a single slick Android/iOS app. Some of them you’ll even have heard of!

The app, I figured, must “phone home” to some website – probably the app’s official website itself – to get the list of comics that it supports and details of where to get their feeds from, so I grabbed a copy of the app and started investigating. Because I figured I was probably looking for a URL, the first thing I did was to download the raw APK file (your favourite search engine can tell you how to do this), decompressed it (APK files are just ZIP files, really) and ran strings on it to search for likely-looking URLs:

Running strings on the Comic Chameleon APK contents — As predicted, there are several hard-coded addresses. And all over unencrypted HTTP, eww!

I tried visiting a few of the addresses but many of them seemed to be API endpoints that were expecting additional parameters. Probably, I figured, the strings I’d extracted were prefixes to which those parameters were attached. Rather than fuzz for the right parameters, I decided to watch what the app did: I spun up a simulated Android device using the official emulator (I could have used my own on a wireless network that I control, of course, but this was lazier) and ran my favourite packet sniffer to see what the application was requesting.

Wireshark output showing Comic Chameleon traffic. — The web addresses are even clearer, here, and include all of the parameters I need.

Now I had full web addresses with parameters. Comparing the parameters that appeared when I clicked different comics revealed that each comic in the “full list” was assigned a numeric ID which was used when requesting issues of that comic (along with an intermediate stage where the year of publication is requested).

Comic Chameleon comic list XML — Each comic is assigned an ID number, probably sequentially.

Interestingly, a number of comics were listed with the attribute s="no-show" and did not appear in the app: it looked like comics that weren’t yet being made available via the app were already being indexed and collected by its web component, and for some reason were being exposed via the XML API: presumably the developer had never considered that anybody but their app would look at the XML itself, but the thing about the Web is that if you put it on the Web, anybody can see it.

Still: at this point I assumed that I was about to find what I was looking for – some kind of machine-readable source (an RSS feed or something like one) for a webcomic or two. But when I looked at the XML API for one of those webcomics I discovered quite a bit more than I’d bargained on finding:

no-shows in the episode list produced by the web component of Comic Chameleon — Hey, what’s this? This feed includes titles for webcomics that haven’t been published yet, marked as ‘no-show’…

The first webcomic I looked at included the “official” web addresses and titles of each published comic… but also several not yet published ones. The unpublished ones were marked with s="no-show" to indicate to the app that they weren’t to be shown, but I could now see them. The “official” web addresses didn’t work for me, as I’d expected, but when I tried Comic Chameleon’s versions of the addresses, I found that I could see entire episodes of comics, up to three and a half months ahead of their expected publication date.

Whoops.

Naturally, I compiled all of my findings into an email and contacted the app developer with all of the details they’d need to fix it – in hacker terms, I’m one of the “good guys”! – but I wanted to share this particular example with you because (a) it’s not a very dangerous leak of data (a few webcomics a few weeks early and/or a way to evade a few ads isn’t going to kill anybody) and (b) it’s very illustrative of the kinds of mistakes that app developers are making a lot, these days, and it’s important to understand why so that you’re not among them. On to that in a moment.

Responsible disclosure

Because (I’d like to think) I’m one of the “good guys” in the security world, the first thing I did after the research above was to contact the author of the software. They didn’t seem to have a security.txt file, a disclosure policy, nor a profile on any of the major disclosure management sites, so I sent an email. Were the security issue more-severe, I’d have sent a preliminary email suggesting (and agreeing on a mechanism for) encrypted email, but given the low impact of this particular issue, I just explained the entire issue in the initial email: basically what you’ve read above, plus some tips on fixing the issue and an offer to help out.

I subscribe to the doctrine of responsible disclosure, which – in the event of more-significant vulnerabilities – means that after first contacting the developer of an insecure system and giving them time to fix it, it’s acceptable (in fact: in significant cases, it’s socially-responsible) to publish the details of the vulnerability. In this case, though, I think the whole experience makes an interesting learning example about ways in which you might begin to “black box” test an app for data leaks like this and – below – how to think about software development in a way that limits the risk of such vulnerabilities appearing in the first place.

The author of this software hasn’t given any answer to any of the emails I’ve sent over the last couple of weeks, so I’m assuming that they just plan to leave this particular leak in place. I reached out and contacted the author of Forward Comic, though, which turns out (coincidentally) to be probably the most-severely affected publication on the platform, so that he had the option of taking action before I published this blog post.

Lessons to learn

When developing an “app” (whether for the web or a desktop or mobile platform) that connects to an Internet service to collect data, here are the important things you really, really ought to do:

Don’t publish any data that you don’t want the user to see.
If the data isn’t for everybody, remember to authenticate the user.
And for heaven’s sake use SSL, it’s not the 1990s any more.

Website message asking visitor to confirm that they're old enough. — It’s a good job that nobody on the Web would ever try to view something easily-available but which they shouldn’t, right? That’s why screens like this have always worked so well.

That first lesson’s the big one of course: if you don’t want something to be on the public Internet, don’t put it on the public Internet! The feeds I found simply shouldn’t have contained the “secret” information that they did, and the unpublished comics shouldn’t have been online at real web addresses. But aside from (or in addition to) not including these unpublished items in the data feeds, what else might our app developer have considered?

Encryption. There’s no excuse for not using HTTPS these days. This alone wouldn’t have prevented a deliberate effort to read the secret data, but it would help prevent it from happening accidentally (which is a risk right now), e.g. on a proxy server or while debugging something else on the same network link. It also protects the user from exposing their browsing habits (do you want everybody at that coffee shop to know what weird comics you read?) and from having content ‘injected’ (do you want the person at the next table of the coffee shop to be able to choose what you see when you ask for a comic?
Authentication (app). The app could work harder to prove that it’s genuinely the app when it contacts the website. No mechanism for doing this can ever be perfect, because the user hasa access to the app and can theoretically reverse-engineer it to fish the entire authentication strategy out of it, but some approaches are better than others. Sending a password (e.g. over Basic Authentication) is barely better than just using a complex web address, but using a client-side certiciate or an OTP algorithm would (in conjunction with encryption) foil many attackers.
Authentication (user). It’s a very-different model to the one currently used by the app, but requiring users to “sign up” to the service would reduce the risks and provide better mechanisms for tracking/blocking misusers, though the relative anonymity of the Internet doesn’t give this much strength and introduces various additional burdens both technical and legal upon the developer.

Fundamentally, of course, there’s nothing that an app developer can do to perfectly protect the data that is published to that app, because the app runs on a device that the user controls! That’s why the first lesson is the most important: if it shouldn’t be on the public Internet (yet), don’t put it on the public Internet.

Hopefully there’s a lesson for you somewhere too: about how to think about app security so that you don’t make a similar mistake, or about some of the ways in which you might test the security of an application (for example, as part of an internal audit), or, if nothing else, that you should go and read Forward, because it’s pretty cool.

Further reading

7 August 2018: I’ve now written a quick explanation about how to intercept HTTPS traffic from Android apps, for those that asked.