Prior to his retirement in 1995 I managed to amass a collection of almost all of Gary Larson’s The Far Side books as well as a couple of calendars and other thingamabobs. After 24 years of silence I didn’t expect to hear anything more from him and so I was as surprised as most of the Internet was when he re-emerged last yearwith a brand new on his first ever website. Woah.
Larson’s hinted that there might be new and original content there someday, but for the time being I’m just loving that I can read The Far Side comments (legitimately) via the Web for the first time! The site’s currently publishing a “Daily Dose” of classic strips, which is awesome. But… I don’t want to have to go to a website to get comics every day. Nor do I want to have to remember which days I’ve caught-up with, yet. That’s a job for computers, right? And it’s a solved problem: RSS (which has been around for almost as long as Larson hasn’t) and similar technologies allow a website to publicise that it’s got updates available in a way that people can “subscribe” to, so I should just use that, right?
Except… the new The Far Side website doesn’t have an RSS feed. Boo! Luckily, I’m not above automating the creation of feeds for websites that I wish had them, even (or perhaps especially) where that involves a little reverse-engineering of online comics. So with a little thanks to my RSS middleware RSSey… I can now read daily The Far Side comics in the way that’s most-convenient to me: right alongside my other subscriptions in my feed reader.
I’m afraid I’m not going to publicly*-share a ready-to-go feed URL for this one, unlike my BBC News Without The Sport feed, because a necessary side-effect of the way it works is that the ads are removed. And if I were to republish a feed containing The Far Side website cartoons but with the ads stripped I’d be guilty of, like, all the ethical and legal faults that Larson was trying to mitigate by putting his new website up in the first place! I love The Far Side and I certainly don’t want to violate its copyright!
But – at least until Larson’s web developer puts up a proper feed (with or without ads) – for those of us who like our comics delivered fresh to us every morning, here’s the source code (as an RSSey feed definition) you could use to run your own personal-use-only “give me The Far Side Daily Dose as an RSS feed” middleware.
Thanks for deciding to join us on the Internet, Gary. I hear it’s going to be a big thing, someday!
* Friends are welcome to contact me off-blog for an address if they like, if they promise to be nice and ethical about it.
For the last few months, I’ve been running an alpha test of an email-based subscription to DanQ.me with a handful of handpicked testers. Now, I’d like to open it up to a slightly larger beta test group. If you’d like to get the latest from this site directly in your inbox, just provide your email address below:
Subscribe by email!
Who’s this for?
Some people prefer to use their email inbox to subscribe to things. If that’s you: great!
What will I receive?
You’ll get a “daily digest”, no more than once per day, summarising everything I’ve published within the last 24 hours. It usually works: occasionally but not often it misses things. You can unsubscribe with one click at any time.
How else can I subscribe?
You can still subscribe in a variety of other ways. Personally, I recommend using a feed reader which lets you choose exactly which kinds of content you’re interested in, but there are plenty of options including Facebook and Twitter (for those of such an inclination).
Didn’t you do this before?
Yes, I ran a “subscribe by email” system back in 2007 but didn’t maintain it. Things might be better this time around. Maybe.
Yesterday I recommended that you go read Aaron Uglum‘s webcomic LABS which had just completed its final strip. I’m a big fan of “completed” webcomics – they feel binge-able in the same way as a complete Netflix series does! – but Spencer quickly pointed out that it’s annoying for we enlightened modern RSS users who hook RSS up to everything to have to binge completed comics in a different way to reading ongoing ones: what he wanted was an RSS feed covering the entire history of LABS.
So naturally (after the intense heatwave woke me early this morning anyway) I made one: complete RSS feed of LABS. And, of course, I open-sourced the code I used to generate it so that others can jumpstart their projects to make static RSS feeds from completed webcomics, too.
Even if you’re not going to read it via this medium, you should go read LABS.
With IndieWebCamp Oxford 2019 scheduled to take place during the Summer of Hacks, I drew a diagram (click to embiggen) of the current ecosystem that powers and propogates the content on DanQ.me. It’s mostly for my own benefit – to be able to get a big-picture view of the ways my website talks to the world and plan for what improvements I might be able to make in the future… but it also works as a vehicle to explain what my personal corner of the IndieWeb does and how it does it. Here’s a summary:
Since fifteen years ago today, DanQ.me has been powered by a self-hosted WordPress installation. I know that WordPress isn’t “hip” on the IndieWeb this week and that if you’re not on the JAMstack you’re yesterday’s news, but at 15 years and counting my love affair with WordPress has lasted longer than any romantic relationship I’ve ever had with another human being, so I’m sticking with it. What’s cool in Web technologies comes and goes, but what’s important is solid, dependable tools that do what you need them to, and between WordPress, half a dozen off-the-shelf plugins and about a dozen homemade ones I’ve got everything I need right here.
I write articles (long posts like this) and notes (short, “tweet-like” updates) directly into the site, and just occasionally other kinds of content. But for the most part, different kinds of content come from different parts of the ecosystem, as described below.
DanQ.me sits at the centre of the diagram, but it’s worth remembering that the diagram is deliberately incomplete: it only contains information flows directly relevant to my blog (and it doesn’t even contain all of those!). The last time I tried to draw a diagram like this that described my online life in general, then my RSS reader found its way to the centre. Which figures: my RSS reader is usually the first and often the last place I visit on the Internet, and I’ve worked hard to funnel everything through it.
Right now I’m using FreshRSS – plus a handful of plugins, including some homemade ones – as my RSS reader: I switched from Tiny Tiny RSS about a year ago to take advantage of FreshRSS’s excellent responsive themes, among other features. Because some websites don’t have RSS feeds, even where they ought to, I use my own tool RSSey to retroactively “fix” people’s websites for them, dynamically adding feeds for my consumption. It’s also a nice reminder that open source and remixability were cornerstones of the original Web. My RSS reader collates information from a variety of sources and additionally gives me a one-click mechanism to push content I enjoy to my blog as a repost.
QTube is my video hosting platform; it’s a PeerTube node. If you haven’t seen it, that’s fine: most content on it is consumed indirectly either through my YouTube channel or directly on my blog as posts of the “video” kind. Also, I don’t actually vlog very often. When I do publish videos onto QTube, their republication onto YouTube or DanQ.me is optional: sometimes I plan to use a video inside an article post, for example, and so don’t need to republish it by itself.
I’m gradually exporting or re-uploading my backlog of YouTube videos from my current and previous channels to QTube in an effort to recentralise and regain control over their hosting, but I’m in no real hurry. PeerTube certainly makes it easy, though!
I operate a private link shortener which I mostly use for the expected purpose: to make links shorter and so easier to read out and memorise or else to make them take up less space in a chat window. But soon after I set it up, many years ago, I realised that it could also act as a mechanism to push content to my RSS reader to “read later”. And by the time I’m using it for that, I figured, I might as well also be using it to repost content to my blog from sources that aren’t things my RSS reader subscribes to. This leads to a process that’s perhaps unnecessarily complex: if I want to share a link with you as a repost, I’ll push it into my link shortener and mark it as going “to me”, then I’ll tell my RSS reader to push it to my blog and there it’ll be published to the world! But it works and it’s fast enough: I’m not in the habit of reposting things that are time-critical anyway.
I’ve been involved in brainstorming ways in which the act of finding (or failing to find, etc.) a geocache or reaching (or failing to reach) a geohashpoint could best be represented as a “checkin“, and last year I open-sourced my plugin for pulling logs (with as much automation as is permitted by the terms of service of some of the silos involved) from geocaching websites and posting them to WordPress blogs: effectively PESOS-for-geocaching. I’d prefer to be publishing on my own blog in the first instance, but syndicating my adventures from various silos into my blog is “good enough”.
New notes get pushed out to my Twitter account, for the benefit of my Twitter-using friends. Articles get advertised on Facebook, Twitter and LiveJournal (yes, really) in teaser form, for the benefit of friends who prefer to get notifications via those platforms. Facebook have been fucking around with their APIs and terms of service lately and this is now less-automatic than it used to be, which is a bit of an annoyance. My RSS feeds carry copies of content out to people who prefer to subscribe via that medium, and I’ve also been using this to power an experimental MailChimp “daily digest” mailing list of “what Dan’s been up to” to a small number of friends, right in their email inboxes: I’ve not made it available to everybody yet, but if you’re happy to help test it then give me a shout and I’ll hook you up.
Finally, a couple of IFTTT recipes push my articles and my reposts to Reddit communities: I don’t really use Reddit myself, any more, but I’ve got friends in a few places there who prefer to keep up-to-date with what I’m up to via that medium. For historical reasons, my reposts to Reddit don’t go directly via my blog’s RSS feeds but “shortcut” directly from my RSS reader: this is suboptimal because I don’t get to tweak post titles for Reddit but it’s not a big deal.
I used to syndicate content to Google+ (before it joined the long list of Things Google Have Killed) and to Ello (but it never got much traction there). I’ve probably historically syndicated to other places too: I’ve certainly manually-republished content to other blogs, from time to time, too.
I use Ryan Barrett‘s excellent Brid.gy to convert Twitter replies and likes back into Webmentions for publication as comments on my blog. This used to work for Facebook, too, but again: Facebook fucked it over. I’ve occasionally manually backfed significant Facebook comments, but it’s not ideal: I might like to look at using similar technologies to RSSey to subvert Facebook’s limitations.
I’ve routinely retroactively reintegrated content that I’ve produced elsewhere on the Web. This includes my previous blogs (which is why you can browse my archives, right here on this site, all the way back to some of the cringeworthy angsty-teenager posts I made in the 1990s) but also some Reddit posts, some replies originally posted directly to other people’s blogs, all my old del.icio.us bookmarks, long-form forum posts, posts I made to mailing lists and newsgroups, and more. As a result, there’s a lot of backdated content on this site, nowadays: almost a million words, and significantly more than the 600,000 or so I counted a few years ago, before my biggest push for reintegration!
Why do I do this? Because I really, really like owning my identity online! I’ve tried the “big” silo alternatives like Facebook, Twitter, Medium, Instagram etc., and they’ve eventually always lead to disappointment, either because they get shut down or otherwise made-unusable, because of inappropriately-applied “real names” policies, because they give too much power to untrustworthy companies, because they impose arbitrary limitations on my content, because they manipulate output promotion (and exacerbate filter bubbles), or because they make the walls of their walled gardens taller and stop you integrating with them how you used to.
A handful of silos have shown themselves to be more-trustworthy than the average – in particular, eschewing techniques that promote “lock-in” – and I’d love to tell you more about them and what I think you should look for in a silo, another time. But for now: suffice to say that just like I don’t use YouTube like most people do, I elect not to use Facebook or Twitter in the conventional ways either. And it’s awesome, thanks.
There are plenty of reasons that people choose to take control of their own Web presence – and everybody who puts content online ought to consider it – but I imagine that few individuals have such a complicated publishing ecosystem as I do! Now you’ve got a picture of how my digital content production workflow works, and perhaps start owning your online identity, too.
I was watching a recent YouTube video by Derek Muller (Veritasium), My Video Went Viral. Here’s Why, and I came to a realisation: I don’t watch YouTube like most people – probably including you! – watch YouTube. And as a result, my perspective on what YouTube is and does is fundamentally biased from the way that others probably think about it.
The magic moment came for me when his video explained that the “subscribe” button doesn’t do what I’d assumed it does. I’m probably not alone in my assumptions: I’ll bet that people who use the “subscribe” button as YouTube intend don’t all realise that it works the way that it does.
Like many, I’d assumed the “subscribe” buttons says “I want to know about everything this creator publishes”. But that’s not what actually happens. YouTube wrangles your subscription list and (especially) your recommendations based on their own metrics using an opaque algorithm. I knew, of course, that they used such a thing to manage the list of recommended next-watches… but I didn’t realise how big an influence it was having on the way that most YouTube users choose what they’ll watch!
YouTube’s metrics for “what to show to you” is, of course, biased by your subscriptions. But it’s also biased by what’s “trending” (which in turn is based on watch time and click-through-rate), what people-who-watch-the-things-you-watch watch, subscription commonalities, regional trends, what your contacts are interested in, and… who knows what else! AAA YouTubers try to “game” it, but the goalposts are moving. And the struggle to stay on-top, especially after a fluke viral hit, leads to the application of increasingly desperate and clickbaity measures.
This is a battle to which I’ve been mostly oblivious, until now, because I don’t watch YouTube like you watch YouTube.
Tom Scott produced an underappreciated sci-fi short last year describing a theoretical AI which, in 2028, caused problems as a result of its single-minded focus. What we’re seeing in YouTube right now is a simpler example, but illustrates the problem well: optimising YouTube’s algorithm for any factor or combination of factors other than a user’s stated preference (subscriptions) will necessarily result in the promotion of videos to a user other than, and at the expense of, the ones by creators that they’ve subscribed to. And there are so many things that YouTube could use as influencing factors. Off the top of my head, there’s:
Number of views
Number of likes
Ratio of likes to dislikes
Number of tracked shares
Number of saves
Length of view
Click-through rate on advertisements
Popularity amongst your friends
Popularity amongst your demographic
Etc., etc., etc.
But this is all alien to me. Why? Well: here’s how I use YouTube:
Subscription: I subscribe to creators via RSS. My RSS reader doesn’t implement YouTube’s algorithm, of course, so it just gives me exactly what I subscribe to – no more, no less.It’s not perfect (for example, it pisses me off every time it tells me about an upcoming “premiere”, a YouTube feature I don’t care about even a little), but apart from that it’s great! If I’m on-the-move and can’t watch something as long as involved TheraminTrees‘ latest deep-thinker, my RSS reader remembers so I can watch it later at my convenience. I can have National Geographic‘s videos “expire” if I don’t watch them within a week but Dr. Doe‘s wait for me forever. And I can implement my own filters if a feed isn’t showing exactly what I’m looking for (like I did to strip the sport section from BBC News’ RSS feed). I’m in control.
Discovery: I don’t rely on YouTube’s algorithm to find me new content. I don’t mind being a day or two behind on what’s trending: I’m not sure I care at all? I’m far more-interested in recommendations curated by a human. If I discover and subscribe to a channel on YouTube, it was usually (a) mentioned by another YouTuber or (b) linked from a blog or community blog. I’m receiving recommendations from people I already respect, and they have a way higher hit-rate than YouTube’s recommendations.(I also sometimes discover content because it’s exactly what I searched for, e.g. I’m looking for that tutorial on how to install a fiddly damn kiddy seat into the car, but this is unlikely to result in a subscription.)
This isn’t yet-another-argument that you should use RSS because it’s awesome. (Okay, so it is. RSS isn’t dead, and its killer feature is that its users get to choose how it works. But there’s more I want to say.)
What I wanted to share was this reminder, for me, that the way you use a technology can totally blind you to the way others use it. I had no idea that many YouTube creators and some YouTube subscribers felt increasingly like they were fighting YouTube’s algorithms, whose goals are different from their own, to get what they want. Now I can see it everywhere! Why do schmoyoho always encourage me to press the notification bell and not just the subscribe button? Because for a typical YouTube user, that’s the only way that they can be sure that their latest content will be seen!
Of course, the business needs of YouTube mean that we’re not likely to see any change from them. So until either we have mainstream content-curating AIs that answer to their human owners rather than to commercial networks (robot butler, anybody?) or else the video fediverse catches on – and I don’t know which of those two are least-likely! – I guess I’ll stick to my algorithm-lite subscription model for YouTube.
But at least now I’ll have a better understanding of why some of the channels I follow are changing the way they produce and market their content…
I love RSS, but it’s a minor niggle for me that if I subscribe to any of the BBC News RSS feeds I invariably get all the sports news, too. Which’d be fine if I gave even the slightest care about the world of sports, but I don’t.
Filters it to remove all entries whose GUID matches a particular regular expression (removing all of those from the “sport” section of the site)
Outputs the resulting feed into a temporary file
Uploads the temporary file to a bucket in Backblaze‘s “B2” repository (think: a better-value competitor S3); the bucket I’m using is publicly-accessible so anybody’s RSS reader can subscribe to the feed
When Firefox 64 arrives in December, support for RSS, the once celebrated content syndication scheme, and its sibling, Atom, will be missing.
“After considering the maintenance, performance and security costs of the feed preview and subscription features in Firefox, we’ve concluded that it is no longer sustainable to keep feed support in the core of the product,” said Gijs Kruitbosch, a software engineer who works on Firefox at Mozilla, in a blog post on Thursday.
Not a great sign, but understandable. Live Bookmarks was never strong enough to be a full-featured RSS reader, and I don’t know about you but I haven’t really made use of bookmarks for a good few years, let alone “live” bookmarks, but the media are likely to see this (as El Reg does, in the article) as another nail in the coffin of one of the best syndication mechanisms the Web ever came up with.
There are two stories here. The first is a story about a vision of the web’s future that never quite came to fruition. The second is a story about how a collaborative effort to improve a popular standard devolved into one of the most contentious forks in the history of open-source software development.
In the late 1990s, in the go-go years between Netscape’s IPO and the Dot-com crash, everyone could see that the web was going to be an even bigger deal than it already was, even if they didn’t know exactly how it was going to get there. One theory was that the web was about to be revolutionized by syndication. The web, originally built to enable a simple transaction between two parties—a client fetching a document from a single host server—would be broken open by new standards that could be used to repackage and redistribute entire websites through a variety of channels. Kevin Werbach, writing for Release 1.0, a newsletter influential among investors in the 1990s, predicted that syndication “would evolve into the core model for the Internet economy, allowing businesses and individuals to retain control over their online personae while enjoying the benefits of massive scale and scope.”1 He invited his readers to imagine a future in which fencing aficionados, rather than going directly to an “online sporting goods site” or “fencing equipment retailer,” could buy a new épée directly through e-commerce widgets embedded into their favorite website about fencing.2 Just like in the television world, where big networks syndicate their shows to smaller local stations, syndication on the web would allow businesses and publications to reach consumers through a multitude of intermediary sites. This would mean, as a corollary, that consumers would gain significant control over where and how they interacted with any given business or publication on the web.
RSS was one of the standards that promised to deliver this syndicated future. To Werbach, RSS was “the leading example of a lightweight syndication protocol.”3 Another contemporaneous article called RSS the first protocol to realize the potential of XML.4 It was going to be a way for both users and content aggregators to create their own customized channels out of everything the web had to offer. And yet, two decades later, RSS appears to be a dying technology, now used chiefly by podcasters and programmers with tech blogs. Moreover, among that latter group, RSS is perhaps used as much for its political symbolism as its actual utility. Though of course some people really do have RSS readers, stubbornly adding an RSS feed to your blog, even in 2018, is a reactionary statement. That little tangerine bubble has become a wistful symbol of defiance against a centralized web increasingly controlled by a handful of corporations, a web that hardly resembles the syndicated web of Werbach’s imagining.
The future once looked so bright for RSS. What happened? Was its downfall inevitable, or was it precipitated by the bitter infighting that thwarted the development of a single RSS standard?
I’ve always been a huge fan of RSS, and I use it for just about everything (I’ll even hack-it-in to services that don’t supply it natively, just to make them fit around my workflow). But even I’ve got to admit that – outside of podcasts – it’s not done well at retaining mainstream appeal, especially after the death of Google Reader. Right now, most people seem content to get their updates from their social media circles, and take a manual approach (ugh) to reading content in the few other places that matter to them. That’s problematic for all kinds of reasons, and I’m perfectly happy to be one of those old fuddy-duddies who likes his web standards open and independent!
As of next week, I’ll have been blogging for 20 years, or about 54% of my life. How did that happen?
The mid-1990s were a very different time for the World Wide Web (yes, we still called it that, and sometimes we even described its use as “surfing”). Going “on the Internet” was a calculated and deliberate action requiring tying up your phone line, minutes of “connecting” along with all of the associated screeching sounds if you hadn’t turned off your modem’s loudspeaker, and you’d typically be paying twice for the experience: both a monthly fee to your ISP for the service and a per-minute charge to your phone company for the call.
It was into this environment that in 1994 I published my first web pages: as far as I know, nothing remains of them now. It wasn’t until 1998 that I signed up an account with UserActive (whose website looks almost the same today as it did then) who offered economical subdomain hosting with shell and CGI support and launched “Castle of the Four Winds”, a set of vanity pages that included my first blog.
Except I didn’t call it a “blog”, of course, because it wasn’t until the following year that Peter Merholz invented the word (he also commemorated 20 years of blogging, this year). I didn’t even call it a “weblog”, because that word was still relatively new and I wasn’t hip enough to be around people who said it, yet. It was self-described as an “online diary”, a name which only served to reinforce the notion that I was writing principally for myself. In fact, it wasn’t until mid-1999 that I discovered that it was being more-widely read than just by me and my circle of friends when I attracted a stalker who travelled across the UK to try to “surprise” me by turning up at places she expected to find me, based on what I’d written online… which was exactly as creepy as it sounds.
While the world began to panic that the coming millennium was going to break all of the computers, I migrated Castle of the Four Winds’ content into AvAngel.com, a joint vanity site venture with my friend Andy. Aside from its additional content (purity tests, funny stuff, risqué e-cards), what we hosted was mostly the same old stuff, and I continued to write snippets about my life in what was now quite-clearly a “blog-like” format, with the most-recent posts at the top and separate pages for content too old for the front page. Looking back, there’s still a certain naivety to these posts which exemplify the youth of the Web. For example, posts routinely referenced my friends by their email addresses, because spam was yet to become a big enough problem that people didn’t much mind if you put their email address on a public webpage somewhere, and because email addresses still carried with them a feeling of anonymity that ceased to be the case when we started using them for important things.
Meanwhile, during my initial months as a student in Aberystwyth, I wrote a series of emails to friends back home entitled “Cool And Interesting Thing Of The Day To Do At The University Of Wales, Aberystwyth”, and put copies of each onto my student webspace; I’ve since recovered these and integrated them into my unified blog.
In 2002 I’d bought the domain name scatmania.org – a reference to my university halls of residence nickname “Scatman Dan”; I genuinely didn’t consider the possibility that the name might be considered scatalogical until later on. As I wanted to continue my blogging at an address that felt like it was solely mine (AvAngel.com having been originally shared with a friend, although in practice over time it became associated only with me), this seemed like a good domain upon which to relaunch. And so, in mid-2003 and powered by a short-lived and ill-fated blogging engine called Flip I did exactly that. WordPress, to which I’d subsequently migrate, hadn’t been invented yet and it wasn’t clear whether its predecessor, b2/cafelog, would survive the troubles its author was experiencing.
From this point on, any web address for any post made to my blog still works to this day, despite multiple technological and infrastructural changes to my blog (and some domain name shenanigans!) in the meantime. I’d come to be a big believer in the mantra that cool URIs don’t change: something that as far as possible I’ve committed to trying to upload in my blogging, my archiving, and my paid work since then. I’m moderately confident that all extant links on the web that point to earlier posts are all under my control so they can (and in most cases have) been fixed already, so I’m pretty close to having all my permalink URIs be “cool”, for now. You might hit a short chain of redirects, but you’ll get to where you’re going.
And everything was fine, until one day in 2004 when it wasn’t. The server hosting scatmania.org died in a very bad way, and because my backup strategy was woefully inadequate, I lost a lot of content. I’ve recovered quite a lot of it and put it back in-place, but some is probably gone forever.
The resurrected site was powered by WordPress, and this was the first time that live database queries had been used to power my blog. Occasionally, these days, when talking to younger, cooler developers, I’m tempted to follow the hip trend of reimplementing my blog as a static site, compiling a stack of host-anywhere HTML files based upon whatever-structure-I-like at the “backend”… but then I remember that I basically did that already for six years and I’m far happier with my web presence today. I’ve nothing against static site systems (I’m quite partial to Middleman, myself, although I’m also fond of Hugo) but they’re not right for this site, right now.
IndieAuth hadn’t been invented yet, but I was quite keen on the ideals of OpenID (I still am, really), and so I implemented what was probably the first viable “install-anywhere” implementation of OpenID for WordPress – you can see part of it functioning in the top-right of the screenshot above, where my (copious, at that time) LiveJournal-using friends were encouraged to sign in to my blog using their LiveJournal identity. Nowadays, the majority of the WordPress plugins I use are ones I’ve written myself: my blog is powered by a CMS that’s more “mine” than not!
Over the course of the first decade of my blogging, a few trends had become apparent in my technical choices. For example:
I’ve preferred an approach of storing the “master” copy of my content on my own site and then (sometimes) syndicating it elsewhere: for example, for the benefit of my friends who during their University years maintained a LiveJournal, for many years I had my blog cross-post to a LiveJournal account (and backfeed copies of comments back to my site).
These were deliberate choices, but they didn’t require much consideration: growing up with a Web far less-sophisticated than today’s (e.g. truly stateless prior to the advent of HTTP cookies) and seeing the chaos caused during the first browser war and the period of stagnation that followed, these choices seemed intuitive.
As you’d expect from a blog covering a period from somebody’s teen years through to their late thirties, there’ve been significant changes in the kinds of content I’ve posted (and the tone with which I’ve done so) over the years, too. If you dip into 2003, for example, you’ll see the results of quiz memes and unqualified daily minutiae alongside actual considered content. Go back further, to early 1999, and it is (at best) meaningless wittering about the day-to-day life of a teenage student. It took until around 2009/2010 before I actually started focussing on writing content that specifically might be enjoyable for others to read (even where that content was frankly silly) and only far more-recently-still that I’ve committed to the “mostly technical stuff, ocassional bits of ‘life’ stuff” focus that I have today.
I say “committed”, but of course I’m fully aware that whatever this blog is now, it’ll doubtless be something somewhat different if I’m still writing it in another two decades…
Once I reached the 2010s I started actually taking the time to think about the design of my blog and its meaning. Conceptually, all of my content is data-driven: database tables full of different “kinds” of content and associated metadata, and that’s pretty-much ideal – it provides a strong separation between content and presentation and makes it possible to make significant design changes with less work than might otherwise be expected. I’ve also always generally favoured a separation of concerns in web development and so I’m not a fan of CSS design methodologies that encourage class names describing how things should appear, like Atomic CSS. Even where it results in a performance hit, I’d far rather use CSS classes to describe what things are or represent. The single biggest problem with this approach, to my mind, is that it violates the DRY principle… but that’s something that your CSS preprocessor’s there to fix for you, isn’t it?
But despite this philosophical outlook on the appropriate gap between content and presentation, it took until about 2010 before I actually attached any real significance to the presentation at all! Until this point, I’d considered myself to have been more of a back-end than a front-end engineer, and felt that the most-important thing was to get the content out there via an appropriate medium. After all, a site without content isn’t a site at all, but a site without design is (or at least should be) still intelligible thanks to browser defaults! Remember, again, that I started web development at a time when stylesheets didn’t exist at all.
My previous implementations of my blog design had used simple designs, often adapted from open-source templates, in an effort to get them deployed as quickly as possible and move on to the next task, but now, I felt, it was time to do a little more.
For a few years, I was producing a new theme once per year. I experimented with different colours, fonts, and layouts, and decided (after some ad-hoc A/B testing) that my audience was better-served by a “front” page than by being dropped directly into my blog archives as had previously been the case. Highlighting the latest few – and especially the very-latest – post and other recent content increased the number of posts that a visitor would be likely to engage with in a single visit. I’ve always presumed that the reason for this is that regular (but non-subscribing) readers are more-likely to be able to work out what they have and haven’t read already from summary text than from trying to decipher an entire post: possibly because my blogging had (has!) become rather verbose.
I went through a bit of a lull in blogging: I’ve joked that I spent more time on my 2010 and 2011 designs than I did on the sum total of the content that was published in between the pair of them (which isn’t true… at least, not quite!). In the month I left Aberystwyth for Oxford, for example, I was doing all kinds of exciting and new things… and yet I only wrote a total of two blog posts.
With RSS waning in popularity – which I can’t understand: RSS is amazing! – I began to crosspost to social networks like Twitter and Google+ (although no longer to Google+, following the news of its imminent demise) to help those readers who prefer to get their content via these media, but because I wasn’t producing much content, it probably didn’t make a significant difference anyway: the chance of a regular reader “missing” something must have been remarkably slim.
Nobody calls me “Scatman Dan” any more, and hadn’t for a long, long time. Given that my name is already awesome and unique all by itself (having changed to be so during the era in which scatmania.org was my primary personal domain name), it felt like I had the opportunity to rebrand.
I moved my blog to a new domain, DanQ.me (which is nice and short, too) and came up with a new collection of colours, fonts, and layout choices that I felt better-reflected my identity… and the fact that my blog was becoming less a place to record the mundane details of my daily life and more a place where I talk about (principally-web) technology, security, and GPS games… and just occasionally about other topics like breadmaking and books. Also, it gave me a chance to get on top of the current trend in web design for big, clean, empty spaces, square corners, and using pictures as the hook to a story.
Particular areas in which I produce content elsewhere but would like to at-least maintain a copy here, and would ideally publish here first and syndicate elsewhere, although I appreciate that this is difficult, are:
Reddit, where I’ve written tens of thousands of words under a variety of accounts, but I don’t really pay attention to the site any more
I left Facebook in 2011 but I still have a backup of what was on my “Wall” at that point, which I could look into reintegrating into my blog
I share a lot of the source code I write via my GitHub account, but I’m painfully aware that this is yet-another-silo that I ought to learn not to depend upon (and it ought to be simple enough to mirror my repos on my own site!)
I’ve got a reasonable number of videos on two YouTube channels which are online by Google’s good graces (and potential for advertising revenue); for a handful of technical reasons they’re a bit of a pain to self-host, but perhaps my blog could act as a secondary source to my own video content
I write business reviews on Google Maps which I should probably look into recovering from the hivemind and hosting here… in fact, I’ve probably written plenty of reviews on other sites, too, like Amazon for example…
On two previous occasions I’ve maintained an online photo gallery; I might someday resurrect the concept, at least for the photos that used to be published on them
I’ve dabbled on a handful of other, often weirder, social networks before like Scuttlebutt (which has a genius concept, by the way) and Ello, and ought to check if there’s anything “original” on there I should reintegrate
Going way, way back, there are a good number of usenet postings I’ve made over the last twenty-something years that I could reclaim, if I can find them…
(if you’re asking why I’m inclined to do all of these things: here’s why)
20 years and around 717,000 words worth of blogging down, it’s interesting to look back and see how things have changed: in my life, on the Web, and in the world in general. I’ve seen many friends’ blogs come and go: they move into a new phase of their life and don’t feel like what they wrote before reflects them today, most often, and so they delete them… which is fine, of course: it’s their content! But for me it’s always felt wrong to do so, for two reasons: firstly, it feels false to do so given that once something’s been put on the Web, it might well be online forever – you can’t put the genie back in the bottle! And secondly: for me, it’s valuable to own everything I wrote before. Even the cringeworthy things I wrote as a teenager who thought they knew everything and the antagonistic stuff I wrote in my early 20s but that I clearly wouldn’t stand by today is part of my history, and hiding that would be a disservice to myself.
The 17-year-old who wrote my first blog posts two decades ago this month fully expected that the things he wrote would be online forever, and I don’t intend to take that away from him. I’m sure that when I write a post in October 2038 looking back on the next two decades, I’ll roll my eyes at myself today, too, but for me: that’s part of the joy of a long-running personal blog. It’s like a diary, but with a sense of accountability. It’s a space on the web that’s “mine” into which I can dump pretty-much whatever I like.
I love it: I’ve been blogging for over half of my life, and if I can get back to you in 2031 and tell you that I’ve by-then been doing so for two-thirds of my life, that would be a win.
@adactio: The ampersand in https://adactio.com/notes/14395 isn’t being properly escaped in your RSS feed, breaking the XML – see e.g. https://validator.w3.org/feed/check.cgi?url=https%3A%2F%2Fadactio.com%2Frss
I recently discovered a minor security vulnerability in mobile webcomic reading app Comic Chameleon, and I thought that it was interesting (and tame) enough to share as a learning example of (a) how to find security vulnerabilities in an app like this, and (b) more importantly, how to write an app like this without this kind of security vulnerability.
The nature of the vulnerability is that, for webcomics pushed directly into the platform by their authors, it’s possible to read comics (long) before they’re published. By way of proof, here’s a copy of the top-right 200 × 120 pixels of episode 54 of the (excellent) Forward Comic, which Imgur will confirm was uploaded on 2 July 2018: over three months ahead of its planned publication date.
How to hack a web-backed app
Just to be clear, I didn’t set out to hack this app, but once I stumbled upon the vulnerability I wanted to make sure that I was able to collect enough information that I’d be able to explain to its author what was wrong and how to fix it. You’d be amazed how many systems I find security holes in almost-completely by accident. In fact, I’d just noticed that the application supported some webcomics that I follow but for which I hadn’t been able to find RSS feeds (and so I was selfdogfooding my own tool, RSSey, to “produce” RSS feeds for my reader by screen-scraping: not the most-elegant solution). But if this app could produce a list of issues of the comic, it must have some way of doing what I was trying to do, and I wanted to know what it was.
The app, I figured, must “phone home” to some website – probably the app’s official website itself – to get the list of comics that it supports and details of where to get their feeds from, so I grabbed a copy of the app and started investigating. Because I figured I was probably looking for a URL, the first thing I did was to download the raw APK file (your favourite search engine can tell you how to do this), decompressed it (APK files are just ZIP files, really) and ran strings on it to search for likely-looking URLs:
I tried visiting a few of the addresses but many of them seemed to be API endpoints that were expecting additional parameters. Probably, I figured, the strings I’d extracted were prefixes to which those parameters were attached. Rather than fuzz for the right parameters, I decided to watch what the app did: I spun up a simulated Android device using the official emulator (I could have used my own on a wireless network that I control, of course, but this was lazier) and ran my favourite packet sniffer to see what the application was requesting.
Now I had full web addresses with parameters. Comparing the parameters that appeared when I clicked different comics revealed that each comic in the “full list” was assigned a numeric ID which was used when requesting issues of that comic (along with an intermediate stage where the year of publication is requested).
Interestingly, a number of comics were listed with the attribute s="no-show" and did not appear in the app: it looked like comics that weren’t yet being made available via the app were already being indexed and collected by its web component, and for some reason were being exposed via the XML API: presumably the developer had never considered that anybody but their app would look at the XML itself, but the thing about the Web is that if you put it on the Web, anybody can see it.
Still: at this point I assumed that I was about to find what I was looking for – some kind of machine-readable source (an RSS feed or something like one) for a webcomic or two. But when I looked at the XML API for one of those webcomics I discovered quite a bit more than I’d bargained on finding:
The first webcomic I looked at included the “official” web addresses and titles of each published comic… but also several not yet published ones. The unpublished ones were marked with s="no-show" to indicate to the app that they weren’t to be shown, but I could now see them. The “official” web addresses didn’t work for me, as I’d expected, but when I tried Comic Chameleon’s versions of the addresses, I found that I could see entire episodes of comics, up to three and a half months ahead of their expected publication date.
Naturally, I compiled all of my findings into an email and contacted the app developer with all of the details they’d need to fix it – in hacker terms, I’m one of the “good guys”! – but I wanted to share this particular example with you because (a) it’s not a very dangerous leak of data (a few webcomics a few weeks early and/or a way to evade a few ads isn’t going to kill anybody) and (b) it’s very illustrative of the kinds of mistakes that app developers are making a lot, these days, and it’s important to understand why so that you’re not among them. On to that in a moment.
Because (I’d like to think) I’m one of the “good guys” in the security world, the first thing I did after the research above was to contact the author of the software. They didn’t seem to have a security.txt file, a disclosure policy, nor a profile on any of the major disclosure management sites, so I sent an email. Were the security issue more-severe, I’d have sent a preliminary email suggesting (and agreeing on a mechanism for) encrypted email, but given the low impact of this particular issue, I just explained the entire issue in the initial email: basically what you’ve read above, plus some tips on fixing the issue and an offer to help out.
I subscribe to the doctrine of responsible disclosure, which – in the event of more-significant vulnerabilities – means that after first contacting the developer of an insecure system and giving them time to fix it, it’s acceptable (in fact: in significant cases, it’s socially-responsible) to publish the details of the vulnerability. In this case, though, I think the whole experience makes an interesting learning example about ways in which you might begin to “black box” test an app for data leaks like this and – below – how to think about software development in a way that limits the risk of such vulnerabilities appearing in the first place.
The author of this software hasn’t given any answer to any of the emails I’ve sent over the last couple of weeks, so I’m assuming that they just plan to leave this particular leak in place. I reached out and contacted the author of Forward Comic, though, which turns out (coincidentally) to be probably the most-severely affected publication on the platform, so that he had the option of taking action before I published this blog post.
Lessons to learn
When developing an “app” (whether for the web or a desktop or mobile platform) that connects to an Internet service to collect data, here are the important things you really, really ought to do:
Don’t publish any data that you don’t want the user to see.
If the data isn’t for everybody, remember to authenticate the user.
And for heaven’s sake use SSL, it’s not the 1990s any more.
That first lesson’s the big one of course: if you don’t want something to be on the public Internet, don’t put it on the public Internet! The feeds I found simply shouldn’t have contained the “secret” information that they did, and the unpublished comics shouldn’t have been online at real web addresses. But aside from (or in addition to) not including these unpublished items in the data feeds, what else might our app developer have considered?
Encryption. There’s no excuse for not using HTTPS these days. This alone wouldn’t have prevented a deliberate effort to read the secret data, but it would help prevent it from happening accidentally (which is a risk right now), e.g. on a proxy server or while debugging something else on the same network link. It also protects the user from exposing their browsing habits (do you want everybody at that coffee shop to know what weird comics you read?) and from having content ‘injected’ (do you want the person at the next table of the coffee shop to be able to choose what you see when you ask for a comic?
Authentication (app). The app could work harder to prove that it’s genuinely the app when it contacts the website. No mechanism for doing this can ever be perfect, because the user hasa access to the app and can theoretically reverse-engineer it to fish the entire authentication strategy out of it, but some approaches are better than others. Sending a password (e.g. over Basic Authentication) is barely better than just using a complex web address, but using a client-side certiciate or an OTP algorithm would (in conjunction with encryption) foil many attackers.
Authentication (user). It’s a very-different model to the one currently used by the app, but requiring users to “sign up” to the service would reduce the risks and provide better mechanisms for tracking/blocking misusers, though the relative anonymity of the Internet doesn’t give this much strength and introduces various additional burdens both technical and legal upon the developer.
Fundamentally, of course, there’s nothing that an app developer can do to perfectly protect the data that is published to that app, because the app runs on a device that the user controls! That’s why the first lesson is the most important: if it shouldn’t be on the public Internet (yet), don’t put it on the public Internet.
Hopefully there’s a lesson for you somewhere too: about how to think about app security so that you don’t make a similar mistake, or about some of the ways in which you might test the security of an application (for example, as part of an internal audit), or, if nothing else, that you should go and read Forward, because it’s pretty cool.
I’m aware that many of my friends use Google Reader to subscribe to their favourite blogs, comics, and so on, so – if you’re among them – I thought I’d better make you aware of some of your alternatives. Google are dropping Google Reader on 1st July (here’s the announcement on the Google Reader blog), so it’s time to move on.
Getting your data out of Google Reader
The good news is that it’s pretty easy to get all of your feeds out of Google Reader, and import them into your new feed reader. You can export everything from your Reader account, but the most important thing in your export is probably the OPML file (called ‘subscriptions.xml’ in your download), which is what your new reader will use to give you the continuity that you’re looking for. OPML files describe a list of subscriptions: for example, this OPML file describes all of the blogs that used to feature on Abnib (when it worked reliably).
Choosing a new RSS reader
You’ve got a few different choices for your new RSS reader. Here are a few of my favourites:
Tiny Tiny RSS – if you’re happy to host your own web-based RSS reader, and you’re enough of a geek to enjoy tweaking it the way that you want, then this tool is simply awesome. Install it on your server, configure it the way you want, and then access it via the web or the Android app. I’ve been using Tiny Tiny RSS for a few years, and I’ve made a few minor tweaks to add URL-shortening and sharing features: that’s what powers the “Dan is reading…” (subscribe) list in the sidebar of my blog. It’s also one of the few web-based RSS readers that offers feed authentication options, which is incredibly useful if you follow “friends only” blogs on LiveJournal or similar platforms.
NewsBlur – this is the closest thing you’ll find to a like-for-like replacement for Google Reader, and it’s actually really good: a slick, simple interface, apps for all of the major mobile platforms, and a damn smart tagging system. They’re a little swamped with Reader refugees right now, but you can work around the traffic by signing up and logging in at their alternative web address of dev.newsblur.com.
Feedly – or, if you’re happy to step away from the centralised, web-based reader solutions, here’s a great option: available as a browser plugin or a mobile app, it has the fringe benefit that you can use it to read your pre-cached subscriptions while you’re away from an Internet connection, if that’s a concern to you.
Blogtrottr – If you only subscribe to a handful of feeds, you might want to look at Blogtrottr: it’s an RSS-to-email service, so it delivers your favourite blogs right to your Inbox, which is great for those of you that use your Inbox as a to-do list (and pretty damn good if you set up some filters to put your RSS feeds into a suitable tag or folder, so that you can read them at your leisure).
Finally, don’t forget that if you’re using Opera as your primary web browser, that it has a great RSS reader baked right into it! As an Opera fan, I couldn’t help but plug that.
Or if you only care about my posts…
Of course, if mine’s the only blog you’re concerned with, you might like to follow me on Google+ on on Twitter: all of my blog posts get publicly pushed to both of those social networks as soon as they’re published, so if you’re a social network fiend, that’s probably the easiest answer for you!