I was watching a recent YouTube video by Derek Muller (Veritasium), My Video Went Viral. Here’s Why, and I came to a realisation: I don’t watch YouTube like most people – probably including you! – watch YouTube. And as a result, my perspective on what YouTube is and does is fundamentally biased from the way that others probably think about it.
The magic moment came for me when his video explained that the “subscribe” button doesn’t do what I’d assumed it does. I’m probably not alone in my assumptions: I’ll bet that people who use the “subscribe” button as YouTube intend don’t all realise that it works the way that it does.
Like many, I’d assumed the “subscribe” buttons says “I want to know about everything this creator publishes”. But that’s not what actually happens. YouTube wrangles your subscription list and (especially) your recommendations based on their own metrics using an opaque algorithm. I knew, of course, that they used such a thing to manage the list of recommended next-watches… but I didn’t realise how big an influence it was having on the way that most YouTube users choose what they’ll watch!
YouTube’s metrics for “what to show to you” is, of course, biased by your subscriptions. But it’s also biased by what’s “trending” (which in turn is based on watch time and click-through-rate), what people-who-watch-the-things-you-watch watch, subscription commonalities, regional trends, what your contacts are interested in, and… who knows what else! AAA YouTubers try to “game” it, but the goalposts are moving. And the struggle to stay on-top, especially after a fluke viral hit, leads to the application of increasingly desperate and clickbaity measures.
This is a battle to which I’ve been mostly oblivious, until now, because I don’t watch YouTube like you watch YouTube.
Tom Scott produced an underappreciated sci-fi short last year describing a theoretical AI which, in 2028, caused problems as a result of its single-minded focus. What we’re seeing in YouTube right now is a simpler example, but illustrates the problem well: optimising YouTube’s algorithm for any factor or combination of factors other than a user’s stated preference (subscriptions) will necessarily result in the promotion of videos to a user other than, and at the expense of, the ones by creators that they’ve subscribed to. And there are so many things that YouTube could use as influencing factors. Off the top of my head, there’s:
Number of views
Number of likes
Ratio of likes to dislikes
Number of tracked shares
Number of saves
Length of view
Click-through rate on advertisements
Popularity amongst your friends
Popularity amongst your demographic
Etc., etc., etc.
But this is all alien to me. Why? Well: here’s how I use YouTube:
Subscription: I subscribe to creators via RSS. My RSS reader doesn’t implement YouTube’s algorithm, of course, so it just gives me exactly what I subscribe to – no more, no less.It’s not perfect (for example, it pisses me off every time it tells me about an upcoming “premiere”, a YouTube feature I don’t care about even a little), but apart from that it’s great! If I’m on-the-move and can’t watch something as long as involved TheraminTrees‘ latest deep-thinker, my RSS reader remembers so I can watch it later at my convenience. I can have National Geographic‘s videos “expire” if I don’t watch them within a week but Dr. Doe‘s wait for me forever. And I can implement my own filters if a feed isn’t showing exactly what I’m looking for (like I did to strip the sport section from BBC News’ RSS feed). I’m in control.
Discovery: I don’t rely on YouTube’s algorithm to find me new content. I don’t mind being a day or two behind on what’s trending: I’m not sure I care at all? I’m far more-interested in recommendations curated by a human. If I discover and subscribe to a channel on YouTube, it was usually (a) mentioned by another YouTuber or (b) linked from a blog or community blog. I’m receiving recommendations from people I already respect, and they have a way higher hit-rate than YouTube’s recommendations.(I also sometimes discover content because it’s exactly what I searched for, e.g. I’m looking for that tutorial on how to install a fiddly damn kiddy seat into the car, but this is unlikely to result in a subscription.)
This isn’t yet-another-argument that you should use RSS because it’s awesome. (Okay, so it is. RSS isn’t dead, and its killer feature is that its users get to choose how it works. But there’s more I want to say.)
What I wanted to share was this reminder, for me, that the way you use a technology can totally blind you to the way others use it. I had no idea that many YouTube creators and some YouTube subscribers felt increasingly like they were fighting YouTube’s algorithms, whose goals are different from their own, to get what they want. Now I can see it everywhere! Why do schmoyoho always encourage me to press the notification bell and not just the subscribe button? Because for a typical YouTube user, that’s the only way that they can be sure that their latest content will be seen!
Of course, the business needs of YouTube mean that we’re not likely to see any change from them. So until either we have mainstream content-curating AIs that answer to their human owners rather than to commercial networks (robot butler, anybody?) or else the video fediverse catches on – and I don’t know which of those two are least-likely! – I guess I’ll stick to my algorithm-lite subscription model for YouTube.
But at least now I’ll have a better understanding of why some of the channels I follow are changing the way they produce and market their content…
I love RSS, but it’s a minor niggle for me that if I subscribe to any of the BBC News RSS feeds I invariably get all the sports news, too. Which’d be fine if I gave even the slightest care about the world of sports, but I don’t.
Filters it to remove all entries whose GUID matches a particular regular expression (removing all of those from the “sport” section of the site)
Outputs the resulting feed into a temporary file
Uploads the temporary file to a bucket in Backblaze‘s “B2” repository (think: a better-value competitor S3); the bucket I’m using is publicly-accessible so anybody’s RSS reader can subscribe to the feed
When Firefox 64 arrives in December, support for RSS, the once celebrated content syndication scheme, and its sibling, Atom, will be missing.
“After considering the maintenance, performance and security costs of the feed preview and subscription features in Firefox, we’ve concluded that it is no longer sustainable to keep feed support in the core of the product,” said Gijs Kruitbosch, a software engineer who works on Firefox at Mozilla, in a blog post on Thursday.
Not a great sign, but understandable. Live Bookmarks was never strong enough to be a full-featured RSS reader, and I don’t know about you but I haven’t really made use of bookmarks for a good few years, let alone “live” bookmarks, but the media are likely to see this (as El Reg does, in the article) as another nail in the coffin of one of the best syndication mechanisms the Web ever came up with.
There are two stories here. The first is a story about a vision of the web’s future that never quite came to fruition. The second is a story about how a collaborative effort to improve a popular standard devolved into one of the most contentious forks in the history of open-source software development.
In the late 1990s, in the go-go years between Netscape’s IPO and the Dot-com crash, everyone could see that the web was going to be an even bigger deal than it already was, even if they didn’t know exactly how it was going to get there. One theory was that the web was about to be revolutionized by syndication. The web, originally built to enable a simple transaction between two parties—a client fetching a document from a single host server—would be broken open by new standards that could be used to repackage and redistribute entire websites through a variety of channels. Kevin Werbach, writing for Release 1.0, a newsletter influential among investors in the 1990s, predicted that syndication “would evolve into the core model for the Internet economy, allowing businesses and individuals to retain control over their online personae while enjoying the benefits of massive scale and scope.”1 He invited his readers to imagine a future in which fencing aficionados, rather than going directly to an “online sporting goods site” or “fencing equipment retailer,” could buy a new épée directly through e-commerce widgets embedded into their favorite website about fencing.2 Just like in the television world, where big networks syndicate their shows to smaller local stations, syndication on the web would allow businesses and publications to reach consumers through a multitude of intermediary sites. This would mean, as a corollary, that consumers would gain significant control over where and how they interacted with any given business or publication on the web.
RSS was one of the standards that promised to deliver this syndicated future. To Werbach, RSS was “the leading example of a lightweight syndication protocol.”3 Another contemporaneous article called RSS the first protocol to realize the potential of XML.4 It was going to be a way for both users and content aggregators to create their own customized channels out of everything the web had to offer. And yet, two decades later, RSS appears to be a dying technology, now used chiefly by podcasters and programmers with tech blogs. Moreover, among that latter group, RSS is perhaps used as much for its political symbolism as its actual utility. Though of course some people really do have RSS readers, stubbornly adding an RSS feed to your blog, even in 2018, is a reactionary statement. That little tangerine bubble has become a wistful symbol of defiance against a centralized web increasingly controlled by a handful of corporations, a web that hardly resembles the syndicated web of Werbach’s imagining.
The future once looked so bright for RSS. What happened? Was its downfall inevitable, or was it precipitated by the bitter infighting that thwarted the development of a single RSS standard?
I’ve always been a huge fan of RSS, and I use it for just about everything (I’ll even hack-it-in to services that don’t supply it natively, just to make them fit around my workflow). But even I’ve got to admit that – outside of podcasts – it’s not done well at retaining mainstream appeal, especially after the death of Google Reader. Right now, most people seem content to get their updates from their social media circles, and take a manual approach (ugh) to reading content in the few other places that matter to them. That’s problematic for all kinds of reasons, and I’m perfectly happy to be one of those old fuddy-duddies who likes his web standards open and independent!
As of next week, I’ll have been blogging for 20 years, or about 54% of my life. How did that happen?
The mid-1990s were a very different time for the World Wide Web (yes, we still called it that, and sometimes we even described its use as “surfing”). Going “on the Internet” was a calculated and deliberate action requiring tying up your phone line, minutes of “connecting” along with all of the associated screeching sounds if you hadn’t turned off your modem’s loudspeaker, and you’d typically be paying twice for the experience: both a monthly fee to your ISP for the service and a per-minute charge to your phone company for the call.
It was into this environment that in 1994 I published my first web pages: as far as I know, nothing remains of them now. It wasn’t until 1998 that I signed up an account with UserActive (whose website looks almost the same today as it did then) who offered economical subdomain hosting with shell and CGI support and launched “Castle of the Four Winds”, a set of vanity pages that included my first blog.
Except I didn’t call it a “blog”, of course, because it wasn’t until the following year that Peter Merholz invented the word (he also commemorated 20 years of blogging, this year). I didn’t even call it a “weblog”, because that word was still relatively new and I wasn’t hip enough to be around people who said it, yet. It was self-described as an “online diary”, a name which only served to reinforce the notion that I was writing principally for myself. In fact, it wasn’t until mid-1999 that I discovered that it was being more-widely read than just by me and my circle of friends when I attracted a stalker who travelled across the UK to try to “surprise” me by turning up at places she expected to find me, based on what I’d written online… which was exactly as creepy as it sounds.
While the world began to panic that the coming millennium was going to break all of the computers, I migrated Castle of the Four Winds’ content into AvAngel.com, a joint vanity site venture with my friend Andy. Aside from its additional content (purity tests, funny stuff, risqué e-cards), what we hosted was mostly the same old stuff, and I continued to write snippets about my life in what was now quite-clearly a “blog-like” format, with the most-recent posts at the top and separate pages for content too old for the front page. Looking back, there’s still a certain naivety to these posts which exemplify the youth of the Web. For example, posts routinely referenced my friends by their email addresses, because spam was yet to become a big enough problem that people didn’t much mind if you put their email address on a public webpage somewhere, and because email addresses still carried with them a feeling of anonymity that ceased to be the case when we started using them for important things.
Meanwhile, during my initial months as a student in Aberystwyth, I wrote a series of emails to friends back home entitled “Cool And Interesting Thing Of The Day To Do At The University Of Wales, Aberystwyth”, and put copies of each onto my student webspace; I’ve since recovered these and integrated them into my unified blog.
In 2002 I’d bought the domain name scatmania.org – a reference to my university halls of residence nickname “Scatman Dan”; I genuinely didn’t consider the possibility that the name might be considered scatalogical until later on. As I wanted to continue my blogging at an address that felt like it was solely mine (AvAngel.com having been originally shared with a friend, although in practice over time it became associated only with me), this seemed like a good domain upon which to relaunch. And so, in mid-2003 and powered by a short-lived and ill-fated blogging engine called Flip I did exactly that. WordPress, to which I’d subsequently migrate, hadn’t been invented yet and it wasn’t clear whether its predecessor, b2/cafelog, would survive the troubles its author was experiencing.
From this point on, any web address for any post made to my blog still works to this day, despite multiple technological and infrastructural changes to my blog (and some domain name shenanigans!) in the meantime. I’d come to be a big believer in the mantra that cool URIs don’t change: something that as far as possible I’ve committed to trying to upload in my blogging, my archiving, and my paid work since then. I’m moderately confident that all extant links on the web that point to earlier posts are all under my control so they can (and in most cases have) been fixed already, so I’m pretty close to having all my permalink URIs be “cool”, for now. You might hit a short chain of redirects, but you’ll get to where you’re going.
And everything was fine, until one day in 2004 when it wasn’t. The server hosting scatmania.org died in a very bad way, and because my backup strategy was woefully inadequate, I lost a lot of content. I’ve recovered quite a lot of it and put it back in-place, but some is probably gone forever.
The resurrected site was powered by WordPress, and this was the first time that live database queries had been used to power my blog. Occasionally, these days, when talking to younger, cooler developers, I’m tempted to follow the hip trend of reimplementing my blog as a static site, compiling a stack of host-anywhere HTML files based upon whatever-structure-I-like at the “backend”… but then I remember that I basically did that already for six years and I’m far happier with my web presence today. I’ve nothing against static site systems (I’m quite partial to Middleman, myself, although I’m also fond of Hugo) but they’re not right for this site, right now.
IndieAuth hadn’t been invented yet, but I was quite keen on the ideals of OpenID (I still am, really), and so I implemented what was probably the first viable “install-anywhere” implementation of OpenID for WordPress – you can see part of it functioning in the top-right of the screenshot above, where my (copious, at that time) LiveJournal-using friends were encouraged to sign in to my blog using their LiveJournal identity. Nowadays, the majority of the WordPress plugins I use are ones I’ve written myself: my blog is powered by a CMS that’s more “mine” than not!
Over the course of the first decade of my blogging, a few trends had become apparent in my technical choices. For example:
I’ve preferred an approach of storing the “master” copy of my content on my own site and then (sometimes) syndicating it elsewhere: for example, for the benefit of my friends who during their University years maintained a LiveJournal, for many years I had my blog cross-post to a LiveJournal account (and backfeed copies of comments back to my site).
These were deliberate choices, but they didn’t require much consideration: growing up with a Web far less-sophisticated than today’s (e.g. truly stateless prior to the advent of HTTP cookies) and seeing the chaos caused during the first browser war and the period of stagnation that followed, these choices seemed intuitive.
As you’d expect from a blog covering a period from somebody’s teen years through to their late thirties, there’ve been significant changes in the kinds of content I’ve posted (and the tone with which I’ve done so) over the years, too. If you dip into 2003, for example, you’ll see the results of quiz memes and unqualified daily minutiae alongside actual considered content. Go back further, to early 1999, and it is (at best) meaningless wittering about the day-to-day life of a teenage student. It took until around 2009/2010 before I actually started focussing on writing content that specifically might be enjoyable for others to read (even where that content was frankly silly) and only far more-recently-still that I’ve committed to the “mostly technical stuff, ocassional bits of ‘life’ stuff” focus that I have today.
I say “committed”, but of course I’m fully aware that whatever this blog is now, it’ll doubtless be something somewhat different if I’m still writing it in another two decades…
Once I reached the 2010s I started actually taking the time to think about the design of my blog and its meaning. Conceptually, all of my content is data-driven: database tables full of different “kinds” of content and associated metadata, and that’s pretty-much ideal – it provides a strong separation between content and presentation and makes it possible to make significant design changes with less work than might otherwise be expected. I’ve also always generally favoured a separation of concerns in web development and so I’m not a fan of CSS design methodologies that encourage class names describing how things should appear, like Atomic CSS. Even where it results in a performance hit, I’d far rather use CSS classes to describe what things are or represent. The single biggest problem with this approach, to my mind, is that it violates the DRY principle… but that’s something that your CSS preprocessor’s there to fix for you, isn’t it?
But despite this philosophical outlook on the appropriate gap between content and presentation, it took until about 2010 before I actually attached any real significance to the presentation at all! Until this point, I’d considered myself to have been more of a back-end than a front-end engineer, and felt that the most-important thing was to get the content out there via an appropriate medium. After all, a site without content isn’t a site at all, but a site without design is (or at least should be) still intelligible thanks to browser defaults! Remember, again, that I started web development at a time when stylesheets didn’t exist at all.
My previous implementations of my blog design had used simple designs, often adapted from open-source templates, in an effort to get them deployed as quickly as possible and move on to the next task, but now, I felt, it was time to do a little more.
For a few years, I was producing a new theme once per year. I experimented with different colours, fonts, and layouts, and decided (after some ad-hoc A/B testing) that my audience was better-served by a “front” page than by being dropped directly into my blog archives as had previously been the case. Highlighting the latest few – and especially the very-latest – post and other recent content increased the number of posts that a visitor would be likely to engage with in a single visit. I’ve always presumed that the reason for this is that regular (but non-subscribing) readers are more-likely to be able to work out what they have and haven’t read already from summary text than from trying to decipher an entire post: possibly because my blogging had (has!) become rather verbose.
I went through a bit of a lull in blogging: I’ve joked that I spent more time on my 2010 and 2011 designs than I did on the sum total of the content that was published in between the pair of them (which isn’t true… at least, not quite!). In the month I left Aberystwyth for Oxford, for example, I was doing all kinds of exciting and new things… and yet I only wrote a total of two blog posts.
With RSS waning in popularity – which I can’t understand: RSS is amazing! – I began to crosspost to social networks like Twitter and Google+ (although no longer to Google+, following the news of its imminent demise) to help those readers who prefer to get their content via these media, but because I wasn’t producing much content, it probably didn’t make a significant difference anyway: the chance of a regular reader “missing” something must have been remarkably slim.
Nobody calls me “Scatman Dan” any more, and hadn’t for a long, long time. Given that my name is already awesome and unique all by itself (having changed to be so during the era in which scatmania.org was my primary personal domain name), it felt like I had the opportunity to rebrand.
I moved my blog to a new domain, DanQ.me (which is nice and short, too) and came up with a new collection of colours, fonts, and layout choices that I felt better-reflected my identity… and the fact that my blog was becoming less a place to record the mundane details of my daily life and more a place where I talk about (principally-web) technology, security, and GPS games… and just occasionally about other topics like breadmaking and books. Also, it gave me a chance to get on top of the current trend in web design for big, clean, empty spaces, square corners, and using pictures as the hook to a story.
Particular areas in which I produce content elsewhere but would like to at-least maintain a copy here, and would ideally publish here first and syndicate elsewhere, although I appreciate that this is difficult, are:
Reddit, where I’ve written tens of thousands of words under a variety of accounts, but I don’t really pay attention to the site any more
I left Facebook in 2011 but I still have a backup of what was on my “Wall” at that point, which I could look into reintegrating into my blog
I share a lot of the source code I write via my GitHub account, but I’m painfully aware that this is yet-another-silo that I ought to learn not to depend upon (and it ought to be simple enough to mirror my repos on my own site!)
I’ve got a reasonable number of videos on two YouTube channels which are online by Google’s good graces (and potential for advertising revenue); for a handful of technical reasons they’re a bit of a pain to self-host, but perhaps my blog could act as a secondary source to my own video content
I write business reviews on Google Maps which I should probably look into recovering from the hivemind and hosting here… in fact, I’ve probably written plenty of reviews on other sites, too, like Amazon for example…
On two previous occasions I’ve maintained an online photo gallery; I might someday resurrect the concept, at least for the photos that used to be published on them
I’ve dabbled on a handful of other, often weirder, social networks before like Scuttlebutt (which has a genius concept, by the way) and Ello, and ought to check if there’s anything “original” on there I should reintegrate
Going way, way back, there are a good number of usenet postings I’ve made over the last twenty-something years that I could reclaim, if I can find them…
(if you’re asking why I’m inclined to do all of these things: here’s why)
20 years and around 717,000 words worth of blogging down, it’s interesting to look back and see how things have changed: in my life, on the Web, and in the world in general. I’ve seen many friends’ blogs come and go: they move into a new phase of their life and don’t feel like what they wrote before reflects them today, most often, and so they delete them… which is fine, of course: it’s their content! But for me it’s always felt wrong to do so, for two reasons: firstly, it feels false to do so given that once something’s been put on the Web, it might well be online forever – you can’t put the genie back in the bottle! And secondly: for me, it’s valuable to own everything I wrote before. Even the cringeworthy things I wrote as a teenager who thought they knew everything and the antagonistic stuff I wrote in my early 20s but that I clearly wouldn’t stand by today is part of my history, and hiding that would be a disservice to myself.
The 17-year-old who wrote my first blog posts two decades ago this month fully expected that the things he wrote would be online forever, and I don’t intend to take that away from him. I’m sure that when I write a post in October 2038 looking back on the next two decades, I’ll roll my eyes at myself today, too, but for me: that’s part of the joy of a long-running personal blog. It’s like a diary, but with a sense of accountability. It’s a space on the web that’s “mine” into which I can dump pretty-much whatever I like.
I love it: I’ve been blogging for over half of my life, and if I can get back to you in 2031 and tell you that I’ve by-then been doing so for two-thirds of my life, that would be a win.
@adactio: The ampersand in https://adactio.com/notes/14395 isn’t being properly escaped in your RSS feed, breaking the XML – see e.g. https://validator.w3.org/feed/check.cgi?url=https%3A%2F%2Fadactio.com%2Frss
I recently discovered a minor security vulnerability in mobile webcomic reading app Comic Chameleon, and I thought that it was interesting (and tame) enough to share as a learning example of (a) how to find security vulnerabilities in an app like this, and (b) more importantly, how to write an app like this without this kind of security vulnerability.
The nature of the vulnerability is that, for webcomics pushed directly into the platform by their authors, it’s possible to read comics (long) before they’re published. By way of proof, here’s a copy of the top-right 200 × 120 pixels of episode 54 of the (excellent) Forward Comic, which Imgur will confirm was uploaded on 2 July 2018: over three months ahead of its planned publication date.
How to hack a web-backed app
Just to be clear, I didn’t set out to hack this app, but once I stumbled upon the vulnerability I wanted to make sure that I was able to collect enough information that I’d be able to explain to its author what was wrong and how to fix it. You’d be amazed how many systems I find security holes in almost-completely by accident. In fact, I’d just noticed that the application supported some webcomics that I follow but for which I hadn’t been able to find RSS feeds (and so I was selfdogfooding my own tool, RSSey, to “produce” RSS feeds for my reader by screen-scraping: not the most-elegant solution). But if this app could produce a list of issues of the comic, it must have some way of doing what I was trying to do, and I wanted to know what it was.
The app, I figured, must “phone home” to some website – probably the app’s official website itself – to get the list of comics that it supports and details of where to get their feeds from, so I grabbed a copy of the app and started investigating. Because I figured I was probably looking for a URL, the first thing I did was to download the raw APK file (your favourite search engine can tell you how to do this), decompressed it (APK files are just ZIP files, really) and ran strings on it to search for likely-looking URLs:
I tried visiting a few of the addresses but many of them seemed to be API endpoints that were expecting additional parameters. Probably, I figured, the strings I’d extracted were prefixes to which those parameters were attached. Rather than fuzz for the right parameters, I decided to watch what the app did: I spun up a simulated Android device using the official emulator (I could have used my own on a wireless network that I control, of course, but this was lazier) and ran my favourite packet sniffer to see what the application was requesting.
Now I had full web addresses with parameters. Comparing the parameters that appeared when I clicked different comics revealed that each comic in the “full list” was assigned a numeric ID which was used when requesting issues of that comic (along with an intermediate stage where the year of publication is requested).
Interestingly, a number of comics were listed with the attribute s="no-show" and did not appear in the app: it looked like comics that weren’t yet being made available via the app were already being indexed and collected by its web component, and for some reason were being exposed via the XML API: presumably the developer had never considered that anybody but their app would look at the XML itself, but the thing about the Web is that if you put it on the Web, anybody can see it.
Still: at this point I assumed that I was about to find what I was looking for – some kind of machine-readable source (an RSS feed or something like one) for a webcomic or two. But when I looked at the XML API for one of those webcomics I discovered quite a bit more than I’d bargained on finding:
The first webcomic I looked at included the “official” web addresses and titles of each published comic… but also several not yet published ones. The unpublished ones were marked with s="no-show" to indicate to the app that they weren’t to be shown, but I could now see them. The “official” web addresses didn’t work for me, as I’d expected, but when I tried Comic Chameleon’s versions of the addresses, I found that I could see entire episodes of comics, up to three and a half months ahead of their expected publication date.
Naturally, I compiled all of my findings into an email and contacted the app developer with all of the details they’d need to fix it – in hacker terms, I’m one of the “good guys”! – but I wanted to share this particular example with you because (a) it’s not a very dangerous leak of data (a few webcomics a few weeks early and/or a way to evade a few ads isn’t going to kill anybody) and (b) it’s very illustrative of the kinds of mistakes that app developers are making a lot, these days, and it’s important to understand why so that you’re not among them. On to that in a moment.
Because (I’d like to think) I’m one of the “good guys” in the security world, the first thing I did after the research above was to contact the author of the software. They didn’t seem to have a security.txt file, a disclosure policy, nor a profile on any of the major disclosure management sites, so I sent an email. Were the security issue more-severe, I’d have sent a preliminary email suggesting (and agreeing on a mechanism for) encrypted email, but given the low impact of this particular issue, I just explained the entire issue in the initial email: basically what you’ve read above, plus some tips on fixing the issue and an offer to help out.
I subscribe to the doctrine of responsible disclosure, which – in the event of more-significant vulnerabilities – means that after first contacting the developer of an insecure system and giving them time to fix it, it’s acceptable (in fact: in significant cases, it’s socially-responsible) to publish the details of the vulnerability. In this case, though, I think the whole experience makes an interesting learning example about ways in which you might begin to “black box” test an app for data leaks like this and – below – how to think about software development in a way that limits the risk of such vulnerabilities appearing in the first place.
The author of this software hasn’t given any answer to any of the emails I’ve sent over the last couple of weeks, so I’m assuming that they just plan to leave this particular leak in place. I reached out and contacted the author of Forward Comic, though, which turns out (coincidentally) to be probably the most-severely affected publication on the platform, so that he had the option of taking action before I published this blog post.
Lessons to learn
When developing an “app” (whether for the web or a desktop or mobile platform) that connects to an Internet service to collect data, here are the important things you really, really ought to do:
Don’t publish any data that you don’t want the user to see.
If the data isn’t for everybody, remember to authenticate the user.
And for heaven’s sake use SSL, it’s not the 1990s any more.
That first lesson’s the big one of course: if you don’t want something to be on the public Internet, don’t put it on the public Internet! The feeds I found simply shouldn’t have contained the “secret” information that they did, and the unpublished comics shouldn’t have been online at real web addresses. But aside from (or in addition to) not including these unpublished items in the data feeds, what else might our app developer have considered?
Encryption. There’s no excuse for not using HTTPS these days. This alone wouldn’t have prevented a deliberate effort to read the secret data, but it would help prevent it from happening accidentally (which is a risk right now), e.g. on a proxy server or while debugging something else on the same network link. It also protects the user from exposing their browsing habits (do you want everybody at that coffee shop to know what weird comics you read?) and from having content ‘injected’ (do you want the person at the next table of the coffee shop to be able to choose what you see when you ask for a comic?
Authentication (app). The app could work harder to prove that it’s genuinely the app when it contacts the website. No mechanism for doing this can ever be perfect, because the user hasa access to the app and can theoretically reverse-engineer it to fish the entire authentication strategy out of it, but some approaches are better than others. Sending a password (e.g. over Basic Authentication) is barely better than just using a complex web address, but using a client-side certiciate or an OTP algorithm would (in conjunction with encryption) foil many attackers.
Authentication (user). It’s a very-different model to the one currently used by the app, but requiring users to “sign up” to the service would reduce the risks and provide better mechanisms for tracking/blocking misusers, though the relative anonymity of the Internet doesn’t give this much strength and introduces various additional burdens both technical and legal upon the developer.
Fundamentally, of course, there’s nothing that an app developer can do to perfectly protect the data that is published to that app, because the app runs on a device that the user controls! That’s why the first lesson is the most important: if it shouldn’t be on the public Internet (yet), don’t put it on the public Internet.
Hopefully there’s a lesson for you somewhere too: about how to think about app security so that you don’t make a similar mistake, or about some of the ways in which you might test the security of an application (for example, as part of an internal audit), or, if nothing else, that you should go and read Forward, because it’s pretty cool.
I’m aware that many of my friends use Google Reader to subscribe to their favourite blogs, comics, and so on, so – if you’re among them – I thought I’d better make you aware of some of your alternatives. Google are dropping Google Reader on 1st July (here’s the announcement on the Google Reader blog), so it’s time to move on.
Getting your data out of Google Reader
The good news is that it’s pretty easy to get all of your feeds out of Google Reader, and import them into your new feed reader. You can export everything from your Reader account, but the most important thing in your export is probably the OPML file (called ‘subscriptions.xml’ in your download), which is what your new reader will use to give you the continuity that you’re looking for. OPML files describe a list of subscriptions: for example, this OPML file describes all of the blogs that used to feature on Abnib (when it worked reliably).
Choosing a new RSS reader
You’ve got a few different choices for your new RSS reader. Here are a few of my favourites:
Tiny Tiny RSS – if you’re happy to host your own web-based RSS reader, and you’re enough of a geek to enjoy tweaking it the way that you want, then this tool is simply awesome. Install it on your server, configure it the way you want, and then access it via the web or the Android app. I’ve been using Tiny Tiny RSS for a few years, and I’ve made a few minor tweaks to add URL-shortening and sharing features: that’s what powers the “Dan is reading…” (subscribe) list in the sidebar of my blog. It’s also one of the few web-based RSS readers that offers feed authentication options, which is incredibly useful if you follow “friends only” blogs on LiveJournal or similar platforms.
NewsBlur – this is the closest thing you’ll find to a like-for-like replacement for Google Reader, and it’s actually really good: a slick, simple interface, apps for all of the major mobile platforms, and a damn smart tagging system. They’re a little swamped with Reader refugees right now, but you can work around the traffic by signing up and logging in at their alternative web address of dev.newsblur.com.
Feedly – or, if you’re happy to step away from the centralised, web-based reader solutions, here’s a great option: available as a browser plugin or a mobile app, it has the fringe benefit that you can use it to read your pre-cached subscriptions while you’re away from an Internet connection, if that’s a concern to you.
Blogtrottr – If you only subscribe to a handful of feeds, you might want to look at Blogtrottr: it’s an RSS-to-email service, so it delivers your favourite blogs right to your Inbox, which is great for those of you that use your Inbox as a to-do list (and pretty damn good if you set up some filters to put your RSS feeds into a suitable tag or folder, so that you can read them at your leisure).
Finally, don’t forget that if you’re using Opera as your primary web browser, that it has a great RSS reader baked right into it! As an Opera fan, I couldn’t help but plug that.
Or if you only care about my posts…
Of course, if mine’s the only blog you’re concerned with, you might like to follow me on Google+ on on Twitter: all of my blog posts get publicly pushed to both of those social networks as soon as they’re published, so if you’re a social network fiend, that’s probably the easiest answer for you!
Long ago, I used desktop RSS readers. I was only subscribed to my friends’ blogs back then anyway, so it didn’t matter that I could only read them from my home computer. But then RSS feeds started appearing on news sites, and tech blogs started appearing about things related to my work. And smartphones took over the world, and I wanted to be able to synchronise my reading list everywhere. There were a few different services that competed for my attention, but Google Reader was the best. It was simple, and fast, and easy, and it Just Worked in that way that Google products often do.
I put up with the occasional changes to the user interface. Hey, it’s a beta, and it’s still the best thing out there. Hey, it’s free, what can you say? I put up with the fact that from time to time, they changed the site in ways that were sometimes quite hostile to Opera, my web browser of choice. I put up with the fact that it had difficulty with unsigned HTTPS certificates (it’s fine now) and that it didn’t provide a mechanism to authenticate against services like LiveJournal (it still doesn’t). I even worked around the latter, releasing my own tool and updatingit a few times until LiveJournal blocked it (twice) and I had to instead recommend that people switched to rival service FreeMyFeed.
I know that they’re ever-so-proud of the Google+ user interface, but rebranding all of the other services to look like it just isn’t working. It’s great for Google+, not-bad for Search, bad for GMail (but at least you can turn it off!), and fucking awful for Reader. I like distinct borders between my items. I don’t like big white spaces and buttons that eat up half the screen.
The sharing interface is completely broken. After a little while, I worked out that I still can share things with other people, but I can’t any longer see what other people are sharing without clicking over to Google+. This sucks a lot. No longer can I keep track of which shared items I have and haven’t read, and no longer can I read the interesting RSS feeds my friends have shared in the same place as I read (and share) my own.
So that’s the last straw. Today, I switched everything over to Tiny Tiny RSS.
Originally I felt that I was being pushed “away” from Google Reader, but the more I’ve played with it, the more I’ve realised that I’m being drawn “towards” Tiny Tiny, and wishing that I’d made the switch further. The things that have really appealed are:
It’s self-hosted. Tiny Tiny RSS is a free, open-source solution that you host for yourself (or I suppose you can use a shared host; there are a few around). I know that this is a downside to most people, but to me, it’s a serious selling point: now, I’m in control of what updates are applied, when, and if I don’t like the functionality of a part of the system, I can change it – I’m in control.
It’s simple and clean. It’s got a great user interface, in an understated and simplistic way. It’s somewhat reminiscent of desktop email clients, replacing the “stream of feeds” idea with a two- or three-pane view (your choice). That sounds like it’d be a downside, until you realise…
…with great keyboard controls. Tiny Tiny RSS is great for keyboard lovers like me. The default key-commands (which are of course customisable) are based on Emacs, so if that’s your background then it’s easy to be right at home in minutes and browsing feeds faster than ever.
Plus: it’s got a stack of nice features. I’m loving the “fresh” filter, that helps me differentiate between the stuff I’ve “saved for later” reading and the stuff that’s actually new and interesting. I’m also impressed by the integrated authentication, which removes my dependency on FreeMyFeed-like services and (because it’s self-hosted) lets me keep my credentials securely under my own control. It supports authentication using SSL certificates, a beautiful and underused technology. It allows you to customise the update frequency of your feeds, so I can stalk by friends’ blogs at lightning-quick rates and stall my weekly update subscriptions so they don’t get checked so frequently. And unlike Google Reader, it actually tells me when feeds break, so I don’t just “get no updates” for a while before I think to check the site (and it’ll even let me change the URLs when this happens, rather than unsubscribing and resubscribing).
Put simply: all of my major gripes with Google Reader over the last few years have been answered all at once in this wonderful little program. If people are interested in how I set up Tiny Tiny RSS and and made the switchover as simple and painless as possible, I’ll write a blog post to talk you through it.
I’ve had just one problem: it’s not quite so tolerant of badly-formed XML as Google Reader. There’s one feed in my list which, it turns out, has (very) invalid XML in it’s feed, that Google Reader managed to ignore and breeze over, but Tiny Tiny RSS chokes on. I’ve contacted the site owner to try to get it fixed, but if they don’t, I might have to hack some code to try to make a workaround. Not ideal, and not something that everybody would necessarily want to deal with, so be aware!
If, like me, you’ve become dissatisfied by Google Reader this week, you might also like to look at rssLounge, the other worthy candidate I considered as a replacement. I had a quick play but didn’t find it quite as suitable for my needs, but it might be to your taste: take a look.
Oh, and one more thing: if you used to “follow” me on Google Reader (or even if you didn’t) and you want to continue to subscribe to the stuff I “share”, then you’ll want to subscribe to this new RSS feed of “my shared stuff”, instead: it can also be found syndicated in the right-hand column of my blog.
Update: this guy’s made a bookmarklet that makes the new Google Reader theme slightly less hideous. Doesn’t fix the other problems, though, but if you’re not quite pissed-off enough to jump ship, it might make your experience more-bearable.
Update 2: others in the blogosphere are saying good things about Reader rival NewsBlur, which recently turned one year old. If you’re looking for a hosted service, rather than something “roll-your-own” like Tiny Tiny RSS, perhaps it’s the tool for you?
It’s been unmaintained for several years now, just ticking along under its own steam and miraculously not falling over. Nowadays, everybody seems to understand (or ought to understand) RSS and can operate their own aggregator, so there doesn’t really seem to be any point in carrying on running the service. So when the domain name comes up for renewal next month, I shan’t be renewing it. If somebody else wants to do so, I’ll happily tell them the settings that they need, but it’ll be them that’s paying for it, not me.
“But I still use Abnib!” I hear you cry. Well, here’s what you can do about it:
Option 1 (the simple-but-good option): switch to something better, easily
RSS aggregators nowadays are (usually) free and (generally) easy to use. If you don’t have a clue, here’s the Really Simple Guide to getting started:
Download the Abnib OPML file (https://danq.me/abnib.opml) and save it to your computer. This file describes in a computer-readable format who all the Abnibbers are.
Go to Google Reader and log in with your Google Account, if you haven’t already.
Click Settings, then Reader Settings.
Click Browse… and select the file you downloaded in step #1.
Ta-da! You can now continue to read your favourite Abnib blogs through Google Reader. You’ve also got more features, like being able to not-subscribe to particular blogs, or (on some blogs) to subscribe to comments or other resources.
You don’t have to use Google Reader, of course: there are plenty of good RSS readers out there. And most of the good ones are capable of importing that OPML file, so you can quickly get up-and-running with all of your favourite Abnib blogs, right off the bat.
Option 2: switch to something better, manually
As above, but instead of downloading and uploading an OPML file, manually re-subscribe to each blog. This takes a lot longer, but makes it easy to choose not to subscribe to particular blogs. It also gives you the option to use a third-party service like FreeMyFeed to allow you to subscribe to LiveJournal “friends only” posts (which you were never able to do with Abnib), for example.
Option 3: continue to use Abnib (wait, what?)
Okay, so the domain name is expiring, but technically you’ll still be able to use Abnib for a while, at least, so long as you use the address http://abnib.appspot.com/. That won’t last forever, and it will be completely unmaintained, so when it breaks, it’s broken for good. It also won’t be updated with new blog addresses, so if somebody changes where their blog is hosted, you’ll never get the new one.
It’s been fun, Abnib, but you’ve served your purpose. Now it’s time for you to go the way of the Troma Night website and the RockMonkey wiki, and die a peaceful little death.
As those of you who use my Feed Proxy service to get your LiveJournal friends’ blogs (including friends-only posts) into Google Reader or a similar service know, the service hasn’t been working for the last few days.
I made all of the changes that LiveJournal’s bot policy required of me. I e-mailed them; no response. I e-mailed again; no response. I e-mailed to ask were they receiving my e-mails – yes, they were, but the person responsible for unblocking the bot “wasn’t in” at the moment.
I e-mailed again: yet again, no response.
I’ve been finding it harder to keep up with my LiveJournal friends because of this, and I know that a lot of you are pissed off, too. But it looks like LiveJournal aren’t going to be cooperative any time soon. So it’s time to switch services.
I’m moving my authenticated feeds over to FreeMyFeed. FreeMyFeed provides many of the same services at Feed Proxy did, although it also works for a wider variety of web applications (for example, you can also use it for Twitter, if you’re one of the dozen or so people who still uses Twitter).
If you’re already a Feed Proxy user:
Within the next few hours, each LiveJournal friend you’re subscribed to through Feed Proxy will produce a post explaining how you can convert their feed over to FreeMyFeed with about two clicks. I suggest that you mark that post as “read” and then click the link, and the rest of the work is mostly done for you. You’ll see some “read” posts all over again (boo!) and FreeMyFeed doesn’t convert LJ “moods” and “comment counts” for you automatically, but apart from that it should serve you well.
If you’re not using Feed Proxy or FreeMyFeed yet, or you’ve deleted your Feed Proxy-powered feeds from Google Reader:
Google Reader’s a great way to keep up-to-date with all your friends’ blogs – as well as with news, comics, and more – both in and out of LiveJournal. To subscribe to a LiveJournal blog in Google Reader or a similar service, friends-only posts and all, go to the FreeMyFeed website and enter into the boxes:
(replace username with the LiveJournal username of the person whose LiveJournal you’re subscribing to)
user: your LiveJournal username
pass: your LiveJournal password
Thanks for all of the support you LiveJournalers have given me over the years, both for Feed Proxy and for it’s predecessor, LiveJournal-To-Google Reader. It’s been fun.
BREAKING NEWS: On 1st October 2009, LiveJournal blocked the Feed Proxy bot. I don’t know when they’ll unblock it and it’ll come back up: see the latest here.
I’ve fixed a handful of bugs in the popular Feed Proxy tool (which, as you probably know, allows you to read LiveJournal and Dreamwidth “friends-only” posts in Google Reader or your favourite RSS reader tool, even where that RSS reader doesn’t support the necessary authentication systems to normally be able to pick up these posts). These include:
A number of users identified a problem relating to some mixed-case LiveJournal usernames having to be entered into Feed Proxy in lowercase to work. These usernames are now automatically corrected to lowercase as necessary.
Feed Proxy now automatically detects those passwords whose characters may cause problems with the cURL library, which is used to fetch the feeds from LiveJournal/Dreamwidth, and produces a warning message, rather than the previous unfriendly error message. A better solution will be investigated in the future.
Downloading an OPML package of some or all of your feeds now works correctly in Google Chrome. I didn’t know so many of you used it!
The FAQ has been expanded with a few more common questions, including the (very) frequently-asked question about multiple source accounts of the same type (which will be properly supported at some future point).
It’s now possible to read the FAQ without having an account or logging in. Sorry I forgot that – whoops!
I’ve finally gotten around to responding to all of the e-mails I’ve received so far from users: sorry about the delay, folks, but a lot of you had questions to ask!
To those that have asked about open-sourcing it: yes, I still fully intend to open-source the project (as I did with it’s predecessor, LJ-To-Google Reader) so you can run it on your own server if you like, but only once it’s reached a point of stability. Follow this RSS feed if you want to hear about updates to Feed Proxy, including when the source code becomes available.
BREAKING NEWS: On 1st October 2009, LiveJournal blocked the Feed Proxy bot. I don’t know when they’ll unblock it and it’ll come back up: see the latest here.
The LiveJournal-To-Google Reader service is back up again, rebranded as Feed Proxy. It’s pretty much bare-bones right now, but I’ve got a meaningful framework that I can add to in the future, and I’ll try to keep it up-to-date by adding all of the features that everybody requested back when it was LiveJournal-To-Google Reader (I’ve already added a few, as described below).
My sincere apologies to everybody affected by the day and a half of downtime that was involved in this change-over.
Here’s what you need to know:
If you already use LiveJournal-To-Google Reader
All of your feed links have now broken. Sorry, but this was necessary! You’ll probably want to delete your subscriptions to all of the old links, because they won’t work any more. You’ll also need to set yourself up with a new account on the new service, Feed Proxy. Choose yourself a username and password, log in, and associate your account with your LiveJournal account. Then you can click “show feeds” and start subscribing to your LiveJournal friends’ feeds using Google Reader.
Where possible, shows how many comments, link to comments, poster’s “mood”, and security status (public or private [i.e. “friends only”]) of each post.
OPML export, so you can easily get all of your feeds back into Google Reader (or whatever RSS reader you prefer) again.
Links that don’t change for no reason
Better support for communities
If you don’t have a clue what this is all about…
Feed Proxy is a tool that I originally wrote because I didn’t like having to go to my LiveJournal “friends page” to catch up on all the “friends-only” posts being made by people I knew. I already used Google Reader for every other blog in the world; why should I have to go to another site? I also didn’t like that I couldn’t “group” my friends on my friends page, so I could see which ones were related to my different interests and just focus on those at once. I also wanted to be able to easily mark which posts I’d already read. Google Reader already does all of this.
But if you subscribe to a LiveJournal account using Google Reader, you don’t get the “friends only” posts. It’s just not possible.
Feed Proxy makes it possible. And now, it adds a lot of other nice features, too.
If you use LiveJournal (or your friends use LiveJournal) and you’d rather have the slicker interface of Google Reader at your disposal, give it a go.