Identifying Post Kinds in WordPress RSS Feeds

I use the Post Kinds plugin to streamline the management of the different types of posts I make on my blog, based on the IndieWeb post types list: articles, like this one, are “conventional” blog posts, but I also publish notes (which are analogous to “tweets”), reposts (“shares” of things I’ve found online, sometimes with commentary), checkins (mostly chronicling my geocaching/geohashing), and others: I’ve extended Post Kinds to facilitate comics and reviews, for example.

But for people who subscribe (either directly or indirectly) to everything I post, I imagine it must be a little frustrating to sometimes be unable to identify the type of a post before clicking-through. So I’ve added the following code, which I’m sharing here and on GitHub in case it’s of any use to anybody else, to my theme’s functions.php:

// Make titles in RSS feed be prefixed by the Kind of the post.
function add_kind_to_rss_post_title(){
	$kinds = wp_get_post_terms( get_the_ID(), 'kind' );
	if( ! isset( $kinds ) || empty( $kinds ) ) return get_the_title(); // sanity-check.
	$kind = $kinds[0]->name;
	$title = get_the_title();
	return trim( "[{$kind}] {$title}" );
}
add_filter( 'the_title_rss', 'add_kind_to_rss_post_title', 4 ); // priority 4 to ensure it happens BEFORE default escaping filters.

This decorates the titles of my posts, but only in my feeds, so it’s easier for people to tell at-a-glance what’s going on:

Rendered RSS feed showing Post Kinds prefixes

Down the line I might expand this so that it doesn’t show if the subscriber is, for example, asking only for articles (e.g. via this feed); I’m coming up with a huge list of things I’d like to do at IndieWebCamp London! But for now, this feels like a nice simple improvement to a plugin I love that helps it to fit my specific needs.

Subscribing to The Far Side (by RSS!)

Prior to his retirement in 1995 I managed to amass a collection of almost all of Gary Larson’s The Far Side books as well as a couple of calendars and other thingamabobs. After 24 years of silence I didn’t expect to hear anything more from him and so I was as surprised as most of the Internet was when he re-emerged last year with a brand new on his first ever website. Woah.

The Far Side by Gary Larson

Larson’s hinted that there might be new and original content there someday, but for the time being I’m just loving that I can read The Far Side comments (legitimately) via the Web for the first time! The site’s currently publishing a “Daily Dose” of classic strips, which is awesome. But… I don’t want to have to go to a website to get comics every day. Nor do I want to have to remember which days I’ve caught-up with, yet. That’s a job for computers, right? And it’s a solved problem: RSS (which has been around for almost as long as Larson hasn’t) and similar technologies allow a website to publicise that it’s got updates available in a way that people can “subscribe” to, so I should just use that, right?

Except… the new The Far Side website doesn’t have an RSS feed. Boo! Luckily, I’m not above automating the creation of feeds for websites that I wish had them, even (or perhaps especially) where that involves a little reverse-engineering of online comics. So with a little thanks to my RSS middleware RSSey… I can now read daily The Far Side comics in the way that’s most-convenient to me: right alongside my other subscriptions in my feed reader.

The Far Side comics in Dan's RSS reader.
How screen scrapers are made.

I’m afraid I’m not going to publicly*-share a ready-to-go feed URL for this one, unlike my BBC News Without The Sport feed, because a necessary side-effect of the way it works is that the ads are removed. And if I were to republish a feed containing The Far Side website cartoons but with the ads stripped I’d be guilty of, like, all the ethical and legal faults that Larson was trying to mitigate by putting his new website up in the first place! I love The Far Side and I certainly don’t want to violate its copyright!

But – at least until Larson’s web developer puts up a proper feed (with or without ads) – for those of us who like our comics delivered fresh to us every morning, here’s the source code (as an RSSey feed definition) you could use to run your own personal-use-only “give me The Far Side Daily Dose as an RSS feed” middleware.

Thanks for deciding to join us on the Internet, Gary. I hear it’s going to be a big thing, someday!

* Friends are welcome to contact me off-blog for an address if they like, if they promise to be nice and ethical about it.

Subscribe by Email

For the last few months, I’ve been running an alpha test of an email-based subscription to DanQ.me with a handful of handpicked testers. Now, I’d like to open it up to a slightly larger beta test group. If you’d like to get the latest from this site directly in your inbox, just provide your email address below:

Subscribe by email!

Who’s this for?

Some people prefer to use their email inbox to subscribe to things. If that’s you: great!

What will I receive?

You’ll get a “daily digest”, no more than once per day, summarising everything I’ve published within the last 24 hours. It usually works: occasionally but not often it misses things. You can unsubscribe with one click at any time.

How else can I subscribe?

You can still subscribe in a variety of other ways. Personally, I recommend using a feed reader which lets you choose exactly which kinds of content you’re interested in, but there are plenty of options including Facebook and Twitter (for those of such an inclination).

Didn’t you do this before?

Yes, I ran a “subscribe by email” system back in 2007 but didn’t maintain it. Things might be better this time around. Maybe.

LABS Comic RSS Archive

Yesterday I recommended that you go read Aaron Uglum‘s webcomic LABS which had just completed its final strip. I’m a big fan of “completed” webcomics – they feel binge-able in the same way as a complete Netflix series does! – but Spencer quickly pointed out that it’s annoying for we enlightened modern RSS users who hook RSS up to everything to have to binge completed comics in a different way to reading ongoing ones: what he wanted was an RSS feed covering the entire history of LABS.

LABS comic adapted to show The Robot literally "feeding" RSS
With apologies to Aaron Uglum who I hope won’t mind me adapting his comic in this way.

So naturally (after the intense heatwave woke me early this morning anyway) I made one: complete RSS feed of LABS. And, of course, I open-sourced the code I used to generate it so that others can jumpstart their projects to make static RSS feeds from completed webcomics, too.

Even if you’re not going to read it via this medium, you should go read LABS.

DanQ.me Ecosystem

Diagram illustrating the relationships between DanQ.me and the satellite services with which it interacts.

With IndieWebCamp Oxford 2019 scheduled to take place during the Summer of Hacks, I drew a diagram (click to embiggen) of the current ecosystem that powers and propogates the content on DanQ.me. It’s mostly for my own benefit – to be able to get a big-picture view of the ways my website talks to the world and plan for what improvements I might be able to make in the future… but it also works as a vehicle to explain what my personal corner of the IndieWeb does and how it does it. Here’s a summary:

DanQ.me

Since fifteen years ago today, DanQ.me has been powered by a self-hosted WordPress installation. I know that WordPress isn’t “hip” on the IndieWeb this week and that if you’re not on the JAMstack you’re yesterday’s news, but at 15 years and counting my love affair with WordPress has lasted longer than any romantic relationship I’ve ever had with another human being, so I’m sticking with it. What’s cool in Web technologies comes and goes, but what’s important is solid, dependable tools that do what you need them to, and between WordPress, half a dozen off-the-shelf plugins and about a dozen homemade ones I’ve got everything I need right here.

Castle of the Four Winds, launched in 1998, with a then-fashionable black background.
I’d been “blogging” – not that we called it that, yet – since late 1998, but my original collection of content-mangling Perl scripts wasn’t all that. More history…

I write articles (long posts like this) and notes (short, “tweet-like” updates) directly into the site, and just occasionally other kinds of content. But for the most part, different kinds of content come from different parts of the ecosystem, as described below.

RSS reader

DanQ.me sits at the centre of the diagram, but it’s worth remembering that the diagram is deliberately incomplete: it only contains information flows directly relevant to my blog (and it doesn’t even contain all of those!). The last time I tried to draw a diagram like this that described my online life in general, then my RSS reader found its way to the centre. Which figures: my RSS reader is usually the first and often the last place I visit on the Internet, and I’ve worked hard to funnel everything through it.

FreshRSS with 129 unread items
129 unread items is a reasonable-sized queue: I try to process to “RSS zero”, but there are invariably things I want to return to on a second-pass and I’ve not yet reimplemented the “snooze button” I added to my previous RSS reader.

Right now I’m using FreshRSS – plus a handful of plugins, including some homemade ones – as my RSS reader: I switched from Tiny Tiny RSS about a year ago to take advantage of FreshRSS’s excellent responsive themes, among other features. Because some websites don’t have RSS feeds, even where they ought to, I use my own tool RSSey to retroactively “fix” people’s websites for them, dynamically adding feeds for my consumption. It’s also a nice reminder that open source and remixability were cornerstones of the original Web. My RSS reader collates information from a variety of sources and additionally gives me a one-click mechanism to push content I enjoy to my blog as a repost.

QTube

QTube is my video hosting platform; it’s a PeerTube node. If you haven’t seen it, that’s fine: most content on it is consumed indirectly either through my YouTube channel or directly on my blog as posts of the “video” kind. Also, I don’t actually vlog very often. When I do publish videos onto QTube, their republication onto YouTube or DanQ.me is optional: sometimes I plan to use a video inside an article post, for example, and so don’t need to republish it by itself.

QTube homepage showing Dan's videos
I recently changed the blue of my “brand colours” to improve accessibility, but this hasn’t carried over to QTube yet.

I’m gradually exporting or re-uploading my backlog of YouTube videos from my current and previous channels to QTube in an effort to recentralise and regain control over their hosting, but I’m in no real hurry. PeerTube certainly makes it easy, though!

Link Shortener

I operate a private link shortener which I mostly use for the expected purpose: to make links shorter and so easier to read out and memorise or else to make them take up less space in a chat window. But soon after I set it up, many years ago, I realised that it could also act as a mechanism to push content to my RSS reader to “read later”. And by the time I’m using it for that, I figured, I might as well also be using it to repost content to my blog from sources that aren’t things my RSS reader subscribes to. This leads to a process that’s perhaps unnecessarily complex: if I want to share a link with you as a repost, I’ll push it into my link shortener and mark it as going “to me”, then I’ll tell my RSS reader to push it to my blog and there it’ll be published to the world! But it works and it’s fast enough: I’m not in the habit of reposting things that are time-critical anyway.

Checkins

Dan geohashing
You know your sport is fringe when you need to reference another fringe sport to describe it. “Geohashing? It’s… a little like geocaching, but…”

I’ve been involved in brainstorming ways in which the act of finding (or failing to find, etc.) a geocache or reaching (or failing to reach) a geohashpoint could best be represented as a “checkin“, and last year I open-sourced my plugin for pulling logs (with as much automation as is permitted by the terms of service of some of the silos involved) from geocaching websites and posting them to WordPress blogs: effectively PESOS-for-geocaching. I’d prefer to be publishing on my own blog in the first instance, but syndicating my adventures from various silos into my blog is “good enough”.

Syndication

New notes get pushed out to my Twitter account, for the benefit of my Twitter-using friends. Articles get advertised on Facebook, Twitter and LiveJournal (yes, really) in teaser form, for the benefit of friends who prefer to get notifications via those platforms. Facebook have been fucking around with their APIs and terms of service lately and this is now less-automatic than it used to be, which is a bit of an annoyance. My RSS feeds carry copies of content out to people who prefer to subscribe via that medium, and I’ve also been using this to power an experimental MailChimp “daily digest” mailing list of “what Dan’s been up to” to a small number of friends, right in their email inboxes: I’ve not made it available to everybody yet, but if you’re happy to help test it then give me a shout and I’ll hook you up.

DanQ.me email newsletter
Most days don’t see an email sent or see an email with only one item, but some days – like this one – are busier. I still need to update the brand colours here, too!

Finally, a couple of IFTTT recipes push my articles and my reposts to Reddit communities: I don’t really use Reddit myself, any more, but I’ve got friends in a few places there who prefer to keep up-to-date with what I’m up to via that medium. For historical reasons, my reposts to Reddit don’t go directly via my blog’s RSS feeds but “shortcut” directly from my RSS reader: this is suboptimal because I don’t get to tweak post titles for Reddit but it’s not a big deal.

IFTTT recipe pushing articles to Reddit
What IFTTT does isn’t magic, but it’s often indistinguishable from it.

I used to syndicate content to Google+ (before it joined the long list of Things Google Have Killed) and to Ello (but it never got much traction there). I’ve probably historically syndicated to other places too: I’ve certainly manually-republished content to other blogs, from time to time, too.

Backfeed

I use Ryan Barrett‘s excellent Brid.gy to convert Twitter replies and likes back into Webmentions for publication as comments on my blog. This used to work for Facebook, too, but again: Facebook fucked it over. I’ve occasionally manually backfed significant Facebook comments, but it’s not ideal: I might like to look at using similar technologies to RSSey to subvert Facebook’s limitations.

Brid.gy's management of my Twitter backfeed
I’ve never had a need for Brid.gy’s “publishing” (i.e. POSSE) features, but its backfeed features “just work”, and it’s awesome.

Reintegration

I’ve routinely retroactively reintegrated content that I’ve produced elsewhere on the Web. This includes my previous blogs (which is why you can browse my archives, right here on this site, all the way back to some of the cringeworthy angsty-teenager posts I made in the 1990s) but also some Reddit posts, some replies originally posted directly to other people’s blogs, all my old del.icio.us bookmarks, long-form forum posts, posts I made to mailing lists and newsgroups, and more. As a result, there’s a lot of backdated content on this site, nowadays: almost a million words, and significantly more than the 600,000 or so I counted a few years ago, before my biggest push for reintegration!

Cumulative wordcount per day, by content type.
Cumulative wordcount per day, by content type. The lion’s share has always been articles, but reposts are creeping up as I’ve been writing more about the things I reshare, lately. It’d be interesting to graph the differentiation of this chart to see the periods of my life that I was writing the most: I have a hypothesis, and centralising my own content under my control makes it easier

Why do I do this? Because I really, really like owning my identity online! I’ve tried the “big” silo alternatives like Facebook, Twitter, Medium, Instagram etc., and they’ve eventually always lead to disappointment, either because they get shut down or otherwise made-unusable, because of inappropriately-applied “real names” policies, because they give too much power to untrustworthy companies, because they impose arbitrary limitations on my content, because they manipulate output promotion (and exacerbate filter bubbles), or because they make the walls of their walled gardens taller and stop you integrating with them how you used to.

A handful of silos have shown themselves to be more-trustworthy than the average – in particular, eschewing techniques that promote “lock-in” – and I’d love to tell you more about them and what I think you should look for in a silo, another time. But for now: suffice to say that just like I don’t use YouTube like most people do, I elect not to use Facebook or Twitter in the conventional ways either. And it’s awesome, thanks.

There are plenty of reasons that people choose to take control of their own Web presence – and everybody who puts content online ought to consider it – but I imagine that few individuals have such a complicated publishing ecosystem as I do! Now you’ve got a picture of how my digital content production workflow works, and perhaps start owning your online identity, too.

I Don’t Watch YouTube (Like You Watch YouTube)

I was watching a recent YouTube video by Derek Muller (Veritasium), My Video Went Viral. Here’s Why, and I came to a realisation: I don’t watch YouTube like most people – probably including you! – watch YouTube. And as a result, my perspective on what YouTube is and does is fundamentally biased from the way that others probably think about it.

The Veritasium video My Video Went Viral. Here’s Why is really good and you should definitely watch at least 7 minutes of it in order to influence the algorithm.

The magic moment came for me when his video explained that the “subscribe” button doesn’t do what I’d assumed it does. I’m probably not alone in my assumptions: I’ll bet that people who use the “subscribe” button as YouTube intend don’t all realise that it works the way that it does.

Like many, I’d assumed the “subscribe” buttons says “I want to know about everything this creator publishes”. But that’s not what actually happens. YouTube wrangles your subscription list and (especially) your recommendations based on their own metrics using an opaque algorithm. I knew, of course, that they used such a thing to manage the list of recommended next-watches… but I didn’t realise how big an influence it was having on the way that most YouTube users choose what they’ll watch!

Veritasium explains how the YouTube subscriber model has changed over time
“YouTube started doing some experiments… where they would change what was recommended to your subscribers. No longer was a subscription like ‘I want to see every video by this person’; it was more of a suggestion…”

YouTube’s metrics for “what to show to you” is, of course, biased by your subscriptions. But it’s also biased by what’s “trending” (which in turn is based on watch time and click-through-rate), what people-who-watch-the-things-you-watch watch, subscription commonalities, regional trends, what your contacts are interested in, and… who knows what else! AAA YouTubers try to “game” it, but the goalposts are moving. And the struggle to stay on-top, especially after a fluke viral hit, leads to the application of increasingly desperate and clickbaity measures.

This is a battle to which I’ve been mostly oblivious, until now, because I don’t watch YouTube like you watch YouTube.

Veritasium explains the YouTube "frontpage" algorithm.
“You could be a little bit disappointed in the way the game is working right now… I challenge you to think of a better way.”
Hold my beer.

Tom Scott produced an underappreciated sci-fi short last year describing a theoretical AI which, in 2028, caused problems as a result of its single-minded focus. What we’re seeing in YouTube right now is a simpler example, but illustrates the problem well: optimising YouTube’s algorithm for any factor or combination of factors other than a user’s stated preference (subscriptions) will necessarily result in the promotion of videos to a user other than, and at the expense of, the ones by creators that they’ve subscribed to. And there are so many things that YouTube could use as influencing factors. Off the top of my head, there’s:

  • Number of views
  • Number of likes
  • Ratio of likes to dislikes
  • Number of tracked shares
  • Number of saves
  • Length of view
  • Click-through rate on advertisements
  • Recency
  • Subscriber count
  • Subscriber engagement
  • Popularity amongst your friends
  • Popularity amongst your demographic
  • Click-through-ratio
  • Etc., etc., etc.
Veritasium videos in my RSS reader
A Veritasium video I haven’t watched yet? Thanks, RSS reader.

But this is all alien to me. Why? Well: here’s how I use YouTube:

  1. Subscription: I subscribe to creators via RSS. My RSS reader doesn’t implement YouTube’s algorithm, of course, so it just gives me exactly what I subscribe to – no more, no less.It’s not perfect (for example, it pisses me off every time it tells me about an upcoming “premiere”, a YouTube feature I don’t care about even a little), but apart from that it’s great! If I’m on-the-move and can’t watch something as long as involved TheraminTrees‘ latest deep-thinker, my RSS reader remembers so I can watch it later at my convenience. I can have National Geographic‘s videos “expire” if I don’t watch them within a week but Dr. Doe‘s wait for me forever. And I can implement my own filters if a feed isn’t showing exactly what I’m looking for (like I did to strip the sport section from BBC News’ RSS feed). I’m in control.
  2. Discovery: I don’t rely on YouTube’s algorithm to find me new content. I don’t mind being a day or two behind on what’s trending: I’m not sure I care at all? I’m far more-interested in recommendations curated by a human. If I discover and subscribe to a channel on YouTube, it was usually (a) mentioned by another YouTuber or (b)  linked from a blog or community blog. I’m receiving recommendations from people I already respect, and they have a way higher hit-rate than YouTube’s recommendations.(I also sometimes discover content because it’s exactly what I searched for, e.g. I’m looking for that tutorial on how to install a fiddly damn kiddy seat into the car, but this is unlikely to result in a subscription.)
Robot with a computer.
I for one welcome our content-recommending robot overlords. (So long as their biases can be configured by their users, not the networks that create them…)

This isn’t yet-another-argument that you should use RSS because it’s awesome. (Okay, so it is. RSS isn’t dead, and its killer feature is that its users get to choose how it works. But there’s more I want to say.)

What I wanted to share was this reminder, for me, that the way you use a technology can totally blind you to the way others use it. I had no idea that many YouTube creators and some YouTube subscribers felt increasingly like they were fighting YouTube’s algorithms, whose goals are different from their own, to get what they want. Now I can see it everywhere! Why do schmoyoho always encourage me to press the notification bell and not just the subscribe button? Because for a typical YouTube user, that’s the only way that they can be sure that their latest content will be seen!

Veritasium encourages us to "ring that bell".
“There is one way… to short-circuit this effect… ring that bell.”
If I may channel Yoda for a moment: No… there is another!

Of course, the business needs of YouTube mean that we’re not likely to see any change from them. So until either we have mainstream content-curating AIs that answer to their human owners rather than to commercial networks (robot butler, anybody?) or else the video fediverse catches on – and I don’t know which of those two are least-likely! – I guess I’ll stick to my algorithm-lite subscription model for YouTube.

But at least now I’ll have a better understanding of why some of the channels I follow are changing the way they produce and market their content…

BBC News… without the sport

I love RSS, but it’s a minor niggle for me that if I subscribe to any of the BBC News RSS feeds I invariably get all the sports news, too. Which’d be fine if I gave even the slightest care about the world of sports, but I don’t.

Sports on the BBC News site
Down with Things Like This!

It only takes a couple of seconds to skim past the sports stories that clog up my feed reader, but because I like to scratch my own itches, I came up with a solution. It’s more-heavyweight perhaps than it needs to be, but it does the job. If you’re just looking for a BBC News (UK) feed but with sports filtered out you’re welcome to share mine: https://f001.backblazeb2.com/file/Dan–Q–Public/bbc-news-nosport.rss.

If you’d like to see how I did it so you can host it yourself or adapt it for some similar purpose, the code’s below or on GitHub:

#!/usr/bin/env ruby

# # Sample crontab:
# # At 41 minutes past each hour, run the script and log the results
# 41 * * * * ~/bbc-news-rss-filter-sport-out.rb > ~/bbc-news-rss-filter-sport-out.log 2>&1

# Dependencies:
# * open-uri - load remote URL content easily
# * nokogiri - parse/filter XML
# * b2       - command line tools, described below
require 'bundler/inline'
gemfile do
  source 'https://rubygems.org'
  gem 'nokogiri'
end
require 'open-uri'

# Regular expression describing the GUIDs to reject from the resulting RSS feed
# We want to drop everything from the "sport" section of the website
REJECT_GUIDS_MATCHING = /^https:\/\/www\.bbc\.co\.uk\/sport\//

# Assumption: you're set up with a Backblaze B2 account with a bucket to which
# you'd like to upload the resulting RSS file, and you've configured the 'b2'
# command-line tool (https://www.backblaze.com/b2/docs/b2_authorize_account.html)
B2_BUCKET = 'YOUR-BUCKET-NAME-GOES-HERE'
B2_FILENAME = 'bbc-news-nosport.rss'

# Load and filter the original RSS
rss = Nokogiri::XML(open('https://feeds.bbci.co.uk/news/rss.xml?edition=uk'))
rss.css('item').select{|item| item.css('guid').text =~ REJECT_GUIDS_MATCHING }.each(&:unlink)

begin
  # Output resulting filtered RSS into a temporary file
  temp_file = Tempfile.new
  temp_file.write(rss.to_s)
  temp_file.close

  # Upload filtered RSS to a Backblaze B2 bucket
  result = `b2 upload_file --noProgress --contentType application/rss+xml #{B2_BUCKET} #{temp_file.path} #{B2_FILENAME}`
  puts Time.now
  puts result.split("\n").select{|line| line =~ /^URL by file name:/}.join("\n")
ensure
  # Tidy up after ourselves by ensuring we delete the temporary file
  temp_file.close
  temp_file.unlink
end

bbc-news-rss-filter-sport-out.rb

When executed, this Ruby code:

  1. Fetches the original BBC news (UK) RSS feed and parses it as XML using Nokogiri
  2. Filters it to remove all entries whose GUID matches a particular regular expression (removing all of those from the “sport” section of the site)
  3. Outputs the resulting feed into a temporary file
  4. Uploads the temporary file to a bucket in Backblaze‘s “B2” repository (think: a better-value competitor S3); the bucket I’m using is publicly-accessible so anybody’s RSS reader can subscribe to the feed

I like the versatility of the approach I’ve used here and its ability to perform arbitrary mutations on the feed. And I’m a big fan of Nokogiri. In some ways, this could be considered a lower-impact, less real-time version of my tool RSSey. Aside from the fact that it won’t (easily) handle websites that require Javascript, this approach could probably be used in exactly the same ways as RSSey, and with significantly less set-up: I might look into whether its functionality can be made more-generic so I can start using it in more places.

Your RSS is grass: Mozilla euthanizes feed reader, Atom code in Firefox browser, claims it’s old and unloved

This article is a repost promoting content originally published elsewhere. See more things Dan's reposted.

( )

When Firefox 64 arrives in December, support for RSS, the once celebrated content syndication scheme, and its sibling, Atom, will be missing.

“After considering the maintenance, performance and security costs of the feed preview and subscription features in Firefox, we’ve concluded that it is no longer sustainable to keep feed support in the core of the product,” said Gijs Kruitbosch, a software engineer who works on Firefox at Mozilla, in a blog post on Thursday.

Not a great sign, but understandable. Live Bookmarks was never strong enough to be a full-featured RSS reader, and I don’t know about you but I haven’t really made use of bookmarks for a good few years, let alone “live” bookmarks, but the media are likely to see this (as El Reg does, in the article) as another nail in the coffin of one of the best syndication mechanisms the Web ever came up with.

The Rise and Demise of RSS

This article is a repost promoting content originally published elsewhere. See more things Dan's reposted.

( )

There are two stories here. The first is a story about a vision of the web’s future that never quite came to fruition. The second is a story about how a collaborative effort to improve a popular standard devolved into one of the most contentious forks in the history of open-source software development.

In the late 1990s, in the go-go years between Netscape’s IPO and the Dot-com crash, everyone could see that the web was going to be an even bigger deal than it already was, even if they didn’t know exactly how it was going to get there. One theory was that the web was about to be revolutionized by syndication. The web, originally built to enable a simple transaction between two parties—a client fetching a document from a single host server—would be broken open by new standards that could be used to repackage and redistribute entire websites through a variety of channels. Kevin Werbach, writing for Release 1.0, a newsletter influential among investors in the 1990s, predicted that syndication “would evolve into the core model for the Internet economy, allowing businesses and individuals to retain control over their online personae while enjoying the benefits of massive scale and scope.”1 He invited his readers to imagine a future in which fencing aficionados, rather than going directly to an “online sporting goods site” or “fencing equipment retailer,” could buy a new épée directly through e-commerce widgets embedded into their favorite website about fencing.2 Just like in the television world, where big networks syndicate their shows to smaller local stations, syndication on the web would allow businesses and publications to reach consumers through a multitude of intermediary sites. This would mean, as a corollary, that consumers would gain significant control over where and how they interacted with any given business or publication on the web.

RSS was one of the standards that promised to deliver this syndicated future. To Werbach, RSS was “the leading example of a lightweight syndication protocol.”3 Another contemporaneous article called RSS the first protocol to realize the potential of XML.4 It was going to be a way for both users and content aggregators to create their own customized channels out of everything the web had to offer. And yet, two decades later, RSS appears to be a dying technology, now used chiefly by podcasters and programmers with tech blogs. Moreover, among that latter group, RSS is perhaps used as much for its political symbolism as its actual utility. Though of course some people really do have RSS readers, stubbornly adding an RSS feed to your blog, even in 2018, is a reactionary statement. That little tangerine bubble has become a wistful symbol of defiance against a centralized web increasingly controlled by a handful of corporations, a web that hardly resembles the syndicated web of Werbach’s imagining.

The future once looked so bright for RSS. What happened? Was its downfall inevitable, or was it precipitated by the bitter infighting that thwarted the development of a single RSS standard?

I’ve always been a huge fan of RSS, and I use it for just about everything (I’ll even hack-it-in to services that don’t supply it natively, just to make them fit around my workflow). But even I’ve got to admit that – outside of podcasts – it’s not done well at retaining mainstream appeal, especially after the death of Google Reader. Right now, most people seem content to get their updates from their social media circles, and take a manual approach (ugh) to reading content in the few other places that matter to them. That’s problematic for all kinds of reasons, and I’m perfectly happy to be one of those old fuddy-duddies who likes his web standards open and independent!

20 Years Of Blogging

As of next week, I’ll have been blogging for 20 years, or about 54% of my life. How did that happen?

Castle of the Four Winds in early 1999.
I’d been “blogging” – not that we called it that, yet – since late 1998, but my original collection of content-mangling Perl scripts wasn’t all that. More history…

The mid-1990s were a very different time for the World Wide Web (yes, we still called it that, and sometimes we even described its use as “surfing”). Going “on the Internet” was a calculated and deliberate action requiring tying up your phone line, minutes of “connecting” along with all of the associated screeching sounds if you hadn’t turned off your modem’s loudspeaker, and you’d typically be paying twice for the experience: both a monthly fee to your ISP for the service and a per-minute charge to your phone company for the call.

It was into this environment that in 1994 I published my first web pages: as far as I know, nothing remains of them now. It wasn’t until 1998 that I signed up an account with UserActive (whose website looks almost the same today as it did then) who offered economical subdomain hosting with shell and CGI support and launched “Castle of the Four Winds”, a set of vanity pages that included my first blog.

Except I didn’t call it a “blog”, of course, because it wasn’t until the following year that Peter Merholz invented the word (he also commemorated 20 years of blogging, this year). I didn’t even call it a “weblog”, because that word was still relatively new and I wasn’t hip enough to be around people who said it, yet. It was self-described as an “online diary”, a name which only served to reinforce the notion that I was writing principally for myself. In fact, it wasn’t until mid-1999 that I discovered that it was being more-widely read than just by me and my circle of friends when I attracted a stalker who travelled across the UK to try to “surprise” me by turning up at places she expected to find me, based on what I’d written online… which was exactly as creepy as it sounds.

AvAngel.com, my second vanity site, as seen in 2001
AvAngel.com

While the world began to panic that the coming millennium was going to break all of the computers, I migrated Castle of the Four Winds’ content into AvAngel.com, a joint vanity site venture with my friend Andy. Aside from its additional content (purity tests, funny stuff, risqué e-cards), what we hosted was mostly the same old stuff, and I continued to write snippets about my life in what was now quite-clearly a “blog-like” format, with the most-recent posts at the top and separate pages for content too old for the front page. Looking back, there’s still a certain naivety to these posts which exemplify the youth of the Web. For example, posts routinely referenced my friends by their email addresses, because spam was yet to become a big enough problem that people didn’t much mind if you put their email address on a public webpage somewhere, and because email addresses still carried with them a feeling of anonymity that ceased to be the case when we started using them for important things.

Technologically-speaking, too, this was a simpler time. Neither Javascript nor CSS support was widespread (nor consistently-standardised) enough to rely upon for anything other than the simplest progressive enhancement unless you were willing to “pick a side” in what we’d subsequently call the first browser war and put one of those apalling “best viewed in Internet Explorer” or “best viewed in Netscape Navigator” banners on your site. I’ve always been a believer in a universal web (and my primary browser at the time was Opera, anyway, as it mostly-remained until Opera went wrong in 2013), and I didn’t have the energy to write everything twice, so our cool/dynamic functionality came mostly from back-end (e.g. Perl, PHP) technologies.

Meanwhile, during my initial months as a student in Aberystwyth, I wrote a series of emails to friends back home entitled “Cool And Interesting Thing Of The Day To Do At The University Of Wales, Aberystwyth”, and put copies of each onto my student webspace; I’ve since recovered these and integrated them into my unified blog.

The first version of Scatmania.org.
Scatmania.org

In 2002 I’d bought the domain name scatmania.org – a reference to my university halls of residence nickname “Scatman Dan”; I genuinely didn’t consider the possibility that the name might be considered scatalogical until later on. As I wanted to continue my blogging at an address that felt like it was solely mine (AvAngel.com having been originally shared with a friend, although in practice over time it became associated only with me), this seemed like a good domain upon which to relaunch. And so, in mid-2003 and powered by a short-lived and ill-fated blogging engine called Flip I did exactly that. WordPress, to which I’d subsequently migrate, hadn’t been invented yet and it wasn’t clear whether its predecessor, b2/cafelog, would survive the troubles its author was experiencing.

From this point on, any web address for any post made to my blog still works to this day, despite multiple technological and infrastructural changes to my blog (and some domain name shenanigans!) in the meantime. I’d come to be a big believer in the mantra that cool URIs don’t change: something that as far as possible I’ve committed to trying to upload in my blogging, my archiving, and my paid work since then. I’m moderately confident that all extant links on the web that point to earlier posts are all under my control so they can (and in most cases have) been fixed already, so I’m pretty close to having all my permalink URIs be “cool”, for now. You might hit a short chain of redirects, but you’ll get to where you’re going.

And everything was fine, until one day in 2004 when it wasn’t. The server hosting scatmania.org died in a very bad way, and because my backup strategy was woefully inadequate, I lost a lot of content. I’ve recovered quite a lot of it and put it back in-place, but some is probably gone forever.

Scatmana.org version 2 - now with actual web design
One of the longest-lived web designs for scatmania.org paid homage to the original, but with more “blue” and a WordPress backing.

The resurrected site was powered by WordPress, and this was the first time that live database queries had been used to power my blog. Occasionally, these days, when talking to younger, cooler developers, I’m tempted to follow the hip trend of reimplementing my blog as a static site, compiling a stack of host-anywhere HTML files based upon whatever-structure-I-like at the “backend”… but then I remember that I basically did that already for six years and I’m far happier with my web presence today. I’ve nothing against static site systems (I’m quite partial to Middleman, myself, although I’m also fond of Hugo) but they’re not right for this site, right now.

IndieAuth hadn’t been invented yet, but I was quite keen on the ideals of OpenID (I still am, really), and so I implemented what was probably the first viable “install-anywhere” implementation of OpenID for WordPress – you can see part of it functioning in the top-right of the screenshot above, where my (copious, at that time) LiveJournal-using friends were encouraged to sign in to my blog using their LiveJournal identity. Nowadays, the majority of the WordPress plugins I use are ones I’ve written myself: my blog is powered by a CMS that’s more “mine” than not!

Scatmania.org in 2006
I no longer have the images that made my 2006 redesign look even remotely attractive, so here it is mocked-up with block colours instead.

Over the course of the first decade of my blogging, a few trends had become apparent in my technical choices. For example:

  • I’ve always self-hosted my blog, rather than relying on a “blog as a service” or siloed social media platform like WordPress.com, Blogger, or LiveJournal.
  • I’ve preferred an approach of storing the “master” copy of my content on my own site and then (sometimes) syndicating it elsewhere: for example, for the benefit of my friends who during their University years maintained a LiveJournal, for many years I had my blog cross-post to a LiveJournal account (and backfeed copies of comments back to my site).
  • I’ve favoured web standards that provided maximum interoperability (e.g. RSS with full content) and longevity (serving HTML pages from permanent URLs, adding “extra” functionality via progressive enhancement so as to ensure that content functioned e.g. without Javascript, with CSS disabled or the specification evolved, etc.).

These were deliberate choices, but they didn’t require much consideration: growing up with a Web far less-sophisticated than today’s (e.g. truly stateless prior to the advent of HTTP cookies) and seeing the chaos caused during the first browser war and the period of stagnation that followed, these choices seemed intuitive.

(Perhaps it’s not so much of a coincidence that I’ve found myself working at a library: maybe I’ve secretly been a hobbyist archivist all along!)

Third major design reboot of scatmania.org
That body font is plain old Verdana, you know: I’ve always felt that it (plus full justification) was the right choice for this particular design, even though I regret other parts of it (like the brightness!).

As you’d expect from a blog covering a period from somebody’s teen years through to their late thirties, there’ve been significant changes in the kinds of content I’ve posted (and the tone with which I’ve done so) over the years, too. If you dip into 2003, for example, you’ll see the results of quiz memes and unqualified daily minutiae alongside actual considered content. Go back further, to early 1999, and it is (at best) meaningless wittering about the day-to-day life of a teenage student. It took until around 2009/2010 before I actually started focussing on writing content that specifically might be enjoyable for others to read (even where that content was frankly silly) and only far more-recently-still that I’ve committed to the “mostly technical stuff, ocassional bits of ‘life’ stuff” focus that I have today.

I say “committed”, but of course I’m fully aware that whatever this blog is now, it’ll doubtless be something somewhat different if I’m still writing it in another two decades…

Graph showing my blog posts per month
2014 may have included my most-prolific month of blogging, but 2003-2005 saw the most-consistent high-volume of content.

Once I reached the 2010s I started actually taking the time to think about the design of my blog and its meaning. Conceptually, all of my content is data-driven: database tables full of different “kinds” of content and associated metadata, and that’s pretty-much ideal – it provides a strong separation between content and presentation and makes it possible to make significant design changes with less work than might otherwise be expected. I’ve also always generally favoured a separation of concerns in web development and so I’m not a fan of CSS design methodologies that encourage class names describing how things should appear, like Atomic CSS. Even where it results in a performance hit, I’d far rather use CSS classes to describe what things are or represent. The single biggest problem with this approach, to my mind, is that it violates the DRY principle… but that’s something that your CSS preprocessor’s there to fix for you, isn’t it?

But despite this philosophical outlook on the appropriate gap between content and presentation, it took until about 2010 before I actually attached any real significance to the presentation at all! Until this point, I’d considered myself to have been more of a back-end than a front-end engineer, and felt that the most-important thing was to get the content out there via an appropriate medium. After all, a site without content isn’t a site at all, but a site without design is (or at least should be) still intelligible thanks to browser defaults! Remember, again, that I started web development at a time when stylesheets didn’t exist at all.

My previous implementations of my blog design had used simple designs, often adapted from open-source templates, in an effort to get them deployed as quickly as possible and move on to the next task, but now, I felt, it was time to do a little more.

Scatmania.org in 2010
My 2010 relaunch put far more focus on the graphical design elements of my blog as well as providing a fully responsive design based on (then-new) CSS media queries. Alongside my focus on separation of concerns in web development, I’m also quite opinionated on the idea that a responsive design has almost always been a superior solution to having a separate “mobile site”.

For a few years, I was producing a new theme once per year. I experimented with different colours, fonts, and layouts, and decided (after some ad-hoc A/B testing) that my audience was better-served by a “front” page than by being dropped directly into my blog archives as had previously been the case. Highlighting the latest few – and especially the very-latest – post and other recent content increased the number of posts that a visitor would be likely to engage with in a single visit. I’ve always presumed that the reason for this is that regular (but non-subscribing) readers are more-likely to be able to work out what they have and haven’t read already from summary text than from trying to decipher an entire post: possibly because my blogging had (has!) become rather verbose.

Scatmania.org until early 2012
My 2011 design, in hindsight, said more about my mood and state-of-mind at the time than it did about artistic choices: what’s with all the black backgrounds and seriffed fonts? Is this a funeral parlour?

I went through a bit of a lull in blogging: I’ve joked that I spent more time on my 2010 and 2011 designs than I did on the sum total of the content that was published in between the pair of them (which isn’t true… at least, not quite!). In the month I left Aberystwyth for Oxford, for example, I was doing all kinds of exciting and new things… and yet I only wrote a total of two blog posts.

With RSS waning in popularity – which I can’t understand: RSS is amazing! – I began to crosspost to social networks like Twitter and Google+ (although no longer to Google+, following the news of its imminent demise) to help those readers who prefer to get their content via these media, but because I wasn’t producing much content, it probably didn’t make a significant difference anyway: the chance of a regular reader “missing” something must have been remarkably slim.

Scatmania.org in 2012
The 2012 design featured “CSS peekaboo”: a transformation that caused my head to “hide” from you behind the search bar if your cursor got too close. Ruth, I hear, spent far too long playing with just this feature.

Nobody calls me “Scatman Dan” any more, and hadn’t for a long, long time. Given that my name is already awesome and unique all by itself (having changed to be so during the era in which scatmania.org was my primary personal domain name), it felt like I had the opportunity to rebrand.

I moved my blog to a new domain, DanQ.me (which is nice and short, too) and came up with a new collection of colours, fonts, and layout choices that I felt better-reflected my identity… and the fact that my blog was becoming less a place to record the mundane details of my daily life and more a place where I talk about (principally-web) technology, security, and GPS games… and just occasionally about other topics like breadmaking and books. Also, it gave me a chance to get on top of the current trend in web design for big, clean, empty spaces, square corners, and using pictures as the hook to a story.

Second design of DanQ.me, 2016
The second design of my blog after moving to DanQ.me showed-off posts with big pictures, framed by lots of white-space.

I’ve been working harder this last year or two to re-integrate (in a PESOS-like way) into my blog content that I’ve published elsewhere, mostly geocaching logs and geohashing expedition records, and I’ve also done so retroactively, so in addition to my first blog article on the subject of geocaching, you can read my first ever cache log without switching to a different site nor relying upon the continued existence and accessibility of that site. I’ve been working at being increasingly mindful of where my content is siloed outside of my control and reclaiming it by hosting it here, on my blog.

Particular areas in which I produce content elsewhere but would like to at-least maintain a copy here, and would ideally publish here first and syndicate elsewhere, although I appreciate that this is difficult, are:

  • GPS games like geocaching and geohashing – I’ve mostly got this under control, but could enjoy streamlining the process or pushing towards POSSE
  • Reddit, where I’ve written tens of thousands of words under a variety of accounts, but I don’t really pay attention to the site any more
  • I left Facebook in 2011 but I still have a backup of what was on my “Wall” at that point, which I could look into reintegrating into my blog
  • I share a lot of the source code I write via my GitHub account, but I’m painfully aware that this is yet-another-silo that I ought to learn not to depend upon (and it ought to be simple enough to mirror my repos on my own site!)
  • I’ve got a reasonable number of videos on two YouTube channels which are online by Google’s good graces (and potential for advertising revenue); for a handful of technical reasons they’re a bit of a pain to self-host, but perhaps my blog could act as a secondary source to my own video content
  • I write business reviews on Google Maps which I should probably look into recovering from the hivemind and hosting here… in fact, I’ve probably written plenty of reviews on other sites, too, like Amazon for example…
  • On two previous occasions I’ve maintained an online photo gallery; I might someday resurrect the concept, at least for the photos that used to be published on them
  • I’ve dabbled on a handful of other, often weirder, social networks before like Scuttlebutt (which has a genius concept, by the way) and Ello, and ought to check if there’s anything “original” on there I should reintegrate
  • Going way, way back, there are a good number of usenet postings I’ve made over the last twenty-something years that I could reclaim, if I can find them…

(if you’re asking why I’m inclined to do all of these things: here’s why)

Current iteration of DanQ.me
This looks familiar.

20 years and around 717,000 words worth of blogging down, it’s interesting to look back and see how things have changed: in my life, on the Web, and in the world in general. I’ve seen many friends’ blogs come and go: they move into a new phase of their life and don’t feel like what they wrote before reflects them today, most often, and so they delete them… which is fine, of course: it’s their content! But for me it’s always felt wrong to do so, for two reasons: firstly, it feels false to do so given that once something’s been put on the Web, it might well be online forever – you can’t put the genie back in the bottle! And secondly: for me, it’s valuable to own everything I wrote before. Even the cringeworthy things I wrote as a teenager who thought they knew everything and the antagonistic stuff I wrote in my early 20s but that I clearly wouldn’t stand by today is part of my history, and hiding that would be a disservice to myself.

The 17-year-old who wrote my first blog posts two decades ago this month fully expected that the things he wrote would be online forever, and I don’t intend to take that away from him. I’m sure that when I write a post in October 2038 looking back on the next two decades, I’ll roll my eyes at myself today, too, but for me: that’s part of the joy of a long-running personal blog. It’s like a diary, but with a sense of accountability. It’s a space on the web that’s “mine” into which I can dump pretty-much whatever I like.

I love it: I’ve been blogging for over half of my life, and if I can get back to you in 2031 and tell you that I’ve by-then been doing so for two-thirds of my life, that would be a win.