Making an RSS feed of YOURLS shortlinks

As you might know if you were paying close attention in Summer 2019, I run a “URL shortener” for my personal use. You may be familiar with public URL shorteners like TinyURL and Bit.ly: my personal URL shortener is basically the same thing, except that only I am able to make short-links with it. Compared to public ones, this means I’ve got a larger corpus of especially-short (e.g. 2/3 letter) codes available for my personal use. It also means that I’m not dependent on the goodwill of a free siloed service and I can add exactly the features I want to it.

Diagram showing the relationships of the DanQ.me ecosystem. Highlighted is the injection of links into the "S.2" link shortener and the export of these shortened links by RSS into FreshRSS.
Little wonder then that my link shortener sat so close to me on my ecosystem diagram the other year.

For the last nine years my link shortener has been S.2, a tool I threw together in Ruby. It stores URLs in a sequentially-numbered database table and then uses the Base62-encoding of the primary key as the “code” part of the short URL. Aside from the fact that when I create a short link it shows me a QR code to I can easily “push” a page to my phone, it doesn’t really have any “special” features. It replaced S.1, from which it primarily differed by putting the code at the end of the URL rather than as part of the domain name, e.g. s.danq.me/a0 rather than a0.s.danq.me: I made the switch because S.1 made HTTPS a real pain as well as only supporting Base36 (owing to the case-insensitivity of domain names).

But S.2’s gotten a little long in the tooth and as I’ve gotten busier/lazier, I’ve leant into using or adapting open source tools more-often than writing my own from scratch. So this week I switched my URL shortener from S.2 to YOURLS.

Screenshot of YOURLS interface showing Dan Q's list of shortened links. Six are shown of 1,939 total.
YOURLs isn’t the prettiest tool in the world, but then it doesn’t have to be: only I ever see the interface pictured above!

One of the things that attracted to me to YOURLS was that it had a ready-to-go Docker image. I’m not the biggest fan of Docker in general, but I do love the convenience of being able to deploy applications super-quickly to my household NAS. This makes installing and maintaining my personal URL shortener much easier than it used to be (and it was pretty easy before!).

Another thing I liked about YOURLS is that it, like S.2, uses Base62 encoding. This meant that migrating my links from S.2 into YOURLS could be done with a simple cross-database INSERT... SELECT statement:

INSERT INTO yourls.yourls_url(keyword, url, title, `timestamp`, clicks)
  SELECT shortcode, url, title, created_at, 0 FROM danq_short.links

But do you know what’s a bigger deal for my lifestack than my URL shortener? My RSS reader! I’ve written about it a lot, but I use RSS for just about everything and my feed reader is my first, last, and sometimes only point of contact with the Web! I’m so hooked-in to my RSS ecosystem that I’ll use my own middleware to add feeds to sites that don’t have them, or for which I’m not happy with the feed they provide, e.g. stripping sports out of BBC News, subscribing to webcomics that don’t provide such an option (sometimes accidentally hacking into sites on the way), and generating “complete” archives of series’ of posts so I can use my reader to track my progress.

One of S.1/S.2’s features was that it exposed an RSS feed at a secret URL for my reader to ingest. This was great, because it meant I could “push” something to my RSS reader to read or repost to my blog later. YOURLS doesn’t have such a feature, and I couldn’t find anything in the (extensive) list of plugins that would do it for me. I needed to write my own.

Partial list of Dan's RSS feed subscriptions, including Jeremy Keith, Jim Nielson, Natalie Lawhead, Bruce Schneier, Scott O'Hara, "Yahtzee", BBC News, and several podcasts, as well as (highlighted) "Dan's Short Links", which has 5 unread items.
In some ways, subscribing “to yourself” is a strange thing to do. In other ways… shut up, I’ll do what I like.

I could have written a YOURLS plugin. Or I could have written a stack of code in Ruby, PHP, Javascript or some other language to bridge these systems. But as I switched over my shortlink subdomain s.danq.me to its new home at danq.link, another idea came to me. I have direct database access to YOURLS (and the table schema is super simple) and the command-line MariaDB client can output XML… could I simply write an XML Transformation to convert database output directly into a valid RSS feed? Let’s give it a go!

I wrote a script like this and put it in my crontab:

mysql --xml yourls -e                                                                                                                     \
      "SELECT keyword, url, title, DATE_FORMAT(timestamp, '%a, %d %b %Y %T') AS pubdate FROM yourls_url ORDER BY timestamp DESC LIMIT 30" \
    | xsltproc template.xslt -                                                                                                            \
    | xmllint --format -                                                                                                                  \
    > output.rss.xml

The first part of that command connects to the yourls database, sets the output format to XML, and executes an SQL statement to extract the most-recent 30 shortlinks. The DATE_FORMAT function is used to mould the datetime into something approximating the RFC-822 standard for datetimes as required by RSS. The output produced looks something like this:

<?xml version="1.0"?>
<resultset statement="SELECT keyword, url, title, timestamp FROM yourls_url ORDER BY timestamp DESC LIMIT 30" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
  <row>
        <field name="keyword">VV</field>
        <field name="url">https://webdevbev.co.uk/blog/06-2021/perfect-is-the-enemy-of-good.html</field>
        <field name="title"> Perfect is the enemy of good || Web Dev Bev</field>
        <field name="timestamp">2021-09-26 17:38:32</field>
  </row>
  <row>
        <field name="keyword">VU</field>
        <field name="url">https://webdevlaw.uk/2021/01/30/why-generation-x-will-save-the-web/</field>
        <field name="title">Why Generation X will save the web  Hi, Im Heather Burns</field>
        <field name="timestamp">2021-09-26 17:38:26</field>
  </row>

  <!-- ... etc. ... -->
  
</resultset>

We don’t see this, though. It’s piped directly into the second part of the command, which  uses xsltproc to apply an XSLT to it. I was concerned that my XSLT experience would be super rusty as I haven’t actually written any since working for my former employer SmartData back in around 2005! Back then, my coworker Alex and I spent many hours doing XML backflips to implement a system that converted complex data outputs into PDF files via an XSL-FO intermediary.

I needn’t have worried, though. Firstly: it turns out I remember a lot more than I thought from that project a decade and a half ago! But secondly, this conversion from MySQL/MariaDB XML output to RSS turned out to be pretty painless. Here’s the template.xslt I ended up making:

<?xml version="1.0"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
  <xsl:template match="resultset">
    <rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
      <channel>
        <title>Dan's Short Links</title>
        <description>Links shortened by Dan using danq.link</description>
        <link> [ MY RSS FEED URL ] </link>
        <atom:link href=" [ MY RSS FEED URL ] " rel="self" type="application/rss+xml" />
        <lastBuildDate><xsl:value-of select="row/field[@name='pubdate']" /> UTC</lastBuildDate>
        <pubDate><xsl:value-of select="row/field[@name='pubdate']" /> UTC</pubDate>
        <ttl>1800</ttl>
        <xsl:for-each select="row">
          <item>
            <title><xsl:value-of select="field[@name='title']" /></title>
            <link><xsl:value-of select="field[@name='url']" /></link>
            <guid>https://danq.link/<xsl:value-of select="field[@name='keyword']" /></guid>
            <pubDate><xsl:value-of select="field[@name='pubdate']" /> UTC</pubDate>
          </item>
        </xsl:for-each>
      </channel>
    </rss>
  </xsl:template>
</xsl:stylesheet>

That uses the first (i.e. most-recent) shortlink’s timestamp as the feed’s pubDate, which makes sense: unless you’re going back and modifying links there’s no more-recent changes than the creation date of the most-recent shortlink. Then it loops through the returned rows and creates an <item> for each; simple!

The final step in my command runs the output through xmllint to prettify it. That’s not strictly necessary, but it was useful while debugging and as the whole command takes milliseconds to run once every quarter hour or so I’m not concerned about the overhead. Using these native binaries (plus a little configuration), chained together with pipes, had already resulted in way faster performance (with less code) than if I’d implemented something using a scripting language, and the result is a reasonably elegant “scratch your own itch”-type solution to the only outstanding barrier that was keeping me on S.2.

All that remained for me to do was set up a symlink so that the resulting output.rss.xml was accessible, over the web, to my RSS reader. I hope that next time I’m tempted to write a script to solve a problem like this I’ll remember that sometimes a chain of piped *nix utilities can provide me a slicker, cleaner, and faster solution.

Update: Right as I finished writing this blog post I discovered that somebody had already solved this problem using PHP code added to YOURLS; it’s just not packaged as a plugin so I didn’t see it earlier! Whether or not I use this alternate approach or stick to what I’ve got, the process of implementing this YOURLS-database ➡ XML ➡  XSLTRSS chain was fun and informative.

Goodbye Reader

Goodbye, Google Reader. It was fun while it lasted.

Long ago, I used desktop RSS readers. I was only subscribed to my friends’ blogs back then anyway, so it didn’t matter that I could only read them from my home computer. But then RSS feeds started appearing on news sites, and tech blogs started appearing about things related to my work. And smartphones took over the world, and I wanted to be able to synchronise my reading list everywhere. There were a few different services that competed for my attention, but Google Reader was the best. It was simple, and fast, and easy, and it Just Worked in that way that Google products often do.

I put up with the occasional changes to the user interface. Hey, it’s a beta, and it’s still the best thing out there. Hey, it’s free, what can you say? I put up with the fact that from time to time, they changed the site in ways that were sometimes quite hostile to Opera, my web browser of choice. I put up with the fact that it had difficulty with unsigned HTTPS certificates (it’s fine now) and that it didn’t provide a mechanism to authenticate against services like LiveJournal (it still doesn’t). I even worked around the latter, releasing my own tool and updating it a few times until LiveJournal blocked it (twice) and I had to instead recommend that people switched to rival service FreeMyFeed.

The new Google Reader (with my annotations - click to embiggen). It sucks quite a lot.

But the final straw came this week when Google “updated” Reader once again, with two awful new changes:

  1. I know that they’re ever-so-proud of the Google+ user interface, but rebranding all of the other services to look like it just isn’t working. It’s great for Google+, not-bad for Search, bad for GMail (but at least you can turn it off!), and fucking awful for Reader. I like distinct borders between my items. I don’t like big white spaces and buttons that eat up half the screen.
  2. The sharing interface is completely broken. After a little while, I worked out that I still can share things with other people, but I can’t any longer see what other people are sharing without clicking over to Google+. This sucks a lot. No longer can I keep track of which shared items I have and haven’t read, and no longer can I read the interesting RSS feeds my friends have shared in the same place as I read (and share) my own.

So that’s the last straw. Today, I switched everything over to Tiny Tiny RSS.

Tiny Tiny RSS - it's simple, clean, and (in an understated way) beautiful.

Originally I felt that I was being pushed “away” from Google Reader, but the more I’ve played with it, the more I’ve realised that I’m being drawn “towards” Tiny Tiny, and wishing that I’d made the switch further. The things that have really appealed are:

  • It’s self-hosted. Tiny Tiny RSS is a free, open-source solution that you host for yourself (or I suppose you can use a shared host; there are a few around). I know that this is a downside to most people, but to me, it’s a serious selling point: now, I’m in control of what updates are applied, when, and if I don’t like the functionality of a part of the system, I can change it – I’m in control.
  • It’s simple and clean. It’s got a great user interface, in an understated and simplistic way. It’s somewhat reminiscent of desktop email clients, replacing the “stream of feeds” idea with a two- or three-pane view (your choice). That sounds like it’d be a downside, until you realise…
  • …with great keyboard controls. Tiny Tiny RSS is great for keyboard lovers like me. The default key-commands (which are of course customisable) are based on Emacs, so if that’s your background then it’s easy to be right at home in minutes and browsing feeds faster than ever.
  • Plus: it’s got a stack of nice features. I’m loving the “fresh” filter, that helps me differentiate between the stuff I’ve “saved for later” reading and the stuff that’s actually new and interesting. I’m also impressed by the integrated authentication, which removes my dependency on FreeMyFeed-like services and (because it’s self-hosted) lets me keep my credentials securely under my own control. It supports authentication using SSL certificates, a beautiful and underused technology. It allows you to customise the update frequency of your feeds, so I can stalk by friends’ blogs at lightning-quick rates and stall my weekly update subscriptions so they don’t get checked so frequently. And unlike Google Reader, it actually tells me when feeds break, so I don’t just “get no updates” for a while before I think to check the site (and it’ll even let me change the URLs when this happens, rather than unsubscribing and resubscribing).

Put simply: all of my major gripes with Google Reader over the last few years have been answered all at once in this wonderful little program. If people are interested in how I set up Tiny Tiny RSS and and made the switchover as simple and painless as possible, I’ll write a blog post to talk you through it.

I’ve had just one problem: it’s not quite so tolerant of badly-formed XML as Google Reader. There’s one feed in my list which, it turns out, has (very) invalid XML in it’s feed, that Google Reader managed to ignore and breeze over, but Tiny Tiny RSS chokes on. I’ve contacted the site owner to try to get it fixed, but if they don’t, I might have to hack some code to try to make a workaround. Not ideal, and not something that everybody would necessarily want to deal with, so be aware!

If, like me, you’ve become dissatisfied by Google Reader this week, you might also like to look at rssLounge, the other worthy candidate I considered as a replacement. I had a quick play but didn’t find it quite as suitable for my needs, but it might be to your taste: take a look.

The new sidebar, showing what I'm reading in my RSS reader lately.

Oh, and one more thing: if you used to “follow” me on Google Reader (or even if you didn’t) and you want to continue to subscribe to the stuff I “share”, then you’ll want to subscribe to this new RSS feed of “my shared stuff”, instead: it can also be found syndicated in the right-hand column of my blog.

Update: this guy’s made a bookmarklet that makes the new Google Reader theme slightly less hideous. Doesn’t fix the other problems, though, but if you’re not quite pissed-off enough to jump ship, it might make your experience more-bearable.

Update 2: others in the blogosphere are saying good things about Reader rival NewsBlur, which recently turned one year old. If you’re looking for a hosted service, rather than something “roll-your-own” like Tiny Tiny RSS, perhaps it’s the tool for you?

Parsing XML as JSON

This morning, I got an instant message from a programmer who’s getting deeply into their Ajax recently. The conversation went something like this (I paraphrase and dramatise at least a little):

Morning! I need to manipulate a JSON feed so that [this JSON parser] will recognise it.

Here’s what I get out of the JSON feed right now:

<?xml version="1.0" encoding="UTF-8"?>
<module-slots type="array">
  <module-slot>
    <title>Module3</title>
    ...

“Umm…” I began, not quite sure how to break this news, “That’s XML, not JSON.”

“Is that a problem?” comes the reply.