Making an RSS feed of YOURLS shortlinks

As you might know if you were paying close attention in Summer 2019, I run a “URL shortener” for my personal use. You may be familiar with public URL shorteners like TinyURL and Bit.ly: my personal URL shortener is basically the same thing, except that only I am able to make short-links with it. Compared to public ones, this means I’ve got a larger corpus of especially-short (e.g. 2/3 letter) codes available for my personal use. It also means that I’m not dependent on the goodwill of a free siloed service and I can add exactly the features I want to it.

Diagram showing the relationships of the DanQ.me ecosystem. Highlighted is the injection of links into the "S.2" link shortener and the export of these shortened links by RSS into FreshRSS.
Little wonder then that my link shortener sat so close to me on my ecosystem diagram the other year.

For the last nine years my link shortener has been S.2, a tool I threw together in Ruby. It stores URLs in a sequentially-numbered database table and then uses the Base62-encoding of the primary key as the “code” part of the short URL. Aside from the fact that when I create a short link it shows me a QR code to I can easily “push” a page to my phone, it doesn’t really have any “special” features. It replaced S.1, from which it primarily differed by putting the code at the end of the URL rather than as part of the domain name, e.g. s.danq.me/a0 rather than a0.s.danq.me: I made the switch because S.1 made HTTPS a real pain as well as only supporting Base36 (owing to the case-insensitivity of domain names).

But S.2’s gotten a little long in the tooth and as I’ve gotten busier/lazier, I’ve leant into using or adapting open source tools more-often than writing my own from scratch. So this week I switched my URL shortener from S.2 to YOURLS.

Screenshot of YOURLS interface showing Dan Q's list of shortened links. Six are shown of 1,939 total.
YOURLs isn’t the prettiest tool in the world, but then it doesn’t have to be: only I ever see the interface pictured above!

One of the things that attracted to me to YOURLS was that it had a ready-to-go Docker image. I’m not the biggest fan of Docker in general, but I do love the convenience of being able to deploy applications super-quickly to my household NAS. This makes installing and maintaining my personal URL shortener much easier than it used to be (and it was pretty easy before!).

Another thing I liked about YOURLS is that it, like S.2, uses Base62 encoding. This meant that migrating my links from S.2 into YOURLS could be done with a simple cross-database INSERT... SELECT statement:

INSERT INTO yourls.yourls_url(keyword, url, title, `timestamp`, clicks)
  SELECT shortcode, url, title, created_at, 0 FROM danq_short.links

But do you know what’s a bigger deal for my lifestack than my URL shortener? My RSS reader! I’ve written about it a lot, but I use RSS for just about everything and my feed reader is my first, last, and sometimes only point of contact with the Web! I’m so hooked-in to my RSS ecosystem that I’ll use my own middleware to add feeds to sites that don’t have them, or for which I’m not happy with the feed they provide, e.g. stripping sports out of BBC News, subscribing to webcomics that don’t provide such an option (sometimes accidentally hacking into sites on the way), and generating “complete” archives of series’ of posts so I can use my reader to track my progress.

One of S.1/S.2’s features was that it exposed an RSS feed at a secret URL for my reader to ingest. This was great, because it meant I could “push” something to my RSS reader to read or repost to my blog later. YOURLS doesn’t have such a feature, and I couldn’t find anything in the (extensive) list of plugins that would do it for me. I needed to write my own.

Partial list of Dan's RSS feed subscriptions, including Jeremy Keith, Jim Nielson, Natalie Lawhead, Bruce Schneier, Scott O'Hara, "Yahtzee", BBC News, and several podcasts, as well as (highlighted) "Dan's Short Links", which has 5 unread items.
In some ways, subscribing “to yourself” is a strange thing to do. In other ways… shut up, I’ll do what I like.

I could have written a YOURLS plugin. Or I could have written a stack of code in Ruby, PHP, Javascript or some other language to bridge these systems. But as I switched over my shortlink subdomain s.danq.me to its new home at danq.link, another idea came to me. I have direct database access to YOURLS (and the table schema is super simple) and the command-line MariaDB client can output XML… could I simply write an XML Transformation to convert database output directly into a valid RSS feed? Let’s give it a go!

I wrote a script like this and put it in my crontab:

mysql --xml yourls -e                                                                                                                     \
      "SELECT keyword, url, title, DATE_FORMAT(timestamp, '%a, %d %b %Y %T') AS pubdate FROM yourls_url ORDER BY timestamp DESC LIMIT 30" \
    | xsltproc template.xslt -                                                                                                            \
    | xmllint --format -                                                                                                                  \
    > output.rss.xml

The first part of that command connects to the yourls database, sets the output format to XML, and executes an SQL statement to extract the most-recent 30 shortlinks. The DATE_FORMAT function is used to mould the datetime into something approximating the RFC-822 standard for datetimes as required by RSS. The output produced looks something like this:

<?xml version="1.0"?>
<resultset statement="SELECT keyword, url, title, timestamp FROM yourls_url ORDER BY timestamp DESC LIMIT 30" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
  <row>
        <field name="keyword">VV</field>
        <field name="url">https://webdevbev.co.uk/blog/06-2021/perfect-is-the-enemy-of-good.html</field>
        <field name="title"> Perfect is the enemy of good || Web Dev Bev</field>
        <field name="timestamp">2021-09-26 17:38:32</field>
  </row>
  <row>
        <field name="keyword">VU</field>
        <field name="url">https://webdevlaw.uk/2021/01/30/why-generation-x-will-save-the-web/</field>
        <field name="title">Why Generation X will save the web  Hi, Im Heather Burns</field>
        <field name="timestamp">2021-09-26 17:38:26</field>
  </row>

  <!-- ... etc. ... -->
  
</resultset>

We don’t see this, though. It’s piped directly into the second part of the command, which  uses xsltproc to apply an XSLT to it. I was concerned that my XSLT experience would be super rusty as I haven’t actually written any since working for my former employer SmartData back in around 2005! Back then, my coworker Alex and I spent many hours doing XML backflips to implement a system that converted complex data outputs into PDF files via an XSL-FO intermediary.

I needn’t have worried, though. Firstly: it turns out I remember a lot more than I thought from that project a decade and a half ago! But secondly, this conversion from MySQL/MariaDB XML output to RSS turned out to be pretty painless. Here’s the template.xslt I ended up making:

<?xml version="1.0"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
  <xsl:template match="resultset">
    <rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
      <channel>
        <title>Dan's Short Links</title>
        <description>Links shortened by Dan using danq.link</description>
        <link> [ MY RSS FEED URL ] </link>
        <atom:link href=" [ MY RSS FEED URL ] " rel="self" type="application/rss+xml" />
        <lastBuildDate><xsl:value-of select="row/field[@name='pubdate']" /> UTC</lastBuildDate>
        <pubDate><xsl:value-of select="row/field[@name='pubdate']" /> UTC</pubDate>
        <ttl>1800</ttl>
        <xsl:for-each select="row">
          <item>
            <title><xsl:value-of select="field[@name='title']" /></title>
            <link><xsl:value-of select="field[@name='url']" /></link>
            <guid>https://danq.link/<xsl:value-of select="field[@name='keyword']" /></guid>
            <pubDate><xsl:value-of select="field[@name='pubdate']" /> UTC</pubDate>
          </item>
        </xsl:for-each>
      </channel>
    </rss>
  </xsl:template>
</xsl:stylesheet>

That uses the first (i.e. most-recent) shortlink’s timestamp as the feed’s pubDate, which makes sense: unless you’re going back and modifying links there’s no more-recent changes than the creation date of the most-recent shortlink. Then it loops through the returned rows and creates an <item> for each; simple!

The final step in my command runs the output through xmllint to prettify it. That’s not strictly necessary, but it was useful while debugging and as the whole command takes milliseconds to run once every quarter hour or so I’m not concerned about the overhead. Using these native binaries (plus a little configuration), chained together with pipes, had already resulted in way faster performance (with less code) than if I’d implemented something using a scripting language, and the result is a reasonably elegant “scratch your own itch”-type solution to the only outstanding barrier that was keeping me on S.2.

All that remained for me to do was set up a symlink so that the resulting output.rss.xml was accessible, over the web, to my RSS reader. I hope that next time I’m tempted to write a script to solve a problem like this I’ll remember that sometimes a chain of piped *nix utilities can provide me a slicker, cleaner, and faster solution.

Update: Right as I finished writing this blog post I discovered that somebody had already solved this problem using PHP code added to YOURLS; it’s just not packaged as a plugin so I didn’t see it earlier! Whether or not I use this alternate approach or stick to what I’ve got, the process of implementing this YOURLS-database ➡ XML ➡  XSLTRSS chain was fun and informative.

4 replies to Making an RSS feed of YOURLS shortlinks

Leave a Reply

Your email address will not be published. Required fields are marked *