Better WordPress RSS Feeds

I’ve made a handful of tweaks to my RSS feed which I feel improves upon WordPress’s default implementation, at least in my use-case.1 In case any of these improvements help you, too, here’s a list of them:

Post Kinds in Titles

Since 2020, I’ve decorated post titles by prefixing them with the “kind” of post they are (courtesy of the Post Kinds plugin). I’ve already written about how I do it, if you’re interested.

Screenshot showing a Weekly Digest email from DanQ.me, with two Notes and a Repost clearly-identified.
Identifying post kinds is particularly useful for people who subscribe by email (the emails are generated off the RSS feed either daily or weekly: subscriber’s choice), who might want to see articles and videos but not care about for example checkins and reposts.

RSS Only posts

A minority of my posts are – initially, at least – publicised only via my RSS feed (and places that are directly fed by it, like email subscribers). I use a tag to identify posts to be hidden in this way. I’ve written about my implementation before, but I’ve since made a couple of additional improvements:

  1. Suppressing the tag from tag clouds, to make it harder to accidentally discover these posts by tag-surfing,
  2. Tweaking the title of such posts when they appear in feeds (using the same technique as above), so that readers know when they’re seeing “exclusive” content, and
  3. Setting a X-Robots-Tag: noindex, nofollow HTTP header when viewing such tag or a post, to discourage search engines (code for this not shown below because it’s so very specific to my theme that it’s probably no use to anybody else!).
// 1. Suppress the "rss club" tag from tag clouds/the full tag list
function rss_club_suppress_tags_from_display( string $tag_list, string $before, string $sep, string $after, int $post_id ): string {
  foreach(['rss-club'] as $tag_to_suppress){
    $regex = sprintf( '/<li>[^<]*?<a [^>]*?href="[^"]*?\/%s\/"[^>]*?>.*?<\/a>[^<]*?<\/li>/', $tag_to_suppress );
    $tag_list = preg_replace( $regex, '', $tag_list );
  }
  return $tag_list;
}
add_filter( 'the_tags', 'rss_club_suppress_tags_from_display', 10, 5 );

// 2. In feeds, tweak title if it's an RSS exclusive
function rss_club_add_rss_only_to_rss_post_title( $title ){
  $post_tag_slugs = array_map(function($tag){ return $tag->slug; }, wp_get_post_tags( get_the_ID() ));
  if ( ! in_array( 'rss-club', $post_tag_slugs ) ) return $title; // if we don't have an rss-club tag, drop out here
  return trim( "{$title} [RSS Exclusive!]" );
  return $title;
}
add_filter( 'the_title_rss', 'rss_club_add_rss_only_to_rss_post_title', 6 );

Adding a stylesheet

Adding a stylesheet to your feeds can make them much friendlier to beginner users (which helps drive adoption) without making them much less-convenient for people who know how to use feeds already. Darek Kay and Terence Eden both wrote great articles about this just earlier this year, but I think my implementation goes a step further.

Screenshot of DanQ.me's RSS feed as viewed in Firefox, showing a "Q" logo and three recent posts.
I started with Matt Webb‘s pretty-feed-v3.xsl by Matt Webb (as popularised by AboutFeeds.com) and built from there.

In addition to adding some “Q” branding, I made tweaks to make it work seamlessly with both my RSS and Atom feeds by using two <xsl:for-each> blocks and exploiting the fact that the two standards don’t overlap in their root namespaces. Here’s my full XSLT; you need to override your feed template as Terence describes to use it, but mine can be applied to both RSS and Atom.2

I’ve still got more I’d like to do with this, for example to take advantage of the thumbnail images I attach to posts. On which note…

Thumbnail images

When I first started offering email subscription options I used Mailchimp’s RSS-to-email service, which was… okay, but not great, and I didn’t like the privacy implications that came along with it. Mailchimp support adding thumbnails to your email template from your feed, but WordPress themes don’t by-default provide the appropriate metadata to allow them to do that. So I installed Jordy Meow‘s RSS Featured Image plugin which did it for me.

<item>
        <title>[Checkin] Geohashing expedition 2023-07-27 51 -1</title>
        <link>https://danq.me/2023/07/27/geohashing-expedition-2023-07-27-51-1/</link>

        ...

        <media:content url="/_q23u/2023/07/20230727_141710-1024x576.jpg" medium="image" />
        <media:description>Dan, wearing a grey Three Rings hoodie, carrying French Bulldog Demmy, standing on a path with trees in the background.</media:description>
</item>
Media attachments for RSS feeds are perhaps most-popular for podcasts, but they’re also great for post thumbnail images.

During my little redesign earlier this year I decided to go two steps further: (1) ditching the plugin and implementing the functionality directly into my theme (it’s really not very much code!), and (2) adding not only a <media:content medium="image" url="..." /> element but also a <media:description> providing the default alt-text for that image. I don’t know if any feed readers (correctly) handle this accessibility-improving feature, but my stylesheet above will, some day!

Here’s how that’s done:

function rss_insert_namespace_for_featured_image() {
  echo "xmlns:media=\"http://search.yahoo.com/mrss/\"\n";
}

function rss_insert_featured_image( $comments ) {
  global $post;
  $image_id = get_post_thumbnail_id( $post->ID );
  if( ! $image_id ) return;
  $image = get_the_post_thumbnail_url( $post->ID, 'large' );
  $image_url = esc_url( $image );
  $image_alt = esc_html( get_post_meta( $image_id, '_wp_attachment_image_alt', true ) );
  $image_title = esc_html( get_the_title( $image_id ) );
  $image_description = empty( $image_alt ) ? $image_title : $image_alt;
  if ( !empty( $image ) ) {
    echo <<<EOF
      <media:content url="{$image_url}" medium="image" />
      <media:description>{$image_description}</media:description>
    EOF;
  }
}

add_action( 'rss2_ns', 'rss_insert_namespace_for_featured_image' );
add_action( 'rss2_item', 'rss_insert_featured_image' );

So there we have it: a little digital gardening, and four improvements to WordPress’s default feeds.

RSS may not be as hip as it once was, but little improvements can help new users find their way into this (enlightened?) way to consume the Web.

If you’re using RSS to follow my blog, great! If it’s not for you, perhaps pick your favourite alternative way to get updates, from options including email, Telegram, the Fediverse (e.g. Mastodon), and more…

Update 4 September 2023: More-recently, I’ve improved WordPress RSS feeds by preventing them from automatically converting emoji into images.

Footnotes

1 The changes apply to the Atom feed too, for anybody of such an inclination. Just assume that if I say RSS I’m including Atom, okay?

2 The experience of writing this transformation/stylesheet also gave me yet another opportunity to remember how much I hate working with XSLTs. This time around, in addition to the normal namespace issues and headscratching syntax, I had to deal with the fact that I initially tried to use a feature from XSLT version 2.0 (a 22-year-old version) only to discover that all major web browsers still only support version 1.0 (specified last millenium)!

× ×

Short-Term Blogging

There’s a perception that a blog is a long-lived, ongoing thing. That it lives with and alongside its author.1

But that doesn’t have to be true, and I think a lot of people could benefit from “short-term” blogging. Consider:

  • Photoblogging your holiday, rather than posting snaps to social media
    You gain the ability to add context, crosslinking, and have permanent addresses (rather than losing eveything to the depths of a feed). You can crosspost/syndicate to your favourite socials if that’s your poison..
Photo showing a mobile phone, held in a hand, being used to take a photograph of a rugged coastline landscape.
Photoblog your holiday and I might follow it, and I’ll do so at my convenience. Put your snaps on Facebook and I almost certainly won’t bother. Photo courtesy ArtHouse Studio.
  • Blogging your studies, rather than keeping your notes to yourself
    Writing what you learn helps you remember it; writing what you learn in a public space helps others learn too and makes it easy to search for your discoveries later.2
  • Recording your roleplaying, rather than just summarising each session to your fellow players
    My D&D group does this at levellers.blog! That site won’t continue to be updated forever – the party will someday retire or, more-likely, come to a glorious but horrific end – but it’ll always live on as a reminder of what we achieved.

One of my favourite examples of such a blog was 52 Reflect3 (now integrated into its successor The Improbable Blog). For 52 consecutive weeks my partner‘s brother Robin blogged about adventures that took him out of his home in London and it was amazing. The project’s finished, but a blog was absolutely the right medium for it because now it’s got a “forever home” on the Web (imagine if he’d posted instead to Twitter, only for that platform to turn into a flaming turd).

I don’t often shill for my employer, but I genuinely believe that the free tier on WordPress.com is an excellent way to give a forever home to your short-term blog4. Did you know that you can type new.blog (or blog.new; both work!) into your browser to start one?

What are you going to write about?

Footnotes

1 This blog is, of course, an example of a long-term blog. It’s been going in some form or another for over half my life, and I don’t see that changing. But it’s not the only kind of blog.

2 Personally, I really love the serendipity of asking a web search engine for the solution to a problem and finding a result that turns out to be something that I myself wrote, long ago!

3 My previous posts about 52 Reflect: Challenge Robin, Twatt, Brixton to Brighton by Boris Bike, Ending on a High (and associated photo/note)

4 One of my favourite features of WordPress.com is the fact that it’s built atop the world’s most-popular blogging software and you can export all your data at any time, so there’s absolutely no lock-in: if you want to migrate to a competitor or even host your own blog, it’s really easy to do so!

×

AI is lying to you

I’ve resisted writing about the current trends in AI because, well, others are already doing it better.1 But I was inspired by Garrett‘s observation that – according to the Washington Post – the C4 dataset has tokenised his personal website.

Screenshot from the Washington Post, showing a search for 'danq.me' returning 300k tokens (0.0002% of the dataset). The site is ranked 61,724.
My website is in the dataset too, but with a massive 300,000 tokens. Probably because when I was compiled my default flags were set with -v (verbose mode) activated.

Much has been said about how ChatGPT and her friends will hallucinate and mislead. Let’s take an example.

Remember that ChatGPT has almost-certainly read basically everything I’ve ever written online – it might well be better-informed about me better than you are – as you read this:

Dan asks: "Tell me about the blogger who runs DanQ.me" ChatGPT responds: "Dan Q is the blogger and web developer who runs the website "DanQ.me". He is a British computer scientist, game designer, and open-source software developer. DanQ.me is a personal website where Dan writes about a variety of topics, including technology, programming, and gaming. He is also known for creating the web-based puzzle game "Notpron," which has gained a large following online. In addition to his work on "DanQ.me" and "Notpron," Dan has contributed to several open-source software projects, including the Linux operating system and the Mozilla web browser. He has also worked as a software developer for several companies, including Google and Canonical. Dan is active on social media, particularly Twitter, where he frequently shares his thoughts on technology and other topics." The screenshot is marked up with the bits about gaming and being active on social media marked "stretching/exaggerating", the bits about Notpron, Linux, Mozilla, and Canonical being "believable lies", and the bits about Twitter and Google identified as "complete bollocks"
Given that ChatGPT has all the information it needs to talk about me accurately, it comes up with a surprising amount of crap.

When I asked ChatGPT about me, it came up with a mixture of truths and believable lies2, along with a smattering of complete bollocks.

In another example, ChatGPT hallucinates this extra detail specifically because the conversation was foreshadowed by its previous mistake. At this point, it digs its heels in and commits to its claim, like the stubborn guy in the corner of the pub who doubles-down on his bullshit.

If you were to ask at the outset who wrote Notpron, ChatGPT would have gotten it right, but because it already mis-spoke, it’s now trapped itself in a lie, incapable of reconsidering what it said previously as having been anything but the truth:

Asked to elaborate on "Notpron", ChatGPT says - Dan Q's custom "Notpron" challenge can be found at the following URL: https://www.danq.me/notpron/. This challenge features unique levels and puzzles that were created by Dan Q himself, and it is known for being particularly difficult and challenging. It has gained a large following among "Notpron" enthusiasts and puzzle game fans. If you decide to take on the challenge, be prepared to think outside the box and use your problem-solving skills to their fullest!"
Notpron is great and all, but it was written by David Münnich, not me. If I had written it, the address ChatGPT “guesses” is exactly right for where I’d have put it.

Simon Willison says that we should call this behaviour “lying”. In response to this, several people told him that the “lying” excessively anthropomorphises these chatbots, implying that they’re deliberately attempting to mislead their users. Simon retorts:

I completely agree that anthropomorphism is bad: these models are fancy matrix arithmetic, not entities with intent and opinions.

But in this case, I think the visceral clarity of being able to say “ChatGPT will lie to you” is a worthwhile trade.

I agree with Simon. ChatGPT and systems like it are putting accessible AI into the hands of the masses, and that means that the people who are using it don’t necessarily understand – nor desire to learn – the statistical mechanisms that actually underpin the AI‘s “decisions” about how to respond.

Trying to explain how and why their new toy will get things horribly wrong is hard, and it takes a critical eye, time, and practice to begin to discover how to use these tools effectively and safely.3 It’s simpler just to say “Here’s a tool; by the way, it’s a really convincing liar and you can’t trust it even a little.”

Giving people tools that will lie to them. What an interesting time to be alive!

Footnotes

1 I’m tempted to blog about my experience of using Stable Diffusion and GPT-3 as assistants while DMing my regular Dungeons & Dragons game, but haven’t worked out exactly what I’m saying yet.

2 That ChatGPT lies won’t be a surprise to anybody who’s used the system nor anybody who understands the fundamentals of how it works, but as AIs get integrated into more and more things, we’re going to need to teach a level of technical literacy about what that means, just like we do should about, say, Wikipedia.

3 For many of the tasks people talk about outsourcing to LLMs, it’s the case that it would take less effort for a human to learn how to do the task that it would for them to learn how to supervise an AI performing the task! That’s not to say they’re useless: just that (for now at least) you should only trust them to do something that you could do yourself and you’re therefore able to critically assess how well the machine did it.

× × ×

Thoughts on Editing Posts

This is a repost promoting content originally published elsewhere. See more things Dan's reposted.

I edit all my posts before I publish them – I look for poor grammar, me rambling on (something which I’m terrible for) and things like typos, although some do still get through. I think that kind of editing is fine.

But when it comes to opinion pieces, I don’t think they should be edited. Yes, you should (in my opinion) check the spelling/grammar before posting, but I don’t think you should go back and edit your opinions retrospectively if they change.

Kev speaks my mind.

At almost 25 years ago, my blog’s ancient, and covering more than half my life it inevitably includes posts that I feel don’t accurately reflect me any more (or, perhaps, didn’t reflect me well even when I wrote them!). My approach has long been that it’s okay to go back and modify a post to:

  • Correct spelling, punctuation, and grammar, or improve readability without changing the meaning.
  • Add content (in a clearly-marked way) to improve context, update information, or prepend/append hyperlinks to updated information.
  • Make changes that protect an individual (e.g. removing the name or photo of somebody who doesn’t want to be identified).

But like Kev, to me it just doesn’t seem right to change opinion pieces after your opinion changes. I’m happy to write a retraction and link to it from the original, but making-out like I never said those things in the first place seems disingenuous.

Kev links a disclaimer from his older posts; that’s an interesting idea that I might adopt.

Automattic Acquires ActivityPub Plugin for WordPress

This is a repost promoting content originally published elsewhere. See more things Dan's reposted.

Automattic has acquired the ActivityPub plugin for WordPress from German developer Matthias Pfefferle, who will be joining the company to continue improving support for federated platforms. Pfefferle, who is also the author of the Webmention plugin, said his new role is to see how Automattic’s products can benefit from open protocols like ActivityPub.

This is so exciting I might burst. Want to know why?

  1. Matt Mullenweg‘s commitment to ActivityPub makes me happy. WordPress made Pingback and Trackback take off, back in the day, and I believe that – in the same way – Automattic can help make ActivityPub more accessible and mainstream too.
  2. Matthias Pfefferle is both an IndieWeb and an ActivityPub star; I use (and I’ve extented upon) a lot of code he’s written every day and I sponsor him on Github! The chance that we get to work directly together is pretty slim, but it’s a chance right?

Susan A. Kitchens expressed concern that this could increase the level of ActivityPub spam out there (which right now is very low). I worry about that too. But I’m still optimistic that we can make something awesome off the back of this acquisition and keep the interpersonal Web federated, the way it ought to be.

Kev Quirk: Ten Years of Blogging

This is a repost promoting content originally published elsewhere. See more things Dan's reposted.

If you read a lot of the “how to start a blog in 2023” type posts (please don’t ever use that title in a post) the advice will often boil down to something like:

  • Choose a niche
  • Start collecting email addresses immediately (newsletter)
  • Write on a regular schedule

I don’t do any of those things. 😂

Kev writes about what he’s learned from ten years of blogging. As a fellow long-term blogger1, I was especially pleased with his observation that, for some (many?) of us old hands, all the tips on starting a blog nowadays are things that we just don’t do, sometimes deliberately.

Like Kev, I don’t have a “niche” (I write about the Web, life, geo*ing, technology, childwrangling, gaming, work…). I’ve experimented with email subscription but only as a convenience to people who prefer to get updates that way – the same reason I push articles to Facebook – and it certainly didn’t take off (and that’s fine!). And as for writing on a regular schedule? Hah! I don’t even manage to be uniform throughout the year, even after averaging over my blog’s quarter-century2 of history.

Also like Kev, and I think this is the reason that we ignore these kinds of guides to blogging, I blog for me first and foremost. Creation is a good thing, and I take my “permission to write” and just create stuff. Not having a “niche” means that I can write about what interests me, variable as that is. In my opinion the only guide to starting a blog that anybody needs to read is Andrew Stephens“So You Want To Start An Unpopular Blog”.

And if that’s not enough inspiration for you to jump back in your time machine and party like the Web’s still in 2005, I don’t know what is.

Footnotes

1 I celebrated 20 years… a while back?

2 Fuck me, describing a blog as having about a quarter-century of history makes it sound real old.

Finger Portal to WordPress Blog

Finger Primer

The finger protocol, first standardised way back in 1977, is a lightweight directory system for querying resources on a local or remote shared system. Despite barely being used today, it’s so well-established that virtually every modern desktop operating system – Windows, MacOS, Linux etc. – comes with a copy of finger, giving it a similar ubiquity to web browsers! (If you haven’t yet, give it a go.)

If you were using a shared UNIX-like system in the 1970s through 1990s, you might run finger to see who else was logged on at the same time as you, finger chris to get more information about Chris, or finger alice@example.net to look up the details of Alice on the server example.net. Its ability to transcend the boundaries of different systems meant that it was, after a fashion, an example of an early decentralised social network!

I first actively used finger when I was a student at Aberystwyth University. The shared central computers osfa and osfb supported it in what was a pretty typical way: users could add a .plan and/or .project file to their home directory and the contents of these would be output to anybody using finger to look up that user, along with other information like what department they belonged to. I’m simulating from memory so this won’t be remotely accurate, but broadly speaking it looked a little like this –

$ finger dlq9@aber.ac.uk
Login: dlq9                           Name: Dan Q
Directory: /users/9/d/dlq9      Department: Computer Science

Project:
Working on my BEng Software Engineering.

Plan:
    _______
---'   ____)____
          ______)  Finger me!
       _____)
      (____)
---.__(___)

It’s not just about a directory of people, though: you could finger printers to see what their queues were like, finger a time server to ask what time it was, finger a vending machine to see what drinks it had available… even finger for a weather forecast where you are (this one still works as shown below; try it for your own location!) –

$ finger oxford@graph.no
        -= Meteogram for Oxford, Oxfordshire, England, United Kingdom =-
 'C                                                                   Rain (mm)
 12
 11
 10                                                         ^^^=--=--
  9^^^                                                   ===
  8   ^^^===      ======                              ^^^
  7         ======      ===============^^^         =--
  6                                       =--=-----
  5
  4
  3        |  |  |  |  |  |  |                                        1 mm
    17 18 19 20 21 22 23 18/11 02 03 04 05 06 07_08_09_10_11_12_13_14 Hour

     W  W  W  W  W  W  W  W  W  W  W  W  W  W  W  W  W  W  W  W  W  W Wind dir.
     6  6  7  7  7  7  7  7  6  6  6  5  5  4  4  4  4  5  6  6  5  5 Wind(m/s)

Legend left axis:   - Sunny   ^ Scattered   = Clouded   =V= Thunder   # Fog
Legend right axis:  | Rain    ! Sleet       * Snow

If you’d just like to play with finger, then finger.farm is a great starting point. They provide free finger hosting and they’re easy to use (try finger dan@finger.farm to find me!). But I had something bigger in mind…

Fingering WordPress

What if you could finger my blog. I.e. if you ran finger blog@danq.me you’d see a summary of some of my recent posts, along with additional addresses you could finger to read the full content of each. This could be the world’s first finger-to-WordPress gateway; y’know, for if you thought the world needed such a thing. Here’s how I did it:

  1. Installed efingerd; I’m using the Debian binaries.
  2. Opened a hole in the firewall on port 79 so the outside world could access it (ufw allow 1965; utf reload).
  3. The default configuration for efingerd acts like a “typical” finger server, but it’s highly programmable to make it “smarter”. I:
    1. Blanked /etc/efingerd/list to prevent any output from “listing” the server (finger @danq.me).
    2. Replaced the contents of /etc/efingerd/list and /etc/efingerd/nouser(which are run when a request matches, or doesn’t match, a user account name) with a call to my script: /usr/local/bin/finger-to-wordpress "$3". $3 holds the username that was requested, so we can act on it.
    3. Created /usr/local/bin/finger-to-wordpressa Ruby program that either (a) lists a selection of posts or (b) returns a specific post (stripping the HTML tags)

Screenshot showing this blog rendered as plain text as the result of running finger blog@danq.me at a Linux terminal.

In future, I might use some extra tags or metadata to enhance finger-friendly WordPress posts. The infrastructure’s in place already (I already have tags that I use to make certain kinds of content available only via certain media – shh!). You might rightly as what the point is of this entire enterprise, of course, and you’d be well within your rights to ask such a question. But I think the best answer available is “because Dan”.

Screenshot showing this blog post rendered as plain text as the result of running finger wp-finger@danq.me at a Linux terminal.

If you want to see my blog in a whole new way, give it a go: run finger blog@danq.me on your computer and follow the instructions.

× ×

Reply to An Accidentally ‘Anti-January’ January

Siobhan said:

I fell off the blogging bandwagon just as everyone else was hopping onto it.

I was just thinking the same! It felt like everybody and their dog did Bloganuary last month, meanwhile I went in the opposite direction!

Historically, there’s been an annual dip in my blogging around February/March (followed by a summer surge!), but in recent years it feels like that hiatus has shifted to January (I haven’t run the numbers yet to be sure, though).

I don’t think that’s necessarily a problem, for me at least. I write for myself first, others afterwards, and so it follows that if I blog when it feels right then an ebb-and-flow to my frequency ought to be a natural consequence. But it still interests me that I have this regular dip, and I wonder if it affects the quality of my writing in any way. I feel the pressure, for example, for post-hiatus blogging to have more impact the longer it’s been since I last posted! Like: “it’s been so long, the next thing I publish has to be awesome, right?”, as if my half-dozen regular readers are under the assumption that I’m always cooking-up something and the longer it’s been, the better it’s going to be.

Making an RSS feed of YOURLS shortlinks

As you might know if you were paying close attention in Summer 2019, I run a “URL shortener” for my personal use. You may be familiar with public URL shorteners like TinyURL and Bit.ly: my personal URL shortener is basically the same thing, except that only I am able to make short-links with it. Compared to public ones, this means I’ve got a larger corpus of especially-short (e.g. 2/3 letter) codes available for my personal use. It also means that I’m not dependent on the goodwill of a free siloed service and I can add exactly the features I want to it.

Diagram showing the relationships of the DanQ.me ecosystem. Highlighted is the injection of links into the "S.2" link shortener and the export of these shortened links by RSS into FreshRSS.
Little wonder then that my link shortener sat so close to me on my ecosystem diagram the other year.

For the last nine years my link shortener has been S.2, a tool I threw together in Ruby. It stores URLs in a sequentially-numbered database table and then uses the Base62-encoding of the primary key as the “code” part of the short URL. Aside from the fact that when I create a short link it shows me a QR code to I can easily “push” a page to my phone, it doesn’t really have any “special” features. It replaced S.1, from which it primarily differed by putting the code at the end of the URL rather than as part of the domain name, e.g. s.danq.me/a0 rather than a0.s.danq.me: I made the switch because S.1 made HTTPS a real pain as well as only supporting Base36 (owing to the case-insensitivity of domain names).

But S.2’s gotten a little long in the tooth and as I’ve gotten busier/lazier, I’ve leant into using or adapting open source tools more-often than writing my own from scratch. So this week I switched my URL shortener from S.2 to YOURLS.

Screenshot of YOURLS interface showing Dan Q's list of shortened links. Six are shown of 1,939 total.
YOURLs isn’t the prettiest tool in the world, but then it doesn’t have to be: only I ever see the interface pictured above!

One of the things that attracted to me to YOURLS was that it had a ready-to-go Docker image. I’m not the biggest fan of Docker in general, but I do love the convenience of being able to deploy applications super-quickly to my household NAS. This makes installing and maintaining my personal URL shortener much easier than it used to be (and it was pretty easy before!).

Another thing I liked about YOURLS is that it, like S.2, uses Base62 encoding. This meant that migrating my links from S.2 into YOURLS could be done with a simple cross-database INSERT... SELECT statement:

INSERT INTO yourls.yourls_url(keyword, url, title, `timestamp`, clicks)
  SELECT shortcode, url, title, created_at, 0 FROM danq_short.links

But do you know what’s a bigger deal for my lifestack than my URL shortener? My RSS reader! I’ve written about it a lot, but I use RSS for just about everything and my feed reader is my first, last, and sometimes only point of contact with the Web! I’m so hooked-in to my RSS ecosystem that I’ll use my own middleware to add feeds to sites that don’t have them, or for which I’m not happy with the feed they provide, e.g. stripping sports out of BBC News, subscribing to webcomics that don’t provide such an option (sometimes accidentally hacking into sites on the way), and generating “complete” archives of series’ of posts so I can use my reader to track my progress.

One of S.1/S.2’s features was that it exposed an RSS feed at a secret URL for my reader to ingest. This was great, because it meant I could “push” something to my RSS reader to read or repost to my blog later. YOURLS doesn’t have such a feature, and I couldn’t find anything in the (extensive) list of plugins that would do it for me. I needed to write my own.

Partial list of Dan's RSS feed subscriptions, including Jeremy Keith, Jim Nielson, Natalie Lawhead, Bruce Schneier, Scott O'Hara, "Yahtzee", BBC News, and several podcasts, as well as (highlighted) "Dan's Short Links", which has 5 unread items.
In some ways, subscribing “to yourself” is a strange thing to do. In other ways… shut up, I’ll do what I like.

I could have written a YOURLS plugin. Or I could have written a stack of code in Ruby, PHP, Javascript or some other language to bridge these systems. But as I switched over my shortlink subdomain s.danq.me to its new home at danq.link, another idea came to me. I have direct database access to YOURLS (and the table schema is super simple) and the command-line MariaDB client can output XML… could I simply write an XML Transformation to convert database output directly into a valid RSS feed? Let’s give it a go!

I wrote a script like this and put it in my crontab:

mysql --xml yourls -e                                                                                                                     \
      "SELECT keyword, url, title, DATE_FORMAT(timestamp, '%a, %d %b %Y %T') AS pubdate FROM yourls_url ORDER BY timestamp DESC LIMIT 30" \
    | xsltproc template.xslt -                                                                                                            \
    | xmllint --format -                                                                                                                  \
    > output.rss.xml

The first part of that command connects to the yourls database, sets the output format to XML, and executes an SQL statement to extract the most-recent 30 shortlinks. The DATE_FORMAT function is used to mould the datetime into something approximating the RFC-822 standard for datetimes as required by RSS. The output produced looks something like this:

<?xml version="1.0"?>
<resultset statement="SELECT keyword, url, title, timestamp FROM yourls_url ORDER BY timestamp DESC LIMIT 30" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
  <row>
        <field name="keyword">VV</field>
        <field name="url">https://webdevbev.co.uk/blog/06-2021/perfect-is-the-enemy-of-good.html</field>
        <field name="title"> Perfect is the enemy of good || Web Dev Bev</field>
        <field name="timestamp">2021-09-26 17:38:32</field>
  </row>
  <row>
        <field name="keyword">VU</field>
        <field name="url">https://webdevlaw.uk/2021/01/30/why-generation-x-will-save-the-web/</field>
        <field name="title">Why Generation X will save the web  Hi, Im Heather Burns</field>
        <field name="timestamp">2021-09-26 17:38:26</field>
  </row>

  <!-- ... etc. ... -->
  
</resultset>

We don’t see this, though. It’s piped directly into the second part of the command, which  uses xsltproc to apply an XSLT to it. I was concerned that my XSLT experience would be super rusty as I haven’t actually written any since working for my former employer SmartData back in around 2005! Back then, my coworker Alex and I spent many hours doing XML backflips to implement a system that converted complex data outputs into PDF files via an XSL-FO intermediary.

I needn’t have worried, though. Firstly: it turns out I remember a lot more than I thought from that project a decade and a half ago! But secondly, this conversion from MySQL/MariaDB XML output to RSS turned out to be pretty painless. Here’s the template.xslt I ended up making:

<?xml version="1.0"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
  <xsl:template match="resultset">
    <rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
      <channel>
        <title>Dan's Short Links</title>
        <description>Links shortened by Dan using danq.link</description>
        <link> [ MY RSS FEED URL ] </link>
        <atom:link href=" [ MY RSS FEED URL ] " rel="self" type="application/rss+xml" />
        <lastBuildDate><xsl:value-of select="row/field[@name='pubdate']" /> UTC</lastBuildDate>
        <pubDate><xsl:value-of select="row/field[@name='pubdate']" /> UTC</pubDate>
        <ttl>1800</ttl>
        <xsl:for-each select="row">
          <item>
            <title><xsl:value-of select="field[@name='title']" /></title>
            <link><xsl:value-of select="field[@name='url']" /></link>
            <guid>https://danq.link/<xsl:value-of select="field[@name='keyword']" /></guid>
            <pubDate><xsl:value-of select="field[@name='pubdate']" /> UTC</pubDate>
          </item>
        </xsl:for-each>
      </channel>
    </rss>
  </xsl:template>
</xsl:stylesheet>

That uses the first (i.e. most-recent) shortlink’s timestamp as the feed’s pubDate, which makes sense: unless you’re going back and modifying links there’s no more-recent changes than the creation date of the most-recent shortlink. Then it loops through the returned rows and creates an <item> for each; simple!

The final step in my command runs the output through xmllint to prettify it. That’s not strictly necessary, but it was useful while debugging and as the whole command takes milliseconds to run once every quarter hour or so I’m not concerned about the overhead. Using these native binaries (plus a little configuration), chained together with pipes, had already resulted in way faster performance (with less code) than if I’d implemented something using a scripting language, and the result is a reasonably elegant “scratch your own itch”-type solution to the only outstanding barrier that was keeping me on S.2.

All that remained for me to do was set up a symlink so that the resulting output.rss.xml was accessible, over the web, to my RSS reader. I hope that next time I’m tempted to write a script to solve a problem like this I’ll remember that sometimes a chain of piped *nix utilities can provide me a slicker, cleaner, and faster solution.

Update: Right as I finished writing this blog post I discovered that somebody had already solved this problem using PHP code added to YOURLS; it’s just not packaged as a plugin so I didn’t see it earlier! Whether or not I use this alternate approach or stick to what I’ve got, the process of implementing this YOURLS-database ➡ XML ➡  XSLTRSS chain was fun and informative.

× × ×

Note #19508

When do I #blog? A breakdown of my top four #indieweb content types – articles, reposts, checkins and notes – by month.

Blog posts by type and month, a graph showing how articles barely change throughout the year but reposts are highest in the first half of the year, checkins peak in the spring and summer, and notes get a big boost in November.

Unsurprisingly my checkins, which represent #geocaching/#geohashing activity, grow in the spring and peak in the summer when the weather’s better!

At first I assumed the notes peak in November might have been thrown off by a single conference, e.g. musetech, but it turns out I’ve just done more note-friendly things in Novembers, like Challenge Robin II and my Cape Town meetup, which are enough to throw the numbers off.

×

Reply to “I was permanently banned by Facebook”

Donncha Ó Caoimh said:

My Facebook account was permanently banned on Wednesday along with all the people who take care of the Cork Skeptics page. We’re still not sure why but it might have something to do with the Facebook algorithm used to detect far-right conspiracy groups.

If you have a Facebook account you should download your information too because it could happen to you too, even though you did nothing wrong. Go here and click the “Create File” button now.

Yeah, I know you won’t do it but you really should.

Great advice.

After I got banned from Facebook in 2011 (for using a “fake name”, which is actually my real name) I took a similar line of thinking: I can’t trust Facebook (or Twitter, or Instagram, or whoever else) to be responsible custodians of my content, so I shan’t. Now, virtually all content I create is hosted on my WordPress-powered blog, at my own domain, first and foremost… and syndicated copies may appear on various social media.

In a very few instances I go the other way around, producing content in silos and then copying it back to my blog: e.g. my geocaching/geohashing expeditions are posted first to their respective sites (because it’s easiest and most-practical to do that using their apps, especially “in the field”), but then they get imported into my blog using a custom plugin. If any of these sites closes, deletes my data, adds paid tiers I’m not happy with, or just bans me from my own account… I’m still set.

Backing up all your social content is a good strategy. Owning it all to begin with is an even better one, IMHO. See also: Indieweb.

Thames Path 2

This is a repost promoting content originally published elsewhere. See more things Dan's reposted.

On our first day‘s walking along the Thames Path, Robin and I had trouble finding any evidence of water for some time. On our second day, we did not have this problem.

After weeks of sustained rain, the fields we walked over as we left Cricklade behind were extremely soggy. On our way out of town we passed Cricklade Millennium Wood, I took a picture for the purpose of mocking it for being very small but later discovered it’s too small to appear on Google Maps and became oddly defensive of it – it’s trying, damn it, we should at least acknowledge its existence.

Ruth and her brother Robin (of Challenge Robin/Challenge Robin II fame on this blog, among many other crazy adventures) have taken it upon themselves to walk the entirety of the Thames Path from the source of the river (or rather, one of the many symbolic sources) to the sea, over the course of a series of separate one-day walks. I’ve mostly been acting as backup-driver so far, but I might join them for a leg or two later on.

In any case, Ruth’s used it as a welcome excuse to dust off her blog and write about the experience, and it’s fun and delightful and you should follow along and give her a digital cheer. The first part is here; the second part landed yesterday.

Celebrate Seventeen

This is a repost promoting content originally published elsewhere. See more things Dan's reposted.

May 27th, 17 years ago, the first release of WordPress was put into the world by Mike Little and myself. It did not have an installer, upgrades, WYSIWYG editor (or hardly any Javascript), comment spam protection, clean permalinks, caching, widgets, themes, plugins, business model, or any funding.

Seventeen years ago, WordPress was first released.

Sixteen years, eleven months ago, I relaunched I relaunched my then-dormant blog. I considered WordPress/b2/cafelog, but went with a now-dead engine called Flip instead.

Fifteen years, ten months ago, in response to a technical failure on the server I was using, I lost it all and had to recover my posts from backups. Immediately afterwards, I took the opportunity to redesign my blog and switch to WordPress. On the same day, I attended the graduation ceremony for my first degree (but somehow didn’t think this was worth blogging about).

Fifteen years, nine months ago, Automattic Inc. was founded to provide managed WordPress hosting services. Some time later, I thought to myself: hey, they seem like a cool company, and I like everything Matt’s done so far. I should perhaps work there someday.

Lots of time passed.

Seven months ago, I got around to doing that.

Happy birthday, WordPress!

Thursday

This is a repost promoting content originally published elsewhere. See more things Dan's reposted.

Today is Thursday. YOU MUST NOT GO OUTSIDE!

  1. Ordnance Survey, the national mapping agency for Great Britain, is set to publish revised maps for the whole of the British isles in the wake of social distancing measures. The new 1:50,000 scale maps are simply a revision of the older 1:25,000 scale map but all geographical features have been moved further apart.

Gareth’s providing a daily briefing including all the important things that the government wants you to know about the coronavirus crisis… and a few things that they didn’t think to tell you, but perhaps they should’ve. Significantly more light-hearted than wherever you’re getting your news from right now.

Happy Birthday, You’re Fired

This is a repost promoting content originally published elsewhere. See more things Dan's reposted.

I walked the dog, bought myself a wax jacket from a local garden centre – so I could feel more ‘Country’ – and in the late afternoon my boss called me to say… you’re fired.

So begins Robin‘s latest blog (you may remember I’ve shared, talked about, and contributed to the success of his previous blogging adventure, 52 Reflect). This time around, it sort-of reads like a contemporaneously-written “what I did on my holidays” report, except that it’s pretty-much about the opposite of a holiday: it chronicles Robin’s adventures in the North of England during these strange times.

I’ve no doubt that this represents the start of another riotous series of posts and perhaps exactly what you need to lift your spirits in these trying times. A second chapter is already online.