BBC News RSS… your way!

It turns out my series of efforts to improve the BBC News RSS feeds are more-popular than I thought. People keep asking for variants of them, and it’s probably time I stopped hosting the resulting feeds on my NAS (which does a good job, but it’s in a highly-kickable place right under my desk).

Screenshot of BBC News RSS Feeds (that don't suck!).
The new site isn’t pretty. But it works.

So I’ve launched BBC-Feeds.DanQ.dev. On a 20-minute schedule, it generates both UK and World editions of the BBC News feeds, filtered to remove iPlayer, Sounds, app “nudges”, duplicates, and other junk, and optionally with the sports news filtered out too.

The entire thing is open source under an ultra-permissive license, so you can run your own copy if you don’t want to use mine.

Enjoy!

Blog Questions Challenge

Since Kev Quirk made an adaptation of Ava‘s Blog Questions Challenge I’ve been seeing it everywhere in my blogosphere circle. I’ve gotta be the last person left on Earth to do it, but it has that old-school pass-it-along meme feel, like that 2006 one about describing your friends. I’ve not been tagged by name, but both Jeremy and Garrett did a broad “you” tag, so I’m taking it.

Why did you start blogging in the first place?

It felt like a natural evolution of my second vanity-site. It was 1998, and my site – Castle of the Four Winds – was home to a selection of the same kinds of random crap that everybody put on their homepages at the time. I figured I’d start keeping an online diary: the word “blog” hadn’t been coined yet, and its predecessor “weblog” had only been around for a year and I hadn’t come across it.

So I experimentally started posting a few times a week.

Castle of the Four Winds in early 1999: a very-90s website of white and red text on black, with Times New Roman text, a flaming hit counter, and a blue ribbon campaign button.
I don’t have many of my posts from 1998, but I know from other records that my first deliberately “blog”-like post was on 27 September. But the posts shown in this screenshot, from January 1999, survive and can still be read here1.
By the way, if you liked how my site looked back in the 1990s, you can wind the clock back! Give it a go!

What platform are you using to manage your blog and why did you choose it? Have you blogged on other platforms before?

1998: Static HTML and a bit of Perl

When I started blogging my site was almost entirely plain HTML2. So my original “platform” was probably Emacs.

2000: Static files indexed by PHP

In the Summer of 2000 I registered avangel.com and moved my diary there. I was still storing posts in static files, but used PHP wrappers to share the structure and menus across the pages. It was a massive improvement.

Later, I moved everything to the (ill-advised?) domain name scatmania.org and reimplemented in pretty-much the same way. Until…

2003: Flip

The first real “blogging engine” I used was Flip.

The first version of Scatmania.org: a Flip-powered weblog.
Flip was a bit of a pain to theme, which is why my Flip-powered blog looked quite a bit like most other Flip-powered blogs.

I liked Flip3: it had a raw simplicity that I’d later come to love in young versions of WordPress. And being able to edit from the Web was a huge improvement over having to edit files, especially when I was out and about: I managed to post from my dad’s BlackBerry while cycling across the Outer Hebrides, for example.

2004: WordPress

I’d have outgrown Flip eventually, but I got a nudge in that direction in July 2004. At the time, I was sharing a server with some friends and operated by Gareth, and something went wrong and the server went completely offline. The co-located server disappeared back to Gareth’s house, eventually, and while I’d recovered many of the posts from my own backups, 61 posts remain partially-incomplete to this day (if you happen to have a copy of any of them I’d love to see it!).

I brought my blog back online using WordPress, whose then-new release version 1.2 included an RSS-powered importer: this allowed me to write a little code to convert my entire previous archive into a fat RSS file and then import it wholesale. WordPress was, as remains, pretty magical – a universal blogging platform that evolved into a universal CMS – and I back in the day I occasionally argued online with Matt about technical aspects of the future direction of the project4.

Scatmana.org version 2 - now with actual web design
Those drop-shadows! Those gradients! Those naked hyperlinks differentiated only by being a slightly different colour! That aggressive use of sans-serif fonts with expanded line-heights! Those RSS links, front-and-centre! The only thing that could make this more-obviously “Web 2.0” would be the addition of a wonky “beta” star in the corner.

Incidentally, if you’d like to see more of my blog’s design history over the last 26+ years, I shared a lot of screenshots back in 2018.

If you didn’t know better, you might well not know I’m running WordPress. My theme and custom plugins are… well, they’re an ecosystem all by themselves. And that’s before you even get to things like CapsulePress, my WordPress-to-Gopher/Gemini/Spartan/Nex bridge, the pile of scripts I use to sync-up with the Fediverse, the PWA I use to post notes while I’m on the move, and so on.

2025: ClassicPress

Earlier this year I experimentally switched to ClassicPress; a fork of WordPress. There’ll doubtless be lots more to say about that, down the line5, but here’s the skinny: I don’t use Gutenberg on my blog anyway6, I appreciate having my backend be almost as high-performance as I’ve worked to make my frontend, and I enjoy most of the feature differences7.

How do you write your posts? For example, in a local editing tool, or in a panel/dashboard that’s part of your blog?

With the exception of notes (most of which are written in a tool of my own creation and then pushed to one or both of my Mastodon and my blog simultaneously), I mostly write right into the WordPress/ClassicPress post editor.

I often write ideas, concepts, and first drafts into my Obsidian notebook and then copy/paste out when the time comes.

When do you feel most inspired to write?

There’s no particular pattern, though it feels like I’m most-inspired to write exactly when I should be prioritising something else! That’s why it’s so helpful to be able to write three sentences into Obsidian and then come back to it later!

I’ve been on a bit of a blogging kick these last few years, though. Last year I wrote a massive 436 posts, although that admittedly includes PESOS‘d checkins from geocaching and geohashing expeditions. I’m a fan of Kev’s #100DaysToOffload challenge, and I’m on course to achieve it earlier than ever before, this year (my sixth consecutive year: I do the challenge strictly by calendar years!), as this post is already by 48th… all within the first 38 days of this year8.

Do you publish immediately after writing, or do you let it simmer a bit as a draft?

A mixture of both. Probably most of my posts are written in a single sitting… or, at least, are written in a tab that stays open for the entire time during which it’s written.

But others spend a long time in-progress. You remember how almost a year ago I gave a talk about why Oxford’s area code is 01865? And I promised that there’d be a blog/vlog/maybe-podcast version of that talk later? Yeah: that’s been 90%-there and sitting in a draft pretty-much since then, just waiting for me to make the finishing touches (and record the vlog/podcast variants, if that’s the direction I decide to go in).

And I’ve dusted off drafts that’ve been much older than that, before, too. So it really is a mixture.

What’s your favourite post on your blog?

I couldn’t pick out a favourite that I wouldn’t change my mind about five minutes later. But a recent favourite might have been last Spring’s “Let Your Players Lead The Way”, which aimed to impart some of the things I’ve learned about gamemastering (especially) while being the dungeon master for The Levellers these last few years9.

Not only was it a post that had been a long time coming, and based on months of drafts and re-drafts, but also I really enjoyed writing some post-specific CSS to give it just a slightly more-magical feel.

Screenshot from Let Your Players Lead The Way, showcasing its design in the style of the D&D Players' Handbook.
The downside is that I’ve now got one more thing to try not to break the next time I re-write my blog’s stylesheet.

Any future plans for your blog? Maybe a redesign, a move to another platform, or adding a new feature?

I want to redesign the homepage to be simpler, less-graphical, and more-informational. I’m not sure how that’s going to look, yet.

I’ve been wondering about integrating some of my personal-geotracking into the design (Aaron Parecki does an amazing job of this with his dynamic site background image, for example).

I’m playing with the idea of adding a guestbook, like it’s 1998 again or something.

I’d like to tidy up my tagging taxonomy, and I’m not convinced AI is up to the task.

I need to decide how I feel about the emoji reactions feature I added in 2023. I’m still undecided. What do you think? 👍? 👎?

And as I mentioned: I’m experimenting with ClassicPress. It’s working out mostly-okay so far, but that’s a story for another post.

Next?

I feel like I’m the last person in the universe to do this quiz. But if you haven’t – and you have anything approximating a blog – then you should go next.

Footnotes

1 I wouldn’t recommend actually reading my older posts, though. I was a teenager, and it shows.

2 I had a slightly-fancier kind of hosting, by this point, that gave me a cgi-bin directory into which I could compile binaries (in C) or write scripts (in Perl). My hit counter? That was a Perl script I adapted from Matt Wright’s counter.pl and “enhanced” with some flaming text using Corel Photo-Paint.

3 While writing this post, I hunted down the original developer of Flip. He seems cool.

4 A year later he launched WordPress.com, which then evolved into the foundation of Automattic, and there soon came a point where I thought “I should work there, someday!” It took me a further 14 years before I applied for such a job, though.

5 Right off the bat, though, let me stress that trying ClassicPress is absolutely nothing to do with the drama in the WordPress space right now: in fact I’ve been planning to give it a try ever since the project got its shit together, re-forked WordPress, and released ClassicPress 2.0 a year ago.

6 I don’t have anything against Gutenberg – I use it on other blogs, and every day at work! – and Block Themes are magical… but I’ve never found any benefit to them here: I’ve no need for it, and I’ve got plugins I’ve written for my own use that I’ve never bothered to make Gutenberg-compatible.

7 My biggest gripe with ClassicPress so far is that in removing the jQuery dependency on the post editor’s tag selector they’ve only replaced it with a <datalist>, which is neat and all but kills the ability to autocomplete multiple comma-separated tags at once. But it looks like that’s getting fixed, so I’m going to hang in there for a bit before I decide whether I’m sticking with ClassicPress or not.

8 I’ll save you from doing the maths: if I complete 48 posts in 38 days, I’d expect to complete 100 posts on my 80th day: as it’s not a leap year, that would be Friday 21 March 2025. Let’s see how I get on!

9 Although I’ve been horribly neglecting them for the last couple of months, for various reasons.

× × ×

God’s Adviser

Duration

Podcast Version

This post is also available as a podcast. Listen here, download for later, or subscribe wherever you consume podcasts.

Of all the discussions I’ve ever been involved with on the subject of religion, the one I’m proudest of was perhaps also one of the earliest.

Let me tell you about a time that, as an infant, I got sent out of my classroom because I wouldn’t stop questioning the theological ramifications of our school nativity play.

I’m aware that I’ve got readers from around the world, and Christmas traditions vary, so let’s start with a primer. Here in the UK, it’s common1 at the end of the school term before Christmas for primary schools to put on a “nativity play”. A group of infant pupils act out an interpretation of the biblical story of the birth of Jesus: a handful of 5/6-year-olds playing the key parts of, for example, Mary, Joseph, an innkeeper, some angels, maybe a donkey, some wise men, some shepherds, and what-have-you.

A group of children dressed as Mary, Joseph, shepherds, kings, angels, and a variety of barn animals crowd a stage.
Maybe they’re just higher-budget nowadays, or maybe I grew up in a more-deprived area, but I’m pretty sure than when I was a child a costume consisted mostly of a bedsheet if you were an angel, a tea-towel secured with an elastic band if you were a shepherd, a cardboard crown if you were a king, and so on. Photo courtesy Ian Turk.

As with all theatre performed by young children, a nativity play straddles the line between adorable and unbearable. Somehow, the innkeeper – who only has one line – forgets to say “there is no room at the inn” and so it looks like Mary and Joseph just elect to stay in the barn, one of the angels wets herself in the middle of a chorus, and Mary, bored of sitting in the background having run out of things to do, idly swings the saviour of mankind round and around, holding him by his toe. It’s beautiful2.

I was definitely in a couple of different nativity plays as a young child, but one in particular stands out in my memory.

Dan, as a young child, superimposed against a background with a star, with a speech bubble reading 'Hark! A star! In the East!'
“Let us go now to Bethlehem. The son of God is born today.”

In order to put a different spin on the story of the first Christmas3, one year my school decided to tell a different, adjacent story. Here’s a summary of the key beats of the plot, as I remember it:

  • God is going to send His only son to Earth and wants to advertise His coming.
  • “What kind of marker can he put in the sky to lead people to the holy infant’s birthplace?”, He wonders.
  • So He auditions a series of different natural phenomena:
    • The first candidate is a cloud, but its pitch is rejected because… I don’t remember: it’ll blow away or something.
    • Another candidate was a rainbow, but it was clearly derivative of an earlier story, perhaps.
    • After a few options, eventually God settles on a star. Hurrah!
  • Some angels go put the star in the right place, shepherds and wise men go visit Mary and her family, and all that jazz.

So far, totally on-brand for a primary school nativity play but with 50% more imagination than the average. Nice.

Edited watercolour of magi crossing a desert on camels, with a large meteor inserted into the sky.
What the Meteor Strike of Bethlehem lacked in longevity, it made up for in earth-shattering destruction.

I was cast as Adviser #1, and that’s where things started to go wrong.

The part of God was played by my friend Daniel, but clearly our teacher figured that he wouldn’t be able to remember all of his lines4 and expanded his role into three: God, Adviser #1, and Adviser #2. After each natural phenomenon explained why it would be the best, Adviser #1 and Adviser #2 would each say a few words about the candidate’s pros and cons, providing God with the information He needed to make a decision.

To my young brain, this seemed theologically absurd. Why would God need an adviser?5

“If He’s supposed to be omniscient, why does God need an adviser, let alone two?” I asked my teacher6.

The answer was, of course, that while God might be capable of anything… if the kid playing Him managed to remember all of his lines then that’d really be a miracle. But I’d interrupted rehearsals for my question and my teacher Mrs. Doyle clearly didn’t want to explain that in front of the class.

But I wouldn’t let it go:

  • “But Miss, are we saying that God could make mistakes?”
  • “Couldn’t God try out the cloud and the rainbow and just go back in time when He knows which one works?”
  • “Why does God send an angel to tell the shepherds where to go but won’t do that for the kings?”
  • “Miss, don’t the stars move across the sky each night? Wouldn’t everybody be asking questions about the bright one that doesn’t?”
  • “Hang on, what’s supposed to have happened to the Star of Bethlehem after God was done with it? Did it have planets? Did those planets… have life?”

In the end I had to be thrown out of class. I spent the rest of that rehearsal standing in the corridor.

And it was totally worth it for this anecdote.

Footnotes

1 I looked around to see if the primary school nativity play was still common, or if the continuing practice at my kids’ school shows that I’m living in a bubble, but the only source I could find was a 2007 news story that claims that nativity plays are “under threat”… by The Telegraph, who I’d expect to write such a story after, I don’t know, the editor’s kids decided to put on a slightly-more-secular play one year. Let’s just continue to say that the school nativity play is common in the UK, because I can’t find any reliable evidence to the contrary.

2 I’ve worked onstage and backstage on a variety of productions, and I have nothing but respect for any teacher who, on top of their regular workload and despite being unjustifiably underpaid, volunteers to put on a nativity play. I genuinely believe that the kids get a huge amount out of it, but man it looks like a monumental amount of work.

3 And, presumably, spare the poor parents who by now had potentially seen children’s amateur dramatics interpretations of the same story several times already.

4 Our teacher was probably correct.

5 In hindsight, my objection to this scripting decision might actually have been masking an objection to the casting decision. I wanted to play God!

6 I might not have used the word “omniscient”, because I probably didn’t know the word yet. But I knew the concept, and I certainly knew that my teacher was on spiritually-shaky ground to claim both that God knew everything and God needed an advisor.

× × ×

Sabbatical Lesson #2: Burnout

If the most-important lesson I learned from my sabbatical was about boundaries and my work/life balance, then the second most-important was about burnout.

A matchstick, burned almost to the end.
Once all the matches have been burned, you can’t use them to light any more fires. It’s not the best metaphor, but it’s the one you’re getting.

If I were anybody else, you might reasonably expect me to talk about work-related burnout and how a sabbatical helped me to recover from it. But in a surprise twist1, my recent brush with burnout came during my sabbatical.

Somehow, I stopped working at my day job… and instead decided to do so much more voluntary work during my newly-empty daytimes – on top of the evening and weekend volunteering I was already doing – that just turned out to be… too much. I wrote a little about it at the time in a post for RSS subscribers only, mostly as a form of self-recognition: patting myself on the back for spotting the problem and course-correcting before it got worse!

When I got back to work2, I collared my coach to talk about this experience. It was one of those broadening “oh, so that’s why I’m like this” experiences:

The why of how I, y’know, got off course at the end of last year and drove myself towards an unhealthy work attitude… is irrelevant, really. But the actual lesson here that I took from my sabbatical is: just because you’re not working in a conventional sense doesn’t make you immune from burnout. Burnout happens when you do too much, for too long, without compassion for yourself and your needs

I dodged it at the end of November, but that doesn’t mean I’ll always be able to, so this is exactly the kind of thing a coach is there to help with!

Footnotes

1 Except to people who know me well at all, to whom this post might not be even remotely surprising.

2 Among the many delightful benefits to my job is a monthly session with my choice of coach. I’ve written a little about it before, but the short of it is that it’s an excellent perk.

×

BBC News RSS… with the sport?

Earlier today, somebody called Allan commented on the latest in my series of several blog posts about how I mutilate manipulate the RSS feeds of BBC News to work around their (many, and increasingly so) various shortcomings, specifically:

  1. Their inclusion of non-news content such as plugs for iPlayer and their apps,
  2. Their repeating of identical news stories with marginally-different GUIDs, and
  3. All of the sports news, which I don’t care about one jot.

Well, it turns out that some people want #3: the sport. But still don’t want the other two.

FreshRSS screenshot with many unread items, but focussing on a feed called "BBC News (with sport)" and showing a story titled: 'How England Golf's yellow cards are tackling blight of slow play'
Some people actually want to read this crap, apparently.

I shan’t be subscribing to this RSS feed, and I can’t promise I’ll fix it if it gets broken. But if “without the crap, but with the sports” is the way you like your BBC News RSS feed, I’ve got you covered:

So there you go, Allan, and anybody in a similar position. I hope that fulfils your need for sports news… without the crap.

×

Cafe Proximity Principles

Possible future presentation concept: using a cafe/dining metaphor to help explain the proximity principle in user interface design (possibly with a “live waitstaffing” demo?).

Great idea? Or stupid idea?

Photo looking down on a square wooden table in a cafe environment. Far from the photographer, a plate containing a couple of crumbs is pushed far to the other side of the table. Closer to him is a cup of coffee, two folded napkins, and an open MacBook. Arrows and captions draw attention to the relative distances between the components of the scene, labelling them as follows - "I am finished with this plate..." (refering to its relative distance), "...but I'm keeping this napkin" (which is closer to the coffee than the discarded plate), "I'm keeping the napkin while I finish my drink" (the two are close together), and "I'm working" (based on the relative position of the photographer and his laptop).

×

Yr Wyddfa’s First Email

On Wednesday, Vodafone announced that they’d made the first ever satellite video call from a stock mobile phone in an area with no terrestrial signal. They used a mountain in Wales for their experiment.

It reminded me of an experiment of my own, way back in around 1999, which I probably should have made a bigger deal of. I believe that I was the first person to ever send an email from the top of Yr Wyddfa/Snowdon.

Nowadays, that’s an easy thing to do. You pull your phone out and send it. But back then, I needed to use a Psion 5mx palmtop, communicating over an infared link using a custom driver (if you ever wondered why I know my AT-commands by heart… well, this isn’t exactly why, but it’s a better story than the truth) to a Nokia 7110 (fortunately it was cloudy enough to not interfere with the 9,600 baud IrDA connection while I positioned the devices atop the trig point), which engaged a GSM 2G connection, over which I was able to send an email to myself, cc:’d to a few friends.

It’s not an exciting story. It’s not even much of a claim to fame. But there you have it: I was (probably) the first person to send an email from the summit of Yr Wyddfa. (If you beat me to it, let me know!)

Trump’s Strategy

Portrait of Donald Trump in the style of the DOS version of 1991 strategy video game Sid Meier's Civilization, saying: "The Gulf of Mexico shall now be known as the Gulf of America."

Pixel art portrait of Claudia Sheinbaum in Civilization-style, against a background reminiscent of Mexico City, saying: "You're fucking kidding, right?"

Pixel-art of convicted felon Donald Trump, now saying: "We're also renaming the State of Canada, the State of Greenland, and the State of Panama. They all belong to me now."

Newspaper in the style of Sid Meier's Civilization, titled The Final Broadsheet, published February 1 2025 AD. The headline is "Donald Trump claims domination victory", with the subheadline "Google Maps now shows 100% of world is part of America". Below is the pixel-art portrait of Trump and a pixel-art world map with a stars-and-stripes pattern applied to the entire world. A pullquote reads "That's not how any of this works! - rest of world". A secondary story has the headline "Putin, Netanyahu, demand do-over: 'We didn't realise you could invade places by just renaming them yours!'".

What do you reckon? Is he trying to go for a domination victory without ever saying “MY THREATS ARE BACKED BY NUCLEAR WEAPONS!”? His track record shows that he’s arrogant enough to think that the strategy of simply renaming things until they’re yours is actually viable!

After I saw Mexico’s response to Google following Trump’s lead in renaming the Gulf of Mexico, this stupid comic literally came to me in a dream.

Adapts screenshots from Sid Meier’s Civilization (1991 DOS version), public domain assets from OpenGameArt.org, and AI-assisted images of world leaders on account of the fact that if I drew pixel-art world leaders without assistance then you’d be even less-likely to be able to recognise them.

Can AI retroactively fix WordPress tags?

I’ve a notion that during 2025 I might put some effort into tidying up the tagging taxonomy on my blog. There’s a few tags that are duplicates (e.g. ai and artificial intelligence) or that exhibit significant overlap (e.g. dog and dogs), or that were clearly created when I speculated I’d write more on the topic than I eventually did (e.g. homa night, escalators1, or nintendo) or that are just confusing and weird (e.g. not that bacon sandwich picture).

Cloud-shaped wordcloud of tags used on DanQ.me, sized by frequency. The words "geocaching" and "cache log" dominate the centre of the picture, with terms like "fun", "funny", "geeky", "news", "video games", "technology", "video", and "review" a close second.
Not an entirely surprising word cloud of my tag frequency, given that all of my cache logs are tagged “cache log” and “geocaching”, and relatively-generic terms like “technology”, “fun”, and “funny” appear all over the place on my blog.

Retro-tagging with AI

One part of such an effort might be to go back and retroactively add tags where they ought to be. For about the first decade of my blog, i.e. prior to around 2008, I rarely used tags to categorise posts. And as more tags have been added it’s apparent that many old posts even after that point might be lacking tags that perhaps they ought to have2.

I remain sceptical about many uses of (what we’re today calling) “AI”, but one thing at which LLMs seem to do moderately well is summarisation3. And isn’t tagging and categorisation only a stone’s throw away from summarisation? So maybe, I figured, AI could help me to tidy up my tagging. Here’s what I was thinking:

  1. Tell an LLM what tags I use, along with an explanation of some of the quirkier ones.
  2. Train the LLM with examples of recent posts and lists of the tags that were (correctly, one assumes) applied.
  3. Give it the content of blog posts and ask what tags should be applied to it from that list.
  4. Script the extraction of the content from old posts with few tags and run it through the above, presenting to me a report of what tags are recommended (which could then be coupled with a basic UI that showed me the post and suggested tags, and “approve”/”reject” buttons or similar.

Extracting training data

First, I needed to extract and curate my tag list, for which I used the following SQL4:

SELECT COUNT(wp_term_relationships.object_id) num, wp_terms.slug FROM wp_term_taxonomy
LEFT JOIN wp_terms ON wp_term_taxonomy.term_id = wp_terms.term_id
LEFT JOIN wp_term_relationships ON wp_term_taxonomy.term_taxonomy_id = wp_term_relationships.term_taxonomy_id
WHERE wp_term_taxonomy.taxonomy = 'post_tag'
AND wp_terms.slug NOT IN (
  -- filter out e.g. 'rss-club', 'published-on-gemini', 'dancast' etc.
  -- these are tags that have internal meaning only or are already accurately applied
  'long', 'list', 'of', 'tags', 'the', 'ai', 'should', 'never', 'apply'
)
GROUP BY wp_terms.slug
HAVING num > 2 -- filter down to tags I actually routinely use
ORDER BY wp_terms.slug
Many of my tags are used for internal purposes; e.g. I tag posts published on gemini if they’re to appear on gemini://danq.me/ and dancast if they embed an episode of my podcast. I filtered these out because I never want the AI to suggest applying them.

I took my output and dumped it into a list, and skimmed through to add some clarity to some tags whose purpose might be considered ambiguous, writing my explanation of each in parentheses afterwards. Here’s a part of the list, for example:

Prompt derivation

I used that list as the basis for the system message of my initial prompt:

Suggest topical tags from a predefined list that appropriately apply to the content of a given blog post.

# Steps

1. **Read the Blog Post**: Carefully read through the provided content of the blog post to identify its main themes and topics.
2. **Analyse Key Aspects**: Identify key topics, themes, or subjects discussed in the blog post.
3. **Match with Tags**: Compare these identified topics against the list of available tags.
4. **Select Appropriate Tags**: Choose tags that best represent the main topics and themes of the blog post.

# Output Format

Provide a list of suggested tags. Each tag should be presented as a single string. Multiple tags should be separated by commas.

# Allowed Tags

Tags that can be suggested are as follows. Text in parentheses are not part of the tag but are a description of the kinds of content to which the tag ought to be applied:

- aberdyfi
- aberystwyth
- ...
- youtube
- zoos

# Examples

**Input:**
The rapid advancement of AI technology has had a significant impact on my industry, even on the ways in which I write my blog posts. This post, for example, used AI to help with tagging.

**Output:**
ai, technology, blogging, meta, work

...(other examples)...

# Notes

- Ensure that all suggested tags are relevant to the key themes of the blog post.
- Tags should be selected based on their contextual relevance and not just keyword matching.

This system prompt is somewhat truncated, but you get the idea.

Now I was ready to give it a go with some real data. As an initial simple and short (and therefore also computationally cheap) experiment, I tried feeding it a note I wrote last week about the interrobang’s place in the Spanish language, and in Unicode.

That post already has the following tags (but this wasn’t disclosed to the AI in its training set; it had to work from scratch): , , (a bit of a redundancy there!), , and .

Testing it out

Let’s see what the AI suggests:

curl https://api.openai.com/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $OPENAI_TOKEN" \
  -d '{
  "model": "gpt-4o-mini",
  "messages": [
    {
      "role": "system",
      "content": [
        {
          "type": "text",
          "text": "[PROMPT AS DESCRIBED ABOVE]"
        }
      ]
    },
    {
      "role": "user",
      "content": [
        {
          "type": "text",
          "text": "My 8-year-old asked me \"In Spanish, I need to use an upside-down interrobang at the start of the sentence‽\" I assume the answer is yes A little while later, I thought to check whether Unicode defines a codepoint for an inverted interrobang. Yup: ‽ = U+203D, ⸘ = U+2E18. Nice. And yet we dont have codepoints to differentiate between single-bar and double-bar \"cifrão\" dollar signs..."
        }
      ]
    }
  ],
  "response_format": {
    "type": "text"
  },
  "temperature": 1,
  "max_completion_tokens": 2048,
  "top_p": 1,
  "frequency_penalty": 0,
  "presence_penalty": 0
}'
Running this via command-line curl meant I quickly ran up against some Bash escaping issues, but set +H and a little massaging of the blog post content seemed to fix it.

GPT-4o-mini

When I ran this query against the gpt-4o-mini model, I got back: unicode, language, education, children, symbols.

That’s… not ideal. I agree with the tags unicode, language, and children, but this isn’t really about education. If I tagged everything vaguely educational on my blog with education, it’d be an even-more-predominant tag than geocaching is! I reserve that tag for things that relate specifically to formal education: but that’s possibly something I could correct for with a parenthetical in my approved tags list.

symbols, though, is way out. Sure, the post could be argued to be something to do with symbols… but symbols isn’t on the approved tag list in the first place! This is a clear hallucination, and that’s pretty suboptimal!

Maybe a beefier model will fare better…

GPT-4o

I switched gpt-4o-mini for gpt-4o in the command above and ran it again. It didn’t take noticeably longer to run, which was pleasing.

The model returned: children, language, unicode, typography. That’s a big improvement. It no longer suggests education, which was off-base, nor symbols, which was a hallucination. But it did suggest typography, which is a… not-unreasonable suggestion.

Neither model suggested spain, and strictly-speaking they were probably right not to. My post isn’t about Spain so much as it’s about Spanish. I don’t have a specific tag for the latter, but I’ve subbed in the former to “connect” the post to ones which are about Spain, but that might not be ideal. Either way: if this is how I’m using the tag then I probably ought to clarify as such in my tag list, or else add a note to the system prompt to explain that I use place names as the tags for posts about the language of those places. (Or else maybe I need to be more-consistent in my tagging).

I experimented with a handful of other well-tagged posts and was moderately-satisfied with the results. Time for a more-challenging trial.

This time, with feeling…

Next, I decided to run the code against a few blog posts that are in need of tags. At this point, I wasn’t quite ready to implement a UI, so I just adapted my little hacky Bash script and copy-pasted HTML-stripped post contents directly into it.

Hand-drawn wireframe application with a blog post shown on the left (with 'previous' and 'next' buttons) and proposed tags on the right (with 'accept' and 'reject' buttons), alongside conventional tag management tools.
If it worked, I decided, I could make a UI. Until then, the command line was plenty sufficient.

I ran against three old posts:

Hospitals (June 2006)

In this post, I shared that my grandmother and my coworker had (independently) been taken into hospital. It had no tags whatsoever.

The AI suggested the tags hospital, family, injury, work, weddings, pub, humour. Which at a glance, is probably a superset of the tags that I’d have considered, but there’s a clear logic to them all.

It clearly picked out weddings based on a throwaway comment I made about a cousin’s wedding, so I disagree with that one: the post isn’t strictly about weddings just because it mentions one.

pub could go either way. It turns out my coworker’s injury occurred at or after a trip to the pub the previous night, and so its relevance is somewhat unknowable from this post in isolation. I think that’s a reasonable suggestion, and a great example of why I’d want any such auto-tagging system to be a human assistant (suggesting candidate tags) and not a fully-automated system. Interesting!

Finally, you might think of humour as being a little bit sarcastic, or maybe overly-laden with schadenfreude. But the blog post explicitly states that my coworker “carefully avoided saying how he’d managed to hurt himself, which implies that it’s something particularly stupid or embarrassing”, before encouraging my friends to speculate on it. However, it turns out that humour isn’t one of my existing tags at all! Boo, hallucinating AI!

I ended up applying all of the AI’s suggestions except weddings and humour. I also applied smartdata, because that’s where I worked (the AI couldn’t have been expected to guess that without context, though!).

Catch-Up: Concerts (June 2005)

This post talked about Ash and I’s travels around the UK to see REM and Green Day in concert5 and to the National Science Museum in London where I discovered that Ash was prejudiced towards… carrot cake.

The AI suggested: concerts, travel, music, preston, london, science museum, blogging.

Those all seemed pretty good at a first glance. Personally, I’d forgotten that we swung by Preston during that particular grand tour until the AI suggested the tag, and then I had to look back at the post more-carefully to double-check! blogging initially seemed like a stretch given that I was only blogging about not having blogged much, but on reflection I think I agree with the robot on this one, because I did explicitly link to a 2002 page that fell off the Internet only a few years ago about the pointlessness of blogging. So I think it counts.

Dan, in his 20s, crouches awkwardly in front of a TV and a Nintendo Wii in a wood-panelled room as he attempts to headbutt a falling blue balloon.
I was able to verify that I’d been in Preston with thanks to this contemporaneous photo. I have no further explanation for the content of the photo, though.

science museum is a big fail though. I don’t use that tag, but I do use the tag museum. So close, but not quite there, AI!

I applied all of its suggestions, after switching museum in place of science museum.

Geeky Winnage With Bluetooth (September 2004)

I wrote this blog post in celebration of having managed to hack together some stuff to help me remote-control my PC from my phone via Bluetooth, which back then used to be a challenge, in the hope that this would streamline pausing, playing, etc. at pizza-distribution-time at Troma Night, a weekly film night I hosted back then.

Four young people, smiling in laughing, sit in a cluttered and messy flat.
If you were sat on that sofa, fighting your way past other people and a mango-chutney-barrel-cum-table to get to a keyboard was genuinely challenging!

It already had the tag technology, which it inherited from a pre-tagging evolution of my blog which used something akin to categories (of which only one could be assigned to a post). In addition to suggesting this, the AI also picked out the following options: bluetooth, geeky, mobile, troma night, dvd, technology, and software.

The big failure here was dvd, which isn’t remotely one of my tags (and probably wouldn’t apply here if it were: this post isn’t about DVDs; it barely even mentions them). Possibly some prompt engineering is required to help ensure that the AI doesn’t make a habit of this “include one tag not from the approved list, every time” trend.

Apart from that it’s a pretty solid list. Annoyingly the AI suggested mobile, which isn’t an approved tag, instead of mobiles, which is. That’s probably a tokenisation fault, but it’s still annoying and a reminder of why even a semi-automated “human-checked” system would need a safety-check to ensure that no absent tags are allowed through to the final stage of approval.

This post!

As a bonus experiment, I tried running my code against a version of this post, but with the information about the AI’s own prompt and the examples removed (to reduce the risk of confusion). It came up with: ai, wordpress, blogging, tags, technology, automation.

All reasonable-sounding choices, and among those I’d made myself… except for tags and automation which, yet again, aren’t among tags that I use. Unless this tendency to hallucinate can be reined-in, I’m guessing that this tool’s going to continue to have some challenges when used on longer posts like this one.

Conclusion and next steps

The bottom line is: yes, this is a job that an AI can assist with, but no, it’s not one that it can do without supervision. The laser-focus with which gpt-4o was able to pick out taggable concepts, faster than I’d have been able to do for the same quantity of text, shows that there’s potential here, but it’s not yet proven itself enough of a time-saver to justify me writing a fluffy UI for it.

However, I might expand on the command-line tools I’ve been using in order to produce a non-interactive list of tagging suggestions, and use that to help inform my work as I tidy up the tags throughout my blog.

You still won’t see any “AI-authored” content on this site (except where it’s for the purpose of talking about AI-generated content, and it’ll always be clearly labelled), and I can’t see that changing any time soon. But I’ll admit that there might be some value in AI-assisted curation and administration, so long as there’s an informed human in the loop at all times.

Footnotes

1 Based on my tagging, I’ve apparently only written about escalators once, while playing Pub Jenga at Robin‘s 21st birthday party. I can’t imagine why I thought it deserved a tag.

2 There are, of course, various other people trying similar approaches to this and similar problems. I might have tried one of them, were it not for the fact that I’m not quite as interested in solving the problem as I am in understanding how one might use an AI to solve the problem. It’s similar to how I don’t enjoy doing puzzles like e.g. sudoku as much as I enjoy writing software that optimises for solving such puzzles. See also, for example, how I beat my children at Mastermind or what the hardest word in Hangman is or my various attempts to avoid doing online jigsaws.

3 Let’s ignore for a moment the farce that was Apple’s attempt to summarise news headlines, shall we?

4 Essentially the same SQL, plus WordClouds.com, was used to produce the word cloud grapic!

5 Two separate concerts, but can you imagine‽ 🤣

× × × ×

Note #25595

Back at work after a three-month break, and it turns out the first thing that I’d missed is my team’s quirky morning ritual.

Slack screenshot of Dan saying, at 08:57am: 'You know how many people pika-waved at me on a morning while I was on sabbatical? None. None at all. I missed you guys.' The quote is bookended by an animated GIF of Pikachu waving and another of Pikachu dancing around a heart. Five people have reacted with a mixture of Pikachu dances and waves.

×

Dan Q found GCA1K70 Oak Protection

This checkin to GCA1K70 Oak Protection reflects a geocaching.com log entry. See more of Dan's cache logs.

After claiming the FTF on the new cache to the North East, the geohound and I continued our walk with a wander through thy woods, eventually finding ourselves near this gate. I’d nearly attempted this cache during a previous visit to these woods but IIRC was dragged right past it by impatient dogs, children, or both. 😂

Today, though, it was a QEF for the doggo and I. Fun container and a good size too, FP awarded. TFTC!

Dan Q found GCB2FZ4 Family Fun #2

This checkin to GCB2FZ4 Family Fun #2 reflects a geocaching.com log entry. See more of Dan's cache logs.

FTF! It’s been a while since I’ve typed that!

Woke this morning slightly hungover and figured a nice walk with the geopup might help me feel better. And what an opportunity: a brand new cache only a short way from home!

Jumped in the car and zipped up to Church Hanborough, parking near the cemetery/allotments because the church bells were ringing and all the parking spots nearer to the centre of thy village were occupied by churchgoers. Walked up the paths to the GZ and had sight of the cache’s shiny container before we were even there. Retrieval was quick and easy, but we had to wait a while before we could return it to it’s hiding place because a large group of dog walkers (one of whom was holding court on how he was confident that Donald Trump would soon “dismantle the Deep State” 🙄) were passing.

Dan, wearing a grey jacket at the edge of a wet and wintry forest, waves to the camera with a hand that also holds a dog lead, the other end of which is connected to a French Bulldog wearing a teal jumper.

Soon signed vs returned and on our way. TFTC!

×

Note #25585

After a night that alternated between raining and freezing winds here, at the edge of Storm Éowyn, this morning my skylight has ice patterns on it that look beautiful and almost organic.

Skylight window viewed from the inside. The outside has a layer of ice punctuated by thinner patches of whirling, seaweed-like patterns that crisscross around it as if formed by a living thing.

×