I love RSS, but it’s a minor niggle for me that if I subscribe to any of the BBC News RSS feeds I invariably get all the sports news, too. Which’d be fine if I gave even the slightest care about the world of sports, but I don’t.
It only takes a couple of seconds to skim past the sports stories that clog up my feed reader, but because I like to scratch my own
itches, I came up with a solution. It’s more-heavyweight perhaps than it needs to be, but it does the job. If you’re just looking for a BBC News (UK) feed but with sports filtered
out you’re welcome to share mine: https://f001.backblazeb2.com/file/Dan–Q–Public/bbc-news-nosport.rss https://fox.q-t-a.uk/bbc-news-no-sport.xml.
If you’d like to see how I did it so you can host it yourself or adapt it for some similar purpose, the code’s below or on GitHub:
When executed, this Ruby code:
- Fetches the original BBC news (UK) RSS feed and parses it as XML using Nokogiri
- Filters it to remove all entries whose GUID matches a particular regular expression (removing all of those from the “sport” section of the site)
- Outputs the resulting feed into a temporary file
- Uploads the temporary file to a bucket in Backblaze‘s “B2” repository (think: a better-value competitor S3); the bucket I’m using is publicly-accessible so anybody’s RSS reader can subscribe to the feed
I like the versatility of the approach I’ve used here and its ability to perform arbitrary mutations on the feed. And I’m a big fan of Nokogiri. In some ways, this could be considered a lower-impact, less real-time version of my tool RSSey. Aside from the fact that it won’t (easily) handle websites that require Javascript, this approach could probably be used in exactly the same ways as RSSey, and with significantly less set-up: I might look into whether its functionality can be made more-generic so I can start using it in more places.
0 comments