I Wrote the Same Code Six Times!

I was updating my CV earlier this week in anticipation of applying for a handful of interesting-looking roles1 and I was considering quite how many different tech stacks I claim significant experience in, nowadays.

There are languages I’ve been writing in every single week for the last 15+ years, of course, like PHP, Ruby, and JavaScript. And my underlying fundamentals are solid.

But is it really fair for me to be able to claim that I can code in Java, Go, or Python: languages that I’ve not used commercially within the last 5-10 years?

Animated GIF showing Sublime Text editor flicking through six different languages, with equivalent code showing in each.
What kind of developer writes the same program six times… for a tech test they haven’t even been asked to do? If you guessed “Dan”, you’d be correct!

Obviously, I couldn’t just let that question lie2. Let’s find out!

Contents

The Test

I fished around on Glassdoor for a bit to find a medium-sized single-sitting tech test, and found a couple of different briefs that I mashed together to create this:

In an object-oriented manner, implement an LRU (Least-Recently Used) cache:

  • The size of the cache is specified at instantiation.
  • Arbitrary objects can be put into the cache, along with a retrieval key in the form of a string. Using the same string, you can get the objects back.
  • If a put operation would increase the number of objects in the cache beyond the size limit, the cached object that was least-recently accessed (by either a put or get operation) is removed to make room for it.
  • putting a duplicate key into the cache should update the associated object (and make this item most-recently accessed).
  • Both the get and put operations should resolve within constant (O(1)) time.
  • Add automated tests to support the functionality.

My plan was to implement a solution to this challenge, in as many of the languages mentioned on my CV as possible in a single sitting.

But first, a little Data Structures & Algorithms theory:

The Theory

Simple case with O(n) complexity

The simplest way to implement such a cache might be as follows:

  • Use a linear data structure like an array or linked list to store cached items.
  • On get, iterate through the list to try to find the matching item.
    • If found: move it to the head of the list, then return it.
  • On put, first check if it already exists in the list as with get:
    • If it already exists, update it and move it to the head of the list.
    • Otherwise, insert it as a new item at the head of the list.
      • If this would increase the size of the list beyond the permitted limit, pop and discard the item at the tail of the list.
Illustration of a simple Cache class with a linked list data structure containing three items, referenced from the class by its head item. It has 'get' and 'put' methods.
It’s simple, elegant and totally the kind of thing I’d accept if I were recruiting for a junior or graduate developer. But we can do better.

The problem with this approach is that it fails the requirement that the methods “should resolve within constant (O(1)) time”3.

Of particular concern is the fact that any operation which might need to re-sort the list to put the just-accessed item at the top 4. Let’s try another design:

Achieving O(1) time complexity

Here’s another way to implement the cache:

  • Retain cache items in a doubly-linked list, with a pointer to both the head and tail
  • Add a hash map (or similar language-specific structure) for fast lookups by cache key
  • On get, check the hash map to see if the item exists.
    • If so, return it and promote it to the head (as described below).
  • On put, check the hash map to see if the item exists.
    • If so, promote it to the head (as described below).
    • If not, insert it at the head by:
      • Updating the prev of the current head item and then pointing the head to the new item (which will have the old head item as its next), and
      • Adding it to the hash map.
      • If the number of items in the hash map would exceed the limit, remove the tail item from the hash map, point the tail at the tail item’s prev, and unlink the expired tail item from the new tail item’s next.
  • To promote an item to the head of the list:
    1. Follow the item’s prev and next to find its siblings and link them to one another (removes the item from the list).
    2. Point the promoted item’s next to the current head, and the current head‘s prev to the promoted item.
    3. Point the head of the list at the promoted item.
Illustration of a more-sophisticated Cache class with a doubly-linked list data structure containing three items, referenced from the class by its head and tail item. It has 'get' and 'put' methods. It also has a separate hash map indexing each cached item by its key.
Looking at a plate of pointer-spaghetti makes me strangely hungry.

It’s important to realise that this alternative implementation isn’t better. It’s just different: the “right” solution depends on the use-case5.

The Implementation

That’s enough analysis and design. Time to write some code.

GitHub repo 'Languages' panel, showing a project that includes Java (17.2%), Go (16.8%), Python (16.4%), Ruby (15.7%), TypeScript (13.7%), PHP (11.5%), and 'Other' (8.7%).
Turns out that if you use enough different languages in your project, GitHub begins to look like itwants to draw a rainbow.

Picking a handful of the more-useful languages on my CV6, I opted to implement in:

  • Ruby (with RSpec for testing and Rubocop for linting)
  • PHP (with PHPUnit for testing)
  • TypeScript (running on Node, with Jest for testing)
  • Java (with JUnit for testing)
  • Go (which isn’t really an object-oriented language but acts a bit like one, amirite?)
  • Python (probably my weakest language in this set, but which actually ended up with quite a tidy solution)

Naturally, I open-sourced everything if you’d like to see for yourself. It all works, although if you’re actually in need of such a cache for your project you’ll probably find an alternative that’s at least as good (and more-likely to be maintained!) in a third-party library somewhere!

What did I learn?

This was actually pretty fun! I might continue to expand my repo by doing the same challenge with a few of the other languages I’ve used professionally at some point or another7.

And there’s a few takeaways I got from this experience –

Lesson #1: programming more languages can make you better at all of them

As I went along, one language at a time, I ended up realising improvements that I could make to earlier iterations.

For example, when I came to the TypeScript implementation, I decided to use generics so that the developer can specify what kind of objects they want to store in the cache, rather than just a generic Object, and better benefit type-safety. That’s when I remembered that Java supports generics, too, so I went back and used them there as well.

In the same way as speaking multiple (human) languages or studying linguistics can help unlock new ways of thinking about your communication, being able to think in terms of multiple different programming languages helps you spot new opportunities. When in 2020 PHP 8 added nullsafe operators, union types, and named arguments, I remember feeling confident using them from day one because those features were already familiar to me from Ruby8, TypeScript9, and Python10, respectively.

Lesson #2: even when I’m rusty, I can rely on my fundamentals

I’ve applied for a handful of jobs now, but if one of them had invited me to a pairing session on a language I’m rusty on (like Java!) I might’ve felt intimidated.

But it turns out I shouldn’t need to be! With my solid fundamentals and a handful of other languages under my belt, I understand when I need to step away from the code editor and hit the API documentation. Turns out, I’m in a good position to demo any of my language skills.

I remember when I was first learning Go, I wanted to make use of a particular language feature that I didn’t know whether it had. But because I’d used that feature in Ruby, I knew what to search for in Go’s documentation to see if it was supported (it wasn’t) and if so, what the syntax was11.

Lesson #3: structural rules are harder to gearshift than syntactic ones

Switching between six different languages while writing the same application was occasionally challenging, but not in the ways I expected.

I’ve had plenty of experience switching programming languages mid-train-of-thought before. Sometimes you just have to flit between the frontend and backend of your application!

But this time around I discovered: changes in structure are apparently harder for my brain than changes in syntax. E.g.:

  • Switching in and out of Python’s indentation caught me out at least once (might’ve been better if I took the time to install the language’s tools into my text editor first!).
  • Switching from a language without enforced semicolon line ends (e.g. Ruby, Go) to one with them (e.g. Java, PHP) had me make the compiler sad several times.
  • This gets even tougher when not writing the language but writing about the language: my first pass at the documentation for the Go version somehow ended up with Ruby/Python-style #-comments instead of Go/Java/TypeScript-style //-comments; whoops!

I’m guessing that the part of my memory that looks after a language’s keywords, how a method header is structured, and which equals sign to use for assignment versus comparison… are stored in a different part of my brain than the bit that keeps track of how a language is laid-out?12

Okay, time for a new job

I reckon it’s time I got back into work, so I’m going to have a look around and see if there’s any roles out there that look exciting to me.

If you know anybody who’s looking for a UK-based, remote-first, senior+, full-stack web developer with 25+ years experience and more languages than you can shake a stick at… point them at my CV, would you?

Footnotes

1 I suspect that when most software engineers look for a new job, they filter to the languages, frameworks, they feel they’re strongest at. I do a little of that, I suppose, but I’m far more-motivated by culture, sector, product and environment than I am by the shape of your stack, and I’m versatile enough that technology specifics can almost come second. So long as you’re not asking me to write VB.NET.

2 It’s sort-of a parallel to how I decided to check the other week that my Gutenberg experience was sufficiently strong that I could write standard ReactJS, too.

3 I was pleased to find a tech test that actually called for an understanding of algorithm growth/scaling rates, so I could steal this requirement for my own experiment! I fear that sometimes, in their drive to be pragmatic and representative of “real work”, the value of a comprehension of computer science fundamentals is overlooked by recruiters.

4 Even if an algorithm takes the approach of creating a new list with the inserted/modified item at the top, that’s still just a very-specific case of insertion sort when you think about it, right?

5 The second design will be slower at writing but faster at reading, and will scale better as the cache gets larger. That sounds great for a read-often/write-rarely cache, but your situation may differ.

6 Okay, my language selection was pretty arbitrary. But if I’d have also come up with implementations in Perl, and C#, and Elixir, and whatever else… I’d have been writing code all day!

7 So long as I’m willing to be flexible about the “object-oriented” requirement, there are even more options available to me. Probably the language that I last wrote longest ago would be Pascal: I wonder how much of that I remember?

8 Ruby’s safe navigation/”lonely” operator did the same thing as PHP’s nullsafe operator since 2015.

9 TypeScript got union types back in 2015, and apart from them being more-strictly-enforced they’re basically identical to PHP’s.

10 Did you know that Python had keyword arguments since its very first public release way back in 1994! How did it take so many other interpreted languages so long to catch up?

11 The feature was the three-way comparison or “spaceship operator”, in case you were wondering.

12 I wonder if anybody’s ever laid a programmer in an MRI machine while they code? I’d be really interested to see if different bits of the brain light up when coding in functional programming languages than in procedural ones, for example!

× × × ×

BBC News RSS… your way!

It turns out my series of efforts to improve the BBC News RSS feeds are more-popular than I thought. People keep asking for variants of them, and it’s probably time I stopped hosting the resulting feeds on my NAS (which does a good job, but it’s in a highly-kickable place right under my desk).

Screenshot of BBC News RSS Feeds (that don't suck!).
The new site isn’t pretty. But it works.

So I’ve launched BBC-Feeds.DanQ.dev. On a 20-minute schedule, it generates both UK and World editions of the BBC News feeds, filtered to remove iPlayer, Sounds, app “nudges”, duplicates, and other junk, and optionally with the sports news filtered out too.

The entire thing is open source under an ultra-permissive license, so you can run your own copy if you don’t want to use mine.

Enjoy!

BBC News RSS… with the sport?

There’s now a much, much better version of this. Go use that instead: bbc-feeds.danq.dev.

Earlier today, somebody called Allan commented on the latest in my series of several blog posts about how I mutilate manipulate the RSS feeds of BBC News to work around their (many, and increasingly so) various shortcomings, specifically:

  1. Their inclusion of non-news content such as plugs for iPlayer and their apps,
  2. Their repeating of identical news stories with marginally-different GUIDs, and
  3. All of the sports news, which I don’t care about one jot.

Well, it turns out that some people want #3: the sport. But still don’t want the other two.

FreshRSS screenshot with many unread items, but focussing on a feed called "BBC News (with sport)" and showing a story titled: 'How England Golf's yellow cards are tackling blight of slow play'
Some people actually want to read this crap, apparently.

I shan’t be subscribing to this RSS feed, and I can’t promise I’ll fix it if it gets broken. But if “without the crap, but with the sports” is the way you like your BBC News RSS feed, I’ve got you covered:

So there you go, Allan, and anybody in a similar position. I hope that fulfils your need for sports news… without the crap.

×

BBC News… without the crap

Update: nowadays, the best place to get this feed and more like it is at bbc-feeds.danq.dev.

Did I mention recently that I love RSS? That it brings me great joy? That I start and finish almost every day in my feed reader? Probably.

I used to have a single minor niggle with the BBC News RSS feed: that it included sports news, which I didn’t care about. So I wrote a script that downloaded it, stripped sports news, and re-exported the feed for me to subscribe to. Magic.

RSS reader showing duplicate copies of the news story "Barbie 2? 'We'd love to,' says Warner Bros boss", and an entry from BBC Sounds.
Lately my BBC News feed has caused me some annoyance and frustration.

But lately – presumably as a result of technical changes at the Beeb’s side – this feed has found two fresh ways to annoy me:

  1. The feed now re-publishes a story if it gets re-promoted to the front pagebut with a different <guid> (it appears to get a #0 after it when first published, a #1 the second time, and so on). In a typical day the feed reader might scoop up new stories about once an hour, any by the time I get to reading them the same exact story might appear in my reader multiple times. Ugh.
  2. They’ve started adding iPlayer and BBC Sounds content to the BBC News feed. I don’t follow BBC News in my feed reader because I want to watch or listen to things. If you do, that’s fine, but I don’t, and I’d rather filter this content out.

Luckily, I already have a recipe for improving this feed, thanks to my prior work. Let’s look at my newly-revised script (also available on GitHub):

#!/usr/bin/env ruby
require 'bundler/inline'
# # Sample crontab:
# # At 41 minutes past each hour, run the script and log the results
# */20 * * * * ~/bbc-news-rss-filter-sport-out.rb > ~/bbc-news-rss-filter-sport-out.log 2>>&1
# Dependencies:
# * open-uri - load remote URL content easily
# * nokogiri - parse/filter XML
gemfile do
  source 'https://rubygems.org'
  gem 'nokogiri'
end
require 'open-uri'
# Regular expression describing the GUIDs to reject from the resulting RSS feed
# We want to drop everything from the "sport" section of the website, also any iPlayer/Sounds links
REJECT_GUIDS_MATCHING = /^https:\/\/www\.bbc\.co\.uk\/(sport|iplayer|sounds)\//
# Load and filter the original RSS
rss = Nokogiri::XML(open('https://feeds.bbci.co.uk/news/rss.xml?edition=uk'))
rss.css('item').select{|item| item.css('guid').text =~ REJECT_GUIDS_MATCHING }.each(&:unlink)

# Strip the anchors off the <guid>s: BBC News "republishes" stories by using guids with #0, #1, #2 etc, which results in duplicates in feed readers
rss.css('guid').each{|g|g.content=g.content.gsub(/#.*$/,'')}

File.open( '/www/bbc-news-no-sport.xml', 'w' ){ |f| f.puts(rss.to_s) }
It’s amazing what you can do with Nokogiri and a half dozen lines of Ruby.

That revised script removes from the feed anything whose <guid> suggests it’s sports news or from BBC Sounds or iPlayer, and also strips any “anchor” part of the <guid> before re-exporting the feed. Much better. (Strictly speaking, this can result in a technically-invalid feed by introducing duplicates, but your feed reader oughta be smart enough to compensate for and ignore that: mine certainly is!)

You’re free to take and adapt the script to your own needs, or – if you don’t mind being tied to my opinions about what should be in BBC News’ RSS feed – just subscribe to my copy at: https://fox.q-t-a.uk/bbc-news-no-sport.xml

Update: nowadays, the best place to get this feed and more like it is at bbc-feeds.danq.dev.

×

UK Strikes in .ics Format

My work colleague Simon was looking for a way to add all of the upcoming UK strike action to their calendar, presumably so they know when not to try to catch a bus or require an ambulance or maybe just so they’d know to whom they should be giving support on any particular day. Thom was able to suggest a few places to see lists of strikes, such as this BBC News page and the comprehensive strikecalendar.co.uk, but neither provided a handy machine-readable feed.

Screenshot showing a Thunderbird calendar popularted with strikes on every day in February.
Gosh, there’s a lot of strikes going on. ✊

If only they knew somebody who loves an excuse to throw a screen-scraper together. Oh wait, that’s me!

I threw together a 36-line Ruby program that extracts all the data from strikecalendar.co.uk and outputs an .ics file. I guess if you wanted you could set it up to automatically update the file a couple of times a day and host it at a URL that people can subscribe to; that’s an exercise left for the reader.

If you just want a one-off import based on the state-of-play right now, though, you can save this .ics file to your computer and import it to your calendar. Simple.

×

Nginx Caching for Passenger Applications

Suppose you’re running an application on a Passenger + Nginx powered server and you want to add caching.

Perhaps your application has a dynamic, public endpoint but the contents don’t change super-frequently or it isn’t critically-important that the user always gets up-to-the-second accuracy, and you’d like to improve performance with microcaching. How would you do that?

Where you’re at

Diagram showing the Internet connecting to an Nginx+Passenger webserver, connecting to an application written for Ruby, Python, or NodeJS.
Not pictured: the rest of the Internet.

Your configuration might look something like this:

1
2
3
4
5
6
7
server {
  # listen, server_name, ssl, logging etc. directives go here
  # ...

  root               /your/application;
  passenger_enabled  on;
}

What you’re looking for is proxy_cache and its sister directives, but you can’t just insert them here because while Passenger does act act like an upstream proxy (with parallels like e.g. passenger_pass_header which mirrors the behaviour of proxy_pass_header), it doesn’t provide any of the functions you need to implement proxy caching of non-static files.

Where you need to be

Instead, what you need to to is define a second server, mount Passenger in that, and then proxy to that second server. E.g.:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
# Set up a cache
proxy_cache_path /tmp/cache/my-app-cache keys_zone=MyAppCache:10m levels=1:2 inactive=600s max_size=100m;

# Define the actual webserver that listens for Internet traffic:
server {
  # listen, server_name, ssl, logging etc. directives go here
  # ...

  # You can configure different rules by location etc., but here's a simple microcache:
  location / {
    proxy_pass http://127.0.0.1:4863; # Proxy all traffic to the application server defined below
    proxy_cache           MyAppCache; # Use the cache defined above
    proxy_cache_valid     200 3s;     # Treat HTTP 200 responses as valid; cache them for 3 seconds
    proxy_cache_use_stale updating;   # (Optional) send outdated response while background-updating cache
    proxy_cache_lock      on;         # (Optional) only allow one process to update cache at once
  }
}

# (Local-only) application server on an arbitrary port number to act as the upstream proxy:
server {
  listen 127.0.0.1:4863;

  root               /your/application;
  passenger_enabled  on;
}

The two key changes are:

  • Passenger moves to a second server block, localhost-only, on an arbitrary port number (doesn’t need HTTPS, of course, but if your application detects/”expects” HTTPS you might need to tweak your headers).
  • Your main server block proxies to the second as its upstream, and you can add whatever caching directives you like.

Obviously you’ll need to be smarter if you host a mixture of public and private content (e.g. send Vary: headers from your application) and if you want different cache durations on different addresses or types of content, but there are already great guides to help with that. I only wrote this post because I spent some time searching for (nonexistent!) passenger_cache_ etc. rules and wanted to save the next person from the same trouble!

×

LABS Comic RSS Archive

Yesterday I recommended that you go read Aaron Uglum‘s webcomic LABS which had just completed its final strip. I’m a big fan of “completed” webcomics – they feel binge-able in the same way as a complete Netflix series does! – but Spencer quickly pointed out that it’s annoying for we enlightened modern RSS users who hook RSS up to everything to have to binge completed comics in a different way to reading ongoing ones: what he wanted was an RSS feed covering the entire history of LABS.

LABS comic adapted to show The Robot literally "feeding" RSS
With apologies to Aaron Uglum who I hope won’t mind me adapting his comic in this way.

So naturally (after the intense heatwave woke me early this morning anyway) I made one: complete RSS feed of LABS. And, of course, I open-sourced the code I used to generate it so that others can jumpstart their projects to make static RSS feeds from completed webcomics, too.

Even if you’re not going to read it via this medium, you should go read LABS.

BBC News… without the sport

There’s now a much, much better version of this. Go use that instead: bbc-feeds.danq.dev.

I love RSS, but it’s a minor niggle for me that if I subscribe to any of the BBC News RSS feeds I invariably get all the sports news, too. Which’d be fine if I gave even the slightest care about the world of sports, but I don’t.

Sports on the BBC News site
Down with Things Like This!

It only takes a couple of seconds to skim past the sports stories that clog up my feed reader, but because I like to scratch my own itches, I came up with a solution. It’s more-heavyweight perhaps than it needs to be, but it does the job. If you’re just looking for a BBC News (UK) feed but with sports filtered out you’re welcome to share mine: https://f001.backblazeb2.com/file/Dan–Q–Public/bbc-news-nosport.rss https://fox.q-t-a.uk/bbc-news-no-sport.xml.

If you’d like to see how I did it so you can host it yourself or adapt it for some similar purpose, the code’s below or on GitHub:

#!/usr/bin/env ruby
# # Sample crontab:
# # At 41 minutes past each hour, run the script and log the results
# 41 * * * * ~/bbc-news-rss-filter-sport-out.rb > ~/bbc-news-rss-filter-sport-out.log 2>&1
# Dependencies:
# * open-uri - load remote URL content easily
# * nokogiri - parse/filter XML
# * b2       - command line tools, described below
require 'bundler/inline'
gemfile do
  source 'https://rubygems.org'
  gem 'nokogiri'
end
require 'open-uri'
# Regular expression describing the GUIDs to reject from the resulting RSS feed
# We want to drop everything from the "sport" section of the website
REJECT_GUIDS_MATCHING = /^https:\/\/www\.bbc\.co\.uk\/sport\//
# Assumption: you're set up with a Backblaze B2 account with a bucket to which
# you'd like to upload the resulting RSS file, and you've configured the 'b2'
# command-line tool (https://www.backblaze.com/b2/docs/b2_authorize_account.html)
B2_BUCKET = 'YOUR-BUCKET-NAME-GOES-HERE'
B2_FILENAME = 'bbc-news-nosport.rss'
# Load and filter the original RSS
rss = Nokogiri::XML(open('https://feeds.bbci.co.uk/news/rss.xml?edition=uk'))
rss.css('item').select{|item| item.css('guid').text =~ REJECT_GUIDS_MATCHING }.each(&:unlink)
begin
  # Output resulting filtered RSS into a temporary file
  temp_file = Tempfile.new
  temp_file.write(rss.to_s)
  temp_file.close

  # Upload filtered RSS to a Backblaze B2 bucket
  result = `b2 upload_file --noProgress --contentType application/rss+xml #{B2_BUCKET} #{temp_file.path} #{B2_FILENAME}`
  puts Time.now
  puts result.split("\n").select{|line| line =~ /^URL by file name:/}.join("\n")
ensure
  # Tidy up after ourselves by ensuring we delete the temporary file
  temp_file.close
  temp_file.unlink
end

bbc-news-rss-filter-sport-out.rb

When executed, this Ruby code:

  1. Fetches the original BBC news (UK) RSS feed and parses it as XML using Nokogiri
  2. Filters it to remove all entries whose GUID matches a particular regular expression (removing all of those from the “sport” section of the site)
  3. Outputs the resulting feed into a temporary file
  4. Uploads the temporary file to a bucket in Backblaze‘s “B2” repository (think: a better-value competitor S3); the bucket I’m using is publicly-accessible so anybody’s RSS reader can subscribe to the feed

I like the versatility of the approach I’ve used here and its ability to perform arbitrary mutations on the feed. And I’m a big fan of Nokogiri. In some ways, this could be considered a lower-impact, less real-time version of my tool RSSey. Aside from the fact that it won’t (easily) handle websites that require Javascript, this approach could probably be used in exactly the same ways as RSSey, and with significantly less set-up: I might look into whether its functionality can be made more-generic so I can start using it in more places.

×

Linda Liukas, Hello Ruby and the magic of coding

This is a repost promoting content originally published elsewhere. See more things Dan's reposted.

Linda Liukas’s best-selling Hello Ruby books teach children that computers are fun and coding can be a magical experience.

See the original article to watch a great video interview with Linda Liukas. Linda is the founder of Rails Girls and author of a number of books encouraging children to learn computer programming (which I’m hoping to show copies of to ours, when they’re a tiny bit older). I’ve mentioned before how important I feel an elementary understanding of programming concepts is to children.

The Ruby Story

This is a repost promoting content originally published elsewhere. See more things Dan's reposted.

By 2005, Ruby had become more popular, but it was still not a mainstream programming language. That changed with the release of Ruby on Rails. Ruby on Rails was the “killer app” for Ruby, and it did more than any other project to popularize Ruby. After the release of Ruby on Rails, interest in Ruby shot up across the board, as measured by the TIOBE language index:

It’s sometimes joked that the only programs anybody writes in Ruby are Ruby-on-Rails web applications. That makes it sound as if Ruby on Rails completely took over the Ruby community, which is only partly true. While Ruby has certainly come to be known as that language people write Rails apps in, Rails owes as much to Ruby as Ruby owes to Rails.

As an early adopter of Ruby (and Rails, when it later came along) I’ve always found that it brings me a level of joy I’ve experienced in very few other languages (and never as much). Every time I write Ruby, it takes me back to being six years old and hacking BASIC on my family’s microcomputer. Ruby, more than any other language I’ve come across, achieves the combination of instant satisfaction, minimal surprises, and solid-but-flexible object orientation. There’s so much to love about Ruby from a technical perspective, but for me: my love of it is emotional.

×

Why it is just lazy to bad-mouth Ruby on Rails

This is a repost promoting content originally published elsewhere. See more things Dan's reposted.

It’s inevitable these days: we will see an article proclaiming the demise of Ruby on Rails every once in a while. It’s the easiest click bait, like this one from TNW.Now, you may say “another Ruby fanboy.” That’s fair, but a terrible argument, as it’s a poor and common argumentum ad hominem. And on the subject of fallacies, the click-bait article above is wrong exactly because it falls for a blatantly Post hoc ergo propter hoc fallacy plus some more confirmation bias which we are all guilty of falling for all the time.

I’m not saying that the author wrote fallacies on purpose. Unfortunately, it’s just too easy to fall for fallacies. Especially when everybody has an intrinsic desire to confirm one’s biases. Even trying to be careful, I end up doing that as well…

Rails is f*cking boring! I love it.

This is a repost promoting content originally published elsewhere. See more things Dan's reposted.

Together with a friend I recently built Dropshare Cloud. We offer online storage for the file and screenshot sharing app Dropshare for macOS/iOS. After trying out Django for getting started (we both had some experience using Django) I decided to rewrite the codebase in Rails. My past experience developing in Rails made the process quick — and boring…

What Does Jack FM Sound Like?

Those who know me well know that I’m a bit of a data nerd. Even when I don’t yet know what I’m going to do with some data yet, it feels sensible to start collecting it in a nice machine-readable format from the word go. Because you never know, right? That’s how I’m able to tell you how much gas and electricity our house used on average on any day in the last two and a half years (and how much off that was offset by our solar panels).

Daily energy usage at Dan's house for the last few years. Look at the gas peaks in the winters, when the central heating ramps up!
The red lumps are winters, when the central heating comes on and starts burning a stack of gas.

So it should perhaps come as no huge surprise that for the last six months I’ve been recording the identity of every piece of music played by my favourite local radio station, Jack FM (don’t worry: I didn’t do this by hand – I wrote a program to do it). At the time, I wasn’t sure whether there was any point to the exercise… in fact, I’m still not sure. But hey: I’ve got a log of the last 45,000 songs that the radio station played: I might as well do something with it. The Discogs API proved invaluable in automating the discovery of metadata relating to each song, such as the year of its release (I wasn’t going to do that by hand either!), and that gave me enough data to, for example, do this (click on any image to see a bigger version):

Jack FM: Decade Frequency by Hour
Decade frequency by hour: you’ve got a good chance of 80s music at any time, but lunchtime’s your best bet (or perhaps just after midnight). Note that times are in UTC+2 in this graph.

I almost expected a bigger variance by hour-of-day, but I guess that Jack isn’t in the habit of pandering to its demographics too heavily. I spotted the post-midnight point at which you get almost a plurality of music from 1990 or later, though: perhaps that’s when the young ‘uns who can still stay up that late are mostly listening to the radio? What about by day-of-week, then:

Jack FM: Decade Frequency by Day of Week
Even less in it by day of week… although 70s music fans should consider tuning in on Fridays, apparently, and 80s fans will be happiest on Sundays.

The chunks of “bonus 80s” shouldn’t be surprising, I suppose, given that the radio station advertises that that’s exactly what it does at those times. But still: it’s reassuring to know that when a radio station claims to play 80s music, you don’t just have to take their word for it (so long as their listeners include somebody as geeky as me).

It feels to me like every time I tune in they’re playing an INXS song. That can’t be a coincidence, right? Let’s find out:

Jack FM: Artist Frequency
One in every ten songs are by just ten artists (including INXS). One in every four are by just 34 artists.

Yup, there’s a heavy bias towards Guns ‘n’ Roses, Michael Jackson, Prince, Oasis, Bryan Adams, Madonna, INXS, Bon Jovi, Queen, and U2 (who collectively are responsible for over a tenth of all music played on Jack FM), and – to a lesser extent – towards Robert Palmer, Meatloaf, Blondie, Green Day, Texas, Whitesnake, the Pet Shop Boys, Billy Idol, Madness, Rainbow, Elton John, Bruce Springsteen, Aerosmith, Fleetwood Mac, Phil Collins, ZZ Top, AC/DC, Duran Duran, the Police, Simple Minds, Blur, David Bowie, Def Leppard, and REM: taken together, one in every four songs played on Jack FM is by one of these 34 artists.

Jack FM: Top 20
Amazingly, the most-played song on Jack FM (Alice Cooper’s “Poison”) is not by one of the most-played 34 artists.

I was interested to see that the “top 20 songs” played on Jack FM these last six months include several songs by artists who otherwise aren’t represented at all on the station. The most-played song is Alice Cooper’s Poison, but I’ve never recorded them playing any other Alice Cooper songs (boo!). The fifth-most-played song is Fight For Your Right, by the Beastie Boys, but that’s the only Beastie Boys song I’ve caught them playing. And the seventh-most-played – Roachford’s Cuddly Toy – is similarly the only Roachford song they ever put on.

Next I tried a Markov chain analysis. Markov chains are a mathematical tool that examines a sequence (in this case, a sequence of songs) and builds a map of “chains” of sequential songs, recording the frequency with which they follow one another – here’s a great explanation and playground. The same technique is used by “predictive text” features on your smartphone: it knows what word to suggest you type next based on the patterns of words you most-often type in sequence. And running some Markov chain analysis helped me find some really… interesting patterns in the playlists. For example, look at the similarities between what was played early in the afternoon of Wednesday 19 October and what was played 12 hours later, early in the morning of Thursday 20 October:

19 October 2016 20 October 2016
12:06:33 Kool & The Gang – Fresh Kool & The Gang – Fresh 00:13:56
12:10:35 Bruce Springsteen – Dancing In The Dark Bruce Springsteen – Dancing In The Dark 00:17:57
12:14:36 Maxi Priest – Close To You Maxi Priest – Close To You 00:21:59
12:22:38 Van Halen – Why Can’t This Be Love Van Halen – Why Can’t This Be Love 00:25:00
12:25:39 Beats International / Lindy – Dub Be Good To Me Beats International / Lindy – Dub Be Good To Me 00:29:01
12:29:40 Kasabian – Fire Kasabian – Fire 00:33:02
12:33:42 Talk Talk – It’s My Life Talk Talk – It’s My Life 00:38:04
12:41:44 Lenny Kravitz – Are You Gonna Go My Way Lenny Kravitz – Are You Gonna Go My Way 00:42:05
12:45:45 Shalamar – I Can Make You Feel Good Shalamar – I Can Make You Feel Good 00:45:06
12:49:47 4 Non Blondes – What’s Up 4 Non Blondes – What’s Up 00:50:07
12:55:49 Madness – Baggy Trousers Madness – Baggy Trousers 00:54:09
    Eagle Eye Cherry – Save Tonight 00:56:09
    Feeling – Love It When You Call 01:04:12
13:02:51 Fine Young Cannibals – Good Thing Fine Young Cannibals – Good Thing 01:10:14
13:06:54 Blur – There’s No Other Way Blur – There’s No Other Way 01:14:15
13:09:55 Pet Shop Boys – It’s A Sin Pet Shop Boys – It’s A Sin 01:17:16
13:14:56 Zutons – Valerie Zutons – Valerie 01:22:18
13:22:59 Cure – The Love Cats Cure – The Love Cats 01:26:19
13:27:01 Bryan Adams / Mel C – When You’re Gone Bryan Adams / Mel C – When You’re Gone 01:30:20
13:30:02 Depeche Mode – Personal Jesus Depeche Mode – Personal Jesus 01:33:21
13:34:03 Queen – Another One Bites The Dust Queen – Another One Bites The Dust 01:38:22
13:42:06 Shania Twain – That Don’t Impress Me Much Shania Twain – That Don’t Impress Me Much 01:42:23
13:45:07 ZZ Top – Gimme All Your Lovin’ ZZ Top – Gimme All Your Lovin’ 01:46:25
13:49:09 Abba – Mamma Mia Abba – Mamma Mia 01:50:26
13:53:10 Survivor – Eye Of The Tiger Survivor – Eye Of The Tiger 01:53:27
    Scouting For Girls – Elvis Aint Dead 01:57:28
    Verve – Lucky Man 02:00:29
    Fleetwood Mac – Say You Love Me 02:05:30
14:03:13 Kiss – Crazy Crazy Nights Kiss – Crazy Crazy Nights 02:10:31
14:07:15 Lightning Seeds – Sense Lightning Seeds – Sense 02:14:33
14:11:16 Pretenders – Brass In Pocket Pretenders – Brass In Pocket 02:18:34
14:14:17 Elvis Presley / JXL – A Little Less Conversation Elvis Presley / JXL – A Little Less Conversation 02:21:35
14:22:19 U2 – Angel Of Harlem U2 – Angel Of Harlem 02:24:36
14:25:20 Trammps – Disco Inferno Trammps – Disco Inferno 02:28:37
14:29:22 Cast – Guiding Star Cast – Guiding Star 02:31:38
14:33:23 New Order – Blue Monday New Order – Blue Monday 02:36:39
14:41:26 Def Leppard – Let’s Get Rocked Def Leppard – Let’s Get Rocked 02:40:41
14:46:28 Phil Collins – Sussudio Phil Collins – Sussudio 02:45:42
14:50:30 Shawn Mullins – Lullaby Shawn Mullins – Lullaby 02:49:43
14:55:31 Stars On 45 – Stars On 45 Stars On 45 – Stars On 45 02:53:45
16:06:35 Dead Or Alive – You Spin Me Round Like A Record Dead Or Alive – You Spin Me Round Like A Record 03:00:47
16:09:36 Dire Straits – Walk Of Life Dire Straits – Walk Of Life 03:03:48
16:13:37 Keane – Everybody’s Changing Keane – Everybody’s Changing 03:07:49
16:17:39 Billy Idol – Rebel Yell Billy Idol – Rebel Yell 03:10:50
16:25:41 Stealers Wheel – Stuck In The Middle Stealers Wheel – Stuck In The Middle 03:14:51
16:28:42 Green Day – American Idiot Green Day – American Idiot 03:18:52
16:33:44 A-Ha – Take On Me A-Ha – Take On Me 03:21:53
16:36:45 Cranberries – Dreams Cranberries – Dreams 03:26:54
    Elton John – Philadelphia Freedom 03:30:56
    Inxs – Disappear 03:36:57
    Kim Wilde – You Keep Me Hanging On 03:40:59
16:44:47 Living In A Box – Living In A Box    
16:47:48 Status Quo – Rockin’ All Over The World Status Quo – Rockin’ All Over The World 03:45:00

The similarities between those playlists (which include a 20-songs-in-a-row streak!) surely can’t be coincidence… but they do go some way to explaining why listening to Jack FM sometimes gives me a feeling of déjà vu (along with, perhaps, the no-talk, all-jukebox format). Looking elsewhere in the data I found dozens of other similar occurances, though none that were both such long chains and in such close proximity to one another. What does it mean?

There are several possible explanations, including:

  • The exotic, e.g. they’re using Markov chains to control an auto-DJ, and so just sometimes it randomly chooses to follow a long chain that it “learned” from a real DJ.
  • The silly, e.g. Jack FM somehow knew that I was monitoring them in this way and are trying to troll me.
  • My favourite: these two are actually the same playlist, but with breaks interspersed differently. During the daytime, the breaks in the list are more-frequent and longer, which suggests: ad breaks! Advertisers are far more-likely to pay for spots during the mid-afternoon than they are in the middle of the night (the gap in the overnight playlist could well be a short ad or a jingle), which would explain why the two are different from one another!

But the question remains: why reuse playlists in close proximity at all? Even when the station operates autonomously, as it clearly does most of the time, it’d surely be easy enough to set up an auto-DJ using “smart random” (because truly random shuffles don’t sound random to humans) to get the same or a better effect.

Jack FM Style Guide
One of the things I love about Jack FM is how little they take seriously. Like their style guide.

Which leads to another interesting observation: Jack FM’s sister stations in Surrey and Hampshire also maintain a similar playlist most of the time… which means that they’re either synchronising their ad breaks (including their duration – I suspect this is the case) or else using filler jingles to line-up content with the beginnings and ends of songs. It’s a clever operation, clearly, but it’s not beyond black-box comprehension. More research is clearly needed. (And yes, I’m sure I could just call up and ask – they call me “Newcastle Dan” on the breakfast show – but that wouldn’t be even half as fun as the data mining is…)

× × × × × ×