The Hardest Hangman

What’s the hardest word to guess, when playing hangman? I’ll come back to that.

A game of hangman; D_N; wrong guesses are R, E, S, O, T and I.
Whatever could the missing letter be?

Last year, Nick Berry wrote a fantastic blog post about the optimal strategy for Hangman. He showed that the best guesses to make to get your first “hit” in a game of hangman are not the most-commonly occurring letters in written English, because these aren’t the most commonly-occurring letters in individual words. He also showed that the first guesses should be adjusted based on the length of the word (the most common letter in 5-letter words is ‘S’, but the most common letter in 6-letter words is ‘E’). In short: hangman’s a more-complex game than you probably thought it was! I’d like to take his work a step further, and work out which word is the hardest word: that is – assuming you’re playing an optimal strategy, what word takes the most-guesses?

A hanged man being quartered by his executioner.
The rules of hangman used to be a lot more brutal. Nowadays, very few people die as a result of the game.

First, though, we need to understand how hangman is perfectly played. Based on the assumption that the “executioner” player is choosing words randomly, and that no clue is given as to the nature of the word, we can determine the best possible move for all possible states of the game by using a data structure known as a tree. Suppose our opponent has chosen a three-letter word, and has drawn three dashes to indicate this. We know from Nick’s article that the best letter to guess is A. And then, if our guess is wrong, the next best letter to guess is E. But what if our first guess is right? Well, then we’ve got an “A” in one or more positions on the board, and we need to work out the next best move: it’s unlikely to be “E” – very few three-letter words have both an “A” and an “E” – and of course what letter we should guess next depends entirely on what positions the letters are in.

A tiny fragment of possible states for the tree of three-letter words, in hangman.
There are billions of possible states of game play, but you can narrow them down quickly with strategic guessing.

What we’re actually doing here is a filtering exercise: of all of the possible letters we could choose, we’re considering what possible results that could have. Then for each of those results, we’re considering what guesses we could make next, and so on. At each stage, we compare all of the possible moves to a dictionary of all possible words, and filter out all of the words it can’t be: after our first guess in the diagram above, if we guess “A” and the board now shows “_ A _”, then we know that of the 600+ three-letter words in the English language, we’re dealing with one of only about 134. We further refine our guess by playing the odds: of those words, more of them have a “C” in than any other letter, so that’s our second guess. If it has a C in, that limits the options further, and we can plan the next guess accordingly. If it doesn’t have a C in, that still provides us with valuable information: we’re now looking for a three-letter word with an A in the second position and no letter C: that cuts it down to 124 words (and our next guess should be ‘T’). This tree-based mechanism for working out the best moves is comparable to that used by other game-playing computers. Hangman is simple enough that it can be “solved” by contemporary computers (like draughts – solved in 2007 – but unlike chess: while modern chess-playing computers can beat humans, it’s still theoretically possible to build future computers that will beat today’s computers).

Pile of stones, and the text "the best word to choose in hangman is the word that your opponent will not guess"
Zen Hangman asks the really important questions. If a man has one guess left and refuses to pick a letter, does he live forever, or not at all?

Now that we can simulate the way that a perfect player would play against a truly-random executioner, we can use this to simulate games of hangman for every possible word (I’m using version 0.7 of this British-English dictionary). In other words, we set up two computer players: the first chooses a word from the dictionary, the second plays “perfectly” to try to guess the word, and we record how many guesses it took. So that’s what I did. Here’s the Ruby code I used. It’s heavily-commented and probably pretty understandable/good learning material, if you’re into that kind of thing. Or if you fancy optimising it, there’s plenty of scope for that too (I knocked it out on a lunch break; don’t expect too much!). Or you could use it as the basis to make a playable hangman game. Go wild.

The hardest three-letter hangman words: xxv, xxx, wiz, oak, vex, vox, aux, fox, yuk, www...
The hardest three-letter hangman words. “Sly” is particularly… well, sly.

Running the program, we can see that the hardest three-letter word is “xxv”, which would take 22 guesses (20 of them wrong!) to get. But aside from the roman numeral for 25, I don’t think that “xxv” is actually a word. Perhaps my dictionary’s not very good. “Oak”, though, is definitely a word, and at 20 guesses (17 wrong), it’s easily enough to hang your opponent no matter how many strokes it takes to complete the gallows.

The hardest four-letter hangman words: xxxv, quiz, jazz, zinc, faux, foxy, hazy, jibe, quay, buzz, gibe, guan, huge...
Interestingly, “oaks” is an easier word than “oak” (although it’s still very difficult): the addition of an extra letter to a word does not make it harder, especially when that letter is common.

There are more tougher words in the four-letter set, like the devious “quiz”, “jazz”, “zinc”, and “faux”. Pick one of those and your opponent – unless they’ve seen this blog post! – is incredibly unlikely to guess it before they’re swinging from a rope.

The hardest five, six, and seven-letter hangman words, including jazzy, quaff, foxed, foxing, queued, favour, vaquero, jazzier, quizzed...
“Hazing foxes, fucking cockily” is not only the title of a highly-inappropriate animated film, but also a series of very challenging Hangman words.

As we get into the 5, 6, and 7-letter words you’ll begin to notice a pattern: that the hardest words with any given number of letters get easier the longer they are. That’s kind of what you’d expect, I suppose: if there were a hypothetical word that contained every letter in the alphabet, then nobody would ever fail to (eventually) get it.

The hardest eight, nine, and ten-letter hangman words, including quizzing, puppyish, picklock, jazziness, pollywogs, cufflinks, humbuggery, juxtaposed, bucketfuls...
Some of the longer words are wonderful, like: dysprosium, semivowel, harrumph, and googolplex.

When we make a graph of each word length, showing which proportion of the words require a given number of “wrong” guesses (by an optimised player), we discover a “sweet spot” window in which we’ll find all of the words that an optimised player will always fail to guess (assuming that we permit up to 10 incorrect guesses before they’re disqualified). The window seems small for the number of times I remember seeing people actually lose at hangman, which implies to me that human players consistently play sub-optimally, and do not adequately counteract that failing by applying an equal level of “smart”, intuitive play (knowing one’s opponent and their vocabulary, looking for hints in the way the game is presented, etc.).

Graph showing the proportion of each word of a given length that take a given number of "wrong" guesses to optimally solve.
The “sweet spot” in the bottom right is the set of words which you would expect a perfect player to fail to guess, assuming that they’re given a limit of 10 “wrong” guesses.

In case you’re interested, then, here are the theoretically-hardest words to throw at your hangman opponent. While many of the words there feel like they would quite-rightly be difficult, others feel like they’d be easier than their ranking would imply: this is probably because they contain unusual numbers of vowels or vowels in unusual-but-telling positions, which humans (with their habit, inefficient under normal circumstances, of guessing an extended series of vowels to begin with) might be faster to guess than a computer.

Word Guesses taken “Wrong” guesses needed
quiz 24 20
jazz 22 19
jazzy 22 18
quaff 22 18
zinc 21 17
oak 20 17
vex 20 17
vox 20 17
foxing 22 16
foxed 21 16
queued 20 16
fuzzy 20 16
quay 20 16
pinup 20 16
fox 19 16
yuk 19 16
vaquero 22 15
jazzier 21 15
quizzed 21 15
hazing 21 15
favour 21 15
yoking 21 15
quays 20 15
quark 20 15
joked 20 15
guyed 20 15
foyer 20 15
bumph 20 15
huge 19 15
quip 19 15
gibe 19 15
rump 19 15
guan 19 15
quizzed 19 15
oaks 19 15
murk 19 15
fezzes 19 15
yuck 19 15
keno 19 15
kazoo 19 15
Download a longer list
(there’s plenty more which you’d expect to “win” with)

If you use this to give you an edge in your next game, let me know how it works out for you!

Update – 8 March 2019: fixed a broken link and improved the layout of the page.

× × × × × × × × ×

Too Ruby

Ruby, a programming language of which I’m quite fond, is well-known for it’s readability and ease of comprehension, among about thirty-seven other wonderful features.

I rediscovered quite how readable the language is when I genuinely ended up writing the following method last week:

# On saving, updates the #Shift counters if the #ExperienceLevel of this
# #Volunteer has been changed
def update_counters_if_experience_level_changed
  update_counters if experience_level_changed?
end

For the benefit of those of you who aren’t programmers, I’ll point out that which is obvious to those of us who are: the body of the method (that’s the line that’s indented) is almost identical to the method name (the line that starts with “def”).

This is the equivalent of going to WikiHow and looking up the article on, say, How to Make a Tie Dyed Cake, only to discover that the text of the article simply says, “Choose what colours you want, and then make a cake in those colours”… and you understand perfectly and go and make the cake, because you’ve got that good an understanding. In this metaphor, you’re the Ruby interpreter, by the way. And the cake is delicious.

Okay, I cheated a little: the experience_level_changed? method was provided for me by the Rails framework. And I had to write the update_counters method myself (although it, too, contains only one line of code in its body). But the point is still the same: writing Ruby, and thinking in a Rubyish way, produces beautifully readable, logical code.

×

Kitty

~$ sudo gem install kitty
Successfully installed kitty-0.0.2
1 gem installed
Installing ri documentation for kitty-0.0.2...
Installing RDoc documentation for kitty-0.0.2...

~$ kitty
     ____
    (.   \
      \  |
       \ |___(\--/)
     __/    (  . . )
    "'._.    '-.O.'
         '-.  \ "|\
            '.,,/'.,,

~$ kitty
         /\_/\
    /\  / o o \
   //\\ \~(*)~/
   `  \/   ^ /
      | \|| ||
      \ '|| ||
       \)()-())

Kitty. That is all.

Hiding In Plain “Site”

I’ve written a program called PicInHTML, which makes web pages with concealed images which are shown when text on the page is selected. What’s clever about these page are how they work: they’re a single file, with no dependence on images nor Javascript, and they work by leveraging the little-used ::selected CSS selector. Each individual letter on the page is given a CSS class to associate it with the colour of a corresponding pixel in the source image, and selecting the text changes the background colour to that pixel colour.

That’s a wordy way of putting it. Let’s try an example:

An example of a special page - selecting the text in this page reveals the Reddit alien. Click on the image to see the discussion about this example on Reddit.

Give it a go on any of the following pages. You’ll need to not be using Microsoft Internet Explorer, I’m afraid, as it doesn’t support the ::selected CSS selector. All you have to do is select the text on the page to reveal the secret image!

If you’re interested in the mechanics of how it works, or you’d like to get a copy of the source code and have a play yourself, see my project page on PicInHTML. You could also try looking at the source code of any of the pages, above: they’re not too-hard to read, especially for machine-generated code.

!
An example of a special page - selecting the text in this page reveals the Reddit alien. Click on the image to see the discussion about this example on Reddit.

T

×

SSL Client Certificate Authentication In Ruby On Rails

I’ve been playing with using client-side SSL certificates (installed into your web browser) as a means to authenticate against a Ruby on Rails-powered application. This subject is geeky and of limited interest even to the people who read this blog (with the possible exception of Ruth, who may find herself doing exactly this as part of her Masters dissertation), so rather than write about it all here, I’ve written a howto/article: SSL Client Certificate Authentication In Ruby On Rails. If you’re at all interested in the topic, you’re welcome to have a read and give me any feedback.

Writing A Calendar App In Rails Vs. PHP

Some time ago, I wrote a web-based calendar application in PHP, one of my favourite programming languages. This tool would produce a HTML tabular calendar for a four week period, Monday to Sunday, in which the current date (or a user-specified date) fell in the second week (so you’re looking at this week, last week, and two weeks in the future). The user-specified date, for various reasons, would be provided as the number of seconds since the epoch (1970). In addition, the user must be able to flick forwards and backwards through the calendar, “shifting” by one or four weeks each time.

Part of this algorithm, of course, was responsible for finding the timestamp (seconds since the epoch) of the beginning of “a week last Monday”, GMT. It went something like this (pseudocode):

1. Get a handle on the beginning of "today" with [specified time] modulus [number of seconds in day]
2. Go back in time a week by deducting [number of seconds in day] multiplied by [number of days in week] (you can see I'm a real programmer, because I set "number of days in week" as a constant, in case it ever gets changed)
3. Find the previous Monday by determining what day of the week this date is on (clever functions in PHP do this for me), then take [number of seconds in day] multiplied by [number of days after Monday we are] from this to get "a week last Monday"
4. Jump forwards or backwards a number of weeks specified by the user, if necessary. Easy.
5. Of course, this isn't perfect, because this "shift backwards a week and a few days" might have put us in to "last month", in which case the calendar needs to know to deduct one month and add [number of days in last month]
6. And if we just went "back in time" beyond January, we also need to deduct a year and add 11 months. Joy.

So; not the nicest bit of code in the world.

I’ve recently been learning to program in Ruby On Rails. Ruby is a comparatively young language which has become quite popular in Japan but has only had reasonable amounts of Westernised documentation for the last four years or so. I started looking into it early this year after reading an article that compared it to Python. Rails is a web application development framework that sits on top of Ruby and promises to be “quick and structured”, becoming the “best of both worlds” between web engineering in PHP (quick and sloppy) and in Java (slow and structured). Ruby is a properly object-oriented language – even your literals are objects – and Rails takes full advantage of this.

For example, here’s my interpretation in Rails of the same bit of code as above:

@week_last_monday = 7.days.ago.gmtime.monday + params[:weeks].to_i.weeks

An explanation:

  • @week_last_monday is just a variable in which I’m keeping the result of my operation.
  • 7.days might fool you. Yes, what I’m doing there is instantiating an Integer (7, actually a Fixint, but who cares), then calling the “days” function on it, which returns me an instance of Time which represents 7 days of time.
  • Calling the ago method on my Time object, which returns me another Time object, this time one which is equal to Time.now (the time right now) minus the amount of Time I already had (7 days). Basically, I now have a handle on “7 days ago”.
  • The only thing PHP had up on me here is that it’s gmdate() function had ensured I already had my date/time in GMT; here, I have to explicitly call gmtime to do the same thing.
  • And then I simply call monday on my resulting Time object to get a handle on the beginning of the previous Monday. That simple. 24 characters of fun.
  • + params[:weeks].to_i.weeks simply increments (or decrements) the Time I have by a number of weeks specified by the user (params[:weeks] gets the number of weeks specified, to_i converts it to an integer, and weeks, like days, creates a Time object from this. In Ruby, object definitions can even override operators like +, -, <, >, etc., as if they were methods (because they are), and so the author of the Time class made it simple to perform arithmetic upon times and dates.

This was the very point at which I feel in love with Ruby on Rails.