If you already appreciate why that’s cool, go play with it. If you don’t, allow me to explain.
What is a One-Time Pad and why is it awesome?
One-time pads are a form of cryptography which are simple enough to do by hand (you don’t need a computer, but it helps), versatile enough to transport any message, and – this is the clever part – completely unbreakable.
Yes, completely unbreakable. It doesn’t matter if you have a billion supercomputers and a billion years, a one-time pad is mathematically sound. So long as it’s used properly, it’s unbreakable, but it’s the difficulty and discipline required in using them properly – as well as difficulties in finding secure ways to share keys over long distances – that makes them impractical for widespread use.
They did, however, see a lot of use in espionage during the Second World War and the Cold War, and continue to be used today for some diplomatic messages, as well as occasionally by particularly paranoid civilians.
So what’s the story?
You’re probably familiar with the concept of a Caesar Cipher – you may have even played with them as a child – which is perhaps most-often seen nowadays in the form of ROT13. Put simply, a caesar cipher “rotates” letters through the alphabet, so perhaps A becomes B, B becomes C, C becomes D, and so on (in this example, Z would become A). So my message “IF YOU READ THIS YOU ARE GAY” becomes “JG ZPV SFBE UIJT ZPV BSF HBZ”. I can send that message to you, having already agreed with you the code, and you can roll each letter back by one (so A becomes Z, B becomes A, etc.), to get back the original message.
This is fundamentally flawed and offer no real security at all, of course. But suppose we made a couple of enhancements to our plain old Caesar Cipher. First, let’s add some punctuation to our alphabet (space, full stop, comma – we’ll treat these as letters in their own right which come after ‘Z’). Then, instead of rotating each letter in our message the same number of steps around, we’ll vary it. So let’s agree that the first letter will rotate 3 places, the second by 18, and the third by 11: then the fourth by 3 again, the fifth by 18, the sixth by 11, and so on. If we encode the same message now, we get:
- I becomes L (rotated by 3)
- F becomes X (rotated by 18)
- [space] becomes I (rotated by 11)
- Y becomes a comma (,)
And so on. Suddenly that’s a lot more secure than our plain old Caesar Cipher! Congratulations: you just invented the Vigenère Cipher. Unfortunately for you, it’s almost 500 years old already. Even more unfortunately, it’s still not very secure. It’s fine for passing notes in class, but it won’t do for sending orders to your agent on the other side of the Iron Curtain!
How is a One-Time Pad different?
The “key” to the cipher we used above is 3, 18, 11, and the problem is that the key ends up being re-used (repeated) throughout the course of the message. If the message was the word “ELF” (encrypted to “HAQ”), and we agreed never to use that same key again, then anybody who intercepted the message – even if they knew we were using a Vigenère Cipher – wouldn’t know what we’d said, except to say that it had three or fewer letters. We could equally have said “MAN” (using the key 8, 17, 8), “EAT” (using the key 0, 17, 14), or “EGG” (using the key 0, 23, 1). If we ever used the same key – 3, 18, 11 – again, our code would become vulnerable to frequency analysis, which is a technique for working out what the key might be based on the likelyhood of particular letters or words (especially common ones) being used in combination.
It’s pretty easy to see how to fix this: all you have to do is to choose a key that is at least as long as the message you want to encrypt, and never reuse the key.
This is how a one-time pad works. Suppose you and I agree a series of numbers, like this: 64191 25746 89891 93406 33604 89879. You keep a copy, and I keep a copy, and we never tell anybody else those numbers, or the order in which they appear.
When I want to send you a message, I first convert that message into a series of numbers, using a codebook or codetable. In the example codetable below – which has been optimised for the English language – the most-commonly used letters are represented by one digit each, while less-frequently used numbers are represented by two digits. So the message “STEAL THE PANTIES” becomes 82832 17890 83752 80148 33282. It’s important to remember that this still isn’t encrypted; it’s just encoded: turned into a format suitable for encryption.
If we often talk about “panties” in our messages (and who doesn’t?), we might add that word to our codebook to make it faster to write: for example, we might assign it the code “11” – in the table above, the prefix “99” means “look it up in the codebook”, so instead of writing “panties” as “80148 33282”, we’d write it as “9911” – cold war spies had whole dictionaries of most-common words assigned to numbers to make them shorter to write out! That makes our message: 82832 17890 83752 99110. In this particular implementation, we add a padding zero to make it up to a nice round block of five digits.
Next, we encrypt the message using our pre-arranged secret key, 64191 25746 89891 93406 33604 89879. To do this, we just take each digit in the message and add it to each digit in the key, ignoring any “tens” column. So 8 plus 6 is (1)4, 2 plus 4 is 6, 8 plus 1 is 9, and so on, to get our encrypted message.
All you have to do to decode it is run the whole thing backwards. From each digit in the message, deduct the corresponding value in the key – if you get any negative numbers, just add 10 to them so that they’re not negative any more. Then run the resulting encoded number through your codebook to get back the secret message.
In practice, using a codebook is optional, but very-highly recommended. In the basic codebook I’ve provided with my implementation, the word “condition” goes down from being “71547 23833 54” to just “99114 7”. A well-designed codebook will contain not only common words in your language, but anticipated words for the things that you expect to talk about in your messages (like “MISSION”, “CAPTURED”, and – of course – “PANTIES”).
Messages encrypted using one-time pads are so secure that it’s safe to send the message itself completely in the clear, which is exactly what we used to do. Especially during the cold war, but still today (and increasingly), governments have been able to communicate with spies in foreign countries simply by broadcasting strings of numbers over conventional radio, from what are called numbers stations by radio enthusiasts (and also by conspiracy theorists, of course). Of course, nowadays it’s perhaps more-feasible to send many kinds of messages by e-mail – and there are a number of one-time pad systems optimised for fully-computerised use, although there exists a greater risk of being traced online than by simply tuning in a radio.
Now: go have a play!
However, you certainly shouldn’t actually use it for passing secret messages around: read the caveats below if you can’t work out why for yourself!
- The first challenge with using one-time pads is finding a good secret key. People have used all kinds of things – patterns in music, entire text of books – that are all flawed and imperfect. The only secret key good enough for use in a one time pad is a cryptographically-random set of number. The random numbers generated by a conventional computer are not good enough: I suggest you get yourself five ten-sided dice and roll them all simultaneously, writing down the numbers which come up as they appear in front of you from left to right. Repeatedly. Yes, this is a boring process. For convenience, my implementation will generate random numbers for you, if you like, but they’re not good enough for actual use. The United States broke a German one-time pad in 1944 because the machine they used to generate the random numbers was not sufficiently random.
- The second challenge is getting your secret key to the friend to whom you want to send secret messages. This must be done in person. If you transmit it by any other medium, it could already have been compromised. Even if you encrypt it, the system can only be considered to be as good as that encryption, which defeats the point entirely. During the cold war, KGB spies were issued with tiny keybooks like the one shown on the right. A book this small can be hidden in any number of places, as anybody who’s been geocaching knows! After receiving and decoding a message, the page used to provide the key could easily be burned, eaten, or otherwise destroyed.
- A third challenge comes from the fact that no key must ever be re-used. As soon as a key is re-used, the code is no longer unbreakable. A combined U.S effort broke a 1945 Soviet one-time pad after the same key was used several times: once the U.S. knew something about the contents of some of the messages (they contained leaked British intelligence), they were able to partially break the key.
- There must be no way for an unauthorised party to observe the plaintext before it has been encrypted or after it has been decrypted. Your desktop PC won’t do, because your enemy can read your screen through the wall, install a keylogger, or just peep through your window!
- And, of course, as with all cryptography, your system is only as secure as the people involved. If your friend can be bribed, blackmailed, tricked or tortured into giving up information, the system fails. Obviously to maximise your ability to protect your system you should issue different keybooks to each of your trusted friends – this also helps to prevent them from talking to one another and organising a coup against you!
- Wikipedia article on one-time pads.
- A great explanation of one-time pads, also explains why books and music aren’t random enough to use as keys
- Simple implementations of one-time pads in other programming languages
- A good command-line one-time pad encrypter/decrypter written in Python
- Shortwave Espionage / Secret Signals – a great (free, online) resource and e-book about numbers stations
Cool little tool. The fact you haven’t provided a mapping for angle brackets is presumably deliberate, but means I couldn’t use it to transmit top secret mathematical formulas… Not that I have any.
I liked your description of one time pads (although I already knew all that stuff), it was very entertaining. And now I wanna do some cryptanalysis. It’s been years since I did a proper frequency analysis.
Dan Q mentioned this article on danq.me.