Magnus Ekhall from Sweden has solved my Playfair challenge from April 2019. With only 30 letters, this ciphertext is the shortest of its kind ever broken.

The Playfair cipher, invented by Charles Wheatstone in the 19th century, can be broken with hill climbing (enhanced with simulated annealing), provided that the ciphertext is long enough. Last year, George Lasry, …

Lasry-01

Source: Schmeh

solved a Playfair cryptogram consisting of only 50 letters I had introduced on this blog. As far as I know, this success set a new world record in breaking a Playfair with a random matrix (US cryptanalyst Alf Monge broke a 30 letter Playfair in the 1930s, but this one was based on a matrix created with a keyword, which made things easier).

In order to check if it is possible to do better, I published a Playfair challenge based on a ciphertext with only 40 letters. This time, Nils Kopal solved it and improved the world record.

Source: Bonavoglia

 

How the Playfair works

The Playfair cipher substitutes letter pairs. So, the cleartext needs to be written as a sequence of letter pairs (the following cleartext is a Shakespeare quote taken from Robert Thouless’ life-after-death experiment):

BA LM OF HU RT MI ND SG RE AT NA TU RE SS EC ON DC OU RS EC HI EF NO UR IS HE RI NL IF ES FE AS T

The Playfair cipher requires that no letter pair consist of two equal letters. Therefore, we add an X between the two Ss:

BA LM OF HU RT MI ND SG RE AT NA TU RE SX SE CO ND CO UR SE CH IE FN OU RI SH ER IN LI FE SF EA ST

If the number of letters in the cleartext is odd, another X needs to be added at the last position, but this is not the case here. Next, we set up a 5×5 matrix containing the letters of the alphabet in a random order (we identify the J with the I, in order to get a 25 letter alphabet):

S U R P I
E A B C D
F G H K L
M N O Q T
V W X Y Z

As you might have noticed, the letter order in the matrix has been derived with the keyword SURPRISE. It would be more secure to use a completely random order of the letters.

Now, we replace the cleartext letter pairs (BA, LM, OF, HU, …) according to the three Playfair rules. Here are the rules in a diagram:

Playfair-diagram

Here are the same rules in text form (I refer to the letter pair to be replaced as XY):

  1. If X and Y are not in the same column and not in the same row (this is the most frequent case), form a rectangle and replace the two letters by the other two corner letters (the upper cleartext letter is replaced by the other upper letter in the rectangle, the lower cleartext letter by the lower one). For instance, LM becomes FT.
  2. If the two letters stand in the same row, each one is replaced by its right neighbor. Here, BA becomes CB.
  3. If the two letters stand in the same column, each one is replaced by its lower neighbor. In our example, AN becomes GW.

When we apply the Playfair rules on our cleartext with our 5×5 matrix, we get the following ciphertext:

CB FT MH GR IO TS TA UF SB DN WG NI SB RV EF BQ TA BQ RP EF BK SD GM NR PS RF BS UT TD MF EM AB IM

 

A new Playfair challenge

After Nils had successfully attacked my 40-letters cryptogram, I decided to take a 30-letters plaintext for my next Playfair chalenge. Here’s the ciphertext CrypTool 2 delivered:

SXCREDBQUGVZRSMNDSIKRKWRSGNSNXVM

This ciphertext has 32 letters, which means that two additinal letters were added to avoid doublings and/or a single letter at the end. As I didn’t use a keyword for the matrix, a dictionary attack did’t work.

 

Challenge solved

Two days ago, I received an email from Magnus Ehall, whom I know from the HistoCrypt conferences in Mons and Uppsala. Magnus is a Swedish computer engineer, working in software development. He has always had an interest for codes and ciphers, and breaking them. At the moment, his main interest is with the Turing bombes and the US Navy bombes. Magnus provided me a some great pictures of the Rök runstone, which I published in a blog post.

Source: Ekhall

In his mail, Magnus told me that he had solved the 30 letters Playfair challenge. The solution he found was:

TAKETHELASTXTRAINTOYORKONSUNDAYX

This was correct. The first X was included to avoid a doubled T, the second one to make the number of letters even. The original plaintext is: TAKE THE LAST TRAIN TO YORK ON SUNDAY

Thankfully, Magnus provided me a report on how he had broken the Playfair challenge. Here it is:

I wrote a simulated annealing programme, first in Python but later in C. I did not get very good results at first, but after listening to George Lasry’s presentation (and discussing the subject at the conference dinner) on how he solved the 40 letter Playfair challenge at the excellent HistoCrypt conference in Mons, I got motivated to give it another try.

In short I would like to describe the algorithm as a simulated annealer, with restarts and pruning. As the algorithm goes along to find local and global (?) maximums, I store the already tried maximums in a hash table and not allow the algorithm to go to that state again. This uses quite a lot of memory.

I log all states that produce a score over a certain limit and then sift through this output both manually and with small helper scripts that for example highlights words that exist in a dictionary.

Letting the programme run repeatedly for a week or so, I found the solution amongst the quite large number of possible solutions logged. It should be noted that the solution did not have an extremely good score, so there was probably quite a bit of luck involved as well.

I can only congratulate Magnus on this great success. Once again, I’m overwhelmed by my reader’s skills in breaking tough cryptograms. I’m proud that my challenges stimulate so much valuable research. Thanks to Magnus and all other codebreakers among my readers.


Further reading: A mird in the hand is worth two in the mush: Solving ciphers with Hill Climbing

Linkedin: https://www.linkedin.com/groups/13501820
Facebook: https://www.facebook.com/groups/763282653806483/

Subscribe to Blog via Email

Enter your email address to subscribe to this blog and receive notifications of new posts by email.

Kommentare (15)

  1. #1 Gerd
    5. September 2019

    Amazing that this could be solved!
    Klaus used a 25 letter key that looks random to me. So a ciphertext with 25 letters and a random 25 letter key should be unsolvable as it is the same complexity as a one time pad, or am I missing something? (Even if the coding is Playfair, not Vernam)
    Magnus could solve the 30 letters text – which is only 5 letters more than “totally random”…

  2. #2 Thomas
    5. September 2019

    @ Magnus

    Congratulations!

  3. #3 Klaus Schmeh
    5. September 2019

    Bill Ricker via Facebook:
    Playfair compromised Wheatstone’s design when failing to promote it to Whitehall, from which compromise it has never recovered.

    Wheatstone the designer knew raw keyword order was inadequate and specified matrix derangement be by keyword/phrase followed by a keyed transposition (of any width coprime with 5, possibly keyed via same keyword, but even better by another).

  4. #4 Magnus Ekhall
    Borensberg
    6. September 2019

    Thank you for the kind words.
    I’ll answer any questions if there are any.

  5. #5 Thomas
    6. September 2019

    @ Magnus

    “I store the already tried maximums in a hash table”: Could you explain this efficient approach? How do you convert those maximums to hashs? Thanks in advance.

  6. #6 Magnus Ekhall
    Borensberg
    6. September 2019

    @Thomas:
    After each round of looking for a better state (testing to swap all pairs of letters in the key, test swapping all pairs of rows/columns in the key) I store the resulting clear text in a hash table.

    If I later encounter a key that would produce a clear text that exists in my hash table, I then refuse that key immediately. With this I hope to slowly force the algorithm into looking at worse and worse maximums instead of sticking to the best ones (which were not the correct answer).

    I eventually wrote my programme in C. But only for this hash table, I switched to C++ and use an std::unordered_set which is implemented as a hash table. I do not have to care about how the hash is computed, I trust the C++ standard library here.
    std::unordered_set is the only C++ construct I used.

  7. #7 Kaligule
    6. September 2019

    @Magnus:
    Could you explain what you used as a fittness function? Did you do a dictionary search on the “deciphered” texts? Or a frequency analysis on the characters?

    Solving a 30 character text with a 25 character key sounds impossible to me, congratulations.

  8. #8 Marc
    6. September 2019

    @Gerd,
    Playfair isn’t a polyalphabetic substitution, so this comparison with the one time pad does not work.

  9. #9 David Oranchak
    http://zodiackillerciphers.com
    7. September 2019

    Sounds like a tabu search: https://en.wikipedia.org/wiki/Tabu_search

    Thanks for the explanation, Magnus. And congrats!

  10. #10 George Lasry
    7. September 2019

    Congrats! This is a significant algorithmic breakthrough and this can surely be applied to more ciphers. Well done.
    George

  11. #11 George Lasry
    7. September 2019

    The unicity distance for Playfair and English is 22. Would be interesting to see if a 22-letter message without x can be solved.

  12. #12 Klaus Schmeh
    7. September 2019

    Nils Kopal via Facebook:
    Very cool – Well done

  13. #13 Gerd
    7. September 2019

    Hello Marc & George,

    I still try to understand, how Playfair can reduce the required ciphertext length below the length of the key. Here is a calculation that gives a unicity distance of 27 for playfair, but maybe this is wrong?
    http://practicalcryptography.com/cryptanalysis/text-characterisation/statistics/

  14. #14 Narga
    7. September 2019

    @Gerd,

    if you follow the argument on the website you linked, you can see that the number of different keys, the language and the alphabet are going into the calculation, not the length of the key.

    James Lyons (of practicalcryptography.com) however forgot to take into account 1) that for Playfair there are identical keys when you rotate the keysquare in any direction, leaving only 24! different keys and 2) that the alphabet is different with only 25 instead of 26 letters (i=j).

    This paper here says 22.69 for Playfair and interestingly quotes on page 51 a solve of a 30 letter message from back in 1936!

    C.A. Deavours: UNICITY POINTS IN CRYPTANALYSIS
    https://www.tandfonline.com/doi/pdf/10.1080/0161-117791832797

  15. #15 Magnus Ekhall
    Borensberg
    9. September 2019

    @Kaligule:
    My fitness function was based on 5-grams, but I also removed all (!) occurrences of the letter X in the string that I evaluated (and compensated the score if any character was removed).

    @David Oranchak:
    I had not heard about Tabu search, but after reading up on that I agree, what I used was close to Tabu search.