Playfair cipher: Is it breakable, if the message has only 40 letters?

Von Klaus Schmeh / 8. Dezember 2018 / 11 Kommentare

My readers have shown that a Playfair cryptogram consisting of only 50 letters can be broken. Here’s a Playfair challenge with only 40 letters. Can you break it, too?

The Playfair cipher, invented by Charles Wheatstone in the 19th century, is a simple encryption method. It is more secure than most other manual ciphers. However, the Playfair can be broken with Hill Climbing (which works even better if Simulated Annealing is included), provided that the ciphertext is long enough.

Earlier this year, George Lasry, …

… who is one of the world’s leading experts in breaking historical ciphers, solved a Playfair cryptogram consisting of only 50 letters I had introduced on this blog. As far as I know, this is the current world record in breaking a Playfair with a random matrix (I know that US cryptanalyst Alf Monge broke a 30 letter Playfair in the 1930s, but this one was based on a matrix created with a keyword, which made things easier).

In order to check if a 50 letters Playfair is the shortest that can be broken, I decided to create a Playfair challenge based on a ciphertext with only 40 letters.

How the Playfair works

The Playfair cipher substitutes letter pairs. So, the cleartext needs to be written as a sequence of letter pairs (the following cleartext is a Shakespeare quote taken from Robert Thouless’ life-after-death experiment):

BA LM OF HU RT MI ND SG RE AT NA TU RE SS EC ON DC OU RS EC HI EF NO UR IS HE RI NL IF ES FE AS T

The Playfair cipher requires that no letter pair consist of two equal letters. Therefore, we add an X between the two Ss:

BA LM OF HU RT MI ND SG RE AT NA TU RE SX SE CO ND CO UR SE CH IE FN OU RI SH ER IN LI FE SF EA ST

If the number of letters in the cleartext is odd, another X needs to be added at the last position, but this is not the case here. Next, we set up a 5×5 matrix containing the letters of the alphabet in a random order (we identify the J with the I, in order to get a 25 letter alphabet):

S U R P I
E A B C D
F G H K L
M N O Q T
V W X Y Z

As you might have noticed, the letter order in the matrix has been derived with the keyword SURPRISE. It would be more secure to use a completely random order of the letters.

Now, we replace the cleartext letter pairs (BA, LM, OF, HU, …) according to the three Playfair rules. Here are the rules in a diagram:

Here are the same rules in text form (I refer to the letter pair to be replaced as XY):

If X and Y are not in the same column and not in the same row (this is the most frequent case), form a rectangle and replace the two letters by the other two corner letters (the upper cleartext letter is replaced by the other upper letter in the rectangle, the lower cleartext letter by the lower one). For instance, LM becomes FT.
If the two letters stand in the same row, each one is replaced by its right neighbor. Here, BA becomes CB.
If the two letters stand in the same column, each one is replaced by its lower neighbor. In our example, AN becomes GW.

When we apply the Playfair rules on our cleartext with our 5×5 matrix, we get the following ciphertext:

CB FT MH GR IO TS TA UF SB DN WG NI SB RV EF BQ TA BQ RP EF BK SD GM NR PS RF BS UT TD MF EM AB IM

How to break a Playfair encryption

Using a computer, there are two ways to attack a Playfair cipher:

Dictionary attack: If a keyword has been used for creating the matrix, it is possible to break a Playfair cipher by key word guessing. For instance, the word SURPRISE is contained in virtually every English dictionary, so a computer that tests one keyword candidate after the other will sooner or later find it.
Hill Climbing: Hill Climbing is the current super-algorithm of historical codebreaking. As George Lasry has proven, Hill Climbing (enhanced with Simmulated Annealing) can break a Playfair cryptogram of 50 letters.

There are several books that explain how a Playfair can be broken without computer support – for instance Helen Fouché Gaines’ Cryptanalysis, André Langie’s Cryptography, and William Friedman’s Military Cryptanalysis. The concept is to guess a few words in the plaintext and to derive the 5×5 matrix based on the peculiarities of the Playfair cipher (for instance, if AB->XY then BA->YX).

However, breaking a Playfair cryptogram this way is pretty difficult, especially if the ciphertext is short. To make things not too complicated, the books mentioned above use messages containing several hundred letters for demonstration. In addition, these books assume that a few words of the plaintext are known or can be guessed. None of these books describes the complete Playfair codebreaking procedure, as this would require too much space – instead, they simply omit most of the trial-and-error reckoning involved in breaking a Playfair cryptogram.

A new Playfair challenge

Here is a Playfair-encrypted text I created:

OFFCERVUMWMOOMRUFIWCMAOGFVZYFXYBHGUXZVEH

Can a reader break this challenge? Here are the details:

The ciphertext has exactly 40 letters (spaces not included). The plaintext is in English.
I used the software CrypTool 2 for encryption. As far as I know, CrypTool implements the Playfair cipher exactly the way it is explained above.

The keyword is a random transposition of the alphabet. No keyword is used. So, a dictionary attack won’t work.

As mentioned, solving this challenge might set a new record. Good luck!
Follow @KlausSchmeh

Further reading: A mird in the hand is worth two in the mush: Solving ciphers with Hill Climbing

Linkedin: https://www.linkedin.com/groups/13501820
Facebook: https://www.facebook.com/groups/763282653806483/

Subscribe to Blog via Email

Kommentare (11)

#1 Michael Schroeder
Potential Alternate 'Proof' Method
8. Dezember 2018

I propose a more direct method of proving a 40 letter Playfair can not be unambiguously broken. That is, produce a pair of 40 letter messages and their corresponding key squares that produce the same cipher text. If someone could produce such an example, the proof would be quite convincing.
#2 Narga
8. Dezember 2018

I think it is possible to construct such a pair for any length of ciphertext.
#3 Klaus Schmeh
8. Dezember 2018

The question is if the second plaintext found makes sense. Finding a sequence of random letters that encrypts to a certain ciphertext is simple, no matter which cipher is used.
#4 Thomas
8. Dezember 2018

Since the unicity distance of standard playfair is 27 (https://practicalcryptography.com/cryptanalysis/text-characterisation/statistics/), 40 characters could be sufficient for a unique decryption.
#5 Nils Kopal
8. Dezember 2018

“MEET YOU TOMORROW AT FOUR TWENTY AT MARKETPLACE”

We broke it with CrypTool 2 and George Lasry’s Playfair analysis algorithm (based on simulated annealing). The plaintext popped up during the analysis but disappeared since others had a higher score. After we applied the beginning, which we saw popping up, as a crib, it always finds the solution.

We plan to publish the Playfair cryptanalysis in CrypTool 2 in the near future.
#6 Klaus Schmeh
8. Dezember 2018

@Nils: Congratulations! Great job!
As it seems, I need to create a 30 letters Playfair challenge now.
#7 Nils Kopal
8. Dezember 2018

The key is:
UVACT
GNYEF
KLZHI
WDROM
PSBQX
#8 Klaus Schmeh
8. Dezember 2018

Here’s a screenshot Nils provided me:
#9 Thomas
8. Dezember 2018

@Nils Kopal
Congratulations! What task did CrypTool 2 perform in breaking this cipher?
#10 George Lasry
9. Dezember 2018

This Cryptool 2 attack (to be soon released) is a ciphertext-only attack based on simulated annealing, using hexagrams (6-grams) for the scoring function.

Lower n-grams (e.g. trigrams, and even 4-grams and 5-grams) are not effective with short ciphertexts. The reason is that garbage decryptions get scores higher than for the correct one, from ‘spurious’ keys obtained by simulated annealing.

The problem is that the ciphertext is very short, and even with hexagrams, there are still too many high-score spurious decryptions. By chance, we spotted a solution starting with ‘meetyoutomorrow’ and within a few seconds, it was replaced in the program display by new spurious decryptions with higher scores.

When using ‘meetyou’ as a crib, the full solution could be found easily, as shown in the screenshot, using a partially known-plaintext attack (also implemented in Cryptool 2).
#11 Bruce Kallick
10. Dezember 2018

It seems to me there may be some confusion between “breaking (or solving) a crytogram” and “unambiguously breaking a cryptogram” or “finding a unique decryption.” In the text linked by Thomas one reads:

“Note that just because you have 30 characters of ciphertext for a substitution cipher, this does not mean you will be able to break it. The Unicity distance is a theoretical minimum. In general you may need 4 times more characters (100 or more) to reliably break substitution ciphers. The same problem exists with the other ciphers.”

In Section 5 of my Handycipher paper https://eprint.iacr.org/2014/257.pdf I introduce the notion of complementary keys, whereby any two plaintexts can be encrypted into the same ciphertext.

Inspired by Gil Hayward’s recalling his devising a set of wheel patterns to test the Tunny (Lorenz SZ40) cipher machines by encrypting “Now is the time for all good men to come to the aid of the party” (the standard test sentence used by telegraph people) into the opening lines of Wordsworth’s lyric poem “I Wandered Lonely as a Cloud,” I illustrated the technique by devising two keys one of which encrypts the Wordsworth text and the other encrypts the opening lines of Longfellow’s most famous work “Evangeline,” each into the same ciphertext.

I extracted this idea in a short paper https://rudegnu.com/xfiles/unbounded_unicity.pdf whose abstract reads:

“The classic Simple Substitution Cipher is modified by randomly inserting key-defined noise characters into the ciphertext in encryption which are ignored in decryption. Interestingly, this yields a finite-key cipher system with unbounded unicity.”

For this modified cipher (called a Cloaked Substitution Cipher) calculating the ratio of the number of bits required to express the key divided by the redundancy of English in bits per character, the unicity distance should be 43. Yet, it is shown that for any two plaintext messages M1 and M2 there are keys K1 and K2 and a ciphertext C such that C is decrypted with K1 to M1, and decrypted with K2 to M2. The Wordsworth and Longfellow texts together with a 364-character common ciphertext are offered as an example.