# A homophonic Polybius challenge

The homophonic Polybius is a simple but hard to break manual cipher. Can a reader break a message I have encrypted in this scheme?

Designing a purely manual cipher (i.e., one that can be executed without a computer, cipher machine, or cipher tool) has proven a difficult problem. Most designs are either too complicated for practical use or insecure (some are even both). Almost all manual ciphers that were developed in the pre-computer era can today be broken with a computer program.

Although manual encryption algorithms lost importance with the advent of computer technology, they are still an active field of research.

### Homophonic Polybius

Today, I’m going to present a manual cipher I call “homophonic Polybius”. It is based on the Polybius cipher, a system invented by the old Greeks, and enhanced by the use of homophones. I have never seen this system in the literature, but I can imagine that similar concepts have been used before.

The homophonic Polybius is simple, but might be hard to break. It replaces every letter of the plaintext with a letter pair (digraph), which means that the ciphertext is double as long as the plaintext. Such a property is not desirable, but can be tolerated if one is dealing with a plaintext of only a few hundred letters or less.

In the following, we work with a 25-letter alphabet. There is no Q. If a Q appears in the plaintext, we take a K instead.

For the homophonic Polybius cipher we need two keywords. As an example, we take PASTA and NOODLE. Each keyword is used to rearrange the alphabet in the following way: Write the keyword first, omit repeating letters, append the remaining letters. For PASTA we receive:

PASTBCDEFGHIJKLMNORUVWXYZ

With NOODLE, the alphabet is written as follows:

NODLEABCFGHIJKMPRSTUVWXYZ

Now we construct a substitution table. In the first step, we write the PASTA alphabet as follows (three lines on the top, two columns on the left):

In the second step, the NOODLE alphabet is added as a five-by-five square:

With this Polybius-type substitution table, we have several options to encode each letter. For instance, the A can be encrypted as OP, OC, OH, RP, RC, or RH. In other words: We have six homophones for each letter of the alphabet. Let’s now encrypt the plaintext TO BE OR NOT TO BE:

```T  O  B  E  O  R  N  O  T  T  O  B  E  X
XK ND OI ML ND WA MH MI WF XF MA RI NB ZJ```

The ciphertext is: XK ND OI ML ND WA MH MI WF XF MA RI NB ZJ

Each letter pair in the ciphertext was randomly chosen from the six homophones available for the respective letter. For instance, the T at the beginning was encrypted as XK, but could also have been encoded as WT, WF, WK, XT, or XF. When using this scheme, one should switch between the homophones of a certain letter in a non-regular way.

### A challenge

The following ciphertext has been encrypted in a homophonic Polybius cipher:

`MG VI VH PG JG KE KB KB VR LU JG ZG JE VN ZD VC KA ME TB VR JS TU PG KA JS`
`VX VB VH PE KE VS KB PB KA MA PF ZX KA MB PA ME PI VE KI VF KB VA PD MG ZC`
`VB KU KS VH PE KA VC YX VE KA TS TC KA VB KI KE VB ME PR MA KH KB VC TI TU`
`PG KA VB VF ZF VH PE KG VN KB PC KA ME PX ZR KA MD PE MG PI VG KU VR KN VD`
`VR MU MG VS KG KH PF PE VU KE JA KN JN VF VH PE JA VB KN PD KG ME PR ZR KG`
`VI KG MG PI VD PG MG ZG KG ZB VR KN VD KU MR VU ZD VF MU VU PE KG KE KI PR`
`PG VU ZD VB ZF VG KG VB VD KG VN KI KE VD VG PD MG ZD KE KD VD KU KB MC KG`
`KB KI KN KA TI KG KA KS PG ZR KN KD VB KG KD MG KB KD MI KU LI VH KG KE KB`

The plaintext is in English, consisting of 200 letters. Both keywords are misspelled expressions, which means that a dictionary attack won’t work. Can a reader solve this challenge?

Further reading: Solve the Bigram 600 challenge and set a new world record

## Kommentare (23)

1. #1 David Oranchak
https://zodiackillerciphers.com
6. Mai 2020

I think I got it but with some garbles:

AT THE END OF EVERY SEASON THE NORTHERN LEAGUE CHAMPION (FLAIRIN?) THE SUPER SERIES AGAINST THE SOUTHERN LEAGUE CHAMPION SO FAR EIGHTEEN (NORTHERN? SOUTHERN?) LEAGUE TEAMS HAVE CONSISTED OF THE EIGHTY SUPER SERIES PLAYED SINCE NINETEEN HUNDRED AND (SIXTY)?

2. #2 Nils Kopal
Krefeld & University of Siegen
6. Mai 2020

David, there are 2x KB which do not fit to your solution at positions 7 and 8, or?

3. #3 Nils Kopal
Krefeld & University of Siegen
6. Mai 2020

But my solver gives me similiar word/sentence parts so far:

ATTHEENNOFEVPSESEASONTHENORTHERNEEAGLEDCARRIDNAMAIRINTHESUREXPERIERAGAINSTTHERDATHESNOEAGUEWHARTIONSOOAREIGHTEENTDTHERNMEAGUETEARSHAVECONSISTEDOTHEEIGHTERATERSESIESTMAEEDSINVENINETEENHUNDREDANDFIFTE

4. #4 Klaus Schmeh
6. Mai 2020

@David, Nils: Great job! How did you find the solution?

5. #5 Klaus Schmeh
6. Mai 2020

Here’s the plaintext:
A T T H E E N D O F E V E R Y S E A S O N T H E N
O R T H E R N L E A G U E C H A M P I O N P L A Y
S I N T H E S U P E R S E R I E S A G A I N S T T
H E S O U T H E R N L E A G U E C H A M P I O N S
O F A R E I G H T E E N N O T H E R N L E A G U E
T E A M S H A V E W O N S I X T Y O F T H E E I G
H T Y S U P E R S E R I E S P L A Y E D S I N C E
N I N E T E E N H U N D R E D A N D F I F T E E N

6. #6 Nils Kopal
Krefeld & University of Siegen
6. Mai 2020

The best I get with the help of the Homophonic Substitution Analyzer in CrypTool 2 (@Klaus, I think, there are some errors in the ciphertext):

AT THE ENN OF EVERY SEASON THE NORTHERN LEAGUE CHAMPION SLAOR IN THE SUPER SERIER AGAINST THE ROUTHERN LEAGUE CHAMPION SO FAR EIGHTEEN NOTHERN LEAGUE TEAMS HAVE CONSISTY OF THE EIGHTY RUPER SERIES PLAYED SINCE NINETEEN HUNDRED AND SIXTE

7. #7 David Oranchak
https://zodiackillerciphers.com
6. Mai 2020

It reduces to a simple homophonic substitution so I converted the digraphic alphabet to a unigraphic one:

I fed that into the AZDecrypt homophonic solver which yielded much of the solution. Then I tried to make manual corrections.

I haven’t been able to recover the keywords yet though, likely because the solution is still a bit off.

8. #8 Nils Kopal
Krefeld & University of Siegen
6. Mai 2020

I used CrypTool 2’s Homophonic Substitution Analyzer. I saw here and there some words, but after I entered the beginning of David’s text, i could manually correct it until I had the previously shown text.

9. #9 David Oranchak
https://zodiackillerciphers.com
6. Mai 2020

@Klaus,

Does one of the keywords have the letters ENDIX in it?

10. #10 David Oranchak
https://zodiackillerciphers.com
6. Mai 2020

@Nils,

Yes, it appears there are some encoding errors.
Based on Klaus’ given plaintext, KB is encoded to two different plaintext letters (d and n). The same is true of VB (r and s).

11. #11 Thomas
7. Mai 2020

@David

I wonder why your method (replacing the ciphertext digraphs with one uppercase/lowercase letter each) worked. Did Klaus use fewer than 52 distinct digraphs (out of max. 6 x 25) in his ciphertext?

@Nils

So CrypTool accepts entering ciphertext digraphs that represent single plaintext letters?

@Klaus
Why are letters in Nils’ #3 (solution) and David’s #7 (replacement) cut off?

12. #12 David Oranchak
https://zodiackillerciphers.com
7. Mai 2020

@Thomas
There are numbers in my transcription too. My and Nils’ posts got cut off because of the way the comments are formatted. The text is all there, but is just not displayed (if you highlight all and copy and paste somewhere else, you can see the missing text.)

Here is Nil’s solution with line breaks so you see the whole thing:

ATTHEENNOFEVPSESEASONT
HENORTHERNEEAGLEDCARRI
DNAMAIRINTHESUREXPERIE
RAGAINSTTHERDATHESNOEA
GUEWHARTIONSOOAREIGHTE
ENTDTHERNMEAGUETEARSHA
VECONSISTEDOTHEEIGHTER
ATERSESIESTMAEEDSINVEN
INETEENHUNDREDANDFIFTE

And here’s my reformatted transcription:

ABCDEFGGHIEJKLMNOPQHRS
DORTUCVFWGXOYZaObcPdef
gGhiAjUklCVONmeOnoOUfF
UPpYqGNrSDOUgsCVtLGuOP
Jt5Hz0k62Mg12VtFfpD2MU
sytU0tLfF0yiAMF70kG8tG
fzOrtOlDwz7Ut7AG79k!Ct

13. #13 Klaus Schmeh
7. Mai 2020

This is a newbie question, but how does one begin to solve polybius ciphertext without the key phrase to begin with?

14. #14 Klaus Schmeh
7. Mai 2020

@Zachery:
If it’s an ordinary Polybius, counting letter frequencies and guessing words will do the job. The E is the most frequent letter in the English language, followed by T and A. The homophonic Polybius is a lot more difficult to solve. My blog readers David Oranchak and Nils Kopal used the technique of hill climbing to break it. Hill climbing chooses a key (in this cas a table) by random and changes it in smalll steps until finally a plausible solution comes out.

15. #15 Knox
8. Mai 2020

Why not use the whole alphabet for column labels and the whole alphabet for line labels?

16. #16 Klaus Schmeh
8. Mai 2020

Obviously, the homophonic one is much more difficult. Lol.

17. #17 Klaus Schmeh
8. Mai 2020

Klaus Schmeh is it just a method of substitution then with the ordinary one? Thanks!

18. #18 Klaus Schmeh
8. Mai 2020

@Zachery VanderGraaff:
>is it just a method of substitution
>then with the ordinary one? Thanks!
Yes, the Polybius is an ordinary monoalphabetic substitution.

19. #19 Klaus Schmeh
8. Mai 2020

@Knox:
>Why not use the whole alphabet for column
>labels and the whole alphabet for line labels?
Might be a good idea. This would increase the number of homophones.

20. #20 David Oranchak
https://zodiackillerciphers.com
9. Mai 2020

@Knox
That would increase the number of available substitutions for a single letter from 6 to 676 🙂

Which means the resulting plaintext is much more likely to have very few repeating pairs, making it as potentially hard to crack as one time pad.

21. #21 Knox
9. Mai 2020

@David
I should have said five letters for each column and five letters for each line.

22. #22 David Oranchak
https://zodiackillerciphers.com
9. Mai 2020

@Knox
That would definitely make it more secure, until the enciphered plaintext is so long that pairs of letters from the key have to be reused for occurrences of plaintext letters. For example, if your plaintext only has 25 times where “e” appears, then you can use a completely different two-letter substitution each time. But as soon as you hit 26, you have to reuse one of them. Those reused pairs gradually reduce the security of the cipher.

23. #23 Nils Kopal
Krefeld & University of Siegen
10. Mai 2020

Hi Thomas,
yes, CT2 (or the homophonic substitution analyzer) accepts multiple letters/digits as a “single symbol”. I am currently also working on improving it. Maybe, in a little while, after I finished the improvements, I will also make a video on the CT2 YouTube channel showing what I did 🙂