Christoph’s Chaotic Caesar Challenge

Blog reader Christoph Tenzer has sent me a nice crypto challenge, based on an enhanced Caesar cipher. Can a reader solve it?

The homophonic Polybius challenge I introduced a few days ago was solved immediately by my readers Dave Oranchak and Nils Kopal. Both used hill climbing software designed to attack homophonic ciphers. Of course, I knew tools of this kind, but – to be honest – I didn’t think that they were powerful enough to break a 200-letter plaintext encrypted with a method that provides six homophones for every character of the alphabet.

Blog reader Knox suggested that instead of one alphabet (15 letters above the matrix, 10 left of it), two (25 letters above the matrix, 25 left of it) should be used to create homophones. This would result in 25 homophones per character of the alphabet. I think I will come back to this concept in the near future.

The chaotic Caesar cipher

My article about the homophonc Polybius triggered long-time blog reader Christoph Tenzer

Source: Christoph Tenzer

… to send me the following description of a cipher he calls “chaotic Cesar”. It is based on the Caesar cipher, a simple letter substitution using a shifted alphabet, known for more than 2000 years. He then borrows two additional concepts from other manual ciphers.

In the following, we work with the standard English 26-letter alphabet. We need a keyword and a number n. As an example, we take the keyword CHALLENGE and n=3. The keyword is used to rearrange the alphabet in the usual way: Write the keyword first, omit repeating letters, append the remaining letters. For CHALLENGE, we receive:

CHALENGBDFIJKMOPQRSTUVWXYZ.

To encrypt a plaintext, one starts with the first letter and performs the following steps (in the first step, the number n comes into play):

1. Find the position of the plaintext letter in the cipher alphabet. The ciphertext letter is the one n steps to the right (continue counting from the left if you reach the end).
2. Switch the position of plaintext and ciphertext letter in the cipher alphabet.
3. Continue with the next plaintext letter.

Using the keyword CHALLENGE and n=3, the plaintext TO BE OR NOT TO BE is encrypted as follows:

1. T returns W and then the positions of T and W are switched, so the alphabet is now: CHALENGBDFIJKMOPQRSWUVTXYZ
2. O returns R and then the positions of O and R are switched, and the alphabet is now: CHALENGBDFIJKMRPQOSWUVTXYZ
3. B returns I and then the positions of B and I are switched, and the alphabet is now: CHALENGIDFBJKMRPQOSWUVTXYZ
4. E returns I and so on…

Here’s the ciphertext: WRIIUUDXZACMM

Decryption works almost the same way, starting with the first ciphertext letter and the original cipher alphabet:

1. Find the position of the ciphertext letter in the cipher alphabet and the cleartext letter is the one n steps to the LEFT (continue from the right if you reach the beginning).
2. Switch the position of cleartext and ciphertext letter in the cipher alphabet.
3. Continue with the next ciphertext letter.

In our example, the original cipher alphabet was: CHALENGBDF IJKMOPQRS TUVWXYZ and n=3. Thus, W returns T, R returns O and so on… until we receive TOBEORNOTTOBE.

Christoph’s Chaotic Caeasar Cipher Challenge

Christoph has encrypted the following ciphertext with the chaotic Cesar cipher:

```YWXHZVFKTLJIPQPUHUSIEUNPXWTXHA
TZITGQQRLXXERRJCJQPXICKJECPDZD
MLUIDAHZRDFWUZPMXLSHUNJIXMDUEN
TZIIDSESYHBVEXSIUMPTZIJPDJNISE
DERHHTYAQHEMTIRAICENTZMJXNDNVD
BZLXTTLITCCDOOJSJESIIWPXJJNDUN
PBBSLPSIAFRCRPAFIXRTKLQIRBIXSI
WSRBEDANUUXAIDCTCEHYZOZARZRUXU
HMCISUWEABIPDEJOZLERYTAERATETL
ZRZXAAOZKXIUWNPSYDZSIGPZJBZRKQ
LXSTSITYKKXHGXDWIXPRXOVRQTQFXX
IVJOSBZDNOPWLFLNIOP```

The plaintext is in English, consisting of 379 letters. The keyword is not a single word and has more than 10 different letters, which means that a dictionary or brute force attack probably won’t work. The number n was chosen randomly between 1 and 25.

Thank you very much, Christoph, for this crypto challenge. Can a reader solve it?

Further reading: Rubik’s Cube encryption challenge solved

Kommentare (30)

1. #1 Gert Brantner
Berlin
10. Mai 2020

Combining two or more weak ciphers results in a super cipher. Al-Kindi, ca. 850 AD

2. #2 farmerjohn
10. Mai 2020

Quick thoughts for optimization (sorry if wrong or trivial):

– Case n is equivalent to (26-n) with the alphabet reversed.
– if n is odd (GCD(26, n)=1) the alphabet is equivalent to some other alphabet with n=1
– if n is even (GCD(26, n)=2) the alphabet is equivalent to two alphabets of length 13 with n=1
– if n=13 then we have 13 cycles of length 2 which can be probably quickly detected

3. #3 Narga
10. Mai 2020

@Gert: do you know what kind of ciphers he combined?
I actually worked the other way round: I started from known difficult ciphers and tried to remove or exchange complex elements while keeping the level high. Trying to reduce to a minimum of necessary elements and explanation steps in this case.

4. #4 Bill Briere
Wyoming, USA
10. Mai 2020

Christoph,

Your cipher is neat, compact, simple, and secure. I like it! By using 52 Scrabble tiles, one can even do manual encryption/decryption quickly and without eraser rubbings.

Periodic “reset” points might help reduce sweeping errors in sloppy field use and in longer messages.

This system appears to be loosely based on the Chaocipher (“chaos cipher”), which also uses dynamic substitution. In its manually operated form, the Chaocipher is much more secure, but is also clunky and error-prone, since both the plaintext and ciphertext alphabets are in motion.

5. #5 Norbert
Berlin Homeoffice
10. Mai 2020

I think farmerjohn made some very good points.

To put it in my own words:
Every key (i. e., combination of keyword and n≠0) can be transformed into an equivalent key in which n equals 1, 2 or 13.

In fact, if n=13 then each letter will always be encrypted the same, making the cipher a simple MASC!

(Another point: the key alphabet can be shifted/rotated without changing the way the encryption works, so the cryptographer may arbitrarily set the first letter of the key alphabet to, say, A.)

n=13 can be excluded by analyzing the IC. A typical English text has an IC of about 0.065. The challenge’s IC equals 0.043. This is not compliant with a MASC, so n=13 can be ruled out here.

Hence only n=1 and n=2 are still in the race…

I was wondering if the bigram IC can help to decide if n=1 or n=2. But the challenge text appears to be too short for this. Some experiments with a long text (50K letters, bigram IC of the plaintext = 0.00702) showed the following bigram ICs for the resulting ciphertexts:

n=1: bigram IC = 0.00149
n=2: bigram IC = 0.00155

However, these values vary a lot in short texts.
The challenge ciphertext has a bigram IC of 0.00201.

I think I’ll do some Montecarlo test runs do get some probabilities for n=1 and n=2 at least.

6. #6 George Lasry
10. Mai 2020

This is very close to the famous Chaocipher, probably simpler because Chaocipher used 2 dynamic alphabets. Chaocipher was found to be susceptible to two types of attacks:
– Known-plaintext attack. This attack required tens of plaintext letters known.
– Ciphertext-only attack on messages in-depth (encrypted with the same key) – this attack required 50 ciphertexts in-depth.

No attack on a single Chaocipher ciphertext was ever published. While the cipher described here seems slightly less secure, I would be surprised to learn that such an attack is possible on the Chaotic Caesar.

7. #7 Norbert
Berlin Homeoffice
11. Mai 2020

I have to correct my data…

IC of challenge cipher: 0.0439
bigram IC : 0.00211

However, this doesn’t change the statements made.

Here is the result of my Monte Carlo experiment:

10,000,000 randomly chosen key alphabets enciphering a 379-letter excerpt, each time randomly chosen from a 2.8 million letter corpus, yielded a ciphertext with both IC and bigram IC matching the values given above …

n = 1:
272 times (up to 1% deviation accepted)
10559 times (5% deviation)
134753 times (10%)

n = 2:
9192 times (1%)
199173 times (5%)
710658 times (10%)

This lets n = 2 (or an even number in Christoph’s initial key) appear much more probable, yet doesn’t rule out n = 1 (or an odd number).

8. #8 Rossignol
Paris, France
12. Mai 2020

With the alphabet EQPCIHSMLVZNAKJYXFRGDWUOTB and n = 24 (i.e. -2) one get the clear text:

for many years scientists had been puzzled by the levels of ionizing radiation measured in the atmosphere the assumption at the time was that the radiation would decrease as the distance from the earth the source of the radiation increased the electroscopes previously used gave an approximate measurement of the radiation but indicated that at greater altitude in the atmosphere the level of radiation might actually be higher than that on the ground

9. #9 Norbert
Berlin Homeoffice
12. Mai 2020

@Rossignol: Congratulations! Extremely good job! Nomen est omen 🙂 How did you solve it?

For the record: The equivalent key which was used by Christoph, apparently was
VICTORFANSHEBDGJKLMPQUWXYZ, n = 18

10. #10 Narga
12. Mai 2020

Congratulations! Pretty amazing! What technique did you use to solve it?
To encode, I used n=18 and the alphabet VICTORFANSHEBDGJKLMPQUWXYZ derived from the keyword VICTORFRANCISHESS (https://en.wikipedia.org/wiki/Victor_Francis_Hess)

I assumed that the correct observations made by farmerjohn and Norbert would not decrease the keyspace down to a fully searchable level. I.e. one would not be in a much better position to solve by knowing if n is odd or even.

11. #11 George Lasry
12. Mai 2020

Rossignol:

Outstanding! Congratulations!.

12. #12 Rossignol
Paris, France
12. Mai 2020

@Norbert, @Narga
Obviously, the remarks of farmerjohn and Norbert were decisive for me.
The letter frequency histogram for the crypto convinced me that Norbert was right: n = 2 mod(26).
So the alphabet breaks down into two independent parts.
Since the most frequent letter in English is E, the other 12 letters in the orbit of E must have a high frequency.
I started from the alphabet

E A D I J P R S T U X Z L part0 (high frequencies)
F B C G K M O Q V W Y H N part1 (low frequencies)

and I alternately optimized its two parts.
Then I made all the permutations of two letters between the two parts while keeping only the one which gave a better score after optimization.

It’s always exciting to see the clear text appear gradually.
My neurons were all ionized 🙂

13. #13 Norbert
Berlin Homeoffice
12. Mai 2020

@Rossignol: Awesome! I am very impressed with the speed with which you have put this divide-and-conquer approach into practice!
I, too, thought about identifying the two 13-letter cycles as a first step, but I probably would have needed weeks …

I had also considered dictionary attacks, namely to combine three words each from the 10,000 most common ones. This would have been successful by pure chance: VICTOR + FAN + SHE (Victor is at position 9545 in my word list). But also for that I would have needed much more time 🙂

14. #14 Norbert
Berlin Homeoffice
12. Mai 2020

@Narga: Thus, n=1 mod(26) appears to be more secure because the alphabet doesn’t break down into two indepent parts then. Another challenge on this base?

As a further increase in security, one could use two different numbers m and n alternately. At least one of these should be odd, and of course, 13 should be avoided. Of course, this is associated with a higher susceptibility to errors in the encryption/decryption process.

15. #15 Norbert
Berlin Homeoffice
12. Mai 2020

Sorry, I meant n=1 mod(2).

16. #16 farmerjohn
12. Mai 2020

Congratulations, Rossignol!

Another property of this cipher is that it’s very easy to check if plaintext is compatible with ciphertext (it takes constant time to check each next plaintext letter). So brute force attack might be possible here.

17. #17 David Oranchak
https://zodiackillerciphers.com
12. Mai 2020

Awesome job, Rossignol! This cipher is fascinating.

I was curious to what extent substrings of the keyword could recover partial solutions. So I tested all substrings of “VICTORFANSHE” and computed the % correctness of the resulting plaintexts. Here are the top 20:

100.0% VICTORFANSHE
16.62% ICTORFANSHE
16.09% VICTORFANSH
15.30% VICTORFANS
10.55% ICTORFANSH
9.50% CTORFANSHE
8.71% RFANSHE
8.18% ORFANSHE
7.92% ICTORFANS
7.92% H
7.65% CT
7.39% SH
7.39% RFANSH
7.39% ANSH
7.12% E
7.12% CTORFA
7.12% CTORF
6.86% HE
6.60% TORFANSHE
6.33% V

So it seems even getting the key mostly right gives only a small portion of the correct plaintext. Does this mean hillclimbing on the space of possible keywords at the character level is not feasible?

18. #18 Narga
12. Mai 2020

@David Oranchak: When Klaus said he liked the challenge and would publish it on his blog, I tried a quick hillclimbing check (switching 2 letter positions in the alphabet for the n=18 case) and it didn’t work for me. I think there just is not much of a “hill” around the correct solution.

19. #19 David Oranchak
https://zodiackillerciphers.com
13. Mai 2020

@Narga

For fun, I made a quick greedy random search using the “% match with the known plaintext” as the fitness function. Best result was about 33% with this alphabet: BRGFUPHINEWLYAMVCTOJZKQSXD

pt:

ZOHXYSLLDKRSAUKQMYISDZCLMBSHXKTPPLEDBK
HMEUEUELYOVZOXIAHNGRFDIUTIOYUEAYUREDIN
TZEZTMASPHERXTHFXSSKMQTILNQOTHQFIMOWVS
OHVZTHEUVDIVTIQNWIPUDVXCEEGDKRIAHDKTIR
ENCEOJOMTHEDPFKHTHEWOFUCKOKIHHEGPRPDKO
PVPPYEVNAZPJZLTMAFEMHZOJPEMRNDZQHXUPTW
FWOZWNBLKOHJLCKLJLDUUELDGFKPJIDKJPOPXZ
SVZOENSOOXRIHJXVVHLNDQMLVBRNARFALOAIMG
HZJUKCELBYATHRGSHGSKDUIHCJMWAORJTCSWK

correct pt:

THEATMOSPHERETHEASSUMPTIONATTHETIMEWAS
NINCREASEDTHEELECTROSCOPESPREVIOUSLYUS
IATIONBUTINDICATEDTHATATGREATERALTITUD
HTACTUALLYBEHIGHERTHANTHATONTHEGROUND

matches:

.O……..RS……IS…..B…….LEDB.
..E.E.EL.O..O.I..NGR.DI.TIO..EA.UREDIN
T.E.TM.SPHER.TH..SS.M.TI.N..TH..IM.W.S
.H..THE..DI.TI.NW…D..C.E……H…..
.NCE..OMTHE….HTHE.O..C.O..H……..O
NINC…..D.H….C..O.CO……..O……
…..E.NA.P….MA.EM…..EM.N………
…..NB……C……….G………….
…………H….H…..L..R……O…G
H……L.Y..H.G……..H………….

I wonder if it’s possible to escape that local minima to climb closer to the real key or if the search space is simply too… chaotic!

20. #20 Narga
15. Mai 2020

Again congratulations to Rossignol for solving this challenge much faster than we had expected! Thanks to all who commented and invested their time in search of the solution and of course to Klaus for publishing the challenge in the first place.

Following up to the comment #14 by Norbert, here is the bonus challenge (english plaintext, same length as ciphertext) with the “true to the name” n=3.
``` UWCJLJIGOOACBGTRBWCOZTCROVAGEREGRFREU JLGKGFROQJZRZLUVEERMUCYKALXRQRBJYGMCX RBEYRRSGIGEAFTFGIVPTCZYNZOBOVILAECFOH OCCEPBEYPQOHILSELDFNUHRBZXJYOWFIWLMHN NRVPQDRLRGUDTJWWZBLUWYUDOFNHQSULDCZUM NWTWVJYDEFPSOJDIAYXJQQLQMONSZUOEUKYSG MSUZJZPGAGLXNSUNBHKRXCJFZSYCCNWDQHXUY TTCDLMAUHJOGOCYRUTKRXHMVHHUMOBVJOYNEZ CHYTWAOGIPAPKMERYOKXZVLORKJJRHLHZQLOH NRMMPMVMPGRWYJLCTVLZOPYWRYPTJBWMWMPQM FGOVEOJXHRTUKCRLOVZIAIGOCGJJYEMCVRBLY LSRLQRFYLLUYYSXCHMCRKDLIUNPCHRRBDVEHC MMIRZUGBMRQRHRIFVLMXEOALISJNPI```

21. #21 George Lasry
15. Mai 2020

DITOUTORDERINGITTOMAKENEEDLESTHENNANKEENSANDNEGLIGEESWHICHITDIDTHENNAILTHELOTTONARGHILESFILLEDWITHNE
PENTHEANDNUMEROUSOTHERNARCOTICSTHEMACHINECARRIEDOUTHISINSTRUCTIONSTOTHELETTERSTILLNOTCOMPLETELYSUREO
RIUMTHISLASTITCOULDNOTDOANDTRURLCONSIDERABLYIRRITATEDDEMANDEDANEXPLANATION

I do not have the key because I attacked it by cutting to pieces of less than 100, guessing the state of the key at the middle, and moving left and right (the algo is much less effective when starting from the start).

Using enhanced simulated annealing, fixed temperature, log hexagram scoring.

I can simulate other cases with other n values, which are co-prime with 26, and the program can solve all of them. Generally, I need about 60-70 letters to produce fragments of plaintext, and 80-100 to produce almost correct decryptions.

22. #22 George Lasry
15. Mai 2020

So, unfortunately, the verdict is that the Chaotic Caesar cipher is not very secure.

23. #23 David Oranchak
https://zodiackillerciphers.com
15. Mai 2020

Congrats, George! What an interesting approach. I sure hope you develop a paper and/or talk about the algorithm. 🙂

24. #24 David Oranchak
https://zodiackillerciphers.com
15. Mai 2020

Here’s the quote and source:

One day Trurl the constructor put together a machine that could create anything starting with n. When it was ready, he tried it out, ordering it to make needles, then nankeens and negligees, which it did, then nail the lot to narghiles filled with nepenthe and numerous other narcotics. The machine carried out his instructions to the letter. Still not completely sure of its ability, he had it produce, one after the other, nimbuses, noodles, nuclei, neutrons, naphtha, noses, nymphs, naiads, and natrium. This last it could not do, and Trurl, considerably irritated, demanded an explanation.

from “The Cyberiad” by Stanislaw Lem

25. #25 George Lasry
15. Mai 2020

Thx David!

I don’t think this is worth a new paper but here are some additional details:

1) The simulated annealing variant is described here:
https://www.ep.liu.se/ecp/article.asp?issue=158&article=010&volume=
2) By starting in the middle of the ciphertext segment (guessing/improving the key state at this middle point) and moving right and left (as opposed to decrypting from the start with the starting key), and processing shorter segments (< 100), the effect of correct entries in the key is enhanced, as less key changes are involved (50 changes instead of 100). When working with the initial key, the effect of correct entries is mostly lost by the time you reach the end of the ciphertext (unless you have found a 90-100% correct key).
3) The transformations on the key involve simple swaps of two elements
4) Scoring the plaintext is performed with log hexagram stats.

26. #26 George Lasry
15. Mai 2020

I think I got the (initial) key:

STANILWEMHCYBRDFGJKOPQUVXZ

which is based on the name of the author

27. #27 Rossignol
Paris, France
15. Mai 2020

Bravo George! The case with n odd is still more difficult than that with n even.
And thanks to Christoph for this fascinating challenge.

28. #28 Narga
15. Mai 2020

Great work, George! So it lasted even shorter than the original challenge… plaintext and key are correct. I derived the alphabet key from the words STANISLAW LEM THE CYBERIAD.

29. #29 George Lasry
15. Mai 2020

Narga: Thank you very much for this very interesting challenge. When Rossignol solved the first part, I was surprised to find out a single ciphertext is enough, but I think that the main weakness of the cipher is that most of its alphabet is still stable after a few encryptions, unlike the Choacipher which shifts several parts after each encryption.

Maybe if instead of swapping the p and c positions, we rotate left (or right) the segment between them (the segment includes p and c), the result would be more chaotic and difficult to solve. We would also need n which is not too small.

Merci beaucoup, Rossignol, and congratulations for showing an attack is possible.

30. #30 Klaus Schmeh
16. Mai 2020

@ farmerjohn, Bill, Rossignol, Narga, David, George, Norbert, Gert:
Thank you for your great contributions and for solving this challenge! Great job!