The keywords used for the Double Columnar Transposition (DCT) should have relatively prime lengths. But why?
David Kahn, author of the crypto history classic The Codebreakers, once told me: One of the things he liked about his job was to get in touch with high school students who asked him for support with school projects about cryptology.
I can understand David. It is always great to see that there are young people occupying themselves with this fascinating topic, and I’m sure that some of these will stick with crypto and produce interesting work in the future.
A question about the DCT
Earlier this week, I received a mail from Marc Hofman, an 11th-grade high-school student from the Max-Planck-Gymnasium in Munich, Germany (11th-grade students are typically around 17 years old in this country). Marc is currently working on a school project about the Double Columnar Transposition (DCT) cipher, also known as the “Doppelwürfel” (“double cube”) in Germany.
The DCT was used by German intelligence organisations in both the Second World War and the Cold War. It is one of the most secure manual systems known and even hard to break with computer support, if used properly. The DCT and its cryptanalysis are well understood since George Lasry, a reader of this blog, and others have published great research works in this area in recenty years.
Marc asked me for literature sources about the DCT. Among other things, I recommended him my books Nicht zu knacken and Codeknacker gegen Codemacher, which both give an introduction to the topic in German.
In addition, Marc asked me a question: Why is it recommended that the lengths of the two keywords used for a DCT encryption be relatively prime? For instance, why are HOUSE/STREET (5/6 letters) considered more secure than HOME/MOTHER (4/6 letters)?
Intuitively, the answer is clear. Numbers that are not relatively prime have a low least common multiple, which might lead to regularities in the encryption process; and regularities usually make a cipher easier to attack.
However, this explanation is not really satisfactory. It would be nice to have a better understanding of how using or not using relatively prime numbers influences the security of a DCT encryption. Marc hadn’t found such an explanation, so he asked me; but I had to admit that I can’t answer this question properly, either.
So, I decided to ask my blog readers.
The Double Columnar Transposition explained
Before we come back to the “relatively prime” question, let’s look at a how the DCT works. The DCT is a cipher that can be carried out with paper and pencil – no machine or computer program is needed. To explain it let’s encrypt the sentence TO BE OR NOT TO BE. We use RAIN as the first keyword. We write the plaintext below the password:
RAIN ---- TOBE ORNO TTOB E
Now, we sort the columns of the table such that the letters of the password stand in alphabetical order:
AINR ---- OBET RNOO TOBT E
Next, we read out the message column-wise: ORTBNOEOBTOTE. What we have done so far, is a single columnar transposition. If we apply the same procedure again (with a different keyword), we get a DCT.
Two DCT challenges
To learn more about the effect of non-relatively-prime keyword lengths used for the DCT, I have created two challenges. Both are DCT-encrypted English messages with the keywords being random letter sequences, which means that a dictionary attack won’t work here. The keywords consist of roughly ten letters each, so the challenges should be solvable. There’s one important difference between the two: The first challenge is based on two keywords with relatively prime lengths, while this is not the case for the second one.
Let’s start with challenge #1, the one encrypted with the “relatively prime” DCT variant:
NUSCPFEUTHUDASEEYMEAYATOEPAUESOODHATUIDFRCHVILARNI EEESSAEATVDOSHHIIENIATNCPOOICPITRDNAEHONDEISEYNFDE SDPANGCTTEOKNRDLUWPANHWENUIRRARRFIEPCTGFTHNNNIELTE CYLNNTTARAIEEIAOSDINIRHTIURNTHESTROSONWNAEOOCTOARS EDLXHOTRVYDROPBENTEDFDYANLHOEDCLAFUAETIANGHTSEUEOA IAELWSDESEISESINUCIEFRODFVCASAAVEXDONTLOIRTTXTTTWT OEAONARLSOYERSUPERECTEESATMGWLCERTDREIWIIFUPSHANHN RBTEACETIDOOHEATHHSOINRALSYTRFCLAPCMLOBTIPSSECDCAC DHGFTLAGOIIVMNSAGSRTITLTIEYODTINIEAYITNLTOSSLEYNEI GOCFRI
And here’s challenge #2, the “not relatively prime” message:
LSGTWSECIUTSOSSYOEPYAWTRLRAOEDHFSTBCPEILIDFETDALNR GWTLVORWDAITNTTDISGAGNAOHTRAIIPTDANRTSTANCEEHSVTOA AAROTWUEIMTERENEHAWLNXLPAAOABUOAEEIEKAELTOIIUAATAE OHLNUOSAFDTDDOATESOFEIIRYNCETETADCRESPDRINBSPDOFOG AMTEHAEESRLNFOTTENMITARFGNNEENNYTXERNNLWGYRADPNOBR SEHNLRTSLLNIAWCFINTIAEOOYEOEEMSHTIENYLTETWTREUNOMZ AHICNEIEDAIDDEMFMHNHATEWWEGNSFAONCWADDHWDSASAHSHMP RACPCMSBCCHHOEERINCHIYDODOEENLOTENIRRRAESTRNPDAHGE RDONNBDTPOEYSSDEDKIEIEISWIANIFLOFOFMEFDDTISOBIHREA VERNVEARHHRHOOAMETOERHTREVSAEN
Can a reader decipher these messages? And even more, I would like to know: Is the second one easier to break than the first one? And why is it, in general, not a good idea to choose keyword lengths that have a common divisor?