The keywords used for the Double Columnar Transposition (DCT) should have relatively prime lengths. But why?

David Kahn, author of the crypto history classic The Codebreakers, once told me: One of the things he liked about his job was to get in touch with high school students who asked him for support with school projects about cryptology.

Source: Schmeh

I can understand David. It is always great to see that there are young people occupying themselves with this fascinating topic, and I’m sure that some of these will stick with crypto and produce interesting work in the future.

 

A question about the DCT

Earlier this week, I received a mail from Marc Hofman, an 11th-grade high-school student from the Max-Planck-Gymnasium in Munich, Germany (11th-grade students are typically around 17 years old in this country). Marc is currently working on a school project about the Double Columnar Transposition (DCT) cipher, also known as the “Doppelwürfel” (“double cube”) in Germany.

The DCT was used by German intelligence organisations in both the Second World War and the Cold War. It is one of the most secure manual systems known and even hard to break with computer support, if used properly. The DCT and its cryptanalysis are well understood since George Lasry, a reader of this blog, and others have published great research works in this area in recenty years.

Marc asked me for literature sources about the DCT. Among other things, I recommended him my books Nicht zu knacken and Codeknacker gegen Codemacher, which both give an introduction to the topic in German.

 

In addition, Marc asked me a question: Why is it recommended that the lengths of the two keywords used for a DCT encryption be relatively prime? For instance, why are HOUSE/STREET (5/6 letters) considered more secure than HOME/MOTHER (4/6 letters)?

Intuitively, the answer is clear. Numbers that are not relatively prime have a low least common multiple, which might lead to regularities in the encryption process; and regularities usually make a cipher easier to attack.

However, this explanation is not really satisfactory. It would be nice to have a better understanding of how using or not using relatively prime numbers influences the security of a DCT encryption. Marc hadn’t found such an explanation, so he asked me; but I had to admit that I can’t answer this question properly, either.

So, I decided to ask my blog readers.

 

The Double Columnar Transposition explained

Before we come back to the “relatively prime” question, let’s look at a how the DCT works. The DCT is a cipher that can be carried out with paper and pencil – no machine or computer program is needed. To explain it let’s encrypt the sentence TO BE OR NOT TO BE. We use RAIN as the first keyword. We write the plaintext below the password:

RAIN
----
TOBE
ORNO
TTOB
E

Now, we sort the columns of the table such that the letters of the password stand in alphabetical order:

AINR
----
OBET
RNOO
TOBT
   E

Next, we read out the message column-wise: ORTBNOEOBTOTE. What we have done so far, is a single columnar transposition. If we apply the same procedure again (with a different keyword), we get a DCT.

 

Two DCT challenges

To learn more about the effect of non-relatively-prime keyword lengths used for the DCT, I have created two challenges. Both are DCT-encrypted English messages with the keywords being random letter sequences, which means that a dictionary attack won’t work here. The keywords consist of roughly ten letters each, so the challenges should be solvable. There’s one important difference between the two: The first challenge is based on two keywords with relatively prime lengths, while this is not the case for the second one.

Let’s start with challenge #1, the one encrypted with the “relatively prime” DCT variant:

NUSCPFEUTHUDASEEYMEAYATOEPAUESOODHATUIDFRCHVILARNI
EEESSAEATVDOSHHIIENIATNCPOOICPITRDNAEHONDEISEYNFDE
SDPANGCTTEOKNRDLUWPANHWENUIRRARRFIEPCTGFTHNNNIELTE
CYLNNTTARAIEEIAOSDINIRHTIURNTHESTROSONWNAEOOCTOARS
EDLXHOTRVYDROPBENTEDFDYANLHOEDCLAFUAETIANGHTSEUEOA
IAELWSDESEISESINUCIEFRODFVCASAAVEXDONTLOIRTTXTTTWT
OEAONARLSOYERSUPERECTEESATMGWLCERTDREIWIIFUPSHANHN
RBTEACETIDOOHEATHHSOINRALSYTRFCLAPCMLOBTIPSSECDCAC
DHGFTLAGOIIVMNSAGSRTITLTIEYODTINIEAYITNLTOSSLEYNEI
GOCFRI

And here’s challenge #2, the “not relatively prime” message:

LSGTWSECIUTSOSSYOEPYAWTRLRAOEDHFSTBCPEILIDFETDALNR
GWTLVORWDAITNTTDISGAGNAOHTRAIIPTDANRTSTANCEEHSVTOA
AAROTWUEIMTERENEHAWLNXLPAAOABUOAEEIEKAELTOIIUAATAE
OHLNUOSAFDTDDOATESOFEIIRYNCETETADCRESPDRINBSPDOFOG
AMTEHAEESRLNFOTTENMITARFGNNEENNYTXERNNLWGYRADPNOBR
SEHNLRTSLLNIAWCFINTIAEOOYEOEEMSHTIENYLTETWTREUNOMZ
AHICNEIEDAIDDEMFMHNHATEWWEGNSFAONCWADDHWDSASAHSHMP
RACPCMSBCCHHOEERINCHIYDODOEENLOTENIRRRAESTRNPDAHGE
RDONNBDTPOEYSSDEDKIEIEISWIANIFLOFOFMEFDDTISOBIHREA
VERNVEARHHRHOOAMETOERHTREVSAEN

Can a reader decipher these messages? And even more, I would like to know: Is the second one easier to break than the first one? And why is it, in general, not a good idea to choose keyword lengths that have a common divisor?

Marc and I are looking forward to interesting answers.


Further reading: George Lasry’s new attacks on the Geheimschreiber

Linkedin: https://www.linkedin.com/groups/13501820
Facebook: https://www.facebook.com/groups/763282653806483/

Subscribe to Blog via Email

Enter your email address to subscribe to this blog and receive notifications of new posts by email.

Kommentare (18)

  1. #1 Detlev
    Leverkusen
    5. April 2020

    Guillaume ist enttarnt worden, weil die Funksprüche an ihn bis 1958 mit dem Doppeltranspositionsverfahren verschlüsselt waren, das mit Hiolfe von Großrechnern geknackt worden ist.

  2. #2 Detlev
    Leverkusen
    5. April 2020

    Hier wird das Verfahren “Granit” ausführlich dargestellt: https://scz.bplaced.net/m.html#dwa1

  3. #3 George Lasry
    6. April 2020

    Hi Klaus – could you be more specific about ‘roughly 10’ and provide a range (e.g. 10+/-3) even if bigger than in reality.

    Can you also confirm the correct lengths of the cryptograms to make sure I have them right: 456 and 480?

    Note that I get fragments of plaintext (‘commercially available’) for the first under the assumption 11-10, but I am not sure why I cannot get the correct plaintext.

    To you original question: The only strong reason I can think of is that co-prime lengths guarantee that if one of the rectangle is complete, the second will not be complete. Two complete rectanges totally break the DT cipher security even with long keys.

  4. #4 Mihail
    6. April 2020

    Hello, Klaus. In the example you provided there seems to be an extra O as if the plaintext is tobeornotOtobe.

  5. #5 Narga
    7. April 2020

    So this is a wild guess at the message #1 with 11-10 keys. But still many open words 🙂
    AS EDLCBE DISCUSSED IN CHCAMER, A DICTIONARY CODE IS USUALLY ACKDENND NOT A CIPHER BUT A PSEIT ENCFYATSIORDS NOTLETFOR SECURITY YENORNOO TYPICALLY USES A COMMERCIALLY AVAILABLE DICTIONARY AND TAKES THE PAGE AND THE POSITION OF A WORD AS ITS CIPHERTEXT REPRESENTATION. IN OUR EXAMPLE AHTNWLISH STATE SERVICE USED THE ENGLISH FRENCH HALF OF CLIFFORD NOUVEAU DICTIONNAIRE FRANCAIS. THE FIVE DIGIT GROUP FOUR THREE EIGHT TWO SEVEN FOR EXAMPLE ITS FOUND TO START FODEHE TWENTY SEVENTH WORDLIST IS ON PAGE FOUR HUNDRED THIRTY EIGHT OF THIS DICTIONARY
    I’ve replaced some letters by others so it is no longer the original letter composition.

  6. #6 Klaus Schmeh
    7. April 2020

    @Narga: This looks good!!

  7. #7 ShadowWolf
    Relatively Prime Key Lengths
    8. April 2020

    It avoids a number of problems. There should also be a requirement that the message text be a length that is relatively prime to the keys which forces both encrypts to be incomplete rather than one complete. I wrote up a short PDF with examples, but you might need to translate the English. It is one thing to say “don’t do this”. It is yet another to demonstrate why it is a bad thing. The illustration is worth a thousand words. To sum it up, the word would be “patterns”. It can really be that bad.

    email if you want the PDF or I can send ODT or DOC.

  8. #8 George Lasry
    8. April 2020

    Narga well done! I got some of the fragments, but you got much more. Still it is very strange we cannot recover the exact key and plaintext. My program can solve problems with longer keys/shorter cryptograms, so this is quite unclear.

  9. #9 Narga
    8. April 2020

    Now I got almost everything recovered for challenge #1:
    A DICTIONARY CODE IS USUALLY A CODE AND NOT A CIPHER BECAUSE IT ENCRYPTS WORDS NOT LETTERS. A DICTIONARY CODE TYPICALLY USES A COMMERCIALLY AVAILABLE DICTIONARY AND TAKES THE PAGE AND THE POSITION OF A WORD AS ITS CIPHERTEXT REPRESENTATION. IN ONE EXAMPLE AN ENGLISH STATE SERVICE USED THE ENGLISH FRENCH HALF OF CLIFTONS NOUVEAU DICTIONNAIRE FRANCAIS. THE FIVE DIGIT GROUP FOUR THREE EIGHT TWO SEVEN FOR EXAMPLE WAS FOUND TO STAND FOR THE TWENTY SEVENTH WORD LISTED ON PAGE FOUR HUNDRED THIRTY EIGHT OF THIS DICTIONARY AS ?????? DISCUSSED IN ???????
    “Leftover” letters are WBPHAATTLLNER, I believe. I’m also not sure where the text actually starts and therefore whether the “as discussed in” belongs to the first sentence or the last.

  10. #10 ShadowWolf
    Programs
    8. April 2020

    I put together a hill climber that gets partials for 1 and 2. Considering it is barely 24 hrs old and I’m still working out a bug or two, I’m actually happy with that. Many of the longer words and some of the short words stand out even in the partial decrypt for number 1. I’m getting a good scoring spike around key lengths 12-10 for number 2 and some words appear but it isn’t what I’d call anything for certain. At best the keys are kind of close.

    I’m with George Lasry on this. It is odd that with this much text that the program won’t settle on a proper key set. I’ve even tried 2, 3, 4, and 5 grams along with a few different scoring files. For some reason it always climbs the wrong hill. If I come up with anything better I’ll post it somewhere.

  11. #11 George Lasry
    8. April 2020

    Klaus – would you be ready to either verify that the encryption process is correct or maybe just post the keys so that we can also check.

    The fact that the full text cannot be recovered for an easy combination of key length and ciphertext length is puzzling 🙂

  12. #12 George Lasry
    8. April 2020

    I tested with tens of simulations, and the program throws the solution in less than a second per case.

  13. #13 Klaus Schmeh
    8. April 2020

    Narga’s solution is correct. Congratulations!

  14. #14 Klaus Schmeh
    8. April 2020

    Message #2 (480 letters) is encrypted with the following keywords:
    DDAFWKQKL
    DHDGKJHGHOWJ
    I hope I didn’t make a mistake.

  15. #15 George Lasry
    9. April 2020

    Sorry, Klaus but those are probably the wrong keys, or most likely the text was wrongly encrypted 🙂

    Keywords

    DDAFWKQKL
    DHDGKJHGHOWJ

    Numerical keys

    BCADIEHFG
    AEBCJHFDGKLI

    Ciphertext

    LSGTWSECIUTSOSSYOEPYAWTRLRAOEDHFSTBCPEILIDFETDALNRGWTLVORWDAITNTTDISGAGNAOHTRAIIPTDANRTSTANCEEHSVTOAAAROTWUEIMTERENEHAWLNXLPAAOABUOAEEIEKAELTOIIUAATAEOHLNUOSAFDTDDOATESOFEIIRYNCETETADCRESPDRINBSPDOFOGAMTEHAEESRLNFOTTENMITARFGNNEENNYTXERNNLWGYRADPNOBRSEHNLRTSLLNIAWCFINTIAEOOYEOEEMSHTIENYLTETWTREUNOMZAHICNEIEDAIDDEMFMHNHATEWWEGNSFAONCWADDHWDSASAHSHMPRACPCMSBCCHHOEERINCHIYDODOEENLOTENIRRRAESTRNPDAHGERDONNBDTPOEYSSDEDKIEIEISWIANIFLOFOFMEFDDTISOBIHREAVERNVEARHHRHOOAMETOERHTREVSAEN

    Plaintext

    HULOODNEAAFTEEPEIEDRINMTTRSNAPSVDDIOBHHNMOYAPWESCPNHISSRASOMEOETUNYRMOARDRGVHEHRERORHALACERATDMISTTNFAENHBONATSTSHANNAEDIOYRCTPGDLEERFHBNTDSADIEIIHVSATWDETHWELLSGELMYFZOOIEOXNAIFSAASYOALTLDTEDSHLCREIVIOHETTAYATUNHSTTESGTAWOIIOIDEEAENONWFRFTNGMDCDEAIEUDHYOEBAXCRGWTAIONWACNTOOAOAMSALCITOAELERRWTRRASYOVGNPENNESENIBDELOENLRPWIEOTEFDHRFAIOKEWNEEOTETDAPNPTEADETMAHRDEACAEFINTIDLTUIGIFEEGHYMTOEPEIAFHSNHAWLOEAITSNTEICSNCROEHECIOTNWOWHDUASSWNDSTRLBRAAMTEIRBLTSDDDNRPENKNNFAFEOERCRRINSRE

  16. #16 Arctus
    14. April 2020

    @ ShadowWolf:
    Thanks for your offer!
    It would be great if you could email me your examples (pdf or doc) to dct-mail@t-online.de

  17. #17 ShadowWolf
    #2 keys
    14. April 2020

    I get the same decrypt using those keys. Even switching the order doesn’t help. Somehow the text was messed up.

    Even knowing the key lengths, I looked at all the various HC run saves. Nothing more than a few 4-5 letter words pop out and I doubt any of them are correct. Typical of false positives. One of my original runs tested key lengths from 6 to 14 excluding duplicate lengths. (10 +/- 4).

    I’m now curious about the actual keys for #1.

    Maybe post the original plaintext for both?

    @Arctus, file was sent.

  18. #18 Marc
    5. Mai 2020

    Nice … what’s about challenge #2