Jarl Van Eycke and Louie Helm recently solved a bigram substitution ciphertext consisting of 1000 letters – the shortest one ever broken. Now I have created a 750-letter challenge of the same kind.

The bigram substitution is a manual encryption method with a history of over 500 years. A bigram (also known as a digraph) is a pair of letters, such as CG, HE, JS or QW. The number of bigrams in the Latin alphabet is 26×26=676, ranging from AA to ZZ. A bigram substitution replaces each letter pair with another one (or with a symbol or with a number between 1 and 676). In order to use a bigram substitution, we need a substitution table with 676 entries.

 

Examples

The oldest bigram substitution I am aware of is described in the book De Furtivis Literarum Notis written by 16th century cryptologist Giambattista della Porta. Porta uses a 20 letter alphabet. He therefore needs a substitution table with 400 entries.

Bigram-Porta

As can be seen, Porta substituted each letter pair with a symbol. He had to be quite inventive to come up with 400 different symbols. For instance, the bigram IA is replaced with a symbol that looks like an X. The bigram VO is substituted with something resembling an O.

Blaise de Vigenère invented a bigram substitution, too:

Bigram-Vigenere

Vigenère replaces each bigram with a single letter or a letter followed by a dot, colon or semicolon. E.g., LM is substituted with “r.”.

The following bigram substitution, which is described in David Kahn’s book The Codebreakers, was used by the Nazi authority Reichssicherheitshauptamt (RSHA):

Bigram-RSHA

 

Two challenges

As far as I can tell, hill cimbing is the best approach to attack a bigram substitution. However, this technique will only be successful if there is enough material to analyze, i.e., if the ciphertext is long enough. But how long is long enough? Not much has been published about this question in the literature. In order to take a first step to find the answer, I decided to create a challenge. I took two messages – one with 2500 and one with 5000 letters – and encrypted them with a bigram substitution. Subsequently, I published them on my blog.

Within a few days, blog reader Norbert Biermann found the solution of the 5000 letters version – still with a few mistakes – using hill climbing. Thomas Ernst published a few interesting word pattern considerations. Then Norbert provided a second, more sophisticated hill-climbing result, which was almost error-free. Finally, Armin Krauß published the correct solution.

After the solution of the 5000 letter challenge had proven quite difficult, I expected that the 2500 letter ciphertext would not be solved so soon. I was wrong. Only a few days later, Norbert Biermann published the correct solution of the 2500 letter challenge, which he had again found with his hill climber. To my knowledge, this success represented the world record in breaking bigram substitutions.

 

The Bigram 1346 challenge

Two years after Norbert’s record, I published another bigram challenge on this blog. This time, I took an English text constisting of 1346 letters as plaintext, calling the result Bigram 1346 challenge. Contrary to last time, I didn’t replace bigrams with numbers but with other bigrams. For this reason, the ciphertext consisted of letters, which had to be read pair-wise. Here’s the challenge:

UNGOZIHIJGSLGVWPIVGJSOKEFMAHSDBDGLUBUNZIWPIEBIUNKFVOUN
BDSLPPHELVAQBAHEBIFJMHKVFLHXQQEFSLQQBDAQRIBVBIBYGJMOSOZ
BSDUXZINXUNEQVKUGYHUNVOWPSGSMGEFLFKRUHELFPHGVUXFGHRJ
YFUHIPBMHUNVOWPSGHXVKRSSGPHPWQXPLKCXGUNFGBICJFGJGCLLC
FPNXUNTUUKKIZBKFABEQNHRFWLKCYHDJHJOPBZRLAHQVFTHETGRQRJ
TDAYDTXTVDBDKFEFZKSDHETUFVIQBIYABDEXZIKCHXRUKQRLGECJAQA
OBKZIOBTEFMFRZNZACLWDWAUNEBBISLMREQKWRJRCUGHERGJMXON
WGJHIPBEYGDZOHXIXKFOXFLKVRUDWAOBIDLSRRICSICJGKFZBBUMRFM
GQKFYBXOHETGHEOAMUEMWLAYRJWPKIGXUOSKZIHIJGSLHIGIBLFUXX
UKUQPHGEGWHIOPZIBDLVBUKQRJSMUGFPWLWPSGPHFIKVYXCJLVULK
VQSZIBDLVBUUISRRUGJAIYXXGLKQXFRPBUOJIBTGHGDTGRCICUNVOWP
SGDTLIEMTAVOUNVUQQKCRTLHBDAIUXFGOAKPBTKVFLAHVWRHWAUG
KCXGUNFGBIXAGJHEHXGLBIRNGDOEBPUQSGBDIKACVORUBLKVVLZIHIJ
GSLWPIEUNCEPHFOMKFVMHUGNOPKBGLKCGCLHEXFAYUOMKTAGDZO
HEKFLVBWKVPLGXBPHEGIXOTJWURUUNCLSOFMKVWGFMFPIKGJLLJYO
GDWFRGLFQQEYDFVCAHYZPJGKFBIRQHEBAHELFHEFSVUCLWDBWIGJG
DRAYVKFPWLZNLQFGGJQAKFBLFUWQPWGJOBSLRIVVBXBDVUMHYIZY
BZKFGQLWROZIOBMACTPHSMGECLFGVOSGUNTUJYOGBIHECERCUGW
DEMFGKPAYSQKCBWONQVGEKVDHBIDWPHEMGEAQAONOEXZIOBMA
UNTUPHFYNXXGUNFGBIFNFMMFOXJBBDCLBIBIFJMHKVFLFJMHXNHXV
KUNKFZBTFFMHMFVLVWLYHHEMFOFICOJVUYXMFZNWLLWICLVSDZIFS
EBUNNXVUFIHARCXOZKMMFPKVRUUNVOWPSGLCQWUGCECJTGHIVK
WAFLPHAQVKPHHJSGMFHMRLDDHJZUBPTDBOFGVOSGUNTUJYOGXG
UNFGBIXAFMWAAQFPAIEQQQKCFIHAQWFIBPGLUBZNNNSOWDXMXG
UNFGBIEAHAKCAHAQSGRLMSKFWDBAHEFVMHWGPHFYBIUNTUJYOG
AIYXWAZIFLYNKCRUSOKVLKBOWARIBIHJ

In August 2019, Norbert Biermann published the solution of the bigram 1346 challenge as a comment on my blog. With this success, Norbert set a new world record for the shortest bigram ciphertext ever broken.

 

The Bigram 1000 Challenge

After Norbert had broken the Bigram 1346 message, I decided to create a new, even shorter challenge. This time, I took a plaintext with exactly 1000 letters. I encrypted it in the same way as the Bigram 1346 plaintext, calling it Bigram 1000 Challenge. Here’s the ciphertext:

IBFWNUNFEBVGDQBVTTMHLHUDVVZYBSKCWCUJNSCQYCXNEBSVFD
IYWCKZHKCDSUQBKBBBCYSIYYWMOVDLQXSIQXUGEEKBVOEEVJXYSE
MCUURBLXOVMSEBBFIBTBYFJMMERNVBRWQBIPUGEKJNZJPEBTVWW
KVVIBMEXIVHMERZXHCWWKIPACKHJNTEHKSVBNPYSOXYQOUUQMA
GVWSVLPBFBFHHXHLFJNVJKYTUSIBVBBMEJIDLFPYFRQCGDAEQZJVUZ
TWKEDEHYOKKWRBJVWNVZJUUBBBFXHBBEQHMRSRJDVLJSVMHQXB
FMKKEVRACDXGXVIBOXAMCAIOBQLIBFWEHYOVAXHVJRBDBHKSEHEU
URNIBPVFQKEKDNXDZBSUDBFKHYYTEHKKIIPUGVWWKVVBNIBFWXHV
JRBDBHKDHBXUUSVXHEQNZEFAIKHHIWKZERXKFBQYQZMTUNXOSXA
QXHRJJRQSUIBLINXWUKCUGEKJNZJPEBTVWWKVVMEFGBBVRBTXVGB
BXGFVRCWBBWCBSBJBITDRQJIJIRVZEKBWMUGRTTOHRQEVRACIBBUQ
XXZXFXYQOUUSEJAEOZJNBXHNLACZJTUSIBVBNWRIQLFKBVIKJTDHBZ
JOOIAONWAEKKEBOUGEKJNZJPEBTVWWKVVBFMSRTDUAIBZATCYEQ
HNUGPEBFUNRJHKSVTCRQCYBBTDRJHKJNXYXHMDZJUUBBMEXYKUU
SGBWRFGWMFQPYUGVIXHCWBBZNUGJYBUVVSIVBZSUGBSBJBITDRQ
GXTUBFAGWAPQRVAMACIBGKXZXVKJBNIBPVNSGWTCSUVGORYOTDI
BGKXPQXMUGKIWVRBJQXUGEKJNZJPEBTVWWKVVCDQXKZNCACHKL
DUUIBNITJCGYCXNDLVIVOQXBXLFPEAGKBUGEHWUKBBSHKSVZEQXU
GRQCGKHQXBZYYWMTRRXIBTUBQNVNCRTJORTVSNXTUSIBVTRVVFD
BHADTEUWIWAWCERJYFHKDL

Again, the challenge was broken and a new world record reached. This time, the solution came from Jarl Van Eycke and Louie Helm. They used highly sophisticated hill climbing techniques.

 

The Bigram 750 Challenge

After Bigram 1000 had been solved, I couldn’t help creating a new, even tougher challenge. Here it is:

YYXFTVUJKXMYWODAWFZPSAPPVDWNEXAJXFPPRXKCMFBZIXDLTC
VIBSKLZOXIUKPEMUXFEMDUOGPCRRMWZSVBNMYYSHLWCIAJJWOR
CFCHKYRXYYJVUPAGJHBZAJZPCJSEWZSEWZCJLFWOFHSAEMXZZU
JHLNGNNMYYIXUVNMYYIXBWAOKYJRYCHUBMNOQTXAPCRRMWPPWZ
AMLLPCXFEMWFITKYPGISZEKJMOMUXAEREKWGQTEOXILBUGGNTC
YOYAHUUQZNKYBJADXIAFICRWCRFPPGZIEEBZHUIWKRKERRLZWF
GQNAJRJQNTPYKBPEKBDLNGDYXPVAZSSKUVHUBBDLXAWFZUPNHZ
CCRXGOLFZUHUGNVWDYRRSAJHTRZUXAXPKMYYYCHRXZDUQSLFDY
KJIAZIDLGQNAQXBVWRSWGXPPAJMPDUPPVWAVNAORHUUWNBLNFM
BSSAPPDVGCGCWFWYDYZEWOPETHDLMUZURXKJHMKJYUBVOJWYDY
UGCYZPZIDLXFLWPCFSEXZRWFERWFIXDTYYWUVJPNAJZURXTFHZ
OAXALZXITHDLBSKLZOXIUKJPYYSHLWCIAJXFZURLVWUNPCHUPT
XZHCAJANBWLPKMHUVCWRKXKMBVCXCTHUHMNCQXVBTCNGADRHPC
KWUGRRKBRQXFPGWAMUDYIXDLKJJSDUOGQTRRKBXILBUGBBIPDL
XZZUWAOSDLYYZPYAZSVBKBGCJPUJXLLHDYIAKBVBZENMVCRWFA

This ciphertext has 750 letters. I call it the Bigram 750 challenge. Can a reader break it? If so, he or she will set a new world record.


Further reading: Can you solve this Cold War encryption challenge?

Linkedin: https://www.linkedin.com/groups/13501820
Facebook: https://www.facebook.com/groups/763282653806483/

Subscribe to Blog via Email

Enter your email address to subscribe to this blog and receive notifications of new posts by email.

Kommentare (7)

  1. #1 Louie Helm and Jarl Van Eycke
    14. Dezember 2019

    IN THE FOLLOWING YEAR FOSTER ACQUIRED THE RIGHT TO USE THE
    NAME MARLBOROUGH AND THE MODEL DESIGNATION NINE HUNDRED
    WHICH HAD ORIGINALLY BEEN USED FOR AN OPEN OPERATING SYSTEM
    PRESENTED IN NINETEEN NINETY BY NORWEGIAN SOFTWARE DESIGNER
    PETER IDE THE MARLBOROUGH NINE HUNDRED WHICH WAS PRODUCED IN
    A GERMAN FACTORY IN PROBABLY OVER FOUR THOUSAND STYLES IS
    FAR LESS WELL KNOWN THAN THE LION GAMMA THREE AND THERE ARE SOME
    AMBIGUITIES AND INCONSISTENCIES REGARDING ITS PRODUCTION
    NEVERTHELESS IT IS CONSIDERED A MODERN CLASSIC AND STATE OF
    MASTERPIECE CARS CONNINGHAM THEN DESIGNED A NEW DISCUS
    CONCEPT FOR THE THUNDERBIRD HARDWARE TRYING TO SOME DESIGN
    FEATURES FROM THE MARLBOROUGH NINE HUNDRED THESE INCLUDE AN
    IMPROVED KEYBOARD A NEW COLOR DISPLAY AND ADDITIONAL INPUT
    DEVICES IN TWO THOUSAND ONE THE NEW MODEL WAS INTRODUCED TO THE
    PRESS AT THE INFORMATION TECHNOLOGY CONVENTION IN NEW YORK

    Note: Perhaps Line 11 meant to say “TRYING TO [INCLUDE] SOME DESIGN FEATURES”. The omitted word added to both the difficulty and enjoyment of solving this cipher.

  2. #2 TWO
    Warzawa
    17. Dezember 2019

    Well done Louie and Jarl!

    Very good job.

    Did you use decagrams this time?

  3. #3 TWO
    Warzawa
    17. Dezember 2019

    @Klaus

    I find it impossible to find the article you used.

    Could you be so kind to point it out please?

  4. #4 Jarl
    Belgium
    17. Dezember 2019

    @TWO, thanks, 8-grams.

  5. #5 Gerd
    17. Dezember 2019

    >I find it impossible to find the article you used.

    I think Klaus intentionally uses a text where all the key-words are exchanged by other words to go sure that the text cannot be found in the internet.
    Gerd

  6. #6 Klaus Schmeh
    18. Dezember 2019

    @TWO:
    >I find it impossible to find the article you used.
    I took a German Wikipedia article, translated it to English usung Deepl and changed all names, places, technical expressions, and numbers.

  7. #7 TWO
    Warzawa
    19. Dezember 2019

    @Klaus

    Good job!

    Is is about a computer? The Xerox Star maybe?