US crypto expert Jim Gillogly broke over 1,000 transposition cryptograms created by IRA activists in the 1920s. Only one of these messages remained unsolved. Now, it has been broken by Richard Bean from Brisbane, Australia.
Decoding the IRA is the title of a book published in 2008 by codebreaking expert Jim Gillogly and historian Tom Mahon. This work is about 300 documents containing encrypted messages Mahon discovered in a Dublin archive. These documents were left behind by Irish activist Moss Twomey (1897-1978), who was the leader of the Irish Republican Army (IRA) from 1926 to 1936.
Decoding the IRA
Most of the encrypted texts covered in the book are dispatches sent between the IRA headquarters in Dublin and IRA activists on the British Isles and in the USA. The rest of the collection consists of cipher messages exchanged between detained IRA members and Twomey or his comrades. In all, the corpus consists of about 1,300 individual cryptograms.
Tom Mahon, who is a specialist in the history of the IRA, lacked the expertise to decipher the cryptograms. He therefore asked the American Cryptogram Association (ACA) for help. When six of the cryptograms were published on the ACA mailing list, ACA member Jim Gillogly …
… immediately became interested. He analyzed the cryptograms and quickly solved all six of them. This was the start of a fruitful partnership between Jim and Tom Mahon. Over the next months, Jim managed to decipher almost all of the cryptograms Mahon had found. The plaintexts, which provided insights into the work of the IRA in the 1920s, proved extremely valuable for Mahon’s research work.
The first chapter of Decoding the IRA, in which Jim explains his deciphering work, is a fascinating read for everybody interested in codebreaking. The rest of the book, which is based on information gained from the deciphered messages, does not contain much information about cryptography, but it is worth reading for everybody interested in Irish history.
An example message
The following IRA message, which is dated 1927, is one of the six Tom Mahon sent to the ACA and that was subsequently shared on the ACA mailing list (it consists of 151 letters):
AEOOA IIIEO AEAEW LFRRD ELBAP RAEEA EIIIE AAAHO IFMFN COUMA
FSOSG NEGHS YPITT WUSYA ORDOO ERHNQ EEEVR TTRDI SOSDR ISIEE ISUTI
ERRAS TTKAH LFSUG RDLKP UEYDM ERNEO RULDC ERWTE ICNIA T
When analyzing this cryptogram, Jim Gillogly saw that the E was by far the most common letter with 23 appearances, followed by A, R and I. The letters Q, B and V turned out to be very rare. These frequencies are consistent with the English language, although the ratio of vowels (47 percent) seemed a little high (40 percent is usual). So, Jim assumed that he was dealing with a transposition cipher. His guess was that a column-based transposition had been used.
In his 40 years as a codebreaker, Jim Gillogly had written a considerable amount of cryptanalysis program code for his personal use. Among other things, he was one of the first to use hill climbing for cryptanalysis. In this case, he used a hill climbing program tailored to break a column-based transposition. He assumed a line length between 8 and 15 and started a separate try for each length. When he tested a line length of 12, he received the following cleartext candidate:
This string contains many words from the English language. It is even possible to read a meaningful sentence from it: THE ADDRESS TO WHICH YOU WILL SEND STUFF …. However, there are quite a few letters that don’t make sense.
Jim restarted his program, with different starting keys, about a 100 times, but he didn’t get a better result. The keyword his software found was FDBJALHCGKEI (this was certainly not the original one used by the IRA, but it was equivalent). For further analysis, he looked at the transposition table his program created:
FDBJALHCGKEI ------------ THEAADDARESS TOWHECIEHYOU WILLOESENDST UFFFOROAQMGI SMRSAWSEEENE YFRUITDIERER ANDGIERIENGR OCERIIFIVEHA ROLDECSEROSS DUBLONIATRYT OMAKAIEATUET OAPPEAEARLLK EFRHATI
Now, Jim recognized how this result had come about: the encipherer had inserted two columns of meaningless vowels (nulls) into the table. In addition Jim’s program had switched the L and the H column. So, here is the correct table:
FDBJALHCGKEI ------------ THEA DD RESS TOWH CI HYOU WILL ES NDST UFFF RO QMGI SMRS WS EENE YFRU TD ERER ANDG ER ENGR OCER IF VEHA ROLD CS ROSS DUBL NI TRYT OMAK IE TUET OAPP AE RLLK EFRH TI
Here’s the plaintext:
THE ADDRESS TO WHICH YOU WILL SEND STUFF FOR QMG IS MRS SWEENEY FRUITERER AND GREENGROCER FIVE HAROLD’S CROSS DUBLIN TRY TO MAKE IT UP TO APPEAR LIKE FRUIT.
An unsolved cryptogram
Jim solved the other five cryptograms he received via the mailing list in a similar way. As it turned out, most of the 1,300 IRA cryptograms were encrypted in a column-based transposition. Jim could solve all of them – with only one exception. This only unsolved IRA cryptogram is reproduced in the following:
GTHOO RCSNM EOTDE TAEDI NRAHE EBFNS INSGD AILLA YTTSE AOITDE.
In his book, Jim wrote about this unsolved ciphertext: “This message is identified as having fifty-two letters [the IRA files list the number of letters for each ciphertext] but only fifty-one appear in the cryptogram itself.” So, it was clear that a mistake had occured somewhere. George Lasry commented: “I worked on this [cryptogram] unsuccessfully, for example, by inserting a letter at every possible position (so did Jim as he wrote to me).”
On August 6, 2019, blog reader Richard Bean published the solution of the IRA cryptogram as a comment on my blog. Richard had determined the following plaintext:
In a more readabe way this message reads (note that five letters were omitted): “Re Gelignit[e] Scotland sta[t]es they raide[d] and obtained [s]ome of th[i]s”
Gelignite is an explosive material. Apparently, the IRA used it in the 1920s.
According to Richard, this message was encrypted with the key BCAEHDGFKJI (again, this is certainly not the original key used by the IRA, but it’s equivalent) as follows:
BCAEHDGFKJI ----------- REGELIGNITS COTLANDSTAE STHEYRAIDEA NDOBTAINEDO MEOFTHLS
Now we change the order of the columns:
ABCDEFGHIJK ----------- GREIENGLSTI TCONLSDAEAT HSTREIAYAED ONDABNITODE OMEHFSLT
Read out column-wise, we receive:
GTHOO RCSNM EOTDE TAEDI NRAHE LEBFNS INSGD AILLA YTTSE AOITDE
Apparently, the L in LEBFNS got lost, and so the following message resulted:
GTHOO RCSNM EOTDE TAEDI NRAHE EBFNS INSGD AILLA YTTSE AOITDE
This is exactly the IRA cryptogram that remained unsolved for so many years.
There are two reasons why this message was so hard to break: first, five letters in the plaintext are missing (probably due to mistakes of the sender); second, a letter in the ciphertext got lost. Richard Bean deciphered this message anyway. In the following section, you will learn how he did it.
A report from the successful codebreaker
After I had read Richard Bean’s comment, I sent him a mail, asking for some background information and a photograph of him. Thankfully, he replied immediately. Here’s what he wrote me:
I am a mathematician, and my background is in combinatorics and statistics.
I was a regular reader of sci.crypt for many years, starting in about first year uni, 1994. I studied a general crypto subject at uni, and looked at GOST for my project, and also studied elliptic curve cryptography (1997). My combinatorics work in Latin squares in my PhD (2001) is applicable to secret sharing schemes, and of course you can see a Latin square in the Kryptos sculpture (the tabula recta). I read books like Sinkov and F. L. Bauer but I never got involved in the “puzzle” or “classical” side of crypto – just the “academic”, “modern” and “theoretical” aspect.
I remember reading about Jim Gillogly’s solution of the first three parts of Kryptos in 1999, on sci.crypt, and I thought that was interesting, but around then I got obsessively involved in analysis of the chess game Kasparov versus the World, so I didn’t look at K4 much myself.
I got interested in Kryptos again in June 2017 (via Google Scholar). I looked at a few classical methods proposed for breaking it, like the Hill cipher, Trifid, Gromark and four-square, and wrote some custom and hill climbing programs to help me.
After reading more about Kryptos I bought and read a few books, Kahn’s “The Code Breakers” (unabridged), “Code Warriors”, “Decoding the IRA” and “Atomic Time”. I joined the American Cryptogram Association and read lots of back issues about computer solving.
In April 2018 I bought and read the “Decoding the IRA” book and decided immediately to modify my C hill climbing programs to try to break the unsolved cipher. (How hard could it be?!)
I used the “quadgram” data files for scoring plaintext from the Practical Cryptography website. I solved some of the other transposition ciphers given in the book.
But for the unsolved cipher, no matter what I tried, I was just seeing random text in the output. I started to wonder if the suggestions on Klaus’s blog (in the 2013 post) about whether it was Gaelic were true because the output made about as much sense as Gaelic to me. I also wondered if my program was just terrible, or if the text contained one or more dummy columns, so that it was impossible to distinguish between the “right” solution and one which was just “good”.
This was a bit disappointing. I decided to use Google to see if the 12-letter keys in Figure 8 of the book were on the net somewhere, and found that the phrases “and the wonder-worker was sent” and “stalest tricks” were in a 1919 book that had been scanned by Google Books in 2018.
After some playing with the snippet view I knew I had the right book. I told Jim about it, and it was news to him.
A few months later, I decided to write a program to brute force the answer for small widths and see if I could find anything, but didn’t get anywhere. I looked at George Lasry’s PhD thesis and saw that hexagram statistics seemed to be helping him, so several months after that, I grabbed all of the Project Gutenberg books from their website and began laboriously stripping the headers and footers. Lots of people have tried to write programs to do it properly; none successfully!
I thought it would help me with my Kryptos ideas, too.
I generated the statistics (26 million numbers for each 6-gram text) and ran my program again with all different widths. I noticed that width 11 was producing better scores than width 12 which seemed like it couldn’t be a coincidence, and also that the best scoring answers had “LIGNIT” in them. The IRA was of course one of the first users of gelignite, so I thought this also wasn’t a coincidence.
I focussed on width 11, trying to add a letter in all different places of the ciphertext, and got the best score improvements inserting a letter between the two Es (letter 25 and 26).
I then decided to investigate all letters in that position and force “GELIGNIT” in the output, and lots of other text became clear like “THEYRAID” and “ANDOBTAINED”. In the middle was “SCOT?AND” which of course worked best when an L was inserted.
I sent it to Jim and he confirmed the answer, explaining the kind of mistakes that were made in the enciphering process, and also pointing out that the first “L” in “AILLA” had been typed as both “I” and “L”. So it was actually harder to solve not because of extra “dummy” columns, but because of a missing column.
It goes without saying that I am deeply impressed by Richard’s deciphering success. I am proud to have such excellent codebreakers as him among my readers. Thanks, Richard and congratulations on this great work!
Further reading: The Top 50 unsolved encrypted messages: 24. The Erba murder cryptogram
Subscribe to Blog via Email