Pietro Giannone (1791–1872), an Italian poet and patriot, left behind an encrypted poem filling a whole book. After nearly 150 years, mathematics teacher Paolo Bonavoglia broke the encryption.
On Saturday I reported on what seemed to be an encrypted book from Italy. Thanks to comments from my readers, I now know that this alleged crypto mystery is a hoax made by three Italian students. Today, I’m going to cover another encrypted book from Italy, but this time I’m sure it is authentic.
The book in question was written by Pietro Celestino Giannone, an Italin poet and patriot (not to be confused with 18th century historian Pietro Gianone, who has an English Wikipedia entry, while Pietro Celestino Giannone hasn’t). Gianone’s most famous poem is titled “L’esule” (“The Exile”). It was first published in Paris in 1829. Another poem written by Gianone is encrypted. It fills a whole book of 80 pages, all handwritten. Here’s a page of it (courtesy of the “Museo del Risorgimento” of Modena):
Consolato Pellegrino, former professor of Elementary Mathematics at the University of Modena and Reggio Emilia, brought this cryptogram to the attention of Paolo Bonavoglia, a teacher of Mathematics at a school in Venice. Paolo is an expert in cryptology. In 2014, he edited and expanded the Manuale di Crittografia by Luigi Sacco, his grandfather. After Giannone’s poem had remained unsolved for almost 150 years, Paolo tried to break it. This blog post is based on Paolo’s publication in Cryptologia (chargeable). He has also published some information on his website (in Italian).
How to break a nomenclator
The alphabet Giannone used for his poem consists of about 400 characters. Besides ordinary letters and digits, it consists of numerous symbols. The most likely explanation for such a large number of characters is a nomenclator (a nomenclator provides characters for both letters and frequent words or syllables). Here’s a nomenclator I recently received from Karl de Leeuw (in this case, the alphabet consists of numbers):
A well-constructed nomenclator is hard to break. However, many nomenclator designers made serious mistakes. E.g., many nomenclators make an obvious destinction between letters and words (e.g., letters are represented by two digit numbers, while three digit numbers stand for words). In addition, many nomenclatores were not used properly. Instead of using a word symbol, users often encrypted a word letter by letter (this is easier, if you know the letter symbols by heart).
Paolo hoped that Giannone had made mistakes like these, too. He was taken to the conjecture that digits and letters represented letters, while the symbols stood for syllables or words. This hypothesis was supported by the fact that the symbols had low frequencies, while the letters and digits appeared a lot more often.
As it turned out, the digits 1, 2, 3 and 4 often stood alone (i.e. they obviously stood for one letter words). As “a”, “e”, “i” and “o” are frequent one letter words in Italian, there was a simple explanation for this observation: The digits from 1 to 5 represented the five Italian vowels A, E, I, O, and U. But was this hypothesis correct? When Paolo looked for word patterns, he dicovered the following (courtesy of the “Museo del Risorgimento” of Modena):
The character sequence “1 p4c. 1p4c.” fit well with “a poco a poco”, a commonly used Italian idiom meaning “bit by bit.” This meant that the letters P and C were not even ciphered! And it meant that Gianone had used two different characters (“4” and “.”) to code the O.
Paolo’s hypothesis proved correct. Knowing the characters that stood for A, C, E, I, O, and P he could reconstruct the other letters of the nomenclator. As expected, Giannone often had used letters, even when a word was available. The more fancy symbols stood, as expected, for words or syllables. Paolo could guess many of them. Here are a few:
All in all, Giannone’s nomenclator is relatively simple. The first five digits from 1 to 5 represent the five Italian vowels A, E, I, O, U. The next digits represent a few common consonants: L, M, N, R. Other consonants are enciphered as themselves. The exception is Z enciphered with A and vice versa. The digrams AN, EN, IN, ON, and UN were coded as an uppercase T variously rotated. As a whole, the nomenclator has 160 symbols plus 20 upper- and lowercase consonants, which were not encrypted.
When Paolo wrote his Cryptologia article, about 200 lines of the book had been completely decrypted. The poem seemed to be an anthology of different tales. Here’s an excerpt with the cleartext (courtesy of the “Museo del Risorgimento” of Modena):
Paolo was surprised to find erotic content with explicit language in the poem. Maybe this was the reason why Giannone encrypted it.
Further reading: How Gary Klivans solved an encrypted letter from a prison inmate