The year 2018 has its first alleged Voynich Manuscript solution. This time, two researchers say that Hebrew is the language the enigmatic book was written in. What’s behind this new hypothesis?
A new solution?
According to reports by Fox News, The Daily Mail and others, yet another Voynich Manusript solution (or at least a solution approach) has been put forward recently (thanks to blog reader George Keller for the hint). Here are the most important facts about it:
- Who? The new alleged solution stems from Professor Greg Kondrak and graduate student Bradley Hauer from the University of Alberta, Canada. Both are into computer science with a focus on NLP (no, this is not Neuro-linguistic Programming, but Natural Language Processing). This background gives me hope that their work is not complete crap.
- What? The two researchers say that the manuscript was written in Hebrew. I don’t know if this is a new hypothesis. Others have claimed that the language underlying this mysterious text is Latin, Greek, English, German, Italian, Armenian or Arabic – just to name a few.
- Where was it published? As mentioned above, there are a number of press reports about Kondrak’s and Hauer’s solution. Luckily, there’s also a scientific publication. The two presented their research at the Association for Computational Linguistics Conference 2017. Their paper “Decoding Anagrammed Texts Written in an Unknown Language and Script” appeared in Transactions of the Association for Computational Linguistics (Volume 4, Issue 1).
What Kondrak and Hauer really did
To be fair, Kondrak and Hauer don’t claim to have solved the Voynich Manuscript (the Fox News headline “15th-century manuscript with ‘alien’ characters finally decoded” is therefore nonsense). What they did is well described in the abstract of their paper:
Algorithmic decipherment is a prime example of a truly unsupervised problem. The first step in the decipherment process is the identification of the encrypted language. We propose three methods for determining the source language of a document enciphered with a monoalphabetic substitution cipher. The best method achieves 97% accuracy on 380 languages. We then present an approach to decoding anagrammed substitution ciphers, in which the letters within words have been arbitrarily transposed. It obtains the average decryption word accuracy of 93% on a set of 50 ciphertexts in 5 languages. Finally, we report the results on the Voynich manuscript, an unsolved fifteenth century cipher, which suggest Hebrew as the language of the document.
In fact, algorithmic decipherment (i.e., letting a computer break an encrypted text without a human interfering) is a very interesting topic. In the scientific magazine, Cryptologia a number of articles have been published about it (referred to as “automated cryptoanalysis”). As described on this blog before, Hill Climbing has been used for this purpose with great success.
Before Kondrak and Hauer published the paper mentioned above, they co-authored a scientific article about algorithmic decipherment of mono-alphabetical substitution ciphers (MASCs). I haven’t read it yet, but it looks quite interesting. As can be read in the abstract above, their current paper improves their algorithmic decipherment techniques by introducing new methods for determining the cleartext language.
In the last chapter of their paper, Kondrak and Hauer apply their solution method to the Voynich Manuscript. This experiment can only be successful if the Voynich Manuscript was encrypted with a MASC – which is far from clear. At least, Kondrak’s and Hauer’s method delivers a result: Hebrew is the language that fits best. The first sentence of the manuscript might be:
She made recommendations to the priest, man of the house and me and people.
Serious research, but not a solution
Sub-chapter 5.4 of Kondrak’s and Hauer’s paper is titled “Decipherment Experiments”. This headline exactly describes what is going on here. Two comutational linguists ask themselves what happens if the text in the Voynich Manuscript is treated as a MASC encryption in an unknown language and fed to a MASC solving program. One of the conclusions given in the paper reads as follows: “[Our work] can only be a starting point for scholars that are well-versed in the given language and historical period.” In other words: Don’t trust this “solution”, it’s only experimental.
All in all, it should be clear: Kondrak’s and Hauer’s work should not be confused with the dozens of useless Voynich Manuscript solutions that have been proposed in the past. Instead, it is a piece of serious research on algorithmic decipherment, enhanced with a nice experiment, which should not be misunderstood as the definitive way to decipher the manuscript.
I hope, we will see Greg Kondrak and Bradley Hauer at crypto history conferences in the near future.
Further reading: A test for checking whether a Voynich Manuscript solution is correct