An alleged cryptanalyst has published a new hypothesis about the Somerton Man cryptogram on Wikipedia. Does it make sense?

Reports about the Mystery of the Somerton Man (also known as the Tamam Shud case) often start with a sentence like the following: “On 30 November 1948, an unknown man was seen by witnesses on Somerton Beach in Adelaide, Australia, …”. As you might have noticed, this means that last Friday it was exactly 70 years ago that the Somerton Man was spotted. Yesterday was the unknown man’s 70th death anniversary.

Such an anniversary would be a good occasion to remember that the dead man who was found 70 years ago …


… is still unidentified to date and that the encrypted (?) note he carried with him …


… has never been deciphered. However, this is not the purpose of this post.

It is also not the purpose of this post to tell you that the German Wikipedia had a link to the Somerton Man entry on its title page last Friday (though I am a little proud that a book and an article of mine are referenced in this entry).


The actual reason for my post is that I learned something new when last Friday I read the German Somerton Man entry on Wikipedia. Here is the paragraph that caught my eye:


Here’s a translation:

In October 2018, for the first time the consideration arose that the message could have been created on the basis of a previously learned, greatly reduced dictionary. If this dictionary is known to both the sender and the recipient, it is possible to substitute each individual word of the cipher with a different key alphabet. By means of the arrangement of letters (‘word patterns’), the recipient can nevertheless identify the plaintext on the basis of the word list available to him. In the case of English, there are approximately the following (few) possible solutions:






It can be taken as a confirmation of this method that there are very few alternatives to the words ‘A’, ‘TRAINING’ or ‘DEMOGRAPHER’ (these alternatives are not necessarily on the word list). Another special feature is that the last two words are always combined for tactical reasons, for example to make it more difficult for outsiders to recognize this relatively easy-to-understand encryption method. However, it is conceivable that the cipher was written in a language other than English.

The source for this hypothesis is described on Wikipedia as follows: “First discussed on October 3, 2018 by cryptanalyst Christof Rieber (‘sequential polyalphabetic substitution’)”.

It goes without saying that, after reading this, I asked myself a few questions:

  • Who is Christof Rieber? Rieber is described on Wikipedia as a “cryptanalyst”. However, I have never heard of him and I can’t find anything he has published. I don’t know if he is identical with the person who has written a German Einstein biography.
  • Does this hypothesis make sense? It is not impossible that the author of the Somerton Man cryptogram used a different substitution table for every word. However, in my view, there are two reasons why the Rieber hypothesis is most likely wrong. First, it would be unusual that somebody uses such a complicated cipher for such a simple note. And second, the letter frequencies clearly indicate that the letters of the cryptogram are starting letters of English words, which contradicts Rieber’s conjecture that we deal with a (polyalphabetic) letter substitution.
  • Who wrote these paragraphs on Wikipedia? This question is easy to answer. I’m sure, it was Rieber himself.
  • Does this hypothesis belong into a Wikipedia article? In my view, it doesn’t. Publishing “original research” (i.e., previously unpublished research results) on Wikipedia clearly offends against the Wikipedia rules. The remark “First dicussed on October 3 …” is not a literature reference. In my view, these paragraphs should either be deleted or complemented with a proper reference.

If you have a different opinion about this new Somerton Man hypothesis, please let me know.

Further reading: A German spy message from World War 2


Subscribe to Blog via Email

Enter your email address to subscribe to this blog and receive notifications of new posts by email.

Kommentare (20)

  1. #1 Richard SantaColoma
    2. Dezember 2018

    I personally think there are too many variables to come to any conclusion as to the merits of this idea.

    I will take the opportunity to link to another idea related to the Somerton Man, which I hadn’t seen before (happened to come across it only last night):

    I have always felt, as others have, that there must have been some connection to the woman he may have met, near where he died. And the daughter of this woman had many interesting points to make about her mother’s mysterious past.

    She may be “The Spy who Love Him”. In my opinion, learning more about this woman could possibly be the key to narrowing down the possible system used in his code.

  2. #2 Thomas
    2. Dezember 2018

    According to Wiki the paragraph cited by you was written by an “Unbekannter Benutzer” (unknown user) on 3 Oct. His IP adress is still stored by Wiki Checkuser (90 days). According to Anexia Internetdiensteistungen GmbH this adress is located in Vienna, maybe he’s the one who posted on your blog, I don’t think he is identical to the author of Einstein’s biography, who is a historian living in Ulm.(according to Wiki).

  3. #3 Jerry McCarthy
    England, Europa
    2. Dezember 2018


  4. #4 Jim Gillogly
    2. Dezember 2018

    As Richard says, this “method” gives the cryptanalyst too many subjective variables to adjust. The first variable is picking amongst the different candidates for each word. For example, the word TRAINING has the same pattern as STOWAWAY, as well as the less relevant-looking CATERERS, DEFINING, REFINING, DRAINING and CHAINING (among the 56 options in my unabridged). ENEMY (also with only one option picked by the proposer) could as easily be ALARM, AGAIN, AWAKE, AWARE, and USUAL, just to pick five of the hundreds of choices. Dealing with the long bits is even more problematical, because if there’s no word that matches you could break it up many ways into two or more words… or even if there *is* a matching word that doesn’t fit your preconception, you can break it up into smaller words that aren’t as constrained. Without a credible description of how the deciphering alphabet is related among words (as it almost must be if it’s to be memorized) I give this method a big thumbs down. Even providing keyed alphabets that would produce those “identified” words would be an improvement.

    I agree whoever said it looks like the initial letters of words, as with the Masonic cribs used for remembering the various litanies in their ceremonies.

  5. #5 MBq
    2. Dezember 2018

    I’ve deleted the abovementioned text from the de.wp article, since it clearly violated our guideline “No original research”. Thanks for pointing that out. Best regards, –User:MBq

  6. #6 Bill Briere
    Wyoming, USA
    3. Dezember 2018

    Clearly, the method described on Wikipedia (and now removed) isn’t a viable means of communication. But it’s possible that the real system employed in the Somerton mystery message isn’t any better.

    I think most cryptanalysts would agree that the note (or, rather, the indentations from writing the note) found in that old copy of The Rubaiyat and connected to the Somerton Man are word initials meant to trigger the memory of the person who jotted them down. Maybe privacy was an additional motivation, but a system like this wouldn’t be very well-equipped to pass information from one individual to another.

    This type of “code” isn’t unique to the Somerton Man, either. Similar systems (in my opinion) are found in a verse written in an underground tunnel at the University of Texas and in the Ricky McCormick notes.

    My short-term memory doesn’t work anymore, so I’m always using the McCormick/Somerton system to get me through the day. It’s great for conveying information from “Myself Now” to “Myself an Hour from Now.” A single letter, in context with other intials, is often all I need to recall my shopping list or a group of people’s names. But how would someone else know if the “ABC” written on my hand means “Adam, Barbara, Chris” or “apples, bread, chocolate”? Even with contextual clues, you might make an educated guess of “apples, bananas, candy,” which would be a partially correct assumption and yet still not be a solution or even a partial solution.

    Anyone, including code breakers, can speculate about the Somerton note’s meaning (and given more information, we could even make educated guesses), but we certainly shouldn’t call such demonstrably unverifiable products “solutions.”

  7. #7 Richard SantaColoma
    3. Dezember 2018

    I agree, Bill. It is just an intuitive guess, for what that is worth (!), but it does look like a mnemonic to me. Perhaps to help remember a poem, an essay, a pledge of some kind (and “like minds”, I’ve felt that way about the McCormick notes, too… license numbers? Addresses? bookie stuff…).

    I was actually looking at programs, for something that could take a block of text and strip all the first letters from each word. If this was done for an entire book, then the string of Somerton’s man’s letters could easily be search for among those results.

    Like probably many before me, and you point out, I also did this manually from a downloaded copy of The Rubaiyat. Obviously no one has found anything like this, there. But if he had an interest in classical poetry? Might narrow it down. But a program which could prepare texts like this would make a pretty fast tool to check works.

  8. #8 Christof Rieber
    3. Dezember 2018

    Hello Cryptocrowd,

    thank you all for your comments/messages. First of all: I did not write any book about ‘German Einstein’. Regarding the main points:

    “I personally think there are too many variables to come to any conclusion as to the merits of this idea.”

    I DID mention that such word pattern encryption would rely on a ‘greatly reduced dictionary’. Even referred to that one twice (‘these alternatives are not necessarily on the word list’). A dictionary with e.g. 500 words would not necessarily produce too many similar word pattern variations to understand a full message (consisting of both, longer and shorter words). Also, I still believe everybody of us is able to decrypt a cipher such as ‘GUDDO GUPFUP’ as long as the only words matching these patterns on such list were ‘HELLO’ and ‘LONDON’. Other words, such as ‘CAMBRIDGE’ simply wouldn’t fit into such pattern (prove me wrong).

    Regarding e.g. Ricky McCormick: Clearly a different encryption method (if at all). Easy to figure this out by e.g. counting the amount of ‘SE’ bigrams, which is unusually high for any English text at all.

    “you can break it up into smaller words that aren’t as constrained”

    Never claimed this method ‘allowed’ to do so. Only the last words to be combined, as mentioned.

    Also, the word ‘consideration’ was clearly expressed (‘Überlegung’). So far, there was no better solution for the cipher.

    IF you go for any kind of substitution cipher, you will soon figure out that the last sequence of the Somerton cipher is not solvable in most common languages (from English over Indonesian or Italian to Suaheli and even Persian). Been there, done that, too.

    Regarding “our guideline – No original research”: There DOES exist enough scientific work regarding both, polyalphabetic encryption as well as word pattern matching. Nothing else has been used (although in combination with each other). No need to refer to such papers (e.g. Friedman, Military Cryptanalysis, 1952). Fully complies with the Wikipedia regulations (“attributable, even if not attributed”).

    Final comments on me being a cryptoanalyst or not: Into crypto since approximately 8 years, read Klaus Schmeh’s book as well as plenty of others (Singh etc.). Not working for any university and glad about it. Focus is on Z340 and related as well as Dorabella. Somerton cipher for me is ‘closed’ as I am more interested in the previous ones. Also it is too short. Actually DO programming for cipher encryption, mostly in PYTHON using Aho-Corasick and other algorithms, word pattern matching, n-gram pattern matching etc. Even created my own dictionary with ~4,500 words based on word roots only. Currently running an attack on line 17 of the Z340, computing Billions of letter combinations in few minutes only (when the friendly message of Gerry reached me). So decide by yourself. Personally I have no need to be called a ‘cryptanalyst’ or whatever as I am more into doing my Python programming which is not too difficult but also not too easy either.

    @MBq: You may delete anything on Wiki without attribution to scientific papers but it could be even more work than creating better content (if possible at all).

    I keep it friendly (as my approach was only to share the idea) and go on with my stuff.

    Best regards

  9. #9 Christof Rieber
    3. Dezember 2018

    Oh, btw…I also found “STOWAWAY”, “DINING” etc. when doing the word pattern analysis.

  10. #10 MBq
    3. Dezember 2018

    @Christof Rieber: Thank you for your comment. I acknowledge there are loads of books on the decryption methodology you were using, but what we need is a peer-reviewed publicaton which explicitely supports your conclusion. See . This is a formal requirement, I’m far from questioning your expertise. – If you like we can continue this on .Regards, –MBq

    • #11 Christof Rieber
      4. Dezember 2018

      Thank you very much for your feedback, also the hint about HistoCrypt 2019 which sounds very interesting. There is a paper in work if not almost ‘ready’ but will I only publish it after ‘cracking’ one of the two ciphers.

      Historical ciphers rule (because modern ones are safe).


  11. #12 Klaus Schmeh
    3. Dezember 2018

    @Christof Rieber: Thanks for your comment. I’m sorry that my article led to your hypothesis being deleted from Wikipedia. However, like MBq, I consider this as original research.

    Your work on Z340 and Dorabella sounds interesting. It would be great to learn more details about it. Perhaps, you want to hand in a paper for HistoCrypt 2019. The Call for Papers will be published soon.

  12. #13 Klaus Schmeh
    7. Dezember 2018

    Sam Taylor via FaceBook:
    The last time I looked at this one, there was a theory about micro writing. I take it that didnt pan out.

  13. #14 Klaus Schmeh
    8. Dezember 2018

    Bart Wenmeckers via Facebook:
    The cipher text so short it is difficult if not impossible to verify the solution as many plausible solutionss can be made. If the somerton man was a foreign spy (which there is some supporting evidence) then it is unlikely the plaintext is english.

    This is an interesting case in Australia and New Zealand as this is the only such potential murder spy crypto espionage case in this corner of the world that is publicly available.

    • #15 Christof Rieber
      11. Dezember 2018

      In his suitcase was found a name tag with T. Keane” written on it. A “Tom Keane” was indeed missing but some ‘relatives’ of this Tom could not identify the Somerton man as their relative.


  14. #16 Christof Rieber
    11. Dezember 2018

    Also, the Tamam Shud book was in English language, too.


  15. #17 Gray Goods
    23. Januar 2019

    Am I really the only guy on this German blog who reads German words in every second line (ignore the mistaken, stricken one) of this message? Afaics, “Paket” and “Samst[a]g” stand out. Maybe the Somerton Man had already decyphered the message, directly under the coded lines? Then it would read like this:
    NTB [or WTB, MTB] im Paket Pitt MT Samst[a]g Ab[end]
    NTB [or WTB, MTB] in package Pitt MT saturd[a]y ev[ening]

    NTB could stand for Notizbuch (notebook) or may be a codeword in itself. MT could be a decyphering or transmitting error. There is a Pitt Street in Adelaide. There’s also a Mount Pitt on Norfolk Island, but imho that’s literally a remote possibility for leaving a package.

    Anyway, if anyone of you cypher experts has time on his hands, pls check if the coded lines may actually be decyphered to the clear text.

  16. #18 Gray Goods
    23. Januar 2019

    One more idea: I am aware that the coded message has 20 letters, while the clear text shows 24. But maybe the day of the week had been shortened to 2 letters, but the Somerton Man wrote down 6, for clarity? Samstg instead of Sa? Then it could be a rather simple letter replacement code. Pls check if the differences between corresponding letters make any sense, maybe result in a quote from a book (omitt the first letter, which can’t be clearly read).

  17. #19 Christof Rieber
    23. Januar 2019

    I seen no ‘Pitt’ but only ‘ITT’, also PANET instead of PAKET.

    But indeed SAMSTGAB is similar to Samstag Ab(end)


  18. #20 Gray Goods
    24. Januar 2019

    Please take a look at the photo, the “P” is in the second line: “IM PAKET P”. Of course, before decyphering it, Mr. “Somerton” couldn’t know where the words would be separated.
    Also take a look at the “N” in what you read as “PANET”. Would you write that letter with such a curved last line? It’s a sloppily written “K” instead, imho.
    Btw, do you know it’s not a photo of the real slip of paper with the message? It’s one of the traces of the letters in the book (the “Rubayat”) that was used as a support when writing. That’s probably the reason why the letters don’t line up neatly. Their imprints had been made visible in the police laboratory. The real paper with the message has never been found.