Vor zwei Monaten hat Christoph Tenzer die Challenge von Armin Krauß gelöst und ist damit der aktuelle Träger des Friedman-Rings. Heute stelle ich Christophs neue Challenge vor. Wer sie löst, wird sein Nachfolger.

English version (translated with DeepL)

Wer das aktuelle Krypto-Rätsel als erstes löst, erstellt das nächste. Nach diesem einfachen Prinzip funktioniert der Friedman-Ring. Dabei handelt es sich um ein Spiel, das nach den Kryptologen William und Elizebeth Friedman benannt ist. Vorbild ist der Iffland-Ring, der bereits seit Jahrhunderten an Schauspieler vergeben wird.

Vor zwei Monaten endete die dritte Runde des Friedman-Ring-Spiels. Das bisher letzte Rätsel kam von Armin Krauß. Christoph Tenzer hat es als erster gelöst und wurde damit der neue Träger des Friedman-Rings.

Quelle/Source: Tenzer

Damit ergibt sich folgende Liste von Trägern:

  1. Frank Schwellinger
  2. Anna Salpingidis und Christoph Tenzer
  3. Armin Krauß
  4. Christoph Tenzer

Die Infrastruktur zum Friedman-Ring ist noch im Aufbau. Es gibt eine Webseite dazu, die sicherlich noch ausführlicher werden wird.

Der jeweilige Empfänger des Friedman-Rings verpflichtet sich, ein Krypto-Rätsel zu entwickeln und mir zur Verfügung zu stellen. Wer dieses Rätsel als erstes löst, ist der neue Träger.

 

Christoph Tenzer’s Kryptogramm

Und hier kommt die neue Challenge. Christoph hat sie mir einerseits als Video zur Verfügung gestellt:

Außerdem gibt es das gleiche Rätsel als Bild:

Quelle/Source: Tenzer

Christoph schrieb mir: “Ich wollte eine Aufgabe erstellen, bei der die Verschlüsselungsmethode von Anfang an bekannt und noch dazu eher einfach ist. Der Klartext ist auf Englisch. Ich denke die Schwierigkeit ist ziemlich hoch. Aber mit Hints kann man das ja wieder vereinfachen.”

Christoph hat sogar (obwohl das in den Regeln nicht vorgeschrieben ist) einen Preis ausgesetzt. Es gibt 50 Euro (per Paypal) für die richtige Lösung – bei Bedarf aufgeteilt nach Beiträgen in den Kommentaren.

Wie schon öfters erwähnt, gehört es zu den Nachteilen einer solchen Challenge, dass jeder Leser-Hinweis, der als Kommentar veröffentlicht wird, einem anderen Codeknacker zum Gewinn verhelfen kann. Ich bitte mein Leser daher, nicht ganz so egoistisch zu sein. Wenn im Forum über Lösungswege diskutiert wird, macht das die Sache für alle spannender. Natürlich werde ich jeden Kommentar, der zur Lösung beigetragen hat, entsprechend würdigen.

Das Rennen ist eröffnet!


Further reading: Christoph’s Chaotic Caesar Challenge

Linkedin: https://www.linkedin.com/groups/13501820
Facebook: https://www.facebook.com/groups/763282653806483/

Subscribe to Blog via Email

Enter your email address to subscribe to this blog and receive notifications of new posts by email.

Kommentare (84)

  1. #1 Bill Briere
    Wyoming, USA
    16. Oktober 2021

    Since the first “paragraph” has a prime number of characters, then the individual letters (likely) consist of strings of dots and dashes of variable lengths, just like in real Morse code.

    Also since there must be some sort of “signal” embedded in each string to indicate that it’s ending, and since that signal would have to be at least 2 characters long for disambiguation, it seems that any string representing a letter must consist of a bare minimum of at least 3 dots and/or dashes.

    Further–and this could be the twist–the “ending signal” would not necessarily have to be at the end of the string. It could be positioned some other way, where its appearance might mean, say, “this string ends AFTER the next dot or dash.”

    It seems that the rest of a letter, aside from its “end signal,” would require up to 4 dots or dashes, just like Morse, to be able to make at least 26 combinations.

    So, it seems that a letter’s length must be 3 dots and/or dashes or greater, and the maximum length must be 6 or greater. Hmm, but the longest strings might always need to be longer, to prevent parts of letters being mistaken for end signals, especially if there are multiple end signals available for use.

  2. #2 Magnus Ekhall
    Borensberg
    16. Oktober 2021

    This reminds me of Huffman coding. Time to bring out the old books from the back of the bookshelf!

  3. #3 Tobias
    16. Oktober 2021

    Yes, Magnus. This was also my first thought, when I read „prefix free“. And instead of 1 and 0 there are dots and minuses.

  4. #4 Thomas
    17. Oktober 2021

    Under the assumption that the smallest average code length is needed: If we assume the frequency order of the most frequent letters as: etaoinshr, my first try would be to assign the prefix- free 3- and 4-bit-groups based on a binary tree as follows: 000/100 to the e/t – group and 0101/0110/0111/1010/1011/1101/1110 to a/o/i/n/s/h/r.

  5. #5 George Lasry
    17. Oktober 2021

    There is a similar challenge in MysteryTwisterC3

    https://www.mysterytwisterc3.org/en/challenges/level-3/substitution-cipher-with-non-prefix-codes

    Over the last 10 years, it has been solved by only one solver. It has 150 letters in the plaintext, with no punctuations or spaces.

    There is even a Master thesis, just on the problem of deciphering a message if you know the key.

    https://scholarworks.sjsu.edu/etd_projects/176/

    So this is a very tough problem.

    1) How many letters are there in the plaintext here?
    2) Are spaces and punctuation also encoded, or just the letters from the alphabet?

  6. #6 Bill Briere
    Wyoming, USA
    17. Oktober 2021

    Looking at this again, I see that my earlier observations (Comment #1) weren’t entirely correct.

    For one thing, we can’t depend on the minimum required length being 3. Say all 26 letters ended with “dot-dot” (not the case here, of course, but only as an example). This doubled dot would signal the end of a letter. So, just to illustrate, the letter A might be dot-dash-dot-dot, and the letter B might be dot-dash-dot-dash-dash-dot-dot–and the letter E could even be as simple as dot-dot. So, BABE would be written as: dot-dash-dot-dash-dash-dot-dot-dot-dash-dot-dot-dot-dash-dot-dash-dash-dot-dot-dot-dot. In this hypothetical case, the minimum would be 2 and the maximum could be no less than 8.

  7. #7 Bill Briere
    Wyoming, USA
    17. Oktober 2021

    Since we’re dealing with a simple substitution cipher, we can most certainly rule out combining and compressing pairs or groups of letters.

  8. #8 Bill Briere
    Wyoming, USA
    17. Oktober 2021

    Here’s a transcript. Line breaks are arbitrary. Dots are zeroes and dashes are ones.

    Paragraph 1:
    0011001010101001010011000110101010001001010
    01111100100100001010010010000110001001011011
    01010100100001101100100101111001101100101101
    101000

    Paragraph 2:
    01111000110001010000111011001000011010101001
    0010010110001001010001110101010001010101101
    01100000110111001010010100101011100101000011
    10110010000110101010000110001111000110001101
    00111010110101101001100001110110010000101000
    110010110100110000111010011110011011010101

    Paragraph 3:
    00111011001000011000001110101010010010000011
    11001010110000011000101000110000111001110011
    00001111000110111001010010100010001101001101
    10110101010010000110101111000110011010100010
    00110000011101011001100100001101000

    Paragraph 4:
    01110001101110100101010100001110110010000110
    10101000011101011110001010110110010010110110
    0101010010100101000100101001101110010010001
    0010100101001010111001010010010100011010011
    00001100101101

  9. #9 Bill Briere
    Wyoming, USA
    17. Oktober 2021

    There are lots of potentially useful strings of repeated digits.

    The longest repeat I found with a visual scan was 35 (01010000111011001000011010101000011 in Paragraphs 2 and 4). The first 31 characters of that string also appear earlier in Paragraph 2.

    Other possibly useful chunks are a 24-digit string (101101101010100100001101 in Paragraphs 1 and 3) and a 27-digit string (100101001010010101110010100 in Paragraphs 2 and 4; the part in P2 overlaps with the 35-digit string).

  10. #10 TWO
    17. Oktober 2021

    Somehow it reminds me of octal as in base 8.

  11. #11 TWO
    17. Oktober 2021

    Could it be octal?

  12. #12 Thomas
    17. Oktober 2021

    @Bill Briere

    I think your observation in comment 1 that the code groups must contain at least three digits is correct. If each code had an ending with the same two digits, one out of the pairs 00, 01, 10, 11 would occur far more frequently than the others. Moreover, that wouldn’t be an actual prefix-free code (but rather some kind of delimiter integrated in each code group).

  13. #13 Thomas
    17. Oktober 2021

    A Huffman encoding based on constructing the binary tree according to letter frequencies seems only necessary when a minimum average code length is needed in order to keep the ciphertext as short as possible. But the challenge instructions don’t say anything about that. So it appears to me that letter frequencies don’t play a role here. Hence a prefix-free code that is needed to split up the binary string into code groups can be created with any binary tree containing 26 leaves representing the letters of the alphabet where the path to each leaf (left = 0 and right = 1 or the other way round) yields the code group. (Hopefully I haven’t overlooked anything here.) I’m still musing on the question how many ways there are to compose a binary tree with 26 leaves.

  14. #14 Gerd
    17. Oktober 2021

    @Thomas, I agree with your analysis.
    The challenge says it is a prefix-free code, so a code with a delimiter sequence is not suitable. We should expect something like Huffman. Also it doesn’t require an optimized code in terms of minimum length. So something like Huffman + a MASC might work.

  15. #15 George Lasry
    17. Oktober 2021

    Non-prefix means that this is not a Huffman encoding.

  16. #16 Gerry
    17. Oktober 2021

    My first question is the size of the alphabet – are there more than only a-z (e.g. comma)? Is there a word delimiter symbol?

  17. #17 Thomas
    17. Oktober 2021

    @George Lasry, Gerd

    Maybe I didn’t put it right what I meant, please correct me if I’m wrong: It should be possible to create a prefix-free code (so that no delimiter is needed for splitting up the entire 1-0-sequences into code groups assigned to single characters) with any binary tree provided it contains a number of leaves that equals the number of distinct plaintext characters. What is special about the Huffman algorithm seems to be that it yields a particular kind of binary tree built according to letter frequencies, so that short codes (i.e. paths) are assigned to frequent letters – like in the non prefix-free Morse code. So the length of the codetext gets minimized. But: Since the challenge doesn’t say anything about this point, I think the Huffman part based on letter frequencies isn’t required here, because it isn’t relevant whether the most frequent letter “e” has a short or a long code group. For creating a prefix-free code any binary tree with the plaintext characters as leaves seems to be appropriate whereas the letter frequencies don’t play any roll in building the binary tree here. Long story short: It might be a Huffington coding, but not necessarily, since different kinds of binary trees can do the prefix-free part as well. So either Huffman + MASC or non-Huffman binary tree + MASC.

  18. #18 TWO
    18. Oktober 2021

    I substituted the zeros with A and the ones with B.

    Does that count as a simple substitution?

  19. #19 George Lasry
    18. Oktober 2021

    Thomas: You are right and I was wrong. I misunderstood prefix-free to be non-prefix.

  20. #20 Thomas
    18. Oktober 2021

    The parsing must meet at least three conditions:
    First: There must not be more than 26 distinct code groups. Second: The frequency distribution of the code groups matches the general frequency distribution of the letters in English texts (because of the MASC). Third: The code groups are prefix-free.

    If the code groups had a unique length, this would be 5 bits (sufficient for 32 characters). But since the length of the first paragraph is prime, as observed by Bill Briere, this can be ruled out. But it should be expected that the maximum code group length doesn’t exceed 6 bits by far.

    Since the code groups are prefix-free, their length must satisfy the Kraft inequality (https://www.sciencedirect.com/topics/mathematics/kraft-inequality). Could that help narrow down the key space?

  21. #21 Magnus Ekhall
    Borensberg
    18. Oktober 2021

    Do we know for sure that there are no more than 26 distinct code groups? There could be numbers in there as well I suppose…

  22. #22 Narga
    18. Oktober 2021

    I just wanted to quickly say thanks to Klaus and everyone active here for the interest shown in this challenge and sharing their thoughts with the others! I’m checking regularly for new comments but I think I will wait a bit with answering questions or giving hints.

  23. #23 George Lasry
    18. Oktober 2021

    Narga: One thing that is needed right away, I am afraid. I think I have an idea for an algorithm, but it requires a very precise transcription. It is possible to post it here, so we are not at the mercy of transcription errors?

    Thanks!

  24. #24 Narga
    18. Oktober 2021

    @George: The transcript provided by Bill Briere in #8 is correct.

  25. #25 Thomas
    18. Oktober 2021

    Magnus Ekhall: No, we don’t. It was only my unfounded assumption that the plaintext doesn’t contain numerals.

    Narga: Is your challenge solvable with pencil and paper? Or is an algorithm needed that checks thousands, millions … of combinations even if not brute forcing?

  26. #26 George Lasry
    18. Oktober 2021

    OK, I have an algorithm that works perfectly under the following assumptions:

    1) Only A-Z symbols.
    2) No more than 5 symbols per letter.

    Under those conditions, it is possible to brute-force the possible encodings, then hillclimb the substitution.

    But since my program isn’t producing any result for the challenge, we have 4 possibilities:
    1) There are more than just A-Z symbols – low probability because the substitution part of my program is robust enough, and would work even if a code is wrongly interpreted as a letter. Also, the cartoon says that there is no need for spacing, so I guess there is no need either for punctuation. Still, numbers could be included, but numbers are not solvable analytically, so I don’t think they would have been included in the challenge.
    2) The transcription is wrong. Also low probability, since Bill’s transcription has been tested by Narga.
    3) I have a bug in my program, that I did not detect yet 🙂 – always possible, but I did run quite a few tests, so I give this a low to medium probability.
    4) There are letters encoded with 6 or more symbols. I give that a medium to high probability.

    Narga: Would you be ready to confirm or deny that there are some codes longer than 6 symbols. If there are, then my algorithm won’t work, as there are two many encoding options, and the only alternative method I can think of is with a generous crib. If there aren’t, I will further dig up the code for possible bugs.

  27. #27 Gerd
    18. Oktober 2021

    @George
    Are you sure you can get 26 symbols prefix-free in 5 Bits? I fear you might need 7 bits for that, but maybe I am wrong.

  28. #28 George Lasry
    18. Oktober 2021

    Gerd: Yes, it is possible, there are a few tens of millions of such keys.

    Here is an example:

    000 1000 1101 1111 00100 00101 00110 00111 01000 01001 01010 01011 01100 01101 01110 01111 10010 10011 10100 10101 10110 10111 11000 11001 11100 11101

    This example actually ‘parses’ correctly the 4 parts of the ciphertext, but there is no solution for the alleged substitution.

  29. #29 Narga
    18. Oktober 2021

    @George: Great work! But there are indeed some codes longer than 6 symbols.

    @Thomas: I don’t know what to say. I think with a tool that can quickly highlight all the occurrences of a sequence in the challenge (like MS Word) it’s a lot easier. How about if you would know one of the more frequent letter codes, e.g. N?

  30. #30 George Lasry
    18. Oktober 2021

    Thanks, Narga. In this case, I will have to wait for the hints. And unless there is a long crib, I am doubtful this will ever be solvable.

    1) Would you consider another challenge, with no more than 5 symbols per letter?

    2) Or, if the number of 6+ symbol letters is not too high, if you provide those as hints (without the substitution key, of course), that might also be solvable, although this would require substantial additional work.

  31. #31 Thomas
    18. Oktober 2021

    @George

    Since the number of binary trees containing n leaves equals the (n – 1)th Catalan number, there should be 4861946401452 (the 25th Catalan number) different prefix-free codes (i.e. different sets of root – leaf paths) possible (without restrictions of the code group lengths). That seems to be a lot of stuff for brute forcing. Do you see a way of hillclimbing all possible sets of paths (I have no idea about the swapping part, the scoring might be possible by comparing the Index of Coincidence of the results to the IoC of English)?

    But since Narga doesn’t basically rule out a pencil and paper approach, there must be a different way ( reminds me of some Renaissance Italian number ciphers without spaces), maybe based on the position of sequences at the beginning and the end of each paragraph. Furthermore, two code groups in the ciphertext mustn’t have a sequence between them that is to short to form a code group (I’d estimate of length 1 and 2). If the repeating parts made out by Bill can help, no idea yet.

  32. #32 George Lasry
    18. Oktober 2021

    I counted about 15M options if you have only 5 or fewer symbols per code. With 6+, this semi-brute-force approach is not practical.

    I have been playing a little bit with hillclimbing cipher variable-length codes, but to get something significant, you would need maybe tens of thousands of symbols. And the Vatican ciphers, while homophonic, only allow 1, 2, or 3 digits.

    It might be solvable to pen-and-paper only, but I guess you would need a nice crib.

  33. #33 Gerd
    19. Oktober 2021

    George, thank you for the example. I see that when the shortest symbol is 3 bits, you can exactly choose one 3-bit, three 4-bit symbols, and the remaining 22 symbols are 5-bit.
    For being prefix-free, the 3-bit symbol blocks 4 of the 32, each 4-bit symbol blocks 2 of the 32, so it comes out with exactly 26 characters.
    Counting the possibilities for the choice of symbols, I get 32 x 28 x 26 x 24 /3! = 93k.
    Is 15M including the MASC possibilities?

  34. #34 YeS
    19. Oktober 2021

    Has anyone tried to translate the Morse in the video? Also, besides Morse I can hear woman speaking – may be some kind of a clue.

  35. #35 Narga
    19. Oktober 2021

    @YeS: I couldn’t make up my mind between using the image and the video for the challenge, so I gave both to Klaus to see his opinion. And of course he used both 🙂
    The audible (and visible) Morse code in the video is the same as in the image, the background noise including the spoken words is only there to create atmosphere.

  36. #36 YeS
    19. Oktober 2021

    The Morse seems to be the one from the image.

  37. #37 YeS
    19. Oktober 2021

    @Narga: Ah. OK. Thanks.

  38. #38 George Lasry
    19. Oktober 2021

    Gerd:

    The count ignores the MASC (26!).

    Assuming there are no more than 5 symbols per letter (which is not the case here), you have the following options:
    – With no 3-symbol codes, you can have 0 to 6 codes with 4 symbols.
    – With just one 3-symbol code, you can have 0 to 3 codes with 4 symbols
    – With two 3-symbol codes, you cannot have any 4-symbol code.
    – You cannot have 3 or more 4-symbol codes.

    This gives 7 + 4 + 1 = 12 options. It is easy to compute how many cases you have per option.

  39. #39 George Lasry
    19. Oktober 2021

    I meant:
    – You cannot have 3 or more 3-symbol codes.

  40. #40 farmerjohn
    Riga
    19. Oktober 2021

    Some thoughts on tree generation for Huffman code.
    Let’s denote:
    n0 – number of tree nodes without child nodes (these are leaves),
    n1 – number of nodes with exactly one child node
    n2 – number of nodes with two children (=n0-1).

    For Huffman tree n1 is strictly 0. This means that if, for example, 010 is code for some letter, then 011 is prefix for other valid code(s).
    However for general prefix-free code n1 can also be positive, which makes exhaustive search much harder.

    Can “unfinished” tree, where codes at leaves are final, be augmented into Huffman tree for English language? Can, iff
    1) n0 + n1 0, or n1 = 0 and n0 = 26

  41. #41 George Lasry
    19. Oktober 2021

    Narga: What about a ‘light’ hint, as follows:

    1) How many codes with 6 symbols?
    2) Are there any codes with 7+ symbols?

  42. #42 Thomas
    19. Oktober 2021

    Probably the plaintext contains several times the word “the”, the t and the e being the most frequent letters in English. Provided a repeated sequence (for instance found by Bill Briere in #9), stands for “the”, the first part representing the t (of unknown length, but in an estimated range of 3 through 8 bits) and the third part standing for the e (of unknown length as well) should very frequently occur in the whole codetext. But since we don’t know the plaintext’s length, the number of occurences cannot be estimated, but only the relative frequency. Unfortunately, I have no algorithm for making out repeated sequences of variable length in a string. Using the MS word search function is like looking for a needle in a haystack.

  43. #43 Johannes
    19. Oktober 2021

    @Thomas
    The Sliding Window Algorithm is a good candidate to perform this task. Numpy for instance has a built-in function to do exactly this.

    As Bill Briere already mentioned there are sequences with 35 and 27 matching bits which repeat twice.

    Here are a couple more interesting ones including those from Bill.

    [0 1 0 1 0 0 0 0 1 1 1 0 1 1 0 0 1 0 0 0 0 1 1 0 1 0 1 0 1 0 0 0 0 1 1], 35 bits, 2x

    [0 0 1 0 1 0 0 0 0 1 1 1 0 1 1 0 0 1 0 0 0 0 1 1 0 1 0 1 0 1 0 0], 32 bits, 2x

    [1 0 0 1 0 1 0 0 1 0 1 0 0 1 0 1 0 1 1 1 0 0 1 0 1 0 0], 27bits, 2x

    [1 0 1 1 0 1 1 0 1 0 1 0 1 0 0 1 0 0 0 0 1 1 0 1], 24bits, 2x

    [0 0 0 1 1 0 1 1 1 0 0 1 0 1 0 0 1 0 1 0 0], 21 bits, 2x

    [0 1 0 1 1 0 1 0 0 1 1 0 0 0 0 1 1 1 0 1], 20 bits, 2x

    [1 0 0 0 0 1 1 1 0 1 1 0 0 1 0 0 0 0 1], 19bits, largest one which appears 4x

    [0 0 1 1 1 0 1 1 0 0 1 0 0 0 0 1], 16 bits, largest one which appears 5x

    There are loads more but unfortunatly i have not found a way yet to filter the interesting ones and variations of those.

  44. #44 Narga
    19. Oktober 2021

    @George: There are 8 codes with 6 symbols and yes, there are codes with 7+ symbols.

  45. #45 Gerd
    19. Oktober 2021

    Is it a reasonable assumption that at start and end of the repeating sequences that Johannes has found are also start and end of character-codes? Could we try to break the message up in the codes using the repeating sequences?
    Compare the 16 bit sequence and the 19 bit sequence in #43. So 100 is a code?

  46. #46 George Lasry
    19. Oktober 2021

    Narga: thanks for the additional information.

    The key entropy is way too high, based on this. I don’t have a proper formula, but I am not even sure that the length of the message (probably between 100 to 150 letters) is above the unicity distance.

    I am bailing out at this stage. If there is a sequel or side challenge with no more than 5 symbols per code, I am happy to take it, though.

    Meanwhile, many thanks for this great exercise, I enjoyed very much working on it.

  47. #47 Thomas
    20. Oktober 2021

    @Johannes

    Much appreciated! I’ll check that out.

  48. #48 Gerry
    20. Oktober 2021

    @Narga: I repeat my question, because otherwise there are too many possibilities: My first question is the size of the alphabet – are there more than only a-z (e.g. commas, dashes)? Is there a word delimiter symbol (space) or are the words glued together (e.g. “you are” versus “youare”?

  49. #49 Narga
    20. Oktober 2021

    @Gerry: I saw that question, sorry for not answering that right away. I think that’s THE break-through question for pen-and-paper solvers who already found some needles in the haystack.

  50. #50 Hias
    20. Oktober 2021

    Sind die 8 Stück 6-Bit Codes möglicherweise:
    101000
    101001
    101010
    101011
    101100
    101101
    101110
    101111

    Dazu noch die100 und 000.
    Tatsächlich nur mit „pen-and-paper“ versucht, daher mehr ein Schuss ins Blaue.

  51. #51 m
    20. Oktober 2021

    The text probably contains double letters (ll, ee), and some of the immediate repetitions like 00110-00110 will be caused by double letters. They can also occur by accident, but 00110 is at least a first candidate letter.

    It is even a good letter candidate, because paragraph 1 starts with it. It could also be the second letter of paragraphs 2 and 4 giving two more candidate letters 01110 and 011110. There are three other instances of 011110-00110, which makes this more plausible – and gives us a lot of possible letter-breaks to continue looking for more letters

  52. #52 Narga
    20. Oktober 2021

    @Hias: Leider nicht.
    Sorry, this is not correct.

  53. #53 Hias
    21. Oktober 2021

    Das Rätsel soll ja lösbar sein. Da keine offensichtlichen Hinweise eingearbeitet sind ist es denkbar, dass einer der vier Abschnitte das Lösungs-Alphabet darstellt.

  54. #54 Basic
    Linkoping
    21. Oktober 2021

    I’ve tried to find “letter spaces”. Ie, points in the code string between letters.
    My approach was to merge the four phrases into one string. And keep an array of indices to letter spaces in the code.
    The array starts as: [0, 137, 398, 609, 797]
    Where 0 is before the first bit, and 797 is after the last bit. The three other numbers are between the phrases.
    Then search for the longest string starting/ending at those spaces, and transfer the letter space to the newly found places.
    Repeat the process accepting shorter and shorter substrings, until no new matches are found.

    Here’s the result:

    ls: [0, 137, 398, 609, 797]
    398 +18 [155, 264, 635]
    264 -27 [766]
    155 +16 [335]
    137 +14 [295]
    155 -13 [467]
    335 -13 [372]
    797 -13 [366]
    366 +13 [329]
    ls: [0, 137, 155, 264, 295, 329, 335, 366, 372, 398, 467, 609, 635, 766, 797]

    Description of the result:
    ls: …
    Initial and final “letter space”-vector.

    398 +18 [155, 264, 635]
    For index 398 and 18 bits after that, there are matches that starts at index 155, 264, and 635.

    264 -27 [766]
    For index 264 and 27 bits before that, there’s a match that ends at index 766.

    With the new spaces, there’s two short strings of 001100, so that could be a potential letter.

    I stoped the search at 13 bits matching, because at 12 bits it pretty much broke down with too many accidental matches.
    I’m aware that this is only good for giving hints at possible letter spaces, and it might very well be some false positives above.
    The biggest unknown for me is how prone a Huffman coding is to “find sync” with the correct spaces if it starts out of sync.

  55. #55 Johannes
    22. Oktober 2021

    I found a nice article which was released last month regarding ciphers based on prefix codes and their security. I am not sure wether it is allowed to post links so i will just post the name of the paper. It is available for free and can be found with Google

    It is called “A Cipher Based on Prefix Codes” by Otokar Grošek, Viliam Hromada and Peter Horák.

    Unfortunatly the papers raises serious concerns if this cipher is actually solveable in its current form. I would love to hear your opions on it.

  56. #56 Johannes
    22. Oktober 2021

    Additionally i found a paper dated back to 1994 which indicates that we might face an NP-Complete problem here.

    It’s called “Complexity aspects of guessing prefix codes” by A. S. Fraenkel and S. T. Klein.

  57. #57 Narga
    22. Oktober 2021

    After a week now (instead of a first hint) I will simply affirm some findings by the commenters here:
    – The assumption by Thomas in #42 that the repeating sequences found by Bill in #9 could be connected somehow to the word “the” is spot on.
    – The repeating sequences found by Johannes in #43 are essential for solving the challenge
    – The conclusion by Basic in #54 that “001100” or “. . – – . .” is a letter is also correct.

  58. #58 Thomas
    22. Oktober 2021

    Thus 011110 and 01010 are code groups (either letter or space) as well.

  59. #59 Thomas
    22. Oktober 2021

    A long shot provided that the code contains at least one group for the word space:
    The substring 0101000011101100100001101010 in the string found by Bill could be parsed as 01010 – 000111011001000011 – 01010 = space – THE – space. The group 1000011 occurs very frequently and might stand for E. More evidence is needed.

  60. #60 Bernd Adameit
    Deutschland
    22. Oktober 2021

    As far as I can see, the sequence 00111011, which occurs five times (once at the beginning of paragraph 3), always expands by eight more characters to the sequence 0011101100100001.

    This could be an indication that 00111011 stands for “T, followed by a somehow compelling prefix of H”. Or, to put it in other words, 0011101100100001 could stand for TH, with T being encoded with a proper prefix of the sequence 00111011 (perhaps 00111).

  61. #61 Narga
    22. Oktober 2021

    @Thomas #58: Even though 001100 is a letter code, not every occurrence of 001100 is substitutable by that letter. It can also be segments of other letters. But you’re almost correct.

  62. #62 two
    24. Oktober 2021

    I think I see a pattern.

    “@Thomas #58: Even though 001100 is a letter code”

    001100 =7 bits.

    So it could start as a 7, 2, 4, 4, 8 bits string.

    If I counted correct that is.

  63. #63 TWO
    24. Oktober 2021

    correcton

    7, 2, 4, 4, 10 bits.

    I can’t count but also have flu.

  64. #64 mjw
    thisBlog
    24. Oktober 2021

    @all Hi, all. Could it be that the first signal of a character and the last signal of a character
    is everytime of the same value? – Like: “aABAABAa” or “bBAAABAb”
    These “Character-Determination-Singals” would then switch like:
    aXa-bXb-aXa-bXb-aXa-bXb-… | …just an idea.

    All “First-Letters”
    IOTIISSI
    All “Last-Letters”
    EYRESYRR
    Maybe there is a hint hidden after a rearrangement.

    @Narga
    thx for your riddle. Unfortunately I don’t have much time to take part-, but time to peek-in.

  65. #65 George Lasry
    24. Oktober 2021

    I spend nearly the whole weekend on it, but finally solved it with the hints:

    No one would have believed
    In the last years of the nineteenth century
    That human affairs were being watched
    From the timeless worlds of space

    That was a really tough one! The hints were quite useful,
    but some of them must be used with caution, as not all repetitions
    indicate that just before and after there is a letter separation.

    Eventually, I used 0101000011101100100001101010100 which appears 3 times.

    I was inspired by Thomas’s idea, but the space turned to be 010100, and
    0011101 T
    1001000 H
    01101 E

    It was also useful to know that 001100 is a symbol, and based on THE and this one
    I also guessed that 011110 was a valid one.

    The second line starts with:

    011110 001100 010100 0011101 1001000 01101 010100
    which decodes to ??_THE_
    which I guessed to be IN_THE_

    then later in that line, we have 0101000011101100100001101010100001100011110001100011010011101011010110100110000111011001000
    which decodes as _THE_NINETEENTH_
    which was a strong indication that the solution was correct,
    and the continuation must be CENTURY

    I was able to recover a few more letters by guessing other places (LAST_YEARS, HUMAN_AFFAIRS),
    then I had enough fragments to look for the complete text in Google.

    A really exciting and fun challenge!

    Huge thanks for creating it, and giving only the right amount of hints 🙂

  66. #66 George Lasry
    24. Oktober 2021

    Many thanks also to everyone who contributed ideas.

    In particular, those that were eventually given as hints:

    1) Bill – the significant repetitions
    2) Johaness – realizing that 001100 is a code.
    3) Thomas – the idea that there is a symbol for space,
    and that one of the repetitions represents _THE_

    PS: Still hoping to get a sequel challenge with no more than 5 symbols per code,
    and only A-Z, to test my original algorithm, which I did not use here

  67. #67 George Lasry
    24. Oktober 2021

    Sorry for misspelling Johannes “-(

  68. #68 George Lasry
    24. Oktober 2021

    The full key (in lexicographic order):

    000 D
    00101 M
    001100 N
    0011010 P
    0011011 R
    0011101 T
    001111 U
    0100 W
    010100 _
    010101 Y
    011000 A
    011001 C
    01101 E
    01110 F
    011110 I
    1000 B
    1001000 H
    1001001 L
    1001010 S
    1001011 V
    101 O
    11 G

    Five letters are not used: J, K, Q, X, Z

  69. #69 Narga
    24. Oktober 2021

    Amazing! Congratulations, George!
    Your solution is correct. So you solved it mostly by deduction and trial&error?

    Thanks to everyone else who gave it a try and contributed ideas!

  70. #70 George Lasry
    24. Oktober 2021

    Yes, no hill climbing etc… Just deductions, and a piece of code that shows the possible positions of known or guessed codes. The spaces are very useful. It would have been much harder without the spaces.

  71. #71 Klaus Schmeh
    24. Oktober 2021

    George:
    Congratulations, you are the new bearer of the Friedman Ring!!!

    Perhaps, you can send me some information on how you found the solution. Then I will write a blog post about your success.

  72. #72 Thomas
    24. Oktober 2021

    What an awesome success! Congratulations, George! If Klaus will write a new blog post on your solution, maybe you could cover several questions: How did you figure out the code group standing for the space? How did you find out whether a certain sequence is a code group or consists of the last digits of a code group and the first digits of the next one?

    Narga: A great challenge, it was much fun! I’ve learned a lot about prefix free codes and their construction.

  73. #73 Bill Briere
    Wyoming, USA
    24. Oktober 2021

    @George, congratulations!

    I’m a pen-and-paper, classical-ciphers kind of guy, so I appreciate what you accomplished the old-fashioned way. I had been aware of the concept of prefix-free codes, which Christoph announced in the puzzle itself, but have never worked with them before. I instinctively looked at the problem from a linguist’s perspective, rather than as a mathematician (which I’m not), so I got off to a misguided start.

    In your recoveries, it looks like there might be a nonrandom mapping of letter assignments. There seem to be a few instances of alphabetic progression peeking through that might hint at a keyword.

  74. #74 Bill Briere
    Wyoming, USA
    24. Oktober 2021

    @Narga, nice puzzle!

    Did you intend/expect this to be pen-and-paper solvable? I fully expected that the solution would be found by a programmer (which I’m not). It turns out, though, that a semi-linguistic, low-tech approach was used by (the incredibly talented) George to finish it off.

  75. #75 Narga
    24. Oktober 2021

    @Bill: Well, I aimed to make it solvable for programmers and classical cipher enthusiasts alike, but one needs some kind of visualisation of where a chosen string appears in the ciphertext, or a way to find repeating sequences. So only pen-and-paper without at least minor computer aid seemed impossible to me.

    I had actually sent a first version without the space-characters to Klaus and then later introduced them to make it harder for programmers and a lot easier for the pen-and-paper community.

  76. #76 George Lasry
    24. Oktober 2021

    Many thanks for the congratulations!

    I have sent a detailed description of the solution steps to Klaus so that he can publish it in a readble blog entry.

    Narga: Huge thanks again, great challenge, and enormous fun!

    I now have an even harder task to come up with a interesting, enjoyable, hard enough but not unsolvable challenge 🙂

  77. #77 Magnus Ekhall
    Borensberg
    25. Oktober 2021

    Well done George! I’m not surprised that you came up with the solution!

  78. #78 mjw
    25. Oktober 2021

    @George
    Congratulations to You!

    @All
    Nice, Team. 🙂

    @Narga
    Thx, again, for Your Challenge.

    @Klaus
    Danke, für Deine Arbeit rund um das Alles.

  79. #79 Johannes
    25. Oktober 2021

    @George Lasry
    Congratulations for a job well done!

    I only wanted to add that the hint with the 001100 letter came from Basic. So he deserves credit. 🙂

    Good job everybody!

  80. #80 Johannes
    25. Oktober 2021

    @George
    Oh, and one more thing. In post #66 you stated that you were working on an algorithm. Maybe you can share it on Github or similar? I am very much interested in taking a look at it. Which language did you write it in?

  81. #81 George Lasry
    25. Oktober 2021

    Johannes: Thanks for the congrats and for the clarification on the credit for the hint.

    As for software code, I am afraid I am not always disciplined enough to write readable code that can be published. I can give you more details on the specific algorithm over email (but note that it works only with no more than 5 symbols per code, and with only A-Z).

  82. #82 Nils Kopal
    Krefeld
    26. Oktober 2021

    Hi George,
    Well done 🙂
    Greetings,
    Nils

  83. #83 Nils Kopal
    Krefeld
    26. Oktober 2021

    @all others: Cool. I really like that the whole contributions lead finally to solving the riddle
    @Narga: Really cool challenge 🙂 — maybe you want to create a challenge as George mentioned for MTC3?

  84. #84 Narga
    26. Oktober 2021

    @Nils: that’s a good idea, I’ll do that in the next weeks.