The bigram 2500 challenge

The following ciphertext encrypts an English cleartext with 2500 letters. Each number (between 0 and 675) represents a bigram:

205 211 230 674 450 545 445 541 221 216 323 434 274 611 46 212 387 201 83 450 134 257 674 545 496 209 608 394 90 255 103 151 191 650 151 266 565 646 189 7 255 299 674 290 626 7 193 249 281 231 486 179 302 450 107 674 151 368 483 230 461 631 568 665 302 597 205 577 462 312 214 21 290 626 7 193 249 281 577 503 545 674 316 302 106 434 445 305 626 642 648 282 170 368 671 642 264 445 580 383 97 179 65 306 7 266 70 365 179 191 450 107 595 54 396 189 603 364 650 580 205 230 322 517 295 234 496 238 255 364 196 545 503 628 229 381 148 478 316 153 122 580 248 381 445 602 214 385 478 205 368 630 384 674 264 597 29 96 11 600 257 631 580 189 473 189 597 205 324 103 222 221 189 26 70 292 375 381 562 42 431 302 106 434 445 305 617 299 29 423 205 67 642 580 265 374 214 272 260 405 26 264 600 11 450 562 42 32 238 234 388 1 46 229 208 229 66 153 657 198 122 580 577 70 292 375 381 4 373 523 205 381 212 122 387 124 131 523 205 368 66 122 299 602 214 248 234 189 170 368 70 292 375 602 580 282 122 248 396 450 362 481 230 461 202 312 597 14 302 106 434 445 305 539 299 119 214 373 205 324 457 368 381 212 218 153 502 65 247 545 595 602 355 463 177 582 179 485 36 450 290 626 7 193 249 46 169 404 119 7 151 447 105 375 416 153 5 211 648 116 368 202 312 597 66 322 517 295 11 265 393 248 21 205 634 665 151 238 218 97 447 452 580 577 545 302 597 381 674 151 597 566 221 478 450 42 577 565 85 499 229 7 413 221 365 211 490 342 317 530 177 135 302 106 434 445 305 481 70 373 327 323 170 368 205 577 261 6 463 110 178 292 211 205 577 266 565 646 189 39 665 151 238 122 90 131 396 370 257 603 496 46 611 368 439 191 675 374 251 207 565 450 396 205 577 194 265 595 151 327 615 85 606 24 191 290 626 7 193 249 46 169 221 216 10 665 151 342 138 84 230 255 103 450 107 674 205 11 221 302 106 434 445 305 481 381 212 122 235 11 387 201 384 233 103 384 597 7 205 674 287 642 514 299 131 191 496 481 209 238 218 154 191 212 663 385 42 478 205 375 602 387 602 214 396 205 179 211 501 449 475 122 608 461 450 107 674 364 608 572 212 189 29 87 214 288 191 650 151 266 565 646 189 490 302 106 434 445 305 481 7 323 45 450 435 368 450 107 595 580 603 306 447 179 191 214 11 205 209 447 83 450 435 368 478 205 375 602 387 602 580 205 221 447 316 46 631 521 530 557 46 194 131 191 496 481 209 238 218 646 191 503 496 375 7 154 191 387 159 212 663 503 496 375 608 131 191 496 481 209 238 218 445 151 209 324 375 608 196 131 205 366 103 151 209 324 375 608 131 191 496 481 209 238 496 191 450 7 323 299 674 45 333 447 577 384 385 282 445 131 191 496 481 209 238 218 154 191 212 663 385 42 478 205 375 602 387 602 214 396 205 221 46 261 122 490 189 70 292 375 490 486 512 179 174 445 123 549 214 597 66 461 248 381 445 602 214 42 387 201 230 212 274 545 562 42 32 373 342 215 333 179 595 387 461 215 333 179 595 127 302 106 434 445 305 422 46 194 131 191 496 481 209 238 122 299 127 486 512 577 191 562 42 342 212 248 229 580 211 11 565 57 289 642 7 503 541 191 381 94 597 248 385 42 450 461 299 212 248 229 580 230 461 205 221 46 70 289 672 11 523 205 447 11 565 57 221 46 194 131 191 496 481 209 238 135 189 70 292 375 490 486 512 179 174 445 123 189 316 153 122 580 36 490 189 218 251 674 545 496 248 381 445 602 214 205 221 342 167 375 29 674 355 214 450 70 373 327 323 577 265 391 368 234 562 42 191 42 191 445 450 290 179 368 249 595 225 381 552 539 201 373 431 59 11 205 619 28 141 151 26 151 368 384 214 205 221 46 631 449 450 107 674 301 539 449 353 42 238 496 191 212 663 141 151 26 387 194 582 383 557 486 261 602 580 517 342 167 375 29 260 7 455 674 355 214 450 176 481 559 230 57 205 209 87 324 368 422 46 631 664 373 45 42 671 229 597 373 191 386 368 306 87 481 642 503 496 642 167 211 577 255 116 85 447 539 496 191 215 333 179 595 127 59 179 191 212 663 342 85 7 674 364 450 290 179 552 640 66 449 11 552 1 46 212 577 413 384 113 646 191 212 248 229 580 211 11 565 57 289 642 7 503 124 373 16 191 387 159 189 134 191 545 595 675 211 255 384 7 344 178 205 577 552 591 52 234 388 481 387 159 478 205 375 602 387 602 214 577 202 597 21 674 503 333 375 364 582 603 255 491 27 214 234 496 611 191 674 205 211 230 674 214 21 584 567 211 501 449 475 122 521 597 387 450 396 615 237 375 647 26 642 557 191 394 597 108 122 521 205 324 189 483 324 375 7 154 191 450 290 179 552 591 29 521 503 628 672 387 381 4 292 211 205 530 11 151 191 674 447 221 447 306 591 381 164 176 211 230 615 237 375 647 26 642 557 447 306 189 316 191 394 597 108 122 521 214 21 413 11 52 234 388 481 552 591 287 532 307 387 159 299 674 396 381 494 230 672 450 134 46 611 377 597 387 450 396 59 52 234 388 481 212 663 478 205 375 602 387 602 214 205 221 46 261 450 299 70 233 209 619 28 287 374 84 28 113 646 191 212 248 229 580 211 11 565 57 289 642 7 503 541 191 381 94 597 248 385 42 255 384 151 191 261 582 481 552 617 45 248 21 70 365 103 384 59 264 331 42 248

1 / 2 / 3

Kommentare (20)

  1. #1 George Lasry
    Israel
    14. Februar 2017

    A hill climber I wrote can solve simulated texts with at least 8000 bigrams (16000 plaintext monograms). It can find some barely readable fragments at 5000 (bigrams). For this challenge (2500 bigrams), I don’t think hill climbing would work, at least not the naive version. Another approach is needed.

  2. #2 Thomas Ernst
    Latrobe
    14. Februar 2017

    The “675” brings back memories. On the assumption that “q” doesn’t occur in your two cipher texts – although it certainly could – “quote”, “question”, et alia – you appear to be the first cryptographer to put into action Trithemius’ “Steganographia” III: improvedly so, i. e. by bigrams. In S III, Trithemius goes up to 700 / 725, and does not use the bottom rung 100 – 1; a peculiar steganographic obsession of his to start up late with numbers; idem in his numerical ciphers of the “Polygraphia”. – Am assuming that you gave the most frequent bigrams “th” and “he” two extra lines of substitution, i. e. additional homophones. – One of the computer gurus will have figured it out soon. But thank you for bringing JT S III back to life. Where he belongs!

  3. #3 Thomas
    14. Februar 2017

    A count of number pairs would be more reliable than bigram frequencies. A simulated annealing algorithm swapping numbers combined with a tetragraph fitness function might be a good approach.

  4. #4 Norbert
    14. Februar 2017

    I wrote a simulated annealing algorithm for the second challenge which apparently cracked it overnight, but … there remains much work of correcting and complementing “by hand” … Unless I’m very much mistaken the plaintext starts like this:

    every time you login to a computer the operating system checks whether you

  5. #5 Thomas
    14. Februar 2017

    @Norbert
    Great!

    Did you swap parts of the bigram table/rows/columns/single numbers?

  6. #6 Norbert
    14. Februar 2017

    Must go to work now … I tried to complete the plaintext in a text editor, but in my hurry I made a mess out of it. I will come back to it tonight (unless someone has posted the complete plaintext by then).

    @Thomas: Single numbers. I didn’t rely on random swaps (as there are too many possibilities) but made sure that every possible swap is tried several times per round. That actually made the difference, I assume.

  7. #7 Thomas Ernst
    Latrobe
    14. Februar 2017

    The sequence “131 191 496 481 209 238 218” occurs four times, the sequence “131 191 496 481 209 238” twice. Of English words that exist both in variations of 14 and 12 characters, unfortunately, there are many: “substitutional”, “substitution”; “intertextually”, “intertextual”. At least we may assume that 131 begins a word, and 218 denotes a nominal, adverbial or adjectival suffix. Alas – not enough for an algorithm …

  8. #8 Thomas Ernst
    Latrobe
    14. Februar 2017

    A few more repetitions in the “2500”: “205 221” (6), “266 565 646 189” (3), “602 387 602 214” (4), “131 191 496 481 209 238 218” (4), “131 191 496 481 209 238” (2), “290 626 7 193 249 46 169” (2), “290 626 7 193 249 281 (1). Which – should have mentioned it above – does not imply that the first bigram[m] of theses sequences constitutes the beginning of a word, since it can be an overlap of the last and first letter of two words, like “an[d t]he”. Still, it might be time to just head into the situation room …

  9. #9 Norbert
    14. Februar 2017

    My head hurts … I decided that I won’t reach perfection regarding challenge #2 anyway, so this is my state of knowledge still written as bigrams. May a native speaker (and IT expert) correct my mistakes, fill the gaps and turn it into readable text 🙂

    ev er yt im ey ou lo gi nt oa co mp ut er th eo pe ra ti ng sy st em ch ec ks wh et he ry ou ar ea na ut ho ri ze du se rw he ny ou wi th dr aw ca sh fr om an at am ac hi ne th em ac hi ne ch ec ks wh et he ry ou ar ea le gi ti ma te ac co un to wn er if yo ue nt er af ac to ry th eg at ek ee pe rc he ck sw he th er yo ua re an ap pr ov ed vi si to ro ft he fa ci li ty in ea ch of th es ec as es th er ei sc he ck to de te rm in et he au th en ti ci ty of th ec la im an da ut ho ri za ti on of th ea ct io nh ow ev er it is te ch ni ca ll ys uf fi ci en tt oc he ck th ev al id it yo ft he ri gh tt ol og in wi th dr aw ca sh or ac ce ss fa ci li ty wi th ou tr ev ea li ng yo ur id en ti ty te ch ni ca ll ys ta te di 095 an yo ft od ay sa ut he nt ic at io np ro ce ss es an id en ti ty of ap er so no rm ac hi ne is va li da te da lt ho ug hi ti sa ls op os si bl ec he ck on ly ap ar ti cu la ra tt ri bu te of an id en ti ty wi th ou td is cl os in gt he fu ll id en ti ty da ta va lu et yp ic al at tr ib ut es th at ca nb ev al id at ed ar ea ge ad dr es sd at aa dr es sp er mi ss io ns an dp ro cu ra ti on at tr ib ut eb as ed au th en ti ca ti on ha ss ev er al ad va nt ag es on ly pa rt ia li de nt it yd at ai sd is cl os ed an dl im it ed to sp ec if ic at tr ib ut es da ta pr ot ec ti on an at tr ib ut em ay ha ve av al id it yp er io dt ha ti sd if fe re nt fr om th eo ne of th ei de nt it ya na tt ri bu te ca nb ea ss er te dt oo rc al cu la te df ro ma 323 ea 173 ex is ti ng id en ti ty da ta an at tr ib ut ec an be ch an ge dw it ho ut ch an gi ng th ei de nt it yi ns om ec as es at tr ib ut es ca nt 478 ev al id at ed wi th th eu se of at tr ib ut es ta te me nt sf or ex am pl et he pr oc es so fa ge ve ri fi ca ti on us ua ll yd oe sn ot re qu ir et he sp ec if ic bi rt hd at ei ts el 207 ut ra th er ac al cu la te dv al ue ab ou tt he at tr ib ut es ta te me nt su ch as pe rs on is ov er ei gh tt ee na tt ri bu te sa re we ll su it ed fo rs ma rt ca rd au th en ti ca ti on es pe ci al ly in th ew 456 ld of ei dd oc um en ts fo ri ns ta nc et he pr oo fo fp er mi ss io no re nt it le me nt fo rd ri vi ng or pu rs ui ng ap ro fe ss io nd oc to rl aw ye ro rs im il ar ca nb ea ut he nt ic at ed pr oo fo fr eg is tr at io nf or pr oo fo fi ns ur an ce pr of es si on al or ga ni za ti on so rs im il ar ar ea dd it io na la sp ec ts of at tr ib ut ec al cu la ti on at tr ib ut es ha ve be en we ll kn ow nf or de ca de sa sa pa rt of at tr ib ut ec er ti fi ca te sa na tt ri bu te ce rt if ic at ea ss ta nd ar di ze di nt he 125 iv eo ni ne fr am ew or 244 an be th ou gh to fa sa di gi ta lc er ti fi ca te wi th ou ta pu bl ic ke yt he si gn at ur ec om es fr om ac er ti fi ca ti on au th or it yc ah ow ev er th is ki nd of at tr ib ut eh an dl in gd oe sn tp la ya ro le in mo de rn ei dc ar ds ys te ms on er ea so nf or th is is th at it is to oc om pl ic at ed to in 521 330 ea ca in ev er yc er ti fi ca te ge ne ra ti on in ad di ti on at tr ib ut ec er ti fi ca te sh av et 478 es to re do nt he ca rd du ri ng pe rs on al iz at io nw hi ch pr ev en ts th ep os si bi li ty of ch an gi ng th em du ri ng li fe ti me of th ed oc um en tf ur th er if ap ar ty is al lo we dt or ea dt he at tr ib ut ec er ti fi ca te al ls to re da tt ri bu te sa re di sc lo se dd ur in gt hi sp ro ce ss an dc an be ut il iz ed ev en ou ts id et he ca rd fo rt he se re as on sa tt ri bu te st od ay ar et yp ic al ly pr ov id ed by th ec ar di ts el ft he ym ay ei th er be si gn ed by th ec ar do rp ro vi de dt oe xt er na ls ys te ms vi aa na ut he nt ic at ed ch an ne la na tt ri bu te th at is pr ot ec te di ns uc ha wa yc an be re fe rr ed to as ac re de nt ia lb ec au se it so ri gi na li ty an da ut he nt ic it yc an be pr ov en cr ed en ti al sc re at ed in su ch aw ay in cl 006 eo 375 yt he ne ce ss ar ya tt ri bu te so ft he ca rd ho ld er an dw hi ch ex pl ic it pe rm is si on sa re 291 an te da ni nn ov at iv ew ay to ha nd le at tr ib ut es an dt oa ss ig np er mi ss io ns to th em ha sb ee ni mp le me nt ed in th eg er ma ni dc ar db ym ea ns of ad ed ic at ed ri gh ts ma na ge me nt an de nc od ed in se 154 de sc ri bi ng ce rt if ic at es pr es en to nt he ca rd ea ch at tr ib ut ec an on ly be re ad ou tb ya se rv ic ep ro vi de rw it ht he ex pl ic it co ns en to ft he ho ld er an dw it hd ed ic at ed pe rm is si on gi ve nb ya fe de ra lc af or th is pe rm is si on th es er vi ce pr ov id er ha st oa pp ly fo ra nd th ep er mi ss io nf or th eb us in es sc as ei sh an de do ve ri nf or mo fa ca rd ve ri fi ab le ce rt if ic at et ha ti sp ro ve nb yt he id ca rd du ri ng au th en ti ca ti on of th es er vi ce pr ov id er it is cl ea rt ha tc re de nt ia ls si gn ed by th ec ar da re mo re se cu re th an th os eo 375 yp ro vi de dv ia an au th en ti ca te dc ha nn el ho 247 he wr or 045 nd ho wn ew at tr ib ut es ar ec re at ed du ri ng th ev al id it yp er io do ft he do cu me nt is ex pl ai ne db el ow ev en wh en an at tr ib ut ei sp ro vi de db ya sm ar tc ar di ti so fc ou rs ei nd ep en de nt fr om it it is th er ew or ep os si bl et ha to ne se rv ic ep ro vi de rp as se sa na tt ri bu te to an ot he rs er vi ce wh en th is at tr ib ut ei sd ig it al ly si gn ed th er ec ei ve rc an be su re th at th ec on te nt is co rr ec ti ft he re is no di gi ta ls ig na tu re th er ec ei vi ng se rv ic ep ro vi de rh as to tr us tt he so ur ce of th ea tt ri bu te an dt he co am un ic at io nc ha nn el ho we ve rh an dl in ga tt ri bu te si nd ep en de nt ly fr om ac ar di so ft en no td es ir ab le as it ma ke st he ca rd ho ld er ca nn ot di re ct ly co nt ro la pi ec eo fp er so na li nf or ma ti on wh ic hm ay be sh ar ed wi th ot he rs er vi ce st he re fo re mo de rn at tr ib ut em an ag em en ti so rg an iz ed in au se rc en tr ic wa yt hi si sb as ed on pr oc es se si nw hi ch at tr ib ut es or as se rt ed st at em en ts ar ep ro vi de db yt he ca rd it se 154 of co ur se th is ap pr oa ch is mo re co mp le xt ha nt he co nc ep to 369 re el 480 lo at in ga tt ri bu te sb ut it re nd er sm or ed at ap ro te ct io na nd ah ig he rd eg re eo fp ri va cy wh il ea na tt ri bu te ca nb ew ri tt en to as ma rt ca rd du ri ng th ep ro du ct io np ro ce ss th ew 005 616 ot en ti al of th is te ch no lo 166 is on ly av ai la bl ei fe 198 st in 464 ar ds ca nb eu 054 at ed at tr ib ut es af te 633 ei ng is su ed fo rs uc ha po st is su an ce pr ov is io ni ng as pe ci al pr oc es si sn ec es sa ry th is pr oc es sm us te ns ur et ha to 375 ya tr us te da tt ri bu te pr ov id er ma wr ri te an at tr ib ut eo nt oa ca rd ch ip an at tr ib ut ep ro vi de ri st yp ic al ly tr ig ge re db ya se rv ic ep ro vi de rf or in st an ce an eg ov er 095 en tp or ta lo ra no 375 in es ho pi no rd er to ac hi ev ea us er ce nt ri ca tt ri bu te ma na ge me nt it is us ua ll yd es ir ed to se pa ra te se rv ic ep ro vi de rs fr om at tr ib ut ep ro vi de rs an 463 or eo ve rt om 255 et he am ut ua ll yi sv is ib le in or de rt os up po rt su ch ap ro ce ss th ee nh an ce dr ol ea ut he nt ic at io np ro to co le ra wa sd es ig ne de ra is at 094 ee pa rt yp ro to co lt ha ti sa pp li ed be tw ee na ch ip ca rd un de rt he co nt ro lo fi ts ho ld er vi as ma rt ca rd mi dd le wa re as er vi ce pr ov id er an da na tt ri bu te pr ov id er ge ne ra ll yt hi sp ro to co lc om bi ne sa us er ce nt er ed ap pr oa ch wi th th es er vi ce pr ov id er an dt he op er at or ha vi ng no di re ct co nn ec ti on wi th th es ys te ms

  10. #10 George Lasry
    14. Februar 2017

    Very impressive, Norbert. You are so fast and efficient!. I am still stuck with solutions only for 5000 bigrams (twice the length). A couple of questions:
    1) What was your target function? Quads or tris or pentas?
    2) Did you normalize using a log function?
    3) How did you build your database?
    4) How exactly do you evaluate after each change? It is not needed to redecrypt the whole thing I guess.
    5) What is your programming language?

    Thx
    George

  11. #11 Armin
    14. Februar 2017

    Great job, Norbert!

    Using your crib I found the following plaintext, that completes your solution:

    EVERY TIME YOU LOG INTO A COMPUTER THE OPERATING SYSTEM CHECKS WHETHER YOU ARE AN AUTHORIZED USER WHEN YOU WITHDRAW CASH FROM AN ATM MACHINE THE MACHINE CHECKS WHETHER YOU ARE A LEGITIMATE ACCOUNT OWNER IF YOU ENTER A FACTORY THE GATEKEEPER CHECKS WHETHER YOU ARE AN APPROVED VISITOR OF THE FACILITY IN EACH OF THESE CASES THERE IS CHECK TO DETERMINE THE AUTHENTICITY OF THE CLAIM AND AUTHORIZATION OF THE ACTION HOWEVER IT IS TECHNICALLY SUFFICIENT TO CHECK THE VALIDITY OF THE RIGHT TO LOG IN WITHDRAW CASH OR ACCESS FACILITY WITHOUT REVEALING YOUR IDENTITY TECHNICALLY STATED IN MANY OF TODAYS AUTHENTICATION PROCESSES AN IDENTITY OF A PERSON OR MACHINE IS VALIDATED ALTHOUGH IT IS ALSO POSSIBLE CHECK ONLY A PARTICULAR ATTRIBUTE OF AN IDENTITY WITHOUT DISCLOSING THE FULL IDENTITY DATA VALUE TYPICAL ATTRIBUTES THAT CAN BE VALIDATED ARE AGE ADDRESS DATA ACCESS PERMISSIONS AND PROCURATION ATTRIBUTE BASED AUTHENTICATION HAS SEVERAL ADVANTAGES ONLY PARTIAL IDENTITY DATA IS DISCLOSED AND LIMITED TO SPECIFIC ATTRIBUTES DATA PROTECTION AN ATTRIBUTE MAY HAVE A VALIDITY PERIOD THAT IS DIFFERENT FROM THE ONE OF THE IDENTITY AN ATTRIBUTE CAN BE ASSERTED TO OR CALCULATED FROM ALREADY EXISTING IDENTITY DATA AN ATTRIBUTE CAN BE CHANGED WITHOUT CHANGING THE IDENTITY IN SOME CASES ATTRIBUTES CAN TO BE VALIDATED WITH THE USE OF ATTRIBUTE STATEMENTS FOR EXAMPLE THE PROCESS OF AGE VERIFICATION USUALLY DOES NOT REQUIRE THE SPECIFIC BIRTHDATE ITSELF BUT RATHER A CALCULATED VALUE ABOUT THE ATTRIBUTE STATEMENT SUCH AS PERSON IS OVER EIGHTTEEN ATTRIBUTES ARE WELL SUITED FOR SMARTCARD AUTHENTICATION ESPECIALLY IN THE FIELD OF EID DOCUMENTS FOR INSTANCE THE PROOF OF PERMISSION OR ENTITLEMENT FOR DRIVING OR PURSUING A PROFESSION DOCTOR LAWYER OR SIMILAR CAN BE AUTHENTICATED PROOF OF REGISTRATION FOR PROOF OF INSURANCE PROFESSIONAL ORGANIZATIONS OR SIMILAR ARE ADDITIONAL ASPECTS OF ATTRIBUTE CALCULATION ATTRIBUTES HAVE BEEN WELL KNOWN FOR DECADES AS A PART OF ATTRIBUTE CERTIFICATES AN ATTRIBUTE CERTIFICATE AS STANDARDIZED IN THE X FIVE O NINE FRAMEWORK CAN BE THOUGHT OF AS A DIGITAL CERTIFICATE WITHOUT A PUBLIC KEY THE SIGNATURE COMES FROM A CERTIFICATION AUTHORITY CA HOWEVER THIS KIND OF ATTRIBUTE HANDLING DOESNT PLAY A ROLE IN MODERN EID CARD SYSTEMS ONE REASON FOR THIS IS THAT IT IS TOO COMPLICATED TO INVOLVE A CA IN EVERY CERTIFICATE GENERATION IN ADDITION ATTRIBUTE CERTIFICATES HAVE TO BE STORED ON THE CARD DURING PERSONALIZATION WHICH PREVENTS THE POSSIBILITY OF CHANGING THEM DURING LIFETIME OF THE DOCUMENT FURTHER IF A PARTY IS ALLOWED TO READ THE ATTRIBUTE CERTIFICATE ALL STORED ATTRIBUTES ARE DISCLOSED DURING THIS PROCESS AND CAN BE UTILIZED EVEN OUTSIDE THE CARD FOR THESE REASONS ATTRIBUTES TODAY ARE TYPICALLY PROVIDED BY THE CARD ITSELF THEY MAY EITHER BE SIGNED BY THE CARD OR PROVIDED TO EXTERNAL SYSTEMS VIA AN AUTHENTICATED CHANNEL AN ATTRIBUTE THAT IS PROTECTED IN SUCH A WAY CAN BE REFERRED TO AS A CREDENTIAL BECAUSE ITS ORIGINALITY AND AUTHENTICITY CAN BE PROVEN CREDENTIALS CREATED IN SUCH A WAY INCLUDE ONLY THE NECESSARY ATTRIBUTES OF THE CARD HOLDER AND WHICH EXPLICIT PERMISSIONS ARE GRANTED AN INNOVATIVE WAY TO HANDLE ATTRIBUTES AND TO ASSIGN PERMISSIONS TO THEM HAS BEEN IMPLEMENTED IN THE GERMAN ID CARD BY MEANS OF A DEDICATED RIGHTS MANAGEMENT AND ENCODED IN SELF DESCRIBING CERTIFICATES PRESENT ON THE CARD EACH ATTRIBUTE CAN ONLY BE READ OUT BY A SERVICE PROVIDER WITH THE EXPLICIT CONSENT OF THE HOLDER AND WITH DEDICATED PERMISSION GIVEN BY A FEDERAL CA FOR THIS PERMISSION THE SERVICE PROVIDER HAS TO APPLY FOR AND THE PERMISSION FOR THEBUSINESS CASE IS HANDED OVER IN FORM OF A CARD VERIFIABLE CERTIFICATE THAT IS PROVEN BY THE ID CARD DURING AUTHENTICATION OF THE SERVICE PROVIDER IT IS CLEAR THAT CREDENTIALS SIGNED BY THE CARD ARE MORE SECURE THAN THOSE ONLY PROVIDED VIA AN AUTHENTICATED CHANNEL HOW THEY WORK AND HOW NEW ATTRIBUTES ARE CREATED DURING THEVALIDITY PERIOD OF THE DOCUMENT IS EXPLAINED BELOW EVEN WHEN AN ATTRIBUTE IS PROVIDED BY A SMARTCARD IT IS OF COURSE INDEPENDENT FROM IT IT IS THEREFORE POSSIBLE THAT ONE SERVICE PROVIDER PASSES AN ATTRIBUTE TO ANOTHER SERVICE WHEN THIS ATTRIBUTE IS DIGITALLY SIGNED THE RECEIVER CAN BE SURE THAT THE CONTENT IS CORRECT IF THERE IS NO DIGITAL SIGNATURE THE RECEIVING SERVICE PROVIDER HAS TO TRUST THE SOURCE OF THE ATTRIBUTE AND THE COMMUNICATION CHANNEL HOWEVER HANDLING ATTRIBUTES INDEPENDENTLY FROM A CARD IS OFTEN NOT DESIRABLE AS IT MAKES THE CARD HOLDER CAN NOT DIRECTLY CONTROL A PIECE OF PERSONAL INFORMATION WHICH MAY BE SHARED WITH OTHER SERVICES THEREFORE MODERN ATTRIBUTE MANAGEMENT IS ORGANIZED IN A USER CENTRIC WAY THIS IS BASED ON PROCESSES IN WHICH ATTRIBUTES OR ASSERTED STATEMENTS ARE PROVIDED BY THE CARD ITSELF OF COURSE THIS APPROACH IS MORE COMPLEX THAN THE CONCEPT OF FREELY FLOATING ATTRIBUTES BUT IT RENDERS MORE DATA PROTECTION AND A HIGHER DEGREE OF PRIVACY WHILE AN ATTRIBUTE CAN BE WRITTEN TO A SMARTCARD DURING THE PRODUCTION PROCESS THE FULL POTENTIAL OF THIS TECHNOLOGY IS ONLY AVAILABLE IF EXISTING CARDS CAN BE UPDATED ATTRIBUTES AFTER BEING ISSUED FOR SUCH A POST ISSUANCE PROVISIONING A SPECIAL PROCESS IS NECESSARY THIS PROCESS MUST ENSURE THAT ONLY A TRUSTED ATTRIBUTE PROVIDER MAY WRITE AN ATTRIBUTE ONTO A CARD CHIP AN ATTRIBUTE PROVIDER IS TYPICALLY TRIGGERED BY A SERVICE PROVIDER FOR INSTANCE AN E GOVERNMENT PORTAL OR AN ONLINE SHOP IN ORDER TO ACHIEVE A USER CENTRIC ATTRIBUTE MANAGEMENT IT IS USUALLY DESIRED TO SEPARATE SERVICE PROVIDERS FROM ATTRIBUTE PROVIDERS AND MOREOVER TO MAKE THEM MUTUALLY INVISIBLE IN ORDER TO SUPPORT SUCH A PROCESS THE ENHANCED ROLE AUTHENTICATION PROTOCOL ERA WAS DESIGNED ERA IS A THREE PARTY PROTOCOL THAT IS APPLIED BETWEEN A CHIP CARD UNDER THE CONTROL OF ITS HOLDER VIA SMARTCARD MIDDLEWARE A SERVICE PROVIDER AND AN ATTRIBUTE PROVIDER GENERALLY THIS PROTOCOL COMBINES A USER CENTERED APPROACH WITH THE SERVICE PROVIDER AND THE OPERATOR HAVING NO DIRECT CONNECTION WITH THE SYSTEMS

  12. #12 Klaus Schmeh
    14. Februar 2017

    @Norbert, Armin: Great job!

    Here’s the cleartext:

    EVERYTIMEYOULOGINTOACOMPUTERTHEOPERATINGSYSTEMCHECKSWHETHER
    YOUAREANAUTHORIZEDUSERWHENYOUWITHDRAWCASHFROMANATMMACHINETH
    EMACHINECHECKSWHETHERYOUAREALEGITIMATEACCOUNTOWNERIFYOUENTE
    RAFACTORYTHEGATEKEEPERCHECKSWHETHERYOUAREANAPPROVEDVISITORO
    FTHEFACILITYINEACHOFTHESECASESTHEREISCHECKTODETERMINETHEAUT
    HENTICITYOFTHECLAIMANDAUTHORIZATIONOFTHEACTIONHOWEVERITISTE
    CHNICALLYSUFFICIENTTOCHECKTHEVALIDITYOFTHERIGHTTOLOGINWITHD
    RAWCASHORACCESSFACILITYWITHOUTREVEALINGYOURIDENTITYTECHNICA
    LLYSTATEDINMANYOFTODAYSAUTHENTICATIONPROCESSESANIDENTITYOFA
    PERSONORMACHINEISVALIDATEDALTHOUGHITISALSOPOSSIBLECHECKONLY
    APARTICULARATTRIBUTEOFANIDENTITYWITHOUTDISCLOSINGTHEFULLIDE
    NTITYDATAVALUETYPICALATTRIBUTESTHATCANBEVALIDATEDAREAGEADDR
    ESSDATAACCESSPERMISSIONSANDPROCURATIONATTRIBUTEBASEDAUTHENT
    ICATIONHASSEVERALADVANTAGESONLYPARTIALIDENTITYDATAISDISCLOS
    EDANDLIMITEDTOSPECIFICATTRIBUTESDATAPROTECTIONANATTRIBUTEMA
    YHAVEAVALIDITYPERIODTHATISDIFFERENTFROMTHEONEOFTHEIDENTITYA
    NATTRIBUTECANBEASSERTEDTOORCALCULATEDFROMALREADYEXISTINGIDE
    NTITYDATAANATTRIBUTECANBECHANGEDWITHOUTCHANGINGTHEIDENTITYI
    NSOMECASESATTRIBUTESCANTOBEVALIDATEDWITHTHEUSEOFATTRIBUTEST
    ATEMENTSFOREXAMPLETHEPROCESSOFAGEVERIFICATIONUSUALLYDOESNOT
    REQUIRETHESPECIFICBIRTHDATEITSELFBUTRATHERACALCULATEDVALUEA
    BOUTTHEATTRIBUTESTATEMENTSUCHASPERSONISOVEREIGHTTEENATTRIBU
    TESAREWELLSUITEDFORSMARTCARDAUTHENTICATIONESPECIALLYINTHEFI
    ELDOFEIDDOCUMENTSFORINSTANCETHEPROOFOFPERMISSIONORENTITLEME
    NTFORDRIVINGORPURSUINGAPROFESSIONDOCTORLAWYERORSIMILARCANBE
    AUTHENTICATEDPROOFOFREGISTRATIONFORPROOFOFINSURANCEPROFESSI
    ONALORGANIZATIONSORSIMILARAREADDITIONALASPECTSOFATTRIBUTECA
    LCULATIONATTRIBUTESHAVEBEENWELLKNOWNFORDECADESASAPARTOFATTR
    IBUTECERTIFICATESANATTRIBUTECERTIFICATEASSTANDARDIZEDINTHEX
    FIVEONINEFRAMEWORKCANBETHOUGHTOFASADIGITALCERTIFICATEWITHOU
    TAPUBLICKEYTHESIGNATURECOMESFROMACERTIFICATIONAUTHORITYCAHO
    WEVERTHISKINDOFATTRIBUTEHANDLINGDOESNTPLAYAROLEINMODERNEIDC
    ARDSYSTEMSONEREASONFORTHISISTHATITISTOOCOMPLICATEDTOINVOLVE
    ACAINEVERYCERTIFICATEGENERATIONINADDITIONATTRIBUTECERTIFICA
    TESHAVETOBESTOREDONTHECARDDURINGPERSONALIZATIONWHICHPREVENT
    STHEPOSSIBILITYOFCHANGINGTHEMDURINGLIFETIMEOFTHEDOCUMENTFUR
    THERIFAPARTYISALLOWEDTOREADTHEATTRIBUTECERTIFICATEALLSTORED
    ATTRIBUTESAREDISCLOSEDDURINGTHISPROCESSANDCANBEUTILIZEDEVEN
    OUTSIDETHECARDFORTHESEREASONSATTRIBUTESTODAYARETYPICALLYPRO
    VIDEDBYTHECARDITSELFTHEYMAYEITHERBESIGNEDBYTHECARDORPROVIDE
    DTOEXTERNALSYSTEMSVIAANAUTHENTICATEDCHANNELANATTRIBUTETHATI
    SPROTECTEDINSUCHAWAYCANBEREFERREDTOASACREDENTIALBECAUSEITSO
    RIGINALITYANDAUTHENTICITYCANBEPROVENCREDENTIALSCREATEDINSUC
    HAWAYINCLUDEONLYTHENECESSARYATTRIBUTESOFTHECARDHOLDERANDWHI
    CHEXPLICITPERMISSIONSAREGRANTEDANINNOVATIVEWAYTOHANDLEATTRI
    BUTESANDTOASSIGNPERMISSIONSTOTHEMHASBEENIMPLEMENTEDINTHEGER
    MANIDCARDBYMEANSOFADEDICATEDRIGHTSMANAGEMENTANDENCODEDINSEL
    FDESCRIBINGCERTIFICATESPRESENTONTHECARDEACHATTRIBUTECANONLY
    BEREADOUTBYASERVICEPROVIDERWITHTHEEXPLICITCONSENTOFTHEHOLDE
    RANDWITHDEDICATEDPERMISSIONGIVENBYAFEDERALCAFORTHISPERMISSI
    ONTHESERVICEPROVIDERHASTOAPPLYFORANDTHEPERMISSIONFORTHEBUSI
    NESSCASEISHANDEDOVERINFORMOFACARDVERIFIABLECERTIFICATETHATI
    SPROVENBYTHEIDCARDDURINGAUTHENTICATIONOFTHESERVICEPROVIDERI
    TISCLEARTHATCREDENTIALSSIGNEDBYTHECARDAREMORESECURETHANTHOS
    EONLYPROVIDEDVIAANAUTHENTICATEDCHANNELHOWTHEYWORKANDHOWNEWA
    TTRIBUTESARECREATEDDURINGTHEVALIDITYPERIODOFTHEDOCUMENTISEX
    PLAINEDBELOWEVENWHENANATTRIBUTEISPROVIDEDBYASMARTCARDITISOF
    COURSEINDEPENDENTFROMITITISTHEREFOREPOSSIBLETHATONESERVICEP
    ROVIDERPASSESANATTRIBUTETOANOTHERSERVICEWHENTHISATTRIBUTEIS
    DIGITALLYSIGNEDTHERECEIVERCANBESURETHATTHECONTENTISCORRECTI
    FTHEREISNODIGITALSIGNATURETHERECEIVINGSERVICEPROVIDERHASTOT
    RUSTTHESOURCEOFTHEATTRIBUTEANDTHECOMMUNICATIONCHANNELHOWEVE
    RHANDLINGATTRIBUTESINDEPENDENTLYFROMACARDISOFTENNOTDESIRABL
    EASITMAKESTHECARDHOLDERCANNOTDIRECTLYCONTROLAPIECEOFPERSONA
    LINFORMATIONWHICHMAYBESHAREDWITHOTHERSERVICESTHEREFOREMODER
    NATTRIBUTEMANAGEMENTISORGANIZEDINAUSERCENTRICWAYTHISISBASED
    ONPROCESSESINWHICHATTRIBUTESORASSERTEDSTATEMENTSAREPROVIDED
    BYTHECARDITSELFOFCOURSETHISAPPROACHISMORECOMPLEXTHANTHECONC
    EPTOFFREELYFLOATINGATTRIBUTESBUTITRENDERSMOREDATAPROTECTION
    ANDAHIGHERDEGREEOFPRIVACYWHILEANATTRIBUTECANBEWRITTENTOASMA
    RTCARDDURINGTHEPRODUCTIONPROCESSTHEFULLPOTENTIALOFTHISTECHN
    OLOGYISONLYAVAILABLEIFEXISTINGCARDSCANBEUPDATEDATTRIBUTESAF
    TERBEINGISSUEDFORSUCHAPOSTISSUANCEPROVISIONINGASPECIALPROCE
    SSISNECESSARYTHISPROCESSMUSTENSURETHATONLYATRUSTEDATTRIBUTE
    PROVIDERMAYWRITEANATTRIBUTEONTOACARDCHIPANATTRIBUTEPROVIDER
    ISTYPICALLYTRIGGEREDBYASERVICEPROVIDERFORINSTANCEANEGOVERNM
    ENTPORTALORANONLINESHOPINORDERTOACHIEVEAUSERCENTRICATTRIBUT
    EMANAGEMENTITISUSUALLYDESIREDTOSEPARATESERVICEPROVIDERSFROM
    ATTRIBUTEPROVIDERSANDMOREOVERTOMAKETHEMMUTUALLYINVISIBLEINO
    RDERTOSUPPORTSUCHAPROCESSTHEENHANCEDROLEAUTHENTICATIONPROTO
    COLERAWASDESIGNEDERAISATHREEPARTYPROTOCOLTHATISAPPLIEDBETWE
    ENACHIPCARDUNDERTHECONTROLOFITSHOLDERVIASMARTCARDMIDDLEWARE
    ASERVICEPROVIDERANDANATTRIBUTEPROVIDERGENERALLYTHISPROTOCOL
    COMBINESAUSERCENTEREDAPPROACHWITHTHESERVICEPROVIDERANDTHEOP
    ERATORHAVINGNODIRECTCONNECTIONWITHTHESYSTEMS

  13. #13 Thomas
    15. Februar 2017

    Congratulations!

    @Norbert, George, Armin or WIMC:

    If I’m not mistaken (I’m no programmer) hill climbing and simulated annealing algorithms are “blindfolded” choosing a certain bigram substitution table (key). Wouldn’t it be even faster if the bigram frequencies of the ciphertext were taken into account from the beginning to rule out impossible and improbable substitutions? I think the problem might be to group the ciphertext and plaintext bigrams by frequencies before decrypting.

  14. #14 Norbert
    16. Februar 2017

    @George: I am afraid my hit was by mere chance! Anyway, here is my “setup”: I used the quadgram freqencies provided here. The frequencies are logarithmized, and then the values for each quadgram of the tentative plaintext summed up. For the 2500-letter challenge, I also tried quintgrams and another approach, which is to compute the bigram distribution of the plaintext and compare it to the expected distribution (chi square). Neither approach has succeeded yet (the latter seems to be the most unusable, unless I implemented it wrongly). I did not optimize my code in terms of redundant recalculating of unchanged plaintext parts. I am happy if the programming takes not too much time, rather I allow my C++ code to run all night 😉

    @Thomas: Depending on the details of the hill climbing algorithm, it is indeed often a good idea to start with a probable key, so the program may at first try to assign each cipher number to a bigram with a corresponding frequency. However, the simulated annealing algorithm will in its first stage turn any initial key into a “chaotic” state. But the programmer is free to decide that certain cipher numbers will be restricted to one or several plaintext bigrams. It’s not a big deal to integrate such a feature into the program. Maybe this is the way Armin used my plaintext suggestion for challenge #2 as a “crib”.

  15. #15 George Lasry
    16. Februar 2017

    Hi Norbert

    Many thanks for the details. I tried something very similar, but with hill climbing. Seems that SA is more appropriate. Congrats again!.

    George

  16. #16 Norbert
    17. Februar 2017

    At last, a simulated annealing approach with quintgram scoring and a guessed crib (“there are”) was successful toward challenge #1:

    there are several technologies which can be used to prevent attacks on eids the most important ones are cryptography biometry secure storage and pin query in this work only cryptography is covered cryptography provides for a broad range of methods to improve the security of an eid among the attacks mentioned above complete fraud cloning alteration unauthorized reading eavesdropping and man in the middle can be prevented with cryptography usage by the wrong persons would be addressed with biometry while illegal obtaining is prevented not with technical but with organisational means for preventing denial of service and marking cryptography is also not the major technology however it is possible to abuse cryptographic tools to start denial of service or marking attacks especially the question of marking is very interesting because it is possible to use covert channels in cryptographic protocols for this purpose however this important question is out of scope in this work the first purpose of this paper is to classify the cryptographic techniques that make an eid secure these cryptographic techniques can be divided into three groups authentication of the chip unit authentication of the terminal unit and secure data exchange among the most important cryptographic tools used for securing eids are the ones that are used for authenticating the card chip itself chip authentication or the content of the card chip content authentication or a statement about the eid statement authentication these tools are summarised under authentication of the chip unit authentication of the chip unit can prevent complete fraud cloning and alteration it can be achieved with both asymmetric and symmetric cryptography this authentication is accomplished with a challenge response protocol the terminal unit sends a challenge and the chip replies with a response chip authentication can prevent complete fraud and cloning but cannot prevent alteration the challenge response protocol is performed with either a secret or a private key in both cases the key must be stored on the chip in a secure way in addition the chip must be capable of computing a challenge out of a response which means that a memory chip is not suitable in other words a microcontroller is necessary in the symmetric case the chip has stored a secret key e g an aes key which is used for the challenge response protocol both the card and the verifier need to know this key symmetric card authentication is mainly recommendable in environments where there are only few terminal units in case of large number of checking units the management of the secret keys gets complicated nevertheless there are card systems where a large number of cards and checking units only use symmetric keys e g aes ecards are often realised this way in case of asymmetric chip authentication the chip uses a private key e g rsa key for the challenge response protocol the terminal unit needs the public key usually provided as a digital

  17. #17 Jim
    17. Februar 2017

    Well done, Norbert! I was still messing with initialization strategies without much success.

  18. #18 Hendrik
    18. Februar 2017

    @Norbert Congratulations!
    I tried it for myself with simulated annealing and quintgram scoring but was still struggling with local maximums and inefficent code 😉

  19. #19 Klaus Schmeh
    18. Februar 2017

    Bart Wenmeckers via Faccebook:
    He is too quick for me.. well done to Norbert. I am curious if my solver will break it.

  20. #20 Klaus Schmeh
    18. Februar 2017

    @Norbert: Congratulations! Great job!
    Your work proves that a bigram substitution applied on a 2500 letter cleartext can be broken. This means that this method should only be used if the cleartext is considerably shorter than 2500 letters. However, creating a table containg 676 bigrams (i. e. 1352 letters) to encrypt, say, a 1500 letter plaintext doesn’t make much sense, as one can use the One Time Pad instead. All in all, I see no reason to use the bigram substitution at all.