PGP word list Words for conveying data bytes in speech The PGP Word List ("Pretty Good Privacy word list", also called a biometric word list for reasons explained below) is a list of words for conveying data bytes in a clear unambiguous way via a voice channel. They are analogous in purpose to the NATO phonetic alphabet, except that a longer list of words is used, each word corresponding to one of the 256 distinct numeric byte values. History and structure The PGP Word List was designed in 1995 by Patrick Juola, a computational linguist, and Philip Zimmermann, creator of PGP.[1][2] The words were carefully chosen for their phonetic distinctiveness, using genetic algorithms to select lists of words that had optimum separations in phoneme space. The candidate word lists were randomly drawn from Grady Ward's Moby Pronunciator list as raw material for the search, successively refined by the genetic algorithms. The automated search converged to an optimized solution in about 40 hours on a DEC Alpha, a particularly fast machine in that era. The Zimmermann–Juola list was originally designed to be used in PGPfone, a secure VoIP application, to allow the two parties to verbally compare a short authentication string to detect a man-in- the-middle attack (MiTM). It was called a biometric word list because the authentication depended on the two human users recognizing each other's distinct voices as they read and compared the words over the voice channel, binding the identity of the speaker with the words, which helped protect against the MiTM attack. The list can be used in many other situations where a biometric binding of identity is not needed, so calling it a biometric word list may be imprecise. Later, it was used in PGP to compare and verify PGP public key fingerprints over a voice channel. This is known in PGP applications as the "biometric" representation. When it was applied to PGP, the list of words was further refined, with contributions by Jon Callas. More recently, it has been used in Zfone and the ZRTP protocol, the successor to PGPfone. The list is actually composed of two lists, each containing 256 phonetically distinct words, in which each word represents a different byte value between 0 and 255. Two lists are used because reading aloud long random sequences of human words usually risks three kinds of errors: 1) transposition of two consecutive words, 2) duplicate words, or 3) omitted words. To detect all three kinds of errors, the two lists are used alternately for the even-offset bytes and the odd-offset bytes in the byte sequence. Each byte value is actually represented by two different words, depending on whether that byte appears at an even or an odd offset from the beginning of the byte sequence. The two lists are readily distinguished by the number of syllables; the even list has words of two syllables, the odd list has three. The two lists have a maximum word length of 9 and 11 letters, respectively. Using a two- list scheme was suggested by Zhahai Stewart. Word lists Here are the two lists of words as presented in the PGPfone Owner's Manual.[3] Hex Even Word Odd Word --- --------- ---------- 00 aardvark adroitness 01 absurd adviser 02 accrue aftermath 03 acme aggregate 04 adrift alkali 05 adult almighty 06 afflict amulet 07 ahead amusement 08 aimless antenna 09 Algol applicant 0A allow Apollo 0B alone armistice 0C ammo article 0D ancient asteroid 0E apple Atlantic 0F artist atmosphere 10 assume autopsy 11 Athens Babylon 12 atlas backwater 13 Aztec barbecue 14 baboon belowground 15 backfield bifocals 16 backward bodyguard 17 banjo bookseller 18 beaming borderline 19 bedlamp bottomless 1A beehive Bradbury 1B beeswax bravado 1C befriend Brazilian 1D Belfast breakaway 1E berserk Burlington 1F billiard businessman 20 bison butterfat 21 blackjack Camelot 22 blockade candidate 23 blowtorch cannonball 24 bluebird Capricorn 25 bombast caravan 26 bookshelf caretaker 27 brackish celebrate 28 breadline cellulose 29 breakup certify 2A brickyard chambermaid 2B briefcase Cherokee 2C Burbank Chicago 2D button clergyman 2E buzzard coherence 2F cement combustion 30 chairlift commando 31 chatter company 32 checkup component 33 chisel concurrent 34 choking confidence 35 chopper conformist 36 Christmas congregate 37 clamshell consensus 38 classic consulting 39 classroom corporate 3A cleanup corrosion 3B clockwork councilman 3C cobra crossover 3D commence crucifix 3E concert cumbersome 3F cowbell customer 40 crackdown Dakota 41 cranky decadence 42 crowfoot December 43 crucial decimal 44 crumpled designing 45 crusade detector 46 cubic detergent 47 dashboard determine 48 deadbolt dictator 49 deckhand dinosaur 4A dogsled direction 4B dragnet disable 4C drainage disbelief 4D dreadful disruptive 4E drifter distortion 4F dropper document 50 drumbeat embezzle 51 drunken enchanting 52 Dupont enrollment 53 dwelling enterprise 54 eating equation 55 edict equipment 56 egghead escapade 57 eightball Eskimo 58 endorse everyday 59 endow examine 5A enlist existence 5B erase exodus 5C escape fascinate 5D exceed filament 5E eyeglass finicky 5F eyetooth forever 60 facial fortitude 61 fallout frequency 62 flagpole gadgetry 63 flatfoot Galveston 64 flytrap getaway 65 fracture glossary 66 framework gossamer 67 freedom graduate 68 frighten gravity 69 gazelle guitarist 6A Geiger hamburger 6B glitter Hamilton 6C glucose handiwork 6D goggles hazardous 6E goldfish headwaters 6F gremlin hemisphere 70 guidance hesitate 71 hamlet hideaway 72 highchair holiness 73 hockey hurricane 74 indoors hydraulic 75 indulge impartial 76 inverse impetus 77 involve inception 78 island indigo 79 jawbone inertia 7A keyboard infancy 7B kickoff inferno 7C kiwi informant 7D klaxon insincere 7E locale insurgent 7F lockup integrate 80 merit intention 81 minnow inventive 82 miser Istanbul 83 Mohawk Jamaica 84 mural Jupiter 85 music leprosy 86 necklace letterhead 87 Neptune liberty 88 newborn maritime 89 nightbird matchmaker 8A Oakland maverick 8B obtuse Medusa 8C offload megaton 8D optic microscope 8E orca microwave 8F payday midsummer 90 peachy millionaire 91 pheasant miracle 92 physique misnomer 93 playhouse molasses 94 Pluto molecule 95 preclude Montana 96 prefer monument 97 preshrunk mosquito 98 printer narrative 99 prowler nebula 9A pupil newsletter 9B puppy Norwegian 9C python October 9D quadrant Ohio 9E quiver onlooker 9F quota opulent A0 ragtime Orlando A1 ratchet outfielder A2 rebirth Pacific A3 reform pandemic A4 regain Pandora A5 reindeer paperweight A6 rematch paragon A7 repay paragraph A8 retouch paramount A9 revenge passenger AA reward pedigree AB rhythm Pegasus AC ribcage penetrate AD ringbolt perceptive AE robust performance AF rocker pharmacy B0 ruffled phonetic B1 sailboat photograph B2 sawdust pioneer B3 scallion pocketful B4 scenic politeness B5 scorecard positive B6 Scotland potato B7 seabird processor B8 select provincial B9 sentence proximate BA shadow puberty BB shamrock publisher BC showgirl pyramid BD skullcap quantity BE skydive racketeer BF slingshot rebellion C0 slowdown recipe C1 snapline recover C2 snapshot repellent C3 snowcap replica C4 snowslide reproduce C5 solo resistor C6 southward responsive C7 soybean retraction C8 spaniel retrieval C9 spearhead retrospect CA spellbind revenue CB spheroid revival CC spigot revolver CD spindle sandalwood CE spyglass sardonic CF stagehand Saturday D0 stagnate savagery D1 stairway scavenger D2 standard sensation D3 stapler sociable D4 steamship souvenir D5 sterling specialist D6 stockman speculate D7 stopwatch stethoscope D8 stormy stupendous D9 sugar supportive DA surmount surrender DB suspense suspicious DC sweatband sympathy DD swelter tambourine DE tactics telephone DF talon therapist E0 tapeworm tobacco E1 tempest tolerance E2 tiger tomorrow E3 tissue torpedo E4 tonic tradition E5 topmost travesty E6 tracker trombonist E7 transit truncated E8 trauma typewriter E9 treadmill ultimate EA Trojan undaunted EB trouble underfoot EC tumor unicorn ED tunnel unify EE tycoon universe EF uncut unravel F0 unearth upcoming F1 unwind vacancy F2 uproot vagabond F3 upset vertigo F4 upshot Virginia F5 vapor visitor F6 village vocalist F7 virus voyager F8 Vulcan warranty F9 waffle Waterloo FA wallet whimsical FB watchword Wichita FC wayside Wilmington FD willow Wyoming FE woodlark yesteryear FF Zulu Yucatan Examples Each byte in a bytestring is encoded as a single word. A sequence of bytes is rendered in network byte order, from left to right. For example, the leftmost (i.e. byte 0) is considered "even" and is encoded using the PGP Even Word table. The next byte to the right (i.e. byte 1) is considered "odd" and is encoded using the PGP Odd Word table. This process repeats until all bytes are encoded. Thus, "E582" produces "topmost Istanbul", whereas "82E5" produces "miser travesty". A PGP public key fingerprint that displayed in hexadecimal as E582 94F2 E9A2 2748 6E8B 061B 31CC 528F D7FA 3F19 would display in PGP Words (the "biometric" fingerprint) as topmost Istanbul Pluto vagabond treadmill Pacific brackish dictator goldfish Medusa afflict bravado chatter revolver Dupont midsummer stopwatch whimsical cowbell bottomless The order of bytes in a bytestring depends on endianness. Other word lists for data There are several other word lists for conveying data in a clear unambiguous way via a voice channel: * the NATO phonetic alphabet maps individual letters and digits to individual words * the S/KEY system maps 64 bit numbers to 6 short words of 1 to 4 characters each from a publicly accessible 2048-word dictionary. The same dictionary is used in RFC 1760 and RFC 2289. * the Diceware system maps five base-6 random digits (almost 13 bits of entropy) to a word from a dictionary of 7,776 distinct words. * the Electronic Frontier Foundation has published a set of improved word lists based on the same concept[4] * FIPS 181: Automated Password Generator converts random numbers into somewhat pronounceable "words". * mnemonic encoding converts 32 bits of data into 3 words from a vocabulary of 1626 words.[5] * what3words encodes geographic coordinates in 3 dictionary words. * the BIP39 standard permits encoding a cryptographic key of fixed size (128 or 256 bits, usually the unencrypted master key of a Cryptocurrency wallet) into a short sequence of readable words known as the seed phrase, for the purpose of storing the key offline. This is used in cryptocurrencies such as Bitcoin or Monero. * Like the PGP word list, the Bytewords standard maps each possible byte to a word. There is only one list, rather than two. The words are uniformly four letters long and can be uniquely identified by their first and last letters References This article incorporates material that is copyrighted by PGP Corporation and has been licensed under the GNU Free Documentation License. (per Jon Callas, CTO, CSO PGP Corporation, 4-Jan-2007) 1. ↑ Juola, Patrick; Zimmermann, Philip (1996). "Whole-word phonetic distances and the PGPfone alphabet (Archived)" (PDF). Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96. Vol. 1. pp. 98–101. doi:10.1109/ICSLP.1996.607046. ISBN 0-7803-3555-4. S2CID 10385500. Archived from the original (PDF) on 7 September 2006. 2. ↑ Juola, Patrick (1996). "Isolated Word Confusion Metrics and the PGPfone Alphabet". Proceedings of New Methods in Language Processing 2. Ankara, Turkey: Oxford University, Dept. of Experimental Psychology. arXiv:cmp-lg/9608021. Bibcode:1996cmp.lg....8021J. 3. ↑ "Archived copy". web.mit.edu. Archived from the original on 26 March 2010. Retrieved 12 January 2022.{{cite web}}: CS1 maint: archived copy as title (link) 4. ↑ "EFF's New Wordlists for Random Passphrases". 19 July 2016. 5. ↑ mnemonic encoding Archived 2008-03-02 at the Wayback Machine and updated code References 1. https://en.wikipedia.org/wiki/Pretty_Good_Privacy (link) 2. https://en.wikipedia.org/wiki/Word (link) 3. https://en.wikipedia.org/wiki/Bytes (link) 4. https://en.wikipedia.org/wiki/NATO_phonetic_alphabet (link) 5. https://en.wikipedia.org/wiki/Patrick_Juola (link) 6. https://en.wikipedia.org/wiki/Computational_linguistics (link) 7. https://en.wikipedia.org/wiki/Philip_Zimmermann (link) 8. https://en.wikipedia.org/wiki/Pretty_Good_Privacy (link) 9. https://en.wikipedia.org/wiki/PGP_word_list#cite_note-Juola1996a-1 (link) 10. https://en.wikipedia.org/wiki/PGP_word_list#cite_note-Juola1996b-2 (link) 11. https://en.wikipedia.org/wiki/Phonetic (link) 12. https://en.wikipedia.org/wiki/Genetic_algorithms (link) 13. https://en.wikipedia.org/wiki/Phoneme (link) 14. https://en.wikipedia.org/wiki/Grady_Ward (link) 15. https://en.wikipedia.org/wiki/Moby_Project (link) 16. https://en.wikipedia.org/wiki/DEC_Alpha (link) 17. https://en.wikipedia.org/wiki/PGPfone (link) 18. https://en.wikipedia.org/wiki/Man-in-the-middle_attack (link) 19. https://en.wikipedia.org/wiki/Biometric (link) 20. https://en.wikipedia.org/wiki/Pretty_Good_Privacy (link) 21. https://en.wikipedia.org/wiki/Public_key (link) 22. https://en.wikipedia.org/wiki/Message_digest (link) 23. https://en.wikipedia.org/wiki/Jon_Callas (link) 24. https://en.wikipedia.org/wiki/Zfone (link) 25. https://en.wikipedia.org/wiki/ZRTP (link) 26. https://en.wikipedia.org/wiki/Phonetics (link) 27. https://en.wikipedia.org/wiki/Syllables (link) 28. https://en.wikipedia.org/wiki/PGP_word_list#cite_note-3 (link) 29. https://en.wikipedia.org/wiki/Network_byte_order (link) 30. https://en.wikipedia.org/wiki/Hexadecimal (link) 31. https://en.wikipedia.org/wiki/Endianness (link) 32. https://en.wikipedia.org/wiki/NATO_phonetic_alphabet (link) 33. https://en.wikipedia.org/wiki/S/KEY (link) 34. https://en.wikipedia.org/wiki/Diceware (link) 35. https://en.wikipedia.org/wiki/Electronic_Frontier_Foundation (link) 36. https://en.wikipedia.org/wiki/PGP_word_list#cite_note-4 (link) 37. https://en.wikipedia.org/wiki/Automated_Password_Generator (link) 38. https://en.wikipedia.org/wiki/PGP_word_list#cite_note-5 (link) 39. https://en.wikipedia.org/wiki/What3words (link) 40. https://en.wikipedia.org/wiki/Cryptocurrency_wallet (link) 41. https://en.wikipedia.org/wiki/Seed_phrase (link) 42. https://en.wikipedia.org/wiki/Bitcoin (link) 43. https://en.wikipedia.org/wiki/Monero (link) 44. https://developer.blockchaincommons.com/bytewords/ (link) 45. https://en.wikipedia.org/wiki/PGP_word_list#cite_ref-Juola1996a_1-0 (link) 46. https://web.archive.org/web/20060907131751/https://www.mathcs.duq.edu/~j uola/papers.d/icslp96.pdf (link) 47. https://en.wikipedia.org/wiki/Doi_(identifier) (link) 48. https://doi.org/10.1109%2FICSLP.1996.607046 (link) 49. https://en.wikipedia.org/wiki/ISBN_(identifier) (link) 50. https://en.wikipedia.org/wiki/Special:BookSources/0-7803-3555-4 (link) 51. https://en.wikipedia.org/wiki/S2CID_(identifier) (link) 52. https://api.semanticscholar.org/CorpusID:10385500 (link) 53. https://www.mathcs.duq.edu/~juola/papers.d/icslp96.pdf (link) 54. https://en.wikipedia.org/wiki/PGP_word_list#cite_ref-Juola1996b_2-0 (link) 55. http://www.mathcs.duq.edu/~juola/papers.d/pgpfonenemlap.ps (link) 56. https://en.wikipedia.org/wiki/ArXiv_(identifier) (link) 57. https://arxiv.org/abs/cmp-lg/9608021 (link) 58. https://en.wikipedia.org/wiki/Bibcode_(identifier) (link) 59. https://ui.adsabs.harvard.edu/abs/1996cmp.lg....8021J (link) 60. https://en.wikipedia.org/wiki/PGP_word_list#cite_ref-3 (link) 61. https://web.archive.org/web/20100326141145/http://web.mit.edu/network/pg pfone/manual/index.html#PGP000062 (link) 62. http://web.mit.edu/network/pgpfone/manual/index.html#PGP000062 (link) 63. https://en.wikipedia.org/wiki/Template:Cite_web (link) 64. https://en.wikipedia.org/wiki/Category:CS1_maint:_archived_copy_as_title (link) 65. https://en.wikipedia.org/wiki/PGP_word_list#cite_ref-4 (link) 66. https://www.eff.org/deeplinks/2016/07/new-wordlists-random-passphrases (link) 67. https://en.wikipedia.org/wiki/PGP_word_list#cite_ref-5 (link) 68. http://www.tothink.com/mnemonic/ (link) 69. https://web.archive.org/web/20080302025836/http://www.tothink.com/mnemon ic/ (link) 70. https://en.wikipedia.org/wiki/Wayback_Machine (link) 71. https://github.com/singpolyma/mnemonicode (link) From: