https://www.johndcook.com/blog/2022/09/25/katakana-hiragana-unicode/ John D. Cook Skip to content * MATH + PROBABILITY + SIGNAL PROCESSING + NUMERICAL COMPUTING + SEE ALL ... * STATS + EXPERT TESTIMONY + FORECASTING + RNG TESTING + SEE ALL ... * PRIVACY + HIPAA + CRYPTOGRAPHY + DIFFERENTIAL PRIVACY * WRITING + BLOG + TWITTER + ARTICLES + TECH NOTES + SUBSCRIBE + NEWSLETTER * ABOUT + CLIENTS + ENDORSEMENTS + TEAM + SERVICES (832) 422-8646 Contact Katakana, Hiragana, and Unicode Posted on 25 September 2022 by John I figured out something that I wasn't able to find by searching, so I'm posting it here in case other people have the same question and the same difficulty finding an answer. I'm sure other people have written about this, but I couldn't find it. Maybe lots of people have written about this in Japanese but not many in English. Japanese kana consists of two syllabaries, hiragana and katakana, that are like phonetic alphabets. Each has 46 basic characters, and each corresponds to a block of 96 Unicode characters. I had two simple questions: 1. How do the 46 characters map into the 90 characters? 2. Do they map the same way for both hiragana and katakana? Hiragana / katakana correspondence I'll start with the second question because it's easier. Hiragana and katakana are different ways of representing the same sounds, and they correspond one to one. For example, the full name of U+3047 (e) is HIRAGANA LETTER SMALL E and the full name of its katakana counterpart U+30A7 (e) is KATAKANA LETTER SMALL E The only difference as far as Unicode goes is that katakana has three code points whose hiragana counterpart is unused, but these are not part of the basic letters. The following Python code shows that the names of all the characters are the same except for the name of the system. from unicodedata import name unused = [0, 151, 152] # not in hiragana for i in range(0,63): if i in unused: continue h = name(chr(0x3040 + i)) k = name(chr(0x30a0 + i)) assert(h == k.replace("KATAKANA", "HIRAGANA")) print("done") Mapping 46 into 50 and 96 You'll see kana written in grid with one side labeled with 5 vowels and the other labeled with 10 consonants called a gojuon (Wu Shi Yin ). That's 50 cells, and in fact gojuon literally means 50 sounds, so how do we get 46? Five cells are empty, and one letter doesn't fit into the grid. The empty cells are unused or archaic, and the extra character doesn't fit the grid structure. In the image below, the table on the left is for hiragana and the table on the right is for katakana. HTML versions of the tables available here. [gojuon] Left out of each table is n in hiragana and n in katakana. So does each set of 46 characters map into its Unicode code block? Unicode numbers the letters consecutively if you traverse the grid increasing vowels first, then consonants, and adding the straggler at the end. But the reason 46 letters expand into more code points is that each letter can have one, two, or three variations. And there are various miscellaneous other symbols in the Unicode block. For example, there is a LETTER E as well as the SMALL LETTER E mentioned above. Other variations seem to correspond to voiced and unvoiced versions of a consonant with a phonetic marker added to the voiced version. For example, ku is U+304F, HIRAGANA LETTER KU, and gu is U+3050, HIRAGANA LETTER GU. Here is how hiragana maps into Unicode. Each cell should be U+3000 plus the characters show. a i u e o 42 44 46 48 4A k 4B 4D 4F 51 53 s 55 57 59 5B 5D t 5F 61 64 66 68 n 6A 6B 6C 6D 6E h 6F 72 75 78 7B m 7E 7F 80 81 82 y 84 86 88 r 89 8A 8B 8C 8D w 8F 92 The corresponding table for katakana is the previous table plus 0x60: a i u e o A2 A4 A6 A8 AA k AB AD AF B1 B3 s B5 B7 B9 BB BD t BF C1 C4 C6 C8 n CA CB CC CD CE h CF D2 D5 D8 DB m DE DF E0 E1 E2 y E4 E6 E8 r E9 EA EB EC ED w EF F2 In each case, the letter missing from the table is the next consecutive value after the last in the table, i.e. n is U+30F3. Related posts * Graphing Japanese prefectures * Alphabets in Unicode Categories : Uncategorized Tags : Unicode Bookmark the permalink Post navigation Previous PostRoom squares and Tournaments Next PostVisualizing correlations with graphs Leave a Reply Your email address will not be published. Required fields are marked * [ ] [ ] [ ] [ ] [ ] [ ] [ ] Comment * [ ] Name * [ ] Email * [ ] Website [ ] [Post Comment] [ ] [ ] [ ] [ ] [ ] [ ] [ ] D[ ] Search for: [ ] [Search] John D. Cook John D. Cook, PhD, President My colleagues and I have decades of consulting experience helping companies solve complex problems involving data privacy, math, statistics, and computing. Let's talk. We look forward to exploring the opportunity to help your company too. [ ] [ ] [ ] [ ] [ ] [Send] [ ] [ ] [ ] [ ] [ ] [ ] [ ] D[ ] John D. Cook (c) All rights reserved. Search for: [ ] [Search] (832) 422-8646 EMAIL