this directory contains stuff to support the mapping of han characters from one encoding into another. currently, we do japanese (kana and kanji), chinese (big5 and gb). the mapping is based on kuten codes, big5 codes or gb codes respectively; if you need to do characters that don't have them, you are stuck. the basic mapping sources are desc.han (unicode vol 1 tables) and desc.cjk (encoded form of the unicode volume 2 tables). the overall product is kuten.c; the mkfile tells you how. the intermediate goal is to generate a jis.font which is a mapping from kuten codes into unicodes. it is incomplete, which is why supplementary mappings in jis.weird are added. the han.bug mappings come from han.raw which is what the word processing pool typed in from a pre-release copy of volume 2 of unicode. han.bwk contains minor adjustments to the mapping. its format is obvious. big5 and gb use a simpler structure, simply a mapping file. the files *.bits contain the 16x16 and 24x24 bitmap fonts for JIS 208. they also contain the kuten codes for jis 208. Descriptions of the Chinese Fonts: cclib16fs.bdf.Z Simplified characters, GB encoding, 16x16 Fang Song Style. cclib16st.bdf.Z Simplified characters, GB encoding, 16x16 Song Style. hku-ch16.bdf.Z Traditional characters, BIG5 encoding, 16x16 unnamed style. cfan24.ccf.Z Traditional characters, QW encoding, 24x24 Song Style. cfang24.ccf.Z Simplified characters, QW encoding, 24x24 Fang Song Style. chei24.ccf.Z Simplified characters, QW encoding, 24x24 Hei Style. ckai24.ccf.Z Simplified characters, QW encoding, 24x24 Kai Style. csong24.ccf.Z Simplified characters, QW encoding, 24x24 Song Style. 1) Traditional characters are the original Chinese characters. They are used Hong Kong, Taiwan, and most overseas Chinese communities. Simplified characters were introduced for convenience of use. They are used in Mainland China and Singapore. Some simplified characters were created from their traditional counterparts by removing certain strokes. Others are themselves traditional characters but are also homonyms of some more complex-shaped characters. Because of the merge of homonyms, the simplified character set is smaller then the traditional set. 2) Here "GB encoding" refers to the Standard "Code of Chinese Graphic Character for Information Interchange, Primary Set (GB2312-80)", published by National Standards Bureau, Beijing, China in 1980. 3) Here "BIG5 encoding" refers the the HKU Big5 encoding, a variant of the widely used de facto standard, so-called "Big5". It is slightly different with ETen encoding. 4) Here "QW encoding" refers to the QUWEI code, a variant of GB, QW maps to GB by the following formula: { char qw[4]; int gb; gb = ((qw[0] - '0')*10 + (qw[1] - '0') + 160) * 256 + ((qw[2] - '0')*10 + (qw[3] - '0') + 160); } usage: it is always safe to say mk -s all clean the only product is ../kuten.c and it only updates it if it were to change. Attributions: The big5 stuff comes from HKU and is freely available by ftp (although we only use the font, not any of the software). the contact name for this is kwchan%csd@hkujnt.bitnet the jis 208 bitmaps come from a file included with the X distribution. the main mapping files come from unicode. we, and not unicode, are responsible for the final results. you are responsible for how they are used. we don't yet know the attribution of, or copyright on, the chinese fonts.