K N J I D I C ============= This document describes the file "kanjidic", which is a merger of the former files "jis1detl.lst" and "jis2detl.lst". It contains 2965 + 3388 lines, one for each of the JIS1 and JIS2 kanji. The first file was compiled initially from the file "kinfo.dat" supplied by Stephen Chung, who in turn compiled his file from a file prepared by Mike Erickson. I originally added about 1900 "meanings" by James Heisig keyed in by Kevin Moore from the book "Remembering The Kanji". I later added the ex-Nelson meanings from Rik Smoody's files. The second file was compiled from a complete JIS2 list with Bushu and stroke counts kindly supplied to me by Jon Crossley, to which I added Nelson numbers, yomikata and meanings extracted from a dictionary file prepared by Rik Smoody at Sony. The file is being continually updated with extra and corrected yomikata, Nelson nos, meanings, etc. Theresa Martin has been a great assistance with this, particularly with tracking down and correcting many mistranscribed yomikata (the old zu/dzu, oo/ou, ji/dji, etc. problems). Kanjidic is used now to build the "kinfo.dat" file which is used by JDIC and JREADER (and JWP if Stephen Chung ever finishes it). "kinfo.dat" contains the identical information, but in a compressed form and in a structure suitable for fast indexed access. It is also used in the XJDIC program. Each line is in ASCII & EUC. It contains: - the kanji (in EUC) - the ASCII of the two bytes of JIS code (for those who like such things. Actually it makes the file a bit easier to edit.) - the Nelson index number of the kanji in the form Nnnnn (where it does not exist there is N0) - the Nelson radical (Bushu) number in the form Bnnn. - the stroke count in the form Snnn - the "Grade" of the kanji in the form Gn (i.e. the school grade in which the kanji is taught in Japan.) - the ON readings from Nelson in katakana - the KUN readings in hiragana (note that for most of the JIS2 kanji, all the yomikata are at present in hiragana, but this is gradually being corrected.) For some of the kanji okurigana are included; the okurigana portion is separated by a "(" (for quite historical reasons). - the "meanings" in the form {meaning_1} {..} {meaning_n} As the files were first generated, there were many duplicated Nelson numbers, i.e. multiple kanji with the same number. Some of these were mistakes, but many came from the structure of Nelson itself. Nelson has many of the alternative (and generally obscure) kanji in one of the following forms: KANJI_1 see KANJI_2 nnnn or KANJI_A pppp see KANJI_B nnnn In the earlier version of JISnDETL.LST, all of these Kanji had a Nelson number of nnnn. As my JDIC program, in its kanji_search, requires unique Nelson numbers, I have rationalized the the numbering so that KANJI_1 will have a Nelson number of 0, and a cross-reference in the meanings field to Nnnnn. Similarly, KANJI_A will have a number of pppp and a cross-reference to nnnn. I seem to be the custodian of these files, so feel free to send me corrections. DO NOT send me complete files, diff out YOUR corrections and just send me those, otherwise I will be trying to sort out conflicting sets of corrections. The jisNdetl.lst files will be available on the monu6 and other archives. Jim Breen (jwb@capek.rdt.monash.edu.au) Department of Robotics & Digital Technology Monash University, Victoria, Australia