Subj : About Unicode To : netscape.public.mozilla.jseng From : Jun Kim Date : Mon Nov 22 2004 11:37 am Hi. I have a question about unicode. Well, I don't know whether I can ask this question here, but I'll ask anyway. Any guideline will be appreciated. I'm using the unicode language, preferably Korean. And I thought JS engine support unicode string, but apparently not. (or maybe I'm doing it wrong) Well, this is the piece of the codes: js> var a = "°¡³ª´Ù"; // the string is unicode js> a.charAt(1); // this returns a trash character The problem I met was calling charAt() function on unicode string. I did the trace on the engine, and when charAt() function is called, the JSString is converted to JSDependentString. (Am I right?) And the function that returns the char * from JSString, it does all the #define functions and gets the address of the actual string memory. And the directed memory location is -1 to the character that actually located using ascii. For instance, if the string is "abc", then the memory allocated looks like 61 00 62 00 63 00 00 00 // a.b.c.. and the charAt function calls with 1 as the index and when it returns, the data points to the '00' right before 'b' 61 00 62 00 63 00 00 00 // a.b.c.. ^ and so 'b' prints out. well, using unicode, the memory is allocated 4 bytes become one unicode character, like this. B1 00 C1 00 8F 00 DA 00 00 00 where B1C1 is one unicode character and 8FDA is other one unicode character. (I just made the code up, so I don't know what they actually represent. :) ) Well, here comes the problem. When calling charAt(1) function and converting things pass by, the string points to '00' right before C1 B1 00 C1 00 8F 00 DA 00 00 00 ^ so the output will be printing C1 code or maybe C18F code. (I don't know exactly) So, now this is where I call help. HELP!!! :) Am I doing something wrong? .