Subj : Re: About Unicode To : netscape.public.mozilla.jseng From : Jens Thiele Date : Mon Nov 22 2004 09:18 pm Jun Kim schrieb: > Hi. I have a question about unicode. > Well, I don't know whether I can ask this question here, > but I'll ask anyway. Any guideline will be appreciated. > I'm using the unicode language, preferably Korean. > And I thought JS engine support unicode string, but apparently not. ECMAScript (=3D>JavaScript) strings support unicode. And the JS engine (I assume you mean the one written in C - "SpiderMonkey") supports them, too. ECMAScript Language Specification Edition 3 (24-Mar-00): "8.4 The String Type The String type is the set of all finite ordered sequences of zero or more 16-bit unsigned integer values [...] When a string contains actual textual data, each element is considered to be a single UTF-16 unit. [...] NOTE: The rationale behind these decisions was to keep the implementation of Strings as simple and high-performing as possible. The intent is that textual data coming into the execution environment from outside (e.g., user input, text read from a file or received over the network, etc.) be converted to Unicode Normalised Form C before the running program sees it. Usually this would occur at the same time incoming text is converted from its original character encoding to Unicode (and would impose no additional overhead). Since it is recommended that ECMAScript source code be in Normalised Form C, string literals are guaranteed to be normalised (if source text is guaranteed to be normalised), as long as they do not contain any Unicode escape sequences." > (or maybe I'm doing it wrong) >=20 > Well, this is the piece of the codes: >=20 > js> var a =3D "=EA=B0=80=EB=82=98=EB=8B=A4"; // the string is un= icode > js> a.charAt(1); // this returns a trash character this probably is the example shell included in the spidermonkey distribution? it does not support unicode or perhaps better: it doesn't do what is mentioned in the NOTE above. It simply reads bytes/chars and passes them to JS_CompileScript: http://lxr.mozilla.org/mozilla/source/js/src/js.c#357 see also: http://www.mozilla.org/js/spidermonkey/apidoc/gen/api-JS_CompileScript.ht= ml vs. http://www.mozilla.org/js/spidermonkey/apidoc/gen/api-JS_CompileUCScript.= html (the byte string is treated as iso-latin) Greetings Jens .