Subj : Re: SpiderMonkey: JS_InitStandardClasses allways fails
To   : netscape.public.mozilla.jseng
From : =?ISO-8859-1?Q?Georg_Maa=DF?= <georg@bioshop.de>
Date : Tue May 06 2003 12:33 pm

I've extended my Wert class now with this new variant of the setValue
method. This uses std::string.assign, which might be less expensive than
std::basic_string(char*, size_type, const A& a=A()). Throwing away the
high bytes when calling JS_GetStringBytes(jss) is no problem, because
may application does not support multibyte characters, when working with
std::string. It provides a std::wstring container type for future use,
when I know how to convert from std::string to std::wstring and vice
versa. As long as this problem is not solved, the std::wstring container
is not used and no characters greater than 255 are supported.


#ifdef __mozilla_enabled__
void Wert::setValue ( // Datentyp 1, String
JSString* jss
, const std::string& key
)
throw (
Wert::EXCEPTION_PARAMETERERROR
, Wert::EXCEPTION_INTERNALERROR
, Wert::EXCEPTION_ACCESSDENIED
)
{
if(jss)
{
switch(typ)
{
case Wert_string:
case Wert_shortIdentifier: // Dies ist ebenfalls ein String, besitzt
jedoch einen andere Semantik (Verwendung als Elementname in DEW)
{
((std::string*)value)->assign(JS_GetStringBytes(jss),
JS_GetStringLength(jss));
break;
}
case Wert_wstring:
{
//
// jschar und wchar_t sind nicht binär kompatibel, denn jschar hat 2 Bytes,
// wchar_t dagegen 4 Bytes. Möglicherweise muß hier auch noch eine Änderung
// der Bytereihenfolge innerhalb der wchar_t durchgeführt werden. Da jschar
// kein Vorzeichen besitzt, bereiten zumindest Vorzeichen keine Probleme.
//
((std::wstring*)value)->erase(); // bisherigen Inhalt löschen, damit wir
die jschar-Zeichen einzeln anhängen können
unsigned long ml = JS_GetStringLength(jss);
jschar* jscs = JS_GetStringChars(jss);
for(unsigned long l = 0; l < ml; l++)
{
wchar_t wc = jscs[l]; //### hier muß eventuell noch eine Vertauschung
von high byte und low byte erfolgen
((std::wstring*)value)->push_back(wc);
}
break;
}
case 0:
{
value = (void*) new std::wstring;
unsigned long ml = JS_GetStringLength(jss);
jschar* jscs = JS_GetStringChars(jss);
bool isString = true;
for(unsigned long l = 0; l < ml; l++)
{
wchar_t wc = jscs[l]; //### hier muß eventuell noch eine Vertauschung
von high byte und low byte erfolgen
isString = isString && wc < 256;
((std::wstring*)value)->push_back(wc);
}
if(isString)
{
// es sind keine Unicode-Zeichen drin, also genügt ein normaler String,
// oder eventuell ist sogar eine Darstellung als Zahl oder Datum etc.
// möglich.
delete (std::wstring*) value; // Das brauchen wir nicht mehr, denn im
default-Zweig arbeiten wir mit
// einem normalen String und erwarten nicht, daß in value bereits ein
// Objekt drin ist. Daher muß das hier nun fachgerecht entsorgt werden.
// ######################################################
// # #
// # ACHTUNG #
// # Dies soll im default-Zweig fortgesetzt werden. #
// # Aus diesem Grund darf hier kein break stehen, #
// # und zwischen diesem Zweig und dem default-Zweig #
// # darf kein weiterer Zweig eingebaut werden. #
// # #
// ######################################################
}
else
{
typ = Wert_wstring; // Es waren Unicode-Zeichen drin, also kann es nur
ein wstring sein
break;
}
}
default:
{
std::string s;
s.assign(JS_GetStringBytes(jss), JS_GetStringLength(jss));
Wert::setValue(s, key); // Stringwert nun zur weiteren Verarbeitung
übergeben
}
}
}
else Wert::setValue("",key); // null wird wie Leerstring behandelt
}
#endif




After this extension of my Wert class I could replace the dummy case in
evaluateJavaScript with this small piece of code, which now does the
desired stuff.


case JSTYPE_STRING:
{
if(pa.first->second) pa.first->second->setType(Wert_string); // Typ auf
string setzen
else pa.first->second = new Wert(Wert_string); // Variable mit Typ
string erzeugen und Zeiger zuweisen
pa.first->second->setValue(JS_ValueToString(global_context, rval)); //
Wert speichern
break;
}



This test code:

evaluateJavaScript("15;2*815;'test string';","test.js",4711); // TEST

now results in a Wert class instance of type Wert_string with value
'test string', which is the desired result.

How can I examin, whether there are unicode characters inside the
JSString, that use a high byte? Is there an api call to test this, or
should I look at each character to determin whether I can use the
implementattion above without information loss, or should prefer a
std::wstring as container to prevent the highbytes to get lost. How can
I fill std::wstring? A jschar is only 2 bytes, where wchar_t is 4 bytes
or more. So a jschar* is not binary compatible with a wcha_tr*. Do I
have to feed a std::wstring jschar by jschar as done in my
implementation above?

I guess that there is no api function to find out whether
JS_GetStringBytes results in an information loss or not, because see no
internal flags inside JSString, which might provide this information in
a cheap way. Getting this information by looking into each jschar is
very expensive.

This knowledge is necessary, when my Wert class instance is of type
"undefined", which means autocast the next assiged value to the type
that best fits. A JSString is ambiguos for this. If it contains
characters larger than 255, then the resulting typ must be Wert_wstring.
If not, then a temporary std::string is to be created and introspeced
whether it might be represended as int, unsigned int, long, unsigned
long, bool or date or otherwise must be stored as std::string. This auto
cast might be very expensive, if there is no cheap test to get the
information whether the JSString contains characters greater than 255 or
not.

What is about byte order? Are there any situations where I have to 
change the byte order, when I assign a jschar to a wchar_t, or does the 
byte order of jschar allways fit the byteorder of wchar_t? On my test 
system (x86) it fits, but does it also fit on any other system like 
PowerPC without changing the byte order?

.