Subj : Re: Different handling of XML literals in Spidermonkey and Rhino
To : netscape.public.mozilla.jseng
From : Brendan Eich
Date : Mon Mar 28 2005 05:45 pm
Brendan Eich wrote:
> So the bug bites only top-level PIs and comments:
>
> js> x =
>
> js> y = hi there
> hithere
> js> y.toXMLString()
>
> hi
> there
>
> js> z =
>
> js> w = hi there
> hithere
> js> w.toXMLString()
>
> hi
> there
>
>
> So we'll want tests for both cases in the suite -- Bob is on the case.
> I'll fix straight away.
Fixed (twice). My first fix yesterday was wrong, and would assign null
to x and y by default in the first two examples above. This would have
the bad effect of producing
null
in z given this example:
s = "T";
z = <{s}>{s}>;
I fixed this to return an empty-string-valued text XML node, which does
the right thing in all cases (note to bclary: e4x/XML/13.4.3.js tests 35
and 36 need to use new XML("") for the correct or expected value).
Notice that in the "computed XML literal" example just above (the 'z =
<{s}>...{s}>' one), there's no need to make any node for the ,
because it will just be converted to a string and then to XML, only to
be reparsed.
So I also beefed up the compiler's constant folder to flatten CDATA,
comments, and PIs back to strings when used in the midst of a larger XML
literal that contains computed tag or text parts. That was good, but it
still left this suboptimality:
js> function f(s){return <{s}>{s}>;}
js> dis(f)
main:
00000: string "<"
00003: getarg 0
00006: xmltagexpr
00007: add
00008: string ">"
00011: add
00012: string ""
00015: add
00016: string ""
00019: getarg 0
00022: xmltagexpr
00023: add
00024: string ">"
00027: add
00028: add
00029: toxml
00030: return
Source notes:
Notice how the left-associative concatenation operator leaves us
building up the string to convert ToXML in more steps than needed. Say
we are evaluating f("T"). First we push
"<"
Then we compute the tag name and add, yielding
"" and add to get
""
Then push "" and add, etc. The end tag is computed as a separate
(as if parenthesized to group "" + {s} + ">") string concatenation
subexpression, hence the back-to-back add ops right before the toxml.
It would be better to push "<", compute the tag name to get "", add, compute the end tag, push ">", add, and
finally toxml. Fixing this is more work than I have time for right now,
but I thought I'd mention it.
(The intermediate form SpiderMonkey uses is mostly-concrete syntax
trees, which doesn't make reassociation of this sort easy; a lower-level
intermediate would add compile time for no general gain; so I'm
considering a peephole approach on the bytecode.)
/be
.