Subj : Re: JavaScript performance in the real world
To   : netscape.public.mozilla.jseng
From : Prince Riley
Date : Sat Dec 04 2004 12:59 am

Pete Diemert wrote:
> I apologize in advance if this question covers well trodden ground but my
> search of the archive was inconclusive.  We are attempting to use the
> SpiderMonkey implementation as a scripting environment for writing robust
> data driven Windows applications.  We have exposed a set of objects that
> provide database connectivity, xml parsing, user interface, etc...  So far
> so good except now we are beginning to write some more "meaty" applications
> that do heavy processing.  Case in point, we have a script that iterates
> some 1000 nodes of an xml document (using our xml document/node object
> model) and runs some queries for each node (using our database object
> model).  We are finding that we begin to experience serious performance
> issues that seem to stem from the size of the working set.  After
> investigation we are finding that because we do not attempt a GC while
> script is running the working set will grow VERY quickly.  To be more
> concrete, here is an pseudo example:
> 
> function processXML(doc)
> {
>     // Will process, say, 1000 nodes
>     for(var node = doc.childHead; node != undefined; node =
> node.nextSibling)}
>    {
>       queryDatabase(node);
>       queryDatabase(node);
>       queryDatabase(node);
>    }
> }
> 
> function queryDatabase(node)
> {
>    // Do some database work with the node
>    //
>    var rs = openRecordset("SELECT * FROM foo");
> }
> 
> ...fairly typical stuff.  The problem is that opening a recordset object, as
> you might expect, can be costly and code similar to that above will chew up
> literally 10's of megabytes of heap space with recordset objects in a matter
> of seconds.  When a GC finally kicks in everything is cleaned up but we can
> see the working set issue easily becoming a constraint in many real world
> scenarios where we wish to use JavaScript.  One other interesting note is
> that we have attempted to call GC at periodic moments throughout processing
> on return from native function calls (e.g. every so many seconds) but we
> find that the GC is CONSIDERABLY slower when the scriptdepth is greater than
> 0 (we have scenarios of approximately 5K JS objects allocated where the GC
> will take 30 seconds).  If the scriptdepth is 0 the GC will be nearly
> instaneous (perhaps extra roots are created for objects in frame?).
> 
> Understanding that JavaScript is a GC / "managed" environment and that,
> architecturally we could be trying to fit a square peg in a round hole, are
> there some techniques that we could use to make this system work better?
> Perhaps just solving the slow GC problem will get us there.  Any suggestions
> or insight would be greatly appreciated.
> 
> pete
> 
> 
Pete,

An interesting problem... based on the level of detail you provided in 
your post, I'd like to suggest the following.

Create a new JS engine object for some predetermine node on each "level" 
in your doc tree. Create a temporary disk file (partitoned by engine 
object) where you store the JS engine object when you decend below that 
node beyound a pre-determine depth in your doc tree and then restore it 
when you rise back up that branch. You might consider implementing the 
disk store as an in memory FIFO stack (mapped to disk), but again the 
detail in your post doesn't allow me to know how if that is practical 
for your application.

Hope this suggestion helps or starts you thinking in a productive direction.

.