Subj : Re: Making a Spider in Java with Rhino
To   : =?UTF-8?B?R29uemFsbyBGbG9yw61h?= <gfloria@tcpsi.es>
From : Igor Bukanov <igor@fastmail.fm>
Date : Tue Mar 25 2003 08:43 pm

BTW, what do you mean by "web site performance" in your initial mail? If 
you need to test just download speed from the server, you may get away 
from DOM, since implementing document.write plus replacing everything 
else by do-nothing-proxies can be sufficient for that.

Igor Bukanov wrote:
> Gonzalo Floría wrote:
> 
>> Thnx, I can see I'm greener than I thought. From your words I understand:
>> 1.- I need to have a DOM implementation: An class able to "understand" 
>> the structure
>> of the HTML I'm working with, separeting HTML from Js code. This I 
>> already have.
>> 2.- My DOM should create a Document Object. This object shoul 
>> implement the
>> org.w3c.dom.html.HTMLDocument interface. This I need to do.
>> 3.- Here is the part I don't know how to do: I should "give" this 
>> document object to
>> the Rhino interpreter before processing the JS code from the document. 
>> how?
> 
> 
> In general you can not parse HTML or build DOM separately from execution 
> of JavaScript since JavaScript can change HTML source via document.write 
>  or modify DOM via any DOM mutation function. One solution is to create 
> a document object representing empty DOM tree and then build the tree 
> there so scripts will see the current DOM tree during their execution.
> 
> Regards, Igor
>

.