Subj : Re: Making a Spider in Java with Rhino To : gonzalo From : Igor Bukanov Date : Mon Mar 17 2003 06:19 pm gonzalo wrote: > Hello: > > I'm working on a project to benchmark Websites. Our program works as > a spider following every link on the page and measuring loading times > and service quality. So far, we have solved every aspect related with > 100% HTML sites. The problem rise when JavaScript is in the middle. > > We are thinking about using Rhino as the JavaScript sections > interpreter, but we don't know where to start. Questions are: > * Can rhino "find" the javascript within the HTML code? > * Can rhino get the required JavaScript libraries to solve the > interpretation? (the libraries stored on the Web Host). > * How can we monitor when Rhino is processing a sentence where another > document is requested? (another page o a image file) Rhino is just an engine that allows to execute JavaScript and it only provides implementation for core ECMAScript libraries and LiveConnect. It does NOT provide a DOM implementation for JavaScript which you will need to execute even very trivial scripts in HTML pages. On the other hand, if you do have a DOM implemented in Java it is possible to connect it to Rhino and get DOM for JavaScript. In you case the task is significantly simplified since you do not need to provide an access to layout information and CSS, but still you have to prepared to deal with zillions of extensions over the standard DOM MSIE made available that are not covered by most Java DOM implementations at all. > > thankx. If anyone knows of examples doing similar things, they would > be very helpful too. Look at any Java browser with JavaScript support. Regards, Igor .