Subj : Re: Compacting JS source via ParseTree or Decompile To : Dan Libby From : Brendan Eich Date : Mon Jun 07 2004 07:16 pm Dan Libby wrote: > For an initial release, I should probably scale things back a bit and > just concentrate on making a good code compactor -- one that will work > with the existing libjs. Then, if we agree on a good design, perhaps we > can extend both to perform the more advanced pretty-printing. Right, that's almost always a sound approach (drool, roll, crawl, ..., run ;-). > BTW, with what I have already, our largest JS file shrunk from 119K to > 59K. A 50% reduction! Cool! > Towards the goal of code compaction, there's a couple more items: > > 1) I notice that even with the JS_DONT_PRETTY_PRINT flag, the decompiler > always places a newline after "for" statements, eg: > > for(...) { > code}more code; > > I think the culprit is this bit of code in jsopcode.c, which does not > use the "pretty" flag: > > 1490 rval = OFF2STR(&ss->sprinter, ss->offsets[ss->top-1]); > 1491 js_printf(jp, " in %s) {\n", rval); > 1492 jp->indent += 4; > 1493 DECOMPILE_CODE(pc + oplen, tail - oplen); > 1494 jp->indent -= 4; > 1495 js_printf(jp, "\t}\n"); Wrong culprit -- here is the bad boy: 991 /* Do the loop body. */ 992 js_puts(jp, ") {\n"); 993 jp->indent += 4; 994 oplen = (cond) ? js_CodeSpec[pc[cond]].length : 0; 995 DECOMPILE_CODE(pc + cond + oplen, next - cond - oplen); 996 jp->indent -= 4; 997 js_printf(jp, "\t}\n"); The problem is the js_puts usage, instead of js_printf. See the closing brace js_printf at line 997. > This is not a very big deal, but it would be nice to be able to strip > all newlines, saving those exta bytes. Would it be safe to simply do a > global string replace of "\n" to "" on the decompiled script? Or might > that break intentional user code? It's just a bug, fixed in the patch for http://bugzilla.mozilla.org/show_bug.cgi?id=245795. > 2) A killer feature for code compaction would be the ability to > automatically abbreviate local scope variables. Global vars, class > vars, and function names are trickier, because they can be referenced > externally. But a variable internal to a function should always be safe > to rename, so long as you also change the references within that > function. Agreed? Caveats? Sure, why not? > Okay, so I think that for the jscompact app to do this, it needs to be > able to: > - compile script (duh) > - walk all the functions in the bytecode > - walk the local vars in the function > - change the variable name (store in a hash, oldname => newname ) > - walk the opcodes to determine where oldname is used > - update the reference > - decompile bytecode All you need to do is to rename the function's properties that have js_GetLocalVariable as their getter. Are you renaming arguments too? >> You'll need to extend the front end to keep comments. Currently the >> scanner strips them. > > Tricky. I suppose it would need to somehow parse the comments and then > record some sort of "marker" for where to place the comment in the > decompiled string. Erggg. Rather than try to pass things through compiled script to the decompiler, I would recommend just annotating the parse tree, using #ifdef'd code, with white space, comments, and extra braces. Then your back end would walk the parse tree created by js_ParseTokenStream, and you would avoid hacking or calling js_EmitTree etc. altogether. > When you say "back end", exactly which code component and/or set of > API's are you talking about? Clearly, if Decompile() and its > sub-functions were substantially re-written, that should do the trick. See above -- I'm proposing you use the JS_FRIEND_API exported from jsparse.h. > I suppose it would be possible to duplicate these in the jscompact > frontend, but with the necessary pretty-printing modifications. That > way, the library doesn't need to change at all. But it seems rather > brittle/gross using internal data structures. Don't duplicate the decompiler (don't even generate code to decompile). Do walk the JSParseNode tree yourself. >> What do these templates look like? Are they just for documentation, >> or for other purposes? > > > Yes, for documentation. I was just thinking of auto generating a > comment block above each function that doesn't already have one. It > could use the javadoc (or other) format, and basically just saves the > tedium of copy/pasting the same base comment block all over the place. > Definitely not a big deal. Easy to do if you write your own back end to the parser. >>> I am still interested in the "walk pn and check sanity" approach, >>> modified] source code, or am I getting in too deep for a weekend >>> project? >> >> It depends on how much you know the engine, but let's design first. > > I'm just taking my first look at the engine this week. (Well, except > for a few headaches making it's ancestor and the rest of communicator > build on OS/2 back in 97-98) Hey, I thought I recognized your name! You're doing fine. /be .