Subj : Re: neat hack To : Adam Megacz From : Igor Bukanov Date : Thu Nov 06 2003 11:10 am Adam Megacz wrote: > I just checked in a patch to org.xwt.util.Preprocessor that lets you > write string switches, and automatically translates them into > Rhino-style case blocks. The code you write looks like this: > > //#switch(str) > "foo": ... > "bar": ... > //#end > > This only took me about 30 min to hack together, but I think it's > neat. I can't take credit for the idea (Rhino did it first), but it's > nice to have a preprocessor that does it automatically; I think the > Rhino guys would run some external tool and then paste in the output. The tool is included with Rhino distribution, just unzip Rhino distro and look at toolsrc/org/mozilla/javascript/tools/idswitch. But the tool is intended to map a set of strings to integers which can be used later to pass to the standard switch rather then to be a generic string switch. If it is necessary to do a switch over the same strings several times as it happens in Rhino, mapping the strings into integers once and writing several int switches would greatly reduce class file size. The original intention of the tool was to generate at runtime a class that will do fast mapping of strings to ints but it turned out that overhead of a generated class in JVM can not justify the speedup so the tools was converted into some kind of preprocessor which allows to have fast lookup and save memory in addition compared with a special-purpose hash table implementation. Regards, Igor > This is actually an interesting problem in Java. There are three ways > to implement this (since the JVM can't do it directly): > > 1. Use a huge if/else/else/else block. This is slow; time is O(n) > whre n is the number of keys to compare against. > > 2. Use a hashtable full of "helper objects". This is how > SpecialBoxProperty works. This is better, but the helper objects > don't have access to the 'this' object (unless you make a new set > of helpers for every instance, which creates tons of garbage). > This also inhibits a lot of compiler optimizations since the actual > code for the case blocks isn't inside the switch -- there's a > function call between them. > > 3. Do what Rhino did: > > switch(str.length()) { > 2: switch(str.charAt(0)) { > case 'a': if (str.equals("adam")) { /* do something */} break; > } > 3: > } > > So the top level switches on the length of the string and each > inner switch switches on a progressively greater character of the > string. Unnecessary levels are not used (for example, if we have > keys called "aaron", "aardvark", and "bob", we'll only switch on > the first and fourth chars when the switch value is "aaron"). > > This is superfast because every JVM on earth (including GCJ) > performs three simple optimizations: > > 1. inline all String methods > 2. keep the String's hashCode() precomputed, in the string > 3. check == equality first when doing .equals() > > The upshot is that all those charAt() and length() calls are just > array accesses (inlined), and the .equals() check looks at the > (precomputed) hashcode and == operator first, which means it > usually doesn't have to bother comparing the actual characters. > If you intern() your strings, it's even faster. Lastly, jumping > around a switch block is way, way, way, way faster than calling a > function (which is a big part of why this beats [2]). > > In a situation like a JS interpreter or host objects for JS, where > almost every single operation is a string-key lookup of some sort, > this makes a big difference. > > - a > .