[HN Gopher] Hash-based bisect debugging in compilers and runtimes
       ___________________________________________________________________
        
       Hash-based bisect debugging in compilers and runtimes
        
       Author : rsc
       Score  : 128 points
       Date   : 2024-07-18 14:36 UTC (8 hours ago)
        
 (HTM) web link (research.swtch.com)
 (TXT) w3m dump (research.swtch.com)
        
       | tekknolagi wrote:
       | This is very cool and seems similar to what we do/did in Cinder:
       | https://bernsteinbear.com/blog/cinder-jit-bisect/
       | 
       | EDIT: Oops, this tool is mentioned in the post already (but the
       | post is not, so here it is if you want to read about it). Neat!
        
         | rsc wrote:
         | Thanks for that link. I've changed the link beneath the Cinder
         | mention to that page instead of the GitHub page.
        
       | tabbott wrote:
       | This is so cool!
       | 
       | It reminds me a bit of one my favorite debugging techniques:
       | Running an individual test with coverage reporting enabled, and
       | then clicking around the HTML coverage report to see exactly what
       | code path the test followed, without needing to use either print
       | statements or a specialized debugger.
       | 
       | Very helpful for answering questions of the form "Why does this
       | test not exercise the code I thought it did?".
        
       | chc4 wrote:
       | When I read https://marcan.st/2017/12/debugging-an-evil-go-
       | runtime-bug/ and saw the hash bisection trick for the first time
       | I was super impressed, it really does sound incredibly slick :) I
       | imagine that's how first coming across normal git bisect must
       | feel to new engineers.
        
       | carry_bit wrote:
       | A closely related technique for debugging optimization passes is
       | that of "optimization fuel". Each rewrite decreases the fuel by
       | one, and when the fuel is gone no more rewrites happen. You can
       | then perform binary search on the optimization fuel to find a
       | specific rewrite instance that breaks things.
        
         | rsc wrote:
         | Yes, I believe LLVM has a flag for "run only the first N
         | optimization passes" that gets used with binary search this
         | way.
         | 
         | A global optimization fuel that worked at finer granularity
         | would be even more precise but you'd have to have the compiler
         | run single-threaded to make sure the numbering always matches.
         | At least in the Go compiler, we compile different functions in
         | different threads, so that fouls up any global numbering.
        
       | camgunz wrote:
       | This is pretty close to what I built for maintaining demo
       | compatibility in Doom engines. Basically it runs a demo and dumps
       | a save game to a file every frame. As soon as there's a
       | divergence it says what the difference is (monster 17 is at (4,
       | 22); should be (4, 21)) and bails. Not a ton of difference
       | between that and diffing the stack.
       | 
       | https://github.com/camgunz/democomp
        
         | rsc wrote:
         | What you are describing sounds like comparing traces of two
         | supposed-to-be-identical programs to see where they first
         | diverge. That's not what this is describing. There is no trace
         | at all, nor any decision about what to trace or what not to
         | trace.
         | 
         | There is only a decision about whether to use the old or new
         | implementation based on the hash of the call stack at that
         | moment. Then you binary search on hash values to identify the
         | exact call stack hash for which the new implementation causes a
         | problem. Then you run the program once more with the
         | instructions "print the stack with this hash".
        
       | Scaevolus wrote:
       | Aside: bisecting flakes doesn't _have_ to involve repeated runs.
       | You can reformulate bisection as an information probing
       | operation, expanding the scope to support noisy benchmarks or
       | low-probability flakes. Bayesian inference narrows down the
       | probable range of the failure for each new observation, and you
       | can choose new probes to maximize information gain-- or even run
       | them in parallel to minimize the total time.
       | 
       | You _do_ have to provide flake rate probability to do the
       | probability estimates, but even roughly correct rates work fine.
       | Running bisects assuming a 5% chance of getting a false positive
       | or negative barely adds more steps and greatly improves
       | robustness.
       | 
       | The math is pretty simple too-- my old prototype might still be
       | in Google's monorepo; I should reimplement it for the open source
       | world.
        
         | rsc wrote:
         | Indeed. Something like this is what I intend to do when I get
         | some time.
        
       ___________________________________________________________________
       (page generated 2024-07-18 23:01 UTC)