[HN Gopher] How fast can the RPython GC allocate?
       ___________________________________________________________________
        
       How fast can the RPython GC allocate?
        
       Author : todsacerdoti
       Score  : 17 points
       Date   : 2025-06-15 19:55 UTC (3 hours ago)
        
 (HTM) web link (pypy.org)
 (TXT) w3m dump (pypy.org)
        
       | kragen wrote:
       | My summary is that it's about one or two allocations per
       | nanosecond on CF Bolz's machine, an AMD Ryzen 7 PRO 7840U,
       | presumably on one core, and it's about 11 instructions per
       | allocation.
       | 
       | This is about 2-4x faster than my pointer-bumping arena allocator
       | for C, kmregion+, which is a similar number of instructions on
       | the (inlined) fast path. But possibly that's because I was
       | testing on slower hardware. I was also testing with 16-byte
       | initialized objects, but without a GC. It's about 10x the speed
       | of malloc/free.
       | 
       | I don't know that I'd recommend using kmregion, since it's never
       | been used for anything serious, but it should at least serve as a
       | proof of concept.
       | 
       | ______
       | 
       | + http://canonical.org/~kragen/sw/dev3/kmregion.h
       | http://canonical.org/~kragen/sw/dev3/kmregion.c
       | http://canonical.org/~kragen/sw/dev3/kmregion_example.c
        
       | pizlonator wrote:
       | The reason why their allocator is faster than Boehm isn't because
       | of conservative stack scanning.
       | 
       | You can move objects while using conservative stack scanning.
       | This is a common approach. JavaScriptCore used to use it.
       | 
       | You can have super fast allocation in a non-moving collector, but
       | that involves an algorithm that is vastly different from the
       | Boehm one. I think the fastest non-moving collectors have similar
       | allocation fast paths to the fastest moving collectors.
       | JavaScriptCore has a fast non-moving allocator called bump'n'pop.
       | In Fil-C, I use a different approach that I call SIMD turbosweep.
       | There's also the Immix approach. And there are many others.
        
       ___________________________________________________________________
       (page generated 2025-06-15 23:00 UTC)