[HN Gopher] Erlang/OTP: Garbage Collector
       ___________________________________________________________________
        
       Erlang/OTP: Garbage Collector
        
       Author : vkatsuba
       Score  : 123 points
       Date   : 2023-03-31 12:19 UTC (1 days ago)
        
 (HTM) web link (medium.com)
 (TXT) w3m dump (medium.com)
        
       | travisgriggs wrote:
       | Scaling up an MQTT<->webhook relay that I wrote in Elixir to
       | 1000's of long running connections, I found that I needed to
       | manually trigger periodic GCs on my long lived processes.
       | 
       | As binary strings work their way through the pipelines via
       | messages, it leaves binaries on the binary heap that don't go
       | away because the ref count stays above 1. There are a number of
       | GC parameters one can tune on a per process level that might
       | cause a long lived process to collect more aggressively. But my
       | long lived processes have a natural "ratchet" point where it was
       | just easy to throw a collect in. This solved all of my slow
       | growth memory problems.
       | 
       | I've read elsewhere that Erlangs GC benefits often on the basis
       | that must Erlanger processes are short lived.
        
         | Nezteb wrote:
         | Is any part of that relay open-source by chance? If not, what
         | libraries are you using?
        
         | toast0 wrote:
         | There was some work to try to make this use case work with
         | normal GC (RefC binaries count as their size garbage, rather
         | than just the size the reference is on the process heap). But
         | if you know your process should be pretty clean at some point,
         | manually triggering GC will do a better job. Off heap message
         | passing might help this case too.
        
       | vkatsuba wrote:
       | If you want to expand the examples or improve the topic - just
       | leave a comment about it.
        
       | sacnoradhq wrote:
       | ORCA (as part of the Pony compiled language) includes a more
       | performant GC than C4 or BEAM/HiPE. It does so by reducing almost
       | to zero the need to do global GC pauses by sharding the heap per
       | actor, zero-copy message passing, fine-grained concurrent sharing
       | semantics, and lock-free data structures.
        
         | bitwalker wrote:
         | I mean, the BEAM doesn't have global GC pauses either, as each
         | process has its own heap - but I would expect Pony can take
         | things a step further as a result of its strong type system,
         | which IIRC is why it can support zero-copy messaging.
        
           | sacnoradhq wrote:
           | This is true. Erlang's heap per PID. Azul's C4 and other JVM
           | GC move in the no world stopping direction but they're still
           | at the mercy of the model of the JVM.
           | 
           | If one can avoid GCs altogether a-la precise (de)allocations
           | like Rust's non-reference-counted entities, this is cool but
           | often requires unnatural contortionism. RC is still necessary
           | in certain cases.
        
       | dmpk2k wrote:
       | The post glosses over the most important part of Erlang's GC: it
       | collects process heaps separately. This transforms a hard problem
       | (collecting a global heap with low latency despite concurrent
       | mutators) to a _much_ simpler problem, at the price of more
       | copying. Compare Java's G1 with Erlang's GC; the former hurts my
       | head.
       | 
       | For those problems that are amenable to Erlang's model, this is a
       | fine solution. The only real improvement here would be making
       | collection incremental.
        
         | dfox wrote:
         | Another point is that due to erlang's immutability there cannot
         | be pointers from oldgen into nursery and thus the GC does not
         | need write barriers.
        
         | fpoling wrote:
         | Erlang also has reference counters for things like strings that
         | are immutable and can be shared between threads (processes in
         | Erlang).
         | 
         | Overall this is a good model. Use GC for small per green thread
         | heaps. Then use reference counters for shared immutable
         | structures that cannot form cycles and copy everything else.
        
           | bitwalker wrote:
           | Erlang only uses reference counting for binaries larger than
           | 64 bytes, everything else is allocated on the process heap
           | (or in heap fragments) and copied. Just that is enough to
           | have a beneficial effect though, since large binaries are
           | relatively common in practice, and are frequently passed
           | around from process-to-process.
        
         | vkatsuba wrote:
         | This is a good point, thanks! I will extend the topic or maybe
         | will be better to provide new topic as continuation of the
         | current topic - since putting everything in one article can be
         | difficult to understand and will increase the article itself,
         | making it more difficult to read.
        
         | weatherlight wrote:
         | I thought Erlang's Garbage collector was incremental by virtue
         | of being per process. A system may have tens of thousands of
         | processes, using a gigabyte of memory overall, but if GC occurs
         | in a process with a 20K heap, then the collector only touches
         | that 20K and collection time is imperceptible. With lots of
         | small processes, you can think of this as a truly incremental
         | collector.
         | 
         | It's not incremental per process, but I'm not sure it would
         | even matter that much in practice.
        
           | dmpk2k wrote:
           | Yes, that is how it works, except (as you implicitly note)
           | that large heaps in single processes can cause problems;
           | allowing incremental collection per heap would flatten the
           | latency profile further.
        
           | ramchip wrote:
           | Large GC jobs get scheduled on dirty schedulers today (a
           | background thread pool), since it's not OK to block a normal
           | scheduler more than 1ms or so in Erlang. If they could be
           | split into smaller chunks of work, perhaps it could be done
           | on normal schedulers, making time allocation more fair.
        
         | benmmurphy wrote:
         | there are lots of foot guns for the user with this model.
         | because transferring data between processes involves copying
         | this can become a problem. Erlang tries to optimise the
         | handling of large binaries by using a separate reference
         | counted heap. however, this introduces another set of issues
         | where memory is 'leaked' because a smaller binary is holding a
         | reference to a larger binary or because processes that have not
         | been GC'd have not decremented the ref count of large binaries
         | in the heap that they no longer user.
        
         | amelius wrote:
         | Wouldn't Erlang be much more efficient if it simply compiled to
         | the JVM?
        
           | nesarkvechnep wrote:
           | No.
        
           | lenkite wrote:
           | JVM standard does not support isolates so it won't work.
           | Java's father Gosling wanted to get isolation into the Java
           | spec but he failed.
           | 
           | The modern GraalVM does have isolates but its a VM specific
           | feature and not a java standard feature.
        
           | vcryan wrote:
           | Ha! Absolutely no
        
             | amelius wrote:
             | Why not? JVM has a highly optimized concurrent GC.
        
               | toast0 wrote:
               | Erlang has a highly optimized concurrent GC as well. It's
               | just optimized for different things. And maybe the
               | concurrency of the GC is different; Erlang has one heap
               | per process (aka green thread), and no concurrency within
               | a heap.
               | 
               | Erlang GC is also very simple and easy to understand
               | because language features only allow references in one
               | direction. Much of JVM GC complexity would be wasted as
               | there's no need to look for reference loops and such,
               | since they're not possible.
        
               | omginternets wrote:
               | It's a GC tuned for imperative languages that prefer
               | mutation over allocation, which is the exact inverse of
               | what BEAM needs.
        
               | pron wrote:
               | OpenJDK's GCs do have elaborate mechanisms to support
               | mutation, but if they're unused they impose no extra
               | overhead.
        
           | _old_dude_ wrote:
           | Almost 10 years ago, i've tested erjang [1] using a medium
           | sized application. Throughput was better than BEAM but
           | latency was terrible.
           | 
           | [1] https://github.com/trifork/erjang/
        
             | pron wrote:
             | Ten years ago was two whole technological generations ago
             | in the implementation of OpenJDK's GCs. OpenJDK now has a
             | maximum pause time of under 1ms for heaps up to 16TB.
        
               | bitwalker wrote:
               | I really strongly doubt that GC is a bottleneck for
               | Erlang programs on either the BEAM or the JVM - the
               | sophistication of the scheduler, and the way various
               | language primitives interact with it, is where the BEAM
               | is almost certainly gaining an edge over the JVM. That
               | said, I'm sure there are a subset of programs that
               | _would_ be faster on the JVM, just depends on what
               | metrics are being compared.
        
             | nickpeterson wrote:
             | As the other reply noted, I'd be shocked if it wouldn't be
             | much better now, seeing something like graal being used
             | would be really interesting. I think if Elixir could target
             | beam or jvm it would be an amazing language for many tasks.
        
           | jlouis wrote:
           | It likely would. But efficiency is only one factor. Many
           | Erlang applications are far more concerned with consistent
           | latency than throughput efficiency. So a switch to the JVM is
           | a lot of cost.
        
       ___________________________________________________________________
       (page generated 2023-04-01 23:00 UTC)