[HN Gopher] Writing a Simple Garbage Collector in C (2020)
       ___________________________________________________________________
        
       Writing a Simple Garbage Collector in C (2020)
        
       Author : susam
       Score  : 67 points
       Date   : 2023-04-08 17:56 UTC (5 hours ago)
        
 (HTM) web link (maplant.com)
 (TXT) w3m dump (maplant.com)
        
       | cyberax wrote:
       | Nice!
       | 
       | A couple of other classic articles that should be mentioned:
       | 
       | "Accurate garbage collection in an uncooperative environment" by
       | Fergus Henderson -
       | https://citeseerx.ist.psu.edu/document?repid=rep1&type=pdf&d...
       | 
       | "Accurate Garbage Collection in Uncooperative Environments
       | Revisited" - http://www.filpizlo.com/papers/baker-
       | ccpe09-accurate.pdf
        
       | version_five wrote:
       | Stupid question, are there standard GC libraries that are used
       | with C? Alternatively, what is the usual memory management
       | practice? Is it typical just to keep track within the confines of
       | the code you're writing and explicitly free memory when you're
       | done with it?
        
         | cancerhacker wrote:
         | A common idiom that is no longer as common is to always create
         | a stack buffer up to a certain small bound and then an
         | allocation if it goes beyond:                  void
         | dwim(unsigned count) {          struct Something _buffer[10];
         | struct Something* buffer;          if (count < sizeof(_buffer)
         | / sizeof(_buffer[0]))            buffer = _buffer;
         | else {            buffer = (struct Something*) malloc(count *
         | sizeof(struct Something));          }
         | doSomething(buffer, count);          if (buffer != _buffer)
         | free(buffer);         }
         | 
         | It used to be really common to write that without considering
         | if it was a necessary optimization or not - nowadays I would
         | just malloc up front.
        
         | mandarax8 wrote:
         | Boehm GC is mentioned in the article:
         | https://github.com/ivmai/bdwgc
        
         | [deleted]
        
         | ritcgab wrote:
         | Manual malloc() + free() is the way. The language design makes
         | it hard (if not impossible) to generate compilation-time GC
         | routine. Runtime GC will be another thing, and you probably
         | will lose raw pointer access, because we need some higher-level
         | struct wrapping around the raw pointer to feed more information
         | to the garbage collector.
        
         | nu11ptr wrote:
         | The Boehm collector exists and is mentioned in the article, but
         | is not part of the standard library. Typical memory management
         | practice is to store on the stack as much as possible reclaming
         | memory as it goes out of scope. Larger chunks of memory or
         | those that outlast scope typically uses manual malloc + free
         | and these are often batched to minimize overhead. It is fairly
         | unusual to use a GC in C. This is likely due to multiple
         | reasons: history/culture, language design (C only allows for
         | conservative collectors), less precise control over runtime
         | overhead and pauses, higher memory utilization, etc.
        
         | mcguire wrote:
         | "The usual memory management practice" in C depends a lot an
         | what you are doing. Malloc/free scattered through your code
         | is...not uncommon. That does require discipline or ugliness
         | ensues. Other common options are not allocating, which is kind
         | of limiting but the only option under very tight memory
         | constraints, and doing specific domain management options.
         | 
         | One of the best of the latter is arena management.
         | (https://en.wikipedia.org/wiki/Region-based_memory_management)
        
         | sirwhinesalot wrote:
         | If you want to keep your sanity, you don't want to be manually
         | mallocing and freeing all over the place.
         | 
         | The most common patterns I often see are a "context" object of
         | some kind (typically used by libraries) which handles all
         | memory management internally and must be passed to every API
         | call. So you only ever allocate and free that one object.
         | (Internally they might be doing all sorts of crazy things, I've
         | even seen a basic tracing GC!)
         | 
         | Applications typically use some combination of bump allocators
         | (aka arenas) and pools. You put temporary allocations in a
         | dedicated temp arena and clear it out when appropriate for the
         | application (i.e. every frame)
        
       | jpfr wrote:
       | Nice and tight.
       | 
       | For comparison, this is the malloc/free implementation of the
       | original UNIX. Also nice and tight C code.
       | 
       | https://warsus.github.io/lions-/all.html#line2522
       | 
       | Although this isn't very efficient for today's memory sizes.
        
       | jakearmitage wrote:
       | What a well written article. Very didactic.
        
       | firatsarlar wrote:
       | Rust.GC.C.Self torture.Python.Endless GC. effort. Endless memory
       | safety effort. And a newborn. Welcome to my old, tired mind. How
       | many GC. effort do we need to solve simply remembering to pay our
       | bill to memory ?
        
       | webkike wrote:
       | Wow! What a blast from the past. I originally submitted this
       | article back in 2014 when it was hosted on my personal page
       | hosted by my university. You can view the thread here:
       | https://news.ycombinator.com/item?id=8222487
       | 
       | When I migrated to my own personal domain, I decided I would
       | update the article to address the comments on that thread, and I
       | think the end result was much better (but alas, I never had a
       | reason to re-submit the article here).
       | 
       | Obviously I do not recommend using a GC like this in general, but
       | as a learning resource for how memory works and some of the more
       | obscure parts of executables, I think it works pretty well! The
       | compliments I've received from this article have been pretty
       | incredible. Seeing it here, uploaded independently, really warms
       | my heart.
       | 
       | Oh, and this article is so old that I'm now a Rust guy. Crazy how
       | that works!
        
       | lamp987 wrote:
       | that code is severely broken by using unsigned int instead of
       | uintptr_t to represent addresses.
        
       ___________________________________________________________________
       (page generated 2023-04-08 23:00 UTC)