[HN Gopher] Writing a Simple Garbage Collector in C (2020)
___________________________________________________________________
Writing a Simple Garbage Collector in C (2020)
Author : susam
Score : 67 points
Date : 2023-04-08 17:56 UTC (5 hours ago)
(HTM) web link (maplant.com)
(TXT) w3m dump (maplant.com)
| cyberax wrote:
| Nice!
|
| A couple of other classic articles that should be mentioned:
|
| "Accurate garbage collection in an uncooperative environment" by
| Fergus Henderson -
| https://citeseerx.ist.psu.edu/document?repid=rep1&type=pdf&d...
|
| "Accurate Garbage Collection in Uncooperative Environments
| Revisited" - http://www.filpizlo.com/papers/baker-
| ccpe09-accurate.pdf
| version_five wrote:
| Stupid question, are there standard GC libraries that are used
| with C? Alternatively, what is the usual memory management
| practice? Is it typical just to keep track within the confines of
| the code you're writing and explicitly free memory when you're
| done with it?
| cancerhacker wrote:
| A common idiom that is no longer as common is to always create
| a stack buffer up to a certain small bound and then an
| allocation if it goes beyond: void
| dwim(unsigned count) { struct Something _buffer[10];
| struct Something* buffer; if (count < sizeof(_buffer)
| / sizeof(_buffer[0])) buffer = _buffer;
| else { buffer = (struct Something*) malloc(count *
| sizeof(struct Something)); }
| doSomething(buffer, count); if (buffer != _buffer)
| free(buffer); }
|
| It used to be really common to write that without considering
| if it was a necessary optimization or not - nowadays I would
| just malloc up front.
| mandarax8 wrote:
| Boehm GC is mentioned in the article:
| https://github.com/ivmai/bdwgc
| [deleted]
| ritcgab wrote:
| Manual malloc() + free() is the way. The language design makes
| it hard (if not impossible) to generate compilation-time GC
| routine. Runtime GC will be another thing, and you probably
| will lose raw pointer access, because we need some higher-level
| struct wrapping around the raw pointer to feed more information
| to the garbage collector.
| nu11ptr wrote:
| The Boehm collector exists and is mentioned in the article, but
| is not part of the standard library. Typical memory management
| practice is to store on the stack as much as possible reclaming
| memory as it goes out of scope. Larger chunks of memory or
| those that outlast scope typically uses manual malloc + free
| and these are often batched to minimize overhead. It is fairly
| unusual to use a GC in C. This is likely due to multiple
| reasons: history/culture, language design (C only allows for
| conservative collectors), less precise control over runtime
| overhead and pauses, higher memory utilization, etc.
| mcguire wrote:
| "The usual memory management practice" in C depends a lot an
| what you are doing. Malloc/free scattered through your code
| is...not uncommon. That does require discipline or ugliness
| ensues. Other common options are not allocating, which is kind
| of limiting but the only option under very tight memory
| constraints, and doing specific domain management options.
|
| One of the best of the latter is arena management.
| (https://en.wikipedia.org/wiki/Region-based_memory_management)
| sirwhinesalot wrote:
| If you want to keep your sanity, you don't want to be manually
| mallocing and freeing all over the place.
|
| The most common patterns I often see are a "context" object of
| some kind (typically used by libraries) which handles all
| memory management internally and must be passed to every API
| call. So you only ever allocate and free that one object.
| (Internally they might be doing all sorts of crazy things, I've
| even seen a basic tracing GC!)
|
| Applications typically use some combination of bump allocators
| (aka arenas) and pools. You put temporary allocations in a
| dedicated temp arena and clear it out when appropriate for the
| application (i.e. every frame)
| jpfr wrote:
| Nice and tight.
|
| For comparison, this is the malloc/free implementation of the
| original UNIX. Also nice and tight C code.
|
| https://warsus.github.io/lions-/all.html#line2522
|
| Although this isn't very efficient for today's memory sizes.
| jakearmitage wrote:
| What a well written article. Very didactic.
| firatsarlar wrote:
| Rust.GC.C.Self torture.Python.Endless GC. effort. Endless memory
| safety effort. And a newborn. Welcome to my old, tired mind. How
| many GC. effort do we need to solve simply remembering to pay our
| bill to memory ?
| webkike wrote:
| Wow! What a blast from the past. I originally submitted this
| article back in 2014 when it was hosted on my personal page
| hosted by my university. You can view the thread here:
| https://news.ycombinator.com/item?id=8222487
|
| When I migrated to my own personal domain, I decided I would
| update the article to address the comments on that thread, and I
| think the end result was much better (but alas, I never had a
| reason to re-submit the article here).
|
| Obviously I do not recommend using a GC like this in general, but
| as a learning resource for how memory works and some of the more
| obscure parts of executables, I think it works pretty well! The
| compliments I've received from this article have been pretty
| incredible. Seeing it here, uploaded independently, really warms
| my heart.
|
| Oh, and this article is so old that I'm now a Rust guy. Crazy how
| that works!
| lamp987 wrote:
| that code is severely broken by using unsigned int instead of
| uintptr_t to represent addresses.
___________________________________________________________________
(page generated 2023-04-08 23:00 UTC)