[HN Gopher] Understanding Memory Management, Part 1: C
       ___________________________________________________________________
        
       Understanding Memory Management, Part 1: C
        
       Author : ekr____
       Score  : 156 points
       Date   : 2025-01-13 14:16 UTC (3 days ago)
        
 (HTM) web link (educatedguesswork.org)
 (TXT) w3m dump (educatedguesswork.org)
        
       | samsquire wrote:
       | Thanks for such a detailed article.
       | 
       | In my spare time working with C as a hobby I am usually in
       | "vertical mode" which is different to how I would work
       | (carefully) at work, which is just getting things done end-to-end
       | as fast as possible, not careful at every step that we have no
       | memory errors. So I am just trying to get something working end-
       | to-end so I do not actually worry about memory management when
       | writing C. So I let the operating system handle memory freeing. I
       | am trying to get the algorithm working in my hobby time.
       | 
       | And since I wrote everything in Python or Javascript initially, I
       | am usually porting from Python to C.
       | 
       | If I were using Rust, it would force me to be careful in the same
       | way, due to the borrow checker.
       | 
       | I am curious: we have reference counting and we have Profile
       | guided optimisation.
       | 
       | Could "reference counting" be compiled into a debug/profiled
       | build and then detect which regions of time we free things in
       | before or after (there is a happens before relation with dropping
       | out of scopes that reference counting needs to run) to detect
       | where to insert frees? (We Write timing metadata from the RC
       | build, that encapsulates the happens before relationships)
       | 
       | Then we could recompile with a happens-before relation file that
       | has correlations where things should be freed to be safe.
       | 
       | EDIT: Any discussion about those stack diagrams and alignment
       | should include a link to this wikipedia page;
       | 
       | https://en.wikipedia.org/wiki/Data_structure_alignment
        
         | jvanderbot wrote:
         | > which is just getting things done end-to-end as fast as
         | possible, not careful at every step that we have no memory
         | errors.
         | 
         | One horrible but fun thing a former professor of mine pointed
         | out: If your program isn't going to live long, then you never
         | have to deallocate memory. Once it exits, the OS will happily
         | clean it up for you.
         | 
         | This works in C or perhaps lazy GC languages, but for stateful
         | objects where destructors do meaningful work, like in C++, this
         | is dangerous. This is one of the reasons I hate C++ so much:
         | Unintended side effects that you have to trigger.
         | 
         | > Could "reference counting" be compiled into a debug/profiled
         | build and then detect which regions of time we free things in
         | before or after (there is a happens before relation with
         | dropping out of scopes that reference counting needs to run) to
         | detect where to insert frees?
         | 
         | This is what Rust does, kinda.
         | 
         | C++ also does this with "stack" allocated objects - it "frees"
         | (calls destructor and cleans up) when they go out of scope. And
         | in C++, heap allocated data (if you're using a smart pointer)
         | will automatically deallocate when the last reference drops,
         | but this is not done at compile time.
         | 
         | Those are the only two memory management models I'm familiar
         | with enough to comment on.
        
           | MarkSweep wrote:
           | There is this old chestnut about "null garbage collectors":
           | 
           | https://devblogs.microsoft.com/oldnewthing/20180228-00/?p=98.
           | ..
           | 
           | > This sparked an interesting memory for me. I was once
           | working with a customer who was producing on-board software
           | for a missile. In my analysis of the code, I pointed out that
           | they had a number of problems with storage leaks. Imagine my
           | surprise when the customers chief software engineer said "Of
           | course it leaks". He went on to point out that they had
           | calculated the amount of memory the application would leak in
           | the total possible flight time for the missile and then
           | doubled that number. They added this much additional memory
           | to the hardware to "support" the leaks. Since the missile
           | will explode when it hits its target or at the end of its
           | flight, the ultimate in garbage collection is performed
           | without programmer intervention.
        
             | jvanderbot wrote:
             | Rapid disassembly as GC. Love it.
             | 
             | Have you heard the related story about the patriot missile
             | system?
             | 
             | https://www.cs.unc.edu/~smp/COMP205/LECTURES/ERROR/lec23/no
             | d...
             | 
             | Not a GC issue, but fun software bug.
        
           | pjmlp wrote:
           | The wonders of corrupted data, stale advisory locks and UNIX
           | IPC leftovers, because they weren't properly flushed, or
           | closed before process termination.
        
             | jvanderbot wrote:
             | I'll narrow my scope more explicitly:
             | 
             | close(x) is not memory management - not at the user level.
             | This should be done.
             | 
             | free(p) has no O/S side effects like this in C - this can
             | be not-done if you don't malloc all your memory.
             | 
             | You can get away with not de-allocating program memory, but
             | (as mentioned), that has nothing to do with freeing Os/
             | kernel / networking resources in C.
        
               | PhilipRoman wrote:
               | Most kernel resources are fairly well behaved, as they
               | will automatically decrement their refcount when a
               | process exits. Even mutexes have a "robust" flag for this
               | exact reason. Programs which rely on destructors or any
               | other form or orderly exit are always brittle and should
               | be rewritten to use atomic operations.
        
         | caspper69 wrote:
         | Nothing is going to tell you where to put your free() calls to
         | guarantee memory safety (otherwise Rust wouldn't exist).
         | 
         | There are tools that will tell you they're missing, however.
         | Read up on Valgrind and ASAN.
         | 
         | In C, non-global variables go out of scope when the function
         | they are created in ends. So if you malloc() in a fn, free() at
         | the end.
         | 
         | If you're doing everything with globals in a short-running
         | program, let the OS do it if that suits you (makes me feel
         | dirty).
         | 
         | This whole problem doesn't get crazy until your program gets
         | more complicated. Once you have a lot of pointers among objects
         | with different lifetimes. or you decide to add some concurrency
         | (or parallelism), or when you have a lot of cooks in the
         | kitchen.
         | 
         | In the applications you say you are writing, just ask yourself
         | if you're going to use a variable again. If not, and it is
         | using dynamically-allocated memory, free() it.
         | 
         | Don't psych yourself out, it's just C.
         | 
         | And yes, there are ref-counting libraries for C. But I wouldn't
         | want to write my program twice, once to use the ref-counting
         | library in debug mode and another to use malloc/free in release
         | mode. That sounds exhausting for all but the most trivial
         | programs.
        
         | mgaunard wrote:
         | In C, not all objects need to be their own allocated entity
         | (like they are in other languages). They can be stored in-line
         | within another object, which means the lifetime of that object
         | is necessarily constrained by that of its parent.
         | 
         | You could make every object its own allocated entity, but then
         | you're losing most of the benefits of using C, which is the
         | ability to control memory layout of objects.
        
           | pjmlp wrote:
           | As any systems programming language include those that
           | predate C by a decade, and still it doesn't allow full
           | control without compiler extensions, if you really want full
           | control of memory layout of objects, Assembly is the only
           | way.
        
         | SkiFire13 wrote:
         | > I am curious: we have reference counting and we have Profile
         | guided optimisation. > > Could "reference counting" be compiled
         | into a debug/profiled build and then detect which regions of
         | time we free things in before or after (there is a happens
         | before relation with dropping out of scopes that reference
         | counting needs to run) to detect where to insert frees?
         | 
         | Profile guided optimizations can only gather informations about
         | what's most probable, but they can't give knowledge about
         | things about what will surely happen. For freeing however you
         | most often want that knowledge, because not freeing will result
         | in a memory leak (and freeing too early will result in a use-
         | aftee-free, which you definitely want to avoid so the analysis
         | needs to be conservative!). In the end this can only be an
         | _optimization_ (just like profile guided _optimization_s are
         | just optimizations!) on top of a workflows that is ok with
         | leaking everything.
        
       | jll29 wrote:
       | Great post for intermediary programmers, who started programming
       | in Python, and who should now learn what's under the hood to get
       | to the next level of their education. Sometimes (perhaps most of
       | the time), we should ignore the nitty gritty details, but the
       | moment comes where you need to know the "how": either because you
       | need more performance, sort out an issue, or do something that
       | requires low-level action.
       | 
       | There are few sources like this post targeting that intermediate
       | group of people: you get lots of beginner YouTube clips and Web
       | tutorials and on HN you get discussions about borrow checking in
       | Rust versus garbage collection in Go, how to generate the best
       | code for it and who has the best Rope implementation; but little
       | to educate yourself from the beginner level to the level where
       | you can begin to grasp what the second group are talking about,
       | so thanks for this educations piece that fills a gap.
        
       | 9999_points wrote:
       | Memory arenas should be taught to all programmers and become the
       | default method of memory management.
        
         | caspper69 wrote:
         | I agree with you 100%. I think arenas are a much lighter burden
         | for the programmer to reason about than lifetimes & access
         | patterns.
         | 
         | But arenas can have one big drawback, and that is if you do a
         | lot of allocations and deallocations, especially in long-
         | running routines, you can essentially leak memory, because
         | arenas are not usually freed until they are going out of scope.
         | This can vary depending on the language and the implementation,
         | though.
         | 
         | My thought to counteract that though is you could offer a ref-
         | counted arena just for this scenario, but I'm not sure what
         | exactly that would look like (automatic once refs hit 0? offer
         | a purge() function like a GC?). I haven't wrapped my head
         | around the ergonomics yet.
        
         | _bohm wrote:
         | They're a great fit in many situations but certainly not all.
         | Why not teach programmers a variety of allocation strategies
         | and how to recognize when each might be a good fit?
        
           | caspper69 wrote:
           | I initially read your username as boehm, and I was like wow,
           | ok, _this_ is a guy who knows his memory. :)
           | 
           | What situations would an arena allocator prove problematic or
           | non-optimal, aside from the many allocations/deallocations
           | scenario?
           | 
           | This is an area I'm very interested in, so any info would be
           | appreciated.
        
             | _bohm wrote:
             | In general, everything allocated within an arena has its
             | lifetime tied to that arena. In lots of situations this is
             | a fine or even desirable property (e.g., a single request
             | context in a server application), but can be a tough
             | restriction to work with in situations where you need fine-
             | grained deallocations and possibly want to reuse freed
             | space. The lifetime property can also be a pain to work
             | with in multithreaded scenarios, where you might have
             | multiple threads needing to access data stored in a single
             | arena. Another situation that comes to mind is large long-
             | lived allocations where you might want to have some manual
             | defragmentation in place for performance reasons.
        
       | bluetomcat wrote:
       | This isn't proper usage of realloc:                   lines =
       | realloc(lines, (num_lines + 1) * sizeof(char *));
       | 
       | In case it cannot service the reallocation and returns NULL, it
       | will overwrite "lines" with NULL, but the memory that "lines"
       | referred to is still there and needs to be either freed or used.
       | 
       | The proper way to call it would be:                   tmp =
       | realloc(lines, (num_lines + 1) * sizeof(char *));              if
       | (tmp == NULL) {             free(lines);             lines =
       | NULL;             // ... possibly exit the program (without a
       | memory leak)         } else {             lines = tmp;         }
        
         | lionkor wrote:
         | Very odd that an article trying to teach memory management
         | would miss this, this should be common knowledge to anyone who
         | used realloc, just like checking the return of any allocation
         | call.
        
           | bluetomcat wrote:
           | They treat an OOM situation as exceptional and immediately
           | call abort() in case any allocation function returns NULL.
           | The specification of these functions allows you to handle OOM
           | situations gracefully.
        
           | PhilipRoman wrote:
           | >checking the return of any allocation call
           | 
           | I would say this is pointless on many modern systems unless
           | you also disable overcommit, since otherwise any memory
           | access can result in a crash, which is impossible to check
           | for explicitly.
        
         | aa-jv wrote:
         | Actually, no. You've just committed one of the cardinal sins of
         | the *alloc()'s, which is: NULL is an acceptable return, so
         | errno != 0 is the only way to tell if things have gone awry.
         | 
         | The proper use of realloc is to check errno _always_ ...
         | because in fact it can return NULL in a case which is not
         | considered an error: lines is not NULL but requested size is
         | zero. This is not considered an error case.
         | 
         | So, in your fix, please replace all checking of tmp == NULL,
         | instead with checking errno != 0. Only then will you have
         | actually fixed the OP's unsafe, incorrect code.
        
           | cozzyd wrote:
           | In this case if (num_lines+1) _(sizeof (char_ )) is zero that
           | is certainly unintended
        
           | spiffyk wrote:
           | From `malloc(3)`:                  Nonportable behavior
           | The  behavior of these functions when the requested size is
           | zero is glibc specific; other implementations may return NULL
           | without setting errno, and portable POSIX programs should
           | tolerate such behavior.  See realloc(3p).
           | POSIX requires memory allocators to set errno upon failure.
           | However, the C standard does not require this, and
           | applications portable to non-POSIX platforms should not
           | assume this.
        
             | anymouse123456 wrote:
             | As someone writing C for POSIX _and_ embedded environments,
             | this clarification is a super helpful.
        
         | pjmlp wrote:
         | It is even flagged as such on Visual Studio analyser.
        
         | returningfory2 wrote:
         | I feel like this comment is misleading because it gives the
         | impression that the code in the article is wrong or unsafe,
         | whereas I think it's actually fine? In the article, in the case
         | when `tmp == NULL` (in your notation) the author aborts the
         | program. This means there's no memory leak or unsafety. I agree
         | that one can do better of course.
        
         | o11c wrote:
         | There's another bug, related to performance - this involves a
         | quadratic amount of memory copying unless your environment can
         | arrange for zero-copy.
        
           | Karellen wrote:
           | Surely that's only the case if realloc() actually resizes and
           | copies on every call? Which it normally doesn't?
           | 
           | I thought that most implementations of realloc() would often
           | "round up" internally to a larger size allocation, maybe
           | power-of-two, maybe page size, or something? So if you ask
           | for 20 bytes, the internal bookkeeping sets aside 32, or
           | 4096, or whatever. And then if you realloc to 24 bytes,
           | realloc will just note that the new allocation fits in the
           | amount its reserved for you and return the same buffer again
           | with no copying?
        
             | o11c wrote:
             | _Some_ implementations _might_ round up to encourage reuse:
             | 
             | * memory-checking allocators never do.
             | 
             | * purely-size-based allocators always do.
             | 
             | * extent-based allocators _try_ to, but this easily fails
             | if you 're doing two interleaving allocations.
             | 
             | * the mmap fallback does _only_ if allowing the kernel to
             | choose addresses rather than keeping virtual addresses
             | together, unless you happen to be on a kernel that allows
             | not leaving a hole
             | 
             | Given that there's approximately zero overhead to do it
             | right, just do it right (you don't need to store capacity,
             | just compute it deterministically from the size).
        
           | ekr____ wrote:
           | Author here. Quite so. See footnote
           | 3:https://educatedguesswork.org/posts/memory-
           | management-1/#fn3
           | 
           | "If you know you're going to be doing a lot of reallocation
           | like this, many people will themselves overallocate, for
           | instance by doubling the size of the buffer every time they
           | are asked for more space than is available, thus reducing the
           | number of times they need to actually reallocate. I've
           | avoided this kind of trickery to keep this example simple."
        
         | tliltocatl wrote:
         | If you exit the program (as in the process) you don't need to
         | free anything.
        
         | astrobe_ wrote:
         | The program abort()s if the reallocation fails. But indeed, for
         | an educational example, it's not good to be too smart.
         | 
         | I believe the test if(!num_lines) is unnecessary, because
         | reallocating a NULL pointer is equivalent to malloc(). This is
         | also a bit "smart", but I think it is also more correct because
         | you don't use the value of one variable (num_lines is 0) to
         | infer the value of another (lines is NULL).
         | 
         | To go further, an opened-ended structure like:
         | struct         {           unsigned count;           char*
         | lines[];         };
         | 
         | ... could also be preferable in practice. But actually writing
         | good C is not the topic of TFA.
        
         | ekr____ wrote:
         | Author here.
         | 
         | Thanks for the flag. As you have probably noticed, I just abort
         | the program a few lines below on realloc failure, so this
         | doesn't leak so much as crash. However, this is a nice example
         | of how fiddly C memory management is.
        
       | 1970-01-01 wrote:
       | This was a great (re)introduction to the fundamentals. Worthy of
       | a bookmark.
        
       | sylware wrote:
       | Avoid as much as you can the C standard lib allocator, go
       | directly to mmap system call with your own allocator if you know
       | you won't use CPU without a MMU.
       | 
       | If you write a library, let the user code install its own
       | allocator.
        
         | jeffbee wrote:
         | "malloc" is a weakly-bound symbol that can be overridden, on
         | every system I've used. I don't know if some standard defines
         | it to be weak. Anyway the point is that malloc is not
         | necessarily a call to the C standard library function. It can
         | be anything.
        
           | sylware wrote:
           | "weakly-bound symbol" implies your a using a complex runtime
           | library/binary format (like ELF).
           | 
           | A portable and clean design for a library is to allow to
           | override the internal allocator via the API (often part of
           | the init function call).
           | 
           | Look at vulkan3D which does many things right and doing this
           | very part right. On the other side, you have some parts of
           | the ALSA lib API which still requires to use the C lib free
           | (may be obsolete though).
        
             | jeffbee wrote:
             | No, it doesn't mean that, because malloc can be replaced at
             | build-time as well. But I agree that interfaces should
             | avoid doing their own allocations and should let the caller
             | dictate it.
        
               | sylware wrote:
               | Look at vulkan3D which does it right(TM).
        
         | pjmlp wrote:
         | Sure, if one doesn't care about portable code.
        
           | the-smug-c-one wrote:
           | What modern OS doesn't have the equivalent of mmap? Just som
           | #ifdefs. I didn't know I'd ever hear "use malloc, because
           | it's portable".
           | 
           | sylware is pretty much right anyway. Try to avoid malloc, or
           | write smaller allocators on top of malloc.
        
             | pjmlp wrote:
             | Why bother with some #ifdefs when the ISO C standard
             | library already does the job?
        
               | the-smug-c-one wrote:
               | Because you're probably writing a much larger program so
               | some ifdefs aren't a big deal :-).
        
       | _bohm wrote:
       | This is a fantastic post. I really feel like these concepts
       | should be introduced to programmers much earlier on in their
       | education and this article does a great job of presenting the
       | info in an approachable manner.
        
       | numeromancer wrote:
       | Just no.                   address = X         length = *X
       | address = address + 1         while length > 0 {
       | address = address + 1             print *address         }
        
         | ekr____ wrote:
         | Author here. You're quite right that this isn't the thing you
         | would normally do. I'm just trying to help people work through
         | the logic of the system with as few dependencies as possible,
         | hence this (admittedly yucky) piece of pseudocode which isn't
         | really C or Rust or Python or anything...
        
       | juanbafora wrote:
       | thanks for sharing these are core concept to better understand
       | the coding
        
       | aslihana wrote:
       | The comics at the beginning hahaha :D
        
       | the_arun wrote:
       | > If we just concatenate the values in memory, how do we know
       | where one line ends and the next begins? For instance, maybe the
       | first two names are "jim" and "bob" or maybe it's one person
       | named "jimbob", or even two people named "jimbo" and "b".
       | 
       | Don't we have a newline character? I thought we can read newline
       | as `0xA` in Rust?
        
       ___________________________________________________________________
       (page generated 2025-01-16 23:00 UTC)