[HN Gopher] Understanding Memory Management, Part 1: C
___________________________________________________________________
Understanding Memory Management, Part 1: C
Author : ekr____
Score : 156 points
Date : 2025-01-13 14:16 UTC (3 days ago)
(HTM) web link (educatedguesswork.org)
(TXT) w3m dump (educatedguesswork.org)
| samsquire wrote:
| Thanks for such a detailed article.
|
| In my spare time working with C as a hobby I am usually in
| "vertical mode" which is different to how I would work
| (carefully) at work, which is just getting things done end-to-end
| as fast as possible, not careful at every step that we have no
| memory errors. So I am just trying to get something working end-
| to-end so I do not actually worry about memory management when
| writing C. So I let the operating system handle memory freeing. I
| am trying to get the algorithm working in my hobby time.
|
| And since I wrote everything in Python or Javascript initially, I
| am usually porting from Python to C.
|
| If I were using Rust, it would force me to be careful in the same
| way, due to the borrow checker.
|
| I am curious: we have reference counting and we have Profile
| guided optimisation.
|
| Could "reference counting" be compiled into a debug/profiled
| build and then detect which regions of time we free things in
| before or after (there is a happens before relation with dropping
| out of scopes that reference counting needs to run) to detect
| where to insert frees? (We Write timing metadata from the RC
| build, that encapsulates the happens before relationships)
|
| Then we could recompile with a happens-before relation file that
| has correlations where things should be freed to be safe.
|
| EDIT: Any discussion about those stack diagrams and alignment
| should include a link to this wikipedia page;
|
| https://en.wikipedia.org/wiki/Data_structure_alignment
| jvanderbot wrote:
| > which is just getting things done end-to-end as fast as
| possible, not careful at every step that we have no memory
| errors.
|
| One horrible but fun thing a former professor of mine pointed
| out: If your program isn't going to live long, then you never
| have to deallocate memory. Once it exits, the OS will happily
| clean it up for you.
|
| This works in C or perhaps lazy GC languages, but for stateful
| objects where destructors do meaningful work, like in C++, this
| is dangerous. This is one of the reasons I hate C++ so much:
| Unintended side effects that you have to trigger.
|
| > Could "reference counting" be compiled into a debug/profiled
| build and then detect which regions of time we free things in
| before or after (there is a happens before relation with
| dropping out of scopes that reference counting needs to run) to
| detect where to insert frees?
|
| This is what Rust does, kinda.
|
| C++ also does this with "stack" allocated objects - it "frees"
| (calls destructor and cleans up) when they go out of scope. And
| in C++, heap allocated data (if you're using a smart pointer)
| will automatically deallocate when the last reference drops,
| but this is not done at compile time.
|
| Those are the only two memory management models I'm familiar
| with enough to comment on.
| MarkSweep wrote:
| There is this old chestnut about "null garbage collectors":
|
| https://devblogs.microsoft.com/oldnewthing/20180228-00/?p=98.
| ..
|
| > This sparked an interesting memory for me. I was once
| working with a customer who was producing on-board software
| for a missile. In my analysis of the code, I pointed out that
| they had a number of problems with storage leaks. Imagine my
| surprise when the customers chief software engineer said "Of
| course it leaks". He went on to point out that they had
| calculated the amount of memory the application would leak in
| the total possible flight time for the missile and then
| doubled that number. They added this much additional memory
| to the hardware to "support" the leaks. Since the missile
| will explode when it hits its target or at the end of its
| flight, the ultimate in garbage collection is performed
| without programmer intervention.
| jvanderbot wrote:
| Rapid disassembly as GC. Love it.
|
| Have you heard the related story about the patriot missile
| system?
|
| https://www.cs.unc.edu/~smp/COMP205/LECTURES/ERROR/lec23/no
| d...
|
| Not a GC issue, but fun software bug.
| pjmlp wrote:
| The wonders of corrupted data, stale advisory locks and UNIX
| IPC leftovers, because they weren't properly flushed, or
| closed before process termination.
| jvanderbot wrote:
| I'll narrow my scope more explicitly:
|
| close(x) is not memory management - not at the user level.
| This should be done.
|
| free(p) has no O/S side effects like this in C - this can
| be not-done if you don't malloc all your memory.
|
| You can get away with not de-allocating program memory, but
| (as mentioned), that has nothing to do with freeing Os/
| kernel / networking resources in C.
| PhilipRoman wrote:
| Most kernel resources are fairly well behaved, as they
| will automatically decrement their refcount when a
| process exits. Even mutexes have a "robust" flag for this
| exact reason. Programs which rely on destructors or any
| other form or orderly exit are always brittle and should
| be rewritten to use atomic operations.
| caspper69 wrote:
| Nothing is going to tell you where to put your free() calls to
| guarantee memory safety (otherwise Rust wouldn't exist).
|
| There are tools that will tell you they're missing, however.
| Read up on Valgrind and ASAN.
|
| In C, non-global variables go out of scope when the function
| they are created in ends. So if you malloc() in a fn, free() at
| the end.
|
| If you're doing everything with globals in a short-running
| program, let the OS do it if that suits you (makes me feel
| dirty).
|
| This whole problem doesn't get crazy until your program gets
| more complicated. Once you have a lot of pointers among objects
| with different lifetimes. or you decide to add some concurrency
| (or parallelism), or when you have a lot of cooks in the
| kitchen.
|
| In the applications you say you are writing, just ask yourself
| if you're going to use a variable again. If not, and it is
| using dynamically-allocated memory, free() it.
|
| Don't psych yourself out, it's just C.
|
| And yes, there are ref-counting libraries for C. But I wouldn't
| want to write my program twice, once to use the ref-counting
| library in debug mode and another to use malloc/free in release
| mode. That sounds exhausting for all but the most trivial
| programs.
| mgaunard wrote:
| In C, not all objects need to be their own allocated entity
| (like they are in other languages). They can be stored in-line
| within another object, which means the lifetime of that object
| is necessarily constrained by that of its parent.
|
| You could make every object its own allocated entity, but then
| you're losing most of the benefits of using C, which is the
| ability to control memory layout of objects.
| pjmlp wrote:
| As any systems programming language include those that
| predate C by a decade, and still it doesn't allow full
| control without compiler extensions, if you really want full
| control of memory layout of objects, Assembly is the only
| way.
| SkiFire13 wrote:
| > I am curious: we have reference counting and we have Profile
| guided optimisation. > > Could "reference counting" be compiled
| into a debug/profiled build and then detect which regions of
| time we free things in before or after (there is a happens
| before relation with dropping out of scopes that reference
| counting needs to run) to detect where to insert frees?
|
| Profile guided optimizations can only gather informations about
| what's most probable, but they can't give knowledge about
| things about what will surely happen. For freeing however you
| most often want that knowledge, because not freeing will result
| in a memory leak (and freeing too early will result in a use-
| aftee-free, which you definitely want to avoid so the analysis
| needs to be conservative!). In the end this can only be an
| _optimization_ (just like profile guided _optimization_s are
| just optimizations!) on top of a workflows that is ok with
| leaking everything.
| jll29 wrote:
| Great post for intermediary programmers, who started programming
| in Python, and who should now learn what's under the hood to get
| to the next level of their education. Sometimes (perhaps most of
| the time), we should ignore the nitty gritty details, but the
| moment comes where you need to know the "how": either because you
| need more performance, sort out an issue, or do something that
| requires low-level action.
|
| There are few sources like this post targeting that intermediate
| group of people: you get lots of beginner YouTube clips and Web
| tutorials and on HN you get discussions about borrow checking in
| Rust versus garbage collection in Go, how to generate the best
| code for it and who has the best Rope implementation; but little
| to educate yourself from the beginner level to the level where
| you can begin to grasp what the second group are talking about,
| so thanks for this educations piece that fills a gap.
| 9999_points wrote:
| Memory arenas should be taught to all programmers and become the
| default method of memory management.
| caspper69 wrote:
| I agree with you 100%. I think arenas are a much lighter burden
| for the programmer to reason about than lifetimes & access
| patterns.
|
| But arenas can have one big drawback, and that is if you do a
| lot of allocations and deallocations, especially in long-
| running routines, you can essentially leak memory, because
| arenas are not usually freed until they are going out of scope.
| This can vary depending on the language and the implementation,
| though.
|
| My thought to counteract that though is you could offer a ref-
| counted arena just for this scenario, but I'm not sure what
| exactly that would look like (automatic once refs hit 0? offer
| a purge() function like a GC?). I haven't wrapped my head
| around the ergonomics yet.
| _bohm wrote:
| They're a great fit in many situations but certainly not all.
| Why not teach programmers a variety of allocation strategies
| and how to recognize when each might be a good fit?
| caspper69 wrote:
| I initially read your username as boehm, and I was like wow,
| ok, _this_ is a guy who knows his memory. :)
|
| What situations would an arena allocator prove problematic or
| non-optimal, aside from the many allocations/deallocations
| scenario?
|
| This is an area I'm very interested in, so any info would be
| appreciated.
| _bohm wrote:
| In general, everything allocated within an arena has its
| lifetime tied to that arena. In lots of situations this is
| a fine or even desirable property (e.g., a single request
| context in a server application), but can be a tough
| restriction to work with in situations where you need fine-
| grained deallocations and possibly want to reuse freed
| space. The lifetime property can also be a pain to work
| with in multithreaded scenarios, where you might have
| multiple threads needing to access data stored in a single
| arena. Another situation that comes to mind is large long-
| lived allocations where you might want to have some manual
| defragmentation in place for performance reasons.
| bluetomcat wrote:
| This isn't proper usage of realloc: lines =
| realloc(lines, (num_lines + 1) * sizeof(char *));
|
| In case it cannot service the reallocation and returns NULL, it
| will overwrite "lines" with NULL, but the memory that "lines"
| referred to is still there and needs to be either freed or used.
|
| The proper way to call it would be: tmp =
| realloc(lines, (num_lines + 1) * sizeof(char *)); if
| (tmp == NULL) { free(lines); lines =
| NULL; // ... possibly exit the program (without a
| memory leak) } else { lines = tmp; }
| lionkor wrote:
| Very odd that an article trying to teach memory management
| would miss this, this should be common knowledge to anyone who
| used realloc, just like checking the return of any allocation
| call.
| bluetomcat wrote:
| They treat an OOM situation as exceptional and immediately
| call abort() in case any allocation function returns NULL.
| The specification of these functions allows you to handle OOM
| situations gracefully.
| PhilipRoman wrote:
| >checking the return of any allocation call
|
| I would say this is pointless on many modern systems unless
| you also disable overcommit, since otherwise any memory
| access can result in a crash, which is impossible to check
| for explicitly.
| aa-jv wrote:
| Actually, no. You've just committed one of the cardinal sins of
| the *alloc()'s, which is: NULL is an acceptable return, so
| errno != 0 is the only way to tell if things have gone awry.
|
| The proper use of realloc is to check errno _always_ ...
| because in fact it can return NULL in a case which is not
| considered an error: lines is not NULL but requested size is
| zero. This is not considered an error case.
|
| So, in your fix, please replace all checking of tmp == NULL,
| instead with checking errno != 0. Only then will you have
| actually fixed the OP's unsafe, incorrect code.
| cozzyd wrote:
| In this case if (num_lines+1) _(sizeof (char_ )) is zero that
| is certainly unintended
| spiffyk wrote:
| From `malloc(3)`: Nonportable behavior
| The behavior of these functions when the requested size is
| zero is glibc specific; other implementations may return NULL
| without setting errno, and portable POSIX programs should
| tolerate such behavior. See realloc(3p).
| POSIX requires memory allocators to set errno upon failure.
| However, the C standard does not require this, and
| applications portable to non-POSIX platforms should not
| assume this.
| anymouse123456 wrote:
| As someone writing C for POSIX _and_ embedded environments,
| this clarification is a super helpful.
| pjmlp wrote:
| It is even flagged as such on Visual Studio analyser.
| returningfory2 wrote:
| I feel like this comment is misleading because it gives the
| impression that the code in the article is wrong or unsafe,
| whereas I think it's actually fine? In the article, in the case
| when `tmp == NULL` (in your notation) the author aborts the
| program. This means there's no memory leak or unsafety. I agree
| that one can do better of course.
| o11c wrote:
| There's another bug, related to performance - this involves a
| quadratic amount of memory copying unless your environment can
| arrange for zero-copy.
| Karellen wrote:
| Surely that's only the case if realloc() actually resizes and
| copies on every call? Which it normally doesn't?
|
| I thought that most implementations of realloc() would often
| "round up" internally to a larger size allocation, maybe
| power-of-two, maybe page size, or something? So if you ask
| for 20 bytes, the internal bookkeeping sets aside 32, or
| 4096, or whatever. And then if you realloc to 24 bytes,
| realloc will just note that the new allocation fits in the
| amount its reserved for you and return the same buffer again
| with no copying?
| o11c wrote:
| _Some_ implementations _might_ round up to encourage reuse:
|
| * memory-checking allocators never do.
|
| * purely-size-based allocators always do.
|
| * extent-based allocators _try_ to, but this easily fails
| if you 're doing two interleaving allocations.
|
| * the mmap fallback does _only_ if allowing the kernel to
| choose addresses rather than keeping virtual addresses
| together, unless you happen to be on a kernel that allows
| not leaving a hole
|
| Given that there's approximately zero overhead to do it
| right, just do it right (you don't need to store capacity,
| just compute it deterministically from the size).
| ekr____ wrote:
| Author here. Quite so. See footnote
| 3:https://educatedguesswork.org/posts/memory-
| management-1/#fn3
|
| "If you know you're going to be doing a lot of reallocation
| like this, many people will themselves overallocate, for
| instance by doubling the size of the buffer every time they
| are asked for more space than is available, thus reducing the
| number of times they need to actually reallocate. I've
| avoided this kind of trickery to keep this example simple."
| tliltocatl wrote:
| If you exit the program (as in the process) you don't need to
| free anything.
| astrobe_ wrote:
| The program abort()s if the reallocation fails. But indeed, for
| an educational example, it's not good to be too smart.
|
| I believe the test if(!num_lines) is unnecessary, because
| reallocating a NULL pointer is equivalent to malloc(). This is
| also a bit "smart", but I think it is also more correct because
| you don't use the value of one variable (num_lines is 0) to
| infer the value of another (lines is NULL).
|
| To go further, an opened-ended structure like:
| struct { unsigned count; char*
| lines[]; };
|
| ... could also be preferable in practice. But actually writing
| good C is not the topic of TFA.
| ekr____ wrote:
| Author here.
|
| Thanks for the flag. As you have probably noticed, I just abort
| the program a few lines below on realloc failure, so this
| doesn't leak so much as crash. However, this is a nice example
| of how fiddly C memory management is.
| 1970-01-01 wrote:
| This was a great (re)introduction to the fundamentals. Worthy of
| a bookmark.
| sylware wrote:
| Avoid as much as you can the C standard lib allocator, go
| directly to mmap system call with your own allocator if you know
| you won't use CPU without a MMU.
|
| If you write a library, let the user code install its own
| allocator.
| jeffbee wrote:
| "malloc" is a weakly-bound symbol that can be overridden, on
| every system I've used. I don't know if some standard defines
| it to be weak. Anyway the point is that malloc is not
| necessarily a call to the C standard library function. It can
| be anything.
| sylware wrote:
| "weakly-bound symbol" implies your a using a complex runtime
| library/binary format (like ELF).
|
| A portable and clean design for a library is to allow to
| override the internal allocator via the API (often part of
| the init function call).
|
| Look at vulkan3D which does many things right and doing this
| very part right. On the other side, you have some parts of
| the ALSA lib API which still requires to use the C lib free
| (may be obsolete though).
| jeffbee wrote:
| No, it doesn't mean that, because malloc can be replaced at
| build-time as well. But I agree that interfaces should
| avoid doing their own allocations and should let the caller
| dictate it.
| sylware wrote:
| Look at vulkan3D which does it right(TM).
| pjmlp wrote:
| Sure, if one doesn't care about portable code.
| the-smug-c-one wrote:
| What modern OS doesn't have the equivalent of mmap? Just som
| #ifdefs. I didn't know I'd ever hear "use malloc, because
| it's portable".
|
| sylware is pretty much right anyway. Try to avoid malloc, or
| write smaller allocators on top of malloc.
| pjmlp wrote:
| Why bother with some #ifdefs when the ISO C standard
| library already does the job?
| the-smug-c-one wrote:
| Because you're probably writing a much larger program so
| some ifdefs aren't a big deal :-).
| _bohm wrote:
| This is a fantastic post. I really feel like these concepts
| should be introduced to programmers much earlier on in their
| education and this article does a great job of presenting the
| info in an approachable manner.
| numeromancer wrote:
| Just no. address = X length = *X
| address = address + 1 while length > 0 {
| address = address + 1 print *address }
| ekr____ wrote:
| Author here. You're quite right that this isn't the thing you
| would normally do. I'm just trying to help people work through
| the logic of the system with as few dependencies as possible,
| hence this (admittedly yucky) piece of pseudocode which isn't
| really C or Rust or Python or anything...
| juanbafora wrote:
| thanks for sharing these are core concept to better understand
| the coding
| aslihana wrote:
| The comics at the beginning hahaha :D
| the_arun wrote:
| > If we just concatenate the values in memory, how do we know
| where one line ends and the next begins? For instance, maybe the
| first two names are "jim" and "bob" or maybe it's one person
| named "jimbob", or even two people named "jimbo" and "b".
|
| Don't we have a newline character? I thought we can read newline
| as `0xA` in Rust?
___________________________________________________________________
(page generated 2025-01-16 23:00 UTC)