[HN Gopher] Mark-Scavenge: Waiting for Trash to Take Itself Out
___________________________________________________________________
Mark-Scavenge: Waiting for Trash to Take Itself Out
Author : vips7L
Score : 154 points
Date : 2024-11-24 17:40 UTC (1 days ago)
(HTM) web link (inside.java)
(TXT) w3m dump (inside.java)
| A4ET8a8uTh0 wrote:
| What a time to be alive, I read it the opening fully expecting to
| see an open source automated trashcan that takes itself to the
| curb each Monday. I was disappointed to find out it is about an
| actual garbage collection algorithm.
| garblegarble wrote:
| I know this is completely off-topic, but you might be
| interested in this[1] YouTuber who did something not far off
| that...
|
| 1: https://www.youtube.com/watch?v=VhYEOG9LOIk
| kibwen wrote:
| _> Modern garbage collection assumes that the weak generational
| hypothesis holds and that most objects die young_
|
| Aside, I'm curious how first-class support for value types and
| Go-style stack allocation via escape analysis changes the value
| proposition of the generational hypothesis. If we hypothesize
| that short-lived objects are local to a single function scope
| (and thus eligible for stack allocation either explicitly via a
| value type or heuristically via escape analysis) then it might
| completely upend the generational hypothesis and make it so that
| relatively more long-lived objects are getting heap-allocated.
| Surely someone's done some studies on this?
| DamonHD wrote:
| AFAIK not nearly enough stuff gets caught by escape analysis -
| and thus stack allocated - to make a difference.
| kibwen wrote:
| I'm open to believing that this is true, but some real
| numbers would be nice. Surely it wouldn't be a hugely
| invasive change to fork the Go compiler, change the stack
| allocation check to `return false`, and then measure the
| overhead of the garbage collector on real Go programs with
| stack allocation both enabled and disabled.
| DamonHD wrote:
| I may have been answering past you - I am thinking of Java
| running on the JDK here. And indeed I may be out of date
| also.
| cempaka wrote:
| Yeah in Java land specifically I think the question would
| become, "does the generational hypothesis still hold up
| once we have Valhalla and a much larger share of short-
| lived objects are stack allocated as value types?" but of
| course it may be years until the ecosystem reaches that
| point, if ever.
| neonsunset wrote:
| As shown by C#, it will generally continue to be relevant
| since both primarily use JIT compilation with ability to
| modify code at runtime which can violate inter-procedural
| escape analysis assumptions leading to heap allocations
| of the objects that are passed down to the callees (there
| is work scheduled for .NET 10 to address this, at least
| for AOT compilation where interproc analysis conclusions
| will be impossible to violate).
|
| You can craft a workload which violates the hypothesis by
| only allocating objects that live for a long time but
| both JVM and .NET GC implementations are still much
| faster designs than Go's GC which prioritizes small
| memory footprint and consistent latency on _low_
| allocation traffic (though as of .NET 9, SRV GC puts much
| more priority on this, making similar tradeoffs).
| cempaka wrote:
| > _ability to modify code at runtime_
|
| Would Java's moves towards "integrity by default" mean
| that this could be ruled out in more cases?
| neonsunset wrote:
| Reading through the JEP again it does not seem to be
| related - it is about deprecating unsafe APIs that the
| executed code itself uses. OpenJDK also has "partial
| escape analysis" where the object that only conditionally
| escapes can still be placed on the stack/scalar replaced.
|
| I'm not privy to the exact APIs that OpenJDK exposes but
| in .NET the main limitation around escape analysis that
| spans multiple methods is the fact that CoreCLR has re-
| JIT API which allows to perform a multitude of actions
| like attaching a profiler or a debugger to a live
| application and forcing the runtime to deoptimize a
| particular method for debugging, or modifying the
| implementation and re-JITting the result. Debug codegen
| is very different especially around GC liveness tracking
| and escape analysis that builds on top of it - it means
| that even debug code would have to uphold stack-allocated
| nature of such object in some way, complicating the
| design significantly. In addition to that, debuggers and
| other instrumentation may observe object references that
| would have otherwise not escaped local scope.
|
| This creates an unfortunate situation where the above
| almost never happens in production, but ignoring it and
| "just stack-allocating anyway" would lead to disastrous
| breakage for all sorts of existing instrumentation.
| Because Go does not have to deal with this constraint, it
| can perform interproc escape analysis without risk -
| whether a pointer escapes or not can be statically
| proven. For NativeAOT, .NET could approach this problem
| in the same way, but paraphrasing compiler team: "We
| would like to avoid optimizations only available for one
| of the target types be it JIT or AOT, and only supporting
| AOT would not benefit the majority of the .NET users".
|
| There is, however, recognition that more comprehensive
| interproc analysis could be very beneficial, including
| the EA which is why it is planned to work on it in .NET
| 10:
|
| - https://github.com/dotnet/runtime/issues/108931 IPA
| framework
|
| - https://github.com/dotnet/runtime/issues/104936 Stack
| allocation enhancements
| cempaka wrote:
| Yeah there's a JEP around deprecating access to
| sun.misc.Unsafe, but that's part of a larger effort
| including Jigsaw to push the Java ecosystem in the
| direction of modular builds, where more invariants are
| assumed to hold (e.g. " 'final' fields are _actually_
| final ") unless explicitly opted out for each module. I
| would assume the lack of such guarantees in the status
| quo wreaks a lot of havoc with EA.
|
| Profiling and debugging would be separate considerations
| -- I'm really not sure what limitations those impose on
| the JVM JIT.
| pjmlp wrote:
| Integrity by default is what the OpenJDK folks are
| pushing for so that any API that can break runtime
| assumptions, has to be explicitly allowed, so that they
| can actually make use of performance optimizations that
| would otherwise be too risky if anyone at any time could
| violate them.
| DamonHD wrote:
| Ahem --- JDK => JVM!
| DarkNova6 wrote:
| The reason escape analysis is not "good enough" is why we
| have project Valhalla trying to bring Value Types into the
| JVM.
|
| I don't have numbers at hand, but I remember the JDK Expert
| Group talking about this extensively in the past and why
| they deferred bringing Value Types for such a long time.
| They hoped complex enough EA can get rid of indirections
| and heap allocations but it just wasn't powerful enough,
| even with all advances throughout the years.
| fweimer wrote:
| Historically, Hotspot's escape analysis only resulted in
| avoided heap allocations (via scalar replacement) if all uses
| were inlined. I don't think this has changed.
| eikenberry wrote:
| Is there a language that makes this explicit, allocates the
| variables on the stack via compiler enforced notation?
| fanf2 wrote:
| C, C++, Rust, Zig, ...
| eikenberry wrote:
| What's Zig's notation for it?
| masklinn wrote:
| Not doing anything, same as the other 3.
|
| Heap allocation is what requires requesting memory from
| an allocator.
| eikenberry wrote:
| Right.. been using GC languages to long. Everything
| allocated to the stack unless it is specifically
| allocated to the heap. I was stuck thinking about what
| the keywords were. But I'm still curious.. are there any
| GC languages that have a way to specify if something
| should go on the stack or the heap?
| masklinn wrote:
| > are there any GC languages that have a way to specify
| if something should go on the stack or the heap?
|
| I think only in the sense that some GCd language have
| value (stack) types as a separate hierarchy from heap
| types? E.g. structs in C# or Swift are stack-allocated
| (and value-identity, and copied) whereas classes are
| heap-allocated.
|
| Adding that for java is one of the goals of Project
| Valhalla I believe.
| neonsunset wrote:
| C# (.NET in general) :)
|
| Well, variables cannot be forced to stack specifically.
| They are placed in the "local scope". And that would
| usually be either stack or CPU registers - thinking in
| stack only is a somewhat flawed mental model.
|
| Both C# and F# complicate this by supporting closures,
| iterator and async methods which capture variables placing
| them in a state machine box / display class instead which
| would be located on the heap, unless stack-allocated by
| escape analysis (unlikely because these usually cross
| method boundaries).
|
| However, .NET has `ref structs` (or, in F#, [<Struct;
| IsByRefLike>] types) which are subject to lifetime analysis
| and can never be placed on the heap directly or otherwise.
| masklinn wrote:
| Go has much more significant stack allocation capabilities,
| most notably it has no problem allocating entire structs on
| the stack so doesn't need scalar replacement, which falls
| over if you breathe on it
| (https://pkolaczk.github.io/overhead-of-optional/).
|
| According to https://github.com/golang/go/discussions/70257#d
| iscussioncom...
|
| > the weak generational hypothesis does not hold up well with
| respect to heap-allocated memory in many real-world Go
| programs (think 60-70% young object mortality vs. the 95%
| typically expected)
| DarkNova6 wrote:
| But stack allocated objects are not part of the heap and
| therefore not even part of Garbage Collection? And afaik stack
| allocation is already done for objects which don't escape a
| method.
| masklinn wrote:
| Yes, but that's the point: objects which don't escape are
| pretty much all young objects. So by this process the stack
| captures a significant fraction of the young generation, that
| young generation never reaches the heap and this is never
| under consideration by the GC.
|
| Essentially the stack is a form of younggen. It is not as
| complete (as there are things which must be heap allocated)
| but because it _is_ , it reduces the benefits of a
| generational GC... without having much impact on its costs
| and complexity.
|
| Depending on work load, that competition can be sufficient to
| make a generational GC net negative.
| DarkNova6 wrote:
| Thanks for the answer. But is this actual behaviour for the
| GCs of the JDK? I was certain that at the very least
| Hotspot makes use of stack allocation as much as possible.
|
| But perhaps the JDK GCs don't care so much about the stack
| because that is already dealt by the JVM a step prior? In
| any case, there will likely still be young objects
| allocated in the heap and this new algorithm might prove
| useful.
|
| But you can tell I am far from an expert here.
| masklinn wrote:
| > Thanks for the answer. But is this actual behaviour for
| the GCs of the JDK? I was certain that at the very least
| Hotspot makes use of stack allocation as much as
| possible.
|
| Not really, java has some escape analysis but it's very
| limited in its ability to stack allocate as it can't put
| entire structures (objects) on the stack, it only works
| if the compiler manages to scalar-replace the object
| (https://shipilev.net/jvm/anatomy-quarks/18-scalar-
| replacemen...) and that has somewhat restricted
| applicability (https://pkolaczk.github.io/overhead-of-
| optional/). The behaviour I'm talking about is mostly
| that of Go, as it is much more capable of stack
| allocating, and specifically has a non-generational GC
| because in testing they found generational GCs had a very
| variable impact depending on workload (rather than a
| universally or near universally positive impact).
| jerven wrote:
| There was a microsoft prototype for more stack allocation
| in OpenJDK (https://archive.fosdem.org/2020/schedule/even
| t/reducing_gc_t...). I recall that being put on hold
| because of how it would interact with project Loom fast
| stack copying. But I don't know the current status.
|
| GO has a non moving GC and I understand, that the cost of
| introducing safe moving GC is considered high. If one has
| a moving GC which the serious java one's are read/write
| barriers are already required, especially if they are
| concurrent like ZGC, C4 or Shenadoah. ZGc, C4 and
| Shenadoah all started out as non generational GC
| implementations, which gained them later, because in most
| cases they do increase performance/reduce overhead.
|
| Valhalla makes objects denser, and reduces overhead of
| identity which is great. Reducing the difference in
| memory layout between java objects and nested go structs.
|
| Go with arena's reduce the GC de-allocation costs.
| Something that the ZGC team is looking at in relation to
| loom/virtual threads. (but I can't find the reference for
| that right now)
| pfdietz wrote:
| I wonder if architectural support could be added to reduce
| the cost of recording modification information.
| masklinn wrote:
| > it might completely upend the generational hypothesis
|
| The generational hypothesis is about object lifetime, and that
| doesn't change.
|
| It does change the relevance of the generational hypothesis to
| garbage collection.
|
| > Surely someone's done some studies on this?
|
| The go team has, and that's why go doesn't have a generational
| GC. The complexity of adding generational support, especially
| in a mutation-based language (so needing memory barriers and
| the like) was found not to benefit when a significant fraction
| of the newborn objects don't even reach the youngen. See
| https://github.com/golang/go/discussions/70257#discussioncom...
| from the current discussion of adding opt-in ad-hoc support for
| memory regions.
| uluyol wrote:
| You might be interested in this talk:
| https://go.dev/blog/ismmkeynote
| ithkuil wrote:
| If you flip the argument on its head you can frame it as: since
| most objects die young it's very likely they will stay on the
| stack and thus it makes sense to invest in an allocation-site
| optimizer that will put the object on the heap only if static
| escape analysis says it may escape the lexical scope.
| pron wrote:
| Stack allocation doesn't make a big difference. To see this,
| consider the working set of a server application vs. the number
| of threads. Assuming a stack size circa 1MB, thousands of
| threads are needed to account for a significant portion of the
| working set. Go and Java's user-mode threads certainly make it
| more interesting, but not enough to dominate heap objects
| (arena allocation, which is particularly nice in Zig) is a
| different matter, and can have a bigger impact).
|
| The main reason for Java's Project Valhalla is not stack
| allocation, as it has too little impact, but arrays-of-structs
| on the heap. These matter not so much for their memory-
| management aspect, but the rate of cache-misses.
| fweimer wrote:
| How does this relate to Shenandoah's region selection logic?
| Doesn't it have similar behavior?
| m463 wrote:
| This is fascinating.
|
| The idea of mark/scavenge is pretty cool for page-based
| allocation and deallocation.
|
| Also real-time or maybe latency-intolerant systems, which
| basically can't block for garbage collection. You could probably
| keep scavenging and stay ahead of things without impacting things
| in a black and white way.
| AtlasBarfed wrote:
| This may be naive, but gc has now been around since the java
| days at mass scale (that is I'm calling java the first mass
| market GC computing system, just go with it for now) for about
| 30 years.
|
| The big lie of GC is that you never need to worry about memory
| management. The insidious lie is that you have no control when
| you do need it
|
| Rust went with a new extreme: no GC, but the compiler does a
| memory verification and a LOT of obscure complex notation and
| keywords
|
| So ... I'm wondering if there is a happy medium, where you can
| have keywords that strongly hint to a GC how to manage memory
| (eg arena/permanent, thread group bound, local/young, etc) that
| will be easier than rust.
|
| Maybe we could have multiple heaps defined (a big problem with
| GC heaps is the bigger they are the harder it is for the bigO
| of the management algorithms to scale, so nice they aren't
| linear bigO).
|
| I'm thinking of something like Cassandra where tables and even
| subsections of tables don't need their bloom filters, commit
| logs, atc all intermingled in a single heap, to the great
| detriment of heao performance.
|
| Instead, provide annotations to partition the heaps or run
| multiple heaps
|
| Of course this is a massive complexity rabbit hole/slippery
| slide.
| jddj wrote:
| Zig's allocators run along this line of thought
| codetrotter wrote:
| I was wondering if the .java TLD belonged to the Java island, or
| to Oracle or to someone else.
|
| https://icannwiki.org/.java
|
| It's Oracle.
___________________________________________________________________
(page generated 2024-11-25 23:01 UTC)