[HN Gopher] JVM statistics cause garbage collection pauses (2015)
       ___________________________________________________________________
        
       JVM statistics cause garbage collection pauses (2015)
        
       Author : tosh
       Score  : 65 points
       Date   : 2024-09-19 14:05 UTC (8 hours ago)
        
 (HTM) web link (www.evanjones.ca)
 (TXT) w3m dump (www.evanjones.ca)
        
       | ta988 wrote:
       | Is it still the case?
        
         | lbalazscs wrote:
         | In 2015 there was no ZGC. Today ZGC (an optional garbage
         | collector optimized for latency) guarantees that there will be
         | no GC pauses longer than a millisecond.
        
           | survivedurcode wrote:
           | I would check your answer. These are pauses due to time spent
           | writing to diagnostic outputs. These are not traditional
           | collection pauses. This affects both jstat as well as writes
           | of GC logs. (I.e. GC log writes will block the app just the
           | same way)
        
             | pjmlp wrote:
             | Which is why for anything serious one should be using
             | Flight Recorder instead.
        
           | hawk_ wrote:
           | ZGC doesn't remove safepoint requests on threads which is the
           | root cause. "Guarantees" here are with very heavy quotes.
        
           | kanzenryu2 wrote:
           | Sadly in many cases no; it's not magic. This nirvana is
           | restricted to cases where there is CPU bandwidth available
           | (e.g. some cores idle) and plenty of free RAM. When either
           | CPU or RAM are less plentiful... hello pauses my old friend.
        
             | sunshowers wrote:
             | This is why memory-bound services generally use languages
             | without mandatory GC. Tail latency is a killer.
             | 
             | Rust's memory management does have some issues in practice
             | (large synchronous drops) but they're relatively minor and
             | easily addressed compared to mandatory GC.
        
           | esaym wrote:
           | These modern garbage collectors are not simply free though. I
           | got bored last year and went on a deep dive with GC params
           | for Minecraft. For my needs I ended up with:
           | -XX:+UseParallelGC -XX:MaxGCPauseMillis=300 -Xmx2G -Xms768M
           | 
           | When flying around in spectator mode, you'd see 3 to 4
           | processes using 100%. Changing to more modern collectors just
           | added more load to the system. ZGC was the worst, with 16+
           | processes all using 100% cpu. With the ParallelGC, yes you'll
           | get the occasional pause but at least my laptop is not
           | burning hot fire.
        
             | namibj wrote:
             | You'll need more spare heap for ZGC.
        
               | ackfoobar wrote:
               | And using generational ZGC will probably lower CPU usage
               | a lot.
        
             | plandis wrote:
             | Yes no GC is free (well perhaps Epsilon comes close :)
             | 
             | It's a low pause GC so latencies, particularly tail
             | latencies, can be more predictable and bounded. The
             | tradeoff you make is that it uses more CPU time and memory
             | in order to operate.
        
           | hinkley wrote:
           | The cost of statistics gathering on a GC implementation that
           | avoids ineffective GC activity is less affected by the cost
           | of telemetry (no news is good news), but it is still
           | affected.
        
         | ackfoobar wrote:
         | Probably yes.
         | 
         | https://bugs.openjdk.org/browse/JDK-8076103
         | 
         | Closed with "Won't Fix".
        
           | flykespice wrote:
           | ...With no reasoning at all?
        
             | ackfoobar wrote:
             | A bit more context in the mailing list:
             | 
             | > It's a non-issue with a pure ram-based file system. Or
             | tmpfs with no swap.
             | 
             | https://mail.openjdk.org/pipermail/hotspot-runtime-
             | dev/2015-...
        
       | smrtinsert wrote:
       | Is this account a submission bot of some sort?
        
         | throwaway04324 wrote:
         | The account seems to be connected to a real person, but it has
         | a high number of submissions (over 350 submissions the past 30
         | days)
        
       | geodel wrote:
       | Also spending a lot cause higher credit card bills.
        
       | cogman10 wrote:
       | > in /tmp
       | 
       | Why is `/tmp` on disk and not a tmpfs mount?
        
         | sltkr wrote:
         | There is no law that says /tmp must be on tmpfs, and
         | historically this wasn't done, because tmpfs is limited in size
         | to some faction of the kernel's memory, while /tmp may be used
         | to store much larger files.
         | 
         | For example, GNU sort can sort arbitrarily large input files,
         | which is implemented by splitting the input into sorted chunks
         | that are written to a temporary directory, /tmp by default. But
         | this is based on the assumption that /tmp can store
         | significantly larger files than fit in memory, otherwise the
         | point is moot. So using tmpfs makes /tmp useless for this type
         | of operation.
         | 
         | In the end, it's a trade-off between performance and disk
         | space. I also prefer to mount /tmp on tmpfs for performance
         | reasons, but you should not assume that this is the case on all
         | systems.
        
           | aidenn0 wrote:
           | While I run /tmp on disk, I should point out that tmpfs is
           | not limited to the size of RAM; contents of tmpfs can be
           | swapped out just like any other memory allocation.
        
         | aidenn0 wrote:
         | Why would I want it on tmpfs? Only advantage I see is slightly
         | improved boot times (/tmp is typically cleared on boot, which
         | is obviously not necessary for tmpfs).
        
           | hinkley wrote:
           | Slightly simpler handling for docker containers. Particularly
           | if you run multiple copies of the same image on one box
           | (blue-green deploys, process-per-cpu programming languages,
           | etc)
        
       | sltkr wrote:
       | > The pauses occur even [..] if you call mlock
       | 
       | I wonder how this is even possible. The only scenario I can think
       | of involves a page fault on the page table itself (i.e., the page
       | is locked into memory, but a page fault occurs during virtual-to-
       | physical address translation). Does anyone know the real reason?
        
         | survivedurcode wrote:
         | Probably because pages mapped, even if they are locked into
         | memory are not allowed to stay dirty forever. Does this help?
         | https://stackoverflow.com/a/11024388 (In contrast, if you
         | mlocked but never wrote to the pages, you probably would not
         | encounter read pauses)
        
       | pjmlp wrote:
       | For proper statistics use Visual VM or Flight Recorder, if using
       | an OpenJDK derived JVM implementation.
       | 
       | Also note that not all JVMs are made alike, and there are plenty
       | to chose from.
        
         | hashmash wrote:
         | When using the `-XX:+PerfDisableSharedMem` workaround, VisualVM
         | cannot attach to the running process anymore.
        
       | jakewins wrote:
       | Man I remember being bit by this in migrating to AWS - this had
       | like snuck through on fast on-prem disks, but as soon as that
       | /tmp was on RDS oh boy, it was a dozy.
        
       | hinkley wrote:
       | Stuff like this is why back when I still wrote Java we only
       | wanted to turn on JVM telemetry on production boxes if they were
       | canaries. Slower you can work around by deploying more copies.
       | But jitter is not something you can do much about.
        
       | opentokix wrote:
       | Using ebpf, perf and flamegraphs would let him find this in a
       | couple of hours. That was not available for him in 2015 tho.
        
       ___________________________________________________________________
       (page generated 2024-09-19 23:01 UTC)