[HN Gopher] How we tracked down a Go 1.24 memory regression
___________________________________________________________________
How we tracked down a Go 1.24 memory regression
Author : gandem
Score : 99 points
Date : 2025-07-17 20:02 UTC (2 days ago)
(HTM) web link (www.datadoghq.com)
(TXT) w3m dump (www.datadoghq.com)
| nitinreddy88 wrote:
| I am more interested to learn about Swiss tables than bug fix :)
|
| What are the best places to learn modern implementations of
| traditional data structures. Many of these utilise SIMD for last
| mile usage of modern hardware
| skavi wrote:
| could read one of the implementations. there's the original
| abseil implementation and rust's in the hashbrown crate.
| probably many more.
| gandem wrote:
| OP here, I wrote another blog post that explains how Swiss
| Tables work, see https://news.ycombinator.com/item?id=44597562
| woadwarrior01 wrote:
| I'd recommend reading the Swiss table design notes[1] in the
| Abseil documentation. You might also like F14 maps[2] from
| Folly.
|
| [1]: https://abseil.io/about/design/swisstables
|
| [2]: https://engineering.fb.com/2019/04/25/developer-tools/f14/
| SkiFire13 wrote:
| In addition to this comment's siblings resources, I also
| suggest this really good Cppcon presentation on Swisstable
| https://www.youtube.com/watch?v=ncHmEUmJZf4
| neuroelectron wrote:
| Great write up. It almost made me miss my old DevOps job.
| dh2022 wrote:
| I am somewhat surprised to see the bucket memory layout which is:
| [k1/v1],[k2,v2],[k3/v3] etc. where k1,k2,k3 are keys and v1,v2,v3
| are values. The CPU cache will not contain more than one [k,v]
| pair - because the CPU cache line is about 64 bytes and the size
| of [k,v] pair was about 56 bytes.
|
| So iterating through the bucket looking for a key will require
| each iteration to fetch the next [k,v] pair from RAM.
|
| Compare this with the following layout: k1,k2,k3,... followed by
| v1,v2,v3. Looking up the first key in the bucket will end up
| loading at least one more key in the CPU cache-line. And this
| should make iterations faster.
|
| The downside of this approach is if the lookup almost all the
| time results in the first key in the bucket. Then
| [k1,v1],[k2,v2],k3,v3] packing is better-because the value is
| also in the CPU cache line .
|
| I am wondering if people on this forum knowvmore about this
| trade-off. Thanks!!
___________________________________________________________________
(page generated 2025-07-19 23:00 UTC)