[HN Gopher] The Evolution of Caching Libraries in Go
       ___________________________________________________________________
        
       The Evolution of Caching Libraries in Go
        
       Author : maypok86
       Score  : 45 points
       Date   : 2025-06-29 17:07 UTC (3 days ago)
        
 (HTM) web link (maypok86.github.io)
 (TXT) w3m dump (maypok86.github.io)
        
       | regecks wrote:
       | We're looking for a distributed Go cache.
       | 
       | We don't want to round trip to a network endpoint in the ideal
       | path, but we run multiple instances of our monolith and we want a
       | shared cache tier for efficiency.
       | 
       | Any architecture/library recommendations?
        
         | awenix wrote:
         | groupcache[https://github.com/golang/groupcache] has been
         | around for some time now.
        
         | maypok86 wrote:
         | To be honest, I'm not sure I can recommend anything specific
         | here.
         | 
         | 1. How much data do you have and how many entries? If you have
         | lots of data with very small records, you might need an off-
         | heap based cache solution. The only ready-made implementation I
         | know is Olric [1].
         | 
         | 2. If you can use an on-heap cache, you might want to look at
         | groupcache [2]. It's not "blazingly-fast", but it's battle-
         | tested. Potential drawbacks include LRU eviction and lack of
         | generics (meaning extra GC pressure from using `interface{}`
         | for keys/values). It's also barely maintained, though you can
         | find active forks on GitHub.
         | 
         | 3. You could implement your own solution, though I doubt you'd
         | want to go that route. Architecturally, segcache [3] looks
         | interesting.
         | 
         | [1]: https://github.com/olric-data/olric
         | 
         | [2]: https://github.com/golang/groupcache
         | 
         | [3]:
         | https://www.usenix.org/conference/nsdi21/presentation/yang-j...
        
         | paulddraper wrote:
         | LRU in memory backed by shared Elasticache.
        
         | stackskipton wrote:
         | Since you mention no network endpoint, I assume it's on a
         | single server. If so, have you considered SQLite? Assuming your
         | cache is not massive, the file is likely to end up in
         | Filesystem cache so most of reads will come from memory and
         | writes on modern SSD will be fine as well.
         | 
         | It's easy to understand system with well battle tested library
         | and getting rid of cache is easy, delete the file.
         | 
         | EDIT: I will say for most use cases, the database cache is
         | probably plenty. Don't add power until you really need it.
        
         | nchmy wrote:
         | perhaps a NATS server colocated on each monolith server (or
         | even embedded in your app, if it is written in golang, meaning
         | that all communication is in-process) and use NATS KV?
         | 
         | Or if you just want it all to be in-memory, perhaps use some
         | other non-distributed caching library and do the replication
         | via NATS? Im sure there's lots of gotchas with something like
         | that, but Marmot is an example of doing SQLite replication via
         | NATS Jetstream
        
       | yumenoandy wrote:
       | on S3-FIFO being problematic, have you looked into TinyUFO? (part
       | of cloudflare/pingora)
        
         | maypok86 wrote:
         | No, I haven't looked into it, but the combination of "lock-
         | free" and "S3-FIFO" raises some red flags for me :)
         | 
         | I don't quite understand the specific rationale for replacing
         | segmented LRU with S3-FIFO. If I remember correctly, even the
         | original authors stated it doesn't provide significant benefits
         | [1].
         | 
         | Regarding TinyUFO - are you using lock-free queues? Has the
         | algorithmic complexity of TinyLFU changed? (In the base
         | version, S3-FIFO is O(n)). How easy is it to add new features?
         | With lock-free queues, even implementing a decent expiration
         | policy becomes a major challenge.
         | 
         | [1]: https://github.com/Yiling-J/theine/issues/21
        
       | jasonthorsness wrote:
       | Much of the complexity of caching comes from trying to work well
       | all workloads. If the workload is known, I think in many cases a
       | specialized simpler cache can outperform some of these libraries.
        
         | maypok86 wrote:
         | What exactly do you mean by a "more specialized simple cache"?
         | Just a map, mutex and LRU/LFU/ARC as eviction policies?
         | 
         | 1. Even using sync.RWMutex and specialized policies won't
         | really help you outperform a well-implemented BP-Wrapper in
         | terms of latency/throughput.
         | 
         | 2. I've never seen cases where W-TinyLFU loses more than 2-3%
         | hit rate compared to simpler eviction policies. But most simple
         | policies are vulnerable to attacks and can drop your hit rate
         | by dozens of percentage points under workload variations. Even
         | ignoring adversarial workloads, you'd still need to guess which
         | specific policy gives you those extra few percentage points. I
         | question the very premise of this approach.
         | 
         | 3. When it comes to loading and refreshing, writing a correct
         | implementation is non-trivial. After implementing it, I'm not
         | sure the cache could still be called "simple". And at the very
         | least, refreshing can reduce end-to-end latency by orders of
         | magnitude.
        
           | jasonthorsness wrote:
           | You're correct on all points. I should not have used the word
           | "outperform" and should have said a simple cache could be
           | sufficient. If for example you know you have more than enough
           | memory to cache all items you receive in 60 seconds and items
           | strictly expire after 60 seconds, then a sync.RWMutex with
           | optional lock striping is going to work just fine. You don't
           | need to reach for one of these libraries in that case (and I
           | have seen developers do that, and at that point the risk
           | becomes misconfiguration/misuse of a complex library).
        
             | maypok86 wrote:
             | Yeah, I basically agree with that.
        
       ___________________________________________________________________
       (page generated 2025-07-02 23:00 UTC)