[HN Gopher] Show HN: Cachegrand - a fast OSS Key-Value store bui...
       ___________________________________________________________________
        
       Show HN: Cachegrand - a fast OSS Key-Value store built for modern
       hardware
        
       I am the author of the platform, happy to reply to any question you
       might have!  It scales up really nicely thanks to a year of
       research and development of the hashtable implemented in
       cachegrand, on the hardware used for benchmarking, an AMD EPYC
       7502P, it was able to reach up to 5mln GET QPS and 4.5mln SET QPS,
       with batching up to 60mln GET QPS and up to 26MLN SET QPS.
       cachegrand is fast, it's fully Open Source, it's under a BSD
       3-clause license - it can be used easily as standalone platform or
       incorporated in other ones without any licensing issue - and we are
       working to expand the Redis functionalities supported and to
       impelement a tiered storage to cache more data than the available
       memory. Longer term our goal is to expand the support to different
       platforms (e.g. memcache, kafka, etc.), add support to webassembly
       to have user defined functions and server side events, and of
       course a network bypass (combining XDP and a lockless FreeBSD
       tcp/ip stack) and a storage bypass.  Although it can easily used
       via docker, here a direct link to the latest release
       https://github.com/danielealbano/cachegrand/releases/tag/v0....
       Currently we are focused on supporting Redis, here the list of
       commands currently implemented
       https://github.com/danielealbano/cachegrand/blob/main/docs/a...
        
       Author : daniele_dll
       Score  : 56 points
       Date   : 2022-09-13 12:31 UTC (1 days ago)
        
 (HTM) web link (github.com)
 (TXT) w3m dump (github.com)
        
       | nerpderp82 wrote:
       | The benchmarks should be against Aerospike or something similar
       | not Redis, as it cooks the results vs what is the current highest
       | performant OSS KV store.
        
         | pclmulqdq wrote:
         | Memcached is the usual benchmark target for multi-core Redis-
         | like systems.
        
           | daniele_dll wrote:
           | Very good point but currently cachegrand doesn't have an
           | interface for memcache, although probably would take about 1
           | week to implement it.
           | 
           | In general I am pretty sure cachegrand would be faster than
           | memcache easily but I will add it to the todo list as it
           | makes sense.
           | 
           | Thanks for pointing it out!
        
         | justinholmes wrote:
         | Trust me as a committer it smokes aerospike and dragonflydb.
         | 
         | Dragonflydb also break Redis license big time.
        
           | eatonphil wrote:
           | Ambiguous use of "it" there. I don't understand what you
           | mean.
        
             | 0x457 wrote:
             | Well, we have 3 things it could mean: Cachegrand, Aerospike
             | and dragonflydb.
             | 
             | Since the comment says "Trust me as a committer it smokes
             | aerospike and dragonflydb." we can rule out aerospike and
             | dragonflydb, which leaves us with "it" being Cachegrand.
             | Ain't that deep.
        
               | eatonphil wrote:
               | Why would it be obvious we should implicitly trust (since
               | all they said is "trust me") a committer to a project
               | that their project is better (in any dimension) than
               | competitors to their project?
               | 
               | That didn't make sense to me so I assumed they must not
               | be a contributor to cachegrand.
        
               | [deleted]
        
         | nvartolomei wrote:
         | Or against https://github.com/dragonflydb/dragonfly.
        
           | justinholmes wrote:
           | We posted benchmarks
           | 
           | https://www.linkedin.com/posts/danielesalvatorealbano_cacheg.
           | ..
        
         | otabdeveloper4 wrote:
         | Aerospike will happily ignore your requests if it feels like.
         | Use it only if you don't care about correctness or losing data.
        
           | nvartolomei wrote:
           | "In hundreds of tests of SC mode through network partitions,
           | 3.99.1.5 and higher versions have not shown any sign of
           | nonlinearizable histories, lost increments to counters, or
           | lost updates to sets." - Kyle Kingsbury, Aerospike 3.99.0.3,
           | 12-27-2017
           | 
           | - https://aerospike.com/blog/aerospike-4-strong-consistency-
           | an...
           | 
           | - https://jepsen.io/analyses/aerospike-3-99-0-3
        
             | otabdeveloper4 wrote:
             | Good for them. We got rid of Aerospike because using it in
             | the "right" way to get data in and out correctly is a pain
             | in the ass and makes everything slow.
        
         | 5d8767c68926 wrote:
         | Redis is the product everyone knows? Pick a tool and it is
         | almost a guarantee there is a less well known implementation
         | that is better on dimension X.
        
       | NiekvdMaas wrote:
       | The redis compatibility layer seems to be very early stage, see
       | https://github.com/danielealbano/cachegrand/blob/main/docs/a...
       | 
       | Compare with e.g. dragonflydb:
       | https://github.com/dragonflydb/dragonfly/blob/main/docs/api_...
       | 
       | Interesting to see how this will develop over time.
        
         | justinholmes wrote:
         | Dragonfly break Redis license also is slower
         | https://www.linkedin.com/posts/danielesalvatorealbano_cacheg...
        
       | [deleted]
        
       | mnutt wrote:
       | Are there any plans to implement key eviction to use it as a LRU
       | cache?
        
       | [deleted]
        
       | Andys wrote:
       | Any plans to make it distributed? Generally need reliability and
       | fail-over more than straight line speed of using all the cores of
       | a machine.
        
       | kaiser81 wrote:
       | I'm following your project Daniele! Can't wait to have a stable
       | release
        
         | daniele_dll wrote:
         | Thanks!
         | 
         | My goal is to make it usable enough from the 0.3.0 where I want
         | to have enough Redis commands supported (hashsets, sets, lists,
         | multiple dbs, hyperloglog, etc.), have more memory control and
         | have an MVP of the on-disk database.
         | 
         | Implementing the data replication, which will come right after,
         | will be fairly easy, thanks to all the logics already built in
         | for the on-disk database, so that should be implemented fairly
         | quickly.
         | 
         | Also I have recently bought 2 x AMD EPYC 7551 so I will setup a
         | better performance regression tracking and a proper benchmark
         | suite to track everything (a contributor is working on
         | leveraging the tests and use them as base for the two things
         | mentioned above)
        
       | danudey wrote:
       | These are extremely impressive numbers, and while what is
       | accomplished here is extremely impressive, I just find it funny
       | how rarely Redis or memcached is the bottleneck for an
       | application's scalability.
       | 
       | (Obviously it does happen, but statistically speaking it's almost
       | never)
        
         | daniele_dll wrote:
         | (I am the main author of cachegrand) I definitely agree, that's
         | why cachegrand puts the focus on functionalities like an on-
         | disk db, which will also be a timeseries db, active-active
         | replication and support for webassembly.
         | 
         | In terms of "just performance", Redis can easily chew 200k GET
         | RPS on an average low-core count VM, even if an application
         | does 10 Redis queries per request in average it would still
         | take 20k requests to saturate it, if we leave 15% of marging
         | for peak traffic / issues / surprises / etc, it would still
         | take an application handling 17.5k RPS which is an HUGE amount
         | if we think that this would require easily between 50 and 100
         | machines beefy machines!
         | 
         | I think the biggest limitation nowadays is instead the cost of
         | using "only" memory for the cache and having to use a bunch of
         | different systems to process your data.
         | 
         | Try to imagine what you would be able to do if cachegrand would
         | ingest your stream as kafka-compatible server, run your
         | webassembly compiled script and/or run your ML/AI models
         | (leveraging webassembly) and then let push data to other
         | databases / systems and/or let you access your processed data
         | via the Redis / Memcache / GraphQL interface!
         | 
         | And on top of this, imagine that all these modules (Kafka,
         | Redis, Memcache, GraphQL, etc.) can leverage a network bypass
         | and a nvme bypass to perform super fast I/O.
         | 
         | It's a lot of stuff, but that's my long term goal / vision.
         | 
         | Of course to achieve all of this, you need a blazing fast and
         | very flexible base! We are currently focusing on the Redis
         | support because needs many different bits and pieces and would
         | allows us to have people starting to use cachegrand which is a
         | key to understand if the grand plan makes sense :)
        
           | morelisp wrote:
           | I roughly agree that get throughput is not generally a
           | bottleneck, but
           | 
           | > 17.5k RPS which is an HUGE amount if we think that this
           | would require easily between 50 and 100 machines beefy
           | machines!
           | 
           | Maybe we have different definitions of beefy, but in terms of
           | HTTP, we serve 2-4x this on less than half that.
        
         | miohtama wrote:
         | Here is what antirez (Salvatore Sanflippo), the author of Redis
         | said when DragonflyDB was posting similar benchmarks few months
         | ago:
         | 
         | https://news.ycombinator.com/item?id=31563641
         | 
         | TL;DR Unlikely to be apples to apples comparison and you can
         | get much more performance out of Redis easily.
        
       | gpderetta wrote:
       | The uncommon name and edit-distance of 1 from cachegrind tripped
       | me up for a few moments.
        
         | daniele_dll wrote:
         | ahahahahha, that's definitely true, quite similar to Valgrind
         | tooling :D
        
       | eatonphil wrote:
       | It's a key-value store that's also a time series database? I'd
       | like to understand better (both why and how) but the linked time-
       | series documentation is mostly TODOs.
        
         | Octoth0rpe wrote:
         | to be fair, it's very clearly marked as `work in progress` in
         | their readme.
        
           | eatonphil wrote:
           | Sure, I'm not criticizing it. I'd just like to learn more. :)
        
             | daniele_dll wrote:
             | I would love to sit and write more documentation but I am
             | doing this work during the night and/or over the weekends
             | so sorry for not having it but I promise I will slowly
             | slowly start to put together something more than a "TODO",
             | even if it's a general intro.
             | 
             | The timeseriesdb is "half" a consequence of having a Write-
             | Ahead-Log that is split in chunks and it's chained. I am
             | saying "half" because the other one is the future addition
             | of secondary index to make it possible and easier to query
             | the internal db properly.
             | 
             | I know that we are talking mainly about Redis right now,
             | but that's just the tip of the iceberg and my long term
             | vision is to build a much more complete and flexible
             | platform which can easily handle streams (e.g. via a Kafka
             | interface) and/or allow you to run more evolved data
             | processing via WASM (e.g. I want yo make possible to
             | calculate rolling window averages in a time-sensible
             | fashion :)).
        
               | jonhohle wrote:
               | So the TSDB is being recorded, it's just not queryable,
               | or it's not binary compatible with whatever the future in
               | disk format will look like?
               | 
               | Is the time series replicated and merged between nodes or
               | does each have its own log for the keys it manages?
        
       ___________________________________________________________________
       (page generated 2022-09-14 23:00 UTC)