[HN Gopher] KeyDB CEO Interview: Getting into YC with a Fork of ...
       ___________________________________________________________________
        
       KeyDB CEO Interview: Getting into YC with a Fork of Redis
        
       Author : dmytton
       Score  : 98 points
       Date   : 2021-04-27 15:01 UTC (7 hours ago)
        
 (HTM) web link (console.dev)
 (TXT) w3m dump (console.dev)
        
       | medium_burrito wrote:
       | The real problem is the Redis Cluster protocol, which pushes all
       | the smarts onto the client. This means each client implementation
       | behaves differently and fails differently.
       | 
       | I would encourage Redis users to use Envoy Proxy as the Redis
       | client (ie use vanilla client, and use Envoy as the cluster
       | client). You get all the HA and usefulness of Redis Cluster, but
       | way less of the headache. Also, strongly encourage people to
       | check out Elasticache, which is really good.
        
         | stingraycharles wrote:
         | But the Envoy proxy only supports a subset of the commands. For
         | example, it doesn't support LPOPRPUSH, precisely because the
         | keys could be stored in multiple nodes, and they don't support
         | that part of the clustering protocols.
        
         | echelon wrote:
         | 100% this.
         | 
         | And to your point about client libraries: for any shared
         | behavior that needs to be local to the app (eg. session
         | handling, permissions, etc.), it's more appropriate to spin up
         | a sidecar that you can talk to over sockets than to try to
         | build client libraries in each and every language. Your client
         | libraries _will_ differ in behavior and it is an incredible
         | pain keeping them all up to date, patching every app, etc.
         | 
         | Envoy for S2S and traffic mesh + sidecars for shared behavior
         | is better than building client smarts.
        
       | Thaxll wrote:
       | It's multithreaded but what about Redis running on a single core
       | in a cluster? Like running 8 Redis on a single 8 cores CPU. I
       | don't really understand the reason to run Redis on multiple core
       | since you can run multiple Redis on a single CPU with clustering
       | which will have better perforamce than running KeYDB with the
       | same number of cores.
        
         | jdsully wrote:
         | Take the scenario where you have an 9 node redis cluster. With
         | KeyDB you could shrink that down to 3 nodes with 3 threads
         | while getting the same throughput. This reduces the maintenance
         | burden while also reducing latency. There's overhead to
         | connecting to a different cluster server each time you access a
         | seperate shard.
         | 
         | Even better you might be able to avoid having a cluster at all.
         | For many that's the biggest win with KeyDB.
        
           | mayank wrote:
           | > There's overhead to connecting to a different cluster
           | server each time you access a seperate shard.
           | 
           | While I agree with your point, this concern is easily
           | addressed with connection pools.
        
       | throwaway823882 wrote:
       | Redis is one of those things where you see a car with an after-
       | market door on it, but the door is actually a house door, cut
       | with a sawzall into the shape of a car door.
        
       | unmole wrote:
       | Probably off-topic but I'm curious about the HPE and Huawei logos
       | on KeyDB's website [0]. What is their involvement in their
       | project?
       | 
       | 0: https://keydb.dev/
        
         | jdsully wrote:
         | They're users. We were pretty bad at asking for logos and just
         | started down that road. Honestly if I had one regret it's not
         | doing more user outreach.
        
       | welder wrote:
       | There's a post about another Redis-clone on the front page today
       | called SSDB. Seems to be a common occurrence:
       | 
       | https://news.ycombinator.com/item?id=19213261
        
         | jdsully wrote:
         | There are a few, some predate us like Threadis. Most only offer
         | a subset of the Redis feature set where we really wanted to be
         | a drop in replacement.
        
       | klohto wrote:
       | Any experience with KeyDB in production? Is the Flash support in
       | KeyDB still only in "Pro" version? Cannot find anything in the
       | docs.
        
         | jdsully wrote:
         | Yea FLASH is only in the Enterprise version. At a rough level
         | the open source is the best cache we can make while
         | pro/enterprise is the best full featured database. It's not
         | fully there yet but the end goal is you can get caching and
         | persistence in one place.
        
           | Scarbutt wrote:
           | What persistence options does KeyDB has that Redis doesn't?
        
             | jdsully wrote:
             | The FLASH feature in Enterprise can send all writes to disk
             | before replying +OK, however notably we don't fsync (AOF is
             | still recommended if you need that). In these deployments
             | it's not necessary to use RDB or AOF except for
             | replication.
             | 
             | And of course we have all the existing options already in
             | Redis.
        
         | thegreatpeter wrote:
         | Draftbit uses KeyDB in production. It's fast
        
         | secondcoming wrote:
         | Someone on my team looked at it, and for our workload (lots of
         | MGETs per query) it wasn't that much better than redis. We may
         | still go with it though, our GCP Memorystore instance is pretty
         | much at 100% CPU constantly.
         | 
         | We have still yet to evaluate Redis6.
        
           | jdsully wrote:
           | Long running commands are still a problem. We have
           | infrastructure in the Enterprise version to alleviate this
           | but so far we've enabled it only for KEYS and SCAN. MGET does
           | come up frequently so it's on our radar.
        
       | tammerk wrote:
       | I can't find in documents but does multi-threading effect
       | consistency somehow? Is there a chance that I wouldn't read what
       | I just wrote? I'm talking about single node, not about
       | replication, cluster etc.
       | 
       | If it provides same consistency, is threading like :
       | 
       | sock_read();
       | 
       | lock(datastructures);
       | 
       | set x=3;
       | 
       | unlock(datastructures);
       | 
       | sock_write();
        
         | jdsully wrote:
         | Yea that was roughly KeyDB's original design. There's some
         | nuance in the locking like ensuring it's both fair and fast so
         | P99 latency doesn't get out of control.
         | 
         | In the Enterprise codebase we can take snapshots which lets us
         | do reads without the lock but it's a bit of work to enable for
         | commands so it only applies to KEYS and SCAN at the moment.
        
       | atonse wrote:
       | This is totally going to be a Hacker News Bingo type of question.
       | 
       | But has anyone tried to do a clean room implementation of Redis
       | using Rust, but speaks the same wire protocol? You would get the
       | zero-cost multi-threading, memory safety, etc, and it would be a
       | drop in replacement.
        
         | injinj wrote:
         | I've done a C clean room version and I will say that the
         | networking part is as important as the multi-threading the data
         | structures: https://github.com/raitechnology/raids/.
         | 
         | If you go to the landing page of the above, scroll down to the
         | bottom, there is a TCP bypass solution graphed, using
         | Solarflare Open Onload and it is capable of running several
         | times as fast as the Linux Kernel TCP. I didn't test Redis with
         | Open Onload, but I'm pretty sure you'll get a similar results
         | since TCP is a major performance bottleneck in Redis as well.
        
         | ddorian43 wrote:
         | > You would get the zero-cost multi-threading,
         | 
         | You kinda have to look at how things really work underneath
         | before you can apply buzzwords to a database.
        
         | eloff wrote:
         | > zero-cost multi-threading
         | 
         | I think you mean zero cost abstractions. Which aren't usually
         | zero cost, but just zero additional cost over doing it
         | yourself.
         | 
         | There's no such thing as zero cost multi threading. Just
         | tradeoffs. Rust actually doesn't help with performance here (it
         | gets in the way often) but it definitely does help with
         | correctness - which is truly hard with multi threaded programs.
        
         | yannikyeo wrote:
         | Tokio async runtime for Rust has a tutorial in its user guide
         | https://tokio.rs/tokio/tutorial on writing a mini-redis
         | (https://github.com/tokio-rs/mini-redis).
        
         | nemothekid wrote:
         | It's been a long time since I've looked at KeyDB, but IIRC
         | KeyDB is just Redis plus a spinlock. It's actually still very
         | performant. There are other "toy" reimplementations of Redis in
         | Rust that take the same approach and aren't even as performant
         | as single threaded Redis.
         | 
         | The next approach you could take is using something like
         | Glommio and take a thread-per-core design to Redis. I think
         | that approach has a lot of potential, but the design becomes
         | more complex (you now need something like distributed
         | transactions for "cross-core" Redis transactions and mutli-
         | gets)
        
           | ddorian43 wrote:
           | RonDB (NDB Cluster) takes a different approach to threading
           | and claims it's faster than scylladb-style sharding
           | http://mikaelronstrom.blogspot.com/2021/03/designing-
           | thread-...
        
       | tyingq wrote:
       | I'm curious about the name. Putting "DB" in the name sort of
       | suggests it might support persistence, more data than fits in
       | memory, write-through, etc. Is that the case? Or is the "DB" some
       | nod that clustering means that "in-memory" doesn't have to be
       | ephemeral?
       | 
       | Or in short, where is KeyDB headed, longer term?
        
         | slaymaker1907 wrote:
         | I don't think more data than fits in memory is a requirement.
         | An important innovation in databases has been the realization
         | that there are important performance benefits from assuming all
         | data can fit in memory even if you still provide persistence.
         | 
         | Even persistence isn't really required. A pure analytics view
         | of another database is still a database by itself, but it
         | doesn't need to actually persist anything. It seems like
         | querying is more important to the concept of a database rather
         | than actual storage.
        
           | tyingq wrote:
           | I wasn't suggesting they had to, or should happen. Just
           | trying to understand the future plans since the "DB" term is
           | there.
        
         | jdsully wrote:
         | We have our FLASH feature which gives us a baseline persistence
         | story, you don't need RDBs or AOFs anymore. However there's a
         | lot of work left to do in this area.
         | 
         | The long term goal of KeyDB is to let you balance your dataset
         | across memory and disk in one database. In the future I think
         | caches will just be a feature of a more full featured database
         | and that's where we're heading with KeyDB.
        
       | solosoyokaze wrote:
       | _" My thought process was simply that there is a big need here
       | and Redis had for some reason decided not to serve it. If they
       | won't then we will."_
       | 
       | I feel like this and the general tone of the article are
       | needlessly antagonistic toward Redis. KeyDB is building their
       | entire business off of it after all.
       | 
       | There may very well be a need for multi-threaded Redis, but Redis
       | as it stands today is an amazing project and there's something to
       | keeping it simple along the lines of the project philosophy.
        
         | eloff wrote:
         | I don't read that as antagonistic. I think you're just being
         | too sensitive, or reading a tone into the test that isn't
         | actually there.
        
           | solosoyokaze wrote:
           | It just doesn't sound very friendly to the open source
           | community.
           | 
           | "If they won't then we will." sounds harsh imho.
        
             | hitekker wrote:
             | You should read about the concept of "forking": https://en.
             | wikipedia.org/wiki/Fork_(software_development)#Fo...
             | 
             | It's fundamental to the health and growth of the open
             | source community.
        
               | solosoyokaze wrote:
               | I am aware. Community management is different from what
               | is legally possible. Were a project owner start to act in
               | a self-interested or malicious manner, then I think
               | proudly and aggressively forking a project is a great
               | idea. That's not Redis. Like I said, it may be a good
               | idea to have a multi-threaded Redis, but Redis users tend
               | to love Redis. I would probably lean into that goodwill
               | instead of against it.
        
               | eloff wrote:
               | I don't see forking as being aggressive. It's the natural
               | thing to do if you want to take the project in a
               | different direction to its stewards. Often lessons are
               | learned from that process and the learnings integrated
               | back into the main project.
        
               | hitekker wrote:
               | Forking is not violence you perpetuate against people you
               | dislike.
               | 
               | It's a freedom; a person can disagree and go their own
               | way instead of being hounded by zealots demanding
               | compliance in the guise of friendliness.
        
               | solosoyokaze wrote:
               | I'm simply offering the advice that the tone seems
               | unfriendly and is likely to be taken as such by Redis
               | users (aka potential customers).
               | 
               | > _a person can disagree and go their own way instead of
               | being hounded by zealots demanding compliance in the
               | guise of friendliness_
               | 
               | This is exactly the kind of aggressive and nebulously
               | political tone that would not help a project gain
               | adoption. Why be hostile?
        
               | hitekker wrote:
               | I was going to ask about where you see hostility. But
               | judging by your comment history, it looks like you're
               | just trolling and starting flame wars :\
        
         | jdsully wrote:
         | I've learned to appreciate Salvatore's stance on simplicity the
         | longer we've gone on with KeyDB. But when I first made KeyDB
         | two years ago I was really perplexed at the decisions he made
         | with respect to threading.
         | 
         | It's not my intention to be antagonistic. I've had a lot of
         | projects over the years that went nowhere and a part of me is
         | sad that the one with the most traction is a fork.
        
       | troquerre wrote:
       | Are you concerned about AWS starting a competitor to keydb
       | cloud/have you considered modifying your license to prevent that
       | from happening? I'd imagine that'd be important in ensuring the
       | long term sustainability of keydb development
        
         | jdsully wrote:
         | They have Elasticache which is always a concern. In terms of
         | open source projects my read is they really don't want to run
         | their own and are much more comfortable operating open source
         | as a service. They "forked" Elastic Search 2 years ago and
         | basically did nothing with it until it was re-licensed a few
         | months ago. Now that they are investing more into it I'm
         | interested to see how they handle it long term.
        
           | hilbertseries wrote:
           | This seems somewhat orthogonal to the question asked. They
           | have Kinesis which competes with Kafka, SQS with competes
           | with a variety of message queues, redshift, dynamo db, etc
           | and it's my understanding that Athena was originally a fork
           | of Presto. So, they could definitely move more heavily into
           | the caching space if they see there's demand.
        
             | jdsully wrote:
             | Elasticache is already that competitor. They are scary to
             | be sure but they have some weaknesses. The main one is
             | branding: Elasticache is a white label Redis not it's own
             | thing, to the point where their own docs use the two
             | interchangeably.
        
       | Weryj wrote:
       | Would this mean that in a single core or low end environment, it
       | would be better to use Redis. I'm assume cutting out multi-core
       | complexities would be beneficial.
        
       | Rafuino wrote:
       | How does KeyDB compare to memKeyDB?
       | 
       | https://github.com/memKeyDB/memKeyDB
        
       | chadd wrote:
       | We (me and some folks at my old consultancy) wrote an Erlang
       | version of Redis (https://github.com/cbd/edis) for some of the
       | same reasons - multithreading changes some of the scaling
       | semantics in interesting ways. It was mostly for fun but ended up
       | in some real projects as a simple REDIS protocol implementation
       | front-end where the backend could be replaced with whatever the
       | implementor wants.
        
       ___________________________________________________________________
       (page generated 2021-04-27 23:01 UTC)