[HN Gopher] Dragonflydb - A modern replacement for Redis and Mem...
___________________________________________________________________
Dragonflydb - A modern replacement for Redis and Memcached
Author : avielb
Score : 489 points
Date : 2022-05-30 16:18 UTC (6 hours ago)
(HTM) web link (github.com)
(TXT) w3m dump (github.com)
| maxpert wrote:
| How does this compare to other multithreaded redis protocol
| compatibles? KeyDB is one key player https://docs.keydb.dev/
| jitl wrote:
| The benchmark graph in README.md shows KeyDB. This thing is
| faster according to the graphs.
| https://raw.githubusercontent.com/dragonflydb/dragonfly/main...
| judofyr wrote:
| KeyDB does network I/O and parsing on separate threads, but
| still has one big lock around the data structures themselves.
| DragonflyDB does full processing of the transaction on separate
| threads.
|
| [1] https://docs.keydb.dev/blog/2019/10/07/blog-post/
| bushbaba wrote:
| Also curious on how it compares to Aerospike.
| romange wrote:
| I do not even know how to to that comparison. Redis and DF
| share the same protocol and the same API so it's each to
| compare with memtier_benchmark.
| thesuperbigfrog wrote:
| So, per the license only non-production use is allowed until June
| 2027 when this release changes to an Apache license?
|
| https://github.com/dragonflydb/dragonfly/blob/main/LICENSE.m...
|
| I understand wanting to protect your work from someone else
| turning into a service, but I will need to get our org's legal
| team to review it first.
| OPoncz wrote:
| License is standard BSL 1.1. which means you can use it in
| production for your own workloads as long as you do not provide
| DF as a managed service.
| [deleted]
| [deleted]
| throwaway888abc wrote:
| Impressive. Will give it try for internal benchmarks.
|
| Homepage: https://dragonflydb.io/
|
| Benchmark:
| https://raw.githubusercontent.com/dragonflydb/dragonfly/main...
| metadat wrote:
| C++? I was expecting Rust!
|
| I am spoiled.
| chimen wrote:
| Rust folks, as do vegans, put the "Rust" in the title straight
| away.
| camdenlock wrote:
| There's a reason there's significant overlap between those
| two groups... ;)
| abhi12_ayalur wrote:
| Would this be able to eventually handle JSON/deep data similar to
| RedisJSON? For my team's use case, this is crucial and what we're
| using currently.
| romange wrote:
| We will be able. It's just matter of time and priorities.
| Xeoncross wrote:
| I want to take a minute to appreciate and recognize the
| https://github.com/dragonflydb/dragonfly#background section.
|
| A lot of projects say "faster" without giving some hint of the
| things they did to achieve this. "A novel fork-less snapshotting
| algorithm", "each thread would manage its own slice of dictionary
| data", and "core hashtable structure" are all important
| information that other projects often leave out.
| staticassertion wrote:
| Nothing gets me excited for a project like a bunch of cited
| papers.
| tiffanyh wrote:
| Sounds similar to DragonflyBSD unique "virtual kernels"
| (lockless SMP with per core hash tables)
| https://www.dragonflybsd.org/
| romange wrote:
| Thank you. And if you are curious to learn more - we would love
| to share! And we will.
| [deleted]
| staticassertion wrote:
| Reminds me a bit of Scylladb with the focus on 'shard per core'.
| I've considered using Scylla as a cache as well, might try this
| out instead.
| OPoncz wrote:
| DF also has a noval eviction algorithm that combines LRU and
| LFU which could be great for caching use cases.
| staticassertion wrote:
| Indeed, I have a use case that would benefit from that
| (theoretically). I'll have to dig into the papers.
| romange wrote:
| ScyllaDB and Seastar was my insipration when I thought about
| the architecture for Dragonfly
| dragosbulugean wrote:
| would love to hear why c++
| romange wrote:
| If I would choose another language it would be Rust. Why I did
| not choose Rust?
|
| 1. I speak fluently C++ and learning Rust would take me years.
| 2. Foodchain of libraries that I am intimately fimiliar with in
| C++ and I am not familiar with in Rust. Take Rust Tokyo, for
| example. This is the de facto the standard for how to build I/O
| backends. However if you benchmark Tokyo's min-redis with
| memtier_benchmark you will see it has much lower throughput
| than helio and much higher latency. (At least this is what I
| observed a year ago). Tokyo is a combination of myriad design
| decisions that authors of the framework had to do to serve the
| mainstream of use-cases. helio is opinionated. DF is
| opinionated. Shared-nothing architecture is not for everyone.
| But if you master it - it's invincible. (and yeah - there is
| zero chance I could write something like helio in Rust)...
| jen20 wrote:
| Tokio is not a shared-nothing model - you'd be looking at
| Glommio [1] (from one of the contributors to Seastar) for
| that.
|
| [1]: https://github.com/DataDog/glommio
| infamouscow wrote:
| The deal breaker for me was stable Rust doesn't have fast
| thread locals[1] nor stack probes for arm64[2].
|
| [1]: https://github.com/rust-lang/rust/issues/29594
|
| [2]: https://github.com/rust-lang/rust/issues/77071
| ed25519FUUU wrote:
| To me the focus on speed is a wash now. They're all fast. I'd
| like to hear about easy cross-region replication and failover as
| well as effortless snapshot and restoring of backups.
| OPoncz wrote:
| Actually snapshot is done in the background and does not use
| fork like Redis. You can see it here:
| https://github.com/dragonflydb/dragonfly#memory-efficiency-b...
| welder wrote:
| > Probably, the fastest in-memory store in the universe!
|
| Redis is fast enough. Read/write speed isn't usually the
| bottleneck, it's limiting your data set to RAM. I've long ago
| switched to a disk-backed Redis clone (called SSDB) that solved
| all my scaling problems.
| FpUser wrote:
| Google search on SSDB - "Saskatchewan Sheep Development Board"
| ;) Well to its credit it did find and put proper one to the
| front page as well.
| shin_lao wrote:
| Fast enough, for the universe of applications you are aware of.
| welder wrote:
| Yea, faster is always better. I just run into storage
| bottlenecks first in my experience at big and small
| companies.
| redman25 wrote:
| He seems to mention memory efficiency as well.
| sontek wrote:
| Haha, I thought the same thing until it wasn't! It turns out
| there are a lot of humans in the world and if you are
| unfortunate enough to get a large portion of them to start
| utilizing the software you write you'll find some bottlenecks
| in almost every system you thought was fast enough.
| sudarshnachakra wrote:
| I like the redis protocol compatibility and the HTTP
| compatibility, but from the initial skim through I guess you are
| using abseil-cpp and the home-grown helio
| (https://github.com/romange/helio) library.
|
| Could you get me a one liner on the helio library is it used as a
| fiber wrapper around the io_uring facility in the kernel? Can it
| be used as a standalone library for implementing fibers in
| application code?
|
| Also it seems that spinlock has become a defacto standard in the
| DB world today, thanks for not falling into the trap (because 90%
| of the users of any DB do not need spinlocks).
|
| Another curious question would be - why not implement with
| seastar (since you're not speaking to disk often enough)?
| romange wrote:
| Yes, helio is the library that allows you to build c++ backends
| easily similar to Seastar. Unlike Seastar that is designed as
| futures and continuations library, helio uses fibers which I
| think simpler to use and reason about. I've wrote a few blog
| posts a while ago about fibers and Seastar:
| https://www.romange.com/2018/07/12/seastar-asynchronous-c-fr...
| one of them. You will see there a typical Seastar flow with
| continuations. I just do not like this style and I think C++ is
| not a good fit for it. Having said that, I do think Seastar is
| 5-star framework and the team behind it are all superstars. I
| learned about shared-nothing architecture from Seastar.
|
| Re helio: You will find examples folder inside the projects
| with sample backends: echo_server and pingpong_server. Both are
| similar but the latter speaks RESP. I also implemented a toy
| midi-redis project https://github.com/romange/midi-redis which
| is also based on helio.
|
| In fact dragonfly evolved from it. Another interesting moment
| about Seastarr - I decided to adopt io_uring as my only polling
| API and Seastar did not use io_uring at that time.
| sudarshnachakra wrote:
| Thanks for taking the time to reply - yes in fact seastar
| does not use io_uring but it's rust equivalent glommio does
| use it (IIRC it is based on io_uring). Any reasons for using
| c++ instead of Rust (are u more familiar with it? or just the
| learning curve hinders the time to market? or is it the
| Rc/Arc fatigue with rust async? I guess Rust should be a
| fairly easy language to pick up for good c++ programmers like
| you)
| romange wrote:
| If I would choose another language it would be Rust. Why I
| did not choose Rust?
|
| 1. I speak fluently C++ and learning Rust would take me
| years. 2. Foodchain of libraries that I am intimately
| fimiliar with in C++ and I am not familiar with in Rust.
| Take Rust Tokyo, for example. This is the de facto standard
| for how to build I/O backends. However if you benchmark
| Tokyo's min-redis with memtier_benchmark you will see it
| has much lower throughput than helio and much higher
| latency. (At least this is what I observed a year ago).
| Tokyo is a combination of myriad design decisions that
| authors of the framework had to do to serve tha mainstream
| of use-cases. helio is opinionated. DF is opinionated.
| Shared-nothing architecture is not for everyone. But if you
| master it - it's invincible.
| mamcx wrote:
| Aside nit-pick: I think is dangerous call anything "db" if is not
| safely stored with Acid.
|
| People not read docs neither know the consequences of words like
| "eventual" or "in memory" and star using this kind of software as
| primary data stores, instead of caches/ephemeral ones...
| vorpalhex wrote:
| I think thats an issue with people who don't read/comprehend
| the docs.
| romange wrote:
| Ok you got us. We chose dragonglydb and not dragonflystore just
| because the former sounds better on tongue :)
|
| Having said that we carefully choose to write everywhere in the
| docs thay we are in-memory store (and not the database).
|
| Btw, I reserve full rights to provide full durability
| guarantees for DF and to claim the database title in the
| future.
| vvern wrote:
| dragonflycache sounds reasonable.
| wutwutwutwut wrote:
| Haha, what. If you run a database without reading the
| documentation then you're the dangerous part, not the ACID-
| compliance aspects.
|
| For _any_ database there will be important information only
| available in the documentation.
| mamcx wrote:
| > If you run a database without reading the documentation
| then you're the dangerous part
|
| I think that covers almost all the whole dev population, for
| what I see in relation with RDBMS. Lucky us most RDBMs shield
| the mistakes in their usage, a lot.
|
| That is why I see is "dangerous" to call ephemeral/eventual
| stores as "db". Marketing/positioning have impacts...
| wutwutwutwut wrote:
| All databases are ephemeral if the person running it don't
| read the docs. Your comment is hence fully redundant, as
| opposed to the default single-node install of any DBMS.
| staticassertion wrote:
| So Cassandra isn't a database? I'd say "thing that manages
| data" is a database, which is to say, a _lot_ of things are
| databases.
| cormacrelf wrote:
| Everything is either a database or a compiler.
| Xeoncross wrote:
| We've identified the final hacker project
| Sebb767 wrote:
| Technically, a traditional optimizer is an SQL compiler.
| mrkurt wrote:
| Or a proxy.
| qeternity wrote:
| A proxy is just a really fast dynamic linker.
| c0l0 wrote:
| Actually, everything is a routing problem.
| morelisp wrote:
| All of A, C, and I only make sense defined relative to a
| particular transaction vocabulary. Redis is perfectly ACID, as
| long as your transactions are those supported by Redis's
| commands.
|
| Conversely, plenty of DBs with programmable transactions (e.g.
| SQL) are considered work-a-day "ACID" enough, despite some
| massive gaps in their transactional model (no DDL in
| transactions, no nested transactions, atomic only when below a
| certain size, etc.)
| antirez wrote:
| I'm no longer involved in Redis, but I would love if people
| receive a clear information: from what the author of Dragonflydb
| is saying here, that is, memcached and Draonflydb have similar
| performance, I imagine that the numbers provided in the
| comparison with Redis are obtain with Redis on a single core, and
| Draonflydb running on all the cores of the machine. Now, it is
| true that Redis uses a core per instance, but it is true that
| this is comparing apple-to-motorbikes. Multi-key operations are
| possible even with multiple instances (via key tags), so the
| author should compare N Redis instances (one per core), and
| report the numbers. Then they should say: "but our advantage is
| that you can run a single instance with this and that good
| thing". Moreover I believe it would be fair to memcached to
| clearly state they have the same performance.
|
| EDIT: another quick note: copy-on-write implementations on the
| user space, algorithmically, are cool in certain situations, but
| it must be checked what happens in the _worst_ case. Because the
| good thing of kernel copy-on-write is that, it is what it is, but
| is easy to predict. Imagine an instance composed of just very
| large sorted sets: snapshotting starts, but there are a lot of
| writes, and all the sorted sets end being duplicated in the
| process. When instead the sorted sets are able to remember their
| version because the data structure _itself_ is versioned, you get
| two things: 1. more memory usage, 2. a lot more complexity in the
| implementation. I don 't know what dragonflydb is using as
| algorithmic copy-on-write, but I would make sure to understand
| what the failure modes are with those algorithm, because it's a
| bit a matter of physics: if you want to capture a snapshot at a
| given Time T0 of a database, somehow changes must be accumulated.
| Either at page level or at some other level.
|
| EDIT 2: fun fact, I didn't comment something about Redis for two
| years!
| [deleted]
| romange wrote:
| Thanks for providing your feedback. As Redis Manifesto states -
| our goal is to fight against complexity. antirez - you are our
| inspiration and I seriously take your manifesto close to heart.
|
| Please allow the possibility that Redis can be improved and
| should be improved. Otherwise other systems will eventually
| take its market apart.
|
| I appreciate your comments very much. I've wrote about you in
| my blog. I am an engineer and I disagree with some of the
| design decisions that were made in Redis and I decided to do
| something about it :) to your points:
|
| 1. DF provides _full_ compatibility with single node Redis
| while running on all cores, compared to Redis cluster that can
| not provide multi-key operations across slots.
|
| 2. Much stronger point - we provide much simpler system since
| you do not need to manage k processes, you do not need to
| *provision* k capacities that managed independently within each
| process and you do not need to monitor those processes,
| load/save k snapshots etc. Our snapshotting is point in time on
| all cores.
|
| 3. Due to pooling of resources DF is more cost efficient. It's
| more versatile. We have a design partner that could reduce its
| costs by factor of 3 just because he could use x2gd machine
| with extra high memory configuration.
|
| Regarding your note about memcached - while we provide similar
| performance like memcached our product proposition is anything
| unlike memcached and it's more similar to Redis. Having said
| that - I will add comparison to memcached. I do believe that
| memcached as performant as DF because essentially it's just an
| epoll loop over multiple threads.
|
| Re you comment about snapshotting. We also push the data into
| serialization sink upon write, hence we do not need to
| aggregate changes until the snapshot completes. The complex
| part is to ensure that no key is written twice and that we
| ensure with versioning. I do agree that there can be extreme
| cases where we need to duplicate memory usage for some entries
| but it's only for the entries at flight - those that are being
| processed for serialization.
|
| Update: re versioning and memory efficiency. We use DashTable
| that is more memory efficient that Redis-Dict. In addition,
| DashTable has a concept of bucket that is comprised of multiple
| slots (14 in our implementation). We maintain a single 64bit
| version per bucket and we serialize all the entries in the
| bucket at once. Naturally, it reduces the overhead of keeping
| versions. Overall, for small value workloads we are 30-40% more
| efficient in memory than Redis.
| manigandham wrote:
| Interesting project. Very similar to KeyDB [1] which also
| developed a multi-threaded scale-up approach to Redis. It's since
| been acquired by Snapchat. There's also Aerospike [2] which has
| developed a lot around low-latency performance.
|
| 1. https://docs.keydb.dev/
|
| 2. https://aerospike.com/
| romange wrote:
| True. Keydb tackled the same problems as us. But we chose
| differrent paths. We decided to go for a complete redesign,
| feeling that there is a critical mass of innovation out there
| that can be applied for inmemory store. KeyDb wanted to stay
| close with the source and be able to integrate quickly with
| recent developments in Redis. Both paths have their own pros
| and cons.
| manigandham wrote:
| I see from the blog posts that you looked at KeyDB and
| Scylla/Seastar for background. I agree with both approaches -
| fewer but bigger instances and shared-nothing thread-per-core
| architecture - and it was a major reason for switching to
| ScyllaDB in my previous startup.
|
| Will definitely follow this to see how it develops. Good
| luck.
| romange wrote:
| Thanks!
| wnzl wrote:
| This looks really good! I gave it a thorough look and is
| definitely something I'd consider using. Maybe as canary in one
| of our systems.
|
| Youth of product makes it bit scary to use fully in mission
| critical systems - given how many problems with Redis started to
| show up under proper load. But definitely on my watch list.
| romange wrote:
| Guys, I am the author of the project. Would love to answer any
| questions you have. Meanwhile will try to do it by replying
| comments below.
| reilly3000 wrote:
| This looks like awesome work! I appreciate you operationalizing
| of some of the best things to come from computer science in the
| past decade or so.
|
| Out of curiosity, are you discovering any new bottlenecks to
| performance outside of the software, given Dragonfly is able to
| process far more qps than most systems? I imagine the network
| and disk I/O could become stressed, but also I wonder if it
| breaks any assumptions of cross-core performance, hypervisors,
| etc. I know that cloud offerings typically mean that you can
| attach ginormous disk IOPS and NICs, but surely there are
| limits.
| [deleted]
| staticassertion wrote:
| Have any thoughts on Anna? They take a similar kind of shared-
| nothing approach, with CALM communication primitives.
| romange wrote:
| I remember that I read about Anna. Very interesting paper.
| From what I remember that require that the operations will be
| conflict free.
|
| I think came to conclusion that it could be interesting as an
| independent (novel) store but not something that can
| implement Redis with its complicated multi-model API,
| transactions and blocking commands. I do not remember all the
| details though...
| ignoramous wrote:
| Anna is a KV store, for those out of loop:
| https://muratbuffalo.blogspot.com/2022/04/anna-key-value-
| sto...
| frankdejonge wrote:
| Any intention to implement Redis Streams?
| kamranjon wrote:
| +1 to this question
| romange wrote:
| Guys, please open a bug in the repo and plus one it there
| :)
|
| We plan to implement everything but your votes can affect
| the priority of the tasks.
| cultofmetatron wrote:
| I gotta say, I looked through some files. I wish my code was
| this clean. its some of the prettiest c++ I've ever seen
| pierrebai wrote:
| I dunno... I took a look at a single file and found:
| static constexpr unsigned NUM_SLOTS = Policy::kSlotNum;
| static constexpr unsigned BUCKET_CNT = Policy::kBucketNum;
| static constexpr unsigned STASH_BUCKET_NUM =
| Policy::kStashBucketNum;
|
| NUM_, _CNT, _NUM, three different prefix/suffix for what
| seems to me like the same concept. That just tickled my inner
| nit-picker.
| romange wrote:
| Everyone has his dirty laundry :(
| romange wrote:
| You are welcomed to send a PR with the fix :)
| romange wrote:
| omg, can I marry you?
| romange wrote:
| btw, it's not just pretty - it has brains too. take it for
| a spin.
| cultofmetatron wrote:
| is there an elixir client library?
| rkunde wrote:
| Do you know how DFDB compares, architecturally and performance-
| wise, to Pelikan and its different engines?
| romange wrote:
| Not really but I read pelikan posts by twitter team.
|
| One thing in common - we both thought that cache-based
| heuristics can be largely improved compared to
| memcached/redis implementations. We did it differently
| though. I think our cache design has academic novelty - I
| will write a separate post about it.
| nextaccountic wrote:
| Have you thought about also offering dragonflydb as a library,
| like sqlite? This would avoid context switching when running
| everything on the same machine.
| romange wrote:
| it could be done technically. not sure how widespread that
| usecase would be.
|
| dragonfly is a linking of a library and a main file
| dfly_main.cc so without this file you will have the lib.
|
| Redis is "Remote Dictionary Server". You gonna loose the
| remote part :)
| obert wrote:
| for development and demos it would help a lot, son one just
| runs the project at hand, instead of managing multiple
| procs
| oneepic wrote:
| Apologies if this was already asked -- I saw you guys already
| showed benchmarks in AWS itself, but would it make sense to
| have some benchmarks outside of AWS (or run benchmarks in a VM
| with Redis/DF/Keydb installed within)?
|
| I ask this because I'm unsure if AWS Redis has any
| modifications on top of the Redis software itself, which would
| affect the speed, or even make it a bit slower. For example I
| know MS Azure's version of Redis restricted certain commands,
| and from a quick search AWS does something similar:
| https://docs.aws.amazon.com/AmazonElastiCache/latest/red-ug/...
|
| (edit: added "affect the speed" for clarity)
| romange wrote:
| We compared DF with Redis OSS and KeyDB OSS, we have not used
| any of managed services for the comparison. AWS was used as a
| platform for using EC2 and cloud environment.
| kodbraker wrote:
| hey, great work. I couldn't find the specifics of the
| benchmark. Is there, by any chance you compare 1 instance of
| single threaded redis running on 64vcore to a multithreaded
| key-value store?
|
| Can we see such disparity in benchmark even if we run Ncore
| instances of redis in parallel?
| thatwasunusual wrote:
| Two things:
|
| Where's the benchmark compared to memcached?
|
| Several years ago there was memcachedb, which could flush stuff
| to disk. While this operation was expensive, it was also
| useful, because you could restart instances without being
| overwhelmed by missing keys (data).
|
| For the latter: your application quickly grinds to a halt if
| you need to build your cache from ground up after some kind of
| crash. This is a deal-breaker for many.
| treis wrote:
| Is there any catch or gotcha for using this as a drop in
| replacement for Redis?
| jkmcf wrote:
| It's only at release 0.1 so I'd let them bake a little
| longer!
|
| Looks awesome so far, though!
| tlarkworthy wrote:
| I can see it's missing some important commands like WATCH,
| and maybe the WAL for high durability contexts.
| Strom wrote:
| A lot less features. If you don't care about the missing
| features, then it might be a completely reasonable
| replacement.
| romange wrote:
| I've been working on DF for the last 6 months and I
| implemented ~130 commands. It's still missing about 40% of
| the commands to reach parity with v5/v6.
|
| I will continue working on DF. Primary/Secondary
| replication is my next milestone.
| treis wrote:
| How apples to apples is the performance comparison given
| these missing commands?
| [deleted]
| simonw wrote:
| Is this a project of https://en.wikipedia.org/wiki/Atos ? How
| big is the team, how is it funded and suchlike?
| romange wrote:
| no it's not us. we are two guys right now. bootstrapped with
| hopes that community will embrace the project and we will be
| able to grow the team.
| simonw wrote:
| What an impressive start to the project! Good luck, I hope
| you get to grow it in the ways you have planned.
| thecleaner wrote:
| What prompted you to write DragonflyDB ? How did you know that
| Redis would consume more memory ?
| romange wrote:
| Sorry, missed your second question. Redis is using fork-based
| save for snapshotting and full replica sync. This means that
| memory pages are duplicated on write. That consumes more
| memory than DF implementation that implements algorithmic
| point-in-time snapshotting. In addition DF was designed to
| use less overhead memory in its keys table. As for the why,
| you can read about it here:
| https://github.com/dragonflydb/dragonfly#background
| romange wrote:
| I've wrote in in the background section of Readme.
|
| Basically, I worked in a cloud company in a team that
| provided a managed service for Redis and Memcached. I
| witnessed lots of problems that our customers experienced due
| to scale problems of Redis. I knew that these problems are
| solveable but only if the whole system would be redesigned
| from scratch. At some point I decided to challenge the status
| quo, so I left the company and..and here we are.
| tdullien wrote:
| Good stuff :-) I would also love to see a comparison against
| Scylla and Aerospike, even if the use-case seems slightly
| different?
| romange wrote:
| I do not know about aerospike, have never run their software
| but Scylla are the champions in what they are doing. Having
| said that, Scylla provide full durability guarantees, that
| means they need to touch a disk upon each write or update.
| Disk I/O is relativle expensive in the cloud and requires
| either large instances or special storage optimized family
| types like i3 etc.
|
| Long story short, I do think Dragonfly is the fastest in-
| memory database in terms of throughput and latency today. We
| will see if we manage to stay this way when we extend our
| capabilities with SSD tiering.
| soheil wrote:
| I find the Redis benchmark suspect. Did you disable write to
| disk and live snapshotting? In production Redis shouldn't be
| configured with write to disk.
| romange wrote:
| Of course, I disabled. Why do you think it's suspicious? P.S.
| You can easily reproduce it - I wrote the command line and
| the instance type.
| OPoncz wrote:
| Yes
| ignoramous wrote:
| Thanks for your work.
|
| Curious: Why BSL? Why not open core [0] (or xGPLv3) like what
| most other commercial OSS projects seem to be doing?
|
| [0] https://archive.is/nT9DF#selection-410.0-410.1
| felixg3 wrote:
| They probably don't want to end up being replaced by an
| AWS/GCP/Azure service. In my opinion, the BSL is a fair
| license model, especially if it is limited in duration (let's
| say 2-3 years BSL then automatically changing to Apache/BSD).
| ethbr0 wrote:
| A duration limit in the license, after which it becomes a
| permissive license, seems the critical point.
|
| Accomplishes the goal of preventing a cloud provider from
| stealing customers, but also ensures customers don't get
| caught in an "always tomorrow" trap when the deadline comes
| and the company realizes it only hurts them to fully share
| it.
|
| Seems to align all interests pretty nicely.
|
| (I'm as big of an OSS supporter as anyone, but we can't
| pretend we still live in a time where Google / Amazon /
| modern-Microsoft don't exist)
| oofbey wrote:
| This looks quite cool from a technical perspective, but the
| unusual license definitely gives me pause. Largely because
| I'm not familiar with BSL. Others say it's increasingly
| popular, but with a little googling the acronym is still
| somewhat ambiguous - opensource.org lists BSL as the "Boost
| Software License" which looks more like BSD. This kind of
| confusion doesn't support the idea that this is a solid
| trustworthy OSS license.
|
| Still, I really appreciate that you didn't choose a copy-left
| license.
|
| On the license front, what is the "change license" clause
| listed? It says something about changing in 5 years. Does
| this mean it will become Apache licensed in 2027? Why would
| you put that in there?
| romange wrote:
| It's exactly that. It gives us a little chance to fight
| against the Giants.
|
| In 5 years the initial version becomes Apache 2.0 then the
| next version and so on and so forth. CockroachDB uses
| similar license. MariaDB uses that, Redpanda Data and
| others. You are right that acronym is confusing - it's not
| Boost license, it's Business License. Every major
| technological startup turned away from BSD/Apache 2.0
| licenses due to inability to compete with cloud providers
| without technological edge.
| stinkiephish wrote:
| Bravo for picking the BSL and writing your license the
| way you have. It shows maturity in that you and your
| company know what they are doing and intend to be there
| for the long haul.
| benatkin wrote:
| > Every major technological startup turned away from
| BSD/Apache 2.0 licenses due to inability to compete with
| cloud providers without technological edge.
|
| No, there are plenty that still use permissive licenses.
|
| GitLab uses MIT and a custom license for EE:
| https://docs.gitlab.com/ee/development/licensing.html
|
| Deno uses an MIT license and has some secret sauce that
| is currently just in hosted services AFAIK:
| https://github.com/denoland/deno/blob/main/LICENSE.md
|
| PlanetScale has hosted services and an open source tool
| called Vitess which is Apache licensed:
| https://planetscale.com/
| https://github.com/vitessio/vitess
|
| Finally Redis has a BSD licensed core, a source available
| license for additional modules, and a closed source
| license for enterprise. https://redis.com/legal/licenses/
| JoshTriplett wrote:
| I certainly don't think you should pick a permissive
| license, and find yourself under competition from
| companies running your own software.
|
| However, I'm sad that instead of going with an Open
| Source license that protects against that, you're using a
| proprietary license. That alone is a nonstarter for many
| users, not because they want to compete with you but
| because they want to protect themselves and make sure
| they have a firm foundation to build on.
|
| Much of the software you're citing as examples moved from
| Open Source to proprietary, harming their users in the
| process, and causing many users to seek alternatives.
| romange wrote:
| We do not choose AGPL so we _would not_ harm our users.
| Lets not confuse the means and the goal here. AGPL is
| copyleft and restricting. Fair users refuse using it. OSI
| does not help technological companies to find a fair
| solution and force us to go outside of "OSS".
| arinlen wrote:
| > _We do not choose AGPL so we would not harm our users._
|
| This is a first. If anything, it seems clear that a FLOSS
| license such as AGPL achieves exactly the opposite: is
| highly protective of the users' best interests, both in
| the short term and long term.
|
| Could you elaborate on why do you think that users are
| harmed by standard, run-of-the-mill FLOSS licenses?
| riquito wrote:
| I don't want to enter in the merit of your company's
| choice, it's definitely up to you what license to use,
| but I'm confused about that statement around AGPL. How
| would AGPL harm users?
| jitl wrote:
| Do you think AGPL offers such protection? I think even
| AGPL is fine for Amazon et al, they are happy to dump the
| code somewhere while selling a managed service with
| enough wrapper layers. But AGPL is toxic also for self-
| hosters who _aren't_ trying to build a managed hosting
| service, since it applies equally against all parties.
| [deleted]
| rglullis wrote:
| AGPL does not require source distribution of your
| application, unless the AGPL code is accessed directly by
| your users.
|
| So, if this were AGPL and I have an closed webservice
| that uses this, it is in the clear.
| goodpoint wrote:
| > AGPL is toxic also for self-hosters who aren't trying
| to build a managed hosting service
|
| Absolutely not. First of all, you can always use any
| software under AGPL as released without any limitation.
|
| Secondly, if you were to make changes and release them
| you can still run it any way you want.
|
| Thirdly, if you are making changes for internal use, and
| for some reasons your really want to keep them secret,
| you can STILL run the database to your heart's contents
| as long as you don't provide it as a service to the
| public.
|
| AGPL is much more permissive than people think. It's just
| providing a degree of protection to developers and and
| users from patent trolls and other uncooperative
| entities.
| sideeffffect wrote:
| This BSL Business Source License is endorsed by neither the
| Free Software Foundation nor OSI
|
| https://www.gnu.org/licenses/license-list.en.html
|
| https://opensource.org/licenses/alphabetical
|
| That's a big red flag.
| OPoncz wrote:
| GPL is copyleft and more restrictive in some elements. BSL
| 1.1 is actually quite popular nowdays.
| AtNightWeCode wrote:
| From my experience, Redis performance seems to be all over the
| place depending on the circumstances. In what cases does this
| solution perform well and where does it fail? Most caches I
| worked with really love to have everything close by at low
| latency and works best when the consumers have about as much
| memory as the cache in the first place.
| ddorian43 wrote:
| Did you think about re-using redis open source and "just"
| changing storage/replication? (like yugabytedb does with
| postgresql).
|
| Why not reuse seastar framework?
|
| Can you describe your distributed log thing? Is it like
| facebook-logdevice or apache-bookeeper?
| romange wrote:
| I honestly think it's impossible to reuse Redis OSS to make
| it multi-threaded besides what KeyDB did. Btw, I did reuse
| some of the code - but it's around single value data-
| structures like zset, hset etc.
|
| With multi-threading you need to think about all things
| holistically. How you handle backpressure, how you do
| snapshotting. How you implement multi-key operations or
| blocking transactions. So you need special algorithms to
| provide atomicity, you need fibers/coroutines to be able to
| block your calling context yet unblock the cpu for other
| tasks etc. All this was designed bottom up from scratch.
| Seastar could work theoretically but I am not a fan of coding
| style with futures and continuations - they are pretty
| confusing, especially in C++. My choice was using fibers -
| which provide more natural way of writing code.
|
| I have not designed the distributed long thingy. Will do it
| in the next 2 months.
| kristianpaul wrote:
| Do you have a saas cloud offering like redis cloud?
| romange wrote:
| not yet. We are focused on building the community at this
| point.
| baobob wrote:
| Benchmarks done on only one metric are often misleading (and in
| commercial circumstances usually intentionally!). Would love to
| see visually what trade offs Dragonfly is making to achieve the
| numbers from the chart at the top of the README. If excellent
| technical work means there really is no trade off, that would
| also be a great reason to chart it visually.
|
| Also as a Redis replacement, it's not clear what durability is
| offered, and for most Redis use cases this is close to the
| first question
| YarickR2 wrote:
| Tradeoff is inability to run on anything besides recent Linux
| kernels
| miohtama wrote:
| Does anyone run production Redis/memcached outside Linux
| any case?
| tyingq wrote:
| I suppose because of io_uring?
|
| It does sound like extendible hashing might have downsides
| in some scenarios also.
| romange wrote:
| How you would suggest to demonstrate visually that there are
| no trade-offs?
|
| The tradeoff the way I see it - one needs to implement 200
| Redis commands from scratch. Besides, I think DF has a
| marginally higher 50th percentile latency. Say, if Redis has
| 0.3ms for 50th percentile, DF can have 0.4ms because it uses
| message passing for inter-thread communication. 99th
| percentiles are better in DF for the same throughput because
| DF uses more cpu power which reduces variance under load.
|
| Re-durability - what durability is offerred by Redis? AOF ?
| We will provide similar durability guarantees with better
| performance than AOF. We already provide snapshotting that
| can be 30-50 faster than of Redis.
| pcthrowaway wrote:
| Does Dragonfly have a timeseries extension? Or does it
| support extensions?
| thayne wrote:
| Presumably, most of the performance benefit comes from using
| linux io_uring, which limits it to using recent linux
| kernels.
|
| Of course, it is also possible there are situations where it
| doesn't perform as well.
| tiffanyh wrote:
| FoundationDB should have been included in their perf comparison.
| It's ACID compliant and a distributed Value/Key store.
|
| For SSD based storage, it's getting 50k reads/sec PER core and
| scales linearly with # of cores you have in your cluster. (They
| achieved 8MM reads/sec with 384 cores)
|
| https://apple.github.io/foundationdb/performance.html
| atombender wrote:
| FoundationDB does have impressive performance, but implementing
| compound operations like INCR, APPEND, etc. would require at
| least an additional network round trip between the client and
| the server.
|
| For example, INCR would require one read followed by one write
| of the new value, and of course this will result in very
| inefficient mutation range conflicts (which must be retried for
| another couple of round trips) if you have frequent updates of
| the same keys in multiple concurrent transactions.
| sorenbs wrote:
| FDB supports atomic operations that you can use to implement
| at least some of this very easily in a way that avoids the
| extra round trip and conflicts:
| https://apple.github.io/foundationdb/developer-
| guide.html#at...
| spullara wrote:
| FDB also supports some atomic operations that may decrease
| the number of roundtrips and remove conflicts:
|
| https://apple.github.io/foundationdb/api-python.html#api-
| pyt...
|
| That said, I still don't think that it is necessarily the
| perfect match for implementing some of the Redis data
| structures.
| atombender wrote:
| I had completely forgotten about those built-in ones.
| Right, so some operations can be made fully atomic server-
| side, but there are still a bunch of ops (append, pubsub,
| set commands such as SADD, etc.) that would need round
| trips.
| spullara wrote:
| SADD wouldn't need a round trip since it is idempotent.
| You would just do a blind set which doesn't even require
| a transaction in FDB. Likewise, most of the other ones
| can be reframed / designed to not require round trips. I
| would still not use it as the backend unless I already
| had FDB in my solution and this was just an additional
| database type that I needed without ultimate performance
| as the constraint.
| atombender wrote:
| SADD needs to return a value indicating whether the
| member was added or not, so I don't think you could avoid
| a round trip, since FDB doesn't have a "set if not
| exists".
| wutwutwutwut wrote:
| > scales linearly with # of cores you have in your cluster.
|
| So if I have 1 machine and increase from 2 to 256 core the
| throughout will scale linearly without the SSD ever being a
| bottleneck?
| gigatexal wrote:
| "Distributed" is the keyword here. You scale cores and
| storage with machines...
| sirsinsalot wrote:
| What about it makes it a "modern" replacement rather than just a
| replacement? Is there something about Redis and Memcached that is
| "outdated" in the (relatively) short time span they've existed
| (compared to something like C)?
| robertlagrant wrote:
| Modern and outdated are not the only options to classify things
| with. Even the example you give, C, is both not modern and not
| outdated.
| [deleted]
| spullara wrote:
| I'd say using io-uring is an example of modern since it has
| only been available since 5.10 of the kernel. They also sit on
| some more recent research on data structures that perform well
| in multithreaded systems.
|
| https://github.com/dragonflydb/dragonfly#background
| jandrewrogers wrote:
| It is "modern" in the sense that the design and architecture is
| idiomatic for high-performance software on recent computing
| hardware. The main advantage of modern (in this sense)
| implementations is that they use the hardware _much_ more
| efficiently and completely for current workloads, enabling much
| higher throughput on the same hardware.
| sirsinsalot wrote:
| Which means the "modern" usage in the title is only true if
| Redis and Memcache fail to do those things.
| jasonwatkinspdx wrote:
| Mainline memcached is very well tuned, so it's sort of the
| odd one out here.
|
| But yes, Redis is very much designed against the grain of
| modern hardware. It also is a very minimalist design, that
| works well within its limits, but falls down hard when you
| push those limits, particularly with snapshotting and
| replication.
| jandrewrogers wrote:
| Yes, exactly. When people talk about about "modern" in the
| context of software performance, it usually denotes
| software architecture choices. If someone did a forklift
| upgrade to the software architecture of Redis etc, then
| they would be modern too.
|
| In practice, software architecture is forever. It is nearly
| impossible to change the fundamental architecture of
| software people use because applications become adapted to
| the specific behavioral quirks of the architecture i.e.
| Hyrum's Law, which would not be preserved if the
| architecture was re-designed wholesale.
| AtNightWeCode wrote:
| "modern" is a bs detector for sure
| romange wrote:
| The fact that we based our core hashtable implementation on
| paper from 2020 does not justify it?
| akie wrote:
| I think the key takeaway here is that people are allergic
| to that word.
| sirsinsalot wrote:
| Not really, it just implies that the competition is not
| modern, without qualification. I think asking for
| qualification in this case is fair if we are to conclude
| Redis and Memcache have aged to the point of needing a
| replacement.
|
| Modern is used here as a selling point
| AtNightWeCode wrote:
| The word is often tagged on anything new somebody tries to
| sell. Better to be specific. The problem is that most
| "modern" things are very old things sold as new ideas.
| Cause biz. So, nothing specific against this proj.
| lionkor wrote:
| Have Redis and Memcached aged so much we need a modern
| replacement? Or is this a webdevy 'modern' which just means the
| first commit is newer than redis' first commit?
| [deleted]
| romange wrote:
| memcached was born in 2003. Redis was born in 2008. Both have
| strength and weaknesses. DF tries to unite them into a single
| backend and keep the strengths of each one of them.
| robertlagrant wrote:
| > Have Redis and Memcached aged so much we need a modern
| replacement? Or is this a webdevy 'modern' which just means the
| first commit is newer than redis' first commit?
|
| The docs make some of the differences clear. Worth reading the
| GitHub repo readme.
| OPoncz wrote:
| Modern as the entire architecture is based on papers from the
| last few years. But also the first commit :)
| marmada wrote:
| Is io_uring the reason this is faster? I'm curious because redis
| is in memory right? And io_uring is mostly for disk ops, I
| assume?
| romange wrote:
| we use io_uring for everything: network and disk. Each thread
| maintains its own polling loop that dispatches completions for
| I/O events, schedules fibers etc. Everything is done via
| io_uring API. All socket writes are done via ring buffer etc.
| If you run strace on DF you won't see almolst any system calls
| besides io_uring_enter
| romange wrote:
| And no, it's not just because of io_uring it is faster. It's
| also because it's multi-threaded, has absolutely different
| hashtable design, uses a different memory allocator and many
| other reasons (i.e design decisions we took on the way).
| etaioinshrdlu wrote:
| It looks like it drops the single-threaded limitation of redis to
| achieve much better performance.
|
| Could this architecture be extended to scale across multiple
| machines? What would be the benefits and costs of this?
| jayd16 wrote:
| It kind of sounds like simple sharing but across all cores in a
| machine. Anyone know if this is an accurate take?
| romange wrote:
| It's not a simple sharing. It's actually share-nothing
| architecture and doing multi-key operations atomically is
| pretty complicated thing. I used paper from 2014 to solve
| this problem
| jayd16 wrote:
| Maybe simple is the wrong word (perhaps I should have said
| familiar) but Redis sharding is similarly shared-nothing,
| is it not?
| OPoncz wrote:
| In Redis cluster the client needs to be connected to all
| shards and manage those connections. Multi-key operation
| on different slots are not supported etc... Maintaining a
| cluster is not a fun responsibility. DF saves you in most
| cases from the need to grow horizontly which should be
| much simpler to maintain and work with.
| jayd16 wrote:
| So is it architecturally different or does DF just have a
| much better UX?
| romange wrote:
| Unfortunately unlikely. Shared nothing architecture works with
| messaging. So threads send each other messages like "give me
| the contents of that key" or "update this key". Operations like
| SET/GET will require a single message hop. Operations like
| RENAME require 2 hops. Transactions and Lua scripts will
| require number of hops as number of operations in the
| transactions. When it's in the same process - the latency
| penalty in negligible. But between different machines - you
| will feel it. But who knows, AWS already have very cool UDP-
| like protocol with very low latency... if this will become the
| standard inside cloud environments maybe we can create a truly
| distributed memory store that spans multiple machines
| transparently.
| etaioinshrdlu wrote:
| The latency would be worse, but would the throughput still
| scale pretty much linearly? It might still be super useful.
| romange wrote:
| Yes, VLL algorithm kinda solves the throughput problem. You
| will still have some issues with fault-taulerance - i.e.
| what you do if a machine does not answer you.
|
| For intra-process framework it's not an issue (as long as
| we do not have deadlocks).
| jimnotgym wrote:
| Modern? Eeek I think of Redis as modern (2009). I'm feeling old.
| mattbaker wrote:
| Yeah, I'd like to see a bit more justification on what makes it
| "modern." The use of io_uring maybe?
| pluc wrote:
| See the "Background" section at the bottom
| staticassertion wrote:
| I think the Background section[0] is pretty helpful. One
| paper they cite is from 2014, another from 2020. The use of
| io_uring as well is also somewhat novel.
|
| [0] https://github.com/dragonflydb/dragonfly#background
| soheil wrote:
| Redis and memcache are in memory key value storage. Writing
| to disk is not a primary function of those systems, it's only
| used for taking snapshots or backup of the data. io_uring
| isn't used as the core functionality and thus that alone
| wouldn't make DF "modern".
| maffydub wrote:
| Is it possible they use io_uring for networking?
| romange wrote:
| Yes, we use io_uring for networking and for disk. io_uring
| provides a unified interface on linux to poll for all I/O
| events. Re disk - we use it for storing snapshots. We will
| use it for writing WALs.
|
| And we have more plans for using io_uring in DF in the
| future.
| jitl wrote:
| There are a lot of benchmarks against Redis, but where is the
| comparison to Memcached? Redis is quite slow for cache use-case
| already.
| romange wrote:
| Yes, I can confirm that Memcached can reach similar performance
| as DF. However, one of the goals of DF was to combine the
| performance of Memcached with versatility of Redis. I
| implemented an engine that provides atomicity guarantees for
| all its operations plus transparent snapshotting under write-
| heavy traffic and all this without reducing the performance
| compared to memcached.
|
| Having said that, DF also has a novel caching algorithm that
| should provide better hit rate with less memory consumption.
| xiphias2 wrote:
| Do the benchmarks stress test those atomicity guarantees?
|
| Get/set operations look like they don't need it.
| romange wrote:
| You are correct - GET/SET do not require any locking as
| long as they do not contend and they do not in those
| benchmarks. You are right that for MSET/MGET you will see
| lower numbers. But still it will be much higher than with
| REDIS.
|
| This is our initial release and we just did not have
| resources to showcase everything under different scenarios.
| Having said that, if you open an issue with a suggestion of
| a benchmark that you would like to see I will try to run
| soon...
| xiphias2 wrote:
| No worries, I trust you to do the right thing, just
| measuring something that Memcached can do fast anyways
| can be a misleading benchmark.
| OPoncz wrote:
| I assume DF has the same performance as Memcached. It would be
| great if someone makes this benchmark and share.
| girfan wrote:
| This looks like a cool project. Is there any support (or plan to
| support) I/O through kernel bypass technologies like RDMA? For
| example, the client reads the objects using 1-sided reads from
| the server given it knows which address the object lives in. This
| could be really benefitial for reducing latency and CPU load.
| romange wrote:
| I do not know much about RDMA. Our goal is to provide a memory
| store that is fully compatible with Redis/Memcached protocols
| so that all the existing frameworks could work as before. I am
| not sure how RDMA fits this goal.
| throwaway787544 wrote:
| Controversial opinion: we don't need more databases. We need
| better application design.
|
| Why do you _need_ a redis /memcache? Because you want to look up
| a shit-ton of random data quickly.
|
| Why does it have to be random? Do you really need to look up any
| and all data? Is there not another more standard (and not
| dependent on a single db cluster) data storage and retrieval
| method you could use?
|
| If you have a bunch of nodes with high memory just to store and
| retrieve data, and you have a bunch of applications with a tiny
| amount of memory.... Why not just split the difference? Deploy
| your apps to nodes with high amounts of memory, add parallel
| processing so they scale efficiently, store the data in memory
| closer to the applications, process in queues to prevent swamping
| the system and more reliable scaling. Or use an SSD array and
| skip storing it in memory, let the kernel VM take care of it.
|
| If you're trying to "share" this memory between a bunch of
| different applications, consider if a microservice architecture
| would be better, or a traditional RDBMS with more efficient
| database design. (And fwiw, organic networks (as in biological)
| do not share one big pot of global state, they keep state local
| and pass messages through a distributed network)
| zackmorris wrote:
| One of the main uses in PHP/Laravel (probably borrowed from
| Ruby on Rails) is caching views for Russian doll caching and
| caching SQL query results:
|
| https://laracasts.com/series/russian-doll-caching-in-laravel
| (subscription only)
|
| Unfortunately, it looks like they retired their database
| caching lesson, which was a mistake on their part since it was
| so good:
|
| https://laracasts.com/discuss/channels/laravel/cache-tutoria...
| (links to defunct https://laracasts.com/lessons/caching-
| essentials)
|
| Laracasts are the best tutorials I've ever seen, regardless of
| language, outside of how php.net/<keyword> search and
| commenting was structured 20 years ago. They would be the best
| if they got outside funding to provide all lessons for free.
|
| Anyway, one HTTP request can easily generate 100+ SQL queries
| under an ORM. Which sounds bad, but is trivially fixable
| globally without side effects via global "select" query
| listeners and memoization. I've applied it and seen 7 second
| responses drop to 100 ms or less simply by associating query
| strings and responses with Redis via the Cache::remember()
| function.
|
| I also feel that there's a deeper problem in how web
| development began as hacking but devolved into application-
| heavy bike shedding. We have a generation of programmers taught
| to apply decorators by hand repeatedly, rather than take an
| aspect-oriented (glorified monkey patching) approach that fixes
| recurring problems at the framework or language level. I feel
| that this is mainly due to the loss of macros without runtimes
| providing alternative ways of overriding methods or even doing
| basic reflection.
|
| So code today often isn't future-proof (requires indefinite
| custodianship of conventions) and has side-effect-inducing or
| conceptually-incorrect changes made in the name of secondary
| concerns like performance. The antidote to this is to never
| save state in classes, but instead pipe data through side-
| effect-free class methods, then profile the code and apply
| momoization to the 20-% of methods that cost 80+% of execution
| time.
|
| Also microservices are definitely NOT the way to go for rapid
| application development or determinism. Or I should say, it's
| unwise to adopt microservices until there are industry-standard
| approaches for reverting back to monoliths.
| oofbey wrote:
| Yes! We just need to get our users to visit the website in
| order of their user_id, at their assigned time, and then we
| wouldn't have these random access patterns. /s
|
| Unfortunately, reality.
| [deleted]
| gxt wrote:
| Or you need the best of both worlds. A tool that can store
| data, serve data, convert code to data, interpret data, display
| data. Some kind of tool that has at its core a kind of
| phpmyadmin but that can be distributed on an app store, but for
| end-users. Something that will handle authentication,
| authorization, audit, provide confidentiality, scalability,
| remove the need for third party tools for end users and
| developers, just one big ecosystem that let's you host entire
| enterprise software , extend it, customize it for your end
| users and your power user. Just one big ass tool that will
| replace all other information management services.
| dvt wrote:
| Weird hill to die on. There's like a million valid reasons for
| using an in-memory data store. Off the top of my head: chat
| apps, real-time video games, caching, (some) data pipelines,
| etc.
| [deleted]
| paulgb wrote:
| As I understand it, high-performance real-time video games do
| tend to use the architecture that throwaway proposed, i.e.
| state is in the game server itself rather than a separate
| caching layer.
| AtNightWeCode wrote:
| Caching is for poor designs. Most "modern" designs only
| rely on cache for data distribution speed. Cache is an
| optimization.
| hinkley wrote:
| > Caching is for poor designs.
|
| Caching is the death of design.
|
| People who like to make decisions and then studiously
| avoid ever looking at the consequences of their actions
| don't seem to notice that once caching is introduced, the
| data model in the system either stops improving or begins
| to unravel.
|
| It becomes very difficult to benchmark the system looking
| for design issues because the real cost of a cold cache
| is very difficult to measure. And the usual argument that
| we can turn off the cache for performance analysis
| doesn't hold water for more than a few months.
|
| Pulling the same data twice, three times, ten times in a
| given transaction are all about the same cost, and it
| gets me out of trying to figure out how to plumb new data
| through the sequence diagrams. I can just snatch values
| out of global state wherever, whenever, and as often as I
| need them. Once people give up and start doing that, you
| can't turn off the caches anymore. When you started it
| cut the cost of the lookup in half, or a third. Now it's
| four, or five, or sixteen, and that call is now over-
| represented in the analysis, hiding other, more tractable
| problems.
|
| You've settled for your broken design, but also cut the
| legs out from under any of the tools that can dig you out
| of the hole you've dug.
|
| Caches are just global state under a different name, but
| slathered in RPC issues.
| Cwizard wrote:
| Could share some more on the data pipelines you mention? What
| kind of data store do you use for this purpose? We are
| reading a lot of the same data from disk to do transformation
| (joins) with new data, I've been looking for something that
| might speed this process up but would like to keep the setup
| small and simple.
| dvt wrote:
| I've used Apache Arrow before[1]; in-memory columnar
| storage. We did some AI/ML stuff with data gathered from
| social network APIs, but you can probably do a ton of
| things.
|
| [1] https://arrow.apache.org/
| judofyr wrote:
| Wow, this looks very nice!
|
| I've seen the VLL paper before and I've wondered how well it
| would work in practice (and for what use cases). Does anyone know
| how they handle blocked transactions across threads? Is the
| locking done per-thread? If so, how do you detect/resolve
| deadlocks?
|
| It also be good to see a benchmark comparing single-thread
| performance between DragonflyDB and Redis. How much of the
| performance increase is due to being able of using all threads?
| And how does it handle contention? In Redis it's easy to reason
| about because everything is done sequentially. How does
| DragonflyDB handle cases where (1) 95% of the traffic is GET/SET
| a single key or (2) 90% of the traffic involves all shards
| (multi-key transaction)?
| romange wrote:
| It's really good questions. I invite you to try it yourself
| using memtier_benchmark :) if you pass `--key-maximum=0` you
| will get a single key contention when doing the loadtest.
| Spoiler alert - it's still much faster than Redis.
| [deleted]
| mfontani wrote:
| Aw, no hyperloglog support. So close for my redis use-case
| romange wrote:
| Dude, open an issue :) I was waiting for someone to ask for
| that because I do not know anyone who uses it :)
| Thaxll wrote:
| On the picture Redis tops at 200k/seconds on an instance with 64
| cores (r6g), Dragonfly 1400k/seconds, Redis is single threaded DF
| is not but it only got 7.7x faster how come?
|
| If you run let say 32 instances of Redis ( not using HT ) with
| CPU pining will be much faster than DF assuming the data is
| sharding/clustered.
| jitl wrote:
| I assume this is because data is stored in memory, not in a CPU
| core.
| edmcnulty101 wrote:
| Doesn't it use the CPU core to put the data to memory though?
| jng wrote:
| Each CPU core doesn't have it's own independent channel to
| memory, there are usually 2-3 channels to DDR memory shared
| by all cores via multiple intermediate caches (usually
| shared hierarchically).
| lrem wrote:
| I'd guess multi-key operations with keys landing in separate
| shards.
| 867-5309 wrote:
| core count probably transferred the bottleneck to e.g. memory
| speed, disk io, .. ?
| ledauphin wrote:
| your conclusion does not follow from the available evidence.
| Comparing MT with shared memory to ST performance is comparing
| apples and oranges.
| romange wrote:
| The reason for this is the networking limitations of each
| instance type. DF consistently reaches the networking limits
| for each instance type in the cloud.
|
| On r6g it's 1.4M qps and then it's saturated on interrupts due
| to ping pong nature of the protocol. This is why pipelining
| mode can reach several times higher throughput - your messages
| are big. c6gn is network-enhanced instance with 32 network
| queues! it's the most capable instance in AWS network-wise.
| This is why DF can reach there > 3.8M qps.
| staticassertion wrote:
| This is really cool. Love a section on how things are designed
| with links to papers, always makes me feel way better about a
| project - especially one that has benchmarks.
|
| Might try this out.
| docmechanic wrote:
| Agreed. Really great piece of technical writing. Probably
| written by one of the rare developers who write better than I
| do. I want to clone him/her/they and work with them.
| avinassh wrote:
| This is excellent!
|
| I see only throughput benchmarks. Redis is single threaded,
| beating it at latency would have been far more impressive.
|
| Do you have latency benchmarks at peak throughput?
| romange wrote:
| If you look below the throughput table you will see the 99th
| percentile latency of Dragonfly in the table.
|
| Now, please take into account that DF maintains 99th percentile
| of 1ms at 3M! qps and not at 200K.
___________________________________________________________________
(page generated 2022-05-30 23:00 UTC)