[HN Gopher] Scylla - Real-Time Big Data Database
___________________________________________________________________
Scylla - Real-Time Big Data Database
Author : andrewstuart
Score : 56 points
Date : 2021-08-24 20:24 UTC (2 hours ago)
(HTM) web link (www.scylladb.com)
(TXT) w3m dump (www.scylladb.com)
| criticaltinker wrote:
| The benchmarks against DynamoDB, Bigtable, & CockroachDB [1]
| appear quite impressive - anyone have real world experience that
| can attest to these claims of improved performance and reduced
| cost?
|
| > Scylla vs DynamoDB - Database Benchmark
|
| _> 20x better throughput in the hot-partition test_
|
| _> Scylla Cloud is 1 /7 the expense of DynamoDB when running
| equivalent workloads _
|
| _> Scylla Cloud: Average replication latency of 82ms. DynamoDB:
| Average latency of 370ms. _
|
| > Scylla vs Bigtable - Database Benchmark
|
| _> Scylla Cloud performs 26X better than Google Cloud Bigtable
| when applied with real-world, unoptimized data distribution _
|
| _> Google BigTable requires 10X as many nodes to accept the same
| workload as Scylla Cloud _
|
| _> Scylla Cloud was able to sustain 26x the throughput, and with
| read latencies 1 /800th and write latencies less than 1/100th of
| Cloud Bigtable_
|
| > Scylla vs CockroachDB - Database Benchmark
|
| _> Loading 10x the data into Scylla took less than half the time
| it took for CockroachDB to load the much lesser dataset._
|
| _> Scylla handled 10x the amount of data. _
|
| _> Scylla achieved 9.3x the throughput of CockroachDB at 1 /4th
| the latency._
|
| [1] https://www.scylladb.com/product/benchmarks/
| kasey_junk wrote:
| I was unwilling to sign up to read the actual benchmark report
| for the comparison to cockroachdb but it jumped out at me as
| odd. They solve completely different kinds of problems in my
| experience so I'm not surprised Scylla did better in raw
| throughout. That's not interesting though. It would be just as
| weird for cockroach to put up a benchmark showing it
| outperforms in distributed sql queries.
|
| That said I've seen the value Scylla brings in its core value
| prop, replacing Cassandra. It's real good at that.
| biggestdummy wrote:
| Full report is posted here with no registration wall:
| https://www.scylladb.com/2021/01/21/cockroachdb-vs-scylla-
| be... And they admit that it's an odd comparison. "Obviously,
| the comparison is of the apples and oranges type..."
| mianos wrote:
| It is interesting in that this is here on the front page and an
| old article about Discord moving to Cassandra is also here
| considering Discord went from Cassandra to Scylla I beleive.
| andrewstuart wrote:
| I posted this because I'm interested to hear from anyone using it
| - how has it worked out for you?
|
| I note it's written in C++ which is a bit of a surprise - I'd
| expected Rust or Golang.
|
| Interesting as well is is AGPL - licensing is always contentious:
|
| https://github.com/scylladb/scylla/blob/master/LICENSE.AGPL
| zinclozenge wrote:
| I think the main reason it's in C++ is because of its async
| executor, Seastar. There's a similar Rust project called
| Glommio but seems still very early.
| biggestdummy wrote:
| Glommio was created by Glauber Costa, one of the early
| contributors to Seastar (and Scylla). The resemblance between
| the two is not coincidence.
| https://glaubercosta-11125.medium.com/c-vs-rust-an-async-
| thr...
| krapht wrote:
| Only on Hackernews would somebody be surprised that high-
| performance system software would be written in C++...
| masterof0 wrote:
| You read my mind. LOL. "Mr. Developer, can you please write
| your project in Rust, or __insert_your_meme_language_here__,
| or Javascript?"
| ethelward wrote:
| Fromthe mouth of CockraochDB's CTO: ``So if we were
| starting at this point in time, I would take a hard look at
| Rust, and I imagine that we would pick it instead of C++.''
| jandrewrogers wrote:
| If you are building a database engine that strongly prioritizes
| performance, and Scylla does position itself that way, then C++
| is the only practical choice today for many people, depending
| on the details. It isn't that C++ is great, though modern
| versions are pretty nice, but that it wins by default.
|
| Garbage collected languages like Golang and high-performance
| database kernels are incompatible because the GC interferes
| with core design elements of high-performance database kernels.
| In addition to a significant loss of performance, it introduces
| operational edge cases you don't have to deal with in non-GC
| languages.
|
| Rust has an issue unique to Rust in the specific case of high-
| performance database kernels. The internals of high-performance
| databases are full of structures, behaviors, and safety
| semantics that Rust's safety checking infrastructure is not
| designed to reason about. Consequently, to use Rust in a way
| that produces equivalent performance requires marking most of
| the address space as "unsafe". And while you could do this,
| Rust is currently less expressive than modern C++ for this type
| of code anyway, so it isn't ergonomic either.
|
| C++ is just exceptionally ergonomic for writing high-
| performance database kernels compared to the alternatives at
| the moment.
| nhourcard wrote:
| At QuestDB we chose zero-GC Java for 80% of the code base,
| which resulted in superior performance on ingestion compared
| to the alternatives.
| dralley wrote:
| Zig might be a good option -- eventually, once it's past 1.0.
| enedil wrote:
| Quoting the interview with ScyllaDB CTO, Avi Kivity (
| https://www.scylladb.com/2020/06/30/ask-me-anything-with-avi...
| )
|
| > Q: Would you implement Scylla in Go, Rust or Javascript if
| you could?
|
| > Avi: Good question. I wouldn't implement Scylla in
| Javascript. It's not really a high-performance language, but I
| will note that Node.js and Seastar share many characteristics.
| Both are using a reactor pattern and designed for high
| concurrency. Of course the performance is going to be very
| different between the two, but writing code for Node.js and
| writing code for Seastar is quite similar.
|
| > Go also has an interesting take on concurrency. I still
| wouldn't use it for something like Scylla. It is a garbage-
| collected language so you lose a lot of predictability, and you
| lose some performance. The concurrency model is great. The
| language lacks generics. I like generics a lot and I think they
| are required for complex software. I also hear that Go is
| getting generics in the next iteration. Go is actually quite
| close to being useful for writing a high-performance database.
| It still has the downside of having a garbage collector, so
| from that point-of-view I wouldn't pick it.
|
| > If you are familiar with how Scylla uses the direct I/O and
| asynchronous I/O, this is not something that Go is great at
| right now. I imagine that it will evolve. So I wouldn't pick
| Javascript or Go.
|
| > However, the other language you mentioned, Rust, does have
| all of the correct characteristics that Scylla requires.
| Precise control over what happens. It doesn't have a garbage
| collector so it means that you have predictability over how
| much time your things take, like allocation. You don't have
| pause times. And it is a well-designed language. I think it is
| better than C++ which we are currently using. So if we were
| starting at this point in time, I would take a hard look at
| Rust, and I imagine that we would pick it instead of C++. Of
| course, when we started Rust didn't have the maturity that it
| has now, but it has progressed a long time since then and I'm
| following it with great interest. I think it's a well-done
| language.
| milesward wrote:
| We're using it with several customers: fast, reliable,
| straightforward.
___________________________________________________________________
(page generated 2021-08-24 23:00 UTC)