[HN Gopher] SlateDB - An embedded database built on object storage
___________________________________________________________________
SlateDB - An embedded database built on object storage
Author : notamy
Score : 149 points
Date : 2024-10-01 22:18 UTC (1 days ago)
(HTM) web link (slatedb.io)
(TXT) w3m dump (slatedb.io)
| anon291 wrote:
| This seems to be a key value store built atop object storage.
| Which is to say, it seems completely redundant. Not sure if
| there's some feature I'm missing, but all of the six features
| mentioned on the front page are things you'd have if you used the
| key value store directly (actually, you get more because then you
| get multiple writers).
|
| I was excited at first and thought this was SQL atop S3 et al.
| I've jerryrigged a solution to this using SQLite with a
| customized VFS backend, and would suggest that as an alternative
| to this particular project. You get the benefit of ACID
| transactions across multiple tables and a distributed backend.
| abound wrote:
| If you want SQLite backed by S3, maybe something like SQLite in
| :memory: mode with Litestream would work?
|
| Edit: actually not sure if you can use :memory: mode since
| Litestream uses the WAL (IIRC), so maybe a ramfs instead
| anon291 wrote:
| There are many solutions. The particular example I was using
| SQLite via webassembly and then resorting to HTTP's fetch api
| for a read-only solution.
| candiddevmike wrote:
| In my experience, SQLite on S3 is ridiculously slow. The
| round trip for writes is horrendous, so you end up doing
| batch saves, but you need a WAL, which has the same problem
| as the main DB file.
| iudqnolq wrote:
| Using an s3 object per key would be too expensive for many use
| cases.
|
| The website is a bit fancy but the readme seems to pretty
| straightforwardly explain why you might want this. It seems to
| me like a nice little (13k loc) project that doesn't fit my
| needs but might come in handy for someone else?
| necubi wrote:
| This is a low-level embedded db that would be used by sql
| databases/query engines/streaming engines/etc rather than
| something that's likely to make sense for you to use as an
| application developer. It sits in a similar space to RocksDB
| and LevelDB.
|
| You generally can't use object storage directly for this stuff;
| if you have a high volume of writes, it's incredibly slow (and
| expensive) to write them individually to s3. Similarly, on the
| read side you want to be able to cache data on local disk &
| memory to reduce query latency and cost.
| vineyardmike wrote:
| > I was excited at first and thought this was SQL atop S3 et
| al.
|
| You can check out Neon.tech who makes an OS Postgres-on-s3 and
| DuckDB who makes an embedded DB with transaction support that
| can operate over S3
| aseipp wrote:
| People want object storage as the backend because in practice
| it means that you can decouple compute and storage entirely, it
| has no requirement to provision space up front, and robust
| object storage systems with (de facto) standardized APIs like
| S3's are widely available for all kinds of deployments and from
| many providers, in many forms. In other words: it works with
| what people already have and want.
|
| Essentially every standalone or embedded key-value storage
| solution treats the KV store and its operation like a database,
| from what I can tell -- which is sensible because that's what
| they are! But people use object stores exactly because they
| _don 't_ operate like traditional databases.
|
| Now there are problems with object stores (they are very coarse
| grained and have high per-object overhead, necessitating some
| design that can reconcile the round hole and the square peg) --
| but this is just the reality of what people are working with.
| If there is some other key-value store server/implementation
| you know of, one that performs and offers APIs like an actual
| database (e.g. multi writer, range scans, atomic writes) but
| with unlimited storage, no provisioning, and it's got over 10+
| different widespread implementations across every major compute
| and cloud provider -- I'm interested in what that project is.
| epolanski wrote:
| Not a db guy, just asking, what does it mean "embedded" database?
|
| I'm confused here, because Google says it's a db bundled with the
| application, but that's not really what I get from the landing
| page.
|
| What problem does it solve?
| leetrout wrote:
| Embedded means it runs in your application process not a
| standalone server / service.
| yawnxyz wrote:
| is this an easier to do the "store parquet on s3 > stream to
| duckdb" pattern that's popping up more and more?
| vineyardmike wrote:
| > MemTables are flushed periodically to object storage as a
| string-sorted table (SST). The flush interval is configurable.
|
| Looks like it has a pretty similar structure under the hood,
| but DuckDB would get you more powerful queries.
|
| FYI duckdb directly supports writes (and transactions) so you
| don't necessarily even need the separate store step.
| jitl wrote:
| This is more targeted at OLTP style workloads with mutable data
| and potentially multiple writers
| kosmozaut wrote:
| Do you know any resources/examples about the setup you mean? It
| sound interesting but from a quick search I didn't find
| anything straight forward.
| atombender wrote:
| Check out Apache Iceberg. It's a format for storing Parquet
| data in object storage, for both read and write. Not sure if
| DuckDB does Iceberg (I know ClickHouse does), but it's a
| similar principle, disaggregating data from compute.
| loxias wrote:
| Can I please, please, please, have C++ or at least C bindings? :)
| Or the desired way to call Rust from another runtime? I don't
| know any Rust.
| jitl wrote:
| Rust is just another programming language that's quite similar
| to C++. The main difference is there's like 4 types for String
| (some are references and some are owned) and methods for a
| struct go in a `impl StructName` block after the struct
| definition instead of inside it.
|
| I don't really know rust either but I'm currently writing some
| bindings to expose Rust libraries to NodeJS and not having too
| much trouble.
|
| For rust -> c++ I googled one time and found this tool which
| Mozilla seems to use to call Rust from C++ in their web
| browser, maybe it would "just work":
| https://github.com/mozilla/cbindgen?tab=readme-ov-file
| sebastianconcpt wrote:
| Although the borrowing rules will make you feel is quite a
| different language than others.
| jitl wrote:
| From the docs https://slatedb.io/docs/introduction/
|
| > NOTE
|
| > Snapshot isolation and transactions are planned but not yet
| implemented.
| quadrature wrote:
| Might have been older docs. They now say that transactions are
| supported
|
| " Snapshot isolation: SlateDB supports snapshot isolation,
| which allows readers and writers to see a consistent view of
| the database. Transactions: Transactional writes are
| supported."
| jitl wrote:
| I don't see any evidence this is implemented in the source
| code, and the README on Github also marks it as not-yet-
| implemented. There is an open issue for "design doc for
| transaction" here:
| https://github.com/slatedb/slatedb/issues/248 and an open
| issue for "Add range queries" here:
| https://github.com/slatedb/slatedb/issues/8
| nmca wrote:
| > Object storage is an amazing technology. It provides highly-
| durable, highly-scalable, highly-available storage at a great
| cost.
|
| I don't know if this was intended to be intentional funny, but
| there is a little ambiguity in the expression "great cost",
| typically great cost means very expensive.
|
| Very cool and useful shim otherwise :)
| unshavedyak wrote:
| Is there an alternate meaning that you first took it as?
| Monetary cost was my take as well hah.
| raybb wrote:
| The other meaning it could have is that it's a good
| price/deal.
| paulgb wrote:
| Monetary cost in both cases, but it's the two meanings of
| "great", which can either mean "large" or "good".
| OJFord wrote:
| That's funny actually - 'great cost', great takes meaning of
| large; 'great price', great takes meaning of very good (i.e.
| small in this context).
|
| Always that way around, ESL's a minefield!
| notthistime12 wrote:
| Native English speaker here. "At a great cost" means "at a good
| price". "At great cost" would mean "expensive".
| skrtskrt wrote:
| you 100% correct not sure why this is downvoted away
| hantusk wrote:
| Since writes to object storage are going to be slow anyway, why
| not double down on read optimized B-trees rather than write
| optimized LSM's?
| chipdart wrote:
| I think slow writes are not a major concern, as most databases
| already use some fast log-type data structure to persist
| writes, and then merge/save these logs to a higher-capacity and
| slower medium on specific events.
| goodpoint wrote:
| Despite the name this is not a database.
| mtndew4brkfst wrote:
| What definition/criteria do you feel it does not satisfy?
| goodpoint wrote:
| Pretty much the usual definition.
| https://en.wikipedia.org/wiki/Database
| jitl wrote:
| > Formally, a "database" refers to a set of related data
| accessed through the use of a "database management system"
| (DBMS), which is an integrated set of computer software
| that allows users to interact with one or more databases
| and provides access to all of the data contained in the
| database (although restrictions may exist that limit access
| to particular data). The DBMS provides various functions
| that allow entry, storage and retrieval of large quantities
| of information and provides ways to manage how that
| information is organized.
|
| What makes SlateDB not qualify for this definition? It
| seems to qualify for me.
| mtndew4brkfst wrote:
| Do you feel that e.g. Redis fails to satisfy the same
| definition in basically the same ways? If it does fulfill
| the criteria, what do you see as the distinction?
| notthistime12 wrote:
| Redis is a key-value store.
| jitl wrote:
| A key-value store is a type of database.
| rehevkor5 wrote:
| Calling Redis a database is a generous generalization.
| For example, Redis does not necessarily provide the same
| kind of durability as a database does, nor the
| capabilities one would expect from an RDBMS. In many
| cases, depending on configuration, it might be more
| appropriate to instead refer to Redis as a cache, an in-
| memory database, or a NoSQL database.
| tgdn wrote:
| "It doesn't currently ship with any language bindings"
|
| Rust is needed to use SlateDB at the moment
| demarq wrote:
| Embed cloud
|
| Sounds like they just cancel each other out. Not sure what
| advantage embedding will yield here
| remon wrote:
| I've read the introduction and descriptions two times now and I
| still don't understand what this adds to the proceedings. It
| appears to be an extremely thin abstraction over object storage
| solutions rather than an actual DB which the name and their texts
| imply.
| shenli3514 wrote:
| Went thru the document:
| https://slatedb.io/docs/introduction/#use-cases I can not
| understand why are they targeting the following use cases with
| this architecture. * Stream processing * Serverless functions *
| Durable execution * Workflow orchestration * Durable caches *
| Data lakes
| drodgers wrote:
| It looks like writes are buffered in an in-memory write ahead log
| before being written to object storage, which means that if the
| writer box dies, then you lose acknowledged writes.
|
| I've built something similar for low-cost storage of infrequently
| accessed data, but it uses our DBMS (MySQL) for the WAL (+ cache
| of hot reads), so you get proper durability guarantees.
|
| The other cool trick to use is to use Be-trees (a relatively
| recent innovation from Microsoft Research) for the object storage
| compaction to minimise the number of write operations needed when
| flushing the WAL.
| quadrature wrote:
| You have the ability to choose your durability guarantee. You
| can choose to have synchronous writes, in which case the client
| blocks until the write is acknowledged.
|
| https://docs.rs/slatedb/latest/slatedb/config/struct.WriteOp...
| 0x1ceb00da wrote:
| Is there something similar that caches recent changes locally
| if the device is offline and uploads them when it comes online?
| rehevkor5 wrote:
| I don't see how it's embedded if it relies on nonlocal
| services... on the contrary it says specifically, "no local
| state". It appears to be more analogous to a "lakehouse
| architecture" implementation (similar to, for example, Apache
| Iceberg), where your app includes a library that knows how to
| interact with the data in cloud object storage.
| indrora wrote:
| The general definition of "Embedded" is that the engine runs in
| your application space, as opposed to the more traditional DBMS
| (MariaDB, Valkey, etc) being a Full Fat Process just for
| itself. [1] This can reduce RTT to the database itself because
| you're already there: You've got a whole DB at your fingertips.
| There's very little worry of cross-application data stink
| because _each application has its own database_ , alleviating a
| lot of the authN/Z that comes with a network attached DBMS.
|
| 1: https://en.wikipedia.org/wiki/Embedded_database
___________________________________________________________________
(page generated 2024-10-02 23:01 UTC)