[HN Gopher] Litestream live replication has been moved to the Li...
___________________________________________________________________
Litestream live replication has been moved to the LiteFS project
Author : hardwaresofton
Score : 78 points
Date : 2022-10-14 14:53 UTC (8 hours ago)
(HTM) web link (github.com)
(TXT) w3m dump (github.com)
| losfair wrote:
| Shameless plug of my mvSQLite [1] project here! It's basically
| another distributed SQLite (that is API-compatible with
| libsqlite3), but with support for everything expected from a
| proper distributed database: synchronous replication, strictly
| serializable transactions, + scalable reads _and writes_ w /
| multiple concurrent writers.
|
| [1] https://github.com/losfair/mvsqlite
| ok_dad wrote:
| > Distributed, MVCC SQLite that runs on top of FoundationDB.
|
| FYI to anyone here, FoundationDB is fucking awesome for
| something like this.
|
| Question @losfair: Did you find the Rust bindings for FDB to be
| very good? The Go bindings are OK, but are pretty out-of-date
| with some cool new features on the HEAD of the FDB source repo.
| endisneigh wrote:
| mvSQLite looks great, though I'm curious how you'd implement a
| schema migration given the locking properties of SQLite
| combined with the transaction limits of fdb.
|
| I imagine you'd get an FDB transaction time limit error
| preventing any schema migrations with non trivial amounts of
| data.
| darthShadow wrote:
| I really wanted to give this a try but the lack of WAL support
| has prevented me from using it. With the recent addition of WAL
| support[1] in litefs, would it be possible to add the same to
| mvsqlite too?
|
| [1] https://github.com/superfly/litefs/pull/120
| mildbyte wrote:
| The live replication (as it used to work in Litestream before the
| LiteFS move, without Consul) would have been perfect for our use
| case with Seafowl (I played around with Litestream before that
| but had to settle on PostgreSQL for the sample multi-node
| deployment [0]):
|
| - rare writes that get directed to a single instance (e.g. using
| Fly.io's replay header), frequent reads (potentially at edge
| locations)
|
| - no need to deploy a PostgreSQL cluster and set up logical
| replication
|
| - SQLite database stored in object storage, reader replicas can
| boot up using the object storage copy and then get kept in sync
| by pulling data from the writer
|
| - delay in replication is fine
|
| LiteFS is probably going to be a great solution here since we're
| mainly using Fly.io and it has built-in support for it [1], but
| are there any alternatives that don't require Consul, still look
| like an SQLite database to the client and can work off of a HTTP
| connection to the primary, so that we don't have to require our
| users to deploy to Fly?
|
| [0] https://seafowl.io/docs/guides/scaling-multiple-nodes
|
| [1] https://fly.io/docs/litefs/getting-started/
| benbjohnson wrote:
| LiteFS author here. You can also set up LiteFS to have a
| single, static leader instead of using Consul if you don't want
| that dependency. I need to write up some docs but there's some
| info on the PR itself:
| https://github.com/superfly/litefs/pull/47
| mildbyte wrote:
| Ah, sweet (and thanks for building Litestream/LiteFS)! This
| should work great for us, will definitely try to get a PoC
| going with this.
| pkhuong wrote:
| https://github.com/backtrace-labs/verneuil doesn't even need an
| HTTP connection between the writer and readers: readers only
| interact with the object store.
| detaro wrote:
| Did anyone see a why (i.e. what does the FUSE-based approach gain
| that the Litestream one doesn't have?)?
| simonw wrote:
| I'm sure Ben will pop in with an explanation soon, but my
| understanding is that this is mainly about safety.
|
| The way Litestream was doing replication required programmers
| to be extremely careful not to accidentally attempt a write to
| a replicated database copy - doing so would corrupt that copy,
| potentially in non-obvious ways. There was no practical
| mechanism for protecting people from making this mistake.
|
| The FUSE approach for LiteFS has more control, so can do a
| better job of protecting people from this kind of mistake.
| benbjohnson wrote:
| LiteFS author here. That's a good explanation, Simon.
| Preventing writes on the replica is a nice benefit to FUSE
| versus an external process.
|
| Another benefit of the control is that we can maintain a
| rolling checksum of the entire database at each transaction
| so we're able to verify integrity when replicating. That's
| also what allows us to do asynchronous replication across a
| loose membership of nodes since we can easily detect if a
| node diverges.
| dotnwat wrote:
| What is the status of FUSE on macOS?
| benbjohnson wrote:
| It's not great. tv42 (who maintains the FUSE implementation we
| use) commented on it recently[1]. macOS support will be via a
| VFS extension in the future.
|
| [1]:
| https://github.com/superfly/litefs/issues/119#issuecomment-1...
| maxpert wrote:
| A shameless plug here I've been working on a project of my own
| called [Marmot](https://maxpert.github.io/marmot/) with the
| intention of making SQLite replication masterless and eventually
| consistent. Project is in very early stages, and I've been using
| it for one of my site replicating a cache based off SQLite. It
| builds on top of NATS (I love NATS), and distributes itself over
| JetStreams.
| alin23 wrote:
| And this is why choosing a good explicit name for a project
| doesn't matter too much. Litestream was SQ _Lite_ _Stream_ ing
| replication and now that's exactly what it doesn't do.
|
| Requirements and features change with time, don't fret about
| names too much.
| adamckay wrote:
| It's still streaming replication, but to S3/other-storage for
| backups, not high availability.
| tekstar wrote:
| .. which was the only feature set when litestream was
| released. Replication was added later and now removed.
| dastbe wrote:
| litefs and litestream are interesting, but they all continue to
| not support confirmation that a transaction is durably replicated
| OR that failover won't cause data loss. until that point it just
| seems like a sequence of experiments.
| LunaSea wrote:
| I wonder if fly.io & co have built home grown solutions for
| this?
| plugin-baby wrote:
| Fly.io bought Litestream
| hardwaresofton wrote:
| well they're in the best/only(?) spot to do it -- owning the
| platform the end-programmer code is running on. SQLite is
| well built and extensible but hard to extend without the
| cooperation of the end-programmer, so to speak.
|
| SQLite is unfortunately (?) kind of hard to modify for
| external processes and while it's built very extensibly it's
| often not the thing you can kind of just... turn on, if that
| makes sense, and SQLite lives in the address space of the
| executing program.
|
| You end up with stuff like hacking LD_PRELOAD[0].
|
| Note: Litestream (Ben) was acquihired essentially by Fly.io
| (so that should explain all their recent SQLite content!).
|
| [0]: https://github.com/cventers/sqlite3-preload
| benbjohnson wrote:
| Litestream/LiteFS author here. I agree that synchronous
| replication is important and we have plans to implement it in
| LiteFS. Because LiteFS supports a loose membership model,
| quorum-based acknowledgement doesn't really work as well since
| the quorum can change. We have some other pieces to put into
| place before synchronous replication can work well.
|
| However, I disagree that it's just a "sequence of experiments".
| There are a lot of applications that can benefit from
| asynchronous replication. Synchronous acknowledgement of writes
| can impose a high cost on throughput so many folks use async
| replication, even in systems like Postgres.
| losfair wrote:
| Maybe the inherent overhead of synchronous replication is
| more on _latency_ rather than _throughput_.
| dastbe wrote:
| synchronous replication to replicas is not necessary for data
| durability and doesn't have to be a huge drag on throughput.
| for example, you can achieve high throughput by pipelining
| uncommitted transactions and then tracking when they are
| durably committed to backing store (like s3) for when to ack
| back to clients. and when dealing with failover, you can use
| the central store for determining place in ledger rather than
| whatever is on the replica that happens to get leadership.
| rubenv wrote:
| Any chance this works together with Kubernetes?
| benbjohnson wrote:
| LiteFS author here. I haven't tested it on Kubernetes yet but
| it is meant to be deployed anywhere. The only dependency is
| Consul although you could get around that by using a static
| leader[1].
|
| [1]: https://github.com/superfly/litefs/pull/47
| openthc wrote:
| Why not use https://github.com/rqlite/rqlite ?
|
| You have to write to the SQLite via the rqlite HTTP API but it
| will replicate the data to N nodes (at least 20) via RAFT and
| then others can read-only the SQLite replica files directly; and
| file-permissions prevent the accidental write.
| hardwaresofton wrote:
| rqlite is a nice project, but you kind of covered it:
|
| > You have to write to the SQLite via the rqlite HTTP API
|
| Also requiring _non determinism_ (i.e. no RANDOM()) is
| something I don 't think I really want to worry about. There
| are a few of tradeoffs for rqlite (and dqlite too, to be fair)
| that just don't seem to be quite worth it (especially compared
| to just running Postgres).
|
| I think people are realizing that having one far away writer is
| actually _fine_ -- 90%+ of the traffic you 're trying to serve
| fast is read queries.
| otoolep wrote:
| Actually rqlite release 7.7.0[1] adds support for RANDOM().
| Timestamp functions support will be added to an upcoming
| release. It does this by statement-rewriting[2] before
| sending the SQL to the Raft log.
|
| [1] https://www.philipotoole.com/rqlite-7-7-0-released/
|
| [2] https://github.com/rqlite/rqlite/blob/master/DOC/NON_DETE
| RMI...
| otoolep wrote:
| And benbjohnson wrote the SQL parser[1] rqlite uses to do
| all this. So you see, the man is everywhere. :-)
|
| [1] https://github.com/rqlite/sql
| hardwaresofton wrote:
| TIL, thank you for pointing this out -- maybe it's time to
| re-evaluate my stance on rqlite & dqlite
| zdw wrote:
| What's the overhead of using FUSE to implement LiteFS?
|
| Is there an issue with OS compatibility? FUSE tends to require OS
| hooks, last I checked, and that can be somewhat hairy to deal
| with.
| benbjohnson wrote:
| > What's the overhead of using FUSE to implement LiteFS?
|
| It's a tricky question to answer. Most of the noticeable
| overhead is on the write side. Initial benchmarks of overhead
| that I've seen locally are about 250us for the write(2) and
| fsync(2) calls. It's closer to 100us for read(2) calls. There
| are additional writes made behind the scenes as well for
| storing in a replication format for the other nodes too.
|
| However, on the read side some of that is moot. For many
| databases, most reads will be in the OS page cache and a fetch
| from there seems to be closer to 4us. If you're running a
| moderately sized database (e.g 1GB) on even a modest VM (e.g.
| 256 RAM) then most of your hot pages will be in the OS page
| cache so you shouldn't notice much overhead on the read side.
|
| LiteFS is targeted at read-heavy workloads. If you need high
| write throughput of thousands of writes per second then LiteFS
| probably isn't a good fit.
|
| > Is there an issue with OS compatibility?
|
| LiteFS is Linux right now. We'll be supporting other operating
| systems via a SQLite VFS extension in the future. macOS has
| poor FUSE support right now and I'm not sure where Windows and
| BSD stand with their support for FUSE or FUSE-like systems.
| yellowapple wrote:
| > We'll be supporting other operating systems via a SQLite
| VFS extension in the future.
|
| Are there advantages to a FUSE-based approach over a VFS-
| based approach?
| sharps_xp wrote:
| How do you turn a local-first file system based database into a
| cloud software product with vendor lock in?
| benbjohnson wrote:
| How is there vendor lock-in? The LiteFS & Litestream code is
| all open source under Apache 2.
| sharps_xp wrote:
| i think both fly.io and litestream were projects that spoke
| sweet words to the average developer wanting to build
| features without the infra headache, and what made sqlite so
| appealing was its simplicity. litestream kept that simplicity
| too. but i don't think the avg developer wants to spin up
| their own LiteFS. At a glance at the repo, i have no idea how
| to deploy this thing. It would have served fly and litestream
| users better to have kept replication within litestream.
|
| just as no one using postgres on RDS will ever leave RDS not
| b/c RDS is so much better than its competitors but because
| the hurdle is too great and the migration so risky. right
| now, fly is the only one who lessens the burden to use LiteFS
| and as long as they're the only one, the average developer is
| essentially locked in.
| infogulch wrote:
| In a previous post about LiteFS [1], the creator Ben commented on
| how clients could maintain a monotonically consistent view of the
| database [2] even in the presence of replication lag and updates
| made by the client. I think this is a pretty good (TM) strategy
| that should work well for a majority of applications.
|
| > > To improve latency, we're aiming at a scale-out model that
| works similarly to Fly Postgres. That's to say: writes get
| forwarded to the primary and all read requests get served from
| their local copies.
|
| > How can you ensure that a client that just performed a
| forwarded write will be able to read that back on their local
| replica on subsequent reads?
|
| > LiteFS provides a transaction ID that applications can use to
| determine replication lag. If the replica is behind the TXID, it
| can either wait or it can forward to the primary to ensure
| consistency.
|
| [1]: https://news.ycombinator.com/item?id=32925734#32928974
|
| [2]: I think this is a reasonable statement, but may not be
| industry standard terminology.
___________________________________________________________________
(page generated 2022-10-14 23:02 UTC)