[HN Gopher] Litestream live replication has been moved to the Li...
       ___________________________________________________________________
        
       Litestream live replication has been moved to the LiteFS project
        
       Author : hardwaresofton
       Score  : 78 points
       Date   : 2022-10-14 14:53 UTC (8 hours ago)
        
 (HTM) web link (github.com)
 (TXT) w3m dump (github.com)
        
       | losfair wrote:
       | Shameless plug of my mvSQLite [1] project here! It's basically
       | another distributed SQLite (that is API-compatible with
       | libsqlite3), but with support for everything expected from a
       | proper distributed database: synchronous replication, strictly
       | serializable transactions, + scalable reads _and writes_ w /
       | multiple concurrent writers.
       | 
       | [1] https://github.com/losfair/mvsqlite
        
         | ok_dad wrote:
         | > Distributed, MVCC SQLite that runs on top of FoundationDB.
         | 
         | FYI to anyone here, FoundationDB is fucking awesome for
         | something like this.
         | 
         | Question @losfair: Did you find the Rust bindings for FDB to be
         | very good? The Go bindings are OK, but are pretty out-of-date
         | with some cool new features on the HEAD of the FDB source repo.
        
         | endisneigh wrote:
         | mvSQLite looks great, though I'm curious how you'd implement a
         | schema migration given the locking properties of SQLite
         | combined with the transaction limits of fdb.
         | 
         | I imagine you'd get an FDB transaction time limit error
         | preventing any schema migrations with non trivial amounts of
         | data.
        
         | darthShadow wrote:
         | I really wanted to give this a try but the lack of WAL support
         | has prevented me from using it. With the recent addition of WAL
         | support[1] in litefs, would it be possible to add the same to
         | mvsqlite too?
         | 
         | [1] https://github.com/superfly/litefs/pull/120
        
       | mildbyte wrote:
       | The live replication (as it used to work in Litestream before the
       | LiteFS move, without Consul) would have been perfect for our use
       | case with Seafowl (I played around with Litestream before that
       | but had to settle on PostgreSQL for the sample multi-node
       | deployment [0]):
       | 
       | - rare writes that get directed to a single instance (e.g. using
       | Fly.io's replay header), frequent reads (potentially at edge
       | locations)
       | 
       | - no need to deploy a PostgreSQL cluster and set up logical
       | replication
       | 
       | - SQLite database stored in object storage, reader replicas can
       | boot up using the object storage copy and then get kept in sync
       | by pulling data from the writer
       | 
       | - delay in replication is fine
       | 
       | LiteFS is probably going to be a great solution here since we're
       | mainly using Fly.io and it has built-in support for it [1], but
       | are there any alternatives that don't require Consul, still look
       | like an SQLite database to the client and can work off of a HTTP
       | connection to the primary, so that we don't have to require our
       | users to deploy to Fly?
       | 
       | [0] https://seafowl.io/docs/guides/scaling-multiple-nodes
       | 
       | [1] https://fly.io/docs/litefs/getting-started/
        
         | benbjohnson wrote:
         | LiteFS author here. You can also set up LiteFS to have a
         | single, static leader instead of using Consul if you don't want
         | that dependency. I need to write up some docs but there's some
         | info on the PR itself:
         | https://github.com/superfly/litefs/pull/47
        
           | mildbyte wrote:
           | Ah, sweet (and thanks for building Litestream/LiteFS)! This
           | should work great for us, will definitely try to get a PoC
           | going with this.
        
         | pkhuong wrote:
         | https://github.com/backtrace-labs/verneuil doesn't even need an
         | HTTP connection between the writer and readers: readers only
         | interact with the object store.
        
       | detaro wrote:
       | Did anyone see a why (i.e. what does the FUSE-based approach gain
       | that the Litestream one doesn't have?)?
        
         | simonw wrote:
         | I'm sure Ben will pop in with an explanation soon, but my
         | understanding is that this is mainly about safety.
         | 
         | The way Litestream was doing replication required programmers
         | to be extremely careful not to accidentally attempt a write to
         | a replicated database copy - doing so would corrupt that copy,
         | potentially in non-obvious ways. There was no practical
         | mechanism for protecting people from making this mistake.
         | 
         | The FUSE approach for LiteFS has more control, so can do a
         | better job of protecting people from this kind of mistake.
        
           | benbjohnson wrote:
           | LiteFS author here. That's a good explanation, Simon.
           | Preventing writes on the replica is a nice benefit to FUSE
           | versus an external process.
           | 
           | Another benefit of the control is that we can maintain a
           | rolling checksum of the entire database at each transaction
           | so we're able to verify integrity when replicating. That's
           | also what allows us to do asynchronous replication across a
           | loose membership of nodes since we can easily detect if a
           | node diverges.
        
       | dotnwat wrote:
       | What is the status of FUSE on macOS?
        
         | benbjohnson wrote:
         | It's not great. tv42 (who maintains the FUSE implementation we
         | use) commented on it recently[1]. macOS support will be via a
         | VFS extension in the future.
         | 
         | [1]:
         | https://github.com/superfly/litefs/issues/119#issuecomment-1...
        
       | maxpert wrote:
       | A shameless plug here I've been working on a project of my own
       | called [Marmot](https://maxpert.github.io/marmot/) with the
       | intention of making SQLite replication masterless and eventually
       | consistent. Project is in very early stages, and I've been using
       | it for one of my site replicating a cache based off SQLite. It
       | builds on top of NATS (I love NATS), and distributes itself over
       | JetStreams.
        
       | alin23 wrote:
       | And this is why choosing a good explicit name for a project
       | doesn't matter too much. Litestream was SQ _Lite_ _Stream_ ing
       | replication and now that's exactly what it doesn't do.
       | 
       | Requirements and features change with time, don't fret about
       | names too much.
        
         | adamckay wrote:
         | It's still streaming replication, but to S3/other-storage for
         | backups, not high availability.
        
           | tekstar wrote:
           | .. which was the only feature set when litestream was
           | released. Replication was added later and now removed.
        
       | dastbe wrote:
       | litefs and litestream are interesting, but they all continue to
       | not support confirmation that a transaction is durably replicated
       | OR that failover won't cause data loss. until that point it just
       | seems like a sequence of experiments.
        
         | LunaSea wrote:
         | I wonder if fly.io & co have built home grown solutions for
         | this?
        
           | plugin-baby wrote:
           | Fly.io bought Litestream
        
           | hardwaresofton wrote:
           | well they're in the best/only(?) spot to do it -- owning the
           | platform the end-programmer code is running on. SQLite is
           | well built and extensible but hard to extend without the
           | cooperation of the end-programmer, so to speak.
           | 
           | SQLite is unfortunately (?) kind of hard to modify for
           | external processes and while it's built very extensibly it's
           | often not the thing you can kind of just... turn on, if that
           | makes sense, and SQLite lives in the address space of the
           | executing program.
           | 
           | You end up with stuff like hacking LD_PRELOAD[0].
           | 
           | Note: Litestream (Ben) was acquihired essentially by Fly.io
           | (so that should explain all their recent SQLite content!).
           | 
           | [0]: https://github.com/cventers/sqlite3-preload
        
         | benbjohnson wrote:
         | Litestream/LiteFS author here. I agree that synchronous
         | replication is important and we have plans to implement it in
         | LiteFS. Because LiteFS supports a loose membership model,
         | quorum-based acknowledgement doesn't really work as well since
         | the quorum can change. We have some other pieces to put into
         | place before synchronous replication can work well.
         | 
         | However, I disagree that it's just a "sequence of experiments".
         | There are a lot of applications that can benefit from
         | asynchronous replication. Synchronous acknowledgement of writes
         | can impose a high cost on throughput so many folks use async
         | replication, even in systems like Postgres.
        
           | losfair wrote:
           | Maybe the inherent overhead of synchronous replication is
           | more on _latency_ rather than _throughput_.
        
           | dastbe wrote:
           | synchronous replication to replicas is not necessary for data
           | durability and doesn't have to be a huge drag on throughput.
           | for example, you can achieve high throughput by pipelining
           | uncommitted transactions and then tracking when they are
           | durably committed to backing store (like s3) for when to ack
           | back to clients. and when dealing with failover, you can use
           | the central store for determining place in ledger rather than
           | whatever is on the replica that happens to get leadership.
        
       | rubenv wrote:
       | Any chance this works together with Kubernetes?
        
         | benbjohnson wrote:
         | LiteFS author here. I haven't tested it on Kubernetes yet but
         | it is meant to be deployed anywhere. The only dependency is
         | Consul although you could get around that by using a static
         | leader[1].
         | 
         | [1]: https://github.com/superfly/litefs/pull/47
        
       | openthc wrote:
       | Why not use https://github.com/rqlite/rqlite ?
       | 
       | You have to write to the SQLite via the rqlite HTTP API but it
       | will replicate the data to N nodes (at least 20) via RAFT and
       | then others can read-only the SQLite replica files directly; and
       | file-permissions prevent the accidental write.
        
         | hardwaresofton wrote:
         | rqlite is a nice project, but you kind of covered it:
         | 
         | > You have to write to the SQLite via the rqlite HTTP API
         | 
         | Also requiring _non determinism_ (i.e. no RANDOM()) is
         | something I don 't think I really want to worry about. There
         | are a few of tradeoffs for rqlite (and dqlite too, to be fair)
         | that just don't seem to be quite worth it (especially compared
         | to just running Postgres).
         | 
         | I think people are realizing that having one far away writer is
         | actually _fine_ -- 90%+ of the traffic you 're trying to serve
         | fast is read queries.
        
           | otoolep wrote:
           | Actually rqlite release 7.7.0[1] adds support for RANDOM().
           | Timestamp functions support will be added to an upcoming
           | release. It does this by statement-rewriting[2] before
           | sending the SQL to the Raft log.
           | 
           | [1] https://www.philipotoole.com/rqlite-7-7-0-released/
           | 
           | [2] https://github.com/rqlite/rqlite/blob/master/DOC/NON_DETE
           | RMI...
        
             | otoolep wrote:
             | And benbjohnson wrote the SQL parser[1] rqlite uses to do
             | all this. So you see, the man is everywhere. :-)
             | 
             | [1] https://github.com/rqlite/sql
        
             | hardwaresofton wrote:
             | TIL, thank you for pointing this out -- maybe it's time to
             | re-evaluate my stance on rqlite & dqlite
        
       | zdw wrote:
       | What's the overhead of using FUSE to implement LiteFS?
       | 
       | Is there an issue with OS compatibility? FUSE tends to require OS
       | hooks, last I checked, and that can be somewhat hairy to deal
       | with.
        
         | benbjohnson wrote:
         | > What's the overhead of using FUSE to implement LiteFS?
         | 
         | It's a tricky question to answer. Most of the noticeable
         | overhead is on the write side. Initial benchmarks of overhead
         | that I've seen locally are about 250us for the write(2) and
         | fsync(2) calls. It's closer to 100us for read(2) calls. There
         | are additional writes made behind the scenes as well for
         | storing in a replication format for the other nodes too.
         | 
         | However, on the read side some of that is moot. For many
         | databases, most reads will be in the OS page cache and a fetch
         | from there seems to be closer to 4us. If you're running a
         | moderately sized database (e.g 1GB) on even a modest VM (e.g.
         | 256 RAM) then most of your hot pages will be in the OS page
         | cache so you shouldn't notice much overhead on the read side.
         | 
         | LiteFS is targeted at read-heavy workloads. If you need high
         | write throughput of thousands of writes per second then LiteFS
         | probably isn't a good fit.
         | 
         | > Is there an issue with OS compatibility?
         | 
         | LiteFS is Linux right now. We'll be supporting other operating
         | systems via a SQLite VFS extension in the future. macOS has
         | poor FUSE support right now and I'm not sure where Windows and
         | BSD stand with their support for FUSE or FUSE-like systems.
        
           | yellowapple wrote:
           | > We'll be supporting other operating systems via a SQLite
           | VFS extension in the future.
           | 
           | Are there advantages to a FUSE-based approach over a VFS-
           | based approach?
        
       | sharps_xp wrote:
       | How do you turn a local-first file system based database into a
       | cloud software product with vendor lock in?
        
         | benbjohnson wrote:
         | How is there vendor lock-in? The LiteFS & Litestream code is
         | all open source under Apache 2.
        
           | sharps_xp wrote:
           | i think both fly.io and litestream were projects that spoke
           | sweet words to the average developer wanting to build
           | features without the infra headache, and what made sqlite so
           | appealing was its simplicity. litestream kept that simplicity
           | too. but i don't think the avg developer wants to spin up
           | their own LiteFS. At a glance at the repo, i have no idea how
           | to deploy this thing. It would have served fly and litestream
           | users better to have kept replication within litestream.
           | 
           | just as no one using postgres on RDS will ever leave RDS not
           | b/c RDS is so much better than its competitors but because
           | the hurdle is too great and the migration so risky. right
           | now, fly is the only one who lessens the burden to use LiteFS
           | and as long as they're the only one, the average developer is
           | essentially locked in.
        
       | infogulch wrote:
       | In a previous post about LiteFS [1], the creator Ben commented on
       | how clients could maintain a monotonically consistent view of the
       | database [2] even in the presence of replication lag and updates
       | made by the client. I think this is a pretty good (TM) strategy
       | that should work well for a majority of applications.
       | 
       | > > To improve latency, we're aiming at a scale-out model that
       | works similarly to Fly Postgres. That's to say: writes get
       | forwarded to the primary and all read requests get served from
       | their local copies.
       | 
       | > How can you ensure that a client that just performed a
       | forwarded write will be able to read that back on their local
       | replica on subsequent reads?
       | 
       | > LiteFS provides a transaction ID that applications can use to
       | determine replication lag. If the replica is behind the TXID, it
       | can either wait or it can forward to the primary to ensure
       | consistency.
       | 
       | [1]: https://news.ycombinator.com/item?id=32925734#32928974
       | 
       | [2]: I think this is a reasonable statement, but may not be
       | industry standard terminology.
        
       ___________________________________________________________________
       (page generated 2022-10-14 23:02 UTC)