[HN Gopher] SQLedge: Replicate Postgres to SQLite on the Edge
___________________________________________________________________
SQLedge: Replicate Postgres to SQLite on the Edge
Author : clessg
Score : 257 points
Date : 2023-08-09 14:29 UTC (8 hours ago)
(HTM) web link (github.com)
(TXT) w3m dump (github.com)
| maxpert wrote:
| Very interesting! I have question ( out of my experience in
| https://github.com/maxpert/marmot ) how do get around the boot
| time, specially when a change log of table is pretty large in
| Postgres? I've implemented snapshotting mechanism in Marmot as
| part of quickly getting up to speed. At some level I wonder if we
| can just feed this PG replication log into NATS cluster and
| Marmot can just replicate it across the board.
| maxfurman wrote:
| This is pretty neat! One question, if all your queries have to be
| SQLite-compatible, doesn't that defeat the purpose of using PG in
| the first place? Maybe SQLite supports more PG features than I
| thought, but if for example your app uses pgvector or pgcrypto
| you might have issues here.
| zknill wrote:
| Yes, absolutely, and this is going to be one of the hardest
| tech challenges to solve. I've thought a little about it, and
| it's probably unrealistic to think that we can translate every
| single PG statement into a SQLite one, especially when PG has
| extensions. So we're probably destined to use the local SQLite
| database for queries we can parse and understand, and
| forwarding all the others (both reads and writes) to the
| upstream PG server.
|
| This slightly breaks model of having a local copy serve data
| faster, but if only the minority of queries use a format that
| we don't understand in SQLite then only that minority of
| queries will suffer from the full latency to the main PG
| server.
| mvaliente2001 wrote:
| For one moment I thought this was the equivalent of SQLite for
| EdgeDB and the idea made me very happy.
| hrdwdmrbl wrote:
| This space is starting to get crowded. Can anyone compare this
| with some of the other solutions coming out recently?
| DANmode wrote:
| List a couple?
| arve0 wrote:
| A few I know: rqlite, dqlite, SQLite wasm, litestream.
| otoolep wrote:
| rqlite[1] creator here, happy to answer any questions.
|
| [1] https://www.rqlite.io
| westurner wrote:
| #. SQLite WAL mode
|
| From https://www.sqlite.org/isolation.html
| https://news.ycombinator.com/item?id=32247085 :
|
| > [sqlite] _WAL mode permits simultaneous readers and
| writers. It can do this because changes do not overwrite the
| original database file, but rather go into the separate
| write-ahead log file. That means that readers can continue to
| read the old, original, unaltered content from the original
| database file at the same time that the writer is appending
| to the write-ahead log_
|
| #. superfly/litefs: a FUSE-based file system for replicating
| SQLite https://github.com/superfly/litefs
|
| #. sqldiff: https://www.sqlite.org/sqldiff.html
| https://news.ycombinator.com/item?id=31265005
|
| #. dolthub/dolt: https://github.com/dolthub/dolt :
|
| > _Dolt is a SQL database that you can fork, clone, branch,
| merge, push and pull just like a Git repository._ [...]
|
| > _Dolt can be set up as a replica of your existing MySQL or
| MariaDB database using standard MySQL binlog replication.
| Every write becomes a Dolt commit. This is a great way to get
| the version control benefits of Dolt and keep an existing
| MySQL or MariaDB database._
|
| #. github/gh-ost: https://github.com/github/gh-ost :
|
| > _Instead, gh-ost uses the binary log stream to capture
| table changes, and asynchronously applies them onto the ghost
| table. gh-ost takes upon itself some tasks that other tools
| leave for the database to perform. As result, gh-ost has
| greater control over the migration process; can truly suspend
| it; can truly decouple the migration 's write load from the
| master's workload._
|
| #. vlcn-io/cr-sqlite: https://github.com/vlcn-io/cr-sqlite :
|
| > _Convergent, Replicated SQLite. Multi-writer and CRDT
| support for SQLite_
|
| > _CR-SQLite is a run-time loadable extension for SQLite and
| libSQL. It allows merging different SQLite databases together
| that have taken independent writes._
|
| > _In other words, you can write to your SQLite database
| while offline. I can write to mine while offline. We can then
| both come online and merge our databases together, without
| conflict._
|
| > _In technical terms: cr-sqlite adds multi-master
| replication and partition tolerance to SQLite via conflict
| free replicated data types (CRDTs) and /or causally ordered
| event logs._
|
| yjs also does CRDTs (Jupyter RTC,)
|
| #. pganalyze/libpg_query:
| https://github.com/pganalyze/libpg_query :
|
| > _C library for accessing the PostgreSQL parser outside of
| the server environment_
|
| #. Ibis + Substrait [ + DuckDB ] https://ibis-
| project.org/blog/ibis_substrait_to_duckdb/ :
|
| > _ibis strives to provide a consistent interface for
| interacting with a multitude of different analytical
| execution engines, most of which (but not all) speak some
| dialect of SQL._
|
| > _Today, Ibis accomplishes this with a lot of help from
| `sqlalchemy` and `sqlglot` to handle differences in dialect,
| or we interact directly with available Python bindings (for
| instance with the pandas, datafusion, and polars backends)._
|
| > [...] _`Substrait` is a new cross-language serialization
| format for communicating (among other things) query plans. It
| 's still in its early days, but there is already nascent
| support for Substrait in Apache Arrow, DuckDB, and Velox._
|
| #. ibis-project/ibis-substrait: https://github.com/ibis-
| project/ibis-substrait
|
| #. tobymao/sqlglot: https://github.com/tobymao/sqlglot :
|
| > _SQLGlot is a no-dependency SQL parser, transpiler,
| optimizer, and engine. It can be used to format SQL or
| translate between 19 different dialects like DuckDB, Presto,
| Spark, Snowflake, and BigQuery. It aims to read a wide
| variety of SQL inputs and output syntactically and
| semantically correct SQL in the targeted dialects._
|
| > _It is a very comprehensive generic SQL parser with a
| robust test suite. It is also quite performant, while being
| written purely in Python._
|
| > _You can easily customize the parser, analyze queries,
| traverse expression trees, and programmatically build SQL._
|
| > _Syntax errors are highlighted and dialect
| incompatibilities can warn or raise depending on
| configurations. However, it should be noted that SQL
| validation is not SQLGlot's goal, so some syntax errors may
| go unnoticed._
|
| #. benbjohnson/postlite:
| https://github.com/benbjohnson/postlite :
|
| > _postlite is a network proxy to allow access to remote
| SQLite databases over the Postgres wire protocol. This allows
| GUI tools to be used on remote SQLite databases which can
| make administration easier._
|
| > _The proxy works by translating Postgres frontend wire
| messages into SQLite transactions and converting results back
| into Postgres response wire messages. Many Postgres clients
| also inspect the pg_catalog to determine system information
| so Postlite mirrors this catalog by using an attached in-
| memory database with virtual tables. The proxy also performs
| minor rewriting on these system queries to convert them to
| usable SQLite syntax._
|
| > _Note: This software is in alpha. Please report bugs.
| Postlite doesn 't alter your database unless you issue
| INSERT, UPDATE, DELETE commands so it's probably safe. If
| anything, the Postlite process may die but it shouldn't
| affect your database._
|
| #. > "Hosting SQLite Databases on GitHub Pages" (2021) re:
| sql.js-httpvfs, DuckDB
| https://news.ycombinator.com/item?id=28021766
|
| #. >> - bittorrent/sqltorrent
| https://github.com/bittorrent/sqltorrent
|
| >> _Sqltorrent is a custom VFS for sqlite which allows
| applications to query an sqlite database contained within a
| torrent. Queries can be processed immediately after the
| database has been opened, even though the database file is
| still being downloaded. Pieces of the file which are required
| to complete a query are prioritized so that queries complete
| reasonably quickly even if only a small fraction of the whole
| database has been downloaded._
|
| #. simonw/datasette-lite:
| https://github.com/simonw/datasette-lite datasette, *-to-
| sqlite, dogsheep
|
| "Loading SQLite databases" [w/ datasette]
| https://github.com/simonw/datasette-lite#loading-sqlite-
| data...
|
| #. awesome-db-tools: https://github.com/mgramin/awesome-db-
| tools
|
| Lots of neat SQLite/vtable/pg/replication things
| markhalonen wrote:
| We're currently grappling with trying to build a system similar
| to https://stripe.com/sigma with a single multi-tenant Postgres
| db as the source and an SQLite read replica per tenant. Currently
| we re-generate the world every night. We need fine-grained
| control over the SQLite schema (it's a public api we let users
| write sql against). Any related projects would be great to point
| me towards!
| kobieps wrote:
| [dead]
| hinkley wrote:
| We need a standardized WAL format. Too many people trying to
| write software to simulate it.
| hamandcheese wrote:
| I'm a bit confused by this. Since SQLEdge is a proxy, you lose
| all the IO benefits of running an embedded in-process DB.
|
| At that point, why not replicate to a real Postgres on the edge?
|
| Maybe the expectation is that the application also opens the
| SQLite file directly? (But in that case, what is the point of the
| SQLEdge proxy?)
| e12e wrote:
| This looks neat. Any support for postgres schemas? Would be cool
| if this supported each SQLite chard to write to separate schema -
| giving each "share" a single tenant "view" while giving access to
| all data in the postgres instance?
|
| Or is this only public or all schemas?
| packetlost wrote:
| Why SQLite instead of a standard hub and spoke replication? What
| benefit does this provide? Being able to run your database on the
| client? That seems like it would be risky
| tptacek wrote:
| Because it's running in the app's address space, SQLite is
| obscenely fast for reads, so much so that you can often just
| write N+1 queries with it.
| hamandcheese wrote:
| In this case though it appears that SQLEdge is running as its
| own process, distinct from the app, and the app sends all
| queries to the SQLEdge proxy.
| tptacek wrote:
| That makes less sense to me then, though the IPC cost for
| those reads is still much lower than a typical n-tier
| Postgres setup.
| luckystarr wrote:
| How would mutually incompatible upstream changes from multiple
| SQLite edge instances be resolved? You'd need user input for
| that, right?
| zknill wrote:
| The writes via SQLedge are sync, that is we wait for the write
| to be processed on the upstream Postgres server. So it operates
| as if SQLedge wasn't in the request path from application to
| Postgres. The writes to Postgres are only reflected in SQLite
| when the data is received back from the Postgres server on the
| replication slot.
|
| This means writes are eventually consistent, currently, but I
| intend to include a feature that allows waiting for that write
| to be reflected back in SQLite which would satisfy the 'read
| your own writes' property.
|
| SQLedge will never be in a situation where the SQLite database
| thinks it has a write, but that write is yet to be applied to
| the upstream Postgres server.
|
| Basically, the Postgres server 'owns' the writes, and can
| handle them just like it would if SQLedge didn't exist.
| luckystarr wrote:
| So, the advertised performance is just for the read-part, not
| the write-part?
|
| As far as I understand: writing would still exhibit the same
| characteristics as before, while reads would never be
| affected by other users.
| zffr wrote:
| > SQLedge serves reads from it's local sqlite database, and
| forwards writes to the upstream postgres server that it's
| replicating from.
|
| I don't think SQLite data is ever written back to the postgres
| DB so this shouldn't be an issue
| simonw wrote:
| I don't think that's a problem here, because the write queries
| aren't handled by the local SQLite databases - instead, the
| proxy forwards them directly to the PostgreSQL instance, so
| presumably the resulting changes show up in SQLite when they
| are replicated back down again.
| sgt wrote:
| A commercial offering (although also open source):
| https://www.powersync.co/
| kobieps wrote:
| [dead]
| kiitos wrote:
| And how do you manage conflicts?
|
| _edit_
|
| > The writes via SQLedge are sync, that is we wait for the write
| to be processed on the upstream Postgres server
|
| OK, so, it's a SQLite read replica of a Postgres primary DB.
|
| Of course, this does mean that it's possible for clients to fail
| the read-your-writes consistency check.
| singingwolfboy wrote:
| Does anyone know of a tool that will export a Postgres database
| to a SQLite database file? Seems like a handy way of exporting
| and passing around smallish DBs. I feel like this tool must
| exist, but I haven't found it yet. (Supporting imports and data
| transformations would be even better!)
| simonw wrote:
| I wrote a tool to do that: https://datasette.io/tools/db-to-
| sqlite
|
| You can see an example of it in use here:
| https://github.com/simonw/simonwillisonblog-backup/blob/main...
| samanator wrote:
| How does this work with custom postgres types?
| karakanb wrote:
| this seems neat, I'll definitely give it a look. this seems like
| a very suitable trade-off for a lot of applications I have worked
| on.
|
| does anyone know if there is a postgres-postgres version of this
| that is easy to run in ephemeral environments? ideally I'd like
| to be able to run Postgres sidecars along my application
| containers and eliminate the network roundtrip using the sidecar
| as a read replica, but haven't seen this being done anywhere.
| maybe it wouldn't be fast enough to run in such scenarios?
| djbusby wrote:
| In PG only one can use logical or streaming replication. Then
| all reads to the replica and writes to main. App needs two
| connections - one for read, one for write.
| durkie wrote:
| I'm super excited for this -- it seems like it's perfect as an
| app-local cache of things that can be a drop-in replacement for
| some high-cost queries.
|
| Are there any plans to support which tables get copied over? The
| main postgres database is too big to replicate everywhere, but
| some key "summary" tables would be really nice to have locally.
| simonw wrote:
| Following the PostgreSQL logical replication stream to update a
| local SQLite database copy is definitely a neat trick, and feels
| very safe to me (especially since you track the Log Sequence
| Number in a postgres_pos table).
|
| The bit that surprised me was that this thing supports writes as
| well!
|
| It does it by acting as a PostgreSQL proxy. You connect to that
| proxy with a regular PostgreSQL client, then any read queries you
| issue run against the local SQLite copy and any writes are
| forwarded on to "real" PostgreSQL.
|
| The downside is that now your SELECT statements all need to be in
| the subset of SQL that is supported by both SQLite and
| PostgreSQL. This can be pretty limiting, mainly because
| PostgreSQL SQL is a much, much richer dialect than SQLite.
|
| Should work fine for basic SELECT queries though.
|
| I'd find this project useful even without the PostgreSQL
| connection/write support though.
|
| I worked with a very high-scale feature flag system a while ago -
| thousands of flag checks a second. This scaled using a local
| memcached cache of checks on each machine, despite the check
| logic itself consulting a MySQL database.
|
| I had an idea to improve that system by running a local SQLite
| cache of the full flag logic on every frontend machine instead.
| That way flag checks could use full SQL logic, but would still
| run incredibly fast.
|
| The challenge would be keeping that local SQLite database copy
| synced with the centralized source-of-truth database. A system
| like SQLedge could make short work of that problem.
| ericraio wrote:
| One use case I can see this being valuable for is for a client
| based application and Postgres being a centralized database.
| The client would just query SQLite and not need to write
| Postgres SQL.
| Omnipresent wrote:
| Honest question: why is SQLLite needed for local? Why would you
| not have PG at edge that replicates data with central PG? That
| way the SQL dialect problem you mentioned wouldn't exist.
| DaiPlusPlus wrote:
| > I worked with a very high-scale feature flag system a while
| ago - thousands of flag checks a second.
|
| May I ask why the flags are checked that frequently? Couldn't
| they be cached for at least a minute?
|
| > It does it by acting as a PostgreSQL proxy. [...] and any
| writes are forwarded on to "real" PostgreSQL.
|
| What happens if there's a multi-statement transaction with a
| bunch of writes sent-off to the mothership - which then get
| returned to the client via logical replication, but _then_
| there 's a ROLLBACK - how would that situation be handled such
| that both the SQLite edge DBs and the mothership DB are able to
| rollback okay - would this impact other clients?
| rockostrich wrote:
| Feature flag systems are usually based on a set of rules that
| could be serialized and evaluated locally (this is how pretty
| much every open source feature flag system and feature flag
| SaaS works). Usually it's based on some kind of UUID being
| hashed with a per-flag seed and bucketed after some set of
| targeting rules are applied to other properties passed in for
| that user. There are added features where you can stores
| large cohorts to do specific targeting and usually there's
| some kind of local cache added to make that look-up faster
| for recent users.
|
| I'm not sure what the original commenter was doing but it
| sounds like they had some kind of targeting that was almost
| entirely based on cohorts or maybe they needed to have
| stability over time which would require a database. We did
| something similar recently except we just store a "session
| ID" with a blob for look-up and the evaluation only happens
| on the first request for a given session ID.
| paulddraper wrote:
| > Couldn't they be cached for at least a minute?
|
| Only per feature+per user. (Though 1000s per second does seem
| high unless your scale is gigantic.)
|
| > What happens if there's a multi-statement transaction with
| a bunch of writes sent-off to the mothership - which then get
| returned to the client via logical replication, but then
| there's a ROLLBACK
|
| Nothing makes it into the replication stream until it is
| committed.
| simonw wrote:
| They were being cached for at least a minute (maybe even more
| than that, I can't remember the details) - that's what the
| local memcached instance was for.
|
| This was problematic though because changing a feature flag
| and then waiting for a minute plus to see if the change
| actually worked can be frustrating, especially if it relates
| to an outage of some sort.
| vasco wrote:
| > May I ask why the flags are checked that frequently?
| Couldn't they be cached for at least a minute?
|
| Not in that project but feature flags don't have to be all or
| nothing. You can apply flags to specific cohorts of your
| users for example, so if you have a large user base, even if
| you cache them per-user, it still translates into many checks
| a second for large systems.
| mikepurvis wrote:
| I guess the cost of doing it precisely isn't terribly high,
| but if the majority of the flags were "off" (eg, subsets of
| users being opted into a beta or something), I wonder if
| you cut a bunch of the queries by sending down a bloom
| filter.
|
| So basically you check the filter first, and if that says
| the feature is enabled, only then do you actually ask the
| real DB.
| runeks wrote:
| My limited understanding of logical replication is that
| writes only happen at COMMIT. Ie. nothing is replicated until
| it's committed.
| zknill wrote:
| The logical replication protocol sends a series of messages
| that essentially follow the flow that a database transaction
| would.
|
| i.e. a stream of messages like: "BEGIN", "[the data]",
| ["COMMIT" or "ROLLBACK"].
|
| So any application that listens to the Postgres replication
| protocol can handle the transaction in the same way that
| Postgres does. Concretely you might choose to open a SQLite
| transaction on BEGIN, apply the statements, and then COMMIT
| or ROLLBACK based on the next messages received on the stream
| replication protocol.
|
| The data sent on the replication protocol includes the state
| of the row after the write query has completed. This means
| you don't need to worry about getting out of sync on queries
| like "UPDATE field = field + 1" because you have access to
| the exact resulting value as stored by Postgres.
|
| TL;DR - you can follow the same begin/change/commit flow that
| the original transaction did on the upstream Postgres server,
| and you have access to the exact underlying data after the
| write was committed.
|
| It's also true (as other commenters have pointed out) that
| for not-huge transactions (i.e. not streaming transactions,
| new feature in Postgres 15) the BEGIN message will only be
| sent if the transaction was committed. It's pretty unlikely
| that you will ever process a ROLLBACK message from the
| protocol (although possible).
| [deleted]
| paulddraper wrote:
| Logical replication never includes uncommitted data, unless
| you have written an custom output plugin.
|
| EDIT:
| https://stackoverflow.com/questions/52202534/postgresql-
| does...
| zknill wrote:
| It's true for protocol version 1 that only the committed
| data is sent on the replication connection.
|
| In protocol version 2 (introduced in postgres 14) large
| in-progress transactions can appear in the new "Stream"
| messages sent on the replication connection: StreamStart,
| StreamStop, StreamAbort, StreamCommit, etc. In the case
| of large in-progress transactions uncommitted data might
| end up in the replication connection after a StreamStart
| message. But you would also receive a StreamCommit or
| StreamAbort message to tell you what happened to that
| transaction.
|
| I've not worked out what qualifies as a "large"
| transaction though. But it is _possible_ to get
| uncommitted data in the replication connection, although
| unlikely.
|
| https://www.postgresql.org/docs/15/protocol-logicalrep-
| messa...
| jmull wrote:
| > May I ask why the flags are checked that frequently?
| Couldn't they be cached for at least a minute?
|
| Not the previous poster, but it appears in the scenario, the
| SQLite database _is_ the cache.
| aeyes wrote:
| How many flags are we talking here? I implemted a similar
| system and we just replace the whole sqlite DB file by
| downloading it from the centralized storage whenever it
| changes.
|
| Even with 1M flags it's still only a few 100 kB compressed.
|
| I wouldn't replicate per user flags to the edge to keep size
| under control.
| hinkley wrote:
| Does it though? If it's a proxy it can support the SQLite read
| and the Postgres write syntax. If reads only ever go to SQLite
| they don't need to work on Postgres.
| rbranson wrote:
| We replicated our MySQL database to a SQLite edge at Segment in
| ctlstore: https://github.com/segmentio/ctlstore
|
| We considered tailing binlogs directly but there's so much cruft
| and complexity involved trying to translate between types and
| such at that end, once you even just get passed properly parsing
| the binlogs and maintaining the replication connection. Then you
| have to deal with schema management across both systems too.
| Similar sets of problems using PostgreSQL as a source of truth.
|
| In the end we decided just to wrap the whole thing up and
| abstract away the schema with a common set of types and a limited
| set of read APIs. Biggest missing piece I regret not getting in
| was support for secondary indexes.
| eddd-ddde wrote:
| Is there a reason you didn't add them when making the shared
| API?
| rbranson wrote:
| If you're asking about secondary indexes, it was just seen as
| a "later" feature we'd implement as a follow-up. It was
| definitely asked for, just never prioritized before I moved
| off the project.
___________________________________________________________________
(page generated 2023-08-09 23:00 UTC)