[HN Gopher] How SQLite scales read concurrency
       ___________________________________________________________________
        
       How SQLite scales read concurrency
        
       Author : pkilgore
       Score  : 151 points
       Date   : 2022-08-24 14:24 UTC (8 hours ago)
        
 (HTM) web link (fly.io)
 (TXT) w3m dump (fly.io)
        
       | ithrow wrote:
       | SQLite needs some 'put your money where your mouth is'
       | benchmarks.
       | 
       | EDIT:Lots of devs have (unfounded) doubts about performance
       | regarding SQLite for web apps, a single machine benchmark against
       | postgres (with postgres and the app in the same server) would
       | clear many doubts and create awareness that SQLite is going to be
       | more than enough for many apps.. The app doesn't have to be a web
       | app (we are not testing web servers) but maybe some code with
       | some semi-complex domain models.
        
         | simlevesque wrote:
         | Needs for what ? It is the most deployed database ever.
         | 
         | https://www.sqlite.org/mostdeployed.html
        
         | chasil wrote:
         | The big benchmark that I know is TPC-C.
         | 
         | This benchmark requires concurrent writes in a massive OLTP
         | client-server model.
         | 
         | SQLite shines in SQL92 compliance (without grant/revoke or
         | other permission-based aspects), but it was simply not designed
         | to compete in any of the TPC tests.
         | 
         | The current holder of the top TPC-C score is OceanBase, which
         | defeated Oracle's previous record with 11g/Solaris/SPARC.
         | 
         | https://www.tpc.org/tpcc/results/tpcc_perf_results5.asp?resu...
        
         | kevingadd wrote:
         | As someone who had to replace SQLite with a custom key value
         | store in an app for throughput (needed to be able to write
         | structured data at 4GB/s from multiple threads) I assure you
         | that the claims about it being fast are not false - it wasn't
         | fast enough for me but it's _very_ fast
        
           | ithrow wrote:
           | Yeah but in every thread about SQLite many mention the fear
           | of having to migrate to postgres as they reach some quantity
           | of concurrent users (web apps), a benchmark would calm these
           | fears for some. (edited my original comment). Most won't go
           | through the trouble to see if SQLite is good enough, they'll
           | just go with the safe choice.
        
           | randito wrote:
           | 4GB/s is a pretty high requirement. What did you end up
           | using?
           | 
           | Redis looks like it would struggle with this requirement also
           | without some serious hardware, tweaks, or configuration.
           | https://redis.io/docs/reference/optimization/benchmarks/
        
             | kevingadd wrote:
             | The app was a C# frontend
             | (https://github.com/kg/HeapProfiler) that drove windows OS
             | services to take heap snapshots and capture stack traces,
             | so I ended up writing a custom key/value store in C# to
             | avoid having to do cross-language interop, marshaling, etc
             | (the cost of sending blobs to SQLite and running queries
             | was adding up.). It's hard to beat the best-in-class
             | optimized databases on their own turf but if you can just
             | grab a spot to dump your data into, you end up being a lot
             | faster.
             | 
             | By the end it ran fast enough that it was able to saturate
             | the kernel's paging infrastructure and make my mouse cursor
             | stutter, and I was able to take 1-2 snapshots per second of
             | a full running Firefox process with real webpages in it, so
             | it was satisfactory. SQLite couldn't process the amount of
             | data I was pumping in at that rate (but it still performed
             | pretty well - maybe a few snapshots per minute)
             | 
             | At the time I did investigate other data stores and the
             | only good candidates I ran across used incompatible open
             | source licenses, so I was stuck doing it myself. Fun excuse
             | to learn how to write and optimize btrees for throughput
             | :-)
        
               | liuliu wrote:
               | Yeah, most databases probably will have pathological
               | behaviors against your requirements (especially on tail-
               | latency, which you would care about). Many implement
               | similar tools would put a lightweight compression on top
               | and just dump these snapshots to disk and then run a
               | post-processing for queries. Dumping snapshots is also
               | preferred because you can insert checksums and
               | checkpoints for partial data recovery if there are
               | failures.
        
         | marginalia_nu wrote:
         | Dunno if it's still the case, but for a long time, it was a
         | license violation to publish benchmarks of Oracle and (IIRC)
         | Microsoft databases.
        
           | 0x457 wrote:
           | Well, luckily we have another two major RDBMS that allow it.
        
           | queuebert wrote:
           | That's not suspicious at all. /s
        
       | simonw wrote:
       | "The database header has a read & write version at bytes 18 & 19,
       | respectively, that are used to determine the journal mode.
       | They're set to 0x01 for rollback journal and 0x02 for write-ahead
       | log. They're typically both set to the same value."
       | 
       | Are there any useful situations where they wouldn't be set to the
       | same value?
        
       | Sytten wrote:
       | Also worth noting that Sqlite released a WAL2 journal mode
       | recently that eliminates the "stop of the world" we had with WAL
       | when checkpointing. Basically it maintains two wal files and
       | switched between them when one needs to be checkpointed. It is
       | quite neat!
        
         | simonw wrote:
         | WAL2 mode is still only available in a branch - I've been
         | hoping they'll merge it to main at some point, but I've not
         | been following things closely enough to know if there are
         | reasons that they wouldn't eventually do that.
         | 
         | https://www.sqlite.org/cgi/src/doc/wal2/doc/wal2.md
        
           | alberth wrote:
           | Not only WAL2, but there is yet another branch with BEGIN
           | CONCURRENT which also helps with scaling concurrency.
           | 
           | https://www.sqlite.org/cgi/src/doc/begin-
           | concurrent/doc/begi...
           | 
           | I really hope these two branches get merged into the mainline
           | sometime soon. I'm not sure what's the blocker, since both
           | branches have existed for literally years and are refreshed
           | frequently.
           | 
           | WAL2 + BEGIN CONCURRENT would solve probably 99.9% of
           | developer scaling needs.
        
             | chasil wrote:
             | SQLite runs critical systems on the A350 in flight.
             | 
             | Stability is vastly more important than new functionality
             | in that context.
        
         | ignoramous wrote:
         | > _Basically it maintains two wal files and switched between
         | them when one needs to be checkpointed. It is quite neat!_
         | 
         | See also: eglSwapBuffers,
         | https://katatunix.wordpress.com/2014/09/17/lets-talk-about-e...
        
       | christophilus wrote:
       | SQLite is such an excellent little tool.
       | 
       | I recently ran an experiment with Node and was able to take
       | writes from around 750 / second to around 24K / second just by
       | coordinating them using Node IPC. That is, I had the main thread
       | own the sole write connection and all other threads sent their
       | writes operations to it, thread-local connections were read-only.
       | 
       | It's pretty cool how far you can push SQLite and it just keeps
       | humming right along.
        
         | zebnyc wrote:
         | Can you clarify "main thread"? Do you have a code example?
         | Asking as Node is single threaded.
         | 
         | Thanks
        
           | numlock86 wrote:
           | Most likely using the cluster module and your favorite IPC
           | module.
           | 
           | https://nodejs.org/api/cluster.html
        
             | ksbrooksjr wrote:
             | My go-to method of inter-process communication in Node
             | without using any third party modules is to just use
             | process.send[1] to a send a message to the main thread, and
             | then have it forward messages to the workers in the
             | cluster, which you can listen for using the message event
             | [2].
             | 
             | [1] https://nodejs.org/api/process.html#processsendmessage-
             | sendh...
             | 
             | [2] https://nodejs.org/api/process.html#event-message
        
             | yrgulation wrote:
             | That or https://nodejs.org/api/worker_threads.html#worker-
             | threads
        
           | christophilus wrote:
           | Using cluster and worker threads. The code is here, but
           | completely undocumented, as I've been side tracked by a few
           | other things[0]. I'm currently looking into porting that
           | project to Bun, so it's possibly dead in its current form.
           | 
           | I was planning on turning that into a library, but Bun nerd
           | sniped me.
           | 
           | You need the "hey" tool to run the benchmarks[1] the way that
           | I was running them.
           | 
           | [0] https://github.com/chrisdavies/sql3
           | 
           | [1] https://github.com/rakyll/hey
        
       | etaioinshrdlu wrote:
       | It's become popular to talk about how scalable SQLite is lately,
       | but let's not forget the elephant in the room, it only allows a
       | single writer at a time.
       | 
       | It's obviously easier to manage and maintain due to it being an
       | embedded database, but that seems to be a very separate issue
       | from the data structure involved, which definitely has
       | disadvantages compared to a typical SQL system.
        
         | sicp-enjoyer wrote:
         | While most databases have more granular concurrency control,
         | almost all application parallelism is handled by locking on
         | critical resources, and having others wait for those resources
         | become available. That's exactly what SQLite does, and it's not
         | a big deal. If you have multiple processes, one might
         | occasionally have to wait a few ms for the other to finish.
        
         | bob1029 wrote:
         | > it only allows a single writer at a time.
         | 
         | This is true. However, in my testing with a relatively old NVMe
         | device, I can push well over a gigabyte/second to the database
         | with that single connection. One trick I employ is to only ever
         | maintain 1 connection and to use an inter-thread messaging
         | framework like Disruptor to serialize everything beforehand to
         | a single batched writer. Another, much simpler option, would be
         | to just lean on the fact that SQLite serializes all writes by
         | default and to take a little bit of a performance hit on the
         | lock contention (assuming a small # of writers).
         | 
         | I know for a fact that I can get more write throughput out of
         | SQLite in an in-process setting than I could if I had to go
         | across the network to a hosted SQL instance. In many cases this
         | difference can be measured as more than 1 order of magnitude,
         | both in latency and throughput figures.
         | 
         | Going vertical on hardware in 2022 is an extremely viable
         | strategy.
        
         | chasil wrote:
         | It appears that a development branch for a new concurrent write
         | mode has been launched.
         | 
         | These appear to be page-level locks, and row-level locks should
         | not be expected for this effort.
         | 
         | https://www.sqlite.org/cgi/src/doc/begin-concurrent/doc/begi...
        
       | posharma wrote:
       | What's with so many SQLite posts on HN these days? Never saw so
       | much love for sqlite before.
        
         | probotect0r wrote:
         | Most of them are by the same author, who created the
         | litestream.io project and has since been hired by fly.io to
         | continue working on the project.
        
           | benbjohnson wrote:
           | Author here. I have had a few posts on HN but there's a lot
           | of other authors doing great work as well. In the two weeks
           | there's been several SQLite posts from other folks on the
           | front page:
           | 
           | - SQLite has pretty limited builtin functions:
           | https://news.ycombinator.com/item?id=32557672
           | 
           | - Turning SQLite into a Distributed Database:
           | https://news.ycombinator.com/item?id=32539360
           | 
           | - SQLite is not a toy database:
           | https://news.ycombinator.com/item?id=32478907
           | 
           | - SQLite: Wal2 Mode Notes:
           | https://news.ycombinator.com/item?id=32435601
           | 
           | - SQLite-HTTP: A SQLite extension for making HTTP requests:
           | https://news.ycombinator.com/item?id=32417410
        
         | hnaccy wrote:
         | seems like it's become trendy to talk about how "underrated"
         | sqlite
         | 
         | in same vein as all those articles comparing time to process
         | data on single laptop vs hadoop cluster or whatever
        
           | benbjohnson wrote:
           | Author here. I hope the article didn't come off as me
           | claiming SQLite is underrated. I find that it's a great piece
           | of software for understanding database internals. There's not
           | a lot of moving parts or abstractions in between the API and
           | the raw bytes on disk -- at least not compared to other
           | databases.
           | 
           | Technologies have waves on HN. A couple years ago there were
           | a ton of Go articles. Lately there's always a Rust article on
           | the front page. Right now, I feel like SQLite is a refreshing
           | escape from many of the complex deployments that have been in
           | vogue lately.
        
           | bob1029 wrote:
           | > it's become trendy
           | 
           | We were using SQLite in production way before it was cool on
           | HN. I remember back in 2017-2018 describing how we use SQLite
           | as the principal database engine for our multi-user product,
           | and was basically tarred and feathered by the hosted SQL
           | crowd.
           | 
           | I think the most intoxicating thing about SQLite is that you
           | don't have to install or configure even one goddamn thing.
           | It's a lot more work and unknowns to go down this path, but
           | you can wind up with a far more robust product as a result.
           | 
           | I hope this trend continues aggressively.
        
       | NeutralForest wrote:
       | Fly.io constantly has good articles about databases, really cool
       | to see it.
        
       | jef_leppard wrote:
       | Someone recently observed that SQLite would be used a lot more in
       | production if it didn't have the word "lite" in its name. I've
       | personally been amazed by what it can do. Really great piece of
       | tech with an unfortunate name.
        
         | efdb wrote:
         | Not so unfortunate according to Socrates: 'the hardest task
         | needs the lightest hand..' ;-)
        
         | aaaaaaaaata wrote:
         | Why dilute your community with people who don't do the reading?
        
         | srcreigh wrote:
         | The name SQLite isn't Sequel-lite. It's S.Q.L.-ite akin to
         | graphite or Levite. It means a thing of SQLy stuff.
         | 
         | The creator says it like Escue Ell-ite emphasis on the Es- and
         | -ite. [0]
         | 
         | https://youtu.be/Jib2AmRb_rk?t=99
        
           | giraffe_lady wrote:
           | Yes but you don't get the namer's intent beamed into your
           | brain when you see or hear a word for the first time. To me
           | and a lot of other people it's pretty clearly sql-lite
           | regardless of what the namer wants it to be. Maybe this makes
           | me stupid or whatever but when naming things you also have to
           | account for my stupidity because I share it with a lot of
           | other people.
        
             | nordsieck wrote:
             | This is exactly the reason the Nimrod language was renamed
             | to Nim.
        
           | nordsieck wrote:
           | That may be so. But just because someone is the creator
           | doesn't mean they have control over how people use the name.
           | 
           | Case in point:
           | 
           | Blue Origin wants people to refer to it by the nickname
           | "Blue". But you kind of have to be in the industry to know
           | that. Everyone else just uses the initialism instead, which
           | is most commonly used to abbreviate "body odor".
        
           | 83457 wrote:
           | TIL, but strange choice.
           | 
           | On a related topic, anyone have an idea of the breakdown of
           | people that say Sequel vs S.Q.L. and are there people that
           | look down on others for their usage? Of course ignoring the
           | opinion of anyone that says Jif.
        
         | benbjohnson wrote:
         | I would be funny if they suddenly rebranded to "MegaSQL" :)
        
           | jef_leppard wrote:
           | SQLAwesome? :)
        
             | 83457 wrote:
             | SQLit!
        
               | jraph wrote:
               | sql.io
        
               | jjoonathan wrote:
               | Nah, this time zoomers win. SQLit > sql.io
        
               | seestem wrote:
               | SQLitio
        
             | [deleted]
        
           | sbierwagen wrote:
           | SQLheavy
        
         | tmpz22 wrote:
         | To blame unconscious bias over a suffix as a non-trivial force
         | on adoption is silly when there are plenty of valid technical
         | constraints for adoption of SQLite in many projects.
         | 
         | IMO database names are incredibly bland, outside of
         | Cockroachdb, and not a driving factor to adoption given the
         | critical nature of the decision.
        
       | simonw wrote:
       | Something I found non-obvious about WAL mode in SQLite is that
       | it's actually a property of the database file itself.
       | 
       | When you run "PRAGMA journal_mode=wal;" against a database file
       | the mode is permanently changed for that file - and the .db-wal
       | and .db-shm files for that database will appear in the same
       | directory as it.
       | 
       | Any future connections to that database will use it in WAL mode -
       | until you switch the mode on it back, at which point it will go
       | back to journal mode.
       | 
       | It makes sense when you think about it - of course a database can
       | only be in one or the other modes, not both, so the setting must
       | be at the database file level. But it took me a while to
       | understand.
       | 
       | I wrote some notes on this here:
       | https://til.simonwillison.net/sqlite/enabling-wal-mode
        
         | lancehasson wrote:
         | I was reading up on WAL mode this week and came across a
         | mailing list where you asked for clarification on this same
         | topic. That post helped me understand, thanks for asking the
         | question! The confusing thing here for me is that the majority
         | of other PRAGMAs (that I know of) are per-request, so this one
         | differs from that pattern
        
         | benbjohnson wrote:
         | The other odd thing is that the journal_mode is only persistent
         | for WAL, I believe. The DELETE, TRUNCATE, & PERSIST modes are
         | per-connection. It makes sense though since those 3 modes deal
         | with the rollback journal and are compatible with each other
         | while the WAL is totally separate.
         | https://www.sqlite.org/pragma.html#pragma_journal_mode
        
         | chasil wrote:
         | There is a _significant_ warning about the use of WAL mode,
         | very plain in the documentation:
         | 
         | "Transactions involving multiple attached databases are atomic,
         | assuming that the main database is not ":memory:" and the
         | journal_mode is not WAL. If the main database is ":memory:" or
         | if the journal_mode is WAL, then transactions continue to be
         | atomic within each individual database file. But if the host
         | computer crashes in the middle of a COMMIT where two or more
         | database files are updated, some of those files might get the
         | changes where others might not."
         | 
         | https://sqlite.org/lang_attach.html
         | 
         | Systems designed for DML against multiple database files should
         | eschew WAL mode.
        
       ___________________________________________________________________
       (page generated 2022-08-24 23:01 UTC)