[HN Gopher] How SQLite scales read concurrency
___________________________________________________________________
How SQLite scales read concurrency
Author : pkilgore
Score : 151 points
Date : 2022-08-24 14:24 UTC (8 hours ago)
(HTM) web link (fly.io)
(TXT) w3m dump (fly.io)
| ithrow wrote:
| SQLite needs some 'put your money where your mouth is'
| benchmarks.
|
| EDIT:Lots of devs have (unfounded) doubts about performance
| regarding SQLite for web apps, a single machine benchmark against
| postgres (with postgres and the app in the same server) would
| clear many doubts and create awareness that SQLite is going to be
| more than enough for many apps.. The app doesn't have to be a web
| app (we are not testing web servers) but maybe some code with
| some semi-complex domain models.
| simlevesque wrote:
| Needs for what ? It is the most deployed database ever.
|
| https://www.sqlite.org/mostdeployed.html
| chasil wrote:
| The big benchmark that I know is TPC-C.
|
| This benchmark requires concurrent writes in a massive OLTP
| client-server model.
|
| SQLite shines in SQL92 compliance (without grant/revoke or
| other permission-based aspects), but it was simply not designed
| to compete in any of the TPC tests.
|
| The current holder of the top TPC-C score is OceanBase, which
| defeated Oracle's previous record with 11g/Solaris/SPARC.
|
| https://www.tpc.org/tpcc/results/tpcc_perf_results5.asp?resu...
| kevingadd wrote:
| As someone who had to replace SQLite with a custom key value
| store in an app for throughput (needed to be able to write
| structured data at 4GB/s from multiple threads) I assure you
| that the claims about it being fast are not false - it wasn't
| fast enough for me but it's _very_ fast
| ithrow wrote:
| Yeah but in every thread about SQLite many mention the fear
| of having to migrate to postgres as they reach some quantity
| of concurrent users (web apps), a benchmark would calm these
| fears for some. (edited my original comment). Most won't go
| through the trouble to see if SQLite is good enough, they'll
| just go with the safe choice.
| randito wrote:
| 4GB/s is a pretty high requirement. What did you end up
| using?
|
| Redis looks like it would struggle with this requirement also
| without some serious hardware, tweaks, or configuration.
| https://redis.io/docs/reference/optimization/benchmarks/
| kevingadd wrote:
| The app was a C# frontend
| (https://github.com/kg/HeapProfiler) that drove windows OS
| services to take heap snapshots and capture stack traces,
| so I ended up writing a custom key/value store in C# to
| avoid having to do cross-language interop, marshaling, etc
| (the cost of sending blobs to SQLite and running queries
| was adding up.). It's hard to beat the best-in-class
| optimized databases on their own turf but if you can just
| grab a spot to dump your data into, you end up being a lot
| faster.
|
| By the end it ran fast enough that it was able to saturate
| the kernel's paging infrastructure and make my mouse cursor
| stutter, and I was able to take 1-2 snapshots per second of
| a full running Firefox process with real webpages in it, so
| it was satisfactory. SQLite couldn't process the amount of
| data I was pumping in at that rate (but it still performed
| pretty well - maybe a few snapshots per minute)
|
| At the time I did investigate other data stores and the
| only good candidates I ran across used incompatible open
| source licenses, so I was stuck doing it myself. Fun excuse
| to learn how to write and optimize btrees for throughput
| :-)
| liuliu wrote:
| Yeah, most databases probably will have pathological
| behaviors against your requirements (especially on tail-
| latency, which you would care about). Many implement
| similar tools would put a lightweight compression on top
| and just dump these snapshots to disk and then run a
| post-processing for queries. Dumping snapshots is also
| preferred because you can insert checksums and
| checkpoints for partial data recovery if there are
| failures.
| marginalia_nu wrote:
| Dunno if it's still the case, but for a long time, it was a
| license violation to publish benchmarks of Oracle and (IIRC)
| Microsoft databases.
| 0x457 wrote:
| Well, luckily we have another two major RDBMS that allow it.
| queuebert wrote:
| That's not suspicious at all. /s
| simonw wrote:
| "The database header has a read & write version at bytes 18 & 19,
| respectively, that are used to determine the journal mode.
| They're set to 0x01 for rollback journal and 0x02 for write-ahead
| log. They're typically both set to the same value."
|
| Are there any useful situations where they wouldn't be set to the
| same value?
| Sytten wrote:
| Also worth noting that Sqlite released a WAL2 journal mode
| recently that eliminates the "stop of the world" we had with WAL
| when checkpointing. Basically it maintains two wal files and
| switched between them when one needs to be checkpointed. It is
| quite neat!
| simonw wrote:
| WAL2 mode is still only available in a branch - I've been
| hoping they'll merge it to main at some point, but I've not
| been following things closely enough to know if there are
| reasons that they wouldn't eventually do that.
|
| https://www.sqlite.org/cgi/src/doc/wal2/doc/wal2.md
| alberth wrote:
| Not only WAL2, but there is yet another branch with BEGIN
| CONCURRENT which also helps with scaling concurrency.
|
| https://www.sqlite.org/cgi/src/doc/begin-
| concurrent/doc/begi...
|
| I really hope these two branches get merged into the mainline
| sometime soon. I'm not sure what's the blocker, since both
| branches have existed for literally years and are refreshed
| frequently.
|
| WAL2 + BEGIN CONCURRENT would solve probably 99.9% of
| developer scaling needs.
| chasil wrote:
| SQLite runs critical systems on the A350 in flight.
|
| Stability is vastly more important than new functionality
| in that context.
| ignoramous wrote:
| > _Basically it maintains two wal files and switched between
| them when one needs to be checkpointed. It is quite neat!_
|
| See also: eglSwapBuffers,
| https://katatunix.wordpress.com/2014/09/17/lets-talk-about-e...
| christophilus wrote:
| SQLite is such an excellent little tool.
|
| I recently ran an experiment with Node and was able to take
| writes from around 750 / second to around 24K / second just by
| coordinating them using Node IPC. That is, I had the main thread
| own the sole write connection and all other threads sent their
| writes operations to it, thread-local connections were read-only.
|
| It's pretty cool how far you can push SQLite and it just keeps
| humming right along.
| zebnyc wrote:
| Can you clarify "main thread"? Do you have a code example?
| Asking as Node is single threaded.
|
| Thanks
| numlock86 wrote:
| Most likely using the cluster module and your favorite IPC
| module.
|
| https://nodejs.org/api/cluster.html
| ksbrooksjr wrote:
| My go-to method of inter-process communication in Node
| without using any third party modules is to just use
| process.send[1] to a send a message to the main thread, and
| then have it forward messages to the workers in the
| cluster, which you can listen for using the message event
| [2].
|
| [1] https://nodejs.org/api/process.html#processsendmessage-
| sendh...
|
| [2] https://nodejs.org/api/process.html#event-message
| yrgulation wrote:
| That or https://nodejs.org/api/worker_threads.html#worker-
| threads
| christophilus wrote:
| Using cluster and worker threads. The code is here, but
| completely undocumented, as I've been side tracked by a few
| other things[0]. I'm currently looking into porting that
| project to Bun, so it's possibly dead in its current form.
|
| I was planning on turning that into a library, but Bun nerd
| sniped me.
|
| You need the "hey" tool to run the benchmarks[1] the way that
| I was running them.
|
| [0] https://github.com/chrisdavies/sql3
|
| [1] https://github.com/rakyll/hey
| etaioinshrdlu wrote:
| It's become popular to talk about how scalable SQLite is lately,
| but let's not forget the elephant in the room, it only allows a
| single writer at a time.
|
| It's obviously easier to manage and maintain due to it being an
| embedded database, but that seems to be a very separate issue
| from the data structure involved, which definitely has
| disadvantages compared to a typical SQL system.
| sicp-enjoyer wrote:
| While most databases have more granular concurrency control,
| almost all application parallelism is handled by locking on
| critical resources, and having others wait for those resources
| become available. That's exactly what SQLite does, and it's not
| a big deal. If you have multiple processes, one might
| occasionally have to wait a few ms for the other to finish.
| bob1029 wrote:
| > it only allows a single writer at a time.
|
| This is true. However, in my testing with a relatively old NVMe
| device, I can push well over a gigabyte/second to the database
| with that single connection. One trick I employ is to only ever
| maintain 1 connection and to use an inter-thread messaging
| framework like Disruptor to serialize everything beforehand to
| a single batched writer. Another, much simpler option, would be
| to just lean on the fact that SQLite serializes all writes by
| default and to take a little bit of a performance hit on the
| lock contention (assuming a small # of writers).
|
| I know for a fact that I can get more write throughput out of
| SQLite in an in-process setting than I could if I had to go
| across the network to a hosted SQL instance. In many cases this
| difference can be measured as more than 1 order of magnitude,
| both in latency and throughput figures.
|
| Going vertical on hardware in 2022 is an extremely viable
| strategy.
| chasil wrote:
| It appears that a development branch for a new concurrent write
| mode has been launched.
|
| These appear to be page-level locks, and row-level locks should
| not be expected for this effort.
|
| https://www.sqlite.org/cgi/src/doc/begin-concurrent/doc/begi...
| posharma wrote:
| What's with so many SQLite posts on HN these days? Never saw so
| much love for sqlite before.
| probotect0r wrote:
| Most of them are by the same author, who created the
| litestream.io project and has since been hired by fly.io to
| continue working on the project.
| benbjohnson wrote:
| Author here. I have had a few posts on HN but there's a lot
| of other authors doing great work as well. In the two weeks
| there's been several SQLite posts from other folks on the
| front page:
|
| - SQLite has pretty limited builtin functions:
| https://news.ycombinator.com/item?id=32557672
|
| - Turning SQLite into a Distributed Database:
| https://news.ycombinator.com/item?id=32539360
|
| - SQLite is not a toy database:
| https://news.ycombinator.com/item?id=32478907
|
| - SQLite: Wal2 Mode Notes:
| https://news.ycombinator.com/item?id=32435601
|
| - SQLite-HTTP: A SQLite extension for making HTTP requests:
| https://news.ycombinator.com/item?id=32417410
| hnaccy wrote:
| seems like it's become trendy to talk about how "underrated"
| sqlite
|
| in same vein as all those articles comparing time to process
| data on single laptop vs hadoop cluster or whatever
| benbjohnson wrote:
| Author here. I hope the article didn't come off as me
| claiming SQLite is underrated. I find that it's a great piece
| of software for understanding database internals. There's not
| a lot of moving parts or abstractions in between the API and
| the raw bytes on disk -- at least not compared to other
| databases.
|
| Technologies have waves on HN. A couple years ago there were
| a ton of Go articles. Lately there's always a Rust article on
| the front page. Right now, I feel like SQLite is a refreshing
| escape from many of the complex deployments that have been in
| vogue lately.
| bob1029 wrote:
| > it's become trendy
|
| We were using SQLite in production way before it was cool on
| HN. I remember back in 2017-2018 describing how we use SQLite
| as the principal database engine for our multi-user product,
| and was basically tarred and feathered by the hosted SQL
| crowd.
|
| I think the most intoxicating thing about SQLite is that you
| don't have to install or configure even one goddamn thing.
| It's a lot more work and unknowns to go down this path, but
| you can wind up with a far more robust product as a result.
|
| I hope this trend continues aggressively.
| NeutralForest wrote:
| Fly.io constantly has good articles about databases, really cool
| to see it.
| jef_leppard wrote:
| Someone recently observed that SQLite would be used a lot more in
| production if it didn't have the word "lite" in its name. I've
| personally been amazed by what it can do. Really great piece of
| tech with an unfortunate name.
| efdb wrote:
| Not so unfortunate according to Socrates: 'the hardest task
| needs the lightest hand..' ;-)
| aaaaaaaaata wrote:
| Why dilute your community with people who don't do the reading?
| srcreigh wrote:
| The name SQLite isn't Sequel-lite. It's S.Q.L.-ite akin to
| graphite or Levite. It means a thing of SQLy stuff.
|
| The creator says it like Escue Ell-ite emphasis on the Es- and
| -ite. [0]
|
| https://youtu.be/Jib2AmRb_rk?t=99
| giraffe_lady wrote:
| Yes but you don't get the namer's intent beamed into your
| brain when you see or hear a word for the first time. To me
| and a lot of other people it's pretty clearly sql-lite
| regardless of what the namer wants it to be. Maybe this makes
| me stupid or whatever but when naming things you also have to
| account for my stupidity because I share it with a lot of
| other people.
| nordsieck wrote:
| This is exactly the reason the Nimrod language was renamed
| to Nim.
| nordsieck wrote:
| That may be so. But just because someone is the creator
| doesn't mean they have control over how people use the name.
|
| Case in point:
|
| Blue Origin wants people to refer to it by the nickname
| "Blue". But you kind of have to be in the industry to know
| that. Everyone else just uses the initialism instead, which
| is most commonly used to abbreviate "body odor".
| 83457 wrote:
| TIL, but strange choice.
|
| On a related topic, anyone have an idea of the breakdown of
| people that say Sequel vs S.Q.L. and are there people that
| look down on others for their usage? Of course ignoring the
| opinion of anyone that says Jif.
| benbjohnson wrote:
| I would be funny if they suddenly rebranded to "MegaSQL" :)
| jef_leppard wrote:
| SQLAwesome? :)
| 83457 wrote:
| SQLit!
| jraph wrote:
| sql.io
| jjoonathan wrote:
| Nah, this time zoomers win. SQLit > sql.io
| seestem wrote:
| SQLitio
| [deleted]
| sbierwagen wrote:
| SQLheavy
| tmpz22 wrote:
| To blame unconscious bias over a suffix as a non-trivial force
| on adoption is silly when there are plenty of valid technical
| constraints for adoption of SQLite in many projects.
|
| IMO database names are incredibly bland, outside of
| Cockroachdb, and not a driving factor to adoption given the
| critical nature of the decision.
| simonw wrote:
| Something I found non-obvious about WAL mode in SQLite is that
| it's actually a property of the database file itself.
|
| When you run "PRAGMA journal_mode=wal;" against a database file
| the mode is permanently changed for that file - and the .db-wal
| and .db-shm files for that database will appear in the same
| directory as it.
|
| Any future connections to that database will use it in WAL mode -
| until you switch the mode on it back, at which point it will go
| back to journal mode.
|
| It makes sense when you think about it - of course a database can
| only be in one or the other modes, not both, so the setting must
| be at the database file level. But it took me a while to
| understand.
|
| I wrote some notes on this here:
| https://til.simonwillison.net/sqlite/enabling-wal-mode
| lancehasson wrote:
| I was reading up on WAL mode this week and came across a
| mailing list where you asked for clarification on this same
| topic. That post helped me understand, thanks for asking the
| question! The confusing thing here for me is that the majority
| of other PRAGMAs (that I know of) are per-request, so this one
| differs from that pattern
| benbjohnson wrote:
| The other odd thing is that the journal_mode is only persistent
| for WAL, I believe. The DELETE, TRUNCATE, & PERSIST modes are
| per-connection. It makes sense though since those 3 modes deal
| with the rollback journal and are compatible with each other
| while the WAL is totally separate.
| https://www.sqlite.org/pragma.html#pragma_journal_mode
| chasil wrote:
| There is a _significant_ warning about the use of WAL mode,
| very plain in the documentation:
|
| "Transactions involving multiple attached databases are atomic,
| assuming that the main database is not ":memory:" and the
| journal_mode is not WAL. If the main database is ":memory:" or
| if the journal_mode is WAL, then transactions continue to be
| atomic within each individual database file. But if the host
| computer crashes in the middle of a COMMIT where two or more
| database files are updated, some of those files might get the
| changes where others might not."
|
| https://sqlite.org/lang_attach.html
|
| Systems designed for DML against multiple database files should
| eschew WAL mode.
___________________________________________________________________
(page generated 2022-08-24 23:01 UTC)