[HN Gopher] SQLite concurrency and why you should care about it
       ___________________________________________________________________
        
       SQLite concurrency and why you should care about it
        
       Author : HunOL
       Score  : 232 points
       Date   : 2025-11-01 12:59 UTC (10 hours ago)
        
 (HTM) web link (jellyfin.org)
 (TXT) w3m dump (jellyfin.org)
        
       | asa400 wrote:
       | In SQLite, transactions by default start in "deferred" mode. This
       | means they do not take a write lock until they attempt to perform
       | a write.
       | 
       | You get SQLITE_BUSY when transaction #1 starts in read mode,
       | transaction #2 starts in write mode, and then transaction #1
       | attempts to upgrade from read to write mode while transaction #2
       | still holds the write lock.
       | 
       | The fix is to set a busy_timeout and to begin any transaction
       | that does a write (any write, even if it is not the first
       | operation in the transaction) in "immediate" mode rather than
       | "deferred" mode.
       | 
       | https://zeroclarkthirty.com/2024-10-19-sqlite-database-is-lo...
        
         | tlaverdure wrote:
         | Yes, these are both important points. I didn't see any mention
         | of SQLITE_BUSY in the blog post and wonder if that was never
         | configured. Something that people miss quite often.
        
         | BobbyTables2 wrote:
         | Thats the best explanation I've seen of this issue.
         | 
         | However, it screams of a broken implementation.
         | 
         | Imagine if Linux PAM logins randomly failed if someone else was
         | concurrently changing their password or vice versa.
         | 
         | In no other application would random failures due to
         | concurrency be tolerated.
         | 
         | SQLite is broken by design; the world shouldn't give them a
         | free pass.
        
           | asa400 wrote:
           | SQLite is a truly remarkable piece of software that is a
           | victim both of its own success and its unwavering commitment
           | to backward compatibility. It has its quirks. There are
           | definitely things we can learn from it.
        
         | mickeyp wrote:
         | Indeed. Everyone who uses sqlite _will_ get burnt by this one
         | day and spend a lot of time chasing down errant write-upgraded
         | transactions that cling on for a little bit longer than
         | intended.
        
         | summarity wrote:
         | I've always tried to avoid situations that could lead to
         | SQLITE_BUSY. SQLITE_BUSY is an architecture smell. For standard
         | SQLite in WAL, I usually structure an app with a read
         | "connection" pool, and a single-entry write connection pool.
         | Making the application aware of who _actually_ holds the write
         | lock gives you the ability to proactively design access
         | patterns, not try to react in the moment, and to get
         | observability into lock contention, etc.
        
           | mickeyp wrote:
           | I mean, you're not wrong, and that is one way to solve it,
           | but the whole point of a sensibly-designed WAL -- never mind
           | database engine -- is that you do not need to commit to some
           | sort of actor model to get your db to serialise writes.
        
             | sethev wrote:
             | These are performance optimizations. SQLite does serialize
             | writes. Avoiding concurrent writes to begin with just
             | avoids some overhead on locking.
        
               | mickeyp wrote:
               | "performance optimisation" --- yeees, well, if you don't
               | care about data integrity between your reads and writes.
               | Who knows when those writes you scheduled really get
               | written. And what of rollbacks due to constraint
               | violations? There's we co-locate transactions with code:
               | they are intertwined. But yes, a queue-writer is fine for
               | a wide range of tasks, but _not_ everything.
               | 
               | It's that we need to contort our software to make sqlite
               | not suck at writes that is the problem.
        
               | sethev wrote:
               | This is just FUD. The reason SQLite does locking to begin
               | with is to avoid data corruption. Almost every statement
               | this blog post makes about concurrency in SQLite is
               | wrong, so it's little surprise that their application
               | doesn't do what they expect.
               | 
               | >Who knows when those writes you scheduled really get
               | written
               | 
               | When a commit completes for a transaction, that
               | transaction has been durably written. No mystery. That's
               | true whether you decide to restrict writes to a single
               | thread in your application or not.
        
               | mickeyp wrote:
               | You are talking about low level stuff like syncing to the
               | filesystem; that data is journalled and ensuring
               | atomicity is maintained and I am in actual fact not.
               | 
               | Dislocating DML from the code that triggers it creates
               | many problems around ensuring proper data integrity and
               | it divorces consistent reads of uncommitted data that you
               | may want to tightly control before committing. By punting
               | it to a dedicated writer you're removing the ability to
               | ensure serialised modification of your data and the
               | ability to cleanly react to integrity errors that may
               | arise. If you don't need that? Go ahead. But it's not
               | fud. We build relational acid compliant databases this
               | way for a reason
        
               | sethev wrote:
               | Oh, I think you're picturing executing your transaction
               | logic and then sending writes off to a background queue.
               | I agree, that's not a general strategy - it only works
               | for certain cases.
               | 
               | I just meant that if you can structure your application
               | to run write transactions in a single thread (the whole
               | transaction and it's associated logic, not just deferring
               | writing the end result to a separate thread) then you
               | minimize contention at the SQLite level.
        
               | catlifeonmars wrote:
               | > When a commit completes for a transaction, that
               | transaction has been durably written. No mystery. That's
               | true whether you decide to restrict writes to a single
               | thread in your application or not.
               | 
               | Usually this is true but there are edge cases for certain
               | journaled file systems. IIRC sqlite.org has a discussion
               | on this.
        
               | zimpenfish wrote:
               | > there are edge cases for certain journaled file
               | systems. IIRC sqlite.org has a discussion on this.
               | 
               | Can't currently find it but I guess it comes under the
               | "if the OS or hardware lies to SQLite, what can it do?"
               | banner?
        
               | catlifeonmars wrote:
               | That might have been it. Overall the whole "How to
               | corrupt your database article" was quite a good read:
               | 
               | https://sqlite.org/howtocorrupt.html
        
               | jitl wrote:
               | > Who knows when those writes you scheduled really get
               | written
               | 
               | I await the write to complete before my next read in my
               | application logic, same as any other bit of code that
               | interacts with a database or does other IO. Just because
               | another thread handles interacting with the writer
               | connection, doesn't mean my logic thread just walks away
               | pretending the write finished successfully in 0ms.
        
               | ncruces wrote:
               | SQLite, for the most part, uses polling locks. That means
               | it checks if a lock is available to be taken, and if it's
               | not, it sleeps for a bit, then checks again, until this
               | times out.
               | 
               | This becomes increasingly inefficient as contention
               | increases, as you can easily get into a situation where
               | _everyone_ is sleeping, waiting for others, for a few
               | milliseconds.
               | 
               | Ensuring all, or most, writes are serialized, improves
               | this.
        
           | simonw wrote:
           | Even with that pattern (which I use too) you still need to
           | ensure those write operations always start a transaction at
           | the beginning in order to avoid SQLITE_BUSY.
        
             | summarity wrote:
             | Yes, indeed. In my apps, which are mostly Nim, my pool
             | manager ensures this always happens - along with a host of
             | other optimizations. I often start with barebones SQLite
             | and then later switch to LiteSync (distributed SQLite with
             | multi-master replication), so I keep the lock management at
             | the app level to adapt to whatever backend I'm using.
        
               | probst wrote:
               | I am really curious about LiteSync. Any chance you could
               | share a bit on your experiences with it (recognising it's
               | somewhat off-topic...). Do you run with multiple
               | primaries? What sort of use cases do you reach to it for?
               | Conflict resolution seems a bit simplistic at first
               | glance (from the perspective of someone very into CRDTs),
               | have you experienced any issues as a result of that?
        
         | simonw wrote:
         | Yeah I read the OP and my first instinct was that this is
         | SQLITE_BUSY. I've been collecting posts about that here:
         | https://simonwillison.net/tags/sqlite-busy/
        
           | gwking wrote:
           | One tidbit that I don't see mentioned here yet is that ATTACH
           | requires a lock. I just went looking for the documentation
           | about this and couldn't find it, especially for WAL mode
           | (https://www.sqlite.org/lockingv3.html mentions the super-
           | journal, but the WAL docs do not mention ATTACH at all).
           | 
           | I have a python web app that creates a DB connection per
           | request (not ideal I know) and immediately attaches 3
           | auxiliary DBs. This is a low traffic site but we have a
           | serious reliability problem when load increases: the ATTACH
           | calls occasionally fail with "database is locked". I don't
           | know if this is because the ATTACH fails immediately without
           | respecting the normal 5 second database timeout or what. To
           | be honest I haven't implemented connection pooling yet
           | because I want to understand what exactly causes this
           | problem.
        
         | kijin wrote:
         | Wouldn't that "fix" make the problem worse on the whole, by
         | making transactions hold onto write locks longer than
         | necessary? (Not trying to disagree, just curious about
         | potential downsides.)
        
           | asa400 wrote:
           | It's a reasonable question!
           | 
           | In WAL mode, writers and readers don't interfere with each
           | other, so you can still do pure read queries in parallel.
           | 
           | Only one writer is allowed at a time no matter what, so
           | writers queue up and you have to take the write lock at some
           | point anyway.
           | 
           | In general, it's hard to say without benchmarking your own
           | application. This will get rid of SQLITE_BUSY errors firing
           | immediately in the situation of read/write/upgrade-read-to-
           | write scenario I described, however. You'd be retrying the
           | transactions that fail from SQLITE_BUSY anyway, so that
           | retrying is what you'd need to benchmark against.
           | 
           | It's a subtle problem, but I'd rather queue up writes than
           | have to write the code that retries failed transactions that
           | shouldn't really be failing.
        
         | chasil wrote:
         | In an Oracle database, there is only one process that is
         | allowed to write to tablespace datafiles, the DBWR (or its
         | slaves). Running transactions can write to ram buffers and the
         | redo logs only.
         | 
         | A similar design for SQLite would design for only one writer,
         | with all other processes passing their SQL to it.
        
       | mickeyp wrote:
       | SQLite is a cracking database -- I love it -- that is let down by
       | its awful defaults in service of 'backwards compatibility.'
       | 
       | You need a brace of PRAGMAs to get it to behave reasonably sanely
       | if you do anything serious with it.
        
         | tejinderss wrote:
         | Do you know any good default PRAGMAs that one should enable?
        
           | mickeyp wrote:
           | These are my PRAGMAs and not _your_ PRAGMAs. Be very careful
           | about blindly copying something that may or may not match
           | your needs.                   PRAGMA foreign_keys=ON
           | PRAGMA recursive_triggers=ON         PRAGMA journal_mode=WAL
           | PRAGMA busy_timeout=30000         PRAGMA synchronous=NORMAL
           | PRAGMA cache_size=10000         PRAGMA temp_store=MEMORY
           | PRAGMA wal_autocheckpoint=1000         PRAGMA optimize <- run
           | on tx start
           | 
           | Note that I do not use auto_vacuum for DELETEs are uncommon
           | in my workflows and I am fine with the trade-off and if I do
           | need it I can always PRAGMA it.
           | 
           | defer_foreign_keys is useful if you understand the pros and
           | cons of enabling it.
        
             | adzm wrote:
             | Really, no mmap?
        
               | metrix wrote:
               | I'm curious what your suggest mmap pragma would be.
        
             | mikeocool wrote:
             | Using strict tables is also a good thing to do, if you
             | value your sanity.
        
             | porridgeraisin wrote:
             | You should pragna optimize before TX end, not at tx start.
             | 
             | Except for long lived connections where you do it
             | periodically.
             | 
             | https://www.sqlite.org/lang_analyze.html#periodically_run_p
             | r...
        
               | masklinn wrote:
               | Also foreign_keys has to be set per connection but
               | journal_mode is sticky (it changes the database itself).
        
               | porridgeraisin wrote:
               | Yes, if journal_mode was not sticky, a new process
               | opening the db would not know to look for the wal and shm
               | files and read the unflushed latest data from there. On
               | the other hand, foreign key enforcement has nothing to do
               | with the file itself, it's a transaction level thing.
               | 
               | In any case, there is no harm in setting sticky pragmas
               | every connection.
        
           | leetrout wrote:
           | Explanation of sqlite performance PRAGMAs
           | 
           | https://kerkour.com/sqlite-for-servers
        
           | e2le wrote:
           | Although not what you asked for, the SQLite authors maintain
           | a list of recommended compilation options that should be used
           | where applicable.
           | 
           | https://sqlite.org/compile.html#recommended_compile_time_opt.
           | ..
        
         | mkoubaa wrote:
         | Seems like it's asking to be forked
        
           | justin66 wrote:
           | It has been forked at least once:
           | 
           | https://docs.turso.tech/libsql
        
           | kbolino wrote:
           | SQLite is fairly fork-resistant due to much of its test suite
           | being proprietary: https://www.sqlite.org/testing.html
        
           | pstuart wrote:
           | The real fork is DuckDB in a way, it has SQLite compatibility
           | and _so much more_.
           | 
           | The SQLite team also has 2 branches that address concurrency
           | that may someday merge to trunk, but by their very nature
           | they are quite conservative and it may never happen unless
           | they feel it passes muster.
           | 
           | https://www.sqlite.org/src/doc/begin-
           | concurrent/doc/begin_co...
           | https://sqlite.org/hctree/doc/hctree/doc/hctree/index.html
           | 
           | As to the problem that prompted the article, there's another
           | way of addressing the problem that is kind of a kludge but is
           | guaranteed to work in scenarios like theirs: Have each thread
           | in the parallel scan write to it's own temporary database and
           | then bulk import them once the scan is done.
           | 
           | It's easy to get hung up on having "a database" but sharding
           | to different files by use is trivial to do.
           | 
           | Another thing to bear in mind with a _lot_ of SQLite use
           | cases is that the data is effectively read only save for
           | occasional updates. Read only databases are a lot easier to
           | deal with regarding locking.
        
             | jitl wrote:
             | DuckDB is similar as an in process SQL database, but
             | lacking btree-style ordered indexes makes it a poor
             | performer in key lookups and order-by / range scans if your
             | table is any size larger than trivial.
             | 
             | It's the classic OLAP (DuckDB) vs OLTP (SQLite) trade off
             | between the two. DuckDB is very good at many things but
             | most applications that need a traditional SQL DB will
             | probably not perform well if you swap it over to DuckDB.
        
               | Kinrany wrote:
               | That's surprising, surely OLAP use cases also need key
               | lookups?
        
               | geysersam wrote:
               | Duckdb has optional adaptive radix tree indexing
               | (https://duckdb.org/docs/stable/sql/indexes.html)
        
               | jitl wrote:
               | Oops, I stand corrected!
               | 
               | What I remember about our evaluation of DuckDB in 2024
               | concluded that (1) the major limitations were lack of
               | range-scan and index-lookup performance (maybe w/ joins?
               | or update where?), and (2) the DuckDB Node.js module
               | segfaulted too much. Perhaps the engineers somehow missed
               | the ART index it could also be the restriction that data
               | fit in memory to create an index on it (our test dataset
               | was about 50gb)
        
             | Kinrany wrote:
             | > Read only databases are a lot easier to deal with
             | regarding locking.
             | 
             | "A lot easier" sounds like an understatement. What's there
             | to lock when the data is read only?
        
       | stefanos82 wrote:
       | When hctree [1] becomes stable in SQLite, it will be the only
       | database I will be using lol!
       | 
       | I presume the `hc` part in project's code name should be High
       | Concurrency.
       | 
       | [1] https://sqlite.org/hctree/doc/hctree/doc/hctree/index.html
        
       | dv35z wrote:
       | Curious if anyone has strategies on how to perform parallel
       | writes to an SQLite database using Python's `multiprocessing`
       | Pool.
       | 
       | I am using it to loop through a database of 11,000 words, hit an
       | HTTP API for each (ChatGPT) and generate example sentences for
       | the word. I would love to be able to asynchronously launch these
       | API calls and have them come back and update the database row
       | when ready, but not sure how to handle the database getting hit
       | by all these writes from (as I understand it) multiple instances
       | of the same Python program/function.
        
         | mickeyp wrote:
         | Edit: disregard. I read it as he'd done it and had contention
         | problems.
         | 
         | You can't. You have a single writer - it's one of the many
         | reasons sqlite is terrible for serious work.
         | 
         | You'll need a multiprocessing Queue and a writer that picks off
         | sentences one by one and commits it.
        
           | hruk wrote:
           | This is just untrue - the naive implementation (make the API
           | call, write a single row to the db) will work fine, as
           | transactions are quite fast on modern hardware.
           | 
           | What do you consider "serious" work? We've served a SaaS
           | product from SQLite (roughly 300-500 queries per second at
           | peak) for several years without much pain. Plus, it's not
           | like PG and MySQL are pain-free, either - they all have their
           | quirks.
        
             | mickeyp wrote:
             | Edit: disregard. I read it as he'd done it and had
             | contention problems.
             | 
             | I mean it's not if he's got lock contention from BUSY
             | signals, now is it, as he implies. Much of his issues will
             | stem from transactions blocking each other; maybe they are
             | long-lived, maybe they are not. And those 3-500 queries ---
             | are they writes or reads? Because reads is not a problem.
        
               | hruk wrote:
               | Roughly 80/20 read to write. On the instance's gp3 EBS
               | volume (which is pretty slow), we've pushed ~700 write
               | transactions per second without much problem.
        
               | mickeyp wrote:
               | For small oltp workloads the locking is not going to be a
               | problem. But stuff that holds the write lock for some
               | measurable fraction of a second even will gum things up
               | real fast. Transactions that need it for many seconds?
               | You'll quickly be dead in the water.
        
         | zie wrote:
         | Technically SQLite can only have 1 writer at any given moment,
         | but it can appear like it works across multiple writers and let
         | it serialize the calls for you.
         | 
         | By default SQLite will not do what you want out of the box. You
         | have to turn on some feature flags(PRAGMA) to get it to behave
         | for you. You need WAL mode, etc read:
         | 
         | * https://kerkour.com/sqlite-for-servers *
         | https://zeroclarkthirty.com/2024-10-19-sqlite-database-is-lo...
         | 
         | My larger question is why multiprocessing? this looks like an
         | IO heavy workload, not CPU bound, so python asyncio or python
         | threads would probably do you better.
         | 
         | multiprocessing is when your resource hog is CPU(probably 1
         | python process per CPU), not IO bound.
        
           | dv35z wrote:
           | I will check into `asyncio` and Python threads. I used
           | multiprocessing as my first project into asynchronous
           | programming. The previous use-case was using Python +
           | multiprocessing to run MacOS `say` (using Python subprocess)
           | - so I could invoke it 10-20 times simultaneously on my
           | computer, rather than waiting for each to complete. I
           | experimented a bit with how many concurrent processes to run
           | (using `time` to clock how long the runs were).
        
         | sethev wrote:
         | Have you tried it?
         | 
         | What you're describing sounds like it would work fine to me.
         | The blog post is misleading imho - it implies that SQLite
         | doesn't handle concurrency at all. In reality, you can perform
         | a bunch of writes in parallel and SQLite will handle running
         | them one after the other internally. This works across
         | applications and processes, you just need to use SQLite to
         | interact with the database. The blog post is also misleading
         | when it implies that the application has to manage access to
         | the database file in some way.
         | 
         | Yes, it's correct that only one of those writes will execute at
         | a time but it's not like you have to account for that in your
         | code, especially in a batch-style process like you're
         | describing. In your Python code, you'll just update a row and
         | it will look like that happens concurrently with other updates.
         | 
         | I'll bet that your call to ChatGPT will take far longer than
         | updating the row, even accounting for time when the write is
         | waiting for its turn in SQLite.
         | 
         | Use WAL-mode for the best performance (and to reduce
         | SQLITE_BUSY errors).
        
           | dv35z wrote:
           | I haven't tried it yet - async processing (and even using
           | SQLite) is new to me, so I'm trying to figure out solution
           | patterns which work for the now, and also I can continue to
           | invest my knowledge in to solve future problems.
           | 
           | I will look into WAL mode. I am enjoying using SQLite (and
           | aware that its not the solution for everything), and have
           | several upcoming tasks which I'm planning to use async stuff
           | - and yes, trying to find the balance between how to handle
           | those async tasks (Networky HTTP calls being different than
           | running `ffmpeg` locally).
        
       | porridgeraisin wrote:
       | > So an application that wants to use SQLite as its database
       | needs to be the only one accessing it.
       | 
       | No. It uses OS level locks. fcntl(). You can access it from how
       | many ever processes. The only rule is, single writer (at a time).
       | 
       | > When another part of the application wants to read data, it
       | reads from the actual database, then scans the WAL for
       | modifications and applies them on the fly.
       | 
       | Also wrong. WAL does not contain modifications, it contains the
       | full pages. A reader checks the WAL, and if it finds the page it
       | won't even read the DB. It's a bit like a cache in this sense,
       | that's why shared cache mode was discouraged in favour of WAL (in
       | addition to its other benefits). Multiple versions of a page can
       | exist in the WAL (from different transactions), but each reader
       | sees a consistent snapshot which is the newest version of each
       | page up to its snapshot point.
       | 
       | > For some reason on some systems that run Jellyfin when a
       | transaction takes place the SQLite engine reports the database is
       | locked and instead of waiting for the transaction to be resolved
       | the engine refuses to wait and just crashes
       | 
       | You can set a timeout for this - busy_timeout.
       | 
       | > Reproducible
       | 
       | There's nothing unreliable here. It will fail every single time.
       | If it doesn't, then the write finished too fast for the read to
       | notice and return SQLite busy. Not sure what they are seeing.
       | 
       | > The solution
       | 
       | So they've reimplemented SQLites serialisation, as well as
       | SQLites busy_timeout in C#?
       | 
       | > "engine", "crash"
       | 
       | Sqlite is not an engine. It's literally functions you link into
       | your app. It also doesn't crash, it returns sqlite_busy. Maybe EF
       | throws an exception on top of that.
       | 
       | I have to say, this article betrays a lack of fundamental DB
       | knowledge and only knowing ORMs. Understand the DB and then use
       | the ORM on top of it. Or atleast, don't flame the DB (context:
       | blame-y tone of article) if you haven't bothered to understand
       | it. Speaking of ORMs ...
       | 
       | > EF Core
       | 
       | You're telling me that burj khalifa of abstractions doesn't have
       | room to tune SQLite to what web devs expect?
        
         | yellow_lead wrote:
         | C# devs*
        
           | porridgeraisin wrote:
           | Didn't mean to belittle any 'X' developer. By "what web devs
           | expect", I meant the settings that are usually used for
           | databases in web apps.
        
       | ignoramous wrote:
       | So, I decided on three locking strategies:            No-Lock
       | Optimistic locking       Pessimistic locking            As a
       | default, the no-lock behavior does exactly what the name implies.
       | Nothing. This is the default because my research shows that for
       | 99% all of this is not an issue and every interaction at this
       | level will slow down the whole application.
       | 
       | Aren't the mutexes in the more modern implementations (like Cosmo
       | [0]) & runtimes (like Go [1]) already optimized so applications
       | can use mutexes _fearlessly_?
       | 
       | [0] https://justine.lol/mutex/
       | 
       | [1] https://victoriametrics.com/blog/go-sync-mutex/
        
       | mangecoeur wrote:
       | Sqlite is a great bit of technology but sometimes I read articles
       | like this and think, maybe they should have used postgres. I you
       | don't specifically need the "one file portability" aspect of
       | sqlite, or its not embedded (in which case you shouldn't have
       | concurrency issues), Postgres is easy to get running and solves
       | these problems.
        
         | eduction wrote:
         | 100%. I specifically clicked for the "why you should care" and
         | was disappointed I could not find it.
         | 
         | I certainly don't mind if someone is pushing the limits of what
         | SQLite is designed for but personally I'd just rather invest
         | the (rather small) overhead of setting up a db server if I need
         | a lot of concurrency.
        
         | abound wrote:
         | Jellyfin is a self-hostable media server. If they "used
         | Postgres", that means anyone who runs it needs Postgres. I
         | think SQLite is the better choice for this kind of application,
         | if one is going to choose a single database instead of some
         | pluggable layer
        
           | morshu9001 wrote:
           | Exactly, there are use cases where SQLite makes sense but you
           | also want to make it faster. I really don't get why there
           | isn't a more portable Postgres.
        
             | zie wrote:
             | There is, you can even run PG under wasm if you are
             | desperate. :)
             | 
             | SQLite is probably the better option here and in most
             | places where you want portability though.
        
           | tombert wrote:
           | I share my Jellyfin with about a dozen people, and it's not
           | weird to have several people streaming at the same time. I
           | have a two gigabit connection so bandwidth isn't generally an
           | issue, but I've had issues when three people all streaming a
           | VC-1 encoded video to H264 in software.
           | 
           | This is something that I think I could fairly easily
           | ameliorate if I could simply load-balance the application
           | server by user, but historically (with Emby), I've not been
           | able to do that due to SQLite locking not allowing me to run
           | multiple instances pointing to the same config instance.
           | 
           | There's almost certainly ways to do this correctly with
           | SQLite but if they allowed for using almost literally any
           | other database this would be a total non-issue.
           | 
           | ETA:
           | 
           | For clarification if anyone is reading this, all this media
           | LEGALLY OBTAINED with PERMISSION FROM THE COPYRIGHT
           | HOLDER(S).
        
             | reddalo wrote:
             | Yeah, I'm sure those twelve people love watching your
             | vacation clips all the time ;)
        
           | reddalo wrote:
           | They're actually planning on migrating to Postgres in a
           | future release:
           | 
           | >[...] it also opens up new possibilities - not officially
           | yet, but soon - for running Jellyfin backed by "real"
           | database systems like PostgreSQL, providing new options for
           | redundancy, load-balancing, and easier maintenance and
           | administration. The future looks very bright!
           | 
           | https://jellyfin.org/posts/jellyfin-release-10.11.0/
        
         | bambax wrote:
         | Jellyfin is a media server app that gets installed on a great
         | variety of platforms and while it would certainly be possible
         | to add a postgres server to the install, the choice of sqlite
         | is more than justified here IMHO.
        
         | throwaway894345 wrote:
         | As a user of Jellyfin, I'm very sad that it doesn't just use
         | Postgres. I basically have to run an NFS system just for
         | Jellyfin so that its data can be available to it no matter
         | which node it gets scheduled on and also that there are never
         | multiple instances running at the same time, even during
         | deployments (e.g., I need to take care that deployments
         | completely stop the first Jellyfin instance before starting the
         | subsequent instance). There are so many unnecessary single
         | points of failure, and Postgres would make a pretty big one go
         | away (never mind addressing the parallelism problems that
         | plague the developers).
         | 
         | Jellyfin is by far the least reliable application I run, but it
         | also seems to be best in class.
        
           | KingMob wrote:
           | I gave up on Jellyfin after media library updates kept
           | hanging on certain video files, and switched to the original
           | Emby it was forked from (iiuc).
           | 
           | Emby has a scarily-ancient install process, but it's been
           | working just fine with less hassle.
        
         | thayne wrote:
         | Using postgres would make it significantly more complicated for
         | Jellyfin users to install and set up Jellyfin. And then users
         | would need to worry about migrating the databases when
         | PostgreSQL has a major version upgrade. An embedded database
         | like sqlite is a much better fit for something like Jellyfin.
        
           | throwaway894345 wrote:
           | As a Jellyfin user, this hasn't been my experience. I needed
           | to do a fair bit of work to make sure Jellyfin could access
           | its database no matter which node it was scheduled onto and
           | that no more than one instance ever accessed the database at
           | the same time. Jellyfin _by far_ required more work to setup
           | _maintainably_ than any of the other applications I run, and
           | it is also easily the least reliable application. This isn't
           | all down to SQLite, but it's all down to a similar set of
           | assumptions (exactly one application instance interacting
           | with state over a filesystem interface).
        
             | thayne wrote:
             | Is running multiple nodes a typical way to run Jellyfin
             | through? I would expect that most Jellyfin users only run a
             | single instance at a time.
        
               | throwaway894345 wrote:
               | Yes, but you have to go out of your way when writing
               | software to make it so the software _can only_ run on one
               | node at a time. Or rather, well-architected software
               | should require minimal, isolated edits to run in a
               | distributed configuration (for example, replacing SQLite
               | with a distributed SQLite).
        
             | stormbeard wrote:
             | Jellyfin isn't meant to be some highly available
             | distributed system, so of course this happens when you try
             | to operate it like one. The typical user is not someone
             | trying to run it via K8s.
        
               | throwaway894345 wrote:
               | Yeah, I agree, though making software that can run in a
               | distributed configuration is a matter of following a few
               | basic principles, and would be far less work than what
               | the developers have spent chasing down trying to make
               | SQLite work for their application.
               | 
               | The effort required to put an application on Kubernetes
               | is a pretty good indicator of software quality. In other
               | words, I can have a pretty good idea about how difficult
               | a software is to maintain in a single-instance
               | configuration by trying to port it to Kubernetes.
        
             | FrinkleFrankle wrote:
             | Care to share your setup?
        
         | amaccuish wrote:
         | Their whole recent rewrite of the DB code (to Entity Framework)
         | is to allow the user choice of DB in future.
        
         | petters wrote:
         | Jellyfin is mostly for a single household, right? Sqlite should
         | be much more than sufficient for Jellyfin (if used correctly).
         | Unfortunately, reading this article you get the impression that
         | they are not using it optimally
        
           | nick_ wrote:
           | Agreed. How can a media file sharing app possibly saturate
           | Sqlite's write limit? I would use an app-level global lock on
           | all writes to Sqlite.
        
         | o11c wrote:
         | Even with postgres, you don't _have_ to use the system
         | instance; there 's nothing stopping you from running the server
         | as a child process.
         | 
         | You probably need to support this for your testsuite anyway.
        
           | hamandcheese wrote:
           | Maybe in theory. In practice, most people who need Postgres
           | for their test suite will boot an instance in a docker
           | container in CI, and maybe just assume a system version is
           | available for local dev.
        
       | ricardobeat wrote:
       | Articles like this leave me with an uneasy feeling that the
       | "solutions" are just blind workarounds - more debugging/research
       | should be able to expose exactly what the problem is, now that
       | would be something worth sharing.
        
         | kccqzy wrote:
         | Articles like this give me the feeling that the author did a
         | little bit of research and shared a suboptimal solution, and
         | was hoping that experts on HN would present better solutions.
         | Wasn't there a saying about how the best way to get correct
         | answers is to post not just the question but the wrong answers
         | to it?
        
       | Leherenn wrote:
       | A bit off topic, but there seems to be quite a few SQLite experts
       | here.
       | 
       | We're having troubles with memory usage when using SQLite in-
       | memory DBs with "a lot" of inserts and deletes. Like maybe
       | inserting up to a 100k rows in 5 minutes, deleting them all after
       | 5 minutes, and doing this for days on end. We see memory usage
       | slowly creeping up over hours/days when doing that.
       | 
       | Any settings that would help with that? It's particularly bad on
       | macOS, we've had instances where we reached 1GB of memory usage
       | according to Activity Monitor after a week or so.
        
         | asa400 wrote:
         | Are you running vacuums at all? auto_vacuum enabled at all?
         | 
         | https://sqlite.org/lang_vacuum.html
        
           | porridgeraisin wrote:
           | In memory DBs don't have anything to vacuum.
           | 
           | However... what you (and OP) are looking for might be pragma
           | shrink_memory [1].
           | 
           | [1] https://sqlite.org/pragma.html#pragma_shrink_memory
        
             | asa400 wrote:
             | Ah, you're correct. I read too fast and missed that it was
             | in-memory databases specifically!
        
         | kachapopopow wrote:
         | sounds like normal behavior of adjusting buffers to better fit
         | the usecase, not sure if it applies to sqlite or if sqlite even
         | implements dynamic buffers.
        
         | pstuart wrote:
         | If you're deleting all rows you can also just drop the table
         | and recreate it.
        
       | ddtaylor wrote:
       | I have encountered this problem on Jellyfin before. It works like
       | a dream, but there are some very strange circumstances that can
       | cause the database to become locked and then just not work until
       | I restart the docker container. If I check the logs it just says
       | stuff about the database being locked. It happens quite rarely
       | and seems to be when we fidget in the menus on the smart TV like
       | starting to watch a show to realize it's the wrong episode as you
       | click the button, then spam the back button, etc.
        
       | thayne wrote:
       | There seem to be some misunderstandings in this:
       | 
       | > If your application fully manages this file, the assumption
       | must be made that your application is the sole owner of this
       | file, and nobody else will tinker with it while you are writing
       | data to it.
       | 
       | Kind of, but sqlite does locking for you, so you don't have to do
       | anything to ensure your process is the only one writing to the db
       | file.
       | 
       | > [The WAL] allows multiple parallel writes to take place and get
       | enqueued into the WAL.
       | 
       | The WAL doesn't allow multiple parallel writes. It just allows
       | reads to be concurrent with a single write transaction.
        
         | Sammi wrote:
         | Yeah... I adore Sqlite and upvote anything about it, but I
         | couldn't upvote this article because it was just so poorly
         | informed. It gets the very basics on sqlite concurrency wrong.
        
       | yread wrote:
       | I'm a bit confused. The point of this article is that the author
       | used .NET Interceptors and TagWith to somehow tag his EF Core
       | operations so that they make their own busy_timeout (which EF
       | Core devs think is not necessary
       | https://github.com/dotnet/efcore/issues/28135 ) or do a horrible
       | global lock? No data is presented on how it improved things if it
       | did. Nor is it described which operations were tagged with what.
       | The only interesting thing about it are the interceptors but
       | that's somehow not discussed in HN's comments at all.
        
       | fitsumbelay wrote:
       | Very helpful and a model for how technical posts should be
       | written: clarity, concision, anchor links that summarize the top
       | lines. It was a pleasure to read.
        
       | tombert wrote:
       | Does this mean I can finally load-balance with multiple Jellyfin
       | instances?
       | 
       | A million years ago, back when I still used Emby, I was annoyed
       | that I couldn't use it across multiple in Docker Swarm due to
       | locking of SQLite. It really annoyed me, enough to where I
       | started (but never completed) a driver to change the DB to
       | postgres [1]. I ended up moving everything over to a single
       | server, which is mostly fine unless I have multiple people
       | transcoding at the same time.
       | 
       | If this is actually fixed then I might have an excuse to
       | rearchitect my home server setup again.
       | 
       | [1] https://github.com/Tombert/embypostgres
        
       | EionRobb wrote:
       | One of the biggest contributors I've had in the past for SQLite
       | blocking was disk fragmentation.
       | 
       | We had some old Android tablets using our app 8 hours a day for
       | 3-4 years. They'd complain if locking errors and slowness but
       | every time they'd copy their data to send to us, we couldn't
       | replicate, even on the same hardware. It wasn't until we bought
       | one user a new device and got them to send us the old one that we
       | could check it out. We thought maybe the ssd had worn out over
       | the few years of continual use but installing a dev copy of our
       | app was super fast. In the end what did work was to "defrag" the
       | db file by copying it to a new location, deleting the original,
       | then moving it back to the same name. Boom, no more "unable to
       | open database" errors, no more slow downs.
       | 
       | I tried this on Jellyfin dbs a few months ago after running it
       | for years and then suddenly running into performance issues, it
       | made a big difference there too.
        
         | Multicomp wrote:
         | Would the SQLite vacuum function help with that?
        
           | mceachen wrote:
           | You can VACUUM INTO, ~~but standard vacuum won't rewrite the
           | whole db~~ (vacuum rewrites the whole db)
           | 
           | https://sqlite.org/lang_vacuum.html
           | 
           | (Edit: if multiple processes are concurrently reading and
           | writing, and one process vacuums, verify that the right
           | things happen: specifically, that concurrent writes from
           | other processes during a vacuum don't get erased by the other
           | processes' vacuum. You may need an external advisory lock to
           | avoid data loss).
        
             | return_to_monke wrote:
             | > You can VACUUM INTO, but standard vacuum won't rewrite
             | the whole db.
             | 
             | This is not true. From the link you posted:
             | 
             | > The VACUUM command works by copying the contents of the
             | database into a temporary database file and then
             | overwriting the original with the contents of the temporary
             | file.
        
               | mceachen wrote:
               | Ugh, you're totally right.
               | 
               | I always get optimize and vacuum mixed up.
               | 
               | https://sqlite.org/pragma.html#pragma_optimize
        
         | didip wrote:
         | This is fascinating. What would be the solution for this? You
         | can't ask users to defrag.
        
           | DANmode wrote:
           | Perform the file operation, after zipping the existing db as
           | a backup, and leaving the original where it sits.
           | 
           | Success, performance increase.
           | 
           | Failure, no change.
        
       | slashdave wrote:
       | I am a little confused, but maybe I am missing some context?
       | Wouldn't using a proper database be a lot easier than all of this
       | transaction hacking? I mean, is Postgres that hard to use?
        
       ___________________________________________________________________
       (page generated 2025-11-01 23:00 UTC)