[HN Gopher] I'm all-in on server-side SQLite (2022)
___________________________________________________________________
I'm all-in on server-side SQLite (2022)
Author : rrampage
Score : 190 points
Date : 2023-09-22 16:02 UTC (6 hours ago)
(HTM) web link (fly.io)
(TXT) w3m dump (fly.io)
| sneak wrote:
| > _We're beginning to hit theoretical limits. In a vacuum, light
| travels about 186 miles in 1 millisecond. That's the distance
| from Philadelphia to New York City and back. Add in layers of
| network switches, firewalls, and application protocols and the
| latency increases further._
|
| > _The per-query latency overhead for a Postgres query within a
| single AWS region can be up to a millisecond. That's not Postgres
| being slow--it's you hitting the limits of how fast data can
| travel._
|
| No. An AWS region has a radius of less than a few dozen
| kilometers, more likely around 5km. Lightspeed doesn't factor
| into it at those small distances. That millisecond is indeed
| Postgres being "slow" in these terms. (Most of it is the
| networking stack, as noted.)
|
| This basic error makes me question the validity of the document.
| I stopped reading here.
|
| I agree that "networks are slow" but this sort of false
| justification is not the way to sell it. Is this an attempt to
| make the author seem like he knows what he is doing because he
| knows the speed of light?
| SantaCruz11 wrote:
| If he really understood what he's talking about, he would at
| least say "half the speed of lights" because that's the max
| speed a fiber cable will ever go.
| blagie wrote:
| I don't need to be sold on the virtues of applications running on
| systems like SQLite. The nineties had a lot of servers which were
| very simple (and performant) compared to LAMP, and I like systems
| like that.
|
| What I would like is a good primer about the layers on top of
| SQLite. What does Litestream do for me? How does it compare to
| competitors? Why not just use SQLite directly? A more in-depth
| technical discussion would be nice. I'd also like to understand
| wrappers and ORMs for migration to other systems, should SQLite
| stop scaling.
| fiedzia wrote:
| > Why not just use SQLite directly?
|
| SQLite does not provide replication, so there is no way to use
| it directly (other than copy whole file). If you mean it as
| "Why not use it as a database" than sure, you can use it
| directly, though the article states reasons for not doing so
| (resiliency and concurrency). Postgres is a lot better in those
| areas, and so is the tooling.
|
| >I'd also like to understand wrappers and ORMs for migration to
| other systems, should SQLite stop scaling
|
| 1. It heavily depends on the orms. For example Django provides
| good abstraction layer and many things works with any database
| with no change needed, but many other don't bother about that.
| However just because a query runs, doesn't mean it will return
| the same results. Any non-trivial app will rely on numerous
| accidental details and you can't switch db and expect
| everything will be fine. SQL is not really portable even in the
| parts it does cover, and there are many it doesn't.
| blagie wrote:
| I've definitely build portable systems in Django, back in the
| day. The trick was to have decent test coverage and run over
| both (at the time) MySQL and SQLite.
|
| (And yes, I should have used postgres).
|
| I'll mention: The internet, in the nineties, was powered by
| 486-grade computers, and things were perfectly performant to
| a pretty decent scale. If you can get rid of issues like
| network latency (from e.g. a database on a different machine
| than your main computer), and similar 2020-era bottlenecks, a
| lot of web apps can serve millions of users from a single
| machine. That's doubly true with gigabytes of RAM and modern
| SSDs.
|
| With RAID and regular backups, it can even be pretty robust.
|
| It's even easier to do now that you can write static client
| apps that just need to push and pull little bits of data to
| and from the server.
|
| That's not an architecture that's used often, but it keeps
| things very simple and can work quite well.
| benbjohnson wrote:
| Author here. Cool to see the post make it up on HN again. I'm
| still as excited as ever about the SQLite space. So much great
| work going on from rqlite, cr-sqlite, & Turso, and we're still
| plugging away on LiteFS. I'm happy to answer any questions about
| the post.
| otoolep wrote:
| Thanks Ben!
|
| rqlite[1] creator, happy to answer questions too.
|
| [1] https://www.rqlite.io
| jjtheblunt wrote:
| Is there a recommendable way to feign a graph database within
| SQLite? (because read only replication would be fantastic on
| fly.io for us.)
| benbjohnson wrote:
| SQLite has very little per-query overhead (as opposed to a
| database connection over a network) so I would think you
| could traverse a graph using multiple small queries rather
| than using a graph query language.
| jjtheblunt wrote:
| yep that's what i am doing
| rubenv wrote:
| What's the status on litestream? Does that have a future as
| well or is it LiteFS all the way?
| benbjohnson wrote:
| Litestream definitely has a future. Our goal is to keep it as
| a simple single-node disaster recovery tool though so it
| won't see as much feature development as something like
| LiteFS. We've been focused a lot on LiteFS & LiteFS Cloud to
| get them in a good place but I'm looking forward to going
| back and updating Litestream more regularly.
| srameshc wrote:
| I recently saw the launch post of Electric SQL which syncs to
| SQlite, I like the pattern on how keeping the data close to the
| frontend can solve many problems, if synced with the main DB. I
| hate to run another docker or manage service to manage this layer
| but if somehow a part of data from the database like Postgres can
| be synced using something simple like litestream and can be
| placed either on edge or client can be a solution to many of the
| problems.
| kijin wrote:
| > _When you put your data right next to your application, you can
| see per-query latency drop to 10-20 microseconds. That's micro,
| with a m. A 50-100x improvement over an intra-region Postgres
| query._
|
| Why compare the latency of a remote Postgres database with a
| local SQLite database? If your app is so simple and self-
| contained that it runs on a single EC2 instance using local
| files, nothing prevents you from installing Postgres on the same
| machine, whether inside a container or not.
|
| I have some simple apps on EC2 with MariaDB on localhost, and
| well-tuned queries rarely take more than 100-200 microseconds.
| That's total query execution time, not just communication
| latency. RDS just sucks for this kind of use case. It's not a
| useful comparison.
|
| > _As much as I love tuning SQL queries, it's becoming a dying
| art for most application developers. Even poorly tuned queries
| can execute in under a second for ordinary databases._
|
| Didn't you just say that milliseconds matter?
| giantrobot wrote:
| > Why compare the latency of a remote Postgres database with a
| local SQLite database?
|
| The SQLite DB is just a flat file that can be packaged in a
| lambda or whatever cloud's object store. It makes sense for
| fast access to read-heavy or read-only data. Even a shitty
| query can return rows in less than a millisecond where a remote
| DB that same amount of data is tens of milliseconds away.
| benbjohnson wrote:
| Author here. The comparison was meant to be about how Postgres
| (or any client/server RDBMS) is typically deployed. Yes, you
| can deploy Postgres on the same machine but I wouldn't say it's
| common. Maybe I could have expanded more on that point or
| simply referenced client/server architecture rather than
| Postgres so it didn't seem like a straw man argument.
| marcosdumay wrote:
| If I had to guess, I'd say that single-machine (with cold
| backups) is the most common way to use Postgres with a web
| server.
| morelisp wrote:
| But why take on the operational overhead of a separate DB
| server (not to mention 200 more microseconds), plus the EC2
| surcharge? I would rather run app+SQLite + dumb object storage,
| than app + MySQL + MySQL incremental backup and restore.
| kijin wrote:
| What separate DB server? I was talking about installing the
| RDBMS on localhost, right inside the server where your
| application runs. No other EC2 instance, no extra charges.
| Preferably connect to it over a Unix domain socket instead of
| TCP. That's the only way to compare SQLite performance with
| an RDBMS in an apples-to-apples way.
|
| The point about operational overhead makes sense, though, and
| IMO it's the only point in this unnecessarily long article
| that's actually worth considering. I do have a couple of
| other apps running on SQLite, so I appreciate the simplicity.
| giantrobot wrote:
| > No other EC2 instance, no extra charges.
|
| Think Lambda (or equivalent) instead of EC2.
| Dan42 wrote:
| Thank you for saying that, it save me the trouble of
| writing it.
|
| This apples-to-oranges comparison (strawman? motivated
| reasoning?) unfortunately makes me mistrust this entire
| article.
| morelisp wrote:
| > What separate DB server?
|
| It's still normal to refer to e.g. "database server",
| "application server", etc. processes even when running on a
| single machine. Re. EC2, I'm referring to the surcharge of
| having a dedicated instance at all, vs. working at the
| container / object storage level of abstraction.
| ak39 wrote:
| SQLite not supporting "stored procedures" is a deal-breaker for
| me. The idea for stored procs is not to "put the process as close
| to the data" but simply that we have a single place for language-
| agnostic encapsulation of data procedures.
| hahn-kev wrote:
| But man the maintenance and debug nightmare never seems worth
| it for that tradeoff. Not to mention vendor lockin
| chungy wrote:
| SQLite is an in-process database. If you need language-agnostic
| encapsulation of data procedures, SQLite is not for you. I
| would suggest you consider PostgreSQL.
| ketralnis wrote:
| I don't think I've ever needed language-agnostic procedures in
| a project where sqlite is also a fit. I like them both but at
| different times. I'd love to hear your use case though. Do you
| have microservices in different languages running on the same
| machine that share a db file? Or maybe a web + command line
| interface?
|
| Sqlite's internals actually _could_ support something like
| this: it has a bytecode engine
| https://www.sqlite.org/opcode.html that's more oriented around
| executing query plans and it's missing some pieces (e.g. it has
| no stack, only registers) but much of the machinery is there to
| expand it to stored procedures
| chasil wrote:
| The language for triggers in SQLite appears to be a fragment of
| SQL/PSM and Oracle PL/SQL.
|
| Perhaps this will grow into a more thorough implementation.
| pstuart wrote:
| What language(s) would the stored procedures be, and how would
| that look in keeping with the ethos of the project?
|
| Their reasoning for not doing this is not unreasonable, but it
| certainly would be cool if such functionality existed.
| bob1029 wrote:
| SQLite supports the _best_ version of "stored procedures",
| IMO:
|
| https://www.sqlite.org/appfunc.html
| andrewstuart wrote:
| I can't see any valid reason not to use Postgres at the back end,
| unless you are in some sort of environment such as embedded or
| cloudflare workers that requires it. Or if you need a graph
| database there are better choices than Postgres.
|
| Postgres is good on multi core, incredibly feature rich, multi
| user, supported by everything, lightweight and has all the tools
| for production workload and management. All stuff that is
| important.
|
| Most important difference to me being SQLite I understand lacks
| flexibility in modifying table structures.
| kentonv wrote:
| Postgres is great at what it does, but it is extremely
| inefficient for storing a small amount of data, e.g. kilobytes
| or a few megabytes. sqlite, on the other hand, scales nicely
| all the way down to a few kb.
|
| This matters for cloud in that it means with Postgres you
| cannot take a "lots of small databases" strategy, e.g. database
| per user or database per document. You pretty much have to
| group a lot of data into one big database.
|
| Many apps want to do that anyway! For them, Postgres makes
| sense. But in the growing world of global deployments and edge
| compute, the lots-of-small-databases approach is getting
| popular because it means you can store than data out on
| hundreds or thousands of edge locations, rather than a single
| central location. And many (not all) applications actually fit
| pretty well into a database-per-user or database-per-document
| model. Centralizing their storage only hurts performance for no
| benefit.
|
| As a bonus, if you are able to run sqlite compiled directly
| into your app, not making any kind of network connection, it
| can be _much_ faster than Postgres, especially in "N+1 select"
| situations (which are well-known to be a problem with most SQL
| databases, but are not a problem when using local sqlite).
| Postgres does not support running as a library like this.
| candiddevmike wrote:
| Now your app is stateful, you need storage for it, a backup
| strategy, a free space monitoring strategy, a way to have the
| storage follow the app, etc. Depending on your situation,
| that could be harder than just getting a Postgres database.
| kentonv wrote:
| Edge compute platforms aim to take care of all that for
| you.
| ngrilly wrote:
| I used to be in the stateless camp. But I think we pushed
| that argument too far. Stateless apps are not very useful.
| Most useful apps are stateful. What we are doing with
| promoting stateless services is just delegating the
| necessary stateful complexity to someone else, sweeping it
| under the carpet. By doing that, we are losing so many
| opportunities to do smarter and better things where the
| complexity really lies, which is where the state is. That's
| why I really appreciate SQLite and other approaches like
| using KV embedded databases like RocksDB making a come
| back. That's the job of the infrastructure providers, AWS,
| GCP, DO, and others, to provide the tools solving the
| problems you mentioned: block storage that is synchronously
| replicated across data centers (GCP does this), snapshots
| and backups on block storage, possibility to quickly
| reattach the block storage to a new computing node if the
| previous one died, etc.
| [deleted]
| fauigerzigerk wrote:
| _> This matters for cloud in that it means with Postgres you
| cannot take a "lots of small databases" strategy, e.g.
| database per user or database per document. You pretty much
| have to group a lot of data into one big database._
|
| You're right, it's a very interesting strategy.
| Unfortunately, last time I looked Cloudflare's D1 didn't
| support this approach as there was no API to create a
| database at runtime from Worker code. Has that changed?
| srcreigh wrote:
| Postgres doesn't support b-trees for primary storage, so data
| can't be automatically clustered, which reduces efficiency for
| joins by 50x. MySQL and SQLite don't have this issue.
| adzm wrote:
| this fact still blows me away
| bajsejohannes wrote:
| > I can't see any valid reason not to use Postgres at the back
| end
|
| I'm using sqlite in production for a backend service. There are
| plenty of downsides, but to focus on the positives:
|
| - I can run all tests in a database in memory. It's incredibly
| fast to "spin up" and I can use a separate database per test
|
| - Related, I find that I write more tests against the database
| instead of mocking a database, which cuts down on time writing
| tests.
|
| - I don't need to start a database to run the backend
|
| - I can have snapshots of the databases in a single files for
| various scenarios
|
| All in all, the development process feels a lot faster. When
| something takes a millisecond instead of seconds, you do things
| differently.
| sumtechguy wrote:
| SQLite fills a need in the market that postgres does not. Local
| single process data storage and retrieval from that with some
| structure. Postgres and all of the DB's like it kick in when
| you want more than one process involved. MS used to have Access
| that filled this need very nicely. Once you go multi
| user/process you probably do not want sqlite you will want
| something that can do ACID on a multi user level. Once you go
| more than one process to store data you can make your
| installation a larger burden than you really need to deal with.
| On the other hand I can think of maybe one or two projects
| where sqllite fit very nicely for a data store. Most of the
| other cases I had involved a 'real' db.
|
| For this case I guess they could try to bend sqlite to do this
| but the pain in doing so will probably not be worth the long
| term trouble than just using mysql or postgres or something
| like those.
| bambax wrote:
| > _I can't see any valid reason not to use Postgres at the back
| end_
|
| From TFA:
|
| > _if you don't need the Postgres features, they're a
| liability. For example, even if you don't use multiple user
| accounts, you'll still need to configure and debug host-based
| authentication. You have to firewall off your Postgres server.
| And more features mean more documentation, which makes it
| difficult to understand the software you're running. The
| documentation for Postgres 14 is nearly 3,000 pages._
| voganmother42 wrote:
| them citing the (excellent) Postgres documentation page count
| is hilarious. There is a really useful comparison to be made
| re: features / scope -- but approximating that and punishing
| a project for how comprehensive their docs are...feels like
| lines of code as a productivity measure but like 10x less
| accurate or useful lol
| benbjohnson wrote:
| Author here. My goal in the comparison was only in terms of
| scope, not that Postgres folks should be penalized for
| having good documentation. I think Postgres is great and it
| makes sense to use it when it's called for. But I think it
| can be overkill for many projects.
| voganmother42 wrote:
| Makes sense and I enjoyed the article.
|
| Estimating the complexity of using a project can be
| really...complex. I think about systems I have used which
| make it easy to use a minimal set of features and where I
| don't have to reason about or be negatively impacted by
| aspects I do not benefit from, and other systems where
| things are less easily isolated and more challenging to
| reason about.
|
| I do think the Postgres docs in particular seek to be a
| reference in addition to an operating manual and I for
| one really enjoy them. I think the point is well made
| that Postgres can be too much (or too much right now) for
| many projects.
| andrewstuart wrote:
| >> you'll still need to configure and debug host-based
| authentication.
|
| False
|
| >> you have to firewall off your server
|
| Well yes. Are you saying SQLite servers don't need a
| firewall?
|
| >> more features/more documentation/hard to understand your
| software
|
| Features lead to powerful software, documentation leads to
| understanding.
| bambax wrote:
| > _Are you saying SQLite servers don't need a firewall_
|
| I'm not the author. But I can tell you there are no "SQLite
| servers". That's kind of the whole point.
| dimgl wrote:
| > any valid reason
|
| Well, cost, right? Cost is a reason why someone may not want to
| use a traditional RDBMS. AWS RDS and GCP Cloud SQL aren't
| exactly the cheapest solutions out there.
| andrewstuart wrote:
| Postgres is free. Put it on a computer. Cost is certainly not
| an argument against Postgres cause it can run on any back end
| that runs Linux.
| arcatech wrote:
| Who pays for the computer and bandwidth?
| zaphar wrote:
| Out of not cheaper to run a postgres server than an
| embedded sqlite DB. There is more to the cost than just the
| software license.
| meitham wrote:
| I recently wrote a production system that uses SQLite as the main
| backend. SQLite is in memory in this case and its entire state
| gets rebuilt from Kafka on start. The DB receives about 2 updates
| a second, wrapped with rest api aiohttp and odata filters. It has
| been able to handle close to 9k requests/second ands it's a
| primary system in a financial institution. So yes SQLite is fully
| capable prod db.
| paulryanrogers wrote:
| So your source of truth is ... Kafka?
| jerrygenser wrote:
| If you don't need SQL (relational data), but maybe have a
| schema per topic, I've used rocksdb as a cache for latest in
| tombstones topic. It has high write throughput for rebuilding
| state when playing forward a stream
| endisneigh wrote:
| You're using SQLite _and_ Kafka? Very ironic.
| meitham wrote:
| In large organisations you often have no choice of the type
| of queue between your team and other teams. That being said
| there's nothing wrong with Kafka and the ability to seek back
| to the earliest timestamp since midnight and being able to
| rebuild our state from that is a godsend feature, in
| comparison to other queues. This means we can make our
| application stateless or at least afford to lose the state
| and be able to build it quickly from Kafka.
| endisneigh wrote:
| Kafka is just fine. I just thought it was funny that SQLite
| would be involved at all.
| morelisp wrote:
| Kafka for consistency/durability and local storage for
| speed/structure is a common architecture, and a really good
| one any time you can tolerate async writes.
| endisneigh wrote:
| Use Postgres. Or if you insist on this type of architecture use
| CouchDB. I shudder thinking about a SQLite schema migration
| across clients with potentially unknown versions.
|
| Seems like a disaster waiting to happen unless you have a bunch
| of logic centralized somewhere to keep track of last know schemas
| per user client database. And if you're going to do all that,
| unless you desperately need low latency (in which case you could
| use a multi region database like cockroach), why not just
| centralize?
| simonw wrote:
| "unless you have a bunch of logic centralized somewhere to keep
| track of last know schemas per user client database"
|
| I've been building exactly that here:
| https://github.com/simonw/sqlite-migrate
| Dwedit wrote:
| If you're running a small message board where the whole database
| fits in under 32MB, SQLite makes perfect sense.
| ricardobeat wrote:
| You might be thinking of 32 bits (2GB)? Even that limit has
| been lifted and it can easily handle multi-GB databases.
| imhoguy wrote:
| Did anybody try something like that: read/write to SQLite
| database file on backend, but also allow the database file to be
| downloaded at any time by rich JS frontend for read-only
| querying. I just wonder if the file is going to be (eventually-)
| consistent and not corrupted.
| simonw wrote:
| My hunch is that if you want to do that the safe way would be
| to have a mechanism that creates a snapshot of the SQLite
| database for the client to download when they request it.
|
| One way to do that is with VACUUM INTO, e.g. how I use it in
| this TIL: https://til.simonwillison.net/sqlite/python-sqlite-
| memory-to...
|
| If your database is less than 100MB or so I imagine this would
| easily be fast enough that the performance overhead wouldn't be
| worth worrying about.
| maxmcd wrote:
| Been feeling a little miffed about this recently. Litestream is
| excellent but if you have multiple writers your db gets
| corrupted. Quite easy to do with rolling deploys.
|
| LifeFS was announced and is intended to help this. Now seems like
| (https://fly.io/docs/litefs/getting-started-fly/) it requires an
| HTTP proxy so that the application can guess about sqlite
| write/read usage by reading the HTTP request method. This
| seems... to introduce a different (maybe better?) set of gotchas
| to navigate.
|
| There are now SQLite cloud offerings but you pay the network
| overhead and avoiding that was so much of the appeal of using
| SQLite.
|
| Are people successfully using SQLite in a work or production
| setting with a replication and consistency strategy that they
| like? I've had trouble getting a setup to the point where I can
| recommend it for use at my jarb.
| liveoneggs wrote:
| Do you use https://www.sqlite.org/cgi/src/doc/begin-
| concurrent/doc/begi... ?
| benbjohnson wrote:
| Author here. The single-node restriction for Litestream was one
| of the main reasons we started LiteFS. There isn't a way to
| handle streaming backup from multiple nodes with Litestream &
| S3 as SQLite is a single-writer system and there aren't any
| coordination primitives available with S3.
|
| I agree that many of the SQLite cloud offerings introduce the
| same network overhead. With LiteFS, the goal is to have the
| data on the application node so you can avoid the network
| latency for most requests. Writes still need to go to the
| primary so that's unavoidable but read requests can be served
| directly from the replica. The LiteFS HTTP proxy was introduced
| as an easy way to have LiteFS manage consistency transparently
| so you can get read-your-writes consistency on replicas and
| strict serializability on the primary. That level of
| consistency works for a lot of applications but if you need
| stronger guarantees then there's usually trade-offs to be made.
| bob1029 wrote:
| > if you have multiple writers
|
| Our strategy is to not attempt replication at the level of
| SQLite. We use a single binary for our SaaS product which
| shares 1 SQLiteConnection instance for the lifetime of the
| whole ordeal. Remember - every single SQLite connection
| instance is a _file system abstraction_ , not some in-
| memory/networking clever optimized thing that Postgres or SQL
| Server is managing on your behalf. Every time you open a new
| connection to SQLite you are doing some pretty heavy-duty OS
| calls, relative to just reusing a prior connection. SQLite
| itself is typically built with serialization on by default,
| which deals with multiple threads on one connection. In my
| experience, this is the most stable & performant arrangement
| (with WAL, et. al. also enabled).
|
| Our backup solution is to snapshot the entire VM (or block
| storage device) that SQLite is running on. Replication is not a
| concern because our restore strategy is to just bring back a
| snapshot if required. Our customers are ultimately responsible
| for this and typically handle it with a few clicks through AWS,
| Azure or a quick email to their private cloud provider. RPO and
| RTO is entirely in their court and all parties prefer it this
| way - them being highly-regulated banks and us being a small
| startup operating at the edge of the abyss.
|
| To this day, we have not once had to support recovery of a
| SQLite database from snapshot due to corruption or other
| weirdness. We've been at it for half a decade now.
| capableweb wrote:
| I've had success in a production capacity with using rqlite
| before. There are also a bunch of other alternatives that still
| seem to be actively maintained, although I've only used rqlite
| myself before:
|
| - https://github.com/canonical/dqlite
|
| - https://github.com/rqlite/rqlite
|
| - https://github.com/Expensify/Bedrock
| morelisp wrote:
| > Litestream is excellent but if you have multiple writers your
| db gets corrupted.
|
| Isn't this not only well-documented, but (restricting to a
| single writer to avoid distributed systems issues while still
| making it easy to move that single writer around) sort of the
| whole point?
| aaviator42 wrote:
| My org's apps heavily use this simple key-value interface built
| on sqlite: https://github.com/aaviator42/StorX
|
| Handles tens of thousands of requests a day very smoothly! :)
| k_vi wrote:
| I've been using turso.tech for my current side project project
| and happy with it so far. iirc, their sqlite is deployed using
| fly.io too.
| declan_roberts wrote:
| I really worry about split-brain with sqlite. These replication
| features just seem too immature for me.
|
| That being said I love sqlite and it should be the DEFAULT
| database with any application until something is demanded
| otherwise.
| sergioisidoro wrote:
| A few years I made a decision to ship a SQLite database in an
| (internal) ruby on rails package. Why? Because there was a large
| set of (static) data that was required for the package to work,
| and it made no sense to make an API to query it from external
| sources (It wasn't that big, something like 5-10Mb if I recall).
| At the time it felt like a super dirty hack, but time seems to
| have validated that decision :)
| jmull wrote:
| I'm bullish on SQLite, and this is mostly a great article, but
| this kind of stuff is flat-out misleading:
|
| > When you put your data right next to your application, you can
| see per-query latency drop to 10-20 microseconds.
|
| As if postgres and others don't have a way to run application
| logic at the database. I like the SQLite way of doing it -- you
| pretty much freely choose your own host language -- anything with
| a decent SQLite client will work. While in postgres, for example,
| you'll probably end up with pgplsql (there are others, but there
| are constraints). So this isn't about latency, as the whole
| section of the article suggests.
|
| There's actually a relative weakness in SQLite here, since it
| doesn't include a built-in protocol to support running
| application logic separate from the database. That's also
| architecturally useful, and so you may have to find/build a
| solution for this.
|
| Just adding replicas isn't a general solution either, because
| each replica has an inherent cost: changes have to somehow get to
| every replica.
|
| E.g., systems can grow to have a lot of database clients. In
| traditional setups you begin to struggle with the number of
| connections. You might think with SQLite, "hey, no connections,
| to problems!" but now, instead of 1000 connections you've got
| 1000 replicas. That's something you're going to have to deal
| with... that's 1000x write load, 1000x write bandwidth.
|
| Perhaps fly.io has a solution for this, but I suspect it's going
| to cost you.
| sodapopcan wrote:
| > As if postgres and others don't have a way to run application
| logic at the database.
|
| I think it's reasonably fair of them not to specify this. The
| target audience of this article is people who are writing their
| applications in languages like Elixir, JS, Ruby, Python, and
| are not going to be interested in pushing all of their business
| logic to the db.
| JohnBooty wrote:
| As if postgres and others don't have a way to run
| application logic at the database.
|
| I mean...
|
| This is probably the least popular possible thing you can
| possibly suggest as an engineer in 2023.
|
| Me? I actually think pushing app logic to the DB is a solid,
| underrated, and possibly even optimal solution for a lot of
| scenarios.
|
| But don't tell anybody I said that. I might get beaten up.
|
| That's probably why fly.io sort of glosses over it as a
| possibility. Almost nobody is even considering it as an option
| in 2023.
| vsareto wrote:
| >Me? I actually think pushing app logic to the DB is a solid,
| underrated, and possibly even optimal solution for a lot of
| scenarios.
|
| The languages for writing it are not as comfy as traditional
| programming languages, which affects how expressive and
| maintainable your code will be. The tools for debugging a
| regular language might also be better than debugging
| application logic in SQL. Having done some of this in T-SQL,
| it's very tedious day-to-day compared to writing C#.
| marcosdumay wrote:
| > Having done some of this in T-SQL
|
| Don't expect your experience with any other DBMS to give
| you an idea about how nice it is to program in Postgres.
|
| It's still not as nice as creating some independent code.
| But Postgres is quite nice to program in.
| dangets wrote:
| Genuinely curious what your experiences of postgres
| programming you are fond of. Are you talking about
| functions & procedures in pgsql or are you using an
| extension to enable a different language?
|
| Do you have any interesting blog links?
| sodapopcan wrote:
| I run a bunch of business logic in the database but have no
| interest in writing my whole application as triggers or
| whathaveyou. It's a bit of a middle ground, perhaps?
| JohnBooty wrote:
| How did you decide which bits of business logic would live
| in the database as opposed to at the application layer(s)?
| no interest in writing my whole application as triggers
| or whathaveyou. It's a bit of a middle ground, perhaps?
|
| Yeah, I think our position is often misunderstood as "put
| _everything_ in the database layer. "
|
| That is definitely not how I think. I just think moving
| stuff to the database layer is one possible tool in the
| toolshed.
|
| One thing I like to use it for is when it's a "low level"
| database concern, like generating an audit trail whenever a
| particular table is changed. To me, populating that audit
| trail is clearly a database concern and not an application
| concern. (And if there are heavy writes to that table, the
| performance difference may be large)
| jmull wrote:
| > How did you decide which bits of business logic would
| live in the database as opposed to at the application
| layer(s)?
|
| Not the previous poster but...
|
| I would think of it as an application layer that is
| running at the database.
|
| It would make the most sense for logic that is closely
| coupled to data access... exactly what that is depends on
| your app. Low-level access control policies... maybe you
| need to dig data out of a bunch of tables and turn it in
| to a hierarchy based on complicated user prefs also
| stored in the database... Or the opposite where you have
| complex data coming in that needs to be written to a
| bunch of tables, especially when there's back and forth.
|
| I don't think very many should try to put their whole app
| in there. Database compute tends to get expensive and
| complicated to scale, so stuffing things in there just
| because you can might run in to trouble. Not to mention
| (except for SQLite) the runtime environment is seriously
| constrained in all kinds of ways.
|
| The previous post mentions triggers but I don't know what
| those have to do with this. The chance that triggers are
| right solution to any given problem is approximately 0%
| in my experience.
| sodapopcan wrote:
| I think this is essentially what I meant? In any event, I
| responded a little clearer but ya, shaping data and
| moving it around is a big thing I feel should be done in
| the DB. Having spent time in the rails world for several
| years, I worked with a lot of people who didn't want to
| do ANYTHING at the DB level. Like, obviously they'd do
| joins and things ActiveRecord could do easily, but they'd
| be fine pulling in a bunch of rows and reducing them in
| Ruby. That drove me nuts. I'm not an optimization junky
| at all but I do not like that in the slightest. Not only
| is it wasteful and slower, in my experience it's also
| more error prone.
| sodapopcan wrote:
| All aggregating and all math I do in the db. Stuff like
| taxes, royalties, any reporting stuff (that should
| probably be in a column store). It may seem like a no-
| brainer to do that but you'd be surprised. I worked at a
| fintech briefly and ALL the math was done in Elixir which
| surprised me. I also usually like to do permissions in
| SQL, ie, selecting a record based on a permission as
| opposed to grabbing the record then checking (I go back
| and forth on this). I do like to have everything that is
| going on represented in the application code, though, so
| I've never actually written a trigger in my life... I did
| write a stored procedure once for generating unique
| product codes... I never said I was a particularly good
| engineer, lol. But ya, even in the case of a cascading
| delete or something I still like to spell that out in
| application code so there are no surprises. Of course,
| I've never gotten to work on a team where everyone felt
| this way.
|
| Audit trails seems like an interesting case I have
| thought about before (I have been a part of building
| audit trails at the app level before). It's something
| I've been curious about but never looked into. I think it
| depends on how detailed they have to be. Like, just
| creating a history of every table seems a little wasteful
| to me, but you're probably talking more nuanced than
| that. I did talk to someone who worked somewhere where
| that was the strategy.
| Micoloth wrote:
| Honest question.. Why?
|
| I'm always thinking that a db that can also run business
| logic would be the ultimate backend solution for crud apps.
|
| I know there are several options to do it, but I always
| assumed Postgres did not support it.
|
| What do people have against it?
| xboxnolifes wrote:
| The tooling around maintaining logic-in-db is worse than
| the tooling for logic-out-of-db.
| __jem wrote:
| Friction with modern devops practices is a big one.
| marcosdumay wrote:
| Postgres has plugins for running the entire backend.
|
| People don't like it for a lot of reasons. Making privilege
| escalation harder is a big one, but also, all the CPU (and
| memory) load on serializing that data is CPU that could be
| used managing the distributed processes problems that only
| the DBMS can solve.
|
| Personally, I think access management on those tools needs
| to improve a lot before they get usable. But also, the
| data-oriented languages have some issues, and the non data-
| oriented ones don't gain much by running inside the
| database.
|
| IMO, we are missing a really good data-oriented language.
| But I don't see any gain from running it inside the
| database.
| JohnBooty wrote:
| After several decades in the business, I have learned not
| to underestimate the massive power of trendiness.
|
| A lot of the argument against it is: _SQL is stodgy and
| uncool, and I want to use Cool Language XYZ._ And honestly,
| I don 't entirely blame people for that. If you want to
| work in this industry you need to do things the "cool" way
| or you'll never even be considered for roles. Gotta keep
| that resume looking good.
|
| People also don't appreciate how performant it can be.
| Depending on what you're doing it can be orders of
| magnitude more performant to do things in-database versus
| shuffling things back and forth. (It can also be _less_
| performant...)
|
| There are definitely cons to that approach.
|
| One is scalability. It's easier to scale your app layer
| horizontally than it is to scale your database server
| vertically. This isn't necessarily an issue: a modern beefy
| server CPU with 64-128 cores and a TB or two of RAM is more
| than 99.9% of companies need, is really not that expensive,
| and is probably a lot cheaper than more complicated setups
| and extra devops headcount. But that's not cool either.
|
| Two is the language/skills mismatch. You've got an app
| layer in one language, a frontend in another, and now
| potentially a data storage layer written in a third. This
| is a valid concern, but also nobody seems to use it as an
| argument against Javascript frontends, so apparently
| sometimes it's cool and sometimes it isn't.
|
| Debugging stored procedures sort of sucks. That's fair.
| (But also, nobody is saying to rewrite your entire app, or
| even most of it, in the DB layer)
|
| Common migration tools often don't really have _explicit_
| support for stored procs and things like that, but AFAIK
| they do let you run arbitrary SQL DDL stuff, so I don 't
| think this is a hard barrier.
| simonw wrote:
| I think the main thing is that most people still aren't
| working with a good migrations system to manage changes to
| their schema... which means logic held in database triggers
| and stored procedures quickly becomes a non-version-
| controlled not-properly-tested mess.
|
| Good migration systems exist, and people should use them!
|
| I held off on doing interesting things with triggers for
| more than a decade. In the past year I've started leaning
| into them much more heavily (actually using them in SQLite)
| because I have confidence that I can both write good tests
| for them and have good migrations automation in place for
| version-controlling my schema.
| [deleted]
| candiddevmike wrote:
| Few reasons:
|
| - Enforce application logic with constraints so you don't
| have to duplicate it
|
| - Use triggers and get access to things like old and new
| without having to create a bunch of transactions
|
| - If you have multiple apps sharing a DB, you either keep
| your business logic consistent across them vs just doing it
| in the database
|
| - You can version your business logic with your schema
| TylerE wrote:
| That whole series of blog posts is an ad for fly.io.
| [deleted]
| erulabs wrote:
| I hope fly is able to make it. I'm rooting for them - however -
| I'm starting to wonder if the SQLite push isn't more "this is fun
| and interesting to build" and less "customers want this".
|
| Don't get me wrong - this is neat - but I'd never suggest anyone
| to actually use this outside of a fun experiment. The problem
| with existing SQL dbs isn't really the architecture - its the
| awful queries that do in memory sorting or make temporary tables
| for no reason or read-after-write, etc, not network latency.
| SQLite won't fix your current production problems.
|
| If it turns out they're building this for customers throwing cash
| at them, awesome. I just somehow doubt it. I think Planetscale
| has the better approach: a drop in replacement for MySQL/RDS with
| a smarter query planner. As a production engineer that's what I
| want to pay for!
| laurencerowe wrote:
| > The problem with existing SQL dbs isn't really the
| architecture - its the awful queries that do in memory sorting
| or make temporary tables for no reason or read-after-write,
| etc, not network latency. SQLite won't fix your current
| production problems.
|
| In my experience SQL databases are pretty poor at executing the
| kinds of deeply nested joins needed to return all of the data
| needed to render more complex UIs. It almost always ends up
| being faster to simply make nested selects in batches. So
| latency ends up mattering in these cases.
|
| You can work around this by denormalizing the data, but this
| often explodes your data size.
| ngrilly wrote:
| Which is exactly where SQLite shines, but I guess that was
| your point: https://www.sqlite.org/np1queryprob.html
| Xeoncross wrote:
| Replace SQLite with Excel and read it again.
| teaearlgraycold wrote:
| If you want a read only database that's too large to simply be
| a JSON file hosted on a CDN it makes sense. That's kind of
| niche but not unheard of.
| jjtheblunt wrote:
| I think you're skipping the replicated read only use case,
| which is our use case, and it's super handy there. but i
| understand this is a restricted scenario where little could
| really go wrong, and it could be done other ways.
| sodapopcan wrote:
| Lots of smaller businesses could do fine with this if they
| don't have a write-heavy workload. Like an ecomm shop, for
| instance.
| hinkley wrote:
| Lots of small businesses invent write heavy workloads and
| don't realize how many thousands of dollars they are spending
| a month on being nosey.
| nitwit005 wrote:
| If you're a smaller e-commerce business, your whole site can
| probably be cached aside from auth, checkout, and order
| history.
| sodapopcan wrote:
| I can say with authority that this is true! We used to
| store our whole massive catalog in Varnish.
| lib-dev wrote:
| Yeah it seems to make a lot of sense in ecomm. Product search
| and filtering on tables in the 1000s rather than the
| millions.
| endisneigh wrote:
| Any self respecting e-commerce site would want fault
| tolerance and strong consistency even with potential network
| partitions, so definitely not SQLite as described in article
| sodapopcan wrote:
| They aren't necessarily going to have all this. Lots of
| smaller ecom shops can run on a single server per region.
| If you're a North American company selling a few hundred
| t-shirts per day in NA and EU, it could probably be fine,
| no? I'll admit I'm not speaking from experience. Rather, I
| have experience in everything I just said only I was using
| pg, not sqlite. But I've been very interested in sqlite
| recently.
| endisneigh wrote:
| It's not about the size. Issues result in lost sales
| which means lost revenue. You would want fault tolerance
| regardless of how big you are.
| bootsmann wrote:
| There is a cost to fault tolerance as well, if you lose
| $100 per month due to fault tolerance thats worth about
| 30 developer minutes. Do you really get good fault
| tolerance for 30m of monthly work?
| erulabs wrote:
| You do with RDS and/or other hosted database services,
| hence their popularity.
| sodapopcan wrote:
| That's something for me to look into. Like what would it
| would look like in each? I've never been the dba-type in
| past jobs. I'm relatively well-versed in SQL but not
| administration.
| [deleted]
| [deleted]
| emodendroket wrote:
| Why even build your own e-commerce Web site at all in
| that case? It's undifferentiated work.
| bcrosby95 wrote:
| I don't know about nowadays.
|
| 8 years ago we built our own ecommerce site + warehouse
| app (inventory tracking, fulfillment, receiving) for a
| few hundred orders per day. The goal was to be able to
| better see profit margins by product, track where the
| money was going/coming from in detail, along with
| cleaning up the inventory management part of the
| operation.
|
| The warehouse people loved the change because it really
| streamlined the fulfillment process and it basically
| reduced their errors to zero. Owners loved it because of
| the all the cost/price tracking. It took about 6 months
| of work by 2 devs.
|
| Before this they were on woocommerce. It was slow as
| shit. We looked at shopify but the integrations with 3rd
| party warehouse stuff was bad. The other solutions we
| looked at were overly complicated swiss army knives.
| sodapopcan wrote:
| This is very close to my experience as well.
| Unfortunately, it was in reverse for me :( We had a
| custom solution and people were happy. A new tech lead
| came in and didn't like that the custom solution was PHP
| that still had some legacy spaghetti in it, so we
| switched to an off the shelf solution. It was very
| painful. People were unhappy.
| evantbyrne wrote:
| As with anything, the typical business can use something
| off-the-shelf, but a certain percentage are doing things
| differently enough that custom development becomes
| practical. I had to custom build the billing system for
| Beaker Studio to properly meter customers. Even Stripe's
| metering API wasn't flexible enough to handle per-hour
| metering.
| emodendroket wrote:
| I find it hard to imagine a reason a shop selling 100 tee
| shirts in a day would need any custom functionality since
| this is basically the exact use case all these OOB
| e-commerce tools are built for.
| sodapopcan wrote:
| I've worked for such a place. Off-the-shelf solutions are
| trying to be everything to everyone and you can end up
| customizing the crap out of them to the point where it
| can become more onerous than just building your own. The
| place I worked was also print-on-demand service and had
| hundreds of thousands of SKUs as well as allowed
| customers to make custom products and we also hosted some
| peoples' shops. Shoehorning that into a custom solution
| was painful.
|
| I work at a very similar place now that uses Shopify.
| Managing that many SKUs on Shopify is crazy painful.
|
| The thing is is that custom ecom solutions really aren't
| that hard. The off-the-shelf ones are complex because, as
| stated above, they are trying to be everything to
| everybody.
| evantbyrne wrote:
| Yeah, you're probably right about that. Main thing that
| would likely be custom would be the design.
| sarchertech wrote:
| One of the first commercial projects I worked on nearly
| 20 years ago was a tshirt shop. And the precious company
| I worked for was a comparatively huge logistics startup.
|
| I'm fairly confident I could spend a month or so writing
| a custom solution for a shop selling and shipping a few
| hundred tshirts a day that would save them enough money
| to break even on the software in a few years (compared to
| off the shelf solutions).
|
| If I was starting my own tshirt company, I'd definitely
| do it.
| MyneOutside wrote:
| In our B2B space there are tons of custom business rules.
| For example, you place orders per manufacturer and each
| manufacturer has its own minimum and reorder amounts and
| reqs on an order, promotions, business rules, etc. This
| is for business that have been around 25-30 years.
|
| We did an evaluation on several out of the box ecommerce
| solutions and none of them were able to meet the
| requirements that absolutely had to be there. Shopify
| flat out said no, they can't help us, etc.
| parineum wrote:
| > the typical business can use something off-the-shelf
|
| I think we generally have this problem. It's not the
| typical business that can use that stuff, it's the
| average business. Unfortunately no business is the
| average business.
| oooyay wrote:
| I recently implemented SQLite for a metadata application at
| work. My constraints in design were pretty concise; the
| application will only ever need to scale vertically, my data
| persistence story is a matter of speed and accessibility over
| longevity, I have plenty of options for scaling disk IO, and my
| read performance is much more paramount than write performance.
|
| The outcome is that my persistence later is not adding $200/m
| immediate service overhead cost and my deployment was easy to
| manage, which is a strong promise made to the rest of the team.
| I think there's a place for SQLite, but just like any tool you
| need to know that your design constraints match the constraints
| of the tool.
| Varriount wrote:
| > It's the awful queries that do in memory sorting or make
| temporary tables for no reason or read-after-write, etc.
|
| Are there any viable alternatives though? I often wonder what
| an SQL-like language built from the ground up would look like.
| erulabs wrote:
| The alternative is to write good queries! Alternatively, a
| better query planner that has a more liberal approach to "I
| know what you want, not what you're asking for". A great
| article on this re: Planetscale's approach is at
| https://vitess.io/blog/2021-11-02-why-write-new-planner/
| mattgreenrocks wrote:
| I'm excited about SQLite for web apps if only because it is one
| less moving piece in my stack, which focuses on prototyping and
| finding product market fit.
|
| If I start hitting hundreds of writes per sec, then, thats
| either awesome or I wrote some horrible code.
| erulabs wrote:
| This is exactly where I think SQLite shines! Unfortunately, I
| don't get to do much green-fielding in my career.
| eek2121 wrote:
| The first iteration of my website used sqlite. I only switched
| away because the growing audience made me nervous.
| WuxiFingerHold wrote:
| I can't imagine many suitable production use cases too.
| Alternatives like Planetscale, Supabase or Neon are even easier
| to use and at the same time much more powerful. If latencies
| around 50 ms are too much for the specific use case, SQLite
| with Litestream could be a great solution. Otherwise I'd go
| with managed Postgres/MySQL solutions.
| nik736 wrote:
| Why would I use SQLite over PostgreSQL for regular CRUD apps?
| simonw wrote:
| https://www.sqlite.org/np1queryprob.html is one of my favourite
| answers to that question.
| PKop wrote:
| 2022
| tuukkah wrote:
| Latest development on this line of work was a week ago:
| https://fly.io/blog/skip-the-api/
|
| Discussion: https://news.ycombinator.com/item?id=37497345
| NoahKAndrews wrote:
| Thanks, I was thinking that the real-time replication features
| of Litestream had been dropped in favor of LiteFS.
| sigmonsays wrote:
| I dont want this to be taken the wrong way but I read about
| fly.io and sqlite atleast once a week.
|
| Who is using this and why is it such a hot topic on HN?
| matlin wrote:
| If you need multiple writers and can handle eventual correctness,
| you should really be using cr-sqlite[1]. It'll allow you to have
| any number of workers/clients that can write locally within the
| same process (so no network overhead) but still guarantee
| converge to the same state.
|
| [1] https://github.com/vlcn-io/cr-sqlite
| eternityforest wrote:
| I don't see any timestamps in the data. If two peers write to
| the same row, does it not use latest-wins logic?
| jeromegn wrote:
| There's a col_version column in a clock table used for last-
| write-wins. In case a tie, the "biggest" value wins.
| greatNespresso wrote:
| While different than the approach offered by Litestream, I am
| fairly excited by the direction of Cloudflare D1, making SQLite
| available at the edge without having to manage anything. Still in
| alpha but worth looking at if you're looking for cheap cloud
| option.
| bradgessler wrote:
| Related: I wrote a piece last week on deploying Rails apps to
| production on Fly.io at https://fly.io/ruby-dispatch/sqlite-and-
| rails-in-production/
|
| The work that's made this possible is:
|
| 1. Litestack https://github.com/oldmoe/litestack runs everything
| on Sqlite
|
| 2. Fly.io's work on the dockerfile-rails generator detecting
| Sqlite and Litestack in a Rails project, then setting up sane
| defaults for where that data is stored and persisted in
| production. This is all done behind the scenes with no
| intervention required from the person deploying.
|
| 3. Servers are overall faster and more powerful
|
| I hope more Rails hosts make it easier and safer to deploy Sqlite
| to production. It will lower costs and reduce complexity for
| folks deploying apps.
___________________________________________________________________
(page generated 2023-09-22 23:02 UTC)