[HN Gopher] SQLite or PostgreSQL? It's Complicated
___________________________________________________________________
SQLite or PostgreSQL? It's Complicated
Author : makaimc
Score : 175 points
Date : 2022-06-28 14:33 UTC (8 hours ago)
(HTM) web link (www.twilio.com)
(TXT) w3m dump (www.twilio.com)
| danaugrs wrote:
| How about PostgreSQLite? Joke aside, is anyone building something
| that takes a stab at fixing the issues with both?
| philliphaydon wrote:
| It's difficult to accept the results without looking at the query
| plans to see how the queries are running and if they are optimal.
| Seems like it's just a straight dump of data into PostgreSQL and
| letting SQLAlchemy handle some queries and indexes but no
| analysis of the indexes at all.
|
| Could be able to squeeze more perf out of both SQLite and
| PostgreSQL.
| jayski wrote:
| exactamundo
| miguelgrinberg wrote:
| The point of this exercise was to determine how these database
| perform with same schema, and if it makes sense to make the
| jump from SQLite to PostgreSQL or not. As a side note and since
| you ask, the indexes on this database are fairly well thought
| out, I have exhausted all ideas on indexing improvements to
| help with performance. I do mention a major implementation
| enhancement which is to precalculate monthly totals.
|
| Also, another important point I tried to make is that this
| benchmark is useful for this particular deployment. I don't
| really want to convince you the numbers that I've got will also
| apply to you, my proposal is that you should test your
| application and make decisions based on your own results, which
| can certainly be different than mine.
| philliphaydon wrote:
| While the indexes /may/ be pretty well thought out. That
| doesn't mean that they are useful.
|
| Indexes don't work the same between SQLite and PostgreSQL so
| a 1:1 transition between the two databases on indexes
| probably isn't going to yield the best results.
|
| For example UUIDs were stored incorrectly in PostgreSQL.
|
| SQLite does not enforce varchar lengths, and typically in
| PostgreSQL you just use text type and not impose any length
| restrictions unless absolutely necessary.
|
| PostgreSQL can use multiple indexes in a single query which
| may help with the 2gb ram limits.
|
| PostgreSQL supports partitioning which could help with the
| range queries.
|
| All I'm saying is that I believe the performance you get from
| SQLite imo could be better, and the performance from
| PostgreSQL a lot better.
| pphysch wrote:
| > typically in PostgreSQL you just use text type and not
| impose any length restrictions unless absolutely necessary.
|
| There was a horror story posted on here a while back about
| some old API causing an outage because there happened to be
| a random text field that a user was effectively using as an
| S3 bucket. Their conclusion was to _always_ set a sane
| length constraint (even if very large, say 1MiB for the
| smallest fields) on field sizes.
|
| Hyrum's Law. Yes user input validation, blah blah, but that
| is infinitely more surface area than having a sane (or
| absurdly large) limit by default.
| jeffdn wrote:
| Right, but use a check constraint and not a data type
| constraint (i.e. `varchar(x)`).
|
| That sounds more like a code review issue -- `blob`
| would've been an appropriate storage type in that case.
| pphysch wrote:
| > That sounds more like a code review issue -- `blob`
| would've been an appropriate storage type in that case.
|
| No, it was supposed to be a short text field. Not an
| infinite storage endpoint!
| Pv9KKy3 wrote:
| I use and prefer varchar(1000) personally, extremely easy
| to change in the future if needed.
| trollied wrote:
| There are also a shitload more indexing options available
| to you in Postgres.
| BeefWellington wrote:
| > As a side note and since you ask, the indexes on this
| database are fairly well thought out, I have exhausted all
| ideas on indexing improvements to help with performance. I do
| mention a major implementation enhancement which is to
| precalculate monthly totals.
|
| So, this is what Data Warehousing is all about: reporting
| related optimizations (typically what monthly totals and such
| are aimed at). The idea being that you trade disk for speed.,
| so that kind of optimization is actually pretty normal if
| you've identified it as a bottleneck. There's no need
| necessarily to separate data warehousing constructs into
| another DB, though it's often good practice because sooner or
| later you're going to start running into problems on your
| important system related to storage inflation and non-indexed
| sorts/filtered.
| sodapopcan wrote:
| I'm surprised you are the only comment mentioning this. I
| don't have quite enough knowledge to speak up about it here
| but it seems that while this article is about SQLite vs
| PostgresQL, it's really that neither are the ideal answer
| in this scenario.
| msbarnett wrote:
| > Once again, SQLite happily stored my UUIDs on this column and
| even indexed them, but to get these across to PostgreSQL I needed
| to fix the schema and properly declare these keys as strings.
|
| In PostgreSQL you're leaving performance on the table if you
| store UUIDs as strings instead of as the built in UUID type
| (128-bit value under the hood)
| cryptonector wrote:
| Plus PG and SQLite3 have fundamentally different approaches to
| types.
|
| SQLite3 has very few types, you cannot create new types, and is
| duck-typed.
|
| PG has many built-in types, you can create new types, and is
| (mostly) statically typed.
|
| You really have to know this going in. If you have to support
| both, you'll probably have to write two versions of your SQL --
| or you'll end up using an ORM that supports the two RDBMSes,
| and then you may get stuck with the ORM.
|
| SQLite3 is amazing, but it's mainly amazing because of how easy
| it is to embed in apps. If that's what you need, use SQLite3.
| If you need more power, if you're building a server, then think
| long and hard about why not just go straight to PG.
|
| EDIT: SQLite3 is amazing for other reasons too.
| m-p-3 wrote:
| And SQLAlchemy made the process of migrating from one to
| another quite simple too.
|
| https://www.sqlalchemy.org/
| password4321 wrote:
| > As of version 3.37.0 (2021-11-27), SQLite provides STRICT
| tables that do rigid type enforcement
|
| https://www.sqlite.org/datatype3.html
| m-p-3 wrote:
| Thanks! Here's the link to the actual details
|
| https://www.sqlite.org/stricttables.html
| cryptonector wrote:
| I'm aware of this, and it's a very welcomed addition,
| though the types supported are still very limited, but...
|
| > though new types might be added in future releases of
| SQLite
|
| Intriguing! Can't wait for SQLite3 to have `CREATE TYPE`!
| anothercat wrote:
| Is it really duck typed if no static type checking is
| performed on the SQL? Duck typing to me seems to imply
| checking quacks and waddles at "compile" time. I would think
| it's at most loosely typed and at worst dynamically typed.
|
| EDIT: Without the new strict mode of course, where it is not
| duck typed but typed by name.
|
| EDIT 2: Actually maybe duck typing is fine to use for dynamic
| languages too. The affinity of INT keys and stuff is sort of
| duck typing I guess.
| kjeetgill wrote:
| You're getting downvoted because the most well known duck-
| typed languages, and probably where people first learned
| the term from, are all examples of very dynamic languages
| like python, ruby, and even js.
|
| I'm sure there are others but the main language anyone
| knows static duck-typing for is go. In go at least, the
| interfaces get checked against implementations based what
| instances are bound to what interface variables or
| parameters at compile time.
| kevin_thibedeau wrote:
| Just to add more detail. SQLite was designed for Tcl
| interop. Its typing model reflects that.
| miguelgrinberg wrote:
| For this particular application I discuss in the article it
| would make no difference, since there is only one not heavily
| used table that uses UUIDs. All the other tables use integer
| primary keys.
|
| Also this may be not a very popular opinion, but I prefer to
| design the database without using database-specific features.
| Things such as using SQLAlchemy and sticking to basic types.
| Sure it may affect performance some, but thanks to having done
| that I was able to port the app from SQLite to PostgreSQL in
| like an hour.
| lenkite wrote:
| Prefer this approach too. Makes unit testing easier and makes
| data more portable. Stick to the commonly supported set of
| SQL data-types.
| PeterisP wrote:
| You can't completely avoid using database-specific features -
| even the most basic types such as strings effectively require
| database-specific treatment at least if you want strings that
| are long and/or unicode.
| klysm wrote:
| > Also this may be not a very popular opinion, but I prefer
| to design the database without using database-specific
| features.
|
| I prefer to leverage the DB schema to do as much as possible
| for me. Generally with PostgreSQL that means using a lot of
| domain types etc.
| msbarnett wrote:
| > Also this may be not a very popular opinion, but I prefer
| to design the database without using database-specific
| features.
|
| The problem with that is that SQL as a standard is so patchy
| and inconsistently implemented that effectively everything
| has database-specific features. You discovered that yourself
| when you inadvertently relied on SQLite's "feature" of
| letting you store UUID strings in an integer column. No
| matter how portable you imagine your schema is, it will work
| very, very differently between any two SQL implementations.
|
| I can't see any value whatsoever in voluntarily tying one
| hand behind your back performance-wise, honestly. It's not
| like changing "text" to "uuid" would have noticeably
| increased the time this port-job took you.
|
| As a wise man once said, if you're doing a job manually while
| the tool designed for that job is sitting idle beside you,
| the tool is smarter than you are.
| sgbeal wrote:
| > The problem with that is that SQL as a standard is so
| patchy and inconsistently implemented that effectively
| everything has database-specific features.
|
| Noting that when it comes to such patchiness, sqlite's
| developers specifically prefer to follow PG's example. When
| they write new SQL features, their basis for comparison is
| literally always "do we get the same answer as PG?" Yes,
| there are differences between the two, but porting between
| sqlite and PG, provided no fancy PG features are involved,
| is very likely simpler than porting between any two other
| arbitrary SQL engines. (That's ignoring any performance
| factors - i'm only talking about semantic compatibility.)
| masklinn wrote:
| There's also no word about indexing or analysis (and postgres
| is _very_ sensitive to having up-to-date statistics).
|
| And on sqlite you might be leaving memory / storage on the
| table if you use a UUID as PK and don't remove the implicit
| ROWID.
| miguelgrinberg wrote:
| I did not feel that was a relevant topic for this article. I
| have spent a good amount of time over the life of this
| application optimizing indexes, though. But the article is
| about how I designed a test to benchmark these two databases,
| not a general discussion on all the optimizations you can
| make.
| phibz wrote:
| In postgres there usually no good reason to use varchar
| unless you specifically want to enforce size limits. If you
| don't care use TEXT, the performance is the same.
| mikece wrote:
| I really wish WordPress supported using SQLite as it's database
| instead of MySQL as this could make backups as simple as
| rsync'ing from prod server to backups.
| herbst wrote:
| To be fair exporting and importing databases is a simple
| command on MySQL (or postgres) so it's just a command more
| simple than just rsyncing the sqlite file.
| lazide wrote:
| Wouldn't you run the (real) risk of a corrupted backup?
| edbaskerville wrote:
| Yeah. SQLite's backup API is the right way to do this:
|
| https://sqlite.org/backup.html
| ElectricalUnion wrote:
| And, if you can't use the backup API for some reason, try
| VACUUM INTO:
|
| https://www.sqlite.org/lang_vacuum.html#vacuum_with_an_into
| _...
| simonw wrote:
| There's a WordPress plugin that adds support for SQLite as an
| alternative to MySQL.
|
| The implementation is (to my mind) simply astonishing: they run
| regular expressions against the SQL to convert it from MySQL
| dialect to SQLite! https://github.com/aaemnnosttv/wp-sqlite-
| db/blob/9a5604cce13...
|
| I'm amazed that this works, but apparently it works really
| well. I would expect it to break the moment it ran into a
| WordPress plugin that executed custom SQL in a way they hadn't
| anticipated though.
| evanelias wrote:
| WordPress became popular at a time when LAMP hosting was easy,
| widespread, and affordable. It's one reason WordPress largely
| supplanted Movable Type, which was an early popular blogging
| platform / CMS / static site generator. Because Movable Type
| was Perl-based, it was more difficult to host.
|
| Internally, Movable Type had a really nice object persistence /
| ORM abstraction, with support for MySQL, Postgres, SQLite, and
| Oracle. Meanwhile, iirc around this time WordPress internally
| had raw queries strewn about its codebase, which certainly
| would have been difficult to support multiple DBs. (I haven't
| looked at WP since then so I don't know if this is still the
| case.)
|
| Faced with shrinking market share and being out-executed on
| features by WordPress/Automattic, Movable Type actually removed
| Postgres and SQLite support in response! Then again, these DBs
| weren't nearly as popular back then as they are today.
| forinti wrote:
| It would be great to have the option to move easily between
| databases.
|
| SQLite would be best for development, testing, and small sites.
| MySQL would be better for production . And SQLite would be be
| great for archiving old sites.
| forgotpwd16 wrote:
| Not only SQLite. I would've expected WordPress be database
| agnostic.
| eddieroger wrote:
| Why? WordPress is from an era of PHP where the ties between
| PHP and MySQL were tighter than tight. What reason would they
| have to change away from that? I would love if WP worked with
| more than just MySQL as well, but I understand why it
| doesn't.
| forgotpwd16 wrote:
| It being from an era doesn't mean it has to remain in that
| era. I also understand why it doesn't. I'm saying it isn't
| expected to (anymore).
| cryptonector wrote:
| My advice: go straight to PG. And avoid ORMs.
| heurisko wrote:
| Hibernate+Spring Data allow me to write native queries, and
| maps them to interfaces for me.
|
| Using an ORM doesn't mean you have to rely on behaviour that
| generates slow (e.g. n+1) query issues.
| rr888 wrote:
| I like that but its the testing that makes it hard. Being able
| to use ORM then mock yourdatabase with sqlite or similar is
| very useful. You can use testcontainers but not in my firm.
| heurisko wrote:
| This is the promise, but in experience nasty bugs occur when
| the abstraction inevitably doesn't work.
|
| If it doesn't work with an real life implementation e.g.
| docker test container, then I don't consider it a robust
| test.
| cryptonector wrote:
| You can test with PG. Trivially.
| rco8786 wrote:
| My advice: advice with no reasoning or context is not useful to
| anyone.
| cryptonector wrote:
| Much much e-ink has been spilled on the subject of ORMs-good-
| or-bad. I shan't always reprise that.
| rco8786 wrote:
| Literally you are the person who brought up ORMs.
| cryptonector wrote:
| https://news.ycombinator.com/item?id=31909806
| philliphaydon wrote:
| ORMs are fine but they are not a silver bullet that means you
| never need to look at the database or the queries generated.
|
| I think too many people use ORMs and rely on them to do
| everything, then blame the database when things are slow. :(
| cryptonector wrote:
| Devs start with ORMs because they promise to make everything
| easy.
|
| Then their apps grow complex enough that the ORMs get in the
| way.
|
| Then the devs get sad.
| bbkane wrote:
| Re: no ORMs - do you have any advice on how I could make
| this raw SQL insertion ( https://github.com/bbkane/starghaz
| e/blob/54d255f524f8ef73384... ) more readable and less
| error prone? Right now I'm basically concatenating strings
| and hoping I don't misorder my parameters when I exec. 200
| lines of this!! Am I doing raw SQL wrong?
| bestinterest wrote:
| It takes a lot to say very little in Go sometimes, I'd
| say this is one of those cases. Mainly the if err != nil
| boiler is at issue here.
|
| I don't think you are doing anything wrong from a noobies
| opinion.
| cryptonector wrote:
| > Right now I'm basically concatenating strings
|
| Well, you're using query parameters, but I think I
| understand what you mean.
|
| What I would do is write a stored procedure that takes
| whatever parameters, then I'd write PG SQL to generate
| bindings to that procedure for whatever language you're
| using (Golang here). I might write the bindings by hand
| if it's not a lot of work, but definitely would write a
| little compiler if it's a tone of work. That way you can
| get type-safety.
| bbkane wrote:
| Thank you. I'm using SQLite3 to keep deployment simple,
| which unfortunately doesn't support stored procedures. Do
| you have any other suggestions to improve this code?
| cryptonector wrote:
| SQLite3 very much supports something like stored
| procedures though! Only, instead of calling them that it
| calls them `INSTEAD OF INSERT` triggers.
|
| So, you'd do something like: CREATE VIEW
| IF NOT EXISTS things AS ...; CREATE TRIGGER IF NOT
| EXISTS create_thing INSTEAD OF INSERT ON things
| FOR EACH ROW BEGIN ... END;
|
| and to invoke such a "stored procedure" you'd
| INSERT INTO things (...) VALUES (...), ...;
|
| or `INSERT .. SELECT ..;`.
|
| Now, with SQLite3 you don't get reflection via SQL, like
| you do in PG via the `pg_catalog` schema, but you do get
| reflection via the API. Also, because it's SQLite3, you
| don't get strong typing, and you get very few types, so
| even with reflection via the API, generating type-safe
| stub functions to invoke the "stored procedure" is not
| exactly easy or possible.
|
| FYI, SQLite3 is Turing-complete. You can loop via
| recursive triggers. And you can do conditionals via
| `WHERE` clauses on the statements in your trigger bodies
| (and also via `OR IGNORE` and such).
|
| EDIT: I made a post of this:
| https://news.ycombinator.com/item?id=31913062
| ormaaj wrote:
| You do know an ORM is a thing that is used with a database
| rite? Perhaps you actually meant "don't use a relational
| database". Either way, LOL
| oh-4-fucks-sake wrote:
| In the JVM world, jOOQ is great happy-medium technology between
| the extremes of string SQL and full-blown ORM:
|
| - Type-safe DSL generated from your schema that looks like SQL
|
| - (If you can write the SQL, you can write it in jOOQ)
|
| - SQL-injection protection
|
| - Strong multi-dialect and vendor-specific support
|
| - Custom-type mappers
|
| - Composable queries
|
| - Compile-time, schema-aware query safety
|
| - Fine-grained transaction control
|
| - And so much more...
| gmac wrote:
| There are various non-ORM database libraries for TypeScript
| too. I develop Zapatos, but Slonik and pg-typed are probably
| better known.
|
| https://jawj.github.io/zapatos/
|
| https://github.com/gajus/slonik
|
| https://pgtyped.vercel.app/
| xwdv wrote:
| No one ever got fired for just using postgtreSQL, maybe burned
| though.
| ormaaj wrote:
| This is totally uncomplicated. The choice of database is a
| function of your application and its architecture, not a matter
| of all-things-equal performance comparison.
|
| "I want an extensible versitile database platform for use as a
| component in an infrastructure that is part of a larger system" -
| postgres.
|
| "I want an embeddable lightweight SQL storage backend for a
| narrow scope, single-purpose, probably single-user, probably
| single-system application" - sqlite
|
| Postgres should perform well in any role in the hands of an
| expert - it's a question of whether or not that is overkill.
| Sqlite can also perform well in a subset of those roles.
| bob1029 wrote:
| I think it's even more complicated.
|
| The reduction in latency brought on by in-process databases,
| combined with modern NVMe storage, means that SQLite is a
| substantially faster approach than any other solution which
| requires a trip through the network stack.
|
| I've got services in production for several financial
| institutions right now that are resolving _most_ SQL queries
| within 100-1000 _micro_ seconds by simply using SQLite on
| reasonable hardware.
|
| How many more users could you support if the amount of time you
| had to await IO for each was reduced by ~2 orders of magnitude?
|
| Obvious caveats here being the resilience angle. We addressed
| this with application-level protocols and additional instances.
| infogulch wrote:
| Another angle is unbounded horizontal scalability across
| tenants. If your application can be divided cleanly between
| different customers and one customer can be served with one
| instance (see "How many more users could you support if the
| amount of time you had to await IO for each was reduced by ~2
| orders of magnitude?") then scaling to more customers is
| extremely simple and linear.
| srcreigh wrote:
| Can you share more info about resilience, app protocols and
| additional instances?
| andy_ppp wrote:
| Why is this limited to four clients? Maybe I missed something but
| it would be good to see what happens with higher numbers of
| connections (something both of these dbs are rumoured to be bad
| at).
| mritchie712 wrote:
| Do clickhouse next!
| umvi wrote:
| Can SQLite be used outside of single threaded contexts? Last I
| checked having multiple concurrent readers and writers is very
| difficult. Multiple concurrent readers only is probably ok if
| that's all you need.
| simonw wrote:
| In practice, most SQLite writes take less than a few ms, so
| concurrent writes are only likely to be a problem for you if
| you need to handle thousands of writes per second.
|
| I've implemented the pattern where writes sit in an in-memory
| queue and are applied to a dedicated write connection in
| series. I've not run into any problems with this so far, but
| I've also not tried to handle thousands of writes per second
| with it yet.
| neverartful wrote:
| It can, but as the concurrency increases you'll start to see
| errors indicating that database is locked.
| ed25519FUUU wrote:
| The fact that the performance is so close between a database that
| requires NO SERVER and one that requires a dedicated server + ops
| management is a signal to me that all projects should start on
| SQLite unless there is a compelling reason not to.
| rco8786 wrote:
| yes!
|
| It seems to me (or at least, I'm hoping this is happening) that
| the pendulum is swinging back to simple deployments and away
| from AWS/Cloud/K8s for every tiny little app.
|
| One thing I'd love to see is that the 'default' MVP deployment
| is your code + SQLite running on a single VM. That will scale
| well into "traction" for the vast majority of applications
| people are building.
| namecheapTA wrote:
| I have a fully working SaaS with 30+ companies paying, with
| 20 employees each on average. I wrote it as a self taught
| programmer via YouTube and stackoverflow. It is hosted on a
| $2 a month namecheap sharedhosting. Php back, jQuery front. I
| use cpanel to set things up.
|
| It runs faster than 90% of webapps on the internet.
| lazide wrote:
| Yup, I believe it. As an industry 90% of companies are
| spending 90% of their time on crap only the remaining 10%
| of the companies actually need.
|
| Good resume fodder though I guess.
| Pv9KKy3 wrote:
| As a former php dev I absolutely believe this. I can't get
| over how slow and bloated stuff is these days. I can't even
| do front-end anymore.
|
| We once did an app ages ago where the database was used to
| create a materialized view of code (PHP) + web pages for
| everyone and everything. We then rsynced that to 6
| machines. This is ancient times, but this thing FLEW ->
| click click click click you could fly through the pages. It
| was a read heavy workload, but still, even the front-end
| (just hand coded) was somehow faster than all the latest
| and greatest going on these days.
| downut wrote:
| Do you implement something akin to Role Based Access
| Control for those 30x20 employees? If so where do you store
| and manage that data?
|
| I have zero experience with this but I am very curious how
| people do it in sqlite.
| ElectricalUnion wrote:
| Not OP, but like any other database? Even if the database
| supports it, you usually should not have users directly
| logged in the database anyways, so whenever the database
| "has" or "lacks" access control, it doesn't matter.
|
| What does matter however, is enforcing parametrized
| queries everywhere. Unless all the db handles you pass to
| the client handling code are read-only, chaos will ensure
| from the DDL permissions.
|
| https://xkcd.com/327/
| downut wrote:
| I start by assuming a server-side bridge between the
| client and the database, and it's a given the queries are
| parameterized. So only the bridge is doing the "logging
| in".
|
| Why is it superior to put all of the (bespoke) access
| control logic in the server side bridge rather than use
| what's available in the database (accessed by the bridge,
| not the client)?
|
| I have been watching like a hawk for 6 months but I
| haven't stumbled upon a clear reason why this is done,
| except for "it helps source code db portability".
|
| For a multiorg/multiuser application this seems like the
| crucial distinction between sqlite and postgresql.
|
| Again I have no experience here, talk to me like I'm
| stupid (I really am!).
| ElectricalUnion wrote:
| For a multiorg approach, you probably want isolated DBs
| anyways.
|
| Within a single org, multiuser approach, there are 2 big
| problems that I remember with attempting to shoehorn DB
| auth into application auth:
|
| * assuming you use a connection pool, you might run out
| of TCP connections/ports if you need to handle too much
| stuff;
|
| say for example that your load balancer need 3
| application nodes behind it - you will need 2
| (connections per user) x 3 (application nodes)
| connections just to handle a user - 6 connections/user.
| That will eat your database connection limit very fast,
| for no good reason.
|
| * assuming you don't use a connection pool, you now have
| horrible latency on every connection (bad) and need to
| handle plain text passwords (assuming you use scram-
| sha-256), or md5 non-replay-resistant hashes of user
| passwords in, either sent in every client request, or in
| a shared session system. No matter what you pick, you
| have a security disaster in the making (very bad).
| rco8786 wrote:
| I think OP is saying their customers are mainly companies
| with ~20 employees, not that their company has 20
| employees
| rco8786 wrote:
| I believe it!
| fmakunbound wrote:
| In my experience the pain will come later as it grows, I
| believe it.
| rco8786 wrote:
| And it's a great pain to have
| avinassh wrote:
| hmm in that case I would probably pick Postgres because despite
| the added network call, it is still faster than an embedded DB.
| ormaaj wrote:
| As I understand it the actual low-level database interface
| may be a shared library mechanism in either case. At least,
| for a local database I'm pretty sure that's a function of the
| connection type. You're still dlopen()-ing an ODBC library
| and the underlying glue between that and the rest of the
| userspace DB engine parts can be implemented in many ways. If
| there's a socket involved it would probably be AF_UNIX
| masklinn wrote:
| It's funny because to me it says pretty much the opposite:
| because sqlite has no server it works entirely in-process,
| whereas the client will be talking to postgres over a unix
| socket, or possibly a full-blown network socket.
|
| Plus it seems to be a readonly workload, something for which
| posgres is not ultra relevant (not that it doesn't work,
| but...).
| lazide wrote:
| Uh, most distributed environments you have no choice but to use
| a server, because you'll have clients on a (relatively)
| uncontrollable set of other machines.
|
| If you're writing something to run in an embedded/client app
| environment, then yeah why would you use Postgres for your one
| machine? You could, but it'll add a lot of moving parts you'll
| never need and probably don't want (like remotely accessible
| network ports you'll need to secure)
| jayd16 wrote:
| Postgres doesn't require a _dedicated_ server. If your
| requirements allow for SQLite but it didn't exist, you could
| conceivably run postgres on your app server to reduce the ops
| load if you really felt it was a significant effort.
|
| If you might want to use multiple app instances connected to a
| shared DB, I would say it's probably easier to just use a local
| postgres container for dev and a managed cloud DB. Really not
| that much effort and you get automatic backups.
|
| If you plan to never use a shared DB, SQLite is great though.
| robonerd wrote:
| I read that as _" dedicated server[process]"_ not _"
| dedicated server[hardware]"_ While the latter would obviously
| imply the former, it's really the former that increases the
| complexity of managing whole affair. Desktop applications
| with a dependency on mysql/postgres running on the same
| machine are, in my experience without exception, a pain in
| the ass for users.
| riku_iki wrote:
| > you could conceivably run postgres on your app server to
| reduce the ops load if you really felt it was a significant
| effort.
|
| Still, PGSQL requires some level of setup/maintenance: create
| user, permissions, database, service with config/security
| effort. If there is a way to run PGSQL as lib and say to it:
| just store your data in this dir without other steps, I am
| very interested to learn.
| urda wrote:
| You are likely wanting something like RDS [1], Azure [2],
| or even Linode (which I just discovered) [3]. Sure you have
| to pay a larger premium for it, but these products do allow
| you to focus on the app than the DBA stuff.
|
| - [1] https://aws.amazon.com/rds/
|
| - [2] https://azure.microsoft.com/en-
| us/services/postgresql/#overv...
|
| - [3] https://www.linode.com/products/postgresql/
| riku_iki wrote:
| > pay a larger premium for it
|
| yeah, premium is very large if I want to do a lot of data
| crunching (lots of iops, cores and ram is needed), plus
| latency/throughput to those services will be much higher
| than to my local disk through PCIE bus.
| eropple wrote:
| You can set up Postgres with three lines of bash and a Unix
| socket. It is more work; I wouldn't say it's appreciably
| more work, and if you need more rigorous data modeling (I
| frequently do) then it's worth considering if this is the
| mode in which you'd like to run your data stores (I
| wouldn't, but latency isn't the most critical thing for
| stuff I do).
| riku_iki wrote:
| I think it is more than that.
|
| Steps will be:
|
| - sudo apt install postgres (more steps are needed if you
| are not satisfied with OS default version)
|
| - sudo su postgres
|
| - psql -> create db, user, grant permissions
|
| - modify hba.conf file to change connection permissions
|
| - modify postgresql.conf to change settings to your
| liking because defaults are very out of touch with modern
| hardware
|
| Another options would be to build your docker image, but
| I am not good with this.
| mikepurvis wrote:
| "never use a shared DB"
|
| I think this is what it comes down to for me too. Yes there
| might be some use cases that really benefit from Postgres
| features like GIS and the native JSON type, but ultimately,
| for most use-cases, it's going to hinge on whether or not you
| ever expect to need to scale any part of the system
| horizontally, whether that's the frontends or the DB itself.
| msbarnett wrote:
| > Yes there might be some use cases that really benefit
| from Postgres features like GIS and the native JSON type
|
| Or a proper UUID type, materialized views, enforced data-
| constraints, lateral joins, queries that consider multiple
| indexes, etc, etc.
|
| It's not like there's just a couple of niches where
| PostgreSQL's features distinguish it from SQLite. SQLite is
| appropriate for very simple DB use-cases where you aren't
| too bothered by silent data inconsistency/corruption, but
| not a whole lot beyond that. It's a good little piece of
| software, but it hardly relegates PostgreSQL to "Just GIS
| things" or "Just JSON use-cases".
| ElectricalUnion wrote:
| > SQLite is appropriate for very simple DB use-cases
| where you aren't too bothered by silent data
| inconsistency/corruption, but not a whole lot beyond
| that.
|
| SQLite is also appropriate for use cases where you are
| bothered by silent data inconsistency/corruption, it's
| just that often the hardware running PostgreSQL (or any
| "serious" DBMS, relational or not) is usually less prone
| to random inconsistency/corruption (non-interruptible
| redundant power sources, well cooled environment, running
| lower but more stable clock speeds on everything, ECC
| RAM, raid disk array, corruption-resistant FS).
|
| If you run PostgreSQL on the cheapest consumer hardware,
| expect random corruption of the database when power runs
| out several times in a row.
| msbarnett wrote:
| Sorry, "corruption" was probably misleading there. By
| "silent data inconsistency/corruption" I mean, "a bug in
| my application led to me storing strings in an integer
| column, or too-long data in a too-short column, and
| SQLite just silently let me do that"-type errors - the
| data consistency is logically (but not physically)
| corrupted. The author encountered 2 separate occurrences
| of this just in their simple use-case, evidently.
| 323 wrote:
| > _led to me storing strings in an integer column_
|
| Recent SQLite versions have STRICT mode where they forbid
| that.
|
| https://www.sqlite.org/stricttables.html
|
| > _or too-long data in a too-short column_
|
| You can use a constraint for that:
| CREATE TABLE test ( name TEXT NOT NULL
| CHECK(length(name) <= 20) ) STRICT;
| mikepurvis wrote:
| Neat, will definitely give this a try. Definitely one of
| my minor peeves about SQLite is having to set a bunch of
| pragmas on every new connection. Like, especially stuff
| such as foreign_keys=1... really, enforcement of foreign
| key constraints is something each connection has to opt
| into rather than being set one on the whole DB file and
| never thought about again? Blah.
| Beltiras wrote:
| There are enough SQL oopsies in this article to show that you
| should NEVER use sqlite for any production except as filter
| databases or other sort of throwaways. RDBMSs are there to
| enforce constraints on data.
| fmajid wrote:
| That's silly, SQLite is the single most widely deployed
| database engine in the world, possibly the most deployed
| software, period. It's unit test suite is second to none.
| It's reliable enough that SMF, Solaris' boot and service
| management (sort of like systemd) depends on it.
|
| That does not mean it is applicable to all databases use-
| cases, but production-grade it undoubtedly is.
| ElectricalUnion wrote:
| > NEVER use sqlite for any production
|
| > RDBMSs are there to enforce constraints on data.
|
| If you did not ask SQLite to enforce type constraints on
| relations (with STRICT), it will work in the relaxed,
| backwards-compatible expected behavior of previous versions
| of SQLite.
|
| That being said, if you want _actual validation_ , you
| probably need more complex CHECK expressions anyways with
| your business rules, and those work by default on any
| database _.
|
| https://www.sqlite.org/stricttables.html *
| unless you're a madman that runs things with "PRAGMA
| ignore_check_constraints = false;" enabled or equivalent; in
| that case, no DB can help you.
|
| _
| BeefWellington wrote:
| Sure, you can start there. Once you scale though you're going
| to have lots of fun.
| neverartful wrote:
| For many applications you're right. However, in the case of a
| dashboard for blog articles I don't think it could ever come
| to that.
| stormdennis wrote:
| I don't know about now but at one time Firebird made this
| question moot. It never received the love it deserved in my view.
| kstrauser wrote:
| Many years ago, I was hired at a web shop running Firebird in
| protection. It had the interesting property that if you ran the
| client software on your desktop to connect to the server
| software on another host, and used ^Z to stop the client, then
| the server process would hang until you resumed the client. I'm
| not entirely sure how I'd implement that as a feature if I
| wanted to.
| RedShift1 wrote:
| Use the right tool for the right job.
| ormaaj wrote:
| Agreed. Performance comparison is interesting in a synthetic
| benchmark context but the premise of the title isn't that.
| KronisLV wrote:
| You know, apart from some of the details in regards to what data
| types to better use etc., I think that both PostgreSQL and SQLite
| are excellent choices for most use cases!
|
| Of course, SQLite might be the best choice for embedded or local
| usage, they even have this lovely page about using it as an
| application file format: https://sqlite.org/appfileformat.html
|
| And PostgreSQL might scale better for multiple writers and
| situations where you need it on a separate VM/server/container
| that might be shared between different apps through a network
| connection of some sort.
|
| Though it's also curious, because you never hear a lot about
| MySQL or MariaDB (comparatively, in my subjective opinion) on HN
| or many other places for that matter. And yet, it's still a
| decent option, has good tooling (MySQL Workbench is great), is
| widely supported and backs a significant amount of production
| systems out there.
|
| Do people just view it as old/legacy, or is there the belief that
| for many use cases PostgreSQL might simply be the better option,
| e.g. also because of projects like PostGIS and support for more
| complex setups, like custom data types, more advanced scripting
| language for in-database processing and whatnot?
| danenania wrote:
| I'm very happy with MySQL currently. I started using it because
| AWS Aurora supports it a bit better than postgres. I haven't
| had any regrets since.
|
| It's great when you just want simple relational storage without
| a lot of features or ceremony, but need a bit more than SQLite.
| It's fast, predictable, and gets the job done.
|
| Postgres is great too and offers more, but sometimes that can
| tempt you to try to do too much in the DB layer.
| mattxxx wrote:
| Some context: I've worked with massive PostgreSQL db's,
| containing millions of daily records spanning multiple years
| (allowing for potentially massive numbers of returned rows). I
| also exclusively MVP simple db architectures in SQLite (think: no
| massive joins or need to partition data).
|
| Best takeaways here:
|
| - SQLite kills it. It appears untuned PostgreSQL is only 2
| seconds (???) faster than untuned SQLite; always try to kickoff a
| project with SQLite
|
| - PostgreSQL has a performance cliff when query responses leave
| memory. This is extra true when an index no longer fits in
| memory. worker_mem is critical for pushing back this performance
| cliff
|
| Side note: Tuning postgres can be really painful, but it's also a
| really dope tried-and-true db.
| neverartful wrote:
| Interesting article. It should be noted that the testing done is
| with 100% read/query workload (not surprising since the
| application is a website of blogs).
| miguelgrinberg wrote:
| This is 100% correct. This is a dashboard application that is
| mostly read-only for users. There is a cron job that imports
| and writes new data once per day.
| polyrand wrote:
| Great blog post. A few observations (considering the blog post
| skips numerous details):
|
| SQLite really shines when compared to running a database in a
| separate server from the app. This is when the network latency
| starts being noticeable. Latency between you and the app is
| usually constant. The latency between the app and the database is
| what matters. If in this case, Postgres can run in the same host
| as the main app, I think it's a good thing to do. But very often
| that's not the case.
|
| Even though not surprising, it's great to see how SQLite handles
| concurrency. From my experience, SQLite would be able to handle
| much more read concurrency. It would have been great to see more
| benchmarks (like 1000 concurrent users, even if the app would
| never see such traffic).
|
| I was surprised that SQLite didn't beat Postgres on simpler
| queries. Without seeing the queries, if they contain multiple
| joins, that could be another reason why SQLite is slower since it
| can only do "nested loop" JOINs. The workloads also seem mostly
| analytical, we'll see where DuckDB goes in this space (looking
| promising for now).
|
| One thing not mentioned in the blog post is "easy to maintain and
| update". If the app is deployed with Docker, adding Postgres
| means you go from worrying about: 1 docker
| container + 1 docker volume
|
| to: 2 docker containers (docker-compose?) + 1
| docker volume + internal docker networking
|
| The fact that SQLite is so simple to operate also has some
| advantages. You can usually go back to an SQLite-based app 1 year
| later and understand everything. With other databases, you may
| need to review users, roles, permissions, connection string, etc.
|
| Another cool thing about SQLite is that it scales mostly linearly
| as you add more CPU cores. It can scale as long as you can also
| scale the number of processes in your app. In other databases,
| scaling means adding connection pools and other tools in the
| middle.
|
| Lastly, and related to the latency topic. The low latency in
| SQLite means that you can sometimes convert a big query into 5
| separate, small and reusable queries without any performance hit.
| It's really remarkable home much cleaner your code can become
| with this.
|
| In any case, I love working both with Postgres and SQLite, and we
| need more blog posts like this one showing real-world cases.
| mcdonje wrote:
| >If you are going to take one thing away from this article, I
| hope it is that the only benchmarks that are valuable are those
| that run on your own platform, with your own stack, with your own
| data, and with your own software. And even then, you may need to
| add custom optimizations to get the best performance.
|
| Good advice, contextually.
| rammy1234 wrote:
| If you say using SQLite, there is always risk of losing that data
| file in disk right. How is that managed. _is this a dumb
| question_ , which PostgreSQL do not ??
| lazide wrote:
| Not a dumb question, but some people don't care for their use
| case (either the data is cheap, or they don't think it will
| happen)
| neverartful wrote:
| The big question here would be whether the filesystem that the
| data file resides on is mounted on a single-disk volume or a
| RAID-volume (or even a clustered SDS volume such as Ceph). On a
| single-disk volume, if the disk dies your DB is gone (at least
| until you can replace the disk and restore the backup). On a
| sensibly designed RAID-volume, if one of the disks dies the
| application's admin probably would never even know. The
| sysadmin would see it in their logging/alerting infrastructure
| and replace the dead disk. There would not be any loss of data
| or availability of the data (just a temporary degradation in
| the overall resiliency). Same is true for a clustered SDS
| volume such as Ceph.
| rfoo wrote:
| Not only that, the easy mode we all doing now is "run the
| application in a major cloud". Besides AWS, They all promise
| 99.999% or above durability for their block storage offering.
|
| Of course you still need backup for other incidents.
| simonw wrote:
| The best answer I know of to this question is Litestream -
| https://litestream.io/ - you can use it to inexpensively
| replicate a backup of your database up to an S3 bucket (or
| various other storage providers) to give you a very solid
| robust recovery option should you lose the disk that your
| SQLite database lives on.
| stonemetal12 wrote:
| Postgresql stores data to disk, so it has the same risk of
| loss. Both setups need a backup solution to mitigate the
| problem of disk failure.
|
| My understanding of the situation is sqlite's solution is a bit
| more low level than postgresql but should be workable.
| GordonS wrote:
| In combination with a base backup, you can use streaming WAL
| backups with Postgres, shipping compressed WAL files to S3 or
| wherever. I like WAL-G for this.
|
| There are lower-level approaches built-in to SQLite, such as
| the Backup API, but there is at least one 3rd party project,
| Litestream, which does streaming WAL backups too.
| procinct wrote:
| These days, you could replicate it to something like S3 via
| something like Lifestream.
| yread wrote:
| I understand and partially agree with all the criticism of the
| article (and could add my own like why did you tweak work_mem out
| of all the hundreds of settings on postgres) but it's a blog
| post, not a scientific article - I still think it's a useful
| post. I might be in a similar situation at some point and it's
| nice to know that you haven't encountered any major gotchas and
| made it even 2x faster. Thanks for sharing your experience.
| mjw1007 wrote:
| I had a look at SQLite around the time sqlite3 appeared, and my
| notes say I decided I didn't like it because of the lax and
| sometimes inconsistent way it can treat its input.
|
| I thought I'd have another go at it today (sqlite 3.34) and I'm
| mildly surprised that the things I didn't like are still there:
| sqlite> select '1 apple' + '2 oranges'; 3
| sqlite> select count(*) where 0.9; 1 sqlite> select
| count(*) where 1.1; 1 sqlite> select count(*) where
| not 0.9; 1 sqlite> select count(*) where not 1.1;
| 0
| simonw wrote:
| "I'm mildly surprised that the things I didn't like are still
| there"
|
| That's because SQLite doesn't ship changes that break existing
| applications.
| infogulch wrote:
| You may be interested in STRICT Tables [1], introduced just 6
| months ago:
|
| > Some developers appreciate the freedom that SQLite's flexible
| typing rules provide and use that freedom to advantage. But
| other developers are aghast at SQLite's flagrant rule-breaking
| and prefer the traditional rigid type system found in all other
| SQL database engines and in the SQL standard. For this latter
| group, SQLite supports a strict typing mode, as of version
| 3.37.0 (2021-11-27), that is enabled separately for each table.
|
| [1]: https://www.sqlite.org/stricttables.html
|
| Related past discussions:
|
| Strict Tables - Column type constraints in SQLite - Draft |
| 2021-08-21 | 331 points, 108 comments |
| https://news.ycombinator.com/item?id=28259104
|
| SQLite Release 3.37.0 | 2021-11-27 | 333 points, 112 comments |
| https://news.ycombinator.com/item?id=29363054
| gunapologist99 wrote:
| The debate over type conversion seems somewhat like arguing
| language features (such as weak typing vs strong), which has
| been a spirited debate since forever.
|
| The fact that this is debatable at all means that there are
| arguments and counter-arguments that serve both sides well,
| which further implies that there are use cases where some pros
| outweigh the cons on each side.
|
| (Of course, this entirely neglects the point that it's usually
| fun and even educational to have a technical argument.)
| mjw1007 wrote:
| I don't have any great objection to a database that lets you
| store a value of any type in any column.
|
| And if they want to say '3' + 5 = 8, I could imagine why that
| might be fine.
|
| But I don't think that's a good reason to have behaviour like
| the above.
|
| I think it's more likely that they wouldn't choose to do
| those things again today: I think this is what fell out of
| the initial TCL-based implementation and they feel they're
| stuck with keeping it backwards-compatible.
| zinekeller wrote:
| I have a question that wasn't answered on the blog, and might be
| silly for me to ask. Is SQLite running in WAL mode or not? I
| usually found that WAL mode usually allows better performance
| even in read-heavy workloads as it doesn't need to exclusively
| lock the database each time.
|
| Otherwise, the post is a nice reminder for SQLite users that
| foresee a database switch to use the newly-introduced strict
| mode.
| miguelgrinberg wrote:
| SQLite isn't running in WAL mode. This application isn't just
| read-heavy, it is ready-only for users, so I wouldn't think
| this will make a difference. The writes occur while everybody
| is sleeping, from a cron job that runs once a day to upload the
| traffic for the previous 24 hs.
| dgb23 wrote:
| Yes, WAL mode can have a drastic impact.
|
| Also this is apparently benchmarking Postgres and SQLite _with
| Python on top_. Some of the overhead here might not have much
| to do with either SQLite or Postgres at all. Just because one
| uses an ORM doesn't mean the same code is running and
| especially considering how sluggish Python is, I'm not
| convinced by any of this either way outside of this being an
| improvement for the very specific use-case presented. But we
| don't know _why_ that is.
| miguelgrinberg wrote:
| I explain this in the post. The goal is to optimize the
| application for users. I really don't care if Postgres is
| faster than SQLite or viceversa, I just want users to have a
| snappier application. The experiment was to swap out the
| database while leaving everything else the same to see if
| that affects the overall user experience in a positive or
| negative way.
|
| Also note that I'm not saying Postgres is faster the SQLite
| in absolute terms. That happened to be the case for this
| particular application and how it was deployed. If you
| perform a similar test with a different app or stack you will
| likely get different results. A main point I'm trying to make
| in the article is that people should do their own
| benchmarking instead of making decisions on stuff you read
| online.
| jeffdn wrote:
| > The solution was to grow the slug column to 512 characters and
| retry.
|
| No, the solution is to always use `text` as your column type, and
| use a check constraint if you need to enforce a limit. It's much
| easier and safer to alter a check constraint than it is to change
| a column type live in production. The `text` type and
| `varchar(x)` are identical under the hood, `text` takes up no
| more space on disk.
| masklinn wrote:
| Note: that's true for postgres and sqlite (which ignores the
| length limit entirely as they discovered anyway), not
| necessarily for other database systems.
| coolsunglasses wrote:
| How convenient then that he's comparing PostgreSQL and SQLite
| then!
| SigmundA wrote:
| I thought as of PG 9.2 expanding a varchar column was metadata
| only operation and therefore low overhead.
|
| I know in SQL Server there are two issues with doing
| varchar(max) for everything and increasing a columns size is
| metadata only. First indexes have a limit of 900 byte values
| and will fail at runtime if you index a column with no max
| length and insert a value larger than 900 bytes. PG seems to
| have this issue as well but the limit is 2712 bytes.
|
| Second the query planner makes use of the size to determine how
| much memory to pre-allocate for the query, with unlimited
| length field it assume something like 4096 bytes and wastes
| working memory if your values are not actually that size. Not
| sure if PG has the second issue, having a max value defined is
| valuable information to a database engine along with other type
| information such as UUID's only taking 16 bytes instead of 36
| bytes.
| masklinn wrote:
| > PG seems to have this issue as well but the limit is 2712
| bytes.
|
| Note that this is only for bytes indexes. Hash is not so
| limited.
|
| Tho at this sort of sizes it seems unlikely such simple
| indexes are of much use. I guess the range matching abilities
| of btrees could be but that seems unlikely, a 2700 bytes
| path-like datum is a lot.
| luhn wrote:
| For Postgres, there's a strong case to be made VARCHAR(n) is
| preferable to TEXT+check constraint. As you note, they're
| stored the same on disk. But for a varchar, you can increase
| the length with just this: ALTER TABLE a
| ALTER COLUMN b TYPE VARCHAR(n)
|
| This only requires ACCESS EXCLUSIVE for a fraction of a second
| while the metadata is updated.
|
| Whereas updating a CHECK constraint will require a full scan of
| the table. If you do it naively, this locks the table for the
| entire scan. If you want to avoid that, it's three steps:
| ALTER TABLE a DROP CONSTRAINT b_c, ADD
| CONSTRAINT b_c CHECK (length(b) < n) NOT VALID; COMMIT;
| ALTER TABLE a VALIDATE CONSTRAINT b_c;
|
| So as long as you're only increasing the length (which in my
| own experience is a safe assumption), VARCHAR is much easier to
| work with.
| singron wrote:
| Btw this procedure doesn't work for partitioned tables. For
| some reason, it locks the table and does a table scan anyway.
|
| The workaround we came up with was to add check constraints
| for instant partition attachment, lock the table, detach all
| the partions, update the type on each partition, update the
| parent table type, then reattach all the partitions.
|
| In our case, we didn't already have those instant-attach
| check conditions, so we incurred table scans anyway to
| validate those, but if you were to make multiple updates, you
| only have to do that once.
| jeffdn wrote:
| Yep, that's fair enough. In my experience, however, in the
| vast, vast majority of cases, folks either don't actually
| care about the length, or they should be using business logic
| in code to do the validation, rather than waiting for the
| database to yell at them, invalidate their transaction, etc.
| layer8 wrote:
| The benefit is that you can treat the database schema as
| the single source of truth as to what data is valid.
| Otherwise what tends to happen is that different code
| components will have different ideas of what exactly is
| valid. This is also important because databases tend to be
| longer-lived than the code that accesses them.
| HideousKojima wrote:
| >they should be using business logic in code to do the
| validation
|
| I trust myself to do this, I don't trust my coworkers and
| future devs writing applications for the same database to
| follow the same constraints in their code though.
| Constraints at the DB level keep your data clean(er) even
| if some other devs tries to shit it up.
| luhn wrote:
| It's good practice to have a length constraint on all text
| fields, just to put some cap on field sizes. You don't want
| somebody uploading 100MB of garbage into the `first_name`
| field--A 1000-char max is more than enough.
|
| Constraints generally should be enforced both application-
| and database-side wherever possible. Application
| constraints can be more robust and have better UX, but only
| database constraints can offer you guarantees about what is
| actually in your database.
| msbarnett wrote:
| If I had a dollar for every time I saw someone doing
| application-layer-only validation of data and end up being
| surprised that some data that wasn't supposed to be valid
| ended up in the database only to later blow up in their
| face, I'd be a fairly well-off man.
|
| In practice, declaratively creating data constraints at
| schema-creation time has a much much higher success rate
| than trying to imperatively enforce data constraints at
| write time, in my experience. The missed cases and bugs
| always come back to haunt you.
| jeffdn wrote:
| I'm not talking about all validation, I'm talking about a
| length constraint on a string (that isn't something like
| a US state, which should be a two-character field) --
| already a pretty uncommon thing. At work, our codebase
| has 975 instances of a column storing a text value, and
| the length is material only in a very small handful of
| those.
| ehutch79 wrote:
| Is it just me or was testing done with only four concurrent
| clients? In a mostly read only environment?
|
| Wouldn't moving to a static site generator been a better
| direction in that case?
|
| If you're maxing out at four people reading your blog, I can't
| imagine more infrastructure is better.
| tpetry wrote:
| They not benchmarking the blog, they are benchmarking an
| analytics application aggregating statistics about the blogs.
| Thaxll wrote:
| I mean SQLite is living on a single node, what exactly do you do
| when you have failure, when more than one process need to acces
| the DB, when you need RBAC etc ...
| rco8786 wrote:
| > what exactly do you do when you have failure
|
| Same thing you have for failure when your pg instance is
| running on a single node?
| simonw wrote:
| You tell the users of your internal statistics dashboard to
| wait a few minutes/hours while you fix it.
| warmwaffles wrote:
| > what exactly do you do when you have failure
|
| Depends on the failure. What do you have in mind?
|
| > when more than one process need to acces the DB
|
| Easy, you use `journal_mode` set to `WAL` when you connect to
| the sqlite database.
|
| > when you need RBAC
|
| If you need RBAC, you aren't going to be using sqlite. In
| theory you could have your application handle the RBAC part
| since sqlite is embedded in your application.
|
| Another solution is to leverage unix users and groups and set
| permissions appropriately for the database file and the
| application running.
| ju-st wrote:
| >> when more than one process need to acces the DB
|
| > Easy, you use `journal_mode` set to `WAL` when you connect
| to the sqlite database.
|
| and set isolation level and check_same_thread and PRAGMA
| synchronous and then put all queries in retry loops because
| the DB is maybe locked https://sqlite.org/wal.html#sometimes_
| queries_return_sqlite_...
|
| I wasted so much time trying to get multiple sqlite queries
| running at the same time
| warmwaffles wrote:
| > I wasted so much time trying to get multiple sqlite
| queries running at the same time
|
| You were trying to write at the same time right?
| neverartful wrote:
| It has everything to do with the nature of the application,
| architecture, business, and the business cost of such failures
| doesn't it?
|
| If the discussion is around failures for the use case of the
| article (dashboard for blogs which is read-only), my best guess
| at 'high availability' would be just to have a separate SQLite
| DB file on multiple app server nodes and call it done. The
| author mentions that the DB is updated once per day (probably
| during the night) by a cron job. Just as easy to then scp a few
| copies to other app server nodes.
|
| If the discussion is around a 24x7 read-write workload with
| high business costs for performance, availability, and
| scalability then it's an entirely different problem.
| xsc wrote:
| It seems a solid first step would be to use EXPLAIN to understand
| query planning/scanning issues.
|
| Assuming 6.5M rows/day*1.5yr=~3.5B rows, how large is the dataset
| in both SQLite and pg? With indexes, is it fitting in RAM?
|
| Linode 2GB plan is shared vcpu, so results will vary depending on
| who is sharing it with your service.
| miguelgrinberg wrote:
| This isn't the topic of the article so I haven't included it,
| but I have taken several "solid steps" to optimize this
| application throughout its year and a half of being in
| production.
|
| This article is about one of them only, how I designed a test
| system to evaluate the two databases head to head.
___________________________________________________________________
(page generated 2022-06-28 23:01 UTC)