[HN Gopher] Test Postgres in Python Like SQLite
___________________________________________________________________
Test Postgres in Python Like SQLite
Author : wey-gu
Score : 144 points
Date : 2025-06-06 00:56 UTC (22 hours ago)
(HTM) web link (github.com)
(TXT) w3m dump (github.com)
| wey-gu wrote:
| I need this in my use case, so built it after downstream e2e
| verified working!
| wey-gu wrote:
| Roadmap of v1
|
| https://github.com/wey-gu/py-pglite/issues/4
| gleenn wrote:
| I would love something like this for databases like Snowflake
| where there isn't an easy way to test without hitting a prod SF
| server
| wey-gu wrote:
| aha, SF is pure managed service, thus 100% local/or even lite-
| local of SF seems not feasible, maybe clickhouse/databend could
| be consider to enable such flexibility of lightweight testing?
| wey-gu wrote:
| This is also an option(verifies data schema, query in some way)
| with SF SQL translated to duckdb SQL and testing it.
|
| https://www.reddit.com/r/SQL/comments/10s4608/comment/j722e1...
| buremba wrote:
| I'm developing https://github.com/buremba/universql for this
| use-case, though DuckDB is used under the hood.
| murkt wrote:
| I wonder if it's possible to compile Postgres directly into a
| Python extension instead of WASM. Just import it and go forth, no
| node dependency, nothing.
| wey-gu wrote:
| wow, thanks! it should be feasible! and as I recall there are
| such thing from some databases(chromadb, milvus-lite) in the
| py-first communities.
|
| we could think big to someday do that within py-pglite project
| actually.
|
| let me put it as the roadmap of v2(much more work to do!)
| heinrichhartman wrote:
| Can you explain this? What is the compilation target? What is
| the compiler? How does being a "python extension" help?
| wey-gu wrote:
| the target would be just baremetal binary with some changes
| needed and the purpose was just to be python dx optimized
| samwillis wrote:
| This is something I've explored as part of my work on PGlite.
| It's _possible_ but needs quite a bit of work, and would come
| with some limitations until Postgres upstream make some
| changes.
|
| You will need to use an unrolled main loop similar to what we
| have in PGlite, using the "single user mode" (you likely don't
| want to fork sub processes like a normal Postgres). The
| problems come with then trying to run multiple instances in a
| single process, Postgres makes heavy use of global vars for
| state (they can as they fork for each session), these would
| clash if you had multiple instances. There is work happening to
| possibly make Postgres multi-threaded, this will solve that
| problem.
|
| The long term ambition of the PGlite project is to create a
| libpglite, a low level embedded Postgres with a C api, that
| will enable all this. We quite far off though - happy to have
| people join the project to help make it happen!
| ewhauser421 wrote:
| Nice! Ive been wanting to try this ever since PGLite came out.
|
| Anything that won't work if you tried this as a drop in
| replacement for a full PG instance in tests? Maybe extensions?
| Any benchmarks?
| laurencerowe wrote:
| This is running pglite in a node subprocess. Why not just run
| Postgres itself as a subprocess with a data directory in a
| tempdir?
| isoprophlex wrote:
| Exactly; in the past I've had reasonable success with test
| fixtures based on temp dirs and a templated docker compose
| file. Just needs docker in the environment, which is not too
| far fetched.
| wey-gu wrote:
| I think the ultimate version in such use case would be
| carefully wire-up the baremetal one with ad-hoc in-mem-disk or
| tempdir :), this could be a future backend of py-pglite(planned
| in v2).
|
| For now, it's more accessible for me to hack it in hours and it
| works.
| ArcaneMoose wrote:
| I recently struggled with PGlite for a while and ended up using
| full postgres via https://github.com/leinelissen/embedded-
| postgres which worked like a charm!
| hardwaresofton wrote:
| TIL this exists, thank you! It's seems obvious in retrospect.
|
| Would love to see a benchmark between this, pglite and
| testcontainers.
| wey-gu wrote:
| Thanks, duno this before and it looks like the ultimate shape
| py-pglite in js world~~
| jauco wrote:
| There's also https://testcontainers.com/ not sure about the speed
| difference but testcontainers has never felt slow to me when
| using it from node js unittests.
| hardwaresofton wrote:
| Ding ding ding. Testcontainers is a fanatstic way to write the
| most important tests (arguably) for your app. Don't test E2E
| with a mock, just use a real database.
|
| If that feels hard to you (to set up your app pointing to
| another DB, run a single E2E-testable part of your app with
| it's own DB connection, etc), fix that.
| maartenh wrote:
| Yep. I do create fresh db's from a fixture db using
| postgres's ability to create a database from a template. Very
| quick, always correct.
| 9rx wrote:
| If you feel you need E2E tests, that means that you have
| implementation details leaking across different areas of your
| application. That's what you really need to fix. Bandaids may
| be found useful, but even better is not cut yourself in the
| first place.
| glookler wrote:
| If you just know when you aren't cutting yourself then why
| do you need tests?
| 9rx wrote:
| Normally you would write tests at the user boundaries
| (like a set of APIs) for the purposes of documenting the
| behavioural intent for the user as to not leave
| unspecified behaviour. While you could theoretically do
| that in Word instead, testing, as we call it, offers a
| couple of advantages:
|
| 1. It is written in code, which is what coders want to
| read. There is nothing worse than having to read
| documentation in English (or insert your natural language
| of choice).
|
| 2. It is able to automatically prove that what is
| documented is true. Programmers tend to have a penchant
| for lying if you let them, so this is a surprisingly big
| advantage.
|
| You might not _need_ them per se, but unless you are
| trying to be an asshole, it is socially expected that you
| will be kind to your users by filling them in on
| important details; to not leave them guessing.
|
| Now, if your application only has one user boundary
| (a.k.a. spaghetti code) then any tests you have will
| technically end up being E2E tests by there being only
| one end (although calling that E2E would be rather
| silly), that is true, but if that's what your code looks
| like then you're going to have trouble connecting to a
| test database anyway, and per the earlier comment you
| need to fix that for the sake of E2E tests. But if you
| reach that point, you may as well go all the way and fix
| it properly.
| wey-gu wrote:
| yeah, w/o py-pglite attempt this should be the only approach,
| the pglite ideally could make it more flexibly/lightweight in
| unittest cases, but as you mentioned it's never felt slow, it
| should be fine to working on it.
|
| And actually, more e2e cases I think it's way better to not use
| the lite backend.
|
| the non-container solutions would do more like the lifecycle
| mgmt/isolated env prep/tear-down with elegantly designed
| abstractions. While I think similar abstractions could be done
| on top of containers.
|
| Maybe we ideally could have unified abstractions on both
| container-based, wasm evantually to boost dx yet with different
| expectation of speed vs compatibility.
| globular-toast wrote:
| For Python specifically I use pytest-docker:
| https://pypi.org/project/pytest-docker/
| kinow wrote:
| That's exactly what we are using for our tests in a new pull
| request to add support to Postgres, https://github.com/BSC-
| ES/autosubmit/pull/2187
|
| The last GH Actions jobs with SQLite and Python 3.9 took 3m
| 41s, and the same tests with Postgres took 4m 11s. Running a
| single test locally in PyCharm also executes in less than 1
| second. You notice some bootstrap happening, but once the
| container image is downloaded locally, it's really quite fast.
| whalesalad wrote:
| It's remarkably easy to integrate. We have a tests/db module
| with a single conftest.py file that does everything. Every test
| suite run spins up a fresh container (no state), runs all
| migrations against it to prepare it for duty, and then the test
| suite continues. You could do this on a per-test basis for
| ultra isolation, but this works for us. That change would
| simply be to modify the `scope` value.
|
| Also a convenient `conn` fixture at the end which allows any
| test in the module to get a new db connection.
|
| linky:
| https://gist.github.com/whalesalad/6ecd284460ac3836a6c2b9ca8...
| samwillis wrote:
| Awesome work Wey! Love that you're building this!
|
| I work on PGlite, we have an experimental WASI build that can run
| in a WASI runtime, it should enable dropping the Node
| requirement. It lacks error handling at the moment (WASM has no
| long jump, and Postgres uses that for error handling - Emscripten
| has hacks that fix this via JS), and so we haven't yet pushed it
| far.
|
| Do ping me on our Discord and I can point you towards it.
|
| Happy to answers any PGlite questions while I'm here!
| jodiug wrote:
| This is great, running tests with a database covers a ton of
| behaviors that are otherwise hard to test, and having the
| database in memory allows CI/CD to run those tests without
| downloading and building containers.
|
| It does depend on SQLAlchemy. Can this also be used with asyncpg?
| Is it on the roadmap?
| wg0 wrote:
| What a library! Just five years ago couldn't have imagined
| Postgress could be running in browser.
|
| And now this.
|
| Going to use right away.
| lucideer wrote:
| This seems great but:
|
| > _The library automatically manages PGlite npm dependencies._
|
| I'm sorry what? Please don't do this. There is no way you can do
| this in a way that:
|
| (a) is compatible with every pre-existing in-repo NodeJS setup
|
| (b) plays nicely with all SCA automation configurations
| (Dependabot, etc.)
|
| ---
|
| Edit:
|
| After closer inspection there's a config[0] to disable this which
| defaults to True, but no documentation on how to manage required
| Node deps when setting that value to false.
|
| I would strongly suggest the authors default this to False with
| accompanying docs in the README
|
| Overall though I do like the simplicity & readability of the
| codebase - it makes these gotchas really easy to find & verify
| even without docs - will definitely be using this.
|
| [1] https://github.com/wey-gu/py-
| pglite/blob/b821cf2cfeb2e4bc58b...
| selimnairb wrote:
| I just use pytest-docker-compose, then I don't need to bother
| with NPM. I usually don't like "magic", but pytest's fixtures are
| so powerful I'm okay with a little bit of "magic".
| ptx wrote:
| How does running PostgreSQL compiled to WebAssembly reduce "the
| overhead of a full PostgreSQL installation"? Couldn't a native
| version be configured similarly and avoid the additional overhead
| of WebAssembly, Node.js and npm?
| CamouflagedKiwi wrote:
| Yes, it can. It's not especially hard to start up a Postgres
| instance, and with a couple of config tweaks you can improve
| the startup time. I've had this working nicely at a previous
| job, it's under a couple of seconds to start.
| wey-gu wrote:
| Thanks~
| WhyNotHugo wrote:
| I can't imagine a wasm running inside nodejs being faster than
| native code that's been optimised for decades.
|
| > No PostgreSQL install needed--just Node.js
|
| postgres is 32MB, nodejs is 63MB. I know that 31MB isn't a huge
| deal for most folks, but it's hard to see as "doesn't require
| postgres" as a selling point when you need something else that's
| twice the size instead.
| wey-gu wrote:
| Haha yeah, I put a (I know) in the readme
|
| > Effortless Setup: No PostgreSQL install needed--just
| Node.js(I know)!
|
| Was just to have kind of SQLite dx in 1 hour thus did so.
|
| And then I thought why not open source it?
|
| Maybe in v2 I could abstract actual binary with same dx
| TheTaytay wrote:
| This isn't because of the size. Frankly, this is appealing
| because we already use both python and node package managers,
| so not needing to reach for another binary install mechanism is
| really appealing.
| ForHackernews wrote:
| Is this about testing Python that _uses_ Postgres?
|
| Because if you're really interested in testing _postgres_ you can
| test PG in PG: https://pgtap.org/
| jacurtis wrote:
| It is for testing python projects that connect to postgres.
|
| So often what you do is when unit testing, you need to test
| CRUD-type operations between the application and the database.
| So you generally have to spin up a temporary database just to
| delete it at the end of the tests. Commonly this is done with
| SQLite during testing, even if you built the application for
| postgres in production. Because it is fast and easy and you can
| interact with it as a file, not requiring connection
| configurations and added bloat.
|
| But then sometimes your app gets complicated enough that you
| are using Postgres features that SQLite doesn't have
| comparables to. So now you need a PG instance just for testing,
| which is a headache.
|
| So this project bridges the gap. Giving you the feature
| completedness and consistency across envs of using postgres,
| but with the conveniences of using SQLite.
| benpacker wrote:
| I have this setup and integrated for Node/Bun -
|
| This is an example of a unit test of an API route on a fully
| isolated WASM backed Postgres - very few lines of code, and all
| your API unit tests can run fully in parallel without any shared
| state: https://github.com/ben-pr-p/bprp-react-router-
| starter/blob/m...
|
| This is all of the code needed to use Postgres in prod and PGLite
| in test/dev: https://github.com/ben-pr-p/bprp-react-router-
| starter/blob/m...
| wey-gu wrote:
| wow thanks! Should I use bun instead of node now?
| benpacker wrote:
| It's the same - just saying it works in both.
|
| I like it because I can do full stack development, including
| the database, with a single system level dependency (Bun).
| wey-gu wrote:
| ha, thanks, make sense
| peterldowns wrote:
| Thanks for sharing this, I'm not a Node/Bun dev but it's quite
| understandable. One question: does each test/database need to
| wait for migrations to run, or are you somehow using template
| databases to run them once and then quickly clone a new
| database for each test?
| perrygeo wrote:
| For Clojure and Java apps, check out Zonky
| (https://github.com/zonkyio/embedded-postgres). It provides a
| similar experience on the JVM, but instead of containers or WASM,
| you're running an embedded native binary.
| wey-gu wrote:
| Thanks! This is the ultimate shape I am going to pursue!
| buremba wrote:
| Amazing, this is what I was trying to find for the last few
| weeks! I wonder if it's possible to run WASM directly from Python
| instead of the subprocess approach, though?
| veggieroll wrote:
| Yo, this installs npm packages at runtime. Very not cool IMO. You
| should disclose this prominently in the README.
|
| This is a nice project idea. But, you should use a Python WASM
| interpreter to run the PostgreSQL WASM.
___________________________________________________________________
(page generated 2025-06-06 23:01 UTC)