hngopher.com

       [HN Gopher] Test Postgres in Python Like SQLite
       ___________________________________________________________________
        
       Test Postgres in Python Like SQLite
        
       Author : wey-gu
       Score  : 144 points
       Date   : 2025-06-06 00:56 UTC (22 hours ago)
        
 (HTM) web link (github.com)
 (TXT) w3m dump (github.com)
        
       | wey-gu wrote:
       | I need this in my use case, so built it after downstream e2e
       | verified working!
        
       | wey-gu wrote:
       | Roadmap of v1
       | 
       | https://github.com/wey-gu/py-pglite/issues/4
        
       | gleenn wrote:
       | I would love something like this for databases like Snowflake
       | where there isn't an easy way to test without hitting a prod SF
       | server
        
         | wey-gu wrote:
         | aha, SF is pure managed service, thus 100% local/or even lite-
         | local of SF seems not feasible, maybe clickhouse/databend could
         | be consider to enable such flexibility of lightweight testing?
        
         | wey-gu wrote:
         | This is also an option(verifies data schema, query in some way)
         | with SF SQL translated to duckdb SQL and testing it.
         | 
         | https://www.reddit.com/r/SQL/comments/10s4608/comment/j722e1...
        
         | buremba wrote:
         | I'm developing https://github.com/buremba/universql for this
         | use-case, though DuckDB is used under the hood.
        
       | murkt wrote:
       | I wonder if it's possible to compile Postgres directly into a
       | Python extension instead of WASM. Just import it and go forth, no
       | node dependency, nothing.
        
         | wey-gu wrote:
         | wow, thanks! it should be feasible! and as I recall there are
         | such thing from some databases(chromadb, milvus-lite) in the
         | py-first communities.
         | 
         | we could think big to someday do that within py-pglite project
         | actually.
         | 
         | let me put it as the roadmap of v2(much more work to do!)
        
           | heinrichhartman wrote:
           | Can you explain this? What is the compilation target? What is
           | the compiler? How does being a "python extension" help?
        
             | wey-gu wrote:
             | the target would be just baremetal binary with some changes
             | needed and the purpose was just to be python dx optimized
        
         | samwillis wrote:
         | This is something I've explored as part of my work on PGlite.
         | It's _possible_ but needs quite a bit of work, and would come
         | with some limitations until Postgres upstream make some
         | changes.
         | 
         | You will need to use an unrolled main loop similar to what we
         | have in PGlite, using the "single user mode" (you likely don't
         | want to fork sub processes like a normal Postgres). The
         | problems come with then trying to run multiple instances in a
         | single process, Postgres makes heavy use of global vars for
         | state (they can as they fork for each session), these would
         | clash if you had multiple instances. There is work happening to
         | possibly make Postgres multi-threaded, this will solve that
         | problem.
         | 
         | The long term ambition of the PGlite project is to create a
         | libpglite, a low level embedded Postgres with a C api, that
         | will enable all this. We quite far off though - happy to have
         | people join the project to help make it happen!
        
       | ewhauser421 wrote:
       | Nice! Ive been wanting to try this ever since PGLite came out.
       | 
       | Anything that won't work if you tried this as a drop in
       | replacement for a full PG instance in tests? Maybe extensions?
       | Any benchmarks?
        
       | laurencerowe wrote:
       | This is running pglite in a node subprocess. Why not just run
       | Postgres itself as a subprocess with a data directory in a
       | tempdir?
        
         | isoprophlex wrote:
         | Exactly; in the past I've had reasonable success with test
         | fixtures based on temp dirs and a templated docker compose
         | file. Just needs docker in the environment, which is not too
         | far fetched.
        
         | wey-gu wrote:
         | I think the ultimate version in such use case would be
         | carefully wire-up the baremetal one with ad-hoc in-mem-disk or
         | tempdir :), this could be a future backend of py-pglite(planned
         | in v2).
         | 
         | For now, it's more accessible for me to hack it in hours and it
         | works.
        
       | ArcaneMoose wrote:
       | I recently struggled with PGlite for a while and ended up using
       | full postgres via https://github.com/leinelissen/embedded-
       | postgres which worked like a charm!
        
         | hardwaresofton wrote:
         | TIL this exists, thank you! It's seems obvious in retrospect.
         | 
         | Would love to see a benchmark between this, pglite and
         | testcontainers.
        
         | wey-gu wrote:
         | Thanks, duno this before and it looks like the ultimate shape
         | py-pglite in js world~~
        
       | jauco wrote:
       | There's also https://testcontainers.com/ not sure about the speed
       | difference but testcontainers has never felt slow to me when
       | using it from node js unittests.
        
         | hardwaresofton wrote:
         | Ding ding ding. Testcontainers is a fanatstic way to write the
         | most important tests (arguably) for your app. Don't test E2E
         | with a mock, just use a real database.
         | 
         | If that feels hard to you (to set up your app pointing to
         | another DB, run a single E2E-testable part of your app with
         | it's own DB connection, etc), fix that.
        
           | maartenh wrote:
           | Yep. I do create fresh db's from a fixture db using
           | postgres's ability to create a database from a template. Very
           | quick, always correct.
        
           | 9rx wrote:
           | If you feel you need E2E tests, that means that you have
           | implementation details leaking across different areas of your
           | application. That's what you really need to fix. Bandaids may
           | be found useful, but even better is not cut yourself in the
           | first place.
        
             | glookler wrote:
             | If you just know when you aren't cutting yourself then why
             | do you need tests?
        
               | 9rx wrote:
               | Normally you would write tests at the user boundaries
               | (like a set of APIs) for the purposes of documenting the
               | behavioural intent for the user as to not leave
               | unspecified behaviour. While you could theoretically do
               | that in Word instead, testing, as we call it, offers a
               | couple of advantages:
               | 
               | 1. It is written in code, which is what coders want to
               | read. There is nothing worse than having to read
               | documentation in English (or insert your natural language
               | of choice).
               | 
               | 2. It is able to automatically prove that what is
               | documented is true. Programmers tend to have a penchant
               | for lying if you let them, so this is a surprisingly big
               | advantage.
               | 
               | You might not _need_ them per se, but unless you are
               | trying to be an asshole, it is socially expected that you
               | will be kind to your users by filling them in on
               | important details; to not leave them guessing.
               | 
               | Now, if your application only has one user boundary
               | (a.k.a. spaghetti code) then any tests you have will
               | technically end up being E2E tests by there being only
               | one end (although calling that E2E would be rather
               | silly), that is true, but if that's what your code looks
               | like then you're going to have trouble connecting to a
               | test database anyway, and per the earlier comment you
               | need to fix that for the sake of E2E tests. But if you
               | reach that point, you may as well go all the way and fix
               | it properly.
        
         | wey-gu wrote:
         | yeah, w/o py-pglite attempt this should be the only approach,
         | the pglite ideally could make it more flexibly/lightweight in
         | unittest cases, but as you mentioned it's never felt slow, it
         | should be fine to working on it.
         | 
         | And actually, more e2e cases I think it's way better to not use
         | the lite backend.
         | 
         | the non-container solutions would do more like the lifecycle
         | mgmt/isolated env prep/tear-down with elegantly designed
         | abstractions. While I think similar abstractions could be done
         | on top of containers.
         | 
         | Maybe we ideally could have unified abstractions on both
         | container-based, wasm evantually to boost dx yet with different
         | expectation of speed vs compatibility.
        
         | globular-toast wrote:
         | For Python specifically I use pytest-docker:
         | https://pypi.org/project/pytest-docker/
        
         | kinow wrote:
         | That's exactly what we are using for our tests in a new pull
         | request to add support to Postgres, https://github.com/BSC-
         | ES/autosubmit/pull/2187
         | 
         | The last GH Actions jobs with SQLite and Python 3.9 took 3m
         | 41s, and the same tests with Postgres took 4m 11s. Running a
         | single test locally in PyCharm also executes in less than 1
         | second. You notice some bootstrap happening, but once the
         | container image is downloaded locally, it's really quite fast.
        
         | whalesalad wrote:
         | It's remarkably easy to integrate. We have a tests/db module
         | with a single conftest.py file that does everything. Every test
         | suite run spins up a fresh container (no state), runs all
         | migrations against it to prepare it for duty, and then the test
         | suite continues. You could do this on a per-test basis for
         | ultra isolation, but this works for us. That change would
         | simply be to modify the `scope` value.
         | 
         | Also a convenient `conn` fixture at the end which allows any
         | test in the module to get a new db connection.
         | 
         | linky:
         | https://gist.github.com/whalesalad/6ecd284460ac3836a6c2b9ca8...
        
       | samwillis wrote:
       | Awesome work Wey! Love that you're building this!
       | 
       | I work on PGlite, we have an experimental WASI build that can run
       | in a WASI runtime, it should enable dropping the Node
       | requirement. It lacks error handling at the moment (WASM has no
       | long jump, and Postgres uses that for error handling - Emscripten
       | has hacks that fix this via JS), and so we haven't yet pushed it
       | far.
       | 
       | Do ping me on our Discord and I can point you towards it.
       | 
       | Happy to answers any PGlite questions while I'm here!
        
       | jodiug wrote:
       | This is great, running tests with a database covers a ton of
       | behaviors that are otherwise hard to test, and having the
       | database in memory allows CI/CD to run those tests without
       | downloading and building containers.
       | 
       | It does depend on SQLAlchemy. Can this also be used with asyncpg?
       | Is it on the roadmap?
        
       | wg0 wrote:
       | What a library! Just five years ago couldn't have imagined
       | Postgress could be running in browser.
       | 
       | And now this.
       | 
       | Going to use right away.
        
       | lucideer wrote:
       | This seems great but:
       | 
       | > _The library automatically manages PGlite npm dependencies._
       | 
       | I'm sorry what? Please don't do this. There is no way you can do
       | this in a way that:
       | 
       | (a) is compatible with every pre-existing in-repo NodeJS setup
       | 
       | (b) plays nicely with all SCA automation configurations
       | (Dependabot, etc.)
       | 
       | ---
       | 
       | Edit:
       | 
       | After closer inspection there's a config[0] to disable this which
       | defaults to True, but no documentation on how to manage required
       | Node deps when setting that value to false.
       | 
       | I would strongly suggest the authors default this to False with
       | accompanying docs in the README
       | 
       | Overall though I do like the simplicity & readability of the
       | codebase - it makes these gotchas really easy to find & verify
       | even without docs - will definitely be using this.
       | 
       | [1] https://github.com/wey-gu/py-
       | pglite/blob/b821cf2cfeb2e4bc58b...
        
       | selimnairb wrote:
       | I just use pytest-docker-compose, then I don't need to bother
       | with NPM. I usually don't like "magic", but pytest's fixtures are
       | so powerful I'm okay with a little bit of "magic".
        
       | ptx wrote:
       | How does running PostgreSQL compiled to WebAssembly reduce "the
       | overhead of a full PostgreSQL installation"? Couldn't a native
       | version be configured similarly and avoid the additional overhead
       | of WebAssembly, Node.js and npm?
        
         | CamouflagedKiwi wrote:
         | Yes, it can. It's not especially hard to start up a Postgres
         | instance, and with a couple of config tweaks you can improve
         | the startup time. I've had this working nicely at a previous
         | job, it's under a couple of seconds to start.
        
           | wey-gu wrote:
           | Thanks~
        
       | WhyNotHugo wrote:
       | I can't imagine a wasm running inside nodejs being faster than
       | native code that's been optimised for decades.
       | 
       | > No PostgreSQL install needed--just Node.js
       | 
       | postgres is 32MB, nodejs is 63MB. I know that 31MB isn't a huge
       | deal for most folks, but it's hard to see as "doesn't require
       | postgres" as a selling point when you need something else that's
       | twice the size instead.
        
         | wey-gu wrote:
         | Haha yeah, I put a (I know) in the readme
         | 
         | > Effortless Setup: No PostgreSQL install needed--just
         | Node.js(I know)!
         | 
         | Was just to have kind of SQLite dx in 1 hour thus did so.
         | 
         | And then I thought why not open source it?
         | 
         | Maybe in v2 I could abstract actual binary with same dx
        
         | TheTaytay wrote:
         | This isn't because of the size. Frankly, this is appealing
         | because we already use both python and node package managers,
         | so not needing to reach for another binary install mechanism is
         | really appealing.
        
       | ForHackernews wrote:
       | Is this about testing Python that _uses_ Postgres?
       | 
       | Because if you're really interested in testing _postgres_ you can
       | test PG in PG: https://pgtap.org/
        
         | jacurtis wrote:
         | It is for testing python projects that connect to postgres.
         | 
         | So often what you do is when unit testing, you need to test
         | CRUD-type operations between the application and the database.
         | So you generally have to spin up a temporary database just to
         | delete it at the end of the tests. Commonly this is done with
         | SQLite during testing, even if you built the application for
         | postgres in production. Because it is fast and easy and you can
         | interact with it as a file, not requiring connection
         | configurations and added bloat.
         | 
         | But then sometimes your app gets complicated enough that you
         | are using Postgres features that SQLite doesn't have
         | comparables to. So now you need a PG instance just for testing,
         | which is a headache.
         | 
         | So this project bridges the gap. Giving you the feature
         | completedness and consistency across envs of using postgres,
         | but with the conveniences of using SQLite.
        
       | benpacker wrote:
       | I have this setup and integrated for Node/Bun -
       | 
       | This is an example of a unit test of an API route on a fully
       | isolated WASM backed Postgres - very few lines of code, and all
       | your API unit tests can run fully in parallel without any shared
       | state: https://github.com/ben-pr-p/bprp-react-router-
       | starter/blob/m...
       | 
       | This is all of the code needed to use Postgres in prod and PGLite
       | in test/dev: https://github.com/ben-pr-p/bprp-react-router-
       | starter/blob/m...
        
         | wey-gu wrote:
         | wow thanks! Should I use bun instead of node now?
        
           | benpacker wrote:
           | It's the same - just saying it works in both.
           | 
           | I like it because I can do full stack development, including
           | the database, with a single system level dependency (Bun).
        
             | wey-gu wrote:
             | ha, thanks, make sense
        
         | peterldowns wrote:
         | Thanks for sharing this, I'm not a Node/Bun dev but it's quite
         | understandable. One question: does each test/database need to
         | wait for migrations to run, or are you somehow using template
         | databases to run them once and then quickly clone a new
         | database for each test?
        
       | perrygeo wrote:
       | For Clojure and Java apps, check out Zonky
       | (https://github.com/zonkyio/embedded-postgres). It provides a
       | similar experience on the JVM, but instead of containers or WASM,
       | you're running an embedded native binary.
        
         | wey-gu wrote:
         | Thanks! This is the ultimate shape I am going to pursue!
        
       | buremba wrote:
       | Amazing, this is what I was trying to find for the last few
       | weeks! I wonder if it's possible to run WASM directly from Python
       | instead of the subprocess approach, though?
        
       | veggieroll wrote:
       | Yo, this installs npm packages at runtime. Very not cool IMO. You
       | should disclose this prominently in the README.
       | 
       | This is a nice project idea. But, you should use a Python WASM
       | interpreter to run the PostgreSQL WASM.
        
       ___________________________________________________________________
       (page generated 2025-06-06 23:01 UTC)