[HN Gopher] Ephemeral Postgres Databases
___________________________________________________________________
Ephemeral Postgres Databases
Author : sac2171
Score : 29 points
Date : 2021-11-17 01:05 UTC (21 hours ago)
(HTM) web link (eradman.com)
(TXT) w3m dump (eradman.com)
| mborch wrote:
| Another trick is to use (possibly nested) savepoints and rollback
| after each test, never actually committing any data.
| Foobar8568 wrote:
| Sql server has snapshot database for something similar, and you
| don't have to play with transactions (that may or not breaks
| everything in case of errors)
| ccakes wrote:
| The catch with this is you can't test any behaviour with
| deferred constraints. Any activity that doesn't take place
| until the commit will never run.
| mborch wrote:
| Should be possible with "SET CONSTRAINTS".
| pquerna wrote:
| At a $previous_job I basically also did what the post is
| describing.
|
| The "best" thing we did was actually using a "template
| database": https://www.postgresql.org/docs/14/manage-ag-
| templatedbs.htm...
|
| We would start a Postgres Process. We would create a new
| database, run all of our migrations and basic data bring up.
| Then we would create a new Database per Test Suite, using the
| one we just ran migrations as the Template.
|
| This meant the initial bring up was a few seconds, but then
| each test suite would get a new database in a dozen
| milliseconds (IIRC).
| mborch wrote:
| That's probably more or less the same thing in terms of what
| actually happens in PostgreSQL.
|
| All things considered, an actual database probably gives you
| the least gray hair, but with some careful test setup I have
| had good success using the savepoint/rollback trick (and it
| trivially supports nested fixtures as well).
| latch wrote:
| > For some time proponents of unit testing have asserted that
| unit tests should not touch a real database
|
| Is that still a belief in some circles? I feel like the shift
| away from this started like 15 years ago (largely because of RoR
| in my mind).
|
| Anyways, this essentially launches a pg instance with a
| postgresql.conf that is aimed for speed (at the risk of possible
| data loss/corruption). DO NOT DO THIS IN PRODUCTION, but I just
| bake the following in local/test/CI: fsync =
| off synchronous_commit = off
|
| Some other things I've seen / done in the name of testing with
| PG:
|
| - Use unlogged tables. Maybe it's a bit faster, never really
| seemed to make much a different.
|
| - Drop all foreign keys. This has significant non-performance
| impact. On the downside, it materially changes the environment
| that your tests are running vs your system: test could now pass
| for otherwise invalid code. On the plus side, it makes setting up
| fake data _a lot_ easier.
|
| - Run tests with a pool_size of 1. This catches cases where you
| start a transaction, but forget to use the transaction object,
| e.g.: db.begin_transaction(tx ->
| tx.query("update x") db.query("delete y") // should
| have been tx.query )
|
| - A robust Factory library (NIH or OSS) is pretty probably the
| most important thing you can do
|
| - If you can't pre-seed all the test data, try to write _all_
| your tests without ever having to truncate/delete tables between
| tests. This (a) avoids the slow delete/truncate and (b) lets you
| run tests in parallel. This means using random keys to avoid
| duplicates (obviously uuid is the simple answer here) and not
| selecting all data from a table when verifying/asserting (since
| who knows what's going to be in there).
| gregplaysguitar wrote:
| > write _all_ your tests without ever having to truncate/delete
| tables between tests
|
| This is exactly what we do, and it works really well. We
| essentially spin up a new tenant for each test suite. Forcing
| all tests to pass regardless of what other unrelated data is in
| the db is a great way to ensure cross-tenant data isolation.
|
| The other underrated benefit is that you don't need a separate
| test database for local development - local dev instance and
| tests can use the same db. This means you can work on db
| changes in one place and see their impact on both tests and the
| running application.
| rubenv wrote:
| Or in Go, with full databases: https://github.com/rubenv/pgtest/
| stansler wrote:
| Another approach is to use Database Lab
| (https://gitlab.com/postgres-ai/database-lab). Our tool allows
| deploying disposable Postgres databases in seconds using REST
| API, CLI, or GUI. The difference is that we provide clones with
| snapshots of full-sized data. For example, you may have dozens of
| clones of your production databases (with masked sensitive data)
| to use as staging servers, database migration verification,
| tests, and all that provisioned in a couple of seconds.
|
| More about using full-sized Postgres clones for migration testing
| and why is that important: https://postgres.ai/products/database-
| migration-testing
| kardianos wrote:
| I also do something like this, but with SQL Server. Highly
| recommended. Except rather then going from a seed or backup, the
| schema and default data is defined in the application, then the
| data is entered in the test setup, the tables are sorted
| topographically, and inserted all at once, then the unit test
| runs (can be combined with a table driven test).
|
| Robust, highly recommended.
| tln wrote:
| My test suite has the schema and seed data in a template database
| then runs CREATE DATABASE testX TEMPLATE seed;
|
| ...which takes about 1 second per test run. The seed data and
| schema is baked into a docker image, and recreated whenever there
| are new migration files. Starting the docker image is slow but
| that doesn't happen on every test run.
| brasetvik wrote:
| That's a pretty reasonable approach, and similar in spirit to
| what I find to work well:
|
| - Create a template with necessary extensions, or just install
| them to "template1" which is the template if you don't
| explicitly specify a template. Installing PostGIS, for example,
| takes a few seconds - which is annoying if you create the
| schema from scratch in test runs. (`create extension if not
| exists postgis` can still be around in your schema, it'll just
| return right away) - Create a template for the test session,
| based on the template you've pre-installed extensions in, and
| apply the schema there. - Create a database based on the second
| template for whatever scope makes sense for your test.
|
| If your Postgres cluster is only serving test workloads,
| `fsync=off` can speed up things as well. (Which a stock
| postgresql.conf will point out can cause irrecoverable data
| loss, which I don't care about for test data)
___________________________________________________________________
(page generated 2021-11-17 23:00 UTC)