[HN Gopher] Don't rely on IF NOT EXISTS for concurrent index cre...
___________________________________________________________________
Don't rely on IF NOT EXISTS for concurrent index creation in
PostgreSQL
Author : shayonj
Score : 54 points
Date : 2024-08-12 18:55 UTC (4 hours ago)
(HTM) web link (www.shayon.dev)
(TXT) w3m dump (www.shayon.dev)
| londons_explore wrote:
| Other long-running operations use transactions... Do the
| operation on a snapshot of the data, and then evaluate if it can
| be committed.
|
| Why isn't index creation done the same way?
| AmericanChopper wrote:
| CREATE INDEX CONCURRENTLY is not a transactional statement. If
| you want to create an index inside a transaction it will block
| all writes to the table until the index is created, which can
| take a rather long time on large tables.
| singron wrote:
| Concurrent index creation internally uses multiple
| transactions, so it can't itself be wrapped in a transaction.
|
| The docs explain the mechanics further:
| https://www.postgresql.org/docs/current/sql-createindex.html...
| zelphirkalt wrote:
| That immediately brings up the next question: Why are
| transactions not composable?
| londons_explore wrote:
| Ignoring everything but the outer transaction would be
| technically correct - wonder why it isn't supported?
| paulddraper wrote:
| They are?
|
| Your transaction isolation mode defines how they compose.
|
| Unless you are asking something different.
| paulddraper wrote:
| The entire point of CONCURRENTLY is to _not_ be in a single
| transaction.
|
| So you can interleave writes and (partially complete) indexing.
|
| If you want transactional index creation, just omit
| CONCURRENTLY.
| ars wrote:
| "PostgreSQL doesn't automatically remove the partially created
| index when the operation is aborted due to a lock timeout. This
| is intentional, as it allows for potential recovery or manual
| intervention"
|
| This argument doesn't make much sense. What's the usefulness of a
| partially created index? The entire rest of the article is
| working hard to explain how this invalid index is 100% useless
| and should be dropped!
|
| So this seems like a bug in PostgreSQL to me.
| shayonj wrote:
| I went looking in the postgres code and took it more as a
| "Feature request" where if the index building / table scanning,
| etc is interrupted for concurrent builds, PG could drop the
| index in the end. I can see how PG keeps it around however,
| since invalid indexes can be useful for administrators.
|
| However, for application developers - unless you are familiar
| with the internals, this comes off as a big surprise.
|
| Would love to learn more from folks who are more familiar with
| the internals.
| choilive wrote:
| Index creation can take a very long time for a large database,
| think hours or god forbid days. Restarting from scratch for an
| index that takes 18 hours to build is very painful.. doubly so
| back when concurrent index creation didn't exist.
| whartung wrote:
| Do what is the recovery/restart process for a partially
| created index?
|
| And how do you clean up after an aborted index build?
| asabil wrote:
| REINDEX INDEX CONCURRENTLY[1]
|
| [1]: https://www.postgresql.org/docs/current/sql-
| reindex.html
| Groxx wrote:
| Just to +1 this as a non-Postgres-user who briefly skimmed
| the docs: REINDEX INDEX CONCURRENTLY
|
| ^ it can be rebuilt.
|
| I'm honestly not sure if this is saving anything except re-
| _defining_ (i.e. the literal "create index" statement), but
| they seem definitely not _100%_ useless despite being
| invalid. Maybe 99% useless, but "drop it in the background"
| doesn't seem particularly better either - leaving
| intermediate state for an admin to tackle by hand seems
| reasonable.
| SoftTalker wrote:
| If your database is that big then it should probably be
| managed by someone competent to do so, not a bunch of
| developers who only know ActiveRecord.
| victorbjorklund wrote:
| You can use reindex to rebuild an invalid index. But many
| prefer to just drop it and try again (like the author does).
| I'm sure there are scenarios where reindex is a much better
| option so not a bug (maybe a bad default).
| AmericanChopper wrote:
| I'm pretty sure it's just because it allows you to retry the
| index creation using REINDEX INDEX CONCURRENTLY, which allows
| you to avoid the exclusive lock created by DROP INDEX, because
| you can't always DROP INDEX CONCURRENTLY
|
| https://www.postgresql.org/docs/current/sql-dropindex.html
| merb wrote:
| Actually the index can only fail because of a deadlock or a
| unique failure. Deadlocks are rare and are probably an
| Application bug and for unique failure you need to fix your
| database anyway and you either need to drop and recreate it or
| reinfect it (new stuff will use the index for unique checks
| even in invalid state but not for queries)
| andatki wrote:
| Hey Shayon. IF NOT EXISTS or the Active Record version
| "if_not_exists: true" can be pretty handy though when there's a
| valid index in production, to help drive schema definition
| consistency in all environments (prod, dev, CI, etc.). As you
| pointed out though, it only makes sense when the indexes being
| checked are valid.
|
| In my experience on large tables that are "busy", sometimes
| indexes need to be added manually first from a utility session,
| perhaps inside tmux/screen that's detached from while they are
| created. This could take hours for large tables. Then once done,
| and the index is valid, an Active Record migration can be sent
| out using "if_not_exists: true" to make sure it's applied
| everywhere.
|
| Your point that it could be misused unintentionally due to not
| knowing an index is INVALID is a good one, and I feel it should
| be part of how it works by default in Active Record. Had you
| considered trying to propose that to rails/rails? I would
| certainly support that PR (may be able to collaborate) and could
| add more examples and validation.
| shayonj wrote:
| Hey Andrew! It's great to see you here. I agree with everything
| you said, 100%. I'm honestly torn because I was the first one
| to recommend `if_not_exists` to engineers, but now I'm
| reconsidering because this can lead to very strange issues in
| production, especially for ActiveRecord applications.
| Maintaining an "invalid index reaper job" doesn't sound like a
| good long-term solution.
|
| I've suggested some proposals to Rails here and would love to
| propose a PR based on the solution that makes the most sense,
| too.
|
| https://github.com/rails/rails/issues/52583
| andatki wrote:
| Do you do any alerting for INVALID indexes? For example, by
| default PgHero will display them prominently and I believe
| PgAnalyze does as well. My thought is to put the energy into
| making INVALID indexes highly visible. Perhaps combined with
| a process step. 1. Any CREATE INDEX CONCURRENTLY migrations
| go out in their own deployment. 2. Any queries that depends
| on that index being present means a PR has a process step
| asking the author to verify that the index exists and is
| valid. That way you wouldn't have to lose if_not_exists.
|
| That all said, still would be cool to make this the default
| in Active Record. Nice idea!
| shayonj wrote:
| Yeah, def, good call.
|
| I have a job that runs nightly or weekly and performs a
| concurrent reindex on tables, which nicely covers cases
| like this.
|
| Given that lock timeouts are very much "rescuable," I'm
| also thinking about addressing the problem right at the
| start (when adding migrations) by implementing retries.
| This approach at least would eliminate the need for any
| follow-up steps by developers and reduce additional
| cognitive load.
|
| The ideal goal being "The system just works"(tm) :D
| andatki wrote:
| Makes sense. Check out the implementation of automatic
| lock timeout retries in Strong Migrations.
|
| Lock Timeout Retries [experimental]
| https://github.com/ankane/strong_migrations
| limaoscarjuliet wrote:
| Concurrent index creation or re-indexing is not all magic. It
| does scan the table multiple times and might lead to deadlocks if
| the table is heavily modified.
|
| Furthermore, failed concurrent index or re-index will leave
| invalid cursors behind that are not usable but incur the update
| costs. Of course, they consume the disk space.
|
| Use with caution.
| dogsledsleddog wrote:
| I hope I am missing something in this logic?:
|
| Broken indexes are intentionally left because indexing can be a
| long operation on a big db, so you may regret not just repairing
| them..
|
| Let's delete and recreate indexes all the time, just in case
| because they might be left over..
|
| Hope we don't encounter the reason they left broken indexes and
| wait a few hours for a rebuild?
| shayonj wrote:
| Reindex starts fresh on a temp index and then does a swap in
| the end - https://www.postgresql.org/docs/current/sql-
| reindex.html
| dogsledsleddog wrote:
| That's very helpful, but I still don't get the
| recommendation. The reindex seems to guarantee that any
| useful index will be used, the deletion and recreation seems
| to guarantee a period of bad performance as long as an index
| takes to build even if the old one was fine.
|
| (I.e. I could imagine a customer upgrading from a version
| that already backported an index to the new major version
| where it was added and deciding there's a performance
| regression.)
| kgeist wrote:
| We generally ban any IF NOT EXISTS in our team (we use MySQL)
| because it usually masks the real problem (why is the developer
| trying to create a table which already exists in the test
| environment? maybe there's a name conflict with another branch
| etc.)
| shayonj wrote:
| exactly - i missed to talk about this in the post. I think
| using IF NOT EXISTS can be a sign of a higher degree issue as
| well. Being more mindful can help.
| lisper wrote:
| > why is the developer trying to create a table which already
| exists
|
| Maybe idempotency was a design goal.
| kgeist wrote:
| We have a special table which remembers which migrations were
| already run. So creating a second migration which tries to
| create the table again is a sign something is wrong. It's
| probably not what the developer intended and can result in
| wrong assumptions about the table's schema, for example (they
| expected their version would be used but the new migration is
| actually skipped).
| arp242 wrote:
| In general I agree, but in MySQL/MariaDB this is quite
| tricky because you can't do a lot of stuff in transactions;
| definitely seen problems where someone does;
| -- migrate-2024-08-12-foo.sql create table [..]
| alter table [..] insert into migrations values
| ('migrate-2024-08-12-foo');
|
| And the create works, but the alter fails. And then you try
| to run it again and now the create fails because you have
| half a migration. Herp derp.
|
| And you often can't split those two statements either: that
| would be "half a migration" that won't work with the code.
|
| As far as I know there isn't really a good solution for
| this. It can be quite a pain.
|
| In PostgreSQL and SQLite this is not an issue as you can
| run all the above in a transaction. You indeed rarely need
| "if not exists" there.
| kgeist wrote:
| For that exact reason, we have a rule that every DDL
| migration must be a separate migration in the framework.
| So if migration #2 out of 3 fails, only 1 migration will
| be marked as completed. If we see migration failed in
| CI/CD logs, it's immediately clear what exactly failed,
| and we can quickly revert what has managed to run (by
| applying the corresponding down migration), to return the
| DB to the previous valid state (if we decide to abort the
| release and fix the migrations first). Or we can rerun
| individual DDL migrations again, if it's a timeout issue.
|
| We design migrations in such a way so that new
| schemas/data never conflict with already running (old)
| code. And new code is activated only after all migrations
| completed successfully. So in practice half completed
| migrations is rarely an issue for us.
|
| But yeah, it requires more boilerplate. However, there's
| fewer surprises, compared to IF NOT EXISTS.
| arp242 wrote:
| There's lots of cases where that doesn't really work, and
| can't logically work. An obvious example is:
| drop index old; create index new [..];
|
| There's tons of examples like this where things can kind-
| of work with enough effort, but also not really - if the
| query takes 5 minutes without that index it's just going
| to timeout, so not that different from having broken
| code, and not much you can do about that too. You really
| want those two combined to be atomic.
|
| And in cases of "create table new; drop table old" using
| the old table is often 1) a major hassle, and 2) often
| doesn't actually work all that well as it's not a tested
| path.
|
| Either way, it seems MariaDB does support transactions
| for all of this if you disable "autocommit mode", judging
| from their docs? I haven't used MySQL/MariaDB in quite a
| few years, but I've never seen migrations for them done
| well without tons of hassle and it's a major reason I've
| typically recommended PostgreSQL. It's just eliminates an
| entire class of common headaches and errors.
| kgeist wrote:
| For your index example, why not just do:
| create index new [..] drop index old;
|
| At every point, there's always an index to use. We
| usually have to split releases into several
| "subreleases":
|
| Release #1 (automatic migration step)
| - alter the table - add a new index which uses new
| columns (etc.) (code release step) - new code
| goes live and starts using the new index
|
| Release #2 (we call it "post release") -
| remove the old unused index
|
| Automation, CI/CD help here. Yeah, it's somewhat error
| prone, but you usually get used to it. Not saying it's
| perfect, but manageable.
|
| In any case, you can never make both SQL migration and
| code release atomic (if you use rolling updates with zero
| downtime) so you already have to split into such
| subreleases anyway.
| arp242 wrote:
| > Yeah, it's somewhat error prone, but you usually get
| used to it. Not saying it's perfect, but manageable.
|
| Of course; I'm just saying this is why people want
| idempotency and use "if not exists", especially in
| MariaDB/MySQL land.
| evanelias wrote:
| > it seems MariaDB does support transactions for all of
| this
|
| No, DDL cannot be used in a transaction in any version of
| MySQL or MariaDB.
|
| > I've typically recommended PostgreSQL. It's just
| eliminates an entire class of common headaches and errors
|
| That's fair, but it does create new headaches, for
| example the notion of an invalid index (as described in
| the post) simply does not exist in MySQL or MariaDB.
| Index creation does not block concurrent writes, and
| won't fail mid-stream due to a lock conflict with old row
| purging (equivalent to PG's vacuum). And if a DDL
| statement somehow does fail, e.g. due to an ill-timed
| host crash, it is fully cleaned up -- DDL is atomic, just
| not transactional.
|
| One way to avoid the "half a migration" problem you
| mentioned up-thread is to use declarative schema
| management, which doesn't require tracking completion
| state of "migrations" at all:
| https://www.skeema.io/blog/2019/01/18/declarative/
| marcosdumay wrote:
| There are tools that will break each statement into a
| different migration, and apply a larger grouping by the
| files you create.
|
| I have actually never seen people adopt those. And the
| "enterprise" tools people tend to circle around seem to
| create more failure modes than it helps solving. But I
| imagine somebody uses them somewhere.
|
| Without those tools, you can break your migration around
| the failure point and mark the first half as complete.
| You should have some way to mark them, because you always
| need some channel for "free" management of a database.
|
| Anyway, transactional DDL is great.
| 1a527dd5 wrote:
| We have a similar setup; but there are still scenarios
| where you need a IF NOT EXISTS. A classic example of a
| table on dev being close to empty, and a table on prod
| being huge and taking longer than the default timeout ~30
| seconds.
|
| Depending on what the timeout is (connection timeout,
| transaction timeout etc) the DDL operation _may_ still be
| running. But the migration "process" has failed. So you end
| up having to slightly modify the attempted migration to
| account for the fact the thing already exists.
|
| It's an increasingly rare occurrence, probably a few times
| a year. Like anything the more we do it, the better we get
| at pulling left and preventing issues in the first place.
| 1a527dd5 wrote:
| IF NOT EXISTS is incredibly useful to resolve the state of
| difference between environments.
|
| A classic example is a massive table on production, but an empty
| table on dev. Dev gets migrated fine, but prod fails for reasons.
| You now need to resolve it.
|
| Also, we have a convention (not enforced) that we let PostgreSQL
| name the relation (e.g PK, UQ, Index et al) for us.
|
| So we don't do:-
|
| CREATE INDEX CONCURRENTLY IF NOT EXISTS idx_test_data ON
| test_table (data);
|
| We do:-
|
| CREATE INDEX CONCURRENTLY ON schema_name.table_name
| (column_name);
| lmm wrote:
| How do you do migrations that drop those indices if they're
| potentially differently-named on dev/prod?
___________________________________________________________________
(page generated 2024-08-12 23:00 UTC)