[HN Gopher] Safely Dropping MySQL Tables
___________________________________________________________________
Safely Dropping MySQL Tables
Author : mscccc
Score : 51 points
Date : 2022-07-25 17:09 UTC (5 hours ago)
(HTM) web link (planetscale.com)
(TXT) w3m dump (planetscale.com)
| skilled wrote:
| Would love to get some detailed, in-depth, and technically rich
| answers from people who upvoted this nonsensical article. It's
| literally a 100-word advertisement for a paid product.
| daenz wrote:
| I didn't upvote it, but I think the product does inspire
| discussion about why we don't have basic important features
| like _mandatory_ "last accessed" metadata on external state
| sources. Think of the number of hours of pain due to someone
| being "sure" a database/api/resource wasn't used, and then
| removing it. I'm not really put off by the advert for that
| reason.
| throwusawayus wrote:
| tracking access time (every read) is huge performance
| bottleneck. especially if reliably persisting this
|
| same reason why filesystems are often mounted with noatime
| attribute !
| daenz wrote:
| Yes, the sensible default for this feature would be opt-
| out, like noatime, in case you need the performance boost
| and understand the implications of the choice.
| tobyjsullivan wrote:
| The most generous hypothesis would be that Planetscale is an
| increasing popular product among the HN crowd - specifically,
| those who have to maintain large MySQL deployments. It seems
| like a very good product that is seeing something like viral
| adoption across those companies.
|
| The next ingredient would be if dropping MySQL tables is a
| relatively big pain (or known but unmanaged risk) that most of
| those people have experienced at some point.
|
| Then this feature would be exciting news for that not-so-small
| cohort and that seems sufficient for HN upvotes?
|
| I don't know if that's what happened today. I've heard about
| planetscale a few times but never used it. I can attest that
| dropping tables is always scary.
| throwusawayus wrote:
| article title is misleading! it is about planetscale only,
| not mysql in general. borderline click bait
|
| also posted by a former github person. not sure if poster is
| author or works at planetscale but a lot of ex github engg
| are there
| samlambert wrote:
| It wouldn't be a PlanetScale post without our biggest fan
| throwusawayus. Let me know if you want a hat.
| skilled wrote:
| I know of many great products that a lot of HN users use or
| have used in the past. Not once have I felt the need to
| submit their announcement for a new product feature because I
| feel like it would spark discussion or foster genuine
| inspiration.
|
| Don't get me wrong but clicking on this link I was expecting
| an actual detailed guide/explanation.
|
| There is nothing anyone can say to justify this being on the
| front page considering both the submission title and the
| actual content of the said submission.
| daenz wrote:
| >There is nothing anyone can say to justify this being on
| the front page
|
| There's a "hide" button on HN posts if downvoting them
| isn't enough for you.
| daenz wrote:
| Every source of state should have this feature. It is always
| inevitably needed.
| noasaservice wrote:
| This is about dropping objects in the MySQL database.
|
| It is not about getting rid of MySQL as a database.
| donatj wrote:
| To me this is one of the major values of monorepo. A quick grep
| of the entire ecosystem for a table name and bam, I can drop it.
| dotancohen wrote:
| I would argue the opposite, actually. The storage model or
| database layer should be in a single repo. And thus there would
| be less code to grep through.
| hinkley wrote:
| That doesn't work because that repo needs to expose an API
| that accesses that table, now you have to prove the API is
| never called, and you're back to searching the entire code
| base for that function call (hopefully you've used meaningful
| function names instead of DRY ones...)
| willcipriano wrote:
| Add: log.info("get_user_by_user_id has been deprecated,
| please use the alternative API.")
|
| Into the repo method, push up the change, redeploy the
| projects that depend on the library and wait a while and
| grep the logs to see if it ever gets used.
|
| Monorepo is easier though.
| hinkley wrote:
| Dude, that just tells you if the table is used on the
| happy path. It does not tell you if it's only used by a
| customer whose godson is the CEO's favorite kid. Or if
| customers only use it right before a federal deadline for
| filing forms, or at the end of the quarter when it's time
| to generate reports. Which you didn't collect for 2
| months.
| wereHamster wrote:
| won't work if you dynamically concatenate table names from
| strings! "select * from " + "use" + "rs;"
| Hamuko wrote:
| It also won't work if you worship Satan and let end users
| enter the table name dynamically, but who the fuck does this?
| hinkley wrote:
| Little Bobby Tables would like to have a word with you...
|
| I honestly wonder if you could make a functioning database
| API that either does not accept strings as arguments, or can
| detect string concatenation and reject it. Not just a builder
| pattern to greatly discourage it, but a straight up exception
| on bad input type. Bind variables or GTFO.
| Arnavion wrote:
| Sure you can. Just make your transport protocol only
| support taking in a stored procedure name and parameters
| for DMLs, and some typed representation for DDLs.
|
| But while that prevents people from concatenating strings
| to form DML queries as a whole, it obviously doesn't
| prevent the kind of concatenation wereHamster mentioned.
| hxtk wrote:
| The book "Building Secure and Reliable Systems" from
| Google's series on SRE actually talks about two examples of
| this in C++ and Go which forbid using anything but string
| literals in the query string of an SQL API.
|
| In Go, the solution was very tidy: it aliases string to an
| unexported internal type that consumers cannot instantiate.
| String literals can be coerced to that type, but variables
| that already have type information associated with them are
| rejected at compile time.
|
| The C++ solution was a bit more complicated and involved
| templates.
| thrashh wrote:
| You can probably do it trivially in any language with
| operator overloading or a very expressive type system
|
| However you would also probably want a language with some
| syntactic sugar that let users use your special string
| type easily otherwise the burden on users will be too
| high
| hxtk wrote:
| > However you would also probably want a language with
| some syntactic sugar that let users use your special
| string type easily otherwise the burden on users will be
| too high
|
| In the instance under discussion in that section of the
| book I'm referencing, allowing the user to instantiate
| the special string type was precisely the behavior
| intended to avoid. The idea being that in the following
| three examples, the first succeeds and the rest fail at
| compile time: Query("SELECT * FROM
| Users WHERE id = ?;") id := "6";
| Query("SELECT * FROM Users WHERE id = " + id + ";")
| id := 6; Query(fmt.Sprintf("SELECT * FROM Users WHERE id
| = %d;", id))
| itake wrote:
| Isn't the safe way to do this is to rename the table and watch
| for errors before performing the drop? Queries fail on the old
| table name and can easily be recovered by reverting the name
| change.
| samlambert wrote:
| With https://planetscale.com/features/rewind you can just drop
| the table and bring it back if you have errors.
| mathnode wrote:
| You are spot on. And more importantly, a roll back procedure.
|
| In this case, RENAME TABLE x TO y, and watch or wait for a
| process to fail in test/QA/UAT/etc.
|
| Have a DROP ready, but also a RENAME back.
|
| A production quality MySQL or MariaDB installation should have
| enabled binary logging and maybe delayed replication to handle
| any issues.
| robertlagrant wrote:
| The safest way is to migrate to Postgres and then do what you
| like with your MySQL instance.
|
| (I just wanted to win the prize for the most HNey comment
| ever!)
| itake wrote:
| OP's solution is objectively worse, because it only reports
| access recency. Just because the table was or wasn't used
| recently, doesn't mean its safe to drop.
|
| The code change may of landed the day before that disabled
| accessing that table, but the tool would warn that the table
| was still in use.
|
| A quarterly cron job may access the table and go undetected
| if you look purely at the recency.
|
| The rename strategy protects the data and offers easy
| rollback. This is a false heuristic.
| SnowHill9902 wrote:
| You are missing Rust somewhere.
| grogers wrote:
| If you happen to still be using it then you'd get a lot of
| errors right away though. If you are extremely paranoid you can
| do better by migrating to a new user that doesn't have access
| to the table. Switching to the new user would be a normal
| rolling deploy, so you'd only start with 1/N errors, and your
| normal rollback process gets automatically triggered.
|
| Probably not worth the hassle though.
| jjeaff wrote:
| Oh, that's clever. I have so rarely utilized the user
| management system in MySQL beyond just creating a user with
| the necessary permissions for each application. I hadn't ever
| thought of using new users to do canary deployments.
| jacquesm wrote:
| This is where the value of documentation comes in, as well as
| some process around your data model. If you need this to keep you
| safe you are doing something terribly wrong somewhere.
| siddontang wrote:
| Interesting feature, I think we can learn from this in our
| product.
|
| IMO, it is still not safe even we know there are no queries
| running for this table. You may still meet a scenario that when
| you type `drop table`, another guy begins to run the query at the
| same time.
|
| As the maintainer of another database, We have been trying our
| best to improve this scenario too.
|
| Early on, we have provided a feature called `recover table` to
| recover table immediately after you wrongly drop the table. But
| this still can't avoid affecting current running queries on this
| table.
|
| We call this problem `DDL affects DML`, and now we try to
| introduce a Table meta lock to guarantee that no any query is
| running when the DDL executed. We hope we can release this
| feature in the end of this year.
| rawoke083600 wrote:
| How does this relate to pt-tools ? percona mysql toolkit ?
| mnd999 wrote:
| So grepping the query logs then.
| Scarbutt wrote:
| That's too obvious.
___________________________________________________________________
(page generated 2022-07-25 23:01 UTC)