[HN Gopher] What we learned after I deleted the main production ...
___________________________________________________________________
What we learned after I deleted the main production database by
mistake
Author : fernandopess1
Score : 36 points
Date : 2022-09-19 20:19 UTC (2 hours ago)
(HTM) web link (medium.com)
(TXT) w3m dump (medium.com)
| sulam wrote:
| Being one click away from a DELETE vs a GET sounds like a serious
| foot-gun that I would wrap a check around. "Are you sure? This
| operation will delete 17M entries."
| fabian2k wrote:
| I'd be seriously scared of putting any production credentials
| with write access into my Postman/Insomnia/whatever. Those
| tools are meant for quickly experimenting with requests, they
| don't have any safety barriers.
| partdavid wrote:
| I mean, it shouldn't really be very easy to even _get_ a
| read-write token to a production database, unless you 're a
| correctly-launched instance of a publisher service. This
| screams to me that they're ignorant of, and probably very
| sloppy with, access control up and down their stack.
| layer8 wrote:
| This is the Postman HTTP method selection dropdown that you can
| see on the screenshots on this page ("GET"):
| https://learning.postman.com/docs/sending-requests/requests/...
|
| Postman doesn't know that sending a single DELETE request to
| that URL will delete 17 million records.
|
| Arguably, REST interfaces shouldn't allow deleting an entire
| collection with a single parameterless DELETE request.
| theptip wrote:
| Honestly I'd make the case for writing a simple python script
| for this kind of thing.
|
| `requests.get(url)` is a lot harder to mis-type as
| `requests.delete(url)`.
|
| At $dayjob we would sometimes do this sort of one-off request
| using Django ORM queries in the production shell, which could
| in principle do catastrophic things like delete the whole
| dataset if you typed `qs.delete()`. But if you write a one-off
| 10-line script, and have someone review the code, then you're
| much less likely to make these sort of "mis-click" errors.
|
| Obviously you need to find the right balance of safety rails
| vs. moving fast. It might not be a good return on investment to
| turn the slightly-risky daily 15-min ask into a safe 5-hour
| task. But I think with the right level of tooling you can make
| it into a 30 min task that you run/test in staging, and then
| execute in production by copy/pasting (rather than deploying a
| new release).
|
| I would say that the author did well by having a copilot;
| that's the other practice we used to avoid errors. But a
| copilot looking at a complex UI like Postman is much less
| helpful than looking at a small bit of code.
| alexjplant wrote:
| I once worked on an app where the staging database was used for
| local testing, all devs used the same shared credentials with
| write access, and you switched environments by changing hosts
| file entries (!!!). This resulted in me accidentally nuking the
| staging database during my first week on the job because I ran a
| dev script containing some DROPs from my corporate Windows system
| and failed to flush the DNS cache.
|
| I had already called out how sub-optimal this entire setup was
| before the incident occurred but it rang hollow from then on
| since it sounded like me just trying to cover for my mistake. The
| footguns were only half-fixed by the time I ended up leaving some
| time later.
| Johnny555 wrote:
| _An old discussion arose about the need for backups. We had
| backups for most databases but no process was implemented for
| ElasticSearch databases. Also, that database was a read model and
| by definition, it wasn't the source of truth for anything. In
| theory, read models shouldn't have backups, they should be
| rebuilt fast enough that won't cause any or minimal impact in
| case of a major incident. Since read models usually have
| information inferred from somewhere else, It is debatable if they
| compensate for the associated monetary cost of maintaining
| regular backups_
|
| My biggest concern about restoring that Elasticsearch backup
| would be that the restored backup would be inconsistent with the
| real source of truth and it might be hard to reconcile to bring
| it up to date.
| soco wrote:
| While everything there is true, why not having a backup anyway?
| I have Elasticsearch backups and even used it once (with
| success) when I terraformed the index away. The delta was
| sourced then on the fly.
| antisthenes wrote:
| The backup only needs to last long enough until the production
| database is rebuilt from the source of truth, and then swapped
| back to the most recent search database.
|
| In other words, it only has to be good enough for a few days
| (ideally - hours).
| glintik wrote:
| <<We had backups for most databases but no process was
| implemented for ElasticSearch databases.>> - that's all you need
| to know
| benjaminpv wrote:
| Funny to think that the issue here is just a relative of the 'no-
| preserve-root' feature rm (now) has: it's easy to let the user
| use the same actions equally on the branches of a hierarchy as
| you could the leaves, but _should_ they?
|
| Pretty recently corporate changed something on my work laptop
| that resulted in a bunch of temporary files generated during the
| build getting redirected to OneDrive. I went in and nuked the
| temp files and shortly thereafter got a message from OD saying
| 'hey noticed you trashed a ton of files, did you mean to do
| that?'
|
| The developer side of me thought 'of course I did, duh' but I can
| imagine that's useful information for most users that made an
| innocent yet potentially costly mistake.
| duxup wrote:
| Having an endpoint that can just delete... everything seems kinda
| risky.
| SoftTalker wrote:
| > In the fifteen minutes I had before the next meeting, I quickly
| joined with one of my senior members to quickly access the live
| environment and perform the query.
|
| Don't do stuff in a rush like this. That's when I almost always
| make my worst mistakes. If there is a "business urgency" then
| cancel or get excused from the upcoming meeting so you can focus
| and work without that additional pressure. If the meeting is
| urgent, then do the other task afterwards.
| racl101 wrote:
| Now this meeting will be get many more urgent meetings.
| PeterisP wrote:
| For me, an interesting statement was "However, it took 6 days to
| fetch all data for all 17 million products." - in my experience
| of DB systems, 17 million entries is significant but not
| particularly much, it's something that fits in the RAM of a
| laptop and can be imported/exported/indexed/processed in minutes
| (if you do batch processing, not a separate transaction per
| entry), perhaps hours if the architecture is lousy but certainly
| not days.
| thayne wrote:
| That kind of depends on how big each record is. And it sounds
| like these records are denormalized from multiple sources, so
| you probably have several transactions for each record. It's
| possible to do batching in that situation, but it definitely
| isn't always easy.
| fabian2k wrote:
| I think this is a very clear disadvantage of the microservice
| architecture they chose in this case, and the post does allude
| to that. To recreate this data they needed to query several
| different microservices that would not have been able to
| sustain a higher load.
|
| If I calculated this right the time they mention comes down to
| 30 items per second. Which is maybe not unreasonable for
| something that queries a whole bunch of services via HTTP, but
| is kinda ridiculous if you compare it to directly querying a
| single RDBMS.
|
| You could probably fix this by scaling _everything_
| horizontally, if that is possible. But the real solution would
| be as you say to have bulk processing capabilities.
| PeterisP wrote:
| Yes, adding a "return X items" mode to the same microservices
| often is a way to get a significant performance boost with
| only minor changes, where even if your main use case needs
| only one item, it enables mass processing without incurring
| the immense overhead of a separate request per each item.
| gtirloni wrote:
| _> Any kind of operation was done through an HTTP call, which
| you would otherwise do with a SQL script, in ElasticSearch, you
| would do an HTTP request_
|
| There you go.
| motoboi wrote:
| People, please don't post things in medium, because it wants
| people to sign up. Use GitHub pages or anything else really.
| ThunderSizzle wrote:
| I'm torn on this, honestly.
|
| We want an internet with less ads, but good writers deserve to
| get paid. They can get paid via Medium (though how much, I
| don't know) through subscriptions. Is that the worse than ads
| or newspapers?
| bachmeier wrote:
| What you say is true, but that doesn't mean it should be
| posted to HN. The purpose of this site is to discuss
| articles. This one's behind a paywall. Even if you sign up
| for a free account, you may have used up your two free
| articles per month. That invites people to comment without
| reading the article. That's not why HN exists. (I actually
| checked the comments hoping someone posted a copy of the
| article.)
| Victerius wrote:
| I enjoy good writing, but the only writing I'm willing to pay
| for is print books (I just bought a copy of J.R.R. Tolkien's
| "The Fall of Gondolin", the hardcover, illustrated one by
| HarperCollins). I don't want to pay for newspapers, for
| investigative journalism, or for long form article magazines
| like The Atlantic or The New Yorker. Nevermind Medium of all
| places, because Medium has no barrier of entry. No
| gatekeeping (and, given how easy it is to merely write a
| blurb of text, I have rather high standards for what I choose
| to pay to read). I'd rather consume from the likes of Amazon
| and have them run these writing platforms (e.g. WaPo) at a
| loss. Which means I'm paying for writers, in the end, just in
| a very indirect way. This sits well with me.
|
| But if the choice before me was to pay for writers directly
| (like Medium), or let non-book writers as a profession
| disappear, I'd opt for the latter. You may criticize this
| attitude. I assume the responsibility for that and I'm being
| honest.
| jacooper wrote:
| There is also hashnode.com
| rlewkov wrote:
| Exactly. Won't read because it requires sign up.
| gumby wrote:
| archive.ph cuts through the medium paywall too.
| contravariant wrote:
| As does basic cookie hygiene.
|
| At least I assume that's what's happening, I haven't seen a
| medium paywall yet.
| baal80spam wrote:
| If you're on Firefox there's an extension to bypass this (only
| for Medium's free articles) -
| https://gitlab.com/magnolia1234/bypass-paywalls-firefox-clea...
| demindiro wrote:
| There is also LibRedirect[1] which automatically redirects to
| an alternative frontend.
|
| [1] https://github.com/libredirect/LibRedirect
| metadat wrote:
| I'm not keen on playing the browser plugin escalation game
| with fundamentally UX hostile sites like Medium. They clearly
| have no respect for the human being at the end of the line
| trying to simply read a document.
| thatguy0900 wrote:
| This is extremely melodramatic. They literally just want
| money so they don't have to run ads
___________________________________________________________________
(page generated 2022-09-19 23:00 UTC)