[HN Gopher] Exploring performance differences between Amazon Aur...
___________________________________________________________________
Exploring performance differences between Amazon Aurora and vanilla
MySQL
Author : bjacokes
Score : 83 points
Date : 2021-06-17 15:20 UTC (7 hours ago)
(HTM) web link (plaid.com)
(TXT) w3m dump (plaid.com)
| wooly_bully wrote:
| H3 tags on this could really use a bump in size and contrast from
| the regular text.
| whs wrote:
| We had similar problem where a running ETL job caused a
| production outage due to binlog pressure.
|
| One thing that surprised us that our TAM says that on a 1 AZ
| write-heavy workload normal MySQL would have higher performance
| as Aurora synchronously write to storage servers in other AZs. On
| immediate read-after-write workload that would mean it would take
| longer time to acquire lock.
| bjacokes wrote:
| This seems plausible given our understanding of the database
| internals. In general we found our AWS contacts to be
| knowledgeable and forthcoming about complex tradeoffs between
| Aurora and vanilla MySQL, even if some of that information is
| hard or impossible to find in the docs.
| frakkingcylons wrote:
| > One thing that surprised us that our TAM says that on a 1 AZ
| write-heavy workload normal MySQL would have higher performance
| as Aurora synchronously write to storage servers in other AZs
|
| What is surprising about a multi-AZ database having higher
| latency than one that runs in only one AZ?
| bjacokes wrote:
| From what I can tell, they provisioned their DB instance(s)
| in a single AZ, but weren't aware that Aurora automatically
| provisions its own storage and always uses multiple AZs. We
| touch on the separation of compute and storage in the post.
|
| I think the surprise is that it's not possible to have a
| truly "single AZ" Aurora database, even though you might have
| thought you provisioned your DB instances that way.
| frakkingcylons wrote:
| I see. I haven't used Aurora, but have had experience
| running write heavy workloads on RDS. EBS failures would
| regularly (like monthly) cause our write latency to spike
| up 3-5x. If Aurora's storage layer architecture is more
| resilient to those types of problems, that seems like a
| huge win.
| goeiedaggoeie wrote:
| Should not be a surprise if you are using Aurora hopefully.
| Papers on the topic are very clear on how they scale the
| storage.
| roopawl wrote:
| Every once in a while there is a well written blog post about
| database internal. Uber's Postgres-MySql switch saga produced a
| few of them. This one is pretty good too
| jeandenis wrote:
| We worked closely with AWS on this (problem and blog) and they
| were great and quite transparent. Glad it's interesting/useful
| to you.
| slownews45 wrote:
| The simplest is probably read committed especially if like many
| ETL jobs you are just going to grab stuff using one read for
| further processing. Another option, do a read committed and omit
| last 15 minutes of data if you are doing long running jobs to
| avoid churn at end of tables / logs.
|
| I see folks doing serializable reads for historic ETL jobs with
| one read in the transaction - why? Is there some history / tool
| issue I'm not familiar with?
| bjacokes wrote:
| For Aurora MySQL, the default for read-only replicas is
| repeatable read. As we mentioned towards the end of the post,
| read committed support appears to have been introduced to
| Aurora MySQL just last year. But you're right - now that it's
| supported, switching to read committed is by far the easiest
| fix.
|
| No idea why people would be using serializable reads for ETL
| jobs though! :O
| slownews45 wrote:
| My own guess was that some ETL jobs were really data
| integrity jobs - in which case folks got used to higher
| levels of isolation being necessary across many reads to
| avoid false positives on their cross check stuff maybe.
___________________________________________________________________
(page generated 2021-06-17 23:01 UTC)