[HN Gopher] Exploring performance differences between Amazon Aur...
       ___________________________________________________________________
        
       Exploring performance differences between Amazon Aurora and vanilla
       MySQL
        
       Author : bjacokes
       Score  : 83 points
       Date   : 2021-06-17 15:20 UTC (7 hours ago)
        
 (HTM) web link (plaid.com)
 (TXT) w3m dump (plaid.com)
        
       | wooly_bully wrote:
       | H3 tags on this could really use a bump in size and contrast from
       | the regular text.
        
       | whs wrote:
       | We had similar problem where a running ETL job caused a
       | production outage due to binlog pressure.
       | 
       | One thing that surprised us that our TAM says that on a 1 AZ
       | write-heavy workload normal MySQL would have higher performance
       | as Aurora synchronously write to storage servers in other AZs. On
       | immediate read-after-write workload that would mean it would take
       | longer time to acquire lock.
        
         | bjacokes wrote:
         | This seems plausible given our understanding of the database
         | internals. In general we found our AWS contacts to be
         | knowledgeable and forthcoming about complex tradeoffs between
         | Aurora and vanilla MySQL, even if some of that information is
         | hard or impossible to find in the docs.
        
         | frakkingcylons wrote:
         | > One thing that surprised us that our TAM says that on a 1 AZ
         | write-heavy workload normal MySQL would have higher performance
         | as Aurora synchronously write to storage servers in other AZs
         | 
         | What is surprising about a multi-AZ database having higher
         | latency than one that runs in only one AZ?
        
           | bjacokes wrote:
           | From what I can tell, they provisioned their DB instance(s)
           | in a single AZ, but weren't aware that Aurora automatically
           | provisions its own storage and always uses multiple AZs. We
           | touch on the separation of compute and storage in the post.
           | 
           | I think the surprise is that it's not possible to have a
           | truly "single AZ" Aurora database, even though you might have
           | thought you provisioned your DB instances that way.
        
             | frakkingcylons wrote:
             | I see. I haven't used Aurora, but have had experience
             | running write heavy workloads on RDS. EBS failures would
             | regularly (like monthly) cause our write latency to spike
             | up 3-5x. If Aurora's storage layer architecture is more
             | resilient to those types of problems, that seems like a
             | huge win.
        
             | goeiedaggoeie wrote:
             | Should not be a surprise if you are using Aurora hopefully.
             | Papers on the topic are very clear on how they scale the
             | storage.
        
       | roopawl wrote:
       | Every once in a while there is a well written blog post about
       | database internal. Uber's Postgres-MySql switch saga produced a
       | few of them. This one is pretty good too
        
         | jeandenis wrote:
         | We worked closely with AWS on this (problem and blog) and they
         | were great and quite transparent. Glad it's interesting/useful
         | to you.
        
       | slownews45 wrote:
       | The simplest is probably read committed especially if like many
       | ETL jobs you are just going to grab stuff using one read for
       | further processing. Another option, do a read committed and omit
       | last 15 minutes of data if you are doing long running jobs to
       | avoid churn at end of tables / logs.
       | 
       | I see folks doing serializable reads for historic ETL jobs with
       | one read in the transaction - why? Is there some history / tool
       | issue I'm not familiar with?
        
         | bjacokes wrote:
         | For Aurora MySQL, the default for read-only replicas is
         | repeatable read. As we mentioned towards the end of the post,
         | read committed support appears to have been introduced to
         | Aurora MySQL just last year. But you're right - now that it's
         | supported, switching to read committed is by far the easiest
         | fix.
         | 
         | No idea why people would be using serializable reads for ETL
         | jobs though! :O
        
           | slownews45 wrote:
           | My own guess was that some ETL jobs were really data
           | integrity jobs - in which case folks got used to higher
           | levels of isolation being necessary across many reads to
           | avoid false positives on their cross check stuff maybe.
        
       ___________________________________________________________________
       (page generated 2021-06-17 23:01 UTC)