[HN Gopher] Squeeze the hell out of the system you have
___________________________________________________________________
Squeeze the hell out of the system you have
Author : sbmsr
Score : 269 points
Date : 2023-08-11 18:18 UTC (4 hours ago)
(HTM) web link (blog.danslimmon.com)
(TXT) w3m dump (blog.danslimmon.com)
| javajosh wrote:
| I'll probably get down-voted for saying this (again), but a key
| way to squeeze unimaginable amounts of performance is to _lean
| into stored procedures_.
|
| Look, I get it, the devx sucks. And it feels proprietary, icky,
| COBOL-like experience. It means you have to _dwell_ in the
| database. What are you, a db admin?!
|
| But I'm telling you, the payoff is worth it. (and also, if you
| ship it you own it so yes you're a db admin). My company ran for
| many years on 3 machines, despite it's extremely heavy page
| weight because the original author wrote it stored procs from the
| beginning. (He also liberally threw away data, which was great,
| but that's another post.) Part of my job was to migrate away from
| .NET and to Java and JavaScript - and another engineer wrote an
| ingenious tool that would generate Java bindings to SQL Server
| stored procs that made it really nice to work with them. And the
| performance really was outrageous - 100x better than any system
| I've worked with before or since. Those 3 boxes handled 300k very
| data intensive monthly actives, and that was like 10 years ago.
|
| Don't worry - even if you lean into SPs there is still plenty of
| engineering to do! It's just that your data layer will simplify,
| and your troubleshooting actually gets easier, not harder. I
| liked the custom bindings - a bit like ActiveRecord, and no ORM.
| But really, truly: if you want to squeeze, move some queries into
| SPs and prepare to be amazed.
| giantrobot wrote:
| I can't disagree with the results, SPs can change your life.
| HOWEVER, they require significant discipline and regular
| audits. All the code for them needs to be in source control
| with a Process for deployment to the DB. You also need a test
| suite as part of the Process which runs against a staging
| server with a comparable configuration to prod. The SPs need to
| be regularly dumped and compared against what's in source
| control and marked as what's supposed to be released in prod.
| [deleted]
| romafirst3 wrote:
| TLDR. We were going to completely rewrite our architecture but
| instead we optimized a few Postgres queries. LMAO
| endisneigh wrote:
| The bit on the database performance issues leads me to my
| hottest, flamiest take for new projects:
|
| - Design your application's hot path to never use joins. Storage
| is cheap, denormalize everything and update it all in a
| transaction. It's truly amazing how much faster everything is
| when you eliminate joins. For your ad-hoc queries you can
| replicate to another database for analytical purposes.
|
| On this note, I have mixed feelings about Amazon's DynamoDB, but
| one things about it is to use it properly you need to plan your
| use first, and schema second. I think there's something you can
| take from this even with a RDBMS.
|
| In fact, I'd go as far to say as joins are unnecessary for
| nonanalytical purposes these days. Storage is so mind booglingly
| cheap and the major DBs have ACID properties. Just denormalize,
| forreal.
|
| - Use something more akin to UUIDs to prevent hot partitions.
| They're not a silver bullet and have their own downsides, but
| you'll already be used to the consistently "OK" performance that
| can be horizontally scaled rather than the great performance of
| say integers that will fall apart eventually.
|
| /hottakes
|
| my sun level take would be also to just index all columns. but
| that'll have to wait for another day.
| [deleted]
| feoren wrote:
| There are "tall" applications and "wide" applications. Almost
| all advice you ever read about database design and optimization
| is for "tall" applications. Basically, it means that your
| application is only doing one single thing, and everything else
| is in service of that. Most of the big tech companies you can
| think of are tall. They have only a handful of really critical,
| driving concepts in their data model.
|
| Facebook really only has people, posts, and ads.
|
| Netflix really only has accounts and shows.
|
| Amazon (the product) really only has sellers, buyers, and
| products, with maybe a couple more behind the scene for
| logistics.
|
| The reason for this is because tall applications are _easy_.
| Much, much easier than wide applications, which are often
| called "enterprise". Enterprise software is bad because it's
| _hard_. This is where the most unexplored territory is. This is
| where untold riches lie. The existing players in this space are
| abysmally bad at it (Oracle, etc.). You will be too, if you
| enter it with a tall mindset.
|
| Advice like "never user joins" and "design around a single
| table" makes a lot of sense for tall applications. It's awful,
| terrible, very bad, no-good advice for wide applications. You
| see this occasionally when these very tall companies attempt to
| do literally anything other than their core competency: they
| fail miserably, because they're staffed with people who hold
| sacrosanct this kind of advice that _does not translate_ to the
| vast space of "wide" applications. Just realize that: your
| advice is for companies doing easy things who are already
| successful and have run out of low-hanging fruit. Even tall
| applications that aren't yet victims of their own success do
| not need to think about butchering their data model in service
| of performance. Only those who are already vastly successful
| and are trying to squeeze out the last juices of performance.
| But those are the people who _least need advice_. This kind of
| tall-centered advice, justified with "FAANG is doing it so you
| should too" and "but what about when you have a billion users?"
| is poisoning the minds of people who set off to do something
| more interesting than serve ads to billions of people.
| xyzzy123 wrote:
| Thanks I think this is a really interesting way to look at
| things.
|
| What is the market for "wide" applications though? It seems
| like any particular business can only really support one or
| two of them, for some that will be SAP and for others it
| might be Salesforce (if they don't need much ERP), or (as you
| mentioned) some giant semi homebrewed Oracle thing.
|
| Usually there is a legacy system which is failing but still
| runs the business, and a "next gen" system which is not ready
| yet (and might never be, because it only supports a small
| number of use cases from the old software and even with an
| army of BAs it's difficult to spec out all the things the old
| software is actually doing with any accuracy).
|
| Or am I not quite getting the idea?
| endisneigh wrote:
| I agree with your sentiment, but even enterprises work on
| multiple "tall" features.
|
| If they didn't then I'd change my advice to be simply multi
| tenant per customer and replicate into a column store for
| cross customer analytics.
| pipe_connector wrote:
| I agree with the characterization of applications you've laid
| out and think everyone should consider whether they're
| working on a "tall" (most users use a narrow band of
| functionality) or a "wide" (most users use a mostly non-
| overlapping band of functionality) application.
|
| I also agree with your take that tall applications are
| generally easier to build engineering-wise.
|
| Where I disagree is that I think in general wide applications
| are failures in product design, even if profitable for a
| period of time. I've worked on a ton of wide applications,
| and each of them eventually became loathed by users and
| really hard to design features for. I think my advice would
| be to strive to build a tall application for as long as you
| can muster, because it means you understand your customers'
| problems better than anyone else.
| feoren wrote:
| > I've worked on a ton of wide applications, and each of
| them eventually became loathed by users and really hard to
| design features for.
|
| Yes, I agree that this is the fate of most. But I refuse to
| believe it's inevitable; rather, I think it comes from
| systemic flaws in our design thinking. Most of what we
| learn in a college database course, most of what we read
| online, most all ideas in this space, transfer poorly to
| "wide" design. People don't realize this because those
| approaches do work well for tall applications, and because
| they're regarded religiously. This is why I call them so
| much harder.
| veave wrote:
| >Design your application's hot path to never use joins. Storage
| is cheap, denormalize everything and update it all in a
| transaction. It's truly amazing how much faster everything is
| when you eliminate joins.
|
| Anybody has documentation about this with examples?
| newlisp wrote:
| Duplicate data to avoid joins, use serializable transactions
| to update all the duplicated data.
| joshstrange wrote:
| See "Single Table Design" which I talked about in this
| comment above: https://news.ycombinator.com/item?id=37093357
| deely3 wrote:
| And if you don't want to spend money, you can get basic
| idea from this article:
| https://www.alexdebrie.com/posts/dynamodb-single-table/
| tibbetts wrote:
| Premature denomalization is expensive complexity.
| Denormalization is a great tool, maybe an under-used tool. But
| you should wait until there are hot paths before using it.
| endisneigh wrote:
| I agree. To be clear I'm not suggesting anyone start
| denormalizing everything. I'm saying if you're fortunate
| enough to be on a green project, you should design the schema
| around the access patterns which will surely be
| "denomarlized." as opposed to designing a normalized schema
| and designing your access patterns around those.
| latchkey wrote:
| > _Design your application 's hot path to never use joins._
|
| Grab (uber of asia) did this religiously and it created a ton
| of friction within the company due to the way the teams were
| laid out. It always required one team to add some sort of API
| that another team could take advantage of. Since the first team
| was so busy always implementing their own features, it created
| roadblocks with other teams and everyone started pointing
| fingers at each other to the point that nothing ever got done
| on time.
|
| Law of unintended consequences
| tedunangst wrote:
| Hard to follow the link. How would you join two tables
| between teams that don't communicate?
| latchkey wrote:
| You don't, that's the problem.
| endisneigh wrote:
| yes, this is a fair point. there's no free lunch after all.
| without knowing more about what happened with Grab I'd say
| you could mitigate some of that with good management and
| access patterns, though.
| latchkey wrote:
| All in all though, I don't think that 'never use joins' is
| a good solution either since it does create more developer
| work almost every way you slice it.
|
| I think the op's solution of looking more closely at the
| hot paths and solving for those is a far better solution
| than re-architecting the application in ways that could, or
| can, create unintended consequences. People don't consider
| that enough, at all.
|
| Don't forget that hot path resolution is the antithesis of
| 'premature optimization'.
|
| > you could mitigate some of that with good management and
| access patterns
|
| the CTO fired me for making those sorts of suggestions
| about better management, and then got fired himself a
| couple months later... -\\_(tsu)_/-... even with the macro
| events, their stock is down 72% since it opened, which
| doesn't surprise me in the least bit having been on the
| inside...
| taylodl wrote:
| My hot take: always use a materialized view or a stored
| procedure. _Hide the actual, physical tables from the
| Application 's account!_
|
| The application doesn't need to know how the data is physically
| stored in the database. They specify the logical view they need
| of the data. The DBAs create the materialized view/stored
| procedure that's needed to implement that logical view.
|
| Since the application is _never_ directly accessing the
| underlying physical data, it can be changed to make the
| retrieval more efficient without affecting any of the database
| 's users. You're also getting the experts to create the
| required data access for you in the fastest, most efficient way
| possible.
|
| We've been doing this for years now and it works great. It's
| alleviated so many headaches we used to have.
| walterbell wrote:
| Interface contracts and indirection FTW.
|
| 2011, "Materialized Views" by Rada Chirkova and Jun Yang,
| https://dsf.berkeley.edu/cs286/papers/mv-fntdb2012.pdf
|
| _> We cover three fundamental problems: (1) maintaining
| materialized views efficiently when the base tables change,
| (2) using materialized views effectively to improve
| performance and availability, and (3) selecting which views
| to materialize. We also point out their connections to a few
| other areas in database research, illustrate the benefit of
| cross-pollination of ideas with these areas, and identify
| several directions for research on materialized views._
| downWidOutaFite wrote:
| This doesn't work because DBAs are rarely on the dev team's
| sprint schedule. If the DBAs are blocking them devs can and
| will figure out how to route around the gatekeepers. In
| general, keep the logic in the app not the db.
| alfor wrote:
| But for the saves the structure is visible?
| taylodl wrote:
| You can update underlying data via a materialized view.
| Scarbutt wrote:
| Normalization is not only about data storage but most
| importantly, data integrity.
| endisneigh wrote:
| Yes, but I assert that it's possible to use transactions to
| update everything consistently. Serializable transactions
| weren't really common when MySQL/Postgres _first_ came out,
| but now that they 're common in new DBs + ACID, I think it's
| not possible to do with reasonable difficulty. If you agree
| with this, than its easy to prove that denormalized tables
| performance increase is well worth the annoyance of updating
| everything to transactionally update the dependencies.
|
| I won't say that it's trivial to update all of your business
| logic to do this, but I think it's definitely worth it for a
| new project at least.
| Guvante wrote:
| You always need to compare write vs read performance.
|
| Turning a single table update into a 10 table one could tip
| your lock contention to the point where you are write bound
| or worse start hitting retries.
|
| Certainly it makes sense to move rarely updated fields to
| where they are used makes sense.
|
| Similarly "build your table against your queries not your
| ideal data model" is always sage advice.
| Bognar wrote:
| Denormalized transactions are not trivial unless you are
| using serializable isolation level which will kill
| performance. If you don't use serializable isolation level,
| then you risk either running into deadlocks (which will
| kill performance) or inconsistency.
|
| Decent SQL databases offer materialized views, which
| probably give you what you want without all the headache of
| maintaining denormalized tables yourself.
| endisneigh wrote:
| all fair points, but to be fair I don't necessary think
| this makes the most sense for an existing project for the
| reasons you state. I do think for a new project would
| best be able to design around the access patterns in a
| way that eliminate most of the downsides.
| williamdclt wrote:
| Transactions are not only (actually mainly not) about
| atomicity. Of course it's possible to keep data integrity
| without normalisation, but that means you need to maintain
| the invariants yourself at application level and a big
| could result in data inconsistency. Normalisation isn't
| there to make integrity possible, it's there to make (some)
| non-integrity impossible.
|
| Nobody says you have to have only one view of your data
| though. You can have a normalised view of your data to
| write, and another denormalised for fast reads (you usually
| have to, at scale). Something like event sourcing is
| another way (which is actually pushing invariants to
| application level, in a structured way)
| wizofaus wrote:
| Can't say I've ever come across a scenario where a join itself
| was the performance bottleneck. If there's any single principle
| I have observed is "don't let a table get too big". More often
| than not it's historical-record type tables that are the issue
| - but the amount of data you need for day-to-day operations is
| usually a tiny fraction of what's actually in the table, and
| you're bound to start finding operations on massive tables get
| slow no matter what indexes you have (and even the act of
| adding more indexes becomes problematic. And just indexing all
| columns isn't enough for traditional RMDBSes at least - you
| have to index the right combinations of columns for them to be
| used. Might be different for DynamoDb).
| 8note wrote:
| Dynamo is quick for that, so long as you are picking good
| partition keys.
|
| Instead, it'll throw you hot key throttling if you start
| querying one partition too much
| wtetzner wrote:
| I'd say that probably depends on what your hot path is. If it's
| write-heavy, then you'll probably end up with performance
| issues when you need to write the same data to multiple tables
| in a single transaction. And if all of those columns are
| indexed, it'll be even worse.
| iamwil wrote:
| If you don't use joins, how do you associate records from two
| different tables when displaying the UI? Do you just join in
| the application? Or something else?
| endisneigh wrote:
| this has opinionated answers.
|
| if you ask Amazon, they might suggest that you design around
| a single table
| (https://aws.amazon.com/blogs/compute/creating-a-single-
| table...).
|
| in my opinion it's easier to use join tables. which are what
| are sometimes temporarily created when you do a join anyways.
| in this case, you permanently create table1, table2, and
| table1_join_table2, and keep all three in sync
| transactionally. when you need a join you just select on
| table1_join_table2. you might think this is a waste of space,
| but I'd argue storage is too cheap for you to be thinking
| about that.
|
| that being said, you really have to design around your access
| patterns, don't design your application around your schema.
| most people do the latter because it seems more natural. what
| this might mean in practice is that you do mockups of all of
| the expected pages and what data is necessary on each one.
| _then_ you design a schema that results in you never having
| to do joins on the majority, if not all, of them.
| sainez wrote:
| > what this might mean in practice is that you do mockups
| of all of the expected pages and what data is necessary on
| each one. then you design a schema that results in you
| never having to do joins on the majority, if not all, of
| them.
|
| Great suggestion! I had a role where I helped a small team
| develop a full stack, data-heavy application. I felt pretty
| good about the individual layers but I felt we could have
| done a better job at achieving cohesion in the big picture.
| Do you have any resources where people think about these
| sorts of things deeply?
| walterbell wrote:
| 2001, "Denormalization effects on performance of RDBMS",
| by G. L. Sanders and Seungkyoon Shin,
| https://www.semanticscholar.org/paper/Denormalization-
| effect...
|
| _> We have suggested using denormalization as an
| intermediate step between logical and physical modeling,
| to be used as an analytic procedure for the design of the
| applications requirements criteria ... The guidelines and
| methodology presented are sufficiently general, and they
| can be applicable to most databases ... denormalization
| can enhance query performance when it is deployed with a
| complete understanding of application requirements._
|
| PDF: https://web.archive.org/web/20171201030308/https://p
| dfs.sema...
| endisneigh wrote:
| yeah, exactly. in my experience the vast majority of
| access patterns are designed around a normalized schema,
| where it really should be that the schema is designed
| around the access patterns and generously "denormalize"
| (which doesn't make sense in this context of a new
| database) as necessary.
| joshstrange wrote:
| Single Table Design is the way forward here. I can highly
| recommend The DynamoDB Book [0] and anything (talks, blogs,
| etc) that Rick Houlihan has put out. In previous discussions
| the author shared a coupon code ("HACKERNEWS") that will take
| $20-$50 off the cost depending on the package you buy. It
| worked earlier this year for me when I bought the book. It
| was very helpful and I referred back to it a number of times.
| This github repo [1] is also a wealth of information
| (maintained by the same guy who wrote the book).
|
| As an added data point I don't really like programming books
| but bought this since the data out there on Single Table
| Design was sparse or not well organized, it was worth every
| penny for me.
|
| [0] https://www.dynamodbbook.com/
|
| [1] https://github.com/alexdebrie/awesome-dynamodb
| deely3 wrote:
| And if you don't want to spend money, you can get idea from
| this article:
|
| https://www.alexdebrie.com/posts/dynamodb-single-table/
|
| Im really curious about real life performance on different
| databases, especially in situation where RAM is smaller
| than database size.
| wizofaus wrote:
| That article didn't appear to be suggesting single-table
| design was appropriate for general purpose RMDBSes (or
| any database other than DynamoDb).
| i_like_apis wrote:
| Yes I like the zero joins on hot paths approach. It can be hard
| to sell people on it. It's a great decision for scaling though.
| skybrian wrote:
| I'm wondering if indexes and materialized views can be used to
| do basically the same thing? That is, assuming they contain all
| the columns you want.
| latchkey wrote:
| The issue is writes, not reads.
| giantrobot wrote:
| There's always money in the banana sta...materialized views.
| Materialized views will get you quite a ways on read heavy
| workloads.
| macNchz wrote:
| Over the years I think I've encountered more pain from
| applications where the devs leaned on denormalization than from
| those that developed issues with join performance on large
| tables.
|
| You can mash those big joins into a materialized view or ETL
| them into a column store or whatever you need to fix
| performance later on, but once someone has copied the
| `subtotal_cents` column onto the Order, Invoice, Payment,
| NotificationEmail, and UserProfileRecentOrders models, and
| they're referenced and/or updated in 296 different
| places...it's a long road back to sanity.
| klodolph wrote:
| I have personally witnessed the "let's build microservices to get
| better performance" argument. I definitely want to nip that in
| the bud.
|
| It's easy to fall in love with complexity, especially since you
| see a lot of complexity in existing systems. But those systems
| became complex as they evolved to meet user needs, or for other
| reasons, over time. Complex systems are impressive, but you need
| to make sure that your team has people who recognize the heavy
| costs of complexity, and who can throw their engineering efforts
| directly against the most important problems your team faces.
| jsight wrote:
| I blame the easy availability of additional resources in the
| cloud for a lot of problems here. Prod db slow? Get a bigger EC2
| instance. Still slow? Hmm, maybe bigger again! Why bother tuning.
|
| Now... Who knows why our AWS bill is so high?
|
| With real hardware in a DC, you'd have to justify large capital
| expenditures to do something that stupid.
| gary_0 wrote:
| No mention of caching? If your database is getting hammered with
| SELECTs, isn't putting a cache in front of it something that
| should at least be considered?
| deathanatos wrote:
| I've been in the OP's situation, and this exact suggestion was
| made in my case. Welcome to one of the hardest problems in CS:
| cache invalidation.
|
| If you have a dataset for which cache invalidation is easy
| (e.g., data that is written and never updated), yeah,
| absolutely go for this.
|
| In our case, and most cases I've seen, it wasn't so simple, and
| "split this off to a DB better suited to it" was less complex
| (maybe still a lot of work, but conceptually _simple_ ) than
| figuring out cache invalidation.
| lern_too_spel wrote:
| There are systems that will do that for you like
| https://readyset.io/.
| Scarbutt wrote:
| They mentioned adding a DB replica for reads.
| jakey_bakey wrote:
| [The Grug Brained Developer](https://grugbrain.dev/)
| sssspppp wrote:
| Love this post. I've been trying to tell my manager the same
| message for the last few months (with little success). We're
| about to embark on a massive migration to "next-gen
| infrastructure" (read: three different Redshift clusters managed
| by CDK) because our overloaded Redshift cluster (already maxed
| out with RA3 nodes) has melted down one too many times. The next-
| gen infra is significantly more complex than our existing setup
| and I'm not convinced this migration will be the silver bullet
| everyone is hoping for.
| iamwil wrote:
| Ugh. I had a colleague that addressed any scaling problem by
| putting a cache in front of the DB. Praised for solving the
| immediate problem, but shouldered none of the costs. </rant>
|
| I admit in the face of finding Prod/market fit, you do the
| expedient thing, but damned if I'm not often at the receiving end
| of these sorts of decisions.
| aidos wrote:
| Interestingly, I often ask candidates about optimising a slow
| running db query and the majority of people jump to adding
| caching and very few ask if they can run an explain or see the
| indexes.
| tedunangst wrote:
| "I would make the slow query faster" seems too obvious an
| answer for an interview question.
| andrewstuart wrote:
| Isn't Rails wasteful in its database access patterns?
| iamwil wrote:
| Generally No. But it can be easy to write bad queries using
| ActiveRecord ORM if you're not aware of N + 1 problems.
| romafirst3 wrote:
| 100%, it makes it easy for bad programmers to write bad
| performing queries, but you can easily write performant code.
| Btw that's a feature - letting people ramp up to full db
| knowledge is beneficial, you don't want to be spending your
| time writing performant queries before you need to.
| topspin wrote:
| > it makes it easy for bad programmers to write bad
| performing queries
|
| That is true of every ORM in existence. The easiest thing
| to do is naively follow the object graph in code, because
| that's what the ORM gives you. If the ORM was to somehow
| add friction here to encourage some other approach it would
| be panned as "too hard!!1" and fade away into obscurity.
| Joel_Mckay wrote:
| The Monolith is often a marker of several naive assumptions.
|
| Yet some interesting patterns will emerge if teams accept some
| basic constraints:
|
| 1. A low-cpu-power client-process is identical to a resource
| taxed server-process
|
| 2. A systems client-server pattern will inevitably become
| functionally equivalent to inter-server traffic. Thus, the
| assumption all high performance systems degenerate into a hosted
| peer-to-peer model will counterintuitively generalize.
| Accordingly, if you accept this fact early, than one may avoid
| re-writing a code-base 3 times, and trying to reconcile a bodged
| API.
|
| 3. Forwarding meaningful information does not mean collecting
| verbose telemetry, then trying to use data-science to fix your
| business model later. Assume you will eventually either have
| high-latency queuing, or start pooling users into siloed
| contexts. In either case, the faulty idea of a single database
| shared-state will need seriously reconsidered at around 40k
| users, and later abandoned after around 13m users.
|
| 4. sharding only buys time at the cost of reliability. You may
| disagree, but one will need to restart a partitioned-cluster
| under heavy-load to understand why.
|
| 5. All complex systems fail in improbable ways. Eventually
| consistent is usually better than sometimes broken. Thus,
| solutions like Erlang/Elixir have been around for awhile...
| perhaps the OTP offers a unique set of tradeoffs.
|
| 6. Everyone thinks these constraints don't apply at first. Thus,
| will repeat the same tantalizing... yet terrible design
| choices... others have repeated for 40+ years.
|
| Good luck, =) J
| 39 wrote:
| Strangely obvious advice?
| iamwil wrote:
| That no one likes to follow.
| sfink wrote:
| Well, the advice is rarely taken in practice. It is (in my
| experience, and it seems common from others based on what I've
| heard) very very common to jump to the complicated solution at
| the first hint of capacity issues "because we'll need to do it
| eventually anyway."
|
| The advice is obvious when you're thinking at that level of
| abstraction. Which suggests that, in practice, people who are
| architecting such systems rarely think at that level of
| abstraction. Which is why it is nice to have posts like this,
| that periodically remind us to get our heads out of the daily
| minutiae and consider the bigger picture (of complexity
| tradeoffs, realistic projections, staffing and availability,
| etc.)
| bryanlarsen wrote:
| "Common sense is not so common."
|
| - Voltaire
| joelshep wrote:
| It might be obvious as far as it goes, but it's also incomplete
| in at least two ways. One is that as tweaks and optimizations
| and "supplementing the system in some way" often involves
| increasing its complexity, even if just a little bit at a time.
| It adds up with time. The more important thing is this: if
| you're already constrained on vertical scaling, and you don't
| have a firm grip on how fast your system is scaling, then you
| can't just stop with making the db more efficient. That's just
| postponing the inevitable, and possibly not for more than a
| couple of years. If you're in the position the author portrays,
| get the database under control first -- for sure -- but then
| get started on figuring out how you're going to stay in front
| of your scaling problem, whether that's rearchitecture, off-
| loading work to systems better suited for it, or whatever.
| Speaking as a former owner of a very large Amazon database that
| fought this battle many times, trying to buy enough dev time to
| build away from it before it completely collapsed. We were too
| content with performance improvements just like the ones
| described in this article, before finally recognizing we were
| just racing the clock.
| huijzer wrote:
| > We should always put off significant complexity increases as
| long as possible.
|
| Reminds me of the mantra that I've read here to easily go for
| reversible things and very careful when going for irreversible
| things.
| sssspppp wrote:
| Amazon's one way vs two way door decisions echo the sentiment
| sainez wrote:
| It is mentioned in this article about the inception of AWS's
| custom silicon: https://semiconductor.substack.com/p/on-the-
| origins-of-aws-c...
|
| > "We use the terms one-way door and two-way door to describe
| the risk of a decision at Amazon. Two-way door decisions have a
| low cost of failure and are easy to undo, while one-way door
| decisions have a high cost of failure and are hard to undo. We
| make two-way door decisions quickly, knowing that speed of
| execution is key, but we make one-way door decisions slowly and
| far more deliberately."
| TX81Z wrote:
| Really curious how much can be attributed to using an ORM.
| kunalgupta wrote:
| I would definitely do the opposite of this - 3 months is a while
| and i think the cost of complexity would take a long time before
| it comparec
| phirschybar wrote:
| I agree with this approach. the other added benefit is that when
| they decided to optimize the app by eliminating or tuning queries
| and utilizing replicas for reads, they ultimately made the app
| much more performant while possibly reducing complexity. the
| "squeeze" mindset pays off in the long-run here. the continued
| optimization over time is infinitely better than adding the
| complexity of microservices or expanded infrastructure because
| the latter will simply bury and compound the potential
| optimizations which could AND SHOULD have been made. squeeze
| squeeze squeeze until you just can't squeeze any more!
| nathias wrote:
| Complexity in software is bad, things can be bad and necessary.
| It's bad in itself, but sometimes it can provide new
| functionality...
| alfalfasprout wrote:
| The problem is this is also a myopic way of looking at things.
| What you should be looking at is also operational complexity.
| What's the current burden on your org/team maintaining what you
| currently have? What about when you need to scale even higher?
|
| A lot of teams that think this way end up with really high oncall
| burdens and then never have the time to even iterate on their
| infrastructure.
| iblaine wrote:
| TL;DR; do the easy things fist, in this case it was to fix bad
| SQL
|
| Given the options to optimize SQL, move read operations to
| replicas, shard data or go towards micro services, optimizing SQL
| is the easy choice.
| bayindirh wrote:
| Actually, I disagree. The "TL;DR:" in the article is "first
| outgrow, then upgrade". In today's software development
| practice, efficiency is second class citizen, because moving
| fast and breaking things is the way to keep the momentum and be
| hip.
|
| However, sometimes everyone needs to chill and sharpen the tool
| they have at hand. It might prove much more capable than first
| anticipated. Or you may be holding the tool wrong to a degree.
| maxboone wrote:
| Relevant blog on improving PostgreSQL performance on ZFS:
| https://news.ycombinator.com/item?id=29647645
| notnmeyer wrote:
| haha, when i read their initial thoughts were write-sharding and
| microservices i whispered "wtf?" to myself.
|
| glad to see there was a better ending to the story though.
| discussDev wrote:
| It's the boring solution. It should also only be the default
| answer if you are not building a super critical system to life
| and limb. But it certainly gives a much lower total cost of
| ownership. If you don't have the resources for some big redundant
| system, I've too often seen the complexity added by the redundant
| system be the issue then focusing on simplicity. If you need to
| add a bunch of people to support complexity but both the money
| and the risk assessment doesn't call for it, simpler is much
| better. I won't say I haven't seen the issue where eventually it
| was only a huge project to go forward, but I tend to think
| sometimes even that is less then the sum of having dealt with
| complexity to that point, it's dependent on a lot about what you
| are building.
| alfor wrote:
| I wonder if moving the db on beefy dedicated hardware with ton of
| ram and nvme would solve the problem. Preferably physicaly
| connected to the web serveurs.
|
| Cost: a fraction of the developper cost.
|
| I see so many things done on the cloud that 10X their complexity
| because of it. Modern hardware in increadibly powerfull.
| sakopov wrote:
| I thought I was going to read something insightful. Instead it
| was a post about how to completely ignore your database
| performance and then consider overcomplicating everything with
| sharding and microservices because you didn't care to do basic
| profiling on your queries. I'm glad common sense prevailed, but
| this is really some junior-level stuff and it's being celebrated
| as some kind of novelty.
| account-5 wrote:
| I suppose it seems obvious in hindsight that your first move
| should always be to investigate potential causes before a
| wholesale redesign that adds potentially unnecessary complexity
| to your system.
| exabrial wrote:
| This is amazing advice. A side note is to use the hell out of
| replication. These things don't have to be complicated. Setup a
| readonly and a readwrite datasource/connection pool in your app
| if you have to.
| i_like_apis wrote:
| I'm reminded of one of my favorite sayings:
|
| _You go to war with the army you have, not the army you might
| want or wish to have at a later time._
|
| You may want to ignore that this this comes from Donald Rumsfeld
| (he has some great ones though: "unknown unknowns ...", etc.)
|
| I think about this a lot when working on teams. Everyone is not
| perfectly agreeable or has the same understanding or collective
| goals. Some may be suboptimal or prone to doing things you don't
| prefer. But having a team is better than no team, so find the
| best way to accomplish goals with the one you have.
|
| It applies to systems well too.
| fuzztester wrote:
| "No battle plan survives contact with the enemy."
|
| https://www.google.com/search?q=no+battle+plan+survives
| sbuk wrote:
| Mike Tyson said it more simply: "Everybody has a plan until
| you get hit in the face."
| fuzztester wrote:
| https://en.m.wikipedia.org/wiki/Helmuth_von_Moltke_the_Elder
|
| Moltke's thesis was that military strategy had to be
| understood as a system of options, since it was possible to
| plan only the beginning of a military operation. As a result,
| he considered the main task of military leaders to consist in
| the extensive preparation of all possible outcomes.[3] His
| thesis can be summed up by two statements, one famous and one
| less so, translated into English as "No plan of operations
| extends with certainty beyond the first encounter with the
| enemy's main strength" (or "no plan survives contact with the
| enemy") and "Strategy is a system of expedients".[18][8]
| Right before the Austro-Prussian War, Moltke was promoted to
| General of the Infantry.[8]
| makeitdouble wrote:
| I'm thinking about this quote for a while but have a hard time
| squeezing the meaning, or really the actionable part out of it.
|
| The unknown unknowns quote brings the concept that however
| confident you are in a plan you absolutely need margin. The
| other quote thought...what do you do differently when
| understanding that your team is not perfect ?
|
| On one side, outside of VC backed startups I don't see
| companies trying to reinvent linux whith a team of 4 new
| graduates. On the other side companies with really big goals
| will hire a bunch until they feel comfortable with their talent
| before "going to war". You'll see recruiting posts seeking
| specialists in a field before a company bets the farm on that
| specific field (imagine Facebook renaming itself to Meta before
| owning Oculus...nobody does that[0])
|
| Edit: sorry, I forgot some guy actually just did that 2 weeks
| ago with a major social platform. And I kinda wanted to forget
| about it I think.
| sainez wrote:
| Great point about working on teams. For the vast majority of
| tasks, people are only marginally better or worse than each
| other. A few people with decent communication will outpace a
| "star" any day of the week.
|
| I try to remind myself of this fact when I'm frustrated with
| other people. A bit of humility and gratitude go a long way.
| tedunangst wrote:
| Mattis "the enemy gets a vote" is another good reminder of
| reality, although people get very angry about it. Useful in
| terms of security, privacy, DRM, etc.
| walterbell wrote:
| Product management outside the box.
| Buttons840 wrote:
| I like a similar quote from Steven Pressfield:
|
| "The athlete knows the day will never come when he wakes up
| pain-free. He has to play hurt."
|
| This applies to ourselves more than our systems though.
| roughly wrote:
| Rumsfeld's got some great quotes, most of which were delivered
| in the context of explaining how the Iraq war turned into such
| a clusterfuck, and boy could that whole situation have used the
| kind of leadership Donald Rumsfeld's quotes would lead you to
| believe the man could've provided.
| xapata wrote:
| > could've
|
| If someone is 83.7% likely to provide good leadership, how
| would you evaluate the choice to hire that person as a leader
| in the hindsight that the person failed to provide good
| leadership -- was it a bad choice, or was it a good choice
| that was unlucky?
|
| (Likelihood was selected arbitrarily.)
| hluska wrote:
| Like everything in politics, I think this is a function of
| what team you cheer for. If your goal was to come up with
| an excuse to invade Iraq, that person was an excellent
| choice. If you're on the other team, what a clusterfuck.
|
| Then you add in a party system and it gets more
| complicated. Realistically, you don't get to be the United
| States Secretary of Defense (twice) if you're the kind of
| person who will ignore the will of the party and whoever is
| President.
| whatshisface wrote:
| > _quotes would lead you to believe_
| dragonwriter wrote:
| > Rumsfeld's got some great quotes, most of which were
| delivered in the context of explaining how the Iraq war
| turned into such a clusterfuck
|
| If by "explaining how" you mean "deflecting (often
| preemptively) responsibility for", yes.
| marcosdumay wrote:
| If I remember it correctly (it was a long time ago), he never
| fully supported the war. It didn't take a genius to notice
| that the goals set by the presidency were (literally)
| impossible and not the kind of thing you do achieve a war.
|
| But whatever position he had, Iraq turning into a clusterfuck
| wasn't a sign of bad leadership by his part. It was a sign of
| bad ethics, but not leadership. His options were all of
| getting out of his position, disobeying the people above him,
| or leading the US into a clusterfuck.
| mickdeek86 wrote:
| Rumsfeld personally advanced the de-baathification
| directive - the lynchpin of the clusterfuckery - all on his
| own, and he certainly would have known to expect the
| 'unexpected' results to be similar to de-nazification. This
| was absolutely his choice. Another point you have
| (unintentionally?) brought up is the dignified resignation
| option. While it is often a naive, self-serving gesture, we
| can reasonably imagine that the Defense Secretary publicly
| resigning over opposition to a war during the public
| consideration of that war, might have had some effect on
| whether that war was started. I want to like him too, with
| his grandfatherly demeanor and genuine funnyness ("My god,
| were there so many vases?!") but, come on.
| moffkalast wrote:
| Could've at least given them some motivational quotes.
| hluska wrote:
| I like to remind myself that very few people reach positions
| of great power after mediocre lives. Rather there's a thread
| of talent that runs through government.
|
| Once they're in, the predilections that led to power often
| rear their dark long tails. But they're all (even the ones I
| disagree with) talented.
| patmcc wrote:
| They're talented at getting into power, and may be talented
| at any number of other things.
|
| They're not always talented at the things we may want them
| to be, unfortunately. And that's true of both the ones I
| agree and disagree with.
| KnobbleMcKnees wrote:
| That was Donald Rumsfeld!? I always assumed this came from some
| techie or agile guru given how much it's used as a concept in
| project planning.
| a_seattle_ian wrote:
| That it came from Donald Rumsfeld in the context of what we
| know now and what he surely knew then is why it's such a good
| quote. The words basically say nothing but are also true
| about everything. So it can implicit be a warning that there
| is probably some bullshit going on or someone has a sense of
| humor and is also warning people while also avoiding the
| subject - of course just my opinion. How people actually use
| it will depend what the audience agrees it to mean.
| [deleted]
| midasuni wrote:
| And unknown unknowns is a great way to communicate with
| stakeholders too
| roughly wrote:
| Zizek has a followup to that quote:
|
| "What he forgot to add was the crucial fourth term: the
| "unknown knowns," the things we don't know that we know."
|
| I've found it's really critical during the project planning
| phase to get to not just where the boundaries of our
| knowledge are, but also where are the things we're either
| tacitly assuming or not even aware that we've assumed. An
| awful lot of postmortems I've been a part of have come down
| to "It didn't occur to us that could happen."
| munificent wrote:
| _> An awful lot of postmortems I 've been a part of have
| come down to "It didn't occur to us that could happen." _
|
| Would that not be an unknown unknown?
| roughly wrote:
| Usually there's a tacit assumption of how the system
| works, how the users are using the system, or something
| else about the system or the environment that causes that
| - it's not that the answer wasn't known, it's that it was
| assumed to be something it wasn't and nobody realized
| that was an assumption and not a fact.
| thfuran wrote:
| That's just an unknown unknown masquerading as a known
| known.
| waprin wrote:
| I really enjoy the concept of unknown knowns, but I don't
| agree with your example, which is an unknown unknown.
|
| To me the corporate version of the unknown known is when
| a a project is certainly doomed, for reasons everyone on
| the ground knows about, yet nobody wants to say anything
| and be the messenger that inevitably gets killed, as long
| as paycheck keeps clearing. An exec ten thousand feet
| from the ground sets a "vision" which can't be blown off
| course by minor details such as reality, until the day it
| does.
|
| Theranos is a famous example of this but I've had less
| extreme versions happen to me many times throughout my
| career.
|
| Another example of unknown knowns might be the conflict
| between companies stated values (Focus on the User) and
| the unstated values that are often much more important
| (Make Lots of Money)
| killjoywashere wrote:
| As a military officer who was watching CNN live from inside
| an aircraft carrier (moored) when he said that, being in
| charge of anti-terrorism on the ship at the time, it was
| absolutely foundational to my approach to so many things
| after that. Here's the actual footage:
| https://www.youtube.com/watch?v=REWeBzGuzCc
|
| Rumsfeld was complicated, but there's no doubt he was very
| effective at leading the Department. I think most people fail
| to realize how sophisticated the Office of the Secretary of
| Defense is. Their resources reel the mind, most of all the
| human capital, many with PhDs, many very savvy political
| operators with stunning operational experiences. As a small
| example, as I recall, Google's hallowed SRE system was
| developed by an engineer who had come up through the ranks of
| Navy nuclear power. That's but one small component reporting
| into OSD.
|
| Not a Rumsfeld apologist, by any means. Errol Morris did a
| good job showing the man for who he is, and it's not pretty
| (1). But reading HN comments opining about the leadership
| qualities of a Navy fighter pilot who was both the youngest
| and oldest SECDEF makes me realize how the Internet lets
| people indulge in a Dunning-Kruger situation the likes of
| which humanity has never seen.
|
| https://www.amazon.com/Known-Donald-Rumsfeld/dp/B00JGMJ914
| michael1999 wrote:
| I'll support you there. In any sensible reading of
| Nuremberg, they all deserve to hang from the neck until
| dead. But the central moral failure was Bush. Letting
| Cheney hijack the vp search, and then pairing him up with
| Rumsfeld was a bad move, and obviously bad at the time.
| Those two had established themselves as brilliant but
| paranoid kooks with their Team B fantasies in the 70s, and
| should never have been allowed free rein.
| oDot wrote:
| Every time I hear the name Rumsfeld, I am reminded of the time
| when, for over 10 minutes, he refused to deny being a lizard:
|
| https://www.youtube.com/watch?v=XH_34tqxAjA
| macNchz wrote:
| In my experience, in web apps built on top of ORMs there is often
| a TON of low hanging fruit for query optimization when database
| load becomes an issue. Beyond the basics of "do we have N+1
| issues", ORMs sometimes just don't generate optimal queries. I
| wouldn't want to built a complex production web app _without_ an
| ORM, but being able to eject from it sometimes is key.
|
| Profile real world queries being run in production that use the
| most resources. Take a look at them. Get a sense of the shape of
| the tables that they're running against. Sometimes the ORM will
| be using a join where you actually want a subquery. Sometimes the
| opposite. Sometimes you'll want to aggregate some results
| beforehand, or adjust the WHERE conditions in a complex join.
| I've seen situations where a semi-frequent ORM-generated query
| was murdering the DB, taking 20+ seconds to run, and with a few
| minor tweaks it would run in less than a second.
| nerdponx wrote:
| I'm working on something right now with the Python ORM
| SQLAlchemy. It turns out that getting it to use RETURNING with
| INSERT is not trivial and requires you to set the non-obvious
| option `expire_on_commit=False`, which doesn't _guarantee_ use
| of RETURNING, but is supposed to use it if your db driver and
| database happen to support it and the ORM happens to support it
| for that particular combination of driver and database. And
| there 's no API to actually inspect the generated SQL even
| though it's emitted in the logs, so there's no way to enforce
| the use of RETURNING in your test suite without capturing and
| scraping your own logs (which fortunately is very easy within
| the Pytest framework).
|
| I like ORMs but this is just frustratingly complicated on so
| many levels. I also understand that SQLAlchemy is an enormous
| library and not everything will be easy. But I think this case
| exemplifies the trade-offs involved with using an ORM.
|
| (Yes I am aware that using insert() itself in Core does what I
| want, I'm talking about .add()-ing an ORM object to an
| AsyncSession).
| bootsmann wrote:
| There is certainly an API to inspect your query, you can just
| call print() on the object iirc.
| sheepz wrote:
| Agree wholeheartedly with the conclusion of the article.
|
| But the post makes it seem that there was no real query-level
| monitoring for the Postgres instance in place, other than perhaps
| the basic CPU/memory ones provided by the cloud provider. Using
| an ORM without this kind of monitoring is sure way to shoot
| yourself in the foot with n+1 queries, queries not using
| indexes/missing indexes etc
|
| The other thing that is amazing that everyone immediately reached
| for redesigning the system without analyzing the cause of the
| issues. A single postgres instance can do a lot!
| PeledYuval wrote:
| What's your recommended way of implementing this in a simple
| App Server <> Postgres architecture? Is there a good Postgres
| plugin or do you utilize something on the App side?
| clintonb wrote:
| We use Datadog, which centralizes logs and application
| traces, allowing us to better pinpoint the exact request/code
| path making the slow query.
| sheepz wrote:
| I've used pganalyze which is a non-free SaaS tool. Gives you
| a very good overview of where the DB time is spent with index
| suggestions etc. There are free alternatives, but require
| more work from you.
| gillh wrote:
| Prioritized load shedding works well as a last resort [0]. The
| idea is simple -
|
| - Detect overload/congestion build-up at the database
|
| - Apply queueing at the gateway service and schedule requests
| based on their priority
|
| - Shed excess requests after a timeout
|
| [0]: https://docs.fluxninja.com/blog/protecting-postgresql-
| with-a...
| Xeoncross wrote:
| > The real cost of increased complexity - often the much larger
| cost - is attention.
|
| ...or just mental load. I'm tired of working on micro-service
| systems that still have downtime, but no one knows how it all
| works. Most are actually just distributed monoliths so changes
| often touch multiple services and have to be rolled out in order.
| Data has to be duplicated, tasks have to be synchronized, state
| has to be shared, etc...
|
| https://www.youtube.com/watch?v=y8OnoxKotPQ
| javajosh wrote:
| This is a very common architectural smell, when you have
| uservices and "no-one knows how they all work". The whole point
| is that no-one can or should know how they all work; the fact
| that someone does in order to fix or modify the system is a
| strong signal that you've violated some of the rules - like
| single responsibility, and proper abstraction through API. But,
| in my experience, this is extremely common - debugging a
| pipeline of N microservices often requires running and building
| all N services locally. This is, strictly speaking, a monolith
| + network partitions + (infinite) build/deploy variation. An
| extremely challenging work environment that is ultimately
| beyond any mortal programmer's ability, IMO.
| mandevil wrote:
| He says to avoid complexity, and the team he was on (cleaning up
| some bad queries) was probably improving along that axis (or at
| worst orthogonal to complexity) but, from having done exactly
| this, adding an optional 'query the read-replica' option for
| queries- and determining whether this query can safely go there-
| is definitely extra complexity which will now need to be managed
| into the future. Definitely less overall than a complete re-
| arching of the system, but this is where engineering judgement
| and experience come into play: would you be better off getting
| those select queries resolved with some other data store or with
| a pg read-replica? If your query can survive against the read-
| replica (so stale data is at least sometimes acceptable) would
| you be better off caching results in redis?
| gwbas1c wrote:
| > If your query can survive against the read-replica (so stale
| data is at least sometimes acceptable) would you be better off
| caching results in redis?
|
| Caching adds a lot of complexity. It denormalizes the data, and
| now you "need to know" when to update the cache. Because "the
| single source of truth" is no longer maintained, it's easy to
| accidentally add regressions.
|
| If it's a matter of adding a read replica, that's a much better
| solution, long-term, because you don't have the effort of "does
| this query also need to update the cache?"
|
| (I'd think by now there would be a way to expose events in a DB
| when certain tables are updated; and then (semi) automatically
| invalidate the cache.)
| pluto_modadic wrote:
| ah... rails easy mode discovers rails is only performant if you
| don't stray too far from hello world...
| dbg31415 wrote:
| I feel like a lot of people do this instead of upgrading to newer
| versions, even maintenance patches.
|
| And I get that upgrades can be scary, but often they are
| relatively low cost.
|
| Leaving everyone on the old system unhappy... means they will
| eventually push to re-platform, or rebuild, instead of just doing
| suggested maintenance along the way to keep they system they have
| in good shape.
|
| My advice... do the maintenance. Do all the maintenance! Don't
| just drive it into the ground and get mad when it breaks; change
| the oil and tires and spring for a car wash and some new wiper
| blades every now and then and you'll be happier in the long run.
| zengid wrote:
| The solution they went with, squeezing juice out of the system by
| finding performance optimizations, brings me so much joy.
|
| It reminds me of Richard L. Sites's book
| _Understanding_Software_Dynamics_ where he basically teaches how
| to measure and fix latency issues, and how at large scales,
| reducing latency can have tremendous savings.
|
| Measuring and reasoning about those issues are hard, but the
| solutions are often simple. For example, on page 9 he mentions
| that _" [a] simple change paid for 10 years of my salary."_
|
| I hope to someday make such an impactful optimization!
| canucker2016 wrote:
| The problem I have with their eventual solution is that they
| only optimized their queries AFTER they had upgraded their
| instance to the largest config available.
|
| They couldn't upgrade their config with a few clicks in the
| admin console anymore (I'm guessing what's involved here) so
| now they had to use actual grey matter to fix their capacity
| problem. Maybe if they had spent more time optimizing specific
| parts of their code, they wouldn't even need such a large
| config instance.
| fritzo wrote:
| > since our work touched many parts of the codebase and demanded
| collaboration with lots of different devs, we now have a strong
| distributed knowledge base about the existing system
|
| Great to see this cultural side-effect called out.
| WallyFunk wrote:
| > Of course, I'm not saying complexity is bad. It's necessary.
|
| Weird thing about computers, even after a fresh install of your
| favorite OS, the whole thing is sitting on a mountain of
| complexity, and that's _before_ you start installing programs,
| browse the web, etc
|
| Only the die-hard use things like MINIX[0] to do their computing.
| Correction: MINIX is in the Intel Management Engine so you have
| /two/ computers.
|
| [0] https://en.wikipedia.org/wiki/Minix
| deathanatos wrote:
| > _Split up the monolith into multiple interconnected services,
| each with its own data store that could be scaled on its own
| terms._
|
| Just to note: you don't have to split out all the possible
| microservices at this junction. You can ask, "what split would
| have the most impact?"
|
| In my case, we split out some timeseries data from Mongo into
| Cassandra. Cass's table structure was a much better fit -- that
| dataset had a well defined schema, so Cass could pack the data
| much more efficiently; for that subset, we didn't need the
| flexibility of JSON docs. And it was the bulk of our data, and so
| Mongo was quite happy after that. Only a single split was
| required. (And technically, we were a monolith before and after:
| the same service just ended up writing to two databases.)
|
| Ironically, later, an airchair architect wanted to merge all the
| data into a JSON document store, which resulted in numerous
| "we've been down that road, and we know where it goes" type
| discussions.
| kedean wrote:
| Funny enough I frequently have the opposite problem, justifying
| repeatedly why Cassandra is a bad fit for relatively short
| lived, frequently updated data (tombstones, baby).
| deathanatos wrote:
| I'd agree with you there.
|
| The specific data that went into Cassandra in our case was
| basically immutable. (And somehow, IIRC, we _still_ had
| issues around tombstones. I am not a fan of them.) Cassandra
| 's tooling left much to be desired around inspecting the
| exact state of tombstones within the cluster.
| kreetx wrote:
| In a way, in the article they also did a split: specific heavy
| select queries were offloaded to a replica.
| agentultra wrote:
| They could probably squeeze more depending on their workload
| patterns. RDBMS' typically optimize for fast/convenient
| writes. If your write load would be fine with a small
| increase in latency then you can do a lot of de-normalization
| so that your reads can avoid using tonnes of joins,
| aggregates, windows, etc at read-time. Update write path so
| that you update all of the de-normalized views at write time.
|
| Depending on your read load and application structure you can
| get a lot more scale with caching.
|
| Decent article.
| mamcx wrote:
| Is interesting that the idea of micro services is throw like a
| obvious "solution".
|
| Is not.
|
| "Scale-up" MUST be the "obvious" solution. What is missed by
| many, and this article touch (despite saying that micro-
| services is a "solid" choice) is that "Scale-up" is "scale-out"
| without breaking the consistency of the DB.
|
| Is a lot you can do to squeeze, and is rare you need to ignore
| join, data validations and other anti-patterns that are
| normally trow casual when problems of performance happens.
| deathanatos wrote:
| I don't know what to tell you other than I've seen vertical
| scaling hit its ceiling, several times. The OP lists "scale
| vertically first" as a given; to an extent, I agree with it,
| and the comment you're responding to is made with that as a
| base assumption.
|
| There are sometimes diminishing returns to simple scaling;
| e.g., in my current job, each new disk we add adds 1/n disks'
| worth of capacity. Each scaling step happens quicker and
| quicker (assuming growth of the underlying system).
| Eventually, you hit the wall in the OP, in that you need
| design level changes, not just quick fixes.
|
| The situation I mention in my comment was one of those: we'd
| about reached the limits of what was possible with the setup
| we had. We were hitting things such as bringing in new nodes
| was difficult: the time for the replica to replicate was
| getting too long, and Mongo, at the time, had some bug that
| caused like a ~30% chance that the replica would SIGSEGV and
| need to restart the replication from scratch. Operationally,
| it was a headache, and the split moved a _lot_ of data out
| that made these cuts not so bad. (Cassandra did bring its own
| challenges, but the sum of the new state was that it was
| better than where we were.)
|
| Consistency is something you must pay attention to. In our
| case, the old foreign key between the two systems was the
| user ID, and we had specific checks to ensure consistency of
| it.
___________________________________________________________________
(page generated 2023-08-11 23:00 UTC)