[HN Gopher] Postgres scaling advice
___________________________________________________________________
Postgres scaling advice
Author : BrentOzar
Score : 310 points
Date : 2021-01-26 12:25 UTC (10 hours ago)
(HTM) web link (www.cybertec-postgresql.com)
(TXT) w3m dump (www.cybertec-postgresql.com)
| orlovs wrote:
| I am totally pro-hosted solutions. However, hosted postgres is
| always none of it. I had experience on same size machine some
| queries took 30x time longer on hosted vs selfhosted
| StreamBright wrote:
| >> A single PostgreSQL instance can easily do hundreds of
| thousands of transactions per second
|
| For example, on my (pretty average) workstation, I can do ca. 25k
| simple read transactions per 1 CPU core on an "in memory" pgbench
| dataset...with the default config for Postgres v13! With some
| tuning (by the way, tuning reads is much harder in Postgres than
| tuning writes!) I was able to increase it to ~32k TPS per core,
| meaning: a top-notch, dedicated hardware server can do about 1
| million short reads! With reads, you can also usually employ
| replicas - so multiply that by 10 if needed! You then need to
| somehow solve the query routing problem, but there are tools for
| that. In some cases, the new standard LibPQ connection string
| syntax (target_session_attrs) can be used - with some shuffling.
| By the way, Postgres doesn't limit the number of replicas, though
| I personally have never witnessed more than 10 replicas. With
| some cascading, I'm sure you could run dozens without bigger
| issues.
|
| This sort of hand-wavy "benchmark" is not really good for anybody
| other then the author's satisfaction. Real world scenarios are
| not like that.
| kevincox wrote:
| I wonder when using a distributed database (like CockroachDB)
| will be the default for new applications. Right now it seems that
| they are less feature and harder to set up than traditional
| RDBMSes but I can only assume that this gap will narrow and at
| some point in the future things will be "scalable by default".
| (Of course no DB is going to prevent all ways to shoot yourself
| in the foot)
| andreimatei1 wrote:
| I think the "default" will evolve with whatever offers the best
| "serverless" experience in the public clouds. In particular,
| the cheapest and most granularly-billed option.
| WJW wrote:
| This question is similar to asking on a car forum when using a
| 40 foot lorry will be the default starter car for everyone. The
| answer is "probably never" because while it does offer superior
| cargo transport scalability, the tradeoffs are not worth it for
| the vast majority of users. The question is posed like
| distributed databases have no disadvantages over non-
| distributed databases, but that is simply not the case.
| Clustering introduces all sort of problems, from network splits
| to slow(er) joins across nodes.
| __s wrote:
| To back up how far one server can go, Stack Overflow used a
| single database for a long time
|
| https://nickcraver.com/blog/2016/02/17/stack-overflow-the-
| ar... (2013 post had much less redundancy, but even their
| 2016 architecture is pretty undistributed in terms of being
| able to recreate everything from the single source of truth
| database)
| bcrosby95 wrote:
| Yeah, it's been a while since I've been in a high traffic
| situation, but back in the late 00s we had a couple sites
| with 3-5 million daily unique users. Each site had a single
| master, a couple read replicas, and some queries were
| cached in memcached.
|
| The problem in MySQL we eventually ran into was the read
| replicas fell behind master during peak load (by 30-60
| minutes depending upon the day). So we still had to hit the
| master for certain queries. I left before we took the next
| step to fix this issue.
| kevincox wrote:
| I don't think it is quite the same.
|
| - Switching from a car to a van to a lorry is fairly low
| cost. You don't need to recreate your product (probably).
|
| - You don't need to run distributed databases in a cluster to
| start.
|
| But I think most importantly the decrease in dev speed and
| performance is an investment in future scalability. And I
| only imagine that this different will shrink over time to
| where for example a 1 node "distributed" DB isn't that
| different to work with than a 1 node "traditional" DB. And
| that small difference pays off because adding a second node
| doesn't have huge tradeoffs.
|
| I agree that right now it doesn't make sense. If I was
| starting a new product I would still fire up a PostgreSQL
| instance. However I think that the day will come where the
| difference is small enough and software support is complete
| enough that we will start with something distributed, much
| like people don't often start with SQLite today, even though
| it is quicker to get going (also not a great comparison
| because it has less features).
| andreimatei1 wrote:
| > I agree that right now it doesn't make sense.
|
| This CRDB engineer respectfully disagrees. This thread
| takes it as a given that a non-distributed DB is better if
| you don't need to scale up (i.e. if you run a single
| "node"). Let me offer a couterpoint: it's easier to embed
| CRDB into some software you're distributing than it is to
| embed Postgres. This is to say, we do try to compete at
| every scale (well, perhaps not at SQLite scale). CRDB
| doesn't have all the SQL features of PG, but it does have
| its pluses: CRDB does online schema changes, it is
| generally simpler to run, comes with a UI, comes with more
| observability, can backup to the cloud, can be more easily
| embedded into tests.
|
| Online schema changes are a big deal; the other thing that
| I hope will help us win small-scale hearts and minds is the
| ever-improving observability story. I hope CRDB will
| develop world-class capabilities here. Other open-source
| databases traditionally have not had many capabilities out
| of the box.
| yuribro wrote:
| > But I think most importantly the decrease in dev speed
| and performance is an investment in future scalability.
|
| It makes sense only if you'll ever need this scalability.
| And you take a hit on other fronts too: Infra cost,
| Deployment complexity. And both deployment complexity and
| code complexity also increase QA cost, instability, product
| and company reputation.
|
| >> much like people don't often start with SQLite today
|
| Maybe they should start with SQLite by default
| lrem wrote:
| For personal projects, I start with SQLite by default.
| Never came close to its scalability limit too ;)
| WJW wrote:
| Making investments in future scalability at the cost of dev
| speed and performance is exactly the wrong tradeoff given
| that 90% of startups fail. At the start, when you have very
| little income, you want to allow for as much speed and
| flexibility as possible so that you can get to
| product/market fit ASAP. By the time your company gets big
| enough that plain MySQL/Postgres can't handle the load any
| longer, you will have more than enough money to afford a
| few experts that can help you migrate.
| cuu508 wrote:
| To me it looks like there are not many affordable options right
| now.
|
| CockroachDB understandably wants you to use their Cloud or
| Enterprise products: the OSS version is quite limited. For
| example it doesn't support row-level partitioning
| (https://www.cockroachlabs.com/docs/stable/configure-
| replicat...). Which means, if I understand correctly, it is not
| of much help for scaling writes to a single big table.
| nvanbenschoten wrote:
| Hi cuu508, CockroachDB engineer here. You are correct that
| row-level partitioning is not supported in the OSS version of
| CRDB. However, it sounds like there's a bit of confusion
| about where manual table partitioning is and is not needed.
| The primary use-case for row-level partitioning is to control
| the geographic location of various data in a multi-region
| cluster. Imagine a "users" table where EU users are stored on
| European servers and NA users are stored on North American
| servers.
|
| If you are only looking to scale write throughput then manual
| partitioning is not be needed. This is because CRDB
| transparently performs range partitioning under-the-hood on
| all tables, so all tables scale in response to data size and
| load automatically. If you are interested in learning more, h
| ttps://www.cockroachlabs.com/docs/stable/architecture/distr..
| . discusses these concepts in depth.
| cuu508 wrote:
| Thanks for explaining, and sorry -- looks like I jumped to
| conclusions too quickly!
| kevincox wrote:
| Oh, I didn't realize that the free version was that limited.
| I guess I need another name to use as the default open source
| distributed database.
| berns wrote:
| No, you didn't understand correctly. The feature that isn't
| supported is row level _replication zones_. Replication zones
| allows to define the location of replicas.
| kryptiskt wrote:
| Why would it ever be default? Very few applications need a
| distributed database. These days you can get a single machine
| with hundreds of terabytes of storage and terabytes of RAM. Add
| a spare machine for failover and you have a reliable setup for
| any but the biggest tasks. And the tasks that that setup isn't
| sufficient for will certainly demand more thought than a
| cookie-cutter setup of some random distributed DB.
| jandrewrogers wrote:
| What we need is a database that can both scale-up _and_ scale-
| out. Most distributed databases offer poor efficiency and
| performance on a per node basis, which has a high operational
| cost. This is why people avoid using distributed databases
| unless they need it. A scale-up database can serve as much
| workload as pretty large scale-out database in practice. This
| discontinuity creates the market for scale-up systems.
|
| There is literally nothing preventing distributed databases
| from having excellent scale-up performance too. Unfortunately,
| people who design distributed databases have a strong bias
| toward unnecessarily throwing hardware at performance and
| scalability problems. This is partly because very few people
| know _how_ to design a modern scale-up database; making
| something (trivially) distributed is easier.
| FpUser wrote:
| >"try all the common vertical scale-up approaches and tricks. Try
| to avoid using derivative Postgres products, or employing
| distributed approaches, or home-brewed sharding at all costs -
| until you have, say, less than 1 year of breathing room
| available."
|
| Very healthy approach. I've always followed the idea of vertical
| scalability when writing my modern C++ app servers with local
| Postgres. Since I do not sell those to FAANG I've never failed
| finding decent very reasonably priced piece of hardware be it
| dedicated hosting or on prem that would not satisfy client's need
| for any foreseeable future. More then that. I've never needed
| even top of the line hardware for that. I concentrate on features
| and robustness instead. Using C++ also gives nice speedup.
| isoprophlex wrote:
| Pretty solid advice, nice article.
|
| One thing:
|
| > For example, on my (pretty average) workstation, I can do ca.
| 25k simple read transactions per 1 CPU core on an "in memory"
| pgbench dataset...with the default config for Postgres v13!
|
| Forget about reaching those numbers on managed DB-as-a-service
| instances, specifically Azure managed postgres. In my experience
| these have comparatively poor peak performance, with high
| variability in latency to boot.
|
| Bare metal all the way, if you can spare the allocation of a
| dedicated DBA.
|
| Also... if rapid reads and steady performance is what you're
| after, provision a replica for the occasional analytical
| workloads.
| samf wrote:
| Good info; I came here looking for experience with managed
| databases. Does anyone have experience with managed instances
| on other platforms?
| jandrewrogers wrote:
| The assertion that PostgreSQL can handle dozens of TB of data
| needs to be qualified, as this is definitely not the case in some
| surprising and unexpected cases that are rarely talked about.
|
| PostgreSQL's statistics collection, which is used by the query
| planner, _doesn 't scale with storage size_. For some ordinary
| data distributions at scale, the statistical model won't reflect
| any kind of reality and therefore can produce pathological query
| plans. It is quite difficult to get around this scaling behavior.
| Consequently, I've moved away from using PostgreSQL for data
| models with distributions such that these query planner
| limitations will occur. These pathologies occur well before the
| "dozens of TB" range.
|
| The statistics collector is not a scalable design generally, but
| that is another matter. In its current state it does not degrade
| gracefully with scale.
| aeyes wrote:
| Vacuum is a way bigger problem, if you have a few million
| updates per day on a 5TB table you are going to have a hard
| time keeping index and page bloat down.
|
| Sure, if your tables are insert only you might be fine but
| doing any kind of DDL or maintenance (analyze after version
| upgrades) is going to ruin your day.
| laurenz_albe wrote:
| This seems to be unfounded criticism. When statistics are
| gathered, PostgreSQL samples a certain percentage of the table,
| so that obviously scales. The number of "most common values"
| and histogram buckets scales up to 10000, which should be good
| even for large tables. While I'll readily admit that not all
| aspects of cross-column dependencies are dealt with, and cross-
| table distributions are not considered, that has nothing to do
| with size. I guess you hit a problem somewhere, couldn't solve
| it and jumped to unwarranted conclusions.
| jandrewrogers wrote:
| It is ironic that you accuse me of "unwarranted conclusions".
| I've been customizing and modifying PostgreSQL internals for
| almost two decades, I know how to read the source. You aren't
| as familiar with PostgreSQL as you think you are.
|
| This wasn't my problem, I was asked by a well-known company
| with many large PG installations and enterprise support
| contracts to look at the issue because no one else could
| figure it out. The limitations of the statistics collector
| are not only evident in the source _but they are documented
| there_. There are also deep architectural reasons why you can
| 't trivially modify the statistics collector -- I looked into
| this option -- to work for larger tables without introducing
| other serious issues.
|
| If you have a uniform distribution of values, the statistics
| collector will work fine. In the above case, the values
| followed a power law distribution which created extreme
| biases in the statistical model due to necessary restrictions
| on sampling. Other distributions can have similar effects
| once you exceed the ability of the statistics collector to
| acquire a representative sample.
| brasetvik wrote:
| > I know how to read the source. You aren't as familiar
| with PostgreSQL as you think you are.
|
| Oh, maybe you have read some of Laurenz Albe's many
| contributions to Postgres, then. https://git.postgresql.org
| /gitweb/?p=postgresql.git&a=search...
| srcreigh wrote:
| Naive question. Wouldn't sampling a power law dataset be
| straightforward? The idea is there's only a few outlier
| values, and the rest are uncommon. This distribution seems
| extremely common. Ie column with mostly NULL values and the
| rest somewhat unique non null strings.
|
| I'm curious what data you saw and why the sampling didn't
| work?
| fabian2k wrote:
| You can change the statistics target, see
| https://www.postgresql.org/docs/current/runtime-config-query...
|
| You can also create more advanced statistics over multiple
| columns: https://www.postgresql.org/docs/current/planner-
| stats.html
|
| But if your statistics are bad, it will certainly mess up some
| of your query plans.
| jandrewrogers wrote:
| Yes, everyone knows these things, it does not address the
| issue. Due to its architecture, there are difficult to change
| internal limits on how PostgreSQL samples the data.
|
| Under some conditions, it is not possible for the sampling
| mechanics to build a representative model -- the actual
| statistics used for query planning will be quasi-random. One
| of the ways this manifests is that every time you rebuild the
| statistics you get a completely different statistical model
| even if the data has barely changed.
| gher-shyu3i wrote:
| > PostgreSQL's statistics collection, which is used by the
| query planner, doesn't scale with storage size.
|
| Out of curiosity, do statistics collectors for other offerings
| (e.g. MySQL, SQL Server, Oracle) scale with storage size?
| boomer918 wrote:
| Theoretically your argument would make sense, but practically
| your data distribution should be uniform enough for the
| statistics collector to work for terabytes and terabytes.
|
| I've seen query plans get messed up with partial indexes, but
| if you have a regular index, you needn't worry.
| natmaka wrote:
| Even when using as much stat samples as possible (3000000) for
| each of those tables: ALTER TABLE table_name ALTER column_name
| SET STATISTICS 10000; See
| https://www.postgresql.org/docs/current/sql-altertable.html
|
| AFAIK the PostgreSQL's approach is based upon
| http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.53....
| fanf2 wrote:
| Does statistics collection scale better with partitioned
| tables? TFA mentions that partitioning helps with vacuum, and
| AIUI vacuum workers also collect statistics.
| jajool wrote:
| I have seen many PostgreSQL benchmarks having solid performance
| with TB data but my real world experience is the complete
| opposite.
|
| Here are some of the main issues that I have encountered so
| far:
|
| 1. Queries on large tables (around 10 GB) are slow even when
| "index only scan" is used because of MVCC and the way
| postgreSQL manages concurrency.
|
| 2. Hot-standby instances can't be used for anything serious
| since all queries are dropped regularly (I believe it's not
| safe to use "hot_standby_feedback" config to overcome this
| issue).
|
| 3. It is not possible to have tables with heavy "update"
| workflows. (because of simultaneous autovaccum execution)
|
| I would be very happy if anyone could show me that I am wrong.
| YorickPeterse wrote:
| > 2. Hot-standby instances can't be used for anything serious
| since all queries are dropped regularly (I believe it's not
| safe to use "hot_standby_feedback" config to overcome this
| issue).
|
| Hot-standby instances are fine. We've been using these for
| GitLab.com for a number of years now, and never have they
| caused any issues. We do have some code in place that retries
| queries when they are cancelled by the hot-standby, but IIRC
| this doesn't happen often.
| tommyzli wrote:
| 1. You'll have to define "slow" - I have a 3TB table where an
| index only scan takes under 1ms
|
| 2. hot_standby_feedback is absolutely safe. I've got 5 hot
| standbys in prod with that flag enabled
|
| 3. Again, it depends on how "heavy" your update throughput
| is. It is definitely tough to find the right balance to
| configure autovacuum between "so slow that it can't keep up"
| and "so fast that it eats up all your I/O"
| singron wrote:
| 1 mostly applies to update heavy tables since index only
| scans use the visibility map, which would be frequently
| invalidated.
|
| 3 is definitely true, especially the larger the table. I've
| had success splitting frequently updated columns out into
| their own much smaller table, or any other trick to
| concentrate updates into a small table. Also MVCC bookkeeping
| requires updating pages, so an UPDATE that doesn't change any
| field and SELECT FOR UPDATE will cause the same problem.
| andy_ppp wrote:
| This sounds really in depth and useful, are there more in depth
| articles about how to avoid these types of query.
| claytonjy wrote:
| This is intriguing; could you give an example of a distribution
| Postgres struggles with at scale, and an application that
| produces such a distribution?
| mkilling wrote:
| We've recently been surprised by wrong estimates made by the
| planner resulting in inefficient query plans.
|
| It seems that power law distributions over multiple tables
| aren't handled very well:
| https://stackoverflow.com/questions/65861780/is-there-a-
| way-...
| silon42 wrote:
| Is it easy to disable it? Personally I would not want any
| unpredictable behavior from query planner anyway, especially at
| scale.
| Tostino wrote:
| You would get predictably bad behavior and performance
| without the statistics collected.
| ben509 wrote:
| Predictably bad won't result in a sudden spike in IO usage
| that brings down a production system. This is why Oracle
| has SQL profiles.
| Tostino wrote:
| Then use something like pg_hint_plan:
| https://pghintplan.osdn.jp/pg_hint_plan.html (never used
| it myself)
|
| I've been bitten by bad optimizer choices which took way
| too long to figure out how to debug / fix, so I know the
| pain. I do hope the PG optimizer continues improving.
| It's one of the weakest areas in comparison to commercial
| DB's IMO.
| eatonphil wrote:
| Neither of you have given a reproducible example one way or the
| other! I could share that I know of TB of core analytics data
| running postgresql and it does alright. But it would be nicest
| if there were a more tangible study to talk about.
| jjice wrote:
| In the opinion of a last semester CS student who has never
| written an application from scratch that needed more than a
| SQLite DB (so take me with a half grain of salt), it seems like
| premature optimization, while always talked about, is very
| common. I see people talking about using Kubernetes for internal
| applications and I just can't figure out why. If it's a hobby
| project and you want to learn Kubernetes, that's a different
| situation, but in the case of making a real application that
| people will use, it seems like a lot of us can get away with a
| single DB, a few workers of our app, and maybe a cache.
|
| I'm speaking out of little experience though. I just think that a
| lot of us can get away with traditional vertical scaling and not
| think too much about it.
| pg_bot wrote:
| What you're describing is called resume driven development. It
| happens every few years when people want to cash in on
| trends/buzzwords that people believe will be disruptive to all
| industries but are just tools to have in the toolbox for most.
| New tools pop up all the time that fit this mould. Over the
| past ten years I can think of Hadoop (Big data), MongoDB
| (NoSQL), Kubernetes, "Serverless" computing, and TensorFlow.
| While all these tools have legitimate use cases, they are often
| overused due to marketing or industry hype.
|
| Adding artificial intelligence to your recipe application is
| unlikely to make any sense, but people do it because they want
| to have AI software engineer on their resume.
| nforest wrote:
| For artificial intelligence, I think it's more often
| marketing driven development. It's easier to seem disruptive
| if you claim to have AI in your product. Easier to get
| funding and have people talk about your company. I feel like
| it comes more often from business executives than technical
| people.
| kodah wrote:
| > but in the case of making a real application that people will
| use, it seems like a lot of us can get away with a single DB, a
| few workers of our app, and maybe a cache.
|
| A couple of points:
|
| 1. Kubernetes can run monoliths. It's certainly not exclusive
| to microservices or SOA. It's just a compute scheduler, quite
| similar to AWS's EC2 reservations and auto-scaling groups
| (ASG's).
|
| 2. I can't speak for every corporation, but if you already have
| patterns for one platform (note: "platform" in this context
| means compute scheduling. eg: AWS, GCP, Kubernetes, Serverless)
| then you will inevitably try to copy patterns you already
| implement internally. A lot of times, for better or for worse,
| it's not what fits best unless what fits best and what you have
| available are highly conflicting.
|
| 3. A lot of times "scaling" is actually code for multi-tenancy.
| As an industry, we should probably be explicit when we're
| scaling for throughput, redundancy, and/or isolation. They are
| not the same thing and at times at odds with each other.
|
| 4. I don't really like your use of "real application" here as
| it implies some level of architectural hierarchy. My main
| takeaway after 10+ years of professional development is that
| architectures are often highly contextual to resource
| availability, platform access, and personal preferences.
| Sometimes there's a variable of languages too, because some
| languages make microservice architecture quite easy while
| others make it a royal PITA.
| wejick wrote:
| I know one of the biggest Ecommerce shop in Asia were using 1
| big DB with multiple read only slave in monolithic architecture
| for more than 5 years.
|
| However not only driven by DB performance, but also on
| organizing hundreds of engineers they adapted microservice
| architecture. Then they slowly migrating to per domain specific
| DB, it is just classic microservice migration story.
|
| While single DB may bring us pretty long way, designing the
| system into more discipline logical domain level segregation
| will help when there's need to move to microservice.
|
| *looks like HN reader quite sensitive with microservice
| migration comment, usually this kind of comment got down voted
| easily.
| collyw wrote:
| Stack Overflow runs what is essentially a monolithic
| architecture. Though they do have a few services, it isn't
| what I would describe as a micro-service architecture.
|
| https://stackexchange.com/performance
| pas wrote:
| Monzo (UK bank) has 1600+ microservices, but mandates a
| common framework/library and uses Cassandra. (Which is
| basically a shared nothing, bring your own schema
| "database".)
|
| So it makes sense to combine advantages of different
| approaches.
| mumblemumble wrote:
| It's resume-driven development, and it's also entertainment-
| driven development. Bringing in new technologies gives you a
| chance to play with a new toy. That's an effective way to make
| your job more interesting when the thing you're supposed to be
| working on is boring. Which, in business applications, is more
| often than not the case.
| spaetzleesser wrote:
| In today's job market resume driven development is a very
| rational choice. I work in medical devices so we are pretty
| conservative and generally way behind the cutting edge. This
| makes it really hard to find jobs at non medical companies. I
| would recommend anybody who has the chance to use the latest
| and shiniest stuff to do so because it's good for your career
| .
| mumblemumble wrote:
| Very good point. Seems like yet another example of how
| carefully optimizing all the individual parts of a system
| can paradoxically de-optimize the overall system.
| zdragnar wrote:
| In big enough organizations, it is very easy to lose track of
| who owns what, especially when it is those little ad-hoc
| internal tools. Manually managing the infrastructure for them
| is a recipe for them to become permanently enshrined in the
| wasteland of "services we think we use, but do not maintain
| because we don't remember who needed it or put it up or how to
| configure it".
|
| K8s isn't the only answer, but if you are already using it for
| your large applications, it isn't much work to reuse the
| existing tooling and infrastructure, and now you st least have
| the dockerfile as a reference if nothing else.
|
| OTOH, if you have an existing tooling setup / pipeline that is
| _not_ K8s, there isn 't a good reason to use it for a small
| application.
| corty wrote:
| Having a Dockerfile that copies a few binary blobs into an
| age-old distro image isn't an improvement, it's a huge
| liability. And most of that stuff that no one knows anything
| about anymore is like that. Same as with an old VM or PM.
|
| I'd rather have that old crap as a physical machine. Why?
| Because the hardware lifetime "naturally" limits the lifetime
| of such applications. If the hardware dies, it forces a
| decision to spend some money to keep it running or throw it
| away, which, given that hardware is expensive, usually
| results in throwing it away.
| vlovich123 wrote:
| Set up your docker file to be part of your CI so that your
| binary blobs are built from source with regularity? That's
| typically the solution I've seen work well. Manually
| maintained stuff (especially for stuff that may not be the
| thing everyone is primarily doing) generally doesn't scale
| well without automation (speaking as someone who's seen
| organizations grow). This is also true of "getting started"
| guides. Can't tell you how much maintenance and run time
| I've saved converting those wikis to Python scripts.
| corty wrote:
| Yes, of course. That would be ideal. That's what we do
| for everything we can control.
|
| But as someone in the IT dept., far too often you get
| some container that either was built by someone who long
| left the company or an external consultant who got paid
| to never return. Sourcecode is usually unavailable, and
| if it is available, will only build on that one laptop
| that the consultant used. The IT department gets left
| with the instruction "run it, it was expensive" and "no,
| you don't get any budget to fix it". That results in the
| aforementioned containers of doom...
|
| Yes, I'm bitter and cynical. Yes, I'm leaving as soon as
| I can :)
| striking wrote:
| I hate to raise a seemingly obvious point, but this
| doesn't seem like a problem with Docker.
| wdb wrote:
| Can still remember the multinational were wanted to host
| the webapplciation we developed from them at their data
| centre. They wanted to charge us (not the business unit for
| some odd reason) nearly EUR20.000 per year to host a RoR
| web application.
|
| Ended up, hosting it ourselves, and in the last year they
| were paying EUR100k per year for it. As we would just sell
| the same setup for each deployment for their customers.
| They probably been cheaper off to host it themselves.
| giantrobot wrote:
| In theory that works. In practice it rarely does. Docker et
| al gained popularity because they made it way more
| practical for projects to be managed as the world works
| rather than as it should be, for good or ill. Before Docker
| it was moving old applications to VMs and before that it
| was running them in chroot horror shows.
| yowlingcat wrote:
| > the hardware lifetime "naturally" limits the lifetime of
| such applications
|
| Oh, my sweet summer child. Of all the things that
| "naturally" limits the lifetime of such applications, that
| is not it. Consider the case of the mainframes running the
| US banking system, for example.
| waynesonfire wrote:
| docker and k8s adoption can force a company that over years
| has developed and perfected a standard practice of deploying
| application, with a half-ass'd solution that ends up solving
| the wrong problems and costing way more. The "shipping beats
| perfecting" mantra is very much at play here. This is due to
| the amount of time it would take to achieve parity. At the
| end of the day, the new solution ends up looking like a step-
| back.
|
| Such a practice combined with the mentality that software
| engineers should do their own dev-ops can easily lead to an
| environment of spaghetti applications where every developer
| working on the new platform does things slightly different
| because the replacement solution wasn't complete and had to
| be addressed by countless band-aids by engineers across the
| band of talent.
|
| Furthermore, for the features that were able to achieve
| parity, you're now forcing your entire organization to re-
| learn the new development process. The docs our abysmal and
| the software engineers that developed the original solution
| have moved on since the hard work of "productionalization"
| remains and they're not interested in that.
| akiselev wrote:
| _> docker and k8s adoption can force a company that over
| years has developed and perfected a standard practice of
| deploying application, with a half-ass 'd solution that
| ends up solving the wrong problems and costing way more._
|
| docker and k8s adoption can also force a company that over
| years has developed and perfected a half-ass'd solution to
| deploying applications, with a single(-ish) source of truth
| that ends up solving the right organizational problems
| instead of the wrong technical ones (and ends up costing
| way more, at least in the short term).
| avereveard wrote:
| Kubernetes isn't only about scaling. The repeatability of
| deployment process is a great asset to have as well.
| Silhouette wrote:
| But there are much simpler ways than K8s to achieve
| automated/repeatable deployments, if that is your goal.
| objektif wrote:
| Can you please name a few?
| kitd wrote:
| _I see people talking about using Kubernetes for internal
| applications and I just can 't figure out why._
|
| There is benefit in having established platforms for running
| your code, and this is especially true for large orgs where the
| people who run the systems are an entirely different group from
| those that developed or assembled it. And people (+ their
| skills) are what cost the most money in any business.
|
| It's true that many/most systems don't require a full
| Kubernetes stack (for instance), but if a critical mass of the
| business IT is going that way, doing the same with your own
| makes sense from an economies of scale PoV.
| mvanbaak wrote:
| > There is benefit in having established platforms for
| running your code
|
| You do know k8s is very new, there's a constant stream of
| changes and updates to it, etc etc? it's not established.
| It's known, but that's it.
| thraxil wrote:
| I think what they may mean is more that there is an
| established platform within the organization. One that you
| have expertise and experience with,
| monitoring/backup/security tools that work with it, etc.
| K8s might not have as long a pedigree as VMs, but if you
| already have a setup to run K8s, people who know how to use
| it, documentation and tooling that allow devs to run their
| apps on it securely and efficiently, etc. it's pretty
| reasonable to want to encourage devs to "just" use k8s it
| if they want to stand up a new service rather than spinning
| up whatever random collection of technologies that dev
| happens to know better.
| pas wrote:
| 1.0 was released in 2015. There are stable LTS vendors for
| it.
|
| It's pretty established. And much saner than cobbling
| together Ansible/Puppet/Chef playbooks for everything.
| majewsky wrote:
| Saying that 2021's Kubernetes is established because 1.0
| was released in 2015 is like saying that 1991's Linux is
| stable because Unix had existed for 20 years at that
| point. Kubernetes 1.0 and 1.20 share the same name,
| design principles and a certain amount of API
| compatibility, but it's impossible to take a nontrivial
| application running on 1.20 and just `kubectl apply` it
| on 1.0. Too much has changed.
|
| Kubernetes is _just now_ entering the realm of "becoming
| stable". Maybe in five years or so it'll finally be
| boring (in the best sense of the word) like Postgres.
| collyw wrote:
| Speaking as someone with 20 years in the industry, what you say
| is correct. Most applications would be find on a single server
| and a classic LAMP stack. But that ain't cool these days.
| ccmcarey wrote:
| I think it's easier to run a small k8s cluster than it is to
| attempt to recreate a lot of the functionality provided
| manually, especially if you're running in a cloud where the
| control plane is handled for you.
|
| It provides unified secrets management, automatic service
| discovery and traffic routing, controllable deployments,
| resource quotas, incredibly easy monitoring (with something
| like a prometheus operator).
|
| Being able to have your entire prod environment defined in
| declarative yml is just so much better than running something
| like a mishmash of ansible playbooks.
|
| If your application runs on a single host and you don't really
| care about SLAs or zero downtime deploys, sure, use some adhoc
| deploy scripts or docker compose. Any more than that, and I
| think k8s pays for itself.
| jayd16 wrote:
| Agreed. What is it that people are doing (or not doing) where
| a simple managed k8s cluster is more work than the minimum
| way to do this?
|
| Are teams not even setting up automated builds and just
| sshing into a box?
| hectormalot wrote:
| For me that's Heroku. I just push my app (RoR) and done.
| I've actually moved it once to k8s for a few months and
| decided to move it back after I understood how easy heroku
| made the opa side of the business.
|
| Note: it's a side project, 50 business users, $5k annual
| revenue, 4h per month as target for the time spent on
| customer support, admin and maintenance. So it's both easy
| to pay heroku and important for me to not spend too much
| time on Ops.
| Silhouette wrote:
| Very many successful applications can indeed run on a single DB
| server (modulo redundancy in case of failures). Vertical
| scaling isn't trendy, but it is effective, until it's not.
|
| I have yet to encounter a real situation where it _suddenly_
| became impossible to run a production DB on a single, high-spec
| server, without knowing far enough in advance to plan a careful
| migration to a horizontally scaled system if and when it was
| necessary.
| vlovich123 wrote:
| I'm not saying in all companies, but as you grow you have lots
| of different teams with different needs. So then you spin up a
| tools team to manage the engineering infrastructure since you
| can't do it as-hoc anymore (CI, source control, etc). So to
| make that team more efficient, you let them force one size fits
| all solutions. While this may feel constraining for a given
| problem domain, it actually makes engineers more portable
| between projects within the company which is valuable. Thus
| having one DB or cloud thing that's supported for all teams for
| all applications is valuable even if sometimes it isn't
| necessarily the absolute best fit (and the complexity is
| similarly reduced as good companies will ensure there's tooling
| to make those complex things easy to configure in consistent
| ways). Your tools team and the project team will work together
| to smooth out any friction points. Why? Because for larger
| numbers of engineers collaborating this is an efficient
| organization that takes advantage of specialization. A
| generalist may know a bit about everything (useful when
| starting) but a domain expert will be far more equipped to
| develop solutions (better results when you have the headcount).
| gowld wrote:
| Kubernetes is a container system, mostly orthogonal to 3- or
| 4-tier application design.
| h0l0cube wrote:
| And typically a single DO droplet would suffice for a toy
| project or POC, for which Ansible is probably the more
| expedient option. But maybe they're not in a rush, and
| learning K8s is just another feather in their cap .
| barrkel wrote:
| Kubernetes is for when you need to allocate CPU like you
| allocate RAM, _and_ you don 't want to be tied to a higher
| level API sold by a vendor.
| smoyer wrote:
| > I see people talking about using Kubernetes for internal
| applications.
|
| I think the important issue when first starting a project is to
| create a "12 Factor App" so that if and when you create a
| Docker image and/or run the application in Kubernetes, you
| don't have to rewrite the entire application. Most of the tools
| I write run on the CLI but I am in fact a fan of Kubernetes for
| services, message processing and certain batch jobs simply
| because I don't have to manage their life-cycles.
| Deadron wrote:
| 12 factor apps sacrifice performance and simplicity of your
| environment for scalability. Unless you are guaranteed to
| start with a worldwide audience its complete overkill. A
| better solution is to write your application with the rules
| in mind with the goal of making it easy to transition to a 12
| factor style app when its needed. Scale up then scale out
| will result in the best performance for your users.
| vp8989 wrote:
| The thinking on efficiency vs scalability is largely
| influenced by the US tech company meta where founders give
| up equity so they don't have to worry about profitability
| for a very long time.
|
| In that case, it is preferred to burn piles of cash on AWS
| instead of potentially needing to sacrifice revenue because
| you can't scale quickly enough.
|
| An architecture that is not scalable is considered a
| failure whereas one that is complex and inefficient is much
| more tolerated (as long as it can scale out) ... at least
| until the funding dries up or Wall Street activists get
| involved.
| sudhirj wrote:
| The 12 factors are mostly common sense that apply in pretty
| much any situation - they help with fancy deployments but
| also with single servers on DO or even on-Prem servers.
| Turbots wrote:
| For one application, kubernetes is obviously giant overkill.
| Companies with hundreds and thousands and applications need a
| platform that can offer standardised deployment pattern,
| runtime pattern and scaling pattern for those apps. Kubernetes
| is a great start for managing all those containers in a
| consistent, secure, multi tenant environment. I know customers
| with literally 10000s of VMs that are moving to more cloud like
| environments, either on premise in their own datacenter or in
| the public cloud. They need these kind of platforms to automate
| away the repetitive stuff and have a secure posture, throughout
| the company.
| rc_hackernews wrote:
| I haven't worked at a FAANG or any other company even close to
| that level of scale, so you can take me with half a grain of
| salt too.
|
| But what you said is absolutely true. It's also something you
| will very much experience once you start working
| professionally.
|
| I'm in no position to give you advice, and I think I might be
| giving advice to myself...just don't let it get to you.
| btilly wrote:
| I have worked at FAANGs before.
|
| I am in firm agreement. I think that far too many people are
| trying to solve the problems that they wish that they had,
| rather than making it easy to solve the ones that they do
| have. Going to something like Kubernetes when there is no
| particular reason for it is a good example of that trend.
|
| When you really need distributed, there is no substitute. But
| far more think that they need it than do.
| laurencerowe wrote:
| Micro-service architectures are an absurd overcomplexity for
| smaller internal apps. They only really make sense when a
| monolithic system becomes too large for a single team to
| manage, at which point the micro-service boundaries reflect
| team boundaries.
| SPBS wrote:
| If you are going to allocate sharded databases per client with
| identical schemas, might as well give each of them an sqlite
| database? Since you're massively cutting down the writes if it's
| one database per customer.
| brandmeyer wrote:
| And then _just one_ query comes along where you need to make an
| update that should be globally visible, and not just visible to
| the shard.
|
| I can see why you would stick with full Postgres for as long as
| practicable.
| bmcahren wrote:
| Avoiding sharding and complex replication is very smart to
| postpone as late as possible with any database (mysql, postgres,
| mongodb). It can be very fragile or fail in unexpected or unusual
| ways and most importantly it can take _much_ longer to fix. E.g.
| 18 hours instead of 2 hours of downtime.
| evanelias wrote:
| Once your data grows very large, a successfully-implemented
| sharding solution actually improves availability, rather than
| reducing it.
|
| With a huge monolithic database, a failure causes downtime for
| your entire product/company. Replica cloning and backups are
| slow. Major version upgrades are stressful because it's all-or-
| nothing.
|
| With a sharded environment, a single shard failure only impacts
| a portion of your userbase, and smaller databases are faster to
| perform operational actions on.
|
| There are definitely major downsides to sharding, but they tend
| to be more on the application side in my experience.
| codyb wrote:
| Wouldn't sharding generally result in overlap between the key
| ranges so that a database shard going down doesn't have to
| result in any downtime?
|
| Then your issue is replication of writes I suppose. Probably
| depends on use case what configuration you choose.
| Tostino wrote:
| Totally depends on how you shard. In my case (b2b), i'd be
| sharding by tenant. Having a single tenant go down would
| not have the same impact as every single tenant going down.
| wejick wrote:
| Honestly I never found the case when this happens, data
| always falls into 1 shard according to the key. Then comes
| the concept of shard replica where the shard can live in
| several nodes and form a redundancy.
|
| However I'm noy sure how usually it's being setup on
| Postgres
| evanelias wrote:
| With a typical sharded relational database setup, each
| sharding key value maps to exactly one shard.
|
| There should be replicas of that shard, which can be
| promoted in case of a master failure. But in rare cases all
| replicas may also be degraded or inconsistent and therefore
| non-promotable. When this happens to a giant monolithic
| non-sharded database, the impact is far more catastrophic
| than when it happens to a single smaller shard.
|
| In any case, replication is a separate concern from
| sharding. Each sharded database replica set (traditionally
| 1 master + N replicas) has a data set that is unique /
| independent of other sharded database replica sets.
|
| That said, some of the "NewSQL" distributed databases may
| arrange their data in more complex ways. [Edit to clarify:
| I mean in respect to the _combination_ of both sharding and
| replication. With a traditional sharded relational
| database, you have replica sets where each replica contains
| the exact same shard or shards; with a NewSQL distributed
| DB, data may be replicated in a more complex arrangement]
| zinclozenge wrote:
| Most, if not all, commercial newsql distributed databases
| do range based splitting of the data, with each range
| managed by a raft group. Raft groups get migrated between
| nodes to avoid hot spots, among other scheduling
| criteria. TiDB does this for sure, I'd be surprised if
| CockroachDB (and yugabyte and dgraph) doesn't do it.
| bmcahren wrote:
| Shards increase the number of failure modes and increase the
| complexity of those failure modes. For _most_ businesses, the
| recommendation holds true... keep it simple, don 't shard
| until you _need_.
|
| I find it somewhat concerning that MongoDB has a better
| architecture for upgrades than Postgres. You add a replica to
| the cluster running the new major version and then switch
| that replica over as your primary once you've replaced enough
| instances in the cluster. Having worked in Oracle and MySQL
| for years then having switched to a company with a MongoDB
| framework I forgot how stressful upgrades would be with such
| archaic limitations.
| evanelias wrote:
| > Shards increase the number of failure modes and increase
| the complexity of those failure modes.
|
| I would only agree with this during the initial
| implementation of sharding. Once deployed and stable, I
| have not found this to be the case, at all.
|
| I say this as someone who has directly architected a
| sharded database layer that scaled to over a trillion rows,
| and later worked on core automation and operations for
| sharded databases that scaled to an incalculable number of
| rows (easily over 1 quadrillion).
|
| In both cases, each company's non-sharded databases were
| FAR more operationally problematic than the sharded ones.
| The sharded database tiers behave in common ways with
| relatively uniform workloads, and the non-sharded databases
| were each special snowflakes using different obscure
| features of the database.
|
| > keep it simple, don't shard until you need
|
| I would have wholeheartedly agreed with this until a few
| years ago. Cloud storage now permits many companies to run
| monster 10+ TB monolithic relational databases. Technically
| these companies no longer "need" to shard, possibly ever.
| But at these data sizes, many operations become extremely
| painful, problematic, and slow.
| throwdbaaway wrote:
| One of those companies is facebook right? I think this
| blog post provides a much more balanced view:
| http://yoshinorimatsunobu.blogspot.com/2017/11/towards-
| bigge...
| numlock86 wrote:
| I like how the article starts with the metrics of what a single
| node can already do. The trend appears to be to scale right at
| the start, even before you have your first real customer ... the
| implications and results are obvious.
| AdrianB1 wrote:
| I have experience with other relational DB products, but the
| principle should be similar; I have a few databases over 1 TB,
| but scaling to tens of TB would require more RAM than what a
| typical 1S or 2S can support. CPU's are not that problematic with
| the huge number of cores in AMD Epyc, but RAM is a serious
| limitation, in my world I need between 15% and 60% RAM to
| database size ration, depending if it is transactional, reporting
| or somewhere in between. Taking a 50 TB database as an example,
| it is too much for a 4TB RAM 2 socket Epyc system.
| beck5 wrote:
| When it comes to replication & automatic failover in PG what is
| the 2021 gold standard setup?
| __s wrote:
| pg_auto_failover is good
| https://www.citusdata.com/blog/2019/05/30/introducing-pg-aut...
|
| Disclosure: I work for Citus
| ksec wrote:
| In 2016 I was expecting some sane defaults like MySQL would
| have arrived by 2018 or 2019.
|
| Looks like 2021 isn't that much different to 2016. There are
| work being done, but doesn't seems to be anywhere close to the
| level of MySQL.
| dijit wrote:
| wow, are you kidding? MySQL replication is possibly the worst
| I've ever seen.
|
| There is almost no consideration for the target being up to
| date, I have personally experienced missed inserts,
| replication lags, and the replica being set to read/write.
|
| PGs integrated replication is far superior, it even has
| support for bootstrapping a new replica node without rsync
| and a command to `promote` the replica to write master.
| claytonjy wrote:
| What about Percona? I haven't used it, but a common
| sentiment on HN has been that Postgres doesn't have
| something quite on par with it.
|
| Citus and, more recently, Patroni, seem to be the dominant
| Postgres analogues; have they caught up? Where do they
| dominate?
| evanelias wrote:
| Replication issues in modern MySQL are caused by user
| error, not inherent bugs. For example if you're configuring
| replication manually, you must ensure that replicas are set
| to read_only. Clustering options introduced over the past
| few years do this automatically though.
|
| > PGs integrated replication is far superior, it even has
| support for bootstrapping a new replica node without rsync
| and a command to `promote` the replica to write master
|
| MySQL has this functionality too, e.g.
| https://dev.mysql.com/doc/refman/8.0/en/clone-plugin.html
| and https://dev.mysql.com/doc/mysql-shell/8.0/en/working-
| with-re...
|
| To be clear, I am not claiming MySQL is superior to PG.
| However, I have found that many PG users are unfamiliar
| with the huge improvements introduced in MySQL over the
| past ~8 years.
| hintbits wrote:
| Use built-in physical replication, it works and it's good.
|
| Patroni is widely used for automatic failover, at least where
| you don't want a possibility of a split brain.
| mmacedo_en wrote:
| I'm building an app using Postgres for the first time. Naturally
| I was a bit worried about performance and scaling if the not
| launched yet app becomes a major success.
|
| I began simulating a heavy use scenario. 100k users creating 10
| records daily for three years straight.
|
| 100000 x 10 x 365 x 3 ~= 1 billion rows or about 200 GB with a
| record's length of 200 bytes. This is peanuts for modern
| databases and hardware.
|
| Seems like a single node can support it for a long way before I
| have to worry about performance...
| majewsky wrote:
| You classifying 11 writes per second as "heavy use" reminds me
| of how people on average completely underestimate how fast
| computers actually are (when they're not bogged down by crappy
| programs).
| mixmastamyk wrote:
| I don't believe the grandparent's simulation actually took
| three years, it was likely operations with a particular data
| size that was tested.
|
| Still, your main point stands. Around 2001 I wrote a
| C-program to record every file and size on a large hard disk.
| We were all amazed that it finished (seemingly) before the
| enter key had come back up. Must be a bug somewhere, right?
| Nope.
|
| Much earlier I wrote a Pascal program on a 486 in school that
| did some calculations over and over again, writing the output
| to the screen. It blew my mind then how fast the computer
| could do it.
| sharadov wrote:
| Over- engineering has become a disease, and has reached pandemic
| -level proportions, primarily driven by resume-padders and
| clueless management folk.
| marcus_holmes wrote:
| TLDR: As always, don't fix a problem until just before you have
| it.
| fabian2k wrote:
| One point I found very interesting was the following paragraph:
|
| > For example, on my (pretty average) workstation, I can do ca.
| 25k simple read transactions per 1 CPU core on an "in memory"
| pgbench dataset...with the default config for Postgres v13!
|
| In my own very unscientific experiments I never got values as
| high as that, but in the area of around 4k transactions per
| second total on multiple cores. Of course I'm comparing very
| different things, I was looking at more than just the database
| and there are lots of other aspects I probably did in a different
| way.
|
| I find this interesting as it probably means that my bottleneck
| wasn't entirely where I thought it was. I have to play around
| with that some time again.
|
| If I wanted to find out the baseline Postgres performance on my
| hardware for trivial read queries like looking up individual rows
| by primary key with a database that fits entirely in RAM, how
| would I do that? I know there is pgbench, but that performs more
| complex queries as far as I understand.
| ahachete wrote:
| As this is a public reference: GitLab's Postgres cluster
| handles peaks of 300K tps, where the master node alone supports
| around 60K-80K. And this is not in-memory (datasize in the
| order of 8TB, RAM 600GB).
|
| https://about.gitlab.com/blog/2020/09/11/gitlab-pg-upgrade/
|
| And there's still room for vertical scaling.
|
| Disclaimer: we provide Postgres support for GitLab.
| mmacedo_en wrote:
| HN is missing an user follow and user tagging button :-)
| ahachete wrote:
| If that's for me, I really appreciate that :)
|
| There's obviously that option on Twitter (same username as
| here), if you want to follow there ;)
| __s wrote:
| You can give pgbench files to have it run the query you want
| (see -f flag)
|
| https://www.postgresql.org/docs/current/pgbench.html
| wilsonrocks wrote:
| This was an interesting read for a database novice. It seems like
| a lot of the quoted stats are about in memory datasets - is that
| realistic?
| dap wrote:
| It depends a lot on what you're doing. In my last role, keeping
| the Postgres database in memory was absolutely not an option,
| and we ran into major issues related to physical I/O.
| bmcahren wrote:
| Yes. Memory can reach 768GB on a single instance today and I
| imagine that to expand. From there you can scale by sharding.
|
| In memory provides real-time transactions you can't guarantee
| when using disk-based storage.
| ahachete wrote:
| EC2's u-24tb1.metal is 224/448 cores/hyperthreads and 24TB
| RAM ;)
|
| https://aws.amazon.com/sap/instance-types/
| wiredfool wrote:
| And nearly 2 million for a 3 year reserved instance.
| atonse wrote:
| This reminded me of a story from 15 years ago.
|
| I once worked for a company that was writing a proposal
| for a US Homeland Security IT system. This was 2006. I
| wasn't involved in it but my office-mate was. He randomly
| turned his chair around and said "hey, can you go on
| Dell.com and see if you can build the most expensive
| server imaginable" - so I did and I ended up at around
| $350k. I don't remember what it was, but at the time the
| stats just felt absolutely obscene. I do remember that,
| at that scale we were buying a closet, not just one box.
|
| And I told him the price, and he said, "They have set
| aside $20 million for server hardware, and what you
| configured is way way overkill for what they think they
| need" - so we were both so damn perplexed because, at
| least from what he told me, this didn't need that much.
| And it wasn't one of those classified projects where they
| couldn't tell you what it was for. It was a pretty boring
| line of business type thing (like HR or building access
| control or something).
|
| Maybe that's probably more a story of just how much cash
| was being poured into that agency during the years after
| 9/11.
| jarym wrote:
| They got to the end of their financial year and had $$$
| left to spend... happens all the time.
| dekimir wrote:
| > In memory provides real-time transactions you can't
| guarantee when using disk-based storage.
|
| This is changing as we discover better ways of coding IO
| operations.
|
| https://www.globenewswire.com/news-
| release/2019/11/05/194114...
|
| https://itnext.io/modern-storage-is-plenty-fast-it-is-the-
| ap...
| bcrosby95 wrote:
| How long does it take to get the data off disk and into
| memory after coming online? A decade ago filling just 64GB of
| memory with our hot dataset was a painful process. I can't
| imagine it's any nicer with 768GB.
| fabian2k wrote:
| You can get more than that on common servers, see e.g. the
| recent post by Let's Encrypt on their new database server:
|
| https://letsencrypt.org/2021/01/21/next-gen-database-
| servers...
|
| They have 2TB RAM in there, and I suspect that is not the
| largest possible amount if you're willing to spend more money
| (though probably the largest possible amount for that
| particular server).
|
| I played around a bit on the Dell website, and the largest
| server I could find supported up to 6TB RAM, with a price at
| ~300k EUR (I assume nobody pays list price for these).
| tyingq wrote:
| 4TB seems to be the limit for normal-ish servers. 32 slots
| filled with 128GB DIMMS. You can find servers with more
| slots (48 is common, 6TB), but you're going past most
| people's definition of "commodity".
| [deleted]
| 0x10c0fe11ce wrote:
| Yes. That's why the OP talked about separating hot/cold data
| and scale up instead of scale out.
| f430 wrote:
| anybody know if this applies to Temporal Tables? I've thought of
| using Datomic but it seems like Temporal Tables does the job, not
| sure about immutability and how that might impact performance
| because you are now dealing with keeping a timestamp record of
| every transactions.
| ComodoHacker wrote:
| It's not surprising to hear such advice from PostgreSQL
| consultancy shop: don't bother and relax until the day you
| desperately need PostgreSQL consultancy! /s
|
| Seriously though, the main point stands. PostgreSQL isn't
| designed with "distributed" in mind, so try vertical scaling
| tricks first.
| [deleted]
| mumblemumble wrote:
| The skepticism is founded, but there's some truth.
|
| PostgreSQL (and traditional RDBMSes) aren't built with
| horizontal scaling in mind. That's both a blessing and a curse.
| The curse is obvious. The blessing is that it means that they
| will let you get _much_ further with vertical scaling.
| Especially if you take the time to learn how to tune them.
|
| Support for horizontal scaling comes at a cost. When I've done
| performance analyses, I regularly find that parallel and
| distributed implementations spend most their wall clock time on
| synchronization and communication. Occasionally more than 90%.
| That creates a big up-front cost you need to overcome before
| you see a net benefit. It also leads to a sort of horizontal
| scaling version of the rocket equation where returns can
| diminish rapidly.
| ccleve wrote:
| The difficulty with this advice is that it assumes that you have
| many small transactions.
|
| Yes, of course, you should try to build your application so
| queries and transactions are very short. That solves a great many
| problems.
|
| But sometimes you can't. Sometimes you just have to do joins
| across large tables. There just isn't any other way. Your query
| is going to run for 5, 10, maybe 30 seconds. That's a huge burden
| on your server, and it will slow down the other tenants. In that
| case, the _only_ answer is to distribute the queries across other
| boxes.
|
| I agree with the advice in general -- delay moving to a
| distributed system as long as possible -- but sometimes you just
| can't.
| btilly wrote:
| Point #1, there is a world of difference between a reporting
| database and a transactional database. If you need a reporting
| database, ship logs and set up a reporting database separate
| from your transactional one. That solves most of the problem.
|
| Point #2, the fact that you've hit performance problems does
| not mean that you need to distribute. Every real system that
| I've seen has had order of magnitude performance improvements
| left when it first looked like it was topping out. Add a
| caching layer, have better indexes, more targeted queries,
| appropriate use of window functions, etc. Give those a try
| before biting the bullet on distributed data.
|
| Point #3, here is a good rule of thumb for distributed systems.
| Whatever scale you hit on a single machine, you can probably
| gain 1-2 orders of magnitude of performance by switching to a
| faster language and carefully optimizing. If you switch to
| distributed, you'll LOSE at least an order of magnitude
| performance due to the overhead of RPCs, but are able to scale
| indefinitely afterwards.
|
| If you're distributing for reliability, great. But if you're
| distributing for performance and you have less than 20 machines
| in your system, either your problem is embarrassingly parallel
| or you likely aren't achieving a net win.
|
| I've seen a lot of people prematurely distribute, run into
| performance challenges, solve them, then pat themselves on the
| back for being smart enough to have distributed their code.
| While failing to recognize that they were addressing a self-
| inflicted injury.
| ajsharp wrote:
| My thoughts exactly on point #1. Nothing in a hot path should
| take multiple seconds.
| giantrobot wrote:
| Postgres has materialized views for the case of expensive
| queries. Instead of doing a view query in real-time it
| physically caches the results and has the same accessibility as
| a regular view.
|
| They don't solve all situations with expensive queries but they
| help a lot. The fact they behave like dynamic views means you
| can migrate a dynamic view to materialized if you run into
| performance issues without making changes to the client
| application. With views in general you can shuffle around
| tables and relationships without the client application knowing
| or caring.
| mmacedo_en wrote:
| I've seen queries running for 1 minute 2 minutes raising user
| complaints. Then we looked at it, and with a few changes in
| indexes and query hints brought it down to sub-second
| execution.
|
| Before thinking about distributed systems, there is an entire
| database optimization toolkit to make use of: primary key
| review, secondary index creation, profiling, view or stored
| procedure creation, temporary tables, memory tables and so on.
| ozkatz wrote:
| This. Using any database requires building up an expertise
| and understanding how to use its capabilities properly. If
| you hit a wall with a non-distributed database and your
| solution is to replace it with a distributed one - you will
| have a bad time. The surface area of what you need to know to
| use it properly still includes your basic data modeling,
| indices, troubleshooting that was required before, with a
| whole lot of networking, consensus protocols and consistency
| models to worry about (just to name a few).
| CharlesW wrote:
| > _Then we looked at it, and with a few changes in indexes
| and query hints brought it down to sub-second execution._
|
| This is exactly what the parent comment said: "you should try
| to build your application so queries and transactions are
| very short".
|
| If you're claiming that the parent is incorrect about
| "sometimes, the only answer is to distribute the queries
| across other boxes", my guess is that probably don't work at
| a scale where you've learned that query optimization can't
| solve every database performance problem.
| btilly wrote:
| Based on my past experiences, I'd be happy to take an even
| money bet that more than 95% of organizations that go
| distributed for performance reasons actually caused
| themselves more problems than they solved.
|
| This does NOT mean that there are no use cases for
| distributed - I've seen Google's internals and it would be
| impossible to do that any other way. But it does mean that
| when someone is telling you with a straight face that they
| needed horizontal scalability for performance, you should
| assume that they were probably wrong. (Though probably
| saying that is not the wisest thing - particularly if the
| person you're talking to is the architect whose beautiful
| diagrams you'd be criticizing.)
|
| So yes, there are problems that require a distributed
| architecture for performance problems. That doesn't
| contradict the point that every other option should be
| explored first.
| CharlesW wrote:
| > _Based on my past experiences, I 'd be happy to take an
| even money bet that more than 95% of organizations that
| go distributed for performance reasons actually caused
| themselves more problems than they solved._
|
| Oh, I'm absolutely positive you're right! And as you
| know, even in distributed architectures there's a huge
| range of solutions from "let's do these _n_ reasonably
| simple things " to "let's just rebuild our system to
| solve every conceivable future problem". I don't know of
| a scenario where the latter has ever worked.
| jeffbee wrote:
| Indexes are not free, they take up space and they make
| mutations more costly. Also, building the index may not even
| be possible while your application is running, because
| postgresql and other RDBMS have inadequate facilities for
| throttling index construction such that it doesn't harm the
| online workload. You might have to build indexes at midnight
| on Sundays, or even take your whole system offline. It can be
| a nightmare.
|
| This isn't just a problem for SQL databases. Terrible-but-
| popular NoSQL systems like MongoDB also rely heavily on
| indexes while providing zero or few safety features that will
| prevent the index build from wrecking the online workload.
|
| I personally prefer databases that simply do not have
| indexes, like bigtable, because they require more forethought
| from data and application architects, leading to
| fundamentally better systems.
| btilly wrote:
| I've seen https://www.postgresql.org/docs/12/sql-
| createindex.html#SQL-... work well in practice on busy
| transactional databases. I'd be interested in knowing about
| cases where it doesn't work well.
|
| My experience on systems without indexes differs strongly
| from yours. Yes, they can work well. But if you have
| multiple use cases for how your data is being queried, they
| push you into keeping multiple copies of your data. And
| then it is very, very easy to lose consistency. And yes, I
| know about "eventual consistency" and all that - I've found
| that in practice it is a nightmare of special cases that
| winds up nowhere good.
| jeffbee wrote:
| Just from personal experience, if they can build gmail on
| top of a database (bigtable) having neither indexes nor
| consistency, then probably it will also be suitable for
| the purposes of my far smaller, much less demanding
| products.
|
| On the other hand I've seen, and am currently suffering
| through, products that have desperate performance
| problems with trivial amounts (tens of GB) of data in
| relational databases with indexes aplenty.
| mcherm wrote:
| > Sometimes you just have to do joins across large tables.
| There just isn't any other way. Your query is going to run for
| 5, 10, maybe 30 seconds.
|
| While that is true, I would speculate (based on my own
| experience, not any actual research) that it is far more common
| that users have an unoptimized query or database schema
| (perhaps it just needs an index to be created) which is taking
| tens of seconds to run but doesn't NEED to, than it is that
| users have a need for a complex query that cannot run faster
| than that.
|
| So for MOST users, the best advice is to learn how to use
| analyze query and other database tools to optimize execution.
| Only if you ALREADY know that should you be considering moving
| to a read replica or some sort of distributed database.
| cliftonk wrote:
| You could always start with postgresql and use the transaction
| log to materialize views for particular use cases (ie
| differential data flow). This post is clickbaity in that it
| doesn't give any administration advice but right in the broad
| sense that postgres/mysql are still the best low-latency single
| sources of truth available
| petr25102018 wrote:
| For anyone interested in general advice regarding scaling
| traditional SQL databases, check out my article [0] where I
| collected various techniques when I investigated this topic in my
| last workplace.
|
| [0] https://stribny.name/blog/2020/07/scaling-relational-sql-
| dat...
___________________________________________________________________
(page generated 2021-01-26 23:00 UTC)