[HN Gopher] Go ahead, self-host Postgres
___________________________________________________________________
Go ahead, self-host Postgres
Author : pavel_lishin
Score : 649 points
Date : 2025-12-20 15:43 UTC (1 days ago)
(HTM) web link (pierce.dev)
(TXT) w3m dump (pierce.dev)
| arichard123 wrote:
| I've been self hosting it for 20 years. Best technical decision I
| ever made. Rock solid
| newsoftheday wrote:
| I've been selfhosting it for at least 10 years, it and mysql,
| mysql longer. No issues selfhosting either. I have backups and
| I know they work.
| moxplod wrote:
| What server company are you guys using with high reliability?
| Looking for server in US-East right now.
| ipsento606 wrote:
| > If your database goes down at 3 AM, you need to fix it.
|
| Of all the places I've worked that had the attitude "If this goes
| down at 3AM, we need to fix it immediately", there was only one
| where that was actually justifiable from a business perspective.
| I'm worked at plenty of places that had this attitude despite the
| fact that overnight traffic was minimal and nothing bad actually
| happened if a few clients had to wait until business hours for a
| fix.
|
| I wonder if some of the preference for big-name cloud
| infrastructure comes from the fact that during an outage,
| employees can just say "AWS (or whatever) is having an outage,
| there's nothing we can do" vs. being expected to actually fix it
|
| From this perspective, the ability to fix problems more quickly
| when self hosting could be considered an antifeature from the
| perspective of the employee getting woken up at 3am
| jonahx wrote:
| This is also the basis for most SaaS purchases by large
| corporations. The old "Nobody gets fired for choosing IBM."
| zbentley wrote:
| Really? That might be an anecdote sampled from unusually small
| businesses, then. Between myself and most peers I've ever
| talked to about availability, I heard an overwhelming majority
| of folks describe systems that really did need to be up 24/7
| with high availability, and thus needed fast 24/7 incident
| response.
|
| That includes big and small businesses, SaaS and non-SaaS, high
| scale (5M+rps) to tiny scale (100s-10krps), and all sorts of
| different markets and user bases. Even at the companies that
| were not staffed or providing a user service over night,
| overnight outages were immediately noticed because on average,
| more than one external integration/backfill/migration job was
| running at any time. Sure, "overnight on call" at small places
| like that was more "reports are hardcoded to email Bob if they
| hit an exception, and integration customers either know Bob's
| phone number or how to ask their operations contact to call
| Bob", but those are still environments where off-hours uptime
| and fast resolution of incidents was expected.
|
| Between me, my colleagues, and friends/peers whose stories I
| know, that's an N of high dozens to low hundreds.
|
| What am I missing?
| runako wrote:
| > What am I missing?
|
| IME the need for 24x7 for B2B apps is largely driven by
| global customer scope. If you have customers in North
| American and Asia, now you need 24x7 (and x365 because of
| little holiday overlap).
|
| That being said, there are a number of B2B apps/industries
| where global scope is not a thing. For example, many
| providers who operate in the $4.9 trillion US healthcare
| market do not have any international users. Similarly the
| $1.5 trillion (revenue) US real estate market. There are
| states where one could operate where healthcare spending is
| over $100B annually. Banks. Securities markets. Lots of
| things do not have 24x7 business requirements.
| zbentley wrote:
| I've worked for banks, multiple large and small US
| healthcare-related companies, and businesses that didn't
| use their software when they were closed for the night.
|
| All of those places needed their backend systems to be up
| 24/7. The banks ran reports and cleared funds with nightly
| batches--hundreds of jobs a night for even small banking
| networks. The healthcare companies needed to receive claims
| and process patient updates (e.g. your provider's EMR is
| updated if you die or have an emergency visit with another
| provider you authorized for records sharing--and no, this
| is not handled by SaaS EMRs in many cases) over night so
| that their systems were up to date when they next opened
| for business. The "regular" businesses closed for the night
| generated reports and frequently had IT staff doing
| migrations, or senior staff working on something at
| midnight due the next day (when the head of marketing is
| burning the midnight oil on that presentation, you don't
| want to be the person explaining that she can't do it
| because the file server hosting the assets is down all the
| time after hours).
|
| And again, that's the norm I've heard described from nearly
| everyone in software/IT that I know: most businesses expect
| (and are willing to pay for or at least insist on) 24/7
| uptime for their computer systems. That seems true across
| the board: for big/small/open/closed-off-
| hours/international/single-timezone businesses alike.
| chickensong wrote:
| Uptime is also a sales and marketing point, regardless of
| real-world usage. Business folks in service-providing
| companies will usually expect high availability by
| default, only tempered by the cost and reality of more
| nines.
|
| Also, in addition to perception/reputation issues, B2B
| contracts typically include an SLA, and nobody wants to
| be in breach of contract.
|
| I think the parent you're replying to is wrong, because
| I've worked at small companies selling into large
| enterprise, and the expectation is basically 24/7 service
| availability, regardless of industry.
| runako wrote:
| You are right that a lot of systems at a lot of places
| need 24x7. Obviously.
|
| But there are also a not-insignificant number of
| important systems where nobody is on a pager, where there
| is no call rotation[1]. Computers are much more reliable
| than they were even 20 years ago. It is an Acceptable
| Business Choice to not have 24x7 monitoring for some
| subset of systems.
|
| Until very recently[2], Citibank took their public
| website/user portal offline for hours a week.
|
| 1 - if a system does not have a fully staffed call
| rotation with escalations, it's not prepared for a real
| off-hours uptime challenge 2 - they may still do this,
| but I don't have a way to verify right now.
| stickfigure wrote:
| This lasts right up until an important customer can't
| access your services. Executives don't care about
| downtime until they have it, then they suddenly care a
| lot.
| true_religion wrote:
| You can often have services available for VIPs, and be
| down for the public.
|
| Unless there's a misconfiguration, usually apps are
| always visible internally to staff, so there's an
| existing methodology to follow to make them visible to
| VIPs.
|
| But sometimes none of that is necessary. I've seen at a
| 1B market cap company, a failure case where the solution
| was _manual_ execution by customer success reps while the
| computers were down. It was slower, but not many people
| complained that their reports took 10 minutes to arrive
| after being parsed by Eye Ball Mk 1s, instead of the 1
| minute of wait time they were used to.
| sixdonuts wrote:
| Thousands of orgs have full stack OT/CI apps/services
| that must run 24/7 365 and are run fully on premise.
| laz wrote:
| The worst SEV calls are the one where you twiddle your thumbs
| waiting for a support rep to drop a crumb of information about
| the provider outage.
|
| You wake up. It's not your fault. You're helpless to solve it.
| OccamsMirror wrote:
| Not when that provider is AWS and the outage is hitting news
| websites. You share the link to AWS being down and go back to
| sleep.
| laz wrote:
| No. You sit on the call and wait to restore your service to
| your users. There's bullshit toil in disabling scale in as
| the outage gets longer.
|
| Eventually, AWS has a VP of something dial in to your call
| to apologize. They're unprepared and offer no new
| information. The get handed to a side call for executive
| bullshit.
|
| AWS comes back. Your support rep only vaguely knows what's
| going on. Your system serves some errors but digs out.
|
| Then you go to sleep.
| sixdonuts wrote:
| News is one thing, if the app/service down impacts revenue,
| safety or security you won't be getting any sleep AWS or
| not.
| odie5533 wrote:
| Recommends hosting postgres yourself. Doesn't recommend a
| distribution stack. If you try this at a startup to save $50 a
| month, you will never recoup the time you wasted setting it up.
| We pay dedicated managed services for these things so we can make
| products on top of them.
| ijustlovemath wrote:
| There's not much to recommend; just use the Postgres from your
| distribution's LTS repo. I like Debian for its rock solid
| stability.
| odie5533 wrote:
| Patroni, Pigsty, Crunchy, CloudNativePG, Zalando, ...
| notaseojh wrote:
| "just use postgres from your distro" is *wildly* underselling
| the amount of work that it takes to go from apt install
| postgres to having a production ready setup (backups,
| replica, pooling, etc). Granted, if it's a tiny database just
| pg-dumping might be enough, but for many that isn't going to
| be enough.
| dvtkrlbs wrote:
| I don't think any of these would take more than a week to
| setup. Assuming you create a nice runbook with every step
| it would not be horrible to maintain as well. Barman for
| backups and unless you need multi-master you can use the
| builtin publication and subscription. Though with scale
| things can complicated really fast but most of the time you
| won't that much traffic to have something complicated.
| true_religion wrote:
| If you're a 'startup', you'll never need any of that work
| until you make it big. 99% of startups do not make it even
| medium size.
|
| If you're a small business, you almost never need replicas
| or pooling. Postgres is insanely capable on modern
| hardware, and is probably the fastest part of your
| application if your application is written in a slower
| dynamic language like Python.
|
| I once worked with a company that scaled up to 30M revenue
| annually, and never once needed more than a single
| dedicated server for postgres.
| marcosdumay wrote:
| The one problem with using your distro's Postgres is that
| your upgrade routine will be dictated by a 3rd party.
|
| And Postgres upgrades are not transparent. So you'll have a 1
| or 2 hours task, every 6 to 18 months that you have only a
| small amount of control over when it happens. This is ok for
| a lot of people, and completely unthinkable for some other
| people.
| true_religion wrote:
| Why would your distro dictate the upgrade routine? Unless
| the distro stops supporting an older version of Postgres,
| you can continue using it. Most companies I know of
| wouldn't dare do an upgrade of an existing production
| database for at least 5 years, and when it does happen...
| downtime is acceptable.
| ezekg wrote:
| Maybe come back when your database spend is two or three orders
| of magnitude higher? It gets expensive pretty fast in my
| experience.
| ijustlovemath wrote:
| And if you want a supabase-like functionality, I'm a huge fan of
| PostgREST (which is actually how supabase works/worked under the
| hood). Make a view for your application and boom, you have a GET
| only REST API. Add a plpgsql function, and now you can POST. It
| uses JWT for auth, but usually I have application on the same
| VLAN as DB so it's not as rife for abuse.
| satvikpendem wrote:
| You can self host Supabase too.
| SamDc73 wrote:
| Last time I checked, it was a pain in the ass to self-host it
| irusensei wrote:
| And then there is the urge to Postgres everything.
|
| I was disappointed alloy doesn't support timescaledb as a metrics
| endpoint. Considering switching to telegraf just because I can
| store the metrics on Postgres.
| ErroneousBosh wrote:
| I've always just Postgressed everything. I used MySQL a bit in
| the PHP3 days, but eventually moved onto Postgres.
|
| SQLite when prototyping, Postgres for production.
|
| If you need to power a lawnmower and all you have is a 500bhp
| Scania V8, you may as well just do it.
| WillDaSilva wrote:
| It's pretty easy these days to spin up a local Postgres
| container. Might as well use it for prototyping too, and save
| yourself the hassle of switching later.
| tcdent wrote:
| It might seem minor, but the little things add up. Make
| your dev environment mirror prod from the start will save
| you a bunch of headaches. Then, when you're ready to
| deploy, there is nothing to change.
|
| Even better, stage to a production-like environment early,
| and then deploy day can be as simple as a DNS record
| change.
| OccamsMirror wrote:
| Thanks to LetsEncrypt DNS-01, you can absolutely spin up
| a production-like environment with SSL and everything.
| It's definitely worth doing.
| solarengineer wrote:
| Have you given thought to why you prototype with SQLite?
|
| I have switched to using postgres even for prototyping once I
| prepared some shell scripts for various setup. With hibernate
| (java) or knex (Javascript/NodeJS) and with unit tests (Test
| Driven Development approach) for code, I feel I have reduced
| the friction of using postgres from the beginning.
| ErroneousBosh wrote:
| Because when I get tired of reconstructing the contents of
| the database between my various dev machines (at home, at
| work, on a remote server, on my laptop) I can just scp the
| sqlite db across.
|
| Because it's "low effort" to just fire it into sqlite and
| if I have to do ridiculous things to the schema as I footer
| around working out exactly what I want the database to do.
|
| I don't want to use nodejs if I can possibly avoid it and
| you literally could not pay me to even look at Java, there
| isn't enough money in the world.
| solarengineer wrote:
| I mentioned Hibernate and knex as examples of DB schema
| version control tools.
|
| Incidentally, you can rsync postgres dumps as well.
| That's what I do when testing and when sharing test data
| with team mates. At times, I decide to pgload the
| database dump into a different target system.
|
| My reason for sharing: I accepted that I was being
| lethargic about using postgres, so I just automated
| certain things as I went along.
| nileshtrivedi wrote:
| I have now switched to pglite for prototyping, because it
| lets me use all the postgres features.
| ErroneousBosh wrote:
| Oho, what is this pglite that I have never heard of? I
| already like the sound of it.
| ishandotpage wrote:
| `pglite` is a WASM version of postgres. I use it in one
| of my side projects for providing a postgres DB running
| in the user's browser.
|
| For most purposes, it works perfectly fine, but with two
| main caveats:
|
| 1. It is single user, single connection (i.e. no MVCC) 2.
| It doesn't support all postgres extensions (particularly
| postGIS), though it does support pgvector
|
| https://github.com/supabase-community/pg-gateway is
| something that may be used to use pglite for prototyping
| I guess, but I haven't used this.
| ZeroConcerns wrote:
| So, yeah, I guess there's much confusion about what a 'managed
| database' actually _is_? Because for me, the table stakes are:
|
| -Backups: the provider will push a full generic disaster-recovery
| backup of my database to an off-provider location at least daily,
| without the need for a maintenance window
|
| -Optimization: index maintenance and storage optimization are
| performed automatically and transparently
|
| -Multi-datacenter failover: my database will remain available
| even if part(s) of my provider are down, with a minimal data loss
| window (like, 30 seconds, 5 minutes, 15 minutes, depending on SLA
| and thus plan expenditure)
|
| -Point-in-time backups are performed at an SLA-defined
| granularity and with a similar retention window, allowing me to
| access snapshots via a custom DSN, not affecting production
| access or performance in any way
|
| -Slow-query analysis: notifying me of _relevant_ performance
| bottlenecks before they bring down production
|
| -Storage analysis: my plan allows for #GB of fast storage, #TB of
| slow storage: let me know when I'm forecast to run out of either
| in the next 3 billing cycles or so
|
| Because, well, if anyone provides all of that for a monthly fee,
| the whole "self-hosting" argument goes out of the window quickly,
| right? And I say that as someone who absolutely _adores_ self-
| hosting...
| thedougd wrote:
| It's even worse when you start finding you're staffing
| specialized skills. You have the Postgres person, and they're
| not quite busy enough, but nobody else wants to do what they
| do. But then you have an issue while they're on vacation, and
| that's a problem. Now I have a critical service but with a bus
| factor problem. So now I staff two people who are now not very
| busy at all. One is a bit ambitious and is tired of being
| bored. So he's decided we need to implement something new in
| our Postgres to solve a problem we don't really have. Uh oh, it
| doesn't work so well, the two spend the next six months trying
| to work out the kinks with mixed success.
| arcbyte wrote:
| Slack is a necessary component in well functioning systems.
| thedougd wrote:
| Of course! It should be included in the math when comparing
| in-housing Postgres vs using a managed service.
| zbentley wrote:
| And rental/SaaS models often provide an extremely cost
| effective alternative to needing to have a lot of slack.
|
| Corollary: rental/SaaS models provide that property in
| large part _because_ their providers have lots of slack.
| satvikpendem wrote:
| This would be a strange scenario because why would you keep
| these people employed? If someone doesn't want to do the job
| required, including servicing Postgres, then they wouldn't be
| with me any longer, I'll find someone who does.
| sixdonuts wrote:
| No doubt. Reading this thread leads me to believe that
| almost no one wants to take responsibility for anything
| anymore, even hiring the right people. Why even hire
| someone who isn't going to take responsibility for their
| work and be part of a team? If an org is worried about the
| "bus factor" they are probably not hiring the right people
| and/or the org management has poor team building skills.
| satvikpendem wrote:
| Exactly, I just don't understand the grandparent's point,
| why have a "Postgres person" at all? I hire an engineer
| who should be able to do it all, no wonder there's been a
| proliferation of full stack engineers over specialized
| ones.
|
| And especially having worked in startups, I was expected
| to do many different things, from fixing infrastructure
| code one day to writing frontend code the next. If you're
| in a bigger company, maybe it's understandable to be
| specialized, but especially if you're at a company with
| only a few people, you must be willing to do the job,
| whatever it is.
| stackskipton wrote:
| Because working now at what used to be startup size, not
| having X Person leads to really bad technical debt
| problems as that person Handling X was not really skilled
| enough to be doing so but it was illusion of success.
| Those technical debt problems are causing us massive
| issues now and costing the business real money.
| marcosdumay wrote:
| IMO, the reason to self-host your database is latency.
|
| Yes, I'd say backups and analysis are table stakes for hiring
| it, and multi-datacenter failover is a relevant nice to have.
| But the reason to do it yourself is because it's literally
| impossible to get anything as good as you can build in
| somebody's else computer.
| andersmurphy wrote:
| Yup, often orders of magnitude better.
| dangoodmanUT wrote:
| There should be no data loss window with a hosted database
| xboxnolifes wrote:
| Why is that?
| andersmurphy wrote:
| Feom what I remember if AWS loses your data they are
| basically give you some credits and that's it.
| jeremyjh wrote:
| That requires synchronous replication, which reduces
| availability and performance.
| wahnfrieden wrote:
| Yugabyte open source covers a lot of this
| odie5533 wrote:
| Self-host things the boss won't call at 3 AM about: logs,
| traces, exceptions, internal apps, analytics. Don't self-host
| the database or major services.
| cube00 wrote:
| Depending on your industry, logs can be very serious
| business.
| satvikpendem wrote:
| If you set it up right, you can automate all this as well by
| self hosting. There is really nothing special about automating
| backups or multi region fail over.
| awestroke wrote:
| But then you have to check that these mechanisms work
| regularly and manually
| satvikpendem wrote:
| One thing I learned working in the industry, you have to
| check them when you're using AWS too.
| awestroke wrote:
| Really? You're saying RDS backups can't be trusted?
| satvikpendem wrote:
| Trusted in what sense, that they'll always work perfectly
| 100% of the time? No, therefore one must still check them
| from time to time, and it's really no different when self
| hosting, again, if you do it correctly.
| awestroke wrote:
| What are some common ways that RDS backups fail to be
| restored?
| satvikpendem wrote:
| Why are you asking me this? Are you trying to test
| whether I've actually used RDS before? I'm sure a quick
| search will find you the answer to your question.
| SoftTalker wrote:
| No backup strategy can be blindly trusted. You must
| verify it, and also test that restores actually work.
| graemep wrote:
| Which providers do all of that?
| BoorishBears wrote:
| I don't know which don't?
|
| The default I've used on Amazon and GCP both do (RDS, Cloud
| SQL)
| jeffbee wrote:
| GCP Alloy DB
| isuckatcoding wrote:
| Take a look at https://github.com/vitabaks/autobase
|
| In case you want to self host but also have something that takes
| care of all that extra work for you
| runako wrote:
| Thank you, this looks awesome.
| satvikpendem wrote:
| I wonder how well this plays with other self hosted open source
| PaaS, is it just a Docker container we can run I assume?
| yakkomajuri wrote:
| Just skimmed the readme. What's the connection pooling
| situation here? Or is it out of scope?
| heipei wrote:
| I still don't get how folks can hype Postgres with every second
| post on HN, yet there is no simple batteries-included way to run
| a HA Postgres cluster with automatic failover like you can do
| with MongoDB. I'm genuinely curious how people deal with this in
| production when they're self-hosting.
| christophilus wrote:
| I've been tempted by MariaDB for this reason. I'd love to hear
| from anyone who has run both.
| paulryanrogers wrote:
| IMO Maria has fallen behind MySQL. I wouldn't chose it for
| anything my income depends on.
|
| (I do use Maria at home for legacy reasons, and have used
| MySQL and Pg professionally for years.)
| danaris wrote:
| > IMO Maria has fallen behind MySQL. I wouldn't chose it
| for anything my income depends on.
|
| Can you give any details on that?
|
| I switched to MariaDB back in the day for my personal
| projects because (so far as I could tell) it was being
| updated more regularly, _and_ it was more fully open
| source. (I don 't recall offhand at this point whether
| MySQL switched to a fully _paid_ model, or just less-open.)
| chuckadams wrote:
| One area where Maria lags significantly is JSON support.
| In MariaDB, JSON is just an alias for LONGTEXT plus
| validation:
| https://mariadb.com/docs/server/reference/data-
| types/string-...
| paulryanrogers wrote:
| IME MariaDB doesn't recover or run as reliably as modern
| versions of MySQL, at least with InnoDB.
| Seattle3503 wrote:
| SKIP LOCKED was added in 10.6 (~2021), years after MySQL
| had it (~2017). My company was using MariaDB around the
| time and was trailing a version or two and it made
| implementing a queue very painful.
| paulryanrogers wrote:
| RDS provides some HA. HAProxy or PGBouncer can help when self
| hosting.
| notaseojh wrote:
| it's easy to through names out like this (pgbackrest is also
| useful...) but getting them setup properly in a production
| environment is not at all straightforward, which I think is
| the point.
| zbentley wrote:
| ...in which case, you should probably use a hosted offering
| that takes care of those things for you. RDS Aurora
| (Serverless or not), Neon, and many other services offer
| those properties without any additional setup. They charge
| a premium for them, however.
|
| It's not like Mongo gives you those properties for free
| either. Replication/clustering related data loss is still
| incredibly common precisely because mongo makes it _seem_
| like all that stuff is handled automatically at setup when
| in reality it requires plenty of manual tuning or extra
| software in order to provide the guarantees everyone thinks
| it does.
| paulryanrogers wrote:
| Yeah my hope is that the core team will adopt a built in
| solution, much as they finally came around on including
| logical replication.
|
| Until then it is nice to have options, even if they do
| require extra steps.
| mfalcao wrote:
| The most common way to achieve HA is using Patroni. The easiest
| way to set it up is using Autobase (https://autobase.tech).
|
| CloudNativePG (https://cloudnative-pg.io) is a great option if
| you're using Kubernetes.
|
| There's also pg_auto_failover which is a Postgres extension and
| a bit less complex than the alternatives, but it has its
| drawbacks.
| franckpachot wrote:
| Be sure to read the Muths and Truths about Synchronous
| Replication in PostgreSQL (by the author of Patroni) before
| considering those solutions as cloud-native high
| availability: https://www.postgresql.eu/events/pgconfde2025/s
| essions/sessi...
| da02 wrote:
| What is your preferred alternative to Patroni?
| tresil wrote:
| If you're running Kubernetes, CloudNativePG seems to be the
| "batteries included" HA Postgres cluster that's becoming the
| standard in this area.
| franckpachot wrote:
| CloudNativePG is automation around PostgreSQL, not "batteries
| included", and not the idea of Kubernetes where pods can die
| or spawn without impacting the availability. Unfortunately,
| naming it Cloud Native doesn't transform a monolithic
| database to an elastic cluster
| monus wrote:
| We've recently had a disk failure in the primary and
| CloudNativePG promoted another to be primary but it wasn't
| zero downtime. During transition, several queries failed. So
| something like pgBouncer together with transactional queries
| (no prepared statements) is still needed which has
| performance penalty.
| _rwo wrote:
| > So something like pgBouncer together with transactional
| queries
|
| FYI - it's already supported by cloudnativepg [1]
|
| I was playing with this operator recently and I'm truly
| impressed - it's a piece of art when it comes to postgres
| automation; alongside with barman [2] it does everything I
| need and more
|
| [1] https://cloudnative-pg.io/docs/1.28/connection_pooling
| [2] https://cloudnative-pg.io/plugin-barman-cloud/
| jknoepfler wrote:
| I use Patroni for that in a k8s environment (although it works
| anywhere). I get an off-the-shelf declarative deployment of an
| HA postgres cluster with automatic failover with a little
| boiler-plate YAML.
|
| Patroni has been around for awhile. The database-as-a-service
| team where I work uses it under the hood. I used it to build
| database-as-a-service functionality on the infra platform team
| I was at prior to that.
|
| It's basially push-button production PG.
|
| There's at least one decent operator framework leveraging it,
| if that's your jam. I've been living and dying by self-hosting
| everything with k8s operators for about 6-7 years now.
| tempest_ wrote:
| We use patroni and run it outside of k8s on prem, no issues
| in 6 or 7 years. Just upgraded from pg 12 to 17 with
| basically no down time without issue either.
| baobun wrote:
| Yo I'm curious if you have any pointers on how you went
| about this to share? Did you use their provided upgrade
| script or did you instrument the upgrade yourself "out of
| band"? rsync?
|
| Currently scratching my head on what the appropriate
| upgrade procedure is for a non-k8s/operator spilo/patroni
| cluster for minimal downtime and risk. The script doesn't
| seem to work for this setup, erroring on mismatching
| PG_VERSION when attempting. If you don't mind sharing it
| would be very appreciated.
| tempest_ wrote:
| I did not use a script (my environment is bare metal
| running ubuntu 24).
|
| I read these and then wrote my own scripts that were
| tailored to my environment.
|
| https://pganalyze.com/blog/5mins-postgres-zero-downtime-
| upgr...
|
| https://www.pgedge.com/blog/always-online-or-bust-zero-
| downt...
|
| https://knock.app/blog/zero-downtime-postgres-upgrades
|
| Basically
|
| - Created a new cluster on new machines
|
| - Started logically replicating
|
| - Waited for that to complete and then left it there
| replicating for a while until I was comfortable with the
| setup
|
| - We were already using haproxy and pgbouncer
|
| - Then I did a cut over to the new setup
|
| - Everything looked good so after a while I tore down the
| old cluster
|
| - This was for a database 600gb-1tb in size
|
| - The client application was not doing anything overly
| fancy which meant there was very little to change going
| from 12 to 17
|
| - Additionally I did all of the above in a staging
| environment first to make sure it would work as expected
|
| Best of luck.
| dangoodmanUT wrote:
| Patroni, Zolando operator on k8s
| franckpachot wrote:
| It's largely cultural. In the SQL world, people are used to
| accepting the absence of real HA (resilience to failure, where
| transactions continue without interruption) and instead rely on
| fast DR (stop the service, recover, check for data loss, start
| the service). In practice, this means that all connections are
| rolled back, clients must reconnect to a replica known to be in
| synchronous commit, and everything restarts with a cold cache.
|
| Yet they still call it HA because there's nothing else. Even a
| planned shutdown of the primary to patch the OS results in
| downtime, as all connections are terminated. The situation is
| even worse for major database upgrades: stop the application,
| upgrade the database, deploy a new release of the app because
| some features are not compatible between versions, test, re-
| analyze the tables, reopen the database, and only then can
| users resume work.
|
| Everything in SQL/RDBMS was thought for a single-node instance,
| not including replicas. It's not HA because there can be only
| one read-write instance at a time. They even claim to be more
| ACID than MongoDB, but the ACID properties are guaranteed only
| on a single node.
|
| One exception is Oracle RAC, but PostgreSQL has nothing like
| that. Some forks, like YugabyteDB, provide real HA with most
| PostgreSQL features.
|
| About the hype: many applications that run on PostgreSQL accept
| hours of downtime, planned or unplanned. Those who run larger,
| more critical applications on PostgreSQL are big companies with
| many expert DBAs who can handle the complexity of database
| automation. And use logical replication for upgrades. But no
| solution offers both low operational complexity and high
| availability that can be comparable to MongoDB
| franckpachot wrote:
| Beyond the hype, the PostgreSQL community is aware of the lack
| of "batteries-included" HA. This discussion on the idea of a
| Built-in Raft replication mentions MongoDB as:
|
| >> "God Send". Everything just worked. Replication was as
| reliable as one could imagine. It outlives several hardware
| incidents without manual intervention. It allowed cluster
| maintenance (software and hardware upgrades) without
| application downtime. I really dream PostgreSQL will be as
| reliable as MongoDB without need of external services.
|
| https://www.postgresql.org/message-id/0e01fb4d-f8ea-4ca9-8c9...
| abrookewood wrote:
| "I really dream PostgreSQL will be as reliable as MongoDB"
| ... someone needs to go and read up on Mongo's history!
|
| Sure, the PostrgreSQL HA story isn't what we all want it to
| be, but the reliability is exceptional.
| computerfan494 wrote:
| Postgres violated serializability on a single node for a
| considerable amount of time [1] and used fsync incorrectly
| for 20 years [2]. I personally witnessed lost data on
| Postgres because of the fsync issue.
|
| Database engineering is very hard. MongoDB has had both
| poor defaults as well as bugs in the past. It will
| certainly have durability bugs in the future, just like
| Postgres and all other serious databases. I'm not sure that
| Postgres' durability stacks up especially well with modern
| MongoDB.
|
| [1] https://jepsen.io/analyses/postgresql-12.3
|
| [2] https://archive.fosdem.org/2019/schedule/event/postgres
| ql_fs...
| abrookewood wrote:
| Thanks for adding that - I wasn't aware.
| groundzeros2015 wrote:
| Because that's an expensive and complex boondoggle almost no
| business needs.
| wb14123 wrote:
| Yeah I'm also wondering that. I'm looking for self-host
| PostgreSQL after Cockroach changed their free tier license but
| found the HA part of PostgreSQL is really lacking. I tested
| Patroni which seems to be a popular choice but found some
| pretty critical problems
| (https://www.binwang.me/2024-12-02-PostgreSQL-High-
| Availabili...). I tried to explore some other solutions, but
| found out the lack of a high level design really makes the HA
| for PostgreSQL really hard if not impossible. For example,
| without the necessary information in WAL, it's hard to enforce
| primary node even with an external Raft/Paxos coordinator. I
| wrote some of them down in this blog
| (https://www.binwang.me/2025-08-13-Why-Consensus-Shortcuts-
| Fa...) especially in the section "Highly Available PostgreSQL
| Cluster" and "Quorum".
|
| My theory of why Postgres is still getting the hype is either
| people don't know the problem, or it's acceptable on some
| level. I've worked in a team that maintains the in house
| database cluster (even though we were using MySQL instead of
| PostgreSQL) and the HA story was pretty bad. But there were
| engineers manually recover the data lost and resolve data
| conflicts, either from the recovery of incident or from
| customer tickets. So I guess that's one way of doing business.
| forinti wrote:
| I love Postgresql simply because it never gives me any trouble.
| I've been running it for decades without trouble.
|
| OTOH, Oracle takes most of my time with endless issues, bugs,
| unexpected feature modifications, even on OCI!
| dpedu wrote:
| This is my gripe with Postgres as well. Every time I see
| comments extolling the greatness of Postgres, I can't help but
| think "ah, that's a user, not a system administrator" and I
| think that's a completely fair judgement. Postgres is pretty
| great if you don't have to take care of it.
| nhumrich wrote:
| What do you postgres self hosters use for performance analysis?
| Both GCP-SQL and RDS have their performance analysis pieces of
| the hosted DB and it's incredible. Probably my favorite reason
| for using them.
| cuu508 wrote:
| I use pgdash and netdata for monitoring and alerting, and plain
| psql for analyzing specific queries.
| sa46 wrote:
| I've been very happy with Pganalyze.
| donatj wrote:
| The author brings up the point, but I have always found
| surprising how _much_ more expensive managed databases are than a
| comparable VPS.
|
| I would expect a little bit more as a cost of the convenience,
| but in my experience it's generally multiple times the expense.
| It's wild.
|
| This has kept me away from managed databases in all but my
| largest projects.
| orev wrote:
| Once they convince you that you can't do it yourself, you end
| up relying on them, but didn't develop the skills you would
| need to migrate to another provider when they start raising
| prices. And they keep raising prices because by then you have
| no choice.
| zbentley wrote:
| There is plenty of provider markup, to be sure. But it is
| also _very much not a given_ that the hosted version of a
| database is running software /configs that are equivalent to
| what you could do yourself. Many hosted databases are
| extremely different behind the scenes when it comes to
| durability, monitoring, failover, storage provisioning,
| compute provisioning, and more. Just because it _acts_ like a
| connection hanging off a postmaster service running on a
| server doesn't mean that's what your "psql" is connected to
| on RDS Aurora (or many of the other cloud-Postgres
| offerings).
| aranelsurion wrote:
| > Just because it acts like a connection hanging off
|
| If anything that's a feature for ease of use and
| compatibility.
| citizenpaul wrote:
| I have not tested this in real life yet but it seems like all
| the argument about vendor lock in can be solved, if you bite
| the bullet and learn basic Kubernetes administration.
| Kubernetes is FOSS and there are countless Kubernetes as a
| service providers.
|
| I know there are other issues with Kubernetes but at least
| its transferable knowledge.
| ch2026 wrote:
| Wait, are you talking about cloud providers or LLMs?
| nrhrjrjrjtntbt wrote:
| Yes if the DB is 5x the VM and the the VM is 10x the dedicated
| server from say OVH etc. then you are payng 50x.
| dewey wrote:
| I recently was also doing some research into what projects exist
| that come close to a "managed Postgres on Digital Ocean"
| experience, sadly there's some building blocks but nothing that
| really makes it a complete no-brainer.
|
| https://blog.notmyhostna.me/posts/what-i-wish-existed-for-se...
| notaseojh wrote:
| Another thread where I can't determine whether the "it's easy"
| suggestions are from people who are clueless or expert.
| Nextgrid wrote:
| Ironically you need a bit of both. You need to be expert enough
| to make it work, but not "too" expert to be stuck in your ways
| and/or influenced by all the fear-mongering.
|
| An _expert_ will give you thousands of theoretical reasons why
| self-hosting the DB is a bad idea.
|
| An "expert" will host it, enjoy the cost savings and deal with
| the once-a-year occurrence of the theoretical risk (if it ever
| occurs).
| molf wrote:
| > I'd argue self-hosting is the right choice for basically
| everyone, with the few exceptions at both ends of the extreme:
|
| > If you're just starting out in software & want to get something
| working quickly with vibe coding, it's easier to treat Postgres
| as just another remote API that you can call from your single
| deployed app
|
| > If you're a really big company and are reaching the scale where
| you need trained database engineers to just work on your stack,
| you might get economies of scale by just outsourcing that work to
| a cloud company that has guaranteed talent in that area. The
| second full freight salaries come into play, outsourcing looks a
| bit cheaper.
|
| This is funny. I'd argue the exact opposite. I would self host
| only:
|
| * if I were on a tight budget and trading an hour or two of my
| time for a cost saving of a hundred dollars or so is a good deal;
| or
|
| * at a company that has reached the scale where employing
| engineers to manage self-hosted databases is more cost effective
| than outsourcing.
|
| I have nothing against self-hosting PostgreSQL. Do whatever you
| prefer. But to me outsourcing this to cloud providers seems
| entirely reasonable for small and medium-sized businesses.
| According to the author's article, self hosting costs you between
| 30 and 120 minutes per month (after setup, and if you already
| know what to do). It's easy to do the math...
| convolvatron wrote:
| its not. I've been in a few shops that use RDS because they
| think their time is better spend doing other things.
|
| except now they are stuck trying to maintain and debug Postgres
| without having the same visibility and agency that they would
| if they hosted it themselves. situation isn't at all clear.
| molf wrote:
| Interesting. Is this an issue with RDS?
|
| I use Google Cloud SQL for PostgreSQL and it's been rock
| solid. No issues; troubleshooting works fine; all extensions
| we need already installed; can adjust settings where needed.
| convolvatron wrote:
| its more of a general condition - its not that RDS is
| somehow really faulty, its just that when things do go
| wrong, its not really anybody's job to introspect the
| system because RDS is taking care of it for us.
|
| in the limit I dont think we should need DBAs, but as long
| as we need to manage indices by hand, think more than 10
| seconds about the hot queries, manage replication, tune the
| vacuumer, track updates, and all the other rot - then
| actually installing PG on a node of your choice is really
| the smallest of problems you face.
| Nextgrid wrote:
| One thing unaccounted for if you've only ever used cloud-
| hosted DBs is just how slow they are compared to a modern
| server with NVME storage.
|
| This leads the developers to do all kinds of workarounds and
| reach for _more_ cloud services (and then integrating them
| and - often poorly - ensuring consistency across them)
| because the cloud hosted DB is not able to handle the load.
|
| On bare-metal, you can go a very long way with just throwing
| everything at Postgres and calling it a day.
| NewJazz wrote:
| Yeah our cloud DBs all have abysmal performance and high
| recurring cost even compared to metal we didn't even buy
| for hosting DBs.
| briHass wrote:
| This is the reason I manage SQL Server on a VM in Azure
| instead of their PaaS offering. The fully managed SQL has
| terrible performance unless you drop many thousands a
| month. The VM I built is closer to 700 a month.
|
| Running on IaaS also gives you more scalability knobs to
| tweak: SSD Iops and b/w, multiple drives for
| logs/partitions, memory optimized VMs, and there's a lot of
| low level settings that aren't accessible in managed SQL.
| Licensing costs are also horrible with managed SQL Server,
| where it seems like you pay the Enterprise level, but
| running it yourself offers lower cost editions like
| Standard or Web.
| andersmurphy wrote:
| 100% this directly connected nvme is a massive win. Often
| several orders of magnitude.
|
| You can take it even further in some context if you use
| sqlite.
|
| I think one of the craziest ideas of the cloud decade was
| to move storage away from compute. It's even worse with
| things like AWS lambda or vercel.
|
| Now vercel et al are charging you extra to have your data
| next to your compute. We're basically back to VMs at
| 100-1000x the cost.
| Nextgrid wrote:
| > employing engineers to manage self-hosted databases is more
| cost effective than outsourcing
|
| Every company out there is using the cloud and yet _still_
| employs infrastructure engineers to deal with its complexity.
| The "cloud" reducing staff costs is and was always a lie.
|
| PaaS platforms (Heroku, Render, Railway) _can_ legitimately be
| operated by your average dev and not have to hire a dedicated
| person; those cost even more though.
|
| Another limitation of both the cloud and PaaS is that they are
| only responsible for the infrastructure/services you use; they
| will not touch your application at all. Can your application
| automatically recover from a slow/intermittent network, a DB
| failover (that you can't even test because your cloud
| providers' failover and failure modes are a black box), and so
| on? Otherwise you're waking up at 3am no matter what.
| matthewmacleod wrote:
| I don't think it's a lie, it's just perhaps overstated. The
| number of staff needed to manage a cloud infrastructure is
| definitely lower than that required to manage the equivalent
| self-hosted infrastructure.
|
| Whether or not you need that equivalence is an orthogonal
| question.
| Nextgrid wrote:
| > The number of staff needed to manage a cloud
| infrastructure is definitely lower than that required to
| manage the equivalent self-hosted infrastructure.
|
| There's probably a sweet spot where that is true, but
| because cloud providers offer more complexity (self-
| inflicted problems) and use PR to encourage you to use them
| ("best practices" and so on) in all the cloud-hosted shops
| I've been in a decade of experience I've always seen
| multiple full-time infra people being busy with...
| _something_?
|
| There was always _something_ to do, whether to keep up with
| cloud provider changes /deprecations, implementing the
| latest "best practice", debugging distributed systems
| failures or self-inflicted problems and so on. I'm sure
| career/resume polishing incentives are at play here too -
| the employee _wants_ the system to require their input
| otherwise their job is no longer needed.
|
| Maybe in a perfect world you can indeed use cloud-hosted
| services to reduce/eliminate dedicated staff, but in
| practice I've never seen anything but solo founders
| actually achieve that.
| freedomben wrote:
| Exactly. Companies with cloud infra often still have to
| hire infra people or even an infra team, but that team will
| be smaller than if they were self-hosting everything, in
| some cases radically smaller.
|
| I _love_ self-hosting stuff and even have a bias towards
| it, but the cost /time tradeoff is more complex than most
| people think.
| molf wrote:
| > Every company out there is using the cloud and yet still
| employs infrastructure engineers
|
| Every company _beyond a particular size_ surely? For many
| small and medium sized companies hiring an infrastructure
| team makes just as little sense as hiring kitchen staff to
| make lunch.
| spwa4 wrote:
| For small companies things like vercel, supabase, firebase,
| ... wipe the floor with Amazon RDS.
|
| For medium sized companies you need "devops engineers". And
| in all honesty, more than you'd need sysadmins for the same
| deployment.
|
| For large companies, they split up AWS responsibilities
| into entire departments of teams (for example, all clouds
| have math auth so damn difficult most large companies have
| -not 1- but _multiple_ departments _just_ dealing with
| authorization, before you so much as start your first app)
| add-sub-mul-div wrote:
| You're paying people to do the role either way, if it's not
| dedicated staff then it's taking time away from your
| application developers so they can play the role of
| underqualified architects, sysadmins, security engineers.
| scott_w wrote:
| From experience (because I used to do this), it's a lot
| less time than a self-hosted solution, once you're
| factoring in the multiple services that need to be
| maintained.
| pinkgolem wrote:
| As someone who has done both.. i disagree, i find self
| hosting to a degree much easier and much less complex
|
| Local reproducibility is easier, and performance is often
| much better
| scott_w wrote:
| It depends entirely on your use case. If all you need is
| a DB and Python/PHP/Node server behind Nginx then you can
| get away with that for a long time. Once you throw in a
| task runner, emails, queue systems, blob storage, user-
| uploaded content, etc. you can start running beyond your
| own ability or time to fix the inevitable problems.
|
| As I pointed out above, you may be better served mixing
| and matching so you spend your time on the critical
| aspects but offload those other tasks to someone else.
|
| Of course, I'm not sitting at your computer so I can't
| tell you what's right for you.
| pinkgolem wrote:
| I mean, fair, we are ofc offloading some of that.. email
| being one of those, LLM`s being another thing.
|
| Task runner/que at least for us postgres works for both
| cases.
|
| We also self host an s3 storage and allow useruploaded
| content in within strict borders.
| flomo wrote:
| Yeah, and nobody is looking at the other side of this.
| There just are not a lot of good DBA/sysop type who even
| want to work for some non-tech SMB. So this either gets
| outsourced to the cloud, or some junior dev or desktop
| support guy hacks it together. And then who knows if the
| backups are even working.
|
| Fact is a lot of these companies are on the cloud because
| their internal IT was a total fail.
| Nextgrid wrote:
| If they just paid half of the markup they currently pay
| for the cloud I'm sure they'll be swimming in qualified
| candidates.
| flomo wrote:
| For companies not heavily into tech, lots of this stuff
| is not that expensive. Again, how many DBAs are even
| looking for a 3 hr/month sidegig?
| strken wrote:
| Our AWS spend is something like $160/month. Want to come
| build bare metal database infrastructure for us for
| $3/day?
| Nextgrid wrote:
| When you need to scale up and don't want that $160 to
| increase 10x to handle the additional load the numbers
| start making more sense: 3 month's worth of the projected
| increase upfront is around 4.3k, which is good money for
| a few days' work for the setup/migration and remains a
| good deal for you since you break even after 3 months and
| keep on pocketing the savings indefinitely from that
| point on.
|
| Of course, my comment wasn't aimed at those who
| successfully keep their cloud bill in the low 3-figures,
| but the majority of companies with a 5-figure bill _and_
| multiple "infrastructure" people on payroll futzing
| around with YAML files. Even half the achieved savings
| should be enough incentive for those guys to learn
| something new.
| solatic wrote:
| > few days' work
|
| But initial setup is maybe 10% of the story. The day 2
| operations of monitoring, backups, scaling, and failover
| still needs to happen, and it still requires expertise.
|
| If you bring that expertise in house, it costs much more
| than 10x ($3/day -> $30/day = $10,950/year).
|
| If you get the expertise from experts who are juggling
| you along with a lot of other clients, you get something
| like PlanetScale or CrunchyData, which are also
| significantly more expensive.
| Nextgrid wrote:
| > monitoring
|
| Most monitoring solutions support Postgres and don't
| actually care where your DB is hosted. Of course this
| only applies _if_ someone was actually looking at the
| metrics to begin with.
|
| > backups
|
| Plenty of options to choose from depending on your
| recovery time objective. From scheduled pg_dumps to WAL
| shipping to disk snapshots and a combination of them at
| any schedule you desire. Just ship them to your favorite
| blob storage provider and call it a day.
|
| > scaling
|
| That's the main reason I favor bare-metal infrastructure.
| There is no way anything on the cloud (at a price you can
| afford) can rival the performance of even a mid-range
| server that scaling is effectively never an issue; if
| you're outgrowing that, the conversation we're having is
| not about getting a big DB but using multiple DBs and
| sharding at the application layer.
|
| > failover still needs to happen
|
| Yes, get another server and use Patroni/etc. Or just
| accept the occasional downtime and up to 15 mins of data
| loss if the machine never comes back up. You'd be
| surprised how many businesses are perfectly fine with
| this. Case in point: two major clouds had hour-long
| downtimes recently and everyone basically forgot about it
| a week later.
|
| > If you bring that expertise in house
|
| Infrastructure should not require continuous
| upkeep/repair. You wouldn't buy a car that requires you
| to have a full-time mechanic in the passenger seat at all
| times. If your infrastructure requires this, you should
| ask for a refund and buy from someone who sells more
| reliable infra.
|
| A server will run forever once set up unless hardware
| fails (and some hardware can be redundant with spares
| provisioned ahead of time to automatically take over and
| delay maintenance operations). You should spend a couple
| hours a month max on routine maintenance which can be
| outsourced and _still_ beats the cloud price.
|
| I think you're underestimating the amount of tech that is
| essentially _nix machines all around you that somehow
| just..._ work* despite having zero upkeep or maintenance.
| Modern hardware is surprisingly reliable and most outages
| are caused by operator error when people are (potentially
| unnecessarily) messing with stuff rather than the
| hardware failing.
| snovv_crash wrote:
| At 160/mo you are using so little you might as well host
| off of a raspberry pi on your desk with a USB3 SSD
| attached. Maintenance and keeping a hot backup would take
| a few hours to set up, and you're more flexible too. And
| if you need to scale, rent a VPS or even dedicated
| machine from Hetzner.
|
| An LLM could set this up for you, it's dead simple.
| barnabee wrote:
| It depends very much what the company is doing.
|
| At my last two places it very quickly got to the point
| where the technical complexity of deployments, managing
| environments, dealing with large piles of data, etc. meant
| that we needed to hire someone to deal with it all.
|
| They actually preferred managing VMs and self hosting in
| many cases (we kept the cloud web hosting for features like
| deploy previews, but that's about it) to dealing with
| proprietary cloud tooling and APIs. Saved a ton of money,
| too.
|
| On the other hand, the place before that was simple enough
| to build and deploy using cloud solutions without hiring
| someone dedicated (up to at least some pretty substantial
| scale that we didn't hit).
| aranelsurion wrote:
| > still employs infrastructure engineers
|
| > The "cloud" reducing staff costs
|
| Both can be true at the same time.
|
| Also:
|
| > Otherwise you're waking up at 3am no matter what.
|
| Do you account for frequency and variety of wakeups here?
| Nextgrid wrote:
| > Do you account for frequency and variety of wakeups here?
|
| Yes. In my career I've dealt with way more failures due to
| unnecessary distributed systems (that could have been one
| big bare-metal box) rather than hardware failures.
|
| You can never eliminate wake-ups, but I find bare-metal
| systems to have much less moving parts means you eliminate
| a whole bunch of failure scenarios so you're only left with
| _actual_ hardware failure (and HW is pretty reliable
| nowadays).
| wredcoll wrote:
| If this isn't the truth. I just spent several weeks, on
| and off, debugging a remote hosted build system tool
| thingy because it was in turn made of at least 50
| different microservice type systems and it was breaking
| in the middle of two of them.
|
| There was, I have to admit, a log message that explained
| the problem... once I could find the specific log message
| and understand the 45 steps in the chain that got to that
| spot.
| scott_w wrote:
| > Every company out there is using the cloud and yet still
| employs infrastructure engineers to deal with its complexity.
| The "cloud" reducing staff costs is and was always a lie.
|
| This doesn't make sense as an argument. The reason the cloud
| is more complex is because that complexity is available.
| Under a certain size, a large number of cloud products simply
| can't be managed in-house (and certainly not altogether).
|
| Also your argument is incorrect in my experience.
|
| At a smaller business I worked at, I was able to use these
| services to achieve uptime and performance that I couldn't
| achieve self-hosted, because I had to spend time on the
| product itself. So yeah, we'd saved on infrastructure
| engineers.
|
| At larger scales, what your false dichotomy suggests also
| doesn't actually happen. Where I work now, our data stores
| are all self-managed on top of EC2/Azure, where performance
| and reliability are critical. But we don't self-host
| everything. For example, we use SES to send our emails and we
| use RDS for our app DB, because their performance profiles
| and uptime guarantees are more than acceptable for the price
| we pay. That frees up our platform engineers to spend their
| energy on keeping our uptime on our critical services.
| kijin wrote:
| Yes, mix-and-match is the way to go, depending on what kind
| of skills are available in your team. I wouldn't touch a
| mail server with a 10-foot pole, but I'll happily self-
| manage certain daemons that I'm comfortable with.
|
| Just be careful not to accept more complexity just because
| it is available, which is what the AWS evangelists often
| try to sell. After all, we should always make an informed
| decision when adding a new dependency, whether in code or
| in infrastructure.
| scott_w wrote:
| Of course AWS are trying to sell you everything. It's
| still on you and your team to understand your product and
| infrastructure and decide what makes sense for you.
| pinkgolem wrote:
| >At a smaller business I worked at, I was able to use these
| services to achieve uptime and performance that I couldn't
| achieve self-hosted, because I had to spend time on the
| product itself. So yeah, we'd saved on infrastructure
| engineers.
|
| How sure are you about that one? All of my hetzner vm`s
| reach an uptime if 99.9% something.
|
| I could see more then one small business stack fitting onto
| a single of those vm`s.
| scott_w wrote:
| 100% certain because I started by self hosting before
| moving to AWS services for specific components and
| improved the uptime and reduced the time I spent keeping
| those services alive.
| pinkgolem wrote:
| What was work you spend configuring those services and
| keeping them alive? I am genuinely curious...
|
| We have a very limited set of services, but most have
| been very painless to maintain.
| scott_w wrote:
| A Django+Celery app behind Nginx back in the day. Most
| maintenance would be discovering a new failure mode:
|
| - certificates not being renewed in time
|
| - Celery eating up all RAM and having to be recycled
|
| - RabbitMQ getting blocked requiring a forced restart
|
| - random issues with Postgres that usually required a
| hard restart of PG (running low on RAM maybe?)
|
| - configs having issues
|
| - running out of inodes
|
| - DNS not updating when upgrading to a new server (no CDN
| at the time)
|
| - data centre going down, taking the provider's email
| support with it (yes, really)
|
| Bear in mind I'm going back a decade now, my memory is
| rusty. Each issue was solvable but each would happen at
| random and even mitigating them was time that I (a single
| dev) was not spending on new features or fixing bugs.
| pinkgolem wrote:
| I mean, going back a decade might be part of the reason?
|
| Configs having issues is like number 1 reason i like the
| setup so much..
|
| I can configure everything on my local machine and test
| here, and then just deploy it to a server the same way.
|
| I do not have to build a local setup, and then a remote
| one
| scott_w wrote:
| Er... what? Even in today's world with Docker, you have
| differences between dev and prod. For a start, one is
| accessed via the internet and requires TLS configs to
| work correctly. The other is accessed via localhost.
| chasd00 wrote:
| Just fyi, you can put whatever you want in /etc/hosts, it
| gets hit before the resolver. So you can run your website
| on localhost with your regular host name over https.
| scott_w wrote:
| I'm aware, I just picked one example but there are others
| like instead of a mail server you're using console, or
| you have a CDN.
| pinkgolem wrote:
| I use a https for localhost, there are a ton of options
| for that.
|
| But yes, the cert is created differently in prod and
| there are a few other differences.
|
| But it's much closer then in the cloud.
| squeaky-clean wrote:
| Just because your VM is running doesn't mean the service
| is accessible. Whenever there's a large AWS outage it's
| usually not because the servers turned off. It also
| doesn't guarantee that your backups are working properly.
| pinkgolem wrote:
| If you have a server where everything is on the server,
| the server being on means everything is online... There
| is not a lot of complexity going on inside a single
| server infrastructure.
|
| I mean just because you have backups does not mean you
| can restore them ;-)
|
| We do test backup restoration automatically and also on a
| quarterly basis manually, but so you should do with AWS.
|
| Otherwise how do you know you can restore system a
| without impact other dependency, d and c
| riedel wrote:
| Working in a university Lab self-hosting is the default for
| almost anything. While I would agree that cost are quite low,
| I sometimes would be really happy to throw money at problems
| to make them go away. Without having the chance and thus
| being no expert, I really see the opportunity of scaling (up
| and down) quickly in the cloud. We ran a postgres database of
| a few 100 GB with multiple read replica and we managed
| somehow, but actually really hit our limits of expertise at
| some point. At some point we stopped migrating to newer
| database schemas because it was just such a hassle keeping
| availability. If I had the money as company, I guess I would
| have paid for a hosted solution.
| spiralpolitik wrote:
| In-house vs Cloud Provider is largely a wash in terms of
| cost. Regardless of the approach, you are going need people
| to maintain stuff and people cost money. Similarly compute
| and storage cost money so what you lose on the swings, you
| gain on the roundabouts.
|
| In my experience you typically need less people if using a
| Cloud Provider than in-house (or the same number of people
| can handle more instances) due to increased leverage. Whether
| you can maximize what you get via leverage depends on how
| good your team is.
|
| US companies typically like to minimize headcount (either
| through accounting tricks or outsourcing) so usually using a
| Cloud Provider wins out for this reason alone. It's not how
| much money you spend, it's how it looks on the balance sheet
| ;)
| erulabs wrote:
| Well sure you still have 2 or 3 infra people but now you
| don't need 15. Comparing to modern Hetzner is also not fair
| to "cloud" in the sense that click-and-get-server didn't
| exist until cloud providers popped up. That was initially the
| whole point. If bare metal behind an API existed in 2009 the
| whole industry would look very different. Contingencies Rule
| Everything Around Me.
| strken wrote:
| I can't talk about staff costs, but as someone who's self-
| hosted Postgres before, using RDS or Supabase saves weeks of
| time on upgrades, replicas, tuning, and backups (yeah, you
| still need independent backups, but PITRs make life easier).
| Databases and file storage are probably the most useful cloud
| functionality for small teams.
|
| If you have the luxury of spending half a million per year on
| infrastructure engineers then you can of course do better,
| but this is by no means universal or cost-effective.
| AYBABTME wrote:
| The fact that as many engineers are on payroll doesn't mean
| that "cloud" is not an efficiency improvement. When things
| are easier and cheaper, people don't do less or buy less.
| They do more and buy more until they fill their capacity. The
| end result is the same number (or more) of engineers, but
| they deal with a higher level of abstraction and achieve more
| with the same headcount.
| cardanome wrote:
| You are missing that most services don't have high availability
| needs and don't need to scale.
|
| Most projects I have worked on in my career have never seen
| more than a hundred concurrent users. If something goes down on
| Saturday, I am going to fix it on Monday.
|
| I have worked on internal tools were I just added a postgres DB
| to the docker setup and that was it. 5 Minute of work and no
| issues at all. Sure if you have something customer facing, you
| need to do a bit more and setup a good backup strategy but that
| really isn't magic.
| prisenco wrote:
| | _self hosting costs you between 30 and 120 minutes per month_
|
| Can we honestly say that cloud services taking a half hour to
| two hours a month of someone's time on average is completely
| unheard of?
| esseph wrote:
| Very much depends on what you're doing in the cloud, how many
| services you are using, and how frequently those services and
| your app needs updates.
| SatvikBeri wrote:
| I handle our company's RDS instances, and probably spend
| closer to 2 hours a year than 2 hours a month over the last 8
| years.
|
| It's definitely expensive, but it's not time-consuming.
| prisenco wrote:
| Of course. But people also have high uptime servers with
| long-running processes they barely touch.
| anal_reactor wrote:
| The discussion isn't "what is more effective". The discussion
| is "who wants to be blamed in case things go south". If you
| push the decision to move to self-hosted and then one of the
| engineers fucks up the database, you have a serious problem. If
| same engineer fucks up cloud database, it's easier to save your
| own ass.
| fhcuvyxu wrote:
| Self hosting does not cost you that much at all. It's basically
| zero once you've got backups automated.
| npn wrote:
| I also encourage people to just use managed databases. After
| all, it is easy to replace such people. Heck actually you can
| fire all of them and replace the demand with genAI nowadays.
| lucideer wrote:
| > _at a company that has reached the scale where employing
| engineers to manage self-hosted databases is more cost
| effective than outsourcing._
|
| This is the crux of one of the most common fallacies in
| software engineering decision making today. I've participated
| in a bunch of architecture / vendor evaluations that concluded
| managed services are more cost effective almost purely because
| they underestimated (or even discarded entirely) the internal
| engineering cost of vendor management. Black box debugging is
| one of the most time costuming engineering pursuits, & even
| when it's something widely documented & well supported like
| RDS, it's only really tuned for the lowest common denominator -
| the complexities of tuning someone else's system at scale can
| really add up to only marginally less effort than self-hosting
| (if there's any difference at all).
|
| But most importantly - even if it's significantly less effort
| than self-hosting, it's never effectively costed when
| evaluating trade-offs - that's what leads to this persistent
| myth about the engineering cost of self-hosting. "Managing"
| managed services is a non-zero cost.
|
| Add to that the ultimate trade-off of accountability vs
| availability (internal engineers care less about availability
| when it's out of there hands - but it's still a loss to your
| product either way).
| bastawhiz wrote:
| > Black box debugging is one of the most time costuming
| engineering pursuits, & even when it's something widely
| documented & well supported like RDS, it's only really tuned
| for the lowest common denominator - the complexities of
| tuning someone else's system at scale can really add up to
| only marginally less effort than self-hosting (if there's any
| difference at all).
|
| I'm really not sure what you're talking about here. I manage
| many RDS clusters at work. I think in total, we've spent
| maybe eight hours over the last three years "tuning" the
| system. It runs at about 100kqps during peak load. Could it
| be cheaper or faster? Probably, but it's a small fraction of
| our total infra spend and it's not keeping me up at night.
|
| Virtually all the effort we've ever put in here has been
| making the application query the appropriate indexes. But
| you'd do no matter how you host your database.
|
| Hell, even the metrics that RDS gives you for free make the
| thing pay for itself, IMO. The thought of setting up grafana
| to monitor a new database makes me sweat.
| solatic wrote:
| > even the metrics that RDS gives you for free make the
| thing pay for itself, IMO. The thought of setting up
| grafana to monitor a new database makes me sweat.
|
| CloudNative PG actually gives you really nice dashboards
| out-of-the-box for free. see:
| https://github.com/cloudnative-pg/grafana-dashboards
| bastawhiz wrote:
| Sure, and I can install something to do RDS performance
| insights without querying PG stats, and something to
| schedule backups to another region, and something to
| aggregate the logs, and then I have N more things that
| can break.
| arevno wrote:
| > trading an hour or two of my time
|
| pacman -S postgresql
|
| initdb -D /pathto/pgroot/data
|
| grok/claude/gpt: "Write a concise Bash script for setting up an
| automated daily PostgreSQL database backup using pg_dump and
| cron on a Linux server, with error handling via logging and
| 7-day retention by deleting older backups."
|
| ctrl+c / ctrl+v
|
| Yeah that definitely took me an hour or two.
| solatic wrote:
| So your backups are written to the same disk?
|
| > datacenter goes up in flames
|
| > 3-2-1 backups: 3 copies on 2 different types of media with
| at least 1 copy off-site. No off-site copy.
|
| Whoops!
| jrochkind1 wrote:
| Agreed. As someone in a very tiny shop, all us devs want to do
| as little context switching to ops as possible. Not even half a
| day a month. Our hosted services are in aggregate still way
| cheaper than hiring another person. (We do _not_ employ an
| "infrastructure engineer").
| alexpadula wrote:
| Everyone and their mother wants to host Postgres for you!
| bradley13 wrote:
| Huh? Maybe I missed something, but...why should self-hosting a
| database server be hard or scary? Sure, you are then responsible
| for security backups, etc...but that's not really different in
| the cloud - if anything, the cloud makes it more complicated.
| xboxnolifes wrote:
| I'd say a managed dB, at minimum, should be handling upgrades
| and backups for you. If it doesn't, thats not a managed db,
| thats a self-service db. You're paying a premium to do the work
| yourself.
| empthought wrote:
| Self-hosting a database server is not particularly hard or
| scary for an engineer.
|
| Hiring and replacing engineers who can and want to manage
| database servers can be hard or scary for employers.
| Nextgrid wrote:
| > Hiring and replacing engineers who can and want to manage
| database servers can be hard or scary for employers.
|
| I heard there's this magical thing called "money" that is
| claimed to help with this problem. You offer even half of the
| AWS markup to your employees and suddenly they like managing
| database servers. Magic I tell you!
| m4ck_ wrote:
| Well for the clickops folks who've built careers on the idea
| that 'systems administration is dead'... I imagine having to
| open a shell and install some stuff or modify a configuration
| file is quite scary.
| zbentley wrote:
| For a fascinating counterpoint (gist: cloud hosted Postgres on
| RDS aurora is _not_ anything like the system you would host
| yourself, and other cloud deployments of databases should also
| not be done like our field is used to doing it when self-hosting)
| see this other front page article and discussion:
| https://news.ycombinator.com/item?id=46334990
| LunaSea wrote:
| Aurora is a closed-source fork of PostgreSQL. So it is indeed
| not possible to self-host it.
|
| However a self-hosted PostgreSQL on a bare metal server with
| NVMe SSDs will much faster than what RDS is capable of.
| Especially at the same price points.
| zbentley wrote:
| Yep! I was mostly replying to TFA's claim that AWS RDS is
|
| > Standard Postgres compiled with some AWS-specific
| monitoring hooks
|
| ... and other operational tools deployed alongside it. That's
| not always true: RDS classic may be those things, but RDS
| Aurora/Serverless is anything but.
|
| As to whether
|
| > self-hosted PostgreSQL on a bare metal server with NVMe
| SSDs will much faster than what RDS is capable of
|
| That's _often_ but not _always_ true. Plenty of workloads
| will perform better on RDS (read auto scaling is huge in
| Serverless: you can have new read replica nodes auto-launch
| in response to e.g. a wave of concurrent, massive reporting
| queries; many queries can benefit from RDS's additions to
| /modifications of the pg buffer cache system that work with
| the underlying storage)--and that's even with the VM tax and
| the networked-storage tax! Of course, it'll cost more in real
| money whether or not it performs better, further complicating
| the cost/benefit analysis here.
|
| Also, pedantically, you can run RDS on bare metal with local
| NVMEs.
| LunaSea wrote:
| > Also, pedantically, you can run RDS on bare metal with
| local NVMEs.
|
| Only if you like your data to evaporate when the server
| stops.
|
| I'm relatively sure that the processing power and memory
| you can buy on OVH / Hetzner / co. is larger and cheaper
| even if you take into account peaks in your usage patterns.
| zbentley wrote:
| > Only if you like your data to evaporate when the server
| stops.
|
| (Edited to remove glib and vague rejoinder, sorry) Then
| hibernate/reboot it instead of stopping it?
| Alternatively, that's what backup-to S3, periodic
| snapshot-to-EBS, clustering, or running an EBS-persisted
| zero-query-volume tiny replica are for.
|
| > the processing power and memory you can buy on OVH /
| Hetzner / co. is larger and cheaper
|
| Cheaper? Yeah, generally. But larger/more performant? Not
| always--it's not about peaks/autoscaling, it's about the
| (large) minority of workloads that will work better on
| RDS/Aurora/Serverless: auto-scale-out makes the reports
| run on time regardless of cost; bulk data loads are
| available on replicas a lot sooner on Aurora because the
| storage is the replication system, not the WAL; and so on
| --if you add up all the situations where the hosted RDBMS
| systems trump self hosted, you get an amount that's not
| "hosted is always better/worth it", but it's not "hosted
| is just ops time savings and is otherwise just
| slower/more expensive" either. And that's before you add
| reliability into the conversation.
| adenta wrote:
| I wish this article would have went more in-depth on how they're
| setting up backups. The great thing about sequel light is
| lightstream makes backup and restore something you don't really
| have to think about
| cosmosgenius wrote:
| for postgres specifically pgbackrest works well. Using in a
| home doing backups to r2 and local s3.
| olavgg wrote:
| ZFS snapshot, send, receive, clone, spin up another postgresql
| server on the backup server, take full backup on that clone
| once per week
| petterroea wrote:
| I have ran (read: helped with infrastructure) a small production
| service using PSQL for 6 years, with up to hundreds of users per
| day. PSQL has been the problem exactly once, and it was because
| we ran out of disk space. Proper monitoring (duh) and a little
| VACUUM would have solved it.
|
| Later I ran a v2 of that service on k8s. The architecture also
| changed a lot, hosting many smaller servers sharing the same psql
| server(Not really microservice-related, think more "collective of
| smaller services ran by different people"). I have hit some
| issues relating to maxing out the max connections, but that's
| about it.
|
| This is something I do on my free time so SLA isn't an issue,
| meaning I've had the ability to learn the ropes of running PSQL
| without many bad consequences. I'm really happy I have had this
| opportunity.
|
| My conclusion is that running PSQL is totally fine if you just
| set up proper monitoring. If you are an engineer that works with
| infrastructure, even just because nobody else can/wants to,
| hosting PSQL is probably fine for you. Just RTFM.
| reilly3000 wrote:
| But it's 1500 pages long!
| petterroea wrote:
| Good point. I sure didn't read it myself :D
|
| I generally read the parts I think I need, based on what I
| read elsewhere like Stackoverflow and blog posts. Usually the
| real docs are better than some random person's SO comment. I
| feel that's sufficient?
| kunley wrote:
| Psql (lowercase) is the name of the textual sql client for
| PostgreSQL. For a general abbreviation we rather use "Pg".
| petterroea wrote:
| Good catch, thx
| cromulent wrote:
| Without disagreeing:
|
| Sometimes it is nice to simplify the conversation with non-tech
| management. Oh, you want HA / DR / etc? We click a button and you
| get it (multi-AZ). Clicking the button doubles your DB costs from
| x to y. Please choose.
|
| Then you have one less repeating conversation and someone to
| blame.
| zsoltkacsandi wrote:
| I've operated both self-hosted and managed database clusters with
| complex topologies and mission-critical data at well-known tech
| companies.
|
| Managed database services mostly automate a subset of routine
| operational work, things like backups, some configuration
| management, and software upgrades. But they don't remove the need
| for real database operations. You still have to validate
| restores, build and rehearse a disaster recovery plan, design and
| review schemas, review and optimize queries, tune indexes, and
| fine-tune configuration, among other essentials.
|
| In one incident, AWS support couldn't determine what was wrong
| with an RDS cluster and advised us to "try restarting it".
|
| Bottom line: even with managed databases, you still need people
| on the team who are strong in DBOps. You need standard operating
| procedures and automation, built by your team. Without that
| expertise, you're taking on serious risk, including potentially
| catastrophic failure modes.
| Nextgrid wrote:
| I've had an RDS instance run out of disk space and then get
| stuck in "modifying" for 24 hours (until an AWS operator
| manually SSH'd in I guess). We had to restore from the latest
| snapshot and manually rebuild the missing data from logs/other
| artifacts in the meantime to restore service.
|
| I would've very much preferred being able to SSH in myself and
| fix it on the spot. Ironically the only reason it ran out of
| space in the first place is that the AWS markup on that is so
| huge we were operating with little margin for error; none of
| that would happen with a bare-metal host where I can rent 1TB
| of NVME for a mere 20 bucks a month.
|
| As far as I know we never got any kind of compensation for
| this, so RDS ended up being a net negative for this company,
| tens of thousands spent over a few years for laptop-grade
| performance and it couldn't even do its promised job the only
| time it was needed.
| satvikpendem wrote:
| Better yet, self host Postgres on your own open source PaaS with
| Coolify, Dokploy, or Canine, and then you can also self host all
| your apps on your VPS too. I use Dokploy but I'm looking into
| Canine, and I know many have used Coolify with great success.
| gynecologist wrote:
| I didnt even know there were companies that would host postgres
| for you. I self host it for my personal projects with 0 users and
| it works just fine, so I don't know why anyone would do it any
| differently.
| satvikpendem wrote:
| I can't tell if this is satire or not with the first sentence
| and the "0 users" parts of your comment, but I know several
| solo devs with millions of users who self host their database
| and apps as well.
| da02 wrote:
| What hosting providers do they use/recommend?
| satvikpendem wrote:
| I believe they use Hetzner although there are some
| comparison sites too: https://serverlist.dev
| phendrenad2 wrote:
| Self-hosting is one of those things that makes sense when you can
| control all of the variables. For example, can you stop the
| developers from using obscure features of the db, that suddenly
| become deprecated, causing you to need to do a manual rolling
| back while they fix the code? A one-button UI to do that might be
| very handy. Can you stop your IT department from breaking the
| VPN, preventing you from logging into the db box at exactly the
| wrong time? Having it all in a UI that routes around IT's fat
| fingers might be helpful.
| kachapopopow wrote:
| since this is on the front page (again?) I guess I'll chime in:
| learn kubernetes - it's worth it. It did take me 3 attempts at it
| to finally wrap my head around it I really suggest trying out
| many different things and see what works for you.
|
| And I really recommend starting with *default* k3s, do not look
| at any alternatives to cni, csi, networked storage - treat your
| first cluster as something that can spontaniously fail and don't
| bother keeping it clean learn as much as you can.
|
| Once you have that, you can use great open-source k8s native
| controllers which take care of vast majority of requirements when
| it comes to self-hosting and save more time in the long run than
| it took to set up and learn these things.
|
| Honerable mentions: k9s, lens(I do not suggest using it in the
| long-term, but UI is really good as a starting point), rancher
| webui.
|
| PostgreSQL specifically: https://github.com/cloudnative-
| pg/cloudnative-pg If you really want networked storage:
| https://github.com/longhorn/longhorn
|
| I do not recommend ceph unless you are okay with not using shared
| filesystems as they have a bunch of gotchas or if you want S3
| without having to install a dedicated deployment for it.
| satvikpendem wrote:
| Check out canine.sh, it's to Kubernetes what Coolify or Dokploy
| is to Docker, if you're familiar with self hosted open source
| PaaS.
| kachapopopow wrote:
| I just push to git where there is a git action to
| automatically synchronize deployments
| chuckadams wrote:
| And on a similar naming note yet totally unrelated, check out
| k9s, which is a TUI for Kubernetes cluster admin. All kinds
| of nifty features built-in, and highly customizable.
| satvikpendem wrote:
| If we're talking about CLIs, check out Kamal, the build
| system that 37signals / Basecamp / DHH developed,
| specifically to move off the cloud. I think it uses
| Kubernetes but not positive, it might just be Docker.
| Nextgrid wrote:
| It's just Docker - it SSH's in to the target servers and
| runs `docker` commands as needed.
| chandureddyvari wrote:
| Any good recommendations you got for learning kubernetes for
| busy people?
| mystifyingpoi wrote:
| No path for busy people, unfortunately. Learn everything from
| ground up, from containers to Compose to k3s, maybe to
| kubeadm or hosted. Huge abstraction layers coming from
| Kubernetes serve their purpose well, but can screw you up
| when anything goes slightly wrong on the upper layer.
|
| For start, ignore operators, ignore custom CSI/CNI, ignore
| IAM/RBAC. Once you feel good in the basics, you can expand.
| kachapopopow wrote:
| k3sup a cluster, ask an AI on how to serve an nginx static
| site using trafeik on it and explain every step of it and
| what it does (it should provide: a config map, a deployment,
| a service and an ingress)
|
| k3s provides: csi, cni (cluster storage interface, cluster
| network interface) which is flannel and and local-pv which
| just maps volumes to disk (pvcs)
|
| trafeik is what routes your traffic from the outside to
| inside your cluster (to an ingress resource)
| ninkendo wrote:
| At $WORK we've been using the Zalando Postgres kubernetes
| operator to great success: https://github.com/zalando/postgres-
| operator
|
| As someone who has operated Postgres clusters for over a decade
| before k8s was even a thing, I fully recommend just using a
| Postgres operator like this one and moving on. The out of box
| config is sane, it's easy to override things, and failover/etc
| has been working flawlessly for years. It's just the right line
| between total DIY and the simplicity of having a hosted
| solution. Postgres is solved, next problem.
| vovavili wrote:
| For something like a database, what is the added advantage to
| using Kubernetes as opposed to something simple like Docker
| Compose?
| mystifyingpoi wrote:
| Docker Compose (ignoring Swarm which seems to be obsolete)
| manages containers on a single machine. With Kubernetes,
| the pod that hosts the database is a pod like any other (I
| assume). It gets moved to a healthy machine when node goes
| bad, respects CPU/mem limits, works with generic monitoring
| tools, can be deployed from GitOps tools etc. All the k8s
| goodies apply.
| Nextgrid wrote:
| When it comes to a DB moving the process around is easy,
| it's the data that matters. The reason bare-metal-hosted
| DBs are so fast is that they use direct-attach storage
| instead of networked storage. You lose those speed
| advantages if you move to distributed storage (Ceph/etc).
| ninkendo wrote:
| You don't need to use networked storage, the zalando
| postgres operator just uses local storage on the host. It
| uses a StatefulSet underneath so that pods will stay on
| the same node until you migrate them.
| Nextgrid wrote:
| But if I'm pinning it to dedicated machines then
| Kubernetes does not give me anything, but I still have to
| deal with its tradeoffs and moving parts - which from
| experience are more likely to bring me down than actual
| hardware failure.
| lukaslalinsky wrote:
| I run PostgreSQL+Patroni on Kubernetes where each
| instance is a separate StatefulSet pinned to dedicated
| hosts, with data on local ZFS volumes, provisioned by the
| OpenEBS controller.
|
| I do this for multiple reasons, one is that I find it
| easier to use Kubernetes as the backend for Patroni,
| rather than running/securing/maintaining just another
| etcd cluster. But I also do it for observability, it's
| much nicer to be able to pull all the metrics and logs
| from all the components. Sure, it's possible to set that
| up without Kubernetes, but why if I can have the logs
| delivered just one way. Plus, I prefer how self-
| documenting the whole thing is. No one likes YAML
| manifests, but they are essentially running documentation
| that can't get out of sync.
| ninkendo wrote:
| It's not like anyone's recommending you setup k8s just to
| use Postgres. The advice is that, if you're already using
| k8s, the Postgres operator is pretty great, and you
| should try it instead of using a hosted Postgres offering
| or having a separate set of dedicated (non-k8s) servers
| just for Postgres.
|
| I will say that even though the StatefulSet pins the pod
| to a node, it still has advantages. The StatefulSet can
| be scaled to N nodes, and if one goes down, failover is
| automatic. Then you have a choice as an admin to either
| recover the node, or just delete the pod and let the
| operator recreate it on some other node. When it gets
| recreated, it resyncs from the new primary and becomes a
| replica and you're back to full health, it's all pretty
| easy IMO.
| kachapopopow wrote:
| I hate that this is starting to sound like a bot Q&A, but
| the primary advantages is that it provides secure remote
| configuration and it's that it's platform agnostic, multi-
| node orchestration, built in load balancing and services
| framework, way more networking control than docker, better
| security, self healing and the list goes on, you have to
| read more about it to really understand the advantages over
| docker.
| ninkendo wrote:
| The assumption is that you're already using Kubernetes,
| sorry.
|
| Docker compose has always been great for running some
| containers on a local machine, but I've never found it to
| be great for deployments with lots of physical nodes. k8s
| is certainly complex, but the complexity really pays off
| for larger deployments IMO.
| alex23478 wrote:
| In this case the advantage are operators for running
| postgres.
|
| With Docker Compose, the abstraction level you're dealing
| with is containers, which means in this case you're saying
| "run the postgres image and mount the given config and the
| given data directory". When running the service, you need
| to know how to operate the software within the container.
|
| Kubernetes at its heart is an extensible API Server, which
| allows so called "operators" to create custom resources and
| react to them. In the given case, this means that a
| postgres operator defines for example a
| PostgresDatabaseCluster resource, and then contains control
| loops to turn these resources into actual running
| containers. That way, you don't necessarily need to know
| how postgres is configured and that it requires a data
| directory mount. Instead, you create a resource that says
| "give me a postgres 15 database with two instances for HA
| fail-over", and the operator then goes to work and manages
| the underlying containers and volumes.
|
| Essentially operators in kubernetes allow you to manage
| these services at a much higher level.
| groundzeros2015 wrote:
| Are you working on websites with millions of hourly visits?
| markstos wrote:
| I hosted PostgreSQL professionally for over a decade.
|
| Overall, a good experience. Very stable service and when
| performance issues did periodically arise, I like that we had
| full access to all details to understand the root cause and tune
| details.
|
| Nobody was employeed as a full-time DBA. We had plenty of other
| things going on in addition to running PostgreSQL.
| jhatemyjob wrote:
| I wish this post went into the actual _how_! He glossed over the
| details. There is a link to his repo, which is a start I suppose:
| https://github.com/piercefreeman/autopg
|
| A blog post that went into the details would be awesome. I know
| Postgres has some docs for this
| (https://www.postgresql.org/docs/current/backup.html), but it's
| too theoretical. I want to see a one-stop-shop with everything
| you'd reasonably need to know to self host: like monitoring
| uptime, backups, stuff like that.
| lbrito wrote:
| I'm probably just an idiot, but I ran unmanaged postgres on
| Fly.io, which is basically self hosting on a vm, and it wasn't
| fun.
|
| I did this for just under two years, and I've lost count of how
| many times one or more of the nodes went down and I had to
| manually deregister it from the cluster with repmgr, clone a new
| vm and promote a healthy node to primary. I ended up writing an
| internal wiki page with the steps. I never got it: if one of the
| purposes of clusters is having higher availability, why did
| repmgr not handle zombie primaries?
|
| Again, I'm probably just an idiot out of my depth with this. And
| I probably didn't need a cluster anyway, although with the nodes
| failing like they did, I didn't feel comfortable moving to a
| single node setup as well.
|
| I eventually switched to managed postgres, and it's amazing being
| able to file a sev1 for someone else to handle when things go
| down, instead of the responsibility being on me.
| indigodaddy wrote:
| Assuming you are using fly's managed postgres now?
| lbrito wrote:
| Yep
| mittermayr wrote:
| Self-hosting is more a question of responsibility I'd say. I am
| running a couple of SaaS products and self-host at much better
| performance at a fraction of the cost of running this on AWS.
| It's amazing and it works perfectly fine.
|
| For client projects, however, I always try and sell them on
| paying the AWS fees, simply because it shifts the responsibility
| of the hardware being "up" to someone else. It does not
| inherently solve the downtime problem, but it allows me to say,
| "we'll have to wait until they've sorted this out, Ikea and
| Disney are down, too."
|
| Doesn't always work like that and isn't always a tried-and-true
| excuse, but generally lets me sleep much better at night.
|
| With limited budgets, however, it's hard to accept the cost of
| RDS (and we're talking with at least one staging environment)
| when comparing it to a very tight 3-node Galera cluster running
| on Hetzner at barely a couple of bucks a month.
|
| Or Cloudflare, titan at the front, being down again today and the
| past two days (intermittently) after also being down a few weeks
| ago and earlier this year as well. Also had SQS queues time out
| several times this week, they picked up again shortly, but it's
| not like those things ...never happen on managed environments.
| They happen quite a bit.
| bossyTeacher wrote:
| > Self-hosting is more a question of responsibility I'd say. I
| am running a couple of SaaS products and self-host at much
| better performance at a fraction of the cost of running this on
| AWS
|
| It is. You need to answer the question: what are the
| consecuences of your service being down for lets say 4 hours or
| some security patch isn't properly applied or you have not
| followed the best practices in terms of security? Many people
| are technically unable, lack the time or the resources to be
| able to confidently address that question, hence paying for
| someone else to do it.
|
| Your time is money though. You are saving money but giving up
| time.
|
| Like everything, it is always cheaper to do it (it being
| cooking at home, cleaning your home, fixing your own car, etc)
| yourself (if you don't include the cost of your own time doing
| the service you normally pay someone else for).
| jbverschoor wrote:
| Yea I agree.. better outsource product development,
| management, and everything else too by that narrative
| zbentley wrote:
| That's pretty reductive. By that logic the opposite extreme
| is just as true: if using managed services is just as bad
| as outsourcing everything else, then a business shouldn't
| rent real estate either--every business should build and
| own their own facility. They should also never contract out
| janitorial work, nor should they retain outside law firms--
| they should hire and staff those departments internally,
| every time, no nuance allowed.
|
| You see the issue?
|
| Like, I'm all for not procuring things that it makes more
| sense to own/build (and I know most businesses have piss-
| poor instincts on which is which--hell, I work for the
| government! I can see firsthand the consequences of
| outsourcing decision making to contractors, rather than
| just outsourcing implementation).
|
| But it's very case-by-case. There's no general rule like
| "always prefer self hosting" or "always rent real estate,
| never buy" that applies broadly enough to be useful.
| jama211 wrote:
| So well said, I like the technique of taking their logic
| and turning it around, never seen that before but it's
| smart.
| antihipocrat wrote:
| In my experience it only ends well on the Internet and
| with philosophically inclined friends.
| jama211 wrote:
| Anything ending well on the internet is like a mythical
| unicorn though
| gopher_space wrote:
| I'll be reductive in conversations like this just to help
| push the pendulum back a little. The prevailing attitude
| seems (to me) like people find self-hosting mystical and
| occult, yet there's never been a better time to do it.
|
| > But it's very case-by-case. There's no general rule
| like "always prefer self hosting" or "always rent real
| estate, never buy" that applies broadly enough to be
| useful.
|
| I don't know if anyone remembers that irritating "geek
| code" thing we were doing a while back, but coming up
| with some kind of shorthand for whatever context we're
| talking about would be useful.
| zbentley wrote:
| No argument here, that's a fair and thoughtful response,
| and you're not wrong regarding the prejudice against
| self-hosting (and for what it's worth I absolutely come
| from the era where that was the default approach, have
| done it extensively, like it, and still do it/recommend
| it when it makes sense).
|
| > " geek code" thing we were doing a while back
|
| Not sure what you're referring to. "Shibboleet", perhaps?
| https://xkcd.com/806/
| gopher_space wrote:
| > The Geek Code, developed in 1993, is a series of
| letters and symbols used by self-described "geeks" to
| inform fellow geeks about their personality, appearance,
| interests, skills, and opinions. The idea is that
| everything that makes a geek individual can be encoded in
| a compact format which only other geeks can read. This is
| deemed to be efficient in some sufficiently geeky manner.
|
| https://en.wikipedia.org/wiki/Geek_Code
| foo42 wrote:
| geek code is worthy of its own hn submission
| nemothekid wrote:
| Unironically - I agree. You _should_ be outsourcing things
| that aren 't your core competency. I think many people on
| this forum have a certain pride about doing this manually,
| but to me it wouldn't make sense in any other context.
|
| Could you imagine accountants arguing that you shouldn't
| use a service like Paychex or Gusto and just run payroll
| manually? After all it's cheaper! Just spend a week
| tracking taxes, benefits and signing checks.
|
| Self-hosting, to me, doesn't make sense unless you are 1.)
| doing something not offered by the cloud or a pathological
| use case 2.) or running a hobby project or 3.) you are in
| maintaince mode on the product. Otherwise your time is
| better spent on your core product - and if it isn't, you
| probably aren't busy enough. If the cost of your RDS
| cluster is _so expensive_ relative to your traffic, you
| probably aren 't charging enough or your business economics
| really don't make sense.
|
| I've managed large database clusters (MySQL, Cassandra) on
| bare metal hardware in managed colo in the past. I'm well
| aware of the performance thats being left on the table and
| what the cost difference is. For the vast majority of
| businesses, optimizing for self hosting doesn't make sense,
| especially if you don't have PMF. For a company like
| 37signals, sure, product velocity probably is very high,
| and you have engineering cycles to spare. But if you aren't
| profitable, self hosting won't make you profitable, and
| your time is better spent elsewhere.
| belorn wrote:
| You can outsource everything, but outsourcing critical
| parts of the company may also put the existence of the
| company in the hand of a third-party. Is that an
| acceptable risk?
|
| Control and risk management cost money, be that by self
| hosting or contracts. At some point it is cheaper to buy
| the competence and make it part of the company rather
| than outsource it.
| nemothekid wrote:
| I think you and I simply disagree about your database
| being a core/critical part of your stack. I believe RDS
| is good enough for most people, and the only advantage
| you would have in self hosting is shaving 33% off your
| instance bill. I'd probably go a step further and argue
| that Neon/CockroachDB Serverless is good enough for most
| people.
| dolmen wrote:
| Access control to your (customer's) data may also be a
| concern that rules out managed services like RDS.
| nemothekid wrote:
| I'm not sure what is meaningfully different about RDS
| that wouldn't rule out the cloud in general if that was a
| concern.
| solatic wrote:
| I'm totally with you on the core vs. context question,
| but you're missing the nuance here.
|
| Postgres's operations _is_ part of the core of the
| business. It 's not a payroll management service where
| you should comparison shop once the contract comes up for
| renewal and haggle on price. Once Postgres is the
| database for your core systems of record, _you are not
| switching away from it_. The closest analog is how
| difficult it is /was for anybody who built a business on
| top of an Oracle database, to switch away from Oracle.
| But Postgres is free ^_^
|
| The question at heart here is whether the _host_ for
| Postgres is context or core. There are a lot of vendors
| for Postgres hosting: AWS RDS and CrunchyData and
| PlanetScale etc. And if you make a conscious choice to
| outsource this bit of context, you should be signing
| yearly-ish contracts with support agreements and re-
| evaluating every year and haggling on price. If your
| business works on top of a small database with not-
| intense access needs, and can handle downtime or
| maintenance windows sometimes, there 's a really good
| argument for treating it that way.
|
| But there's also an argument that your Postgres host is
| core to your business as well, because if your Postgres
| host screws up, your customers feel it, and it can affect
| your bottom line. If your Postgres host didn't react in
| time to your quick need for scaling, or tuning Postgres
| settings (that a Postgres host refuses to expose) could
| make a material impact on either customer experience or
| financial bottom-line, _that is indeed core to your
| business_. That simply isn 't a factor when picking a
| payroll processor.
| nemothekid wrote:
| Ignoring the fact that the assumption that you will
| automatically have as good or better uptime than a cloud
| provider, I just feel like you just simply aren't being
| thoughtful enough with the comparison. Like in what world
| is payroll not as important as your DBMS - if you can't
| pay people you don't have a business!
|
| If your payroll processor screws up and you can't pay
| your employees or contractors, that can also affect your
| bottom line. This isn't a hypothetical - this is a real
| thing that happened to companies that used Rippling.
|
| If your payroll processor screws up and you end up owing
| tens of thousands to ex-employees because they didn't
| accrue vacation days correctly, that can squeeze your
| business. These are real things I've seen happen.
|
| Despite these real issues that have jammed up businesses
| before rarely do people suggest moving payroll in-house.
| Many companies treat Payroll like cloud, with no need for
| multi-year contracts, Gusto lets you sign up monthly with
| a credit card and you can easily switch to rippling or
| paychex.
|
| What I imagine is you are innately aware of how a DBMS
| can screw up, but not how complex payroll can get. So in
| your world view payroll is a solved problem to be
| outsourced, but DBMS is not.
|
| To me, the question isn't whether or not my cloud
| provider is going to have perfect uptime. The assumption
| that you will achieve better uptime and operations than
| cloud is pure hubris; it's certainly possible, but there
| is nothing inherent about self-hosting that makes it more
| resilient. The question is your use case differentiated
| enough where something like RDS doesn't make sense. If
| it's not, your time is better spent focused on your
| business - not setting up dead man switches to ensure
| your database backup cron is running.
| bigstrat2003 wrote:
| > Like everything, it is always cheaper to do it (it being
| cooking at home, cleaning your home, fixing your own car,
| etc) yourself (if you don't include the cost of your own time
| doing the service you normally pay someone else for).
|
| In a business context the "time is money" thing actually
| makes sense, because there's a reasonable likelihood that the
| business can put the time to a more profitable use in some
| other way. But in a personal context it makes no sense at
| all. Realistically, the time I spend cooking or cleaning was
| not going to earn me a dime no matter what else I did,
| therefore the opportunity cost is zero. And this is true for
| almost everyone out there.
| _superposition_ wrote:
| Lol this made me laugh, there's a reasonable likelihood
| that time will be filled with meetings.
| bigstrat2003 wrote:
| Heh, true. Although in fairness I said the business _can_
| repurpose the time to make money, not that they _will_. I
| 'm splitting hairs, but it seems in keeping with the
| ethos here. ;)
| PunchyHamster wrote:
| You can pay someone else to manage your hardware stack, there
| are literal companies that will just keep it running, while
| you just deploy your apps on that.
|
| > It is. You need to answer the question: what are the
| consecuences of your service being down for lets say 4 hours
| or some security patch isn't properly applied or you have not
| followed the best practices in terms of security?
|
| There is one advantage self hosted setup has here, if you set
| up VPN, only your employees have access, and you can have
| server not accessible from the internet. So even in case of
| zero day that WILL make SaaS company leak your data, you can
| be safe(r) with self-hosted solution.
|
| > Your time is money though. You are saving money but giving
| up time.
|
| The investment compounds. Setting up infra to run a single
| container for some app takes time and there is good chance it
| won't pay back for itself.
|
| But 2nd service ? Cheaper. 5th ? At that point you probably
| had it automated enough that it's just pointing it at docker
| container and tweaking few settings.
|
| > Like everything, it is always cheaper to do it (it being
| cooking at home, cleaning your home, fixing your own car,
| etc) yourself (if you don't include the cost of your own time
| doing the service you normally pay someone else for).
|
| It's cheaper if you include your own time. You pay a
| technical person at your company to do it. Saas company does
| that, then pays sales and PR person to sell it, then pays
| income tax to it, then it also needs to "pay" investors.
|
| Yeah making a service for 4 people in company can be more
| work than just paying $10/mo to SaaS company. But 20 ? 50 ?
| 100 ? It quickly gets to point where self hosting (whether
| actually "self" or by using dedicated servers, or by using
| cloud) _actually_ pays off
| mattmanser wrote:
| Over 20 year I've had lots of clients on self-hosted, even
| self-hosting SQL on the same VM as the webserver as you used to
| in the long distant past for low-usage web apps.
|
| I have never, ever, ever had a SQL box go down. I've had a web
| server go down once. I had someone who probably shouldn't have
| had access to a server accidentally turn one off once.
|
| The only major outage I've had (2/3 hours) was when the box was
| also self-hosting an email server and I accidentally caused it
| to flood itself with failed delivery notices with a deploy.
|
| I may have cried a little in frustration and panic but it got
| fixed in the end.
|
| I actually find using cloud hosted SQL in some ways harder and
| more complicated because it's such a confusing mess of cost and
| what you're actually getting. The only big complication is
| setting up backups, and that's a one-off task.
| paulryanrogers wrote:
| Disks go bad. RAID is nontrivial to set up. Hetzner had a big
| DC outage that lead to data loss.
|
| Off site backups or replication would help, though not always
| trivial to fail over.
| alemanek wrote:
| As someone who has set this up while not being a DBA or
| sysadmin.
|
| Replication and backups really aren't that difficult to
| setup properly with something like Postgres. You can also
| expose metrics around this to setup alerting if replication
| lag goes beyond a threshold you set or a backup didn't
| complete. You do need to periodically test your backups but
| that is also good practice.
|
| I am not saying something like RDS doesn't have value but
| you are paying a huge premium for it. Once you get to more
| steady state owning your database totally makes sense. A
| cluster of $10-20 VPSes with NVMe drives can get really
| good performance and will take you a lot farther than you
| might expect.
| andersmurphy wrote:
| Even easier with sqlite thanks to litestream.
| westurner wrote:
| datasette and datasette-lite (WASM w/pyodide) are web UIs
| for SQLite with sqlite-utils.
|
| For read only applications, it's possible to host
| datasette-lite and the SQLite database as static files on
| a redundant CDN. Datasette-lite + URL redirect API +
| litestream would probably work well, maybe with read-
| write; though also electric-sql has a sync engine (with
| optional partial replication) too, and there's PGlite
| (Postgres in WebAssembly)
| tormeh wrote:
| I think the pricing of the big three is absurd, so I'm on
| your side in principle. However, it's the steady state
| that worries me. When the box has been running for 4
| years and nobody who works there has any (recent)
| experience operating postgres anymore. That shit makes me
| nervous.
| bg24 wrote:
| Yes. Also you can have these replicas of Postgres across
| regions.
| j45 wrote:
| Not as often as you might think. Hardware doesn't fail like
| it used to.
|
| Hardware also monitors itself reasonably well because the
| hosting providers use it.
|
| It's trivial to run a mirrored containers on two separate
| proxmox nodes because hosting providers use the same kind
| of stuff.
|
| Offsite backups and replication? Also point and click and
| trivial with tools like Proxmox.
|
| RAID is actually trivial to setup.l if you don't compare it
| to doing it manually yourself from the command line. Again,
| tools like Proxmox make it point and click and 5 minutes of
| watching from YouTube.
|
| If you want to find a solution our brain will find it. If
| we don't we can find reasons not to.
| tempest_ wrote:
| > if you don't compare it to doing it manually yourself
|
| Even if you do ZFS makes this pretty trivial as well.
| mattmanser wrote:
| So can the cloud, and cloud has had more major outages in
| the last 3 months than I've seen on self-hosted in 20
| years.
|
| Deploys these days take minutes so what's the problem if a
| disk does go bad? You lose at most a day of data if you go
| with the 'standard' overnight backups, and if it's mission
| critical, you will have already set up replicas, which
| again is pretty trivial and only slightly more complicated
| than doing it on cloud hosts.
| paulryanrogers wrote:
| > ...you will have already set up replicas, which again
| is pretty trivial and only slightly more complicated than
| doing it on cloud hosts.
|
| Even on PostgreSQL 18 I wouldn't describe self hosted
| replication as "pretty trivial". On RDS you can get an HA
| replica (or cluster) by clicking a radio box.
| fabian2k wrote:
| For this kind of small scale setup, a reasonable backup
| strategy is all you need for that. The one critical part is
| that you actually verify your backups are done and work.
|
| Hardware doesn't fail _that_ often. A single server will
| easily run many years without any issues, if you are not
| unlucky. And many smaller setups can tolerate the downtime
| to rent a new server or VM and restore from backup.
| mcny wrote:
| One thing that will always stick in my mind is one time I
| worked at a national Internet service provider.
|
| The log disk was full or something. That's not the shameful
| part though. What followed is a mass email saying everyone
| needs to update their connection string from bla bla bla 1
| dot foo dot bar to bla bla bla 2 dot foo dot bar
|
| This was inexcusable to me. I mean this is an Internet
| service provider. If we can't even figure out DNS, we
| should shut down the whole business and go home.
| PunchyHamster wrote:
| They, do, it isn't, cloud providers also go bad.
|
| > Off site backups or replication would help, though not
| always trivial to fail over.
|
| You want those regardless of where you host
| znpy wrote:
| > RAID is nontrivial to set up.
|
| Skill issue?
|
| It's not 2003, modern volume-managing filesystems (eg:ZFS)
| make creating and managing RAID trivial.
| vb-8448 wrote:
| You can still outsource up to VM level and handle everything
| else on you own.
|
| Obviously it depends on the operational overhead of specific
| technology.
| madeofpalk wrote:
| > but it allows me to say, "we'll have to wait until they've
| sorted this out, Ikea and Disney are down, too."
|
| From my experience your client's clients don't care about this
| when they're still otherwise up.
| tjwebbnorfolk wrote:
| Yes but the fact that it's "not their fault" keeps the person
| from getting fired.
|
| Don't underestimate the power of CYA
| HPsquared wrote:
| That's real microeconomics.
| api wrote:
| This is a major reason the cloud commands such a premium.
| It's a way to make down time someone else's problem.
|
| The other factor is eliminating the "one guy who knows X"
| problem in IT. What happens if that person leaves or you
| have to let them go? But with managed infrastructure
| there's a pool of people who know how to write terraform or
| click buttons and manage it and those are more
| interchangeable than someone's DIY deployment. Worst case
| the cloud provider might sell you premium support and help.
| Might be expensive but you're not down.
|
| Lastly, there's been an exodus of talent from IT. The
| problem is that anyone really good can become a coder and
| make more. So finding IT people at a reasonable cost who
| know how to really troubleshoot and root cause stuff and
| engineer good systems is very hard. The good ones command
| more of a programmer salary which makes the gap with cloud
| costs much smaller. Might as well just go managed cloud.
| 01HNNWZ0MV43FF wrote:
| That is called "bus factor" or "lottery factor". If the
| one IT guy gets hit by a bus or wins the lottery and
| quits, what happens? You want a bus factor of two or more
| - Two people would have to get hit by a bus for the
| company to have a big problem
| growse wrote:
| There's a bus factor equivalent with the cloud, too. The
| power to severely disrupt your service (either
| accidentally, or on purpose) rests with a single org (and
| often, a single compliance department within that org).
|
| Ironically, this becomes more of a concern the larger the
| supplier. AWS can live with firing any one of their
| customers - a smaller outfit probably couldn't.
| 6LLvveMx2koXfwn wrote:
| Surely 'the other factor' is no factor at all as IaC can
| target on-prem just as easily as cloud?
| TheNewsIsHere wrote:
| Many people do inaccurately equate IaC with "cloud
| native" or cloud "only".
|
| It can certainly fit into a particular cloud platform's
| offerings. But it's by no means exclusive to the cloud.
|
| My entire stack can be picked up and redeployed anywhere
| where I can run Ubuntu or Debian. My "most external"
| dependencies are domain name registries and an S3-API
| compatible object store, and even that one is technically
| optional, if given a few days of lead time.
| pdimitar wrote:
| I never understood the argument of a senior IT person's
| salary competing for the cloud expenses. In my
| contracting and consulting career I have done all of
| programming, monitoring and DevOps many times; the cost
| of my contract is amortized over multiple activities.
|
| The way you present it makes sense of course. But I have
| to wonder whether there really are such clear demarcation
| lines between responsibilities. At least over the course
| of my career this was very rarely the case.
| blitz_skull wrote:
| From my experience, this completely disavows you from an
| otherwise reputation damaging experience.
| Thaxll wrote:
| That argument does not hold when there is aws serverless pg
| available, which cost almost nothing for low traffic and is
| vastly superior to self hosting regarding observability,
| security, integration, backup ect...
|
| There is no reason to self manage pg for dev / environnement.
|
| https://aws.amazon.com/rds/aurora/serverless/
| jread wrote:
| This was true for RDS serverless v1 which scaled to 0 but is
| no longer offered. V2 requires a minimum 0.5 ACU hourly
| commit ($40+ /mo).
| cobolcomesback wrote:
| V2 scales to zero as of last year.
|
| https://aws.amazon.com/blogs/database/introducing-scaling-
| to...
|
| It only scales down after a period of inactivity though -
| it's not pay-per-request like other serverless offerings.
| DSQL looks to be more cost effective for small projects if
| you can deal with the deviations from Postgres.
| jread wrote:
| Ah, good to know, I hadn't seen that V2 update. Looks
| like a min 5m inactivity to auto-pause (i.e., scale to
| 0), and any connection attempt (valid or not) resumes the
| DB.
| starttoaster wrote:
| "which cost almost nothing for low traffic" you invented the
| retort "what about high traffic" within your own message. I
| don't even necessarily mean user traffic either. But if you
| constantly have to sync new records over (as could be the
| case in any kind of timeseries use-case) the internal traffic
| could rack up costs quickly.
|
| "vastly superior to self hosting regarding observability" I'd
| suggest looking into the cnpg operator for Postgres on
| Kubernetes. The builtin metrics and official dashboard is
| vastly superior to what I get from Cloudwatch for my RDS
| clusters. And the backup mechanism using Barman for database
| snapshots and WAL backups is vastly superior to AWS DMS or
| AWS's disk snapshots which aren't portable to a system
| outside of AWS if you care about avoiding vendor lock-in.
| maccard wrote:
| Aurora serverless requires provisioned compute - it's about
| $40/mo last time I checked.
| snovv_crash wrote:
| The performance disparity is just insane.
|
| Right now from Hetzner you can get a _dedicated_ server
| with 6c /12t Ryzen2 3600, 64GB RAM and 2x512GB Nvme SSD for
| EUR37/mo
|
| Even if you just served files from disc, no RAM, that could
| give 200k small files per second.
|
| From RAM, and with 6 dedicated cores, network will saturate
| long before you hit compute limits on any reasonably
| efficient web framework.
| gonzo41 wrote:
| Just use a pg container on a vm, cheap as chips and you can
| do anything to em.
| arwhatever wrote:
| Me: "Why are we switching from NoNameCMS to Salesforce?"
|
| Savvy Manager: "NoNameCMS often won't take our support calls,
| but if Salesforce goes down it's in the WSJ the next day."
| dilyevsky wrote:
| This ignores the case when BigVendor is down for your account
| and your account only and support is mia, which is not that
| uncommon ime
| yunwal wrote:
| It doesn't ignore that case, it simply allows them to shift
| blame whereas the no name vendor does not.
| ajmurmann wrote:
| "Nobody has ever been fired for buying IBM"
| psychoslave wrote:
| https://www.forbes.com/sites/duenablomstrom1/2018/11/30/n
| obo...
| zelphirkalt wrote:
| So in the end it's not better for the users at all, it's
| just for non-technical people to shift blame. Great
| "business reasoning".
| WJW wrote:
| Nobody in this thread ever claimed it was better for the
| users. It's better for the people involved in the
| decision.
| growse wrote:
| And this speaks to the lack of alignment about what's
| good for the decision makers Vs what's good for the
| customer.
| PunchyHamster wrote:
| It's not tho, they have workers that they pay not making
| money, all while footing bigger bill for the "pleasure"
| notKilgoreTrout wrote:
| That's more a small business owner perspective. For a
| middle manager rattling some cages during a week of IBM
| downtime is adequate performance while it is unclear how
| much performative response is necessary if mom&pops is
| down for a day.
| zelphirkalt wrote:
| Yes, you are correct. But actually, I am not claiming
| someone claimed it :) I am actually trying to get at the
| idea, that the "business people" usually bring up, that
| they are looking after the user's/customer's interest and
| that others don't have the "business mind", while
| actually when it comes to this kind of decision making,
| all of that is out of the window, because they want to
| shift the blame.
|
| A few steps further stepped back, most of the services we
| use are not that essential, that we cannot bear them
| being down a couple of hours over the course of a year.
| We have seen that over and over again with Cloudflare and
| AWS outages. The world continues to revolve. If we were a
| bit more reasonable with our expectations and realistic
| when it comes to required uptime guarantees, there
| wouldn't be much worry about something being down every
| now and then, and we wouldn't need to worry about our
| livelihood, if we need to reboot a customer's database
| server once a year, or their impression about the quality
| of system we built, if such a thing happens.
|
| But even that is unlikely, if we set up things properly.
| I have worked in a company where we self-hosted our
| platform and it didn't have the most complex fail-safe
| setup ever. Just have good backups and make sure you can
| restore, and 95% of the worries go away, for such non-
| essential products, and outages were less often than
| trouble with AWS or Cloudflare.
|
| It seems that either way, you need people who know what
| they are doing, whether you self-host or buy some
| service.
| oconnor663 wrote:
| You have to consider the class of problems as a whole,
| from the perspective of management:
|
| - The cheap solution would be equally good, and it's just
| a blame shifting game.
|
| - The cheap solution is worse, and paying more for the
| name brand gets you more reliability.
|
| There are _many_ situations that fall into the second
| category, and anyone running a business probably has
| personal memories of making the second mistake. The
| problem is, if you 're not up to speed on the nitty
| gritty technical details of a tradeoff, you can't tell
| the difference between the first category and the second.
| So you accept that sometimes you will over-spend for "no
| reason" as a cost of doing business. (But the reason is
| that information and trust don't come for free.)
| nwallin wrote:
| > non-technical people
|
| It's also better for the technical people. If you self
| host the DB goes down at 2am on a Sunday morning all the
| technical people are gonna get woken up and they will be
| working on it until it's fixed.
|
| If us-east goes down a technical person will be woken up,
| they'll check downdetector.com, and they'll say "us-east
| is down, nothin' we can do" and go back to sleep.
| dilyevsky wrote:
| This excuse only works for one or maybe two such outages
| in most orgs
| TheNewsIsHere wrote:
| Just wait until you end up spending $100,000 for an awful
| implantation from a partner who pretends to understand your
| business need but delivers something that doesn't work.
|
| But perhaps I'm bitter from prior Salesforce experiences.
| alexfromapex wrote:
| Just don't try to build it from source haha. Compiling Postgres
| 18 with the PostGIS extension has been such a PITA because the
| topology component won't configure to not use the system
| /usr/bin/postgres and has given me a lot of grief. Finally got it
| fixed I think though.
| olavgg wrote:
| I actually always build PostgreSQL from source as I want 32kb
| block size as default. It makes ZFS compression more awesome.
| wreath wrote:
| > Take AWS RDS. Under the hood, it's: Standard
| Postgres compiled with some AWS-specific monitoring hooks
| A custom backup system using EBS snapshots Automated
| configuration management via Chef/Puppet/Ansible Load
| balancers and connection pooling (PgBouncer) Monitoring
| integration with CloudWatch Automated failover scripting
|
| I didn't know RDS had PgBouncer under the hood, is this really
| accurate?
|
| The problem i find with RDS (and most other managed Postgres) is
| that they limit your options for how you want to design your
| database architecture. For instance, if write consistency is
| important to you want to support synchronous replication, there
| is no way to do this in RDS without either Aurora or having the
| readers in another AZ. The other issue is that you only have
| access to logical replication, because you don't have access to
| your WAL archive, so it makes moving off RDS much more difficult.
| mystifyingpoi wrote:
| > I didn't know RDS had PgBouncer under the hood
|
| I don't think it does. AWS has this feature under RDS Proxy,
| but it's an extra service and comes with extra cost (and a bit
| cumbersome to use in my opinion, it should have been designed
| as a checkbox, not an entire separate thing to maintain).
|
| Although, it technically has "load balancer", in form of a DNS
| entry that resolves to a random reader replica, if I recall
| correctly.
| jurschreuder wrote:
| I moved from AWS RDS to ScaleWay RDS, had the same effect on cost
| heyalexej wrote:
| Without stating actual numbers if not comfortable, what was the
| % savings one over the other? Happy with performance? Looking
| at potential of doing the same move.
| ergonaught wrote:
| > Self-hosting a database sounds terrifying.
|
| Is this actually the "common" view (in this context)?
|
| I've got decades with databases so I cannot even begin to fathom
| where such an attitude would develop, but, is it?
|
| Boggling.
| Nextgrid wrote:
| Over a decade of cloud provider propaganda achieves that. We
| appear to have lost the basic skill of operating a *nix
| machine, so anything even remotely close to that now sounds
| terrifying.
|
| You mean you need to SSH into the box? Horrifying!
| gnusi wrote:
| Can't agree more.
| FragrantRiver wrote:
| People really love jumping through hoops to avoid spending five
| dollars.
| TheRealPomax wrote:
| > When self-hosting makes sense: 1. If you're just starting out
| in software & want to get something working quickly [...]
|
| This is when you use SQLite, not Postgres. Easy enough to turn
| into Postgres later, nothing to set up. It already works. And
| backups are literally just "it's a file, incremental backup by
| your daily backups already covers this".
| slroger wrote:
| Great read. I moved my video sharing app from GCP to self hosted
| on a beefy home server+ cloudflare for object storage and video
| streaming. Had been using Cloud SQL as my managed db and now
| running Postgres on my own dedicated hardware. I was forced to
| move away from the cloud primarily because of the high cost of
| running video processing(not because Cloud SQL was bad) but
| instead have discovered self hosting the db isnt as difficult as
| its made out to be. And there was a daily charge of keeping the
| DB hot which I dont have now. Will be moving to a rackmount
| server at a datacolo in about a month so this was great to read
| and confirms my experience.
| darksaints wrote:
| honestly at this point I'm actually surprised that there aren't
| specialized linux distributions for hosting postgres. There's so
| many kernel-level and file-system level optimizations that can be
| done that significantly impact performance, and the ability to
| pare down all of the unneeded stuff in most distributions would
| make for a pretty compact and highly optimized image.
| mind-blight wrote:
| I think a big piece missing from these conversations is
| compliance frameworks and customer trust. Of your selling to
| enterprise customers or governments, they want to go through your
| stack, networking, security, audit logs, and access controls with
| a fine toothed comb.
|
| Everything you do that isn't "normal" is another conversation you
| need to have with an auditor plus each customer. Those eat up a
| bunch of time and deals take longer to close.
|
| Right or wrong, these decisions make you less "serious" and
| therefore less credible in the eyes of many enterprise customers.
| You can get around that perception, but it takes work. Not
| hosting on one of the big 3 needs to be decided with that cost in
| mind
| yomismoaqui wrote:
| Now for the next step... just use SQLite (it's possible it will
| be enough for your case).
|
| Disclaimer: there's no silver bullet, yadda yadda. But SQLite in
| WAL mode and backups using Litestream have worked perfectly for
| me.
| esseph wrote:
| (This is very reductionist)
|
| A lot of this comes down to devs not understanding infrastructure
| and infrastructure components and the insane interplay and
| complexity. And they don't care! Apps, apps apps, developers,
| developers, developers!
|
| On the managerial side, it's often about deflection of
| responsibility for the Big Boss.
|
| It's not part of the app itself it can be HARD, and if you're not
| familiar with things, then it's also scary! What if you mess up?
|
| (Most apps don't need the elasticity, or the bells and whistles,
| but you're paying for them even if you don't use them,
| indirectly.)
| evnp wrote:
| Enjoyed the article, and the "less can be more than you think"
| mindset in general.
|
| To the author - on Android Chrome I seem to inevitably load the
| page scrolled to the bottom, footnotes area. Scrolling up, back
| button, click link again has the same results - I start out
| seeing footnotes. Might be worth a look.
| lofaszvanitt wrote:
| I was on a severely restricted budget and self hosted everything
| for 15+ years, while the heavily used part of the database was on
| a RAM card. The RAM drive was soft raided to a hard drive pair
| which were 3Ware raid1 hdds, just in case, and also did a daily
| backup on the database and during that time never had any data
| loss and never had to restore anything from backup. And my
| options were severely restricted due to a capped income.
|
| The real downside wasn't technical. The constant background
| anxiety you had to learn to live with, since the hosted news
| sites were hammered by the users. The dreaded SMS alerts saying
| the server was inaccessible (often due to ISP issues) or going
| abroad meant persuading one of your mates to keep an eye on
| things just in case, created a lot of unnecessary stress.
|
| AWS is quite good. It has everything you need and removes most of
| that operational burden, so the angst is much lower, but the
| pricing is problematic.
| reilly3000 wrote:
| I think we can get to the point where we have self-hosted agents
| that can manage db maintenance and recovery. There could be
| regular otel -> * -> Grafana -> ~PagerDuty -> you and TriageBot
| which would call specialists to gather state and orchestrate a
| response.
|
| Scripts could kick off health reports and trigger operations.
| Upgrades and recovery runbooks would be clearly defined and
| integration tested.
|
| It would empower personal sovereignty.
|
| Someone should make this in the open. Maybe it already exists,
| there are a lot of interesting agentops projects.
|
| If that worked 60% of the time and I had to figure out the rest,
| I'd self host that. I'd pay for 80%+.
| fullstackchris wrote:
| this is basically supabase. their entire stack (and product)
| can be hosted as a series of something like 10+ docker
| containers:
|
| https://supabase.com/docs/guides/self-hosting/docker
|
| however, like always, 'complexity has to live somewhere'. I
| doubt even Opus 4.5 could handle this. as soon as you get into
| database records themselves, context is going to blow up and
| you're going to have a bad time
| devin wrote:
| What irks me about so many comments in this thread is that they
| often totally ignore questions of scale, the shape of your
| workloads, staffing concerns, time constraints, stage of your
| business, whether you require extensions, etc.
|
| There is a whole raft of reasons why you might be a candidate for
| self-hosting, and a whole raft of reasons why not. This article
| is deeply reductive, and so are many of the comments.
| groundzeros2015 wrote:
| Engineers almost never consider any of those questions. And
| instead deploy the maximally expensive solution their boss will
| say ok to.
| RadiozRadioz wrote:
| Bad, short-sighted engineers will do that. An engineer who is
| not acting solely in the best interests of the wider
| organisation is a bad one. I would not want to work with a
| colleague who was so detached from reality that they wouldn't
| consider all GP's suggested facets. Engineering includes
| soft/business constraints as well as technical ones.
| groundzeros2015 wrote:
| We are saying similar things.
| RadiozRadioz wrote:
| Ah, you are implying that most engineers are bad, I see.
| In that case I agree too
| groundzeros2015 wrote:
| I don't know if they are bad engineers, but they have
| poor judgment.
| npn wrote:
| I bet you also believe database is the single source of
| truth, right?
| WackyFighter wrote:
| I find it is the opposite way around. I come up with _<
| simple solution>_ based on open source tooling and I am
| forced instead to use _< expensive enterprise shite>_ which
| is 100% lock in proprietary BS because _< large corporate
| tech company>_ is partnered and is subsidising development.
| This has been a near constant throughout my career.
| groundzeros2015 wrote:
| I agree, my statement is too coarse. There can be a lot of
| organizational pressure to produce complexity and it's not
| fair to just blame engineers.
|
| I've given a lot of engineers tasks only to find they are
| "setting up kubernetes cluster so I can setup automated
| deployments with a dashboard for ..."
|
| And similarly in QA I rarely see a cost/benefit
| consideration for a particular test or automation. Instead
| it's we are going to fully automate this and analyze every
| possible variable.
| roncesvalles wrote:
| I'd argue forget about Postgres completely. If you can shell out
| $90/month, the only database you should use is GCP Spanner (yes,
| this also means forget about any mega cloud other than GCP unless
| you're fine paying ingress and egress).
|
| And for small projects, SQLite, rqlite, or etcd.
|
| My logic is either the project is important enough that data
| durability matters to you and sees enough scale that loss of data
| durability would be a major pain in the ass to fix, or the
| project is not very big and you can tolerate some lost committed
| transactions.
|
| A consensus-replication-less non-embedded database has no place
| in 2025.
|
| This is assuming you have relational needs. For non-relational
| just use the native NoSQL in your cloud, e.g. DynamoDB in AWS.
| rikafurude21 wrote:
| You seem insanely miscalibrated. $90 gets you a dedicated
| server that covers most projects' needs. data durability isnt
| some magic that only cloud providers can get you.
| roncesvalles wrote:
| If you can lose committed transactions in case of single node
| data failure, you don't have durability. Then it comes down
| to do you really care about durability.
| reillyse wrote:
| Disk read write performance is also orders of magnitude
| better/cheaper/faster.
| 999900000999 wrote:
| I've had to set up postgres manually ( before docker TBF) and
| it's best described as suffering.
|
| Things will go wrong. And it's all your fault. You can't just
| blame AWS.
|
| Also are we changing the definition of self hosting. Self hosting
| on Digital Ocean ?!
| bluepuma77 wrote:
| From my point of view the real challenge comes when you want high
| availability and need to setup a Postgres cluster.
|
| With MongoDB you simply create a replicaset and you are done.
|
| When planing a Postgres cluster, you need to understand
| replication options, potentially deal with Patroni. Zalandos
| Docker Spilo image is not really maintained, the way to go seems
| CloudNativePG, but that requires k8s.
|
| I still don't understand why there is no easy built-in Postgres
| cluster solution.
| geldedus wrote:
| Pros self-host their DB's
| conradfr wrote:
| I've been self hosting Postgresql for 12+ years at this point.
| Directly on bare metal then and now in a container with CapRover.
|
| I have a cron sh script to backup to S3 (used to be ftp).
|
| It's not "business grade" but it has also actually NEVER failed.
| Well once, but I think it was more the container or a swarm
| thing. I just destroyed and recreated it and it picked up the
| same volume fine.
|
| The biggest pain point is upgrading as Postgresql can't upgrade
| the data without the previous version installed or something.
| It's VERY annoying.
| pellepelster wrote:
| I have spent quite some time the past months and years to deploy
| Postgres databases to non-hyperscaler environments.
|
| A popular choice for smaller workloads has always been the
| Hetzner cloud which I finally poured into a ready-to-use
| Terraform module
| https://pellepelster.github.io/solidblocks/hetzner/rds/index....
|
| Main focus here is a tested solution with automated backup and
| recovery, leaving out the complicated parts like clustering,
| prioritizing MTTR over MTBF.
|
| The naming of RDS is a little bit presumptuous I know, but it
| works quite well :-)
| sergiotapia wrote:
| Some fun math for you guys.
|
| I had a single API endpoint performing ~178 Postgres SQL queries.
| Setup Latency/query Total time
| ------------------------------------------------- Same geo
| area 35ms 6.2s Same local network 4ms
| 712ms Same server ~0ms 170ms
|
| This is with zero code changes, these time shavings are coming
| purely from network latency. A lot of devs lately are not even
| aware of latency costs coming from their service locations. It's
| crazy!
| Beltiras wrote:
| I've had my hair on fire because my app code shit the bed. I've
| never ever (throughout 15 years of using it in everything I do)
| had to even think about Postgres, and yes, I always set it up
| self-hosted. The only concern I've had is when I had to do
| migrations where I had to upgrade PG to fit with upgrades in the
| ORM database layer. Made for some interesting stepping-stone
| upgrades once in a while but mostly just careful sysadmining.
| fbuilesv wrote:
| I would have liked to read about the "high availability" that's
| mentioned a couple of times in the article; the _WAL
| Configuration_ section is not enough, and replication is
| expensive 'ish.
| raggi wrote:
| I have been self hosting a product on Postgres that serves GIS
| applications for 20 years and that has been upgraded through all
| of the various versions during that time. It has a near perfect
| uptime record modulo two hardware failures and short maintenance
| periods for final upgrade cutovers. The application has real
| traffic - the database is bigger than those at my day job.
| jeffbee wrote:
| The author's experience is trivial, so it indicates nothing.
| Anybody can set up a rack of postgresql servers and say it's
| great in year 2. All the hardware is under warranty and it still
| works anyway. There haven't been any major releases. The platform
| software is still "LTS". Nobody has needed to renegotiate the
| datacenter lease yet. So experience in year 2 tells you nothing.
| jbmsf wrote:
| I started in this industry before cloud was a thing. I did most
| of the things RDS does the hard way (except being able to
| dynamically increase memory on a running instance, that's magic
| to me). I do not want that responsibility, especially because I
| know how badly it turns out when it's one of a dozen (or dozens)
| of responsibilities asked of the team.
| fhcuvyxu wrote:
| > Self-hosting a database sounds terrifying.
|
| Is this really the state of our industry? Lol. Bunch of babies
| scared of the terminal.
| vbezhenar wrote:
| I don't feel like it's easy to self-host postgres.
|
| Here are my gripes:
|
| 1. Backups are super-important. Losing production data just is
| not an option. Postgres offers pgdump which is not appropriate
| tool, so you should set up WAL archiving or something like that.
| This is complicated to do right.
|
| 2. Horizontal scalability with read replicas is hard to
| implement.
|
| 3. Tuning various postgres parameters is not a trivial task.
|
| 4. Upgrading major version is complicated.
|
| 5. You probably need to use something like pgbouncer.
|
| 6. Database usually is the most important piece of
| infrastructure. So it's especially painful when it fails.
|
| I guess it's not that hard when you did it once and have all
| scripts and memory to look back. But otherwise it's hard.
| Clicking few buttons in hoster panel is much easier.
| nrhrjrjrjtntbt wrote:
| Scaling to a different instance size is also easy on AWS.
|
| That said a self hosted DB on a dedicated Hetzner flies. It
| does things at the price that may save you time reworking your
| app to be more efficient on AWS for cost.
|
| So swings and roundabouts.
| tonyhart7 wrote:
| "all scripts and memory to look back. But otherwise it's hard.
| Clicking few buttons in hoster panel is much easier."
|
| so we need open source way to do that, coolify/dokploy comes to
| mind and it exactly do that way
|
| I would say 80% of your point wouldnt be hit at certain scale,
| as most application grows and therefore outgrow your tech
| stack. you would replace them anyway at some point
| npn wrote:
| wal archiving is piss easy. you can also just use basebackup.
| with postgres 17 it is easier than ever with incremental backup
| feature.
|
| you don't need horizontal scalability when a single server can
| have 384 cpu real cores, 6TB of ram, some petabytes of pcie5
| ssd, 100Gbps NIC.
|
| for tuning postgres parameters, you can start by using
| pgtune.leopard.in.ua or pgconfig.org.
|
| upgrading major version is piss easy since postgres 10 or so.
| just a single command.
|
| you do not need pgbouncer if your database adapter library
| already provide the database pool functionality (most of them
| do).
|
| for me maintained database also need that same amount of
| effort, due to shitty documents and garbage user interfaces
| (all aws, gcp or azure is the same), not to mention they change
| all the time.
| npn wrote:
| > I sleep just fine at night thank you.
|
| I also self-host my webapp for 4+ years. never have any trouble
| with databases.
|
| pg_basebackup and wal archiving work wonder. and since I always
| pull the database (the backup version) for local development, the
| backup is constantly verified, too.
| taylorsatula wrote:
| Self-hosting Postgres is so incredibly easy. People are under
| this strange spell that they need to use an ORM or always reach
| for SQLite when it's trivially easy to write raw SQL. The syntax
| was designed so lithium'd out secretaries were able to write
| queries on a punchcard. Postgres has so many nice lil features.
| kwillets wrote:
| Over time I've realized that the best abstraction for managing a
| computer is a computer.
| nottorp wrote:
| > These settings tell Postgres that random reads are almost as
| fast as sequential reads on NVMe drives, which dramatically
| improves query planning.
|
| Interesting. Whoever wrote
|
| https://news.ycombinator.com/item?id=46334990
|
| didn't seem to be aware of that.
| drchaim wrote:
| I've been managing a 100+ GB PostgreSQL database for years. Each
| two years I upgrade the VPS for the size, and also the db and os
| version. The app is in the same VPS as the DB. A 2 hour window
| each two years is ok for the use case. No regrets.
| lukaslalinsky wrote:
| I'm not a cloud-hosting fan, but comparing RDS to a single
| instance DB seems crazy to me. Even for a hobby project, I
| couldn't accept losing data since the last snapshot. If you are
| going to self-host PostgreSQL in production, make sure you have
| at least some knowledge how to setup streaming replication and
| have monitoring in place making sure the replication works.
| Ideally, use something like Patroni for automatic failover. I'm
| saying this a someone running fairly large self-hosted HA
| PostgreSQL databases in production.
| tgtweak wrote:
| RDS is not, by default, multi-instance and multi-region or
| fault tolerant at all - you choose all of that in your instance
| config. The amount of single-instance single-region zero-backup
| RDS setup's I've seen in the wild is honestly concerning. Do
| Devs think an RDS instance on it's own without explicit
| configuration is fault tolerant and backed-up? If you have an
| ec2 instance with EBS and auto-restart you have almost
| identical fault tolerance (yes there are some slight nuances on
| RDS regarding recovery following a failure).
|
| Just found that assumption a bit dangerous. The ease with which
| you can set that up is easy on RDS but it's not on by default.
| jillesvangurp wrote:
| There are a couple of things that are being glossed over:
|
| Hardware failures and automated fail overs. That's a thing AWS
| and other managed hosting solutions do. Hardware will eventually
| fail of course. In AWS this would be a non event. It will fail
| over, a replacement spins up, etc. Same with upgrades, and other
| stuff.
|
| Configuration complexity. The author casually outlines a lot of
| fairly complex design involving all sorts of configuration
| tweaks, load balancing, etc. That implies skills most teams don't
| have. I know enough to know that I have quite a bit of reading up
| to do if I ever were to decide to self host postgresql. Many
| people would make bad assumptions about things being fine out of
| the box because they are not experienced postgresql DBAs.
|
| Vacations/holidays/sick days. Databases may go down when it's not
| convenient to you. To mitigate that, you need to have several
| colleagues that are equally qualified to fix things when they go
| down while you are away from keyboard. If you haven't covered
| that risk, you are taking a bit of risk. In a normal company, at
| least 3-4 people would be a good minimum. If you are just
| measuring your own time, you are not being honest or not being as
| diligent as you should be. Either it's a risk you are covering at
| a cost or a risk you are ignoring.
|
| With managed hosting, covering all of that is what you pay for.
| You are right that there are still failure modes beyond that that
| need covering. But an honest assessment of the time you, and your
| team, put in for this adds up really quickly.
|
| Whatever the reasons you are self hosting, cost is probably a
| poor one.
| vitabaks wrote:
| Just use Autobase for PostgreSQL
|
| https://github.com/vitabaks/autobase
|
| automates the deployment and management of highly available
| PostgreSQL clusters in production environments. This solution is
| tailored for use on dedicated physical servers, virtual machines,
| and within both on-premises and cloud-based infrastructures.
| banditelol wrote:
| One of the things that made me think twice for self hosting
| postgres is securing the OS I host PG on. Any recommendation
| where to start for that?
| danparsonson wrote:
| Can you get away without exposing it to the internet? Firewall
| it off altogether, or just open the address of a specific
| machine that needs access to it?
| PunchyHamster wrote:
| Cooking the RDS equivalent is reasonable amount of work, and
| pretty big amount of knowledge (easy to make failover solution
| have lower uptime than "just a single VM" if you don't get
| everything right)
|
| ... but you can do a _lot_ with just "a single VM and robust
| backup". PostgreSQL restore is pretty fast, and if you automated
| deployment you can start with it in minutes, so if your service
| can survive 30 minutes of downtime once every 3 years while the
| DB reloads, "downgrading" to "a single cloud VM" or "a single VM
| on your own hardware" might not be a big deal.
| jgalt212 wrote:
| Does anyone offer a managed database service where the database
| and your application server live on the same box? Until, I can
| get such latency advantages of such a set-up, we've found latency
| just too high to go with a managed solution. We are already
| spending too much batching or vectorizing database reads.
| yoan9224 wrote:
| I've been self-hosting Postgres for production apps for about 6
| years now. The "3 AM database emergency" fear is vastly overblown
| in my experience.
|
| In reality, most database issues are slow queries or connection
| pool exhaustion - things that happen during business hours when
| you're actively developing. The actual database process itself
| just runs. I've had more AWS outages wake me up than Postgres
| crashes.
|
| The cost savings are real, but the bigger win for me is having
| complete visibility. When something does go wrong, I can SSH in
| and see exactly what's happening. With RDS you're often stuck
| waiting for support while your users are affected.
|
| That said, you do need solid backups and monitoring from day one.
| pgBackRest and pgBouncer are your friends.
| jsight wrote:
| I often find it sad how many things that we did, almost without
| thinking about them, that are considered hard today. Take a
| stroll through this thread and you will find out that everything
| from RAID to basic configuration management are ultrahard things
| that will lead you to having a bus factor of 1.
|
| What went so wrong during the past 25 years?
| cosmodust wrote:
| I would suggest if you do host your database yourself consider
| taking the data seriously. Few easy solutions are using a multi
| zonal disk [1] with scheduled automatic snapshots [2].
|
| [1] https://docs.cloud.google.com/compute/docs/disks/hd-
| types/hy... [2]
| https://docs.cloud.google.com/compute/docs/disks/create-snap...
| cube00 wrote:
| Scheduled automatic snapshots are not the kind of consistent
| snapshots you need for a filesystem based backup.
| anonu wrote:
| does self-hosting on EC2 instance count?
| tgtweak wrote:
| As someone who self hosted mysql (in complex master/slave setups)
| then mariadb, memsql, mongo and pgsql on bare metal, virtual
| machines then containers for almost 2 decades at this point...
| you can self host with very little downtime and the only real
| challenge is upgrade path and getting replication right.
|
| Now with pgbouncer (or whatever other flavor of sql-aware proxy
| you fancy) you can greatly reduce the complexity involved in
| managing conventionally complex read/write routing and sharding
| to various replicas to enable resilient, scalable production-
| grade database setups on your own infra. Throw in the fact that
| copy-on-write and snapshotting is baked into most storage today
| and it becomes - at least compared to 20 years ago - trivial to
| set up DRS as well. Others have mentioned pgBackRest and that
| further enforces the ease with which you can set up these
| traditionally-complex setups.
|
| Beyond those two significant features there isn't many other
| reasons you'd need to go with hosted/managed pgsql. I've yet to
| find a managed/hosted database solution that doesn't have some
| level of downtime to apply updates and patches so even if you go
| fully hosted/managed it's not a silver bullet. The cost of
| managed DB is also several times that of the actual hardware it's
| running on, so there is a cost factor involved as well.
|
| I guess all this to say it's never been a better time to self-
| host your database and the learning curve is as shallow as it's
| ever been. Add to all of this that any garden-variety LLM can
| hand-hold you through the setup and management, including any
| issues you might encounter on the way.
| jpgvm wrote:
| Beyond the usual points there are some other important factors to
| consider self-hosting PG:
|
| 1. Access to any extension you want and importantly ability to
| create your own extensions.
|
| 2. Being able to run any version you want, including being able
| to adopt patches ahead of releases.
|
| 3. Ability to tune for maximum performance based on the kind of
| workload you have. If it's massively parallel you can fill the
| box with huge amounts of memory and screaming fast SSDs, if it's
| very compute heavy you can spec the box with really tall cores
| etc.
|
| Self hosting is rarely about cost, it's usually about control for
| me. Being able to replace complex application logic/types with a
| nice custom pgrx extension can save massive amounts of time.
| Similarity using a custom index access method can unlock a step
| change in performance unachievable without some non-PG solution
| that would compromise on simplicity by forcing a second data
| store.
| the-anarchist wrote:
| I generally agree with the author, however, there are a handful
| of relatively prominent, recent examples (eg [1]) that many
| admins might find scary enough to prefer a hosted solution.
|
| [1]: https://matrix.org/blog/2025/07/postgres-corruption-
| postmort...
| polskibus wrote:
| What's the SOTA for on-prem Postgres, in terms of point-in-time-
| recovery? are there any well-tested tools for it?
| ttkciar wrote:
| Huh. I thought hosting one's own databases was _still_ the norm.
| Guess I 'm just stuck in the past, or don't consume cloud vendor
| marketing, or something.
|
| Glad my employer is still one of the sane ones.
___________________________________________________________________
(page generated 2025-12-21 23:02 UTC)