[HN Gopher] Moving from AWS to Bare-Metal saved us $230k per year
___________________________________________________________________
Moving from AWS to Bare-Metal saved us $230k per year
Author : devneelpatel
Score : 171 points
Date : 2023-11-16 19:54 UTC (3 hours ago)
(HTM) web link (blog.oneuptime.com)
(TXT) w3m dump (blog.oneuptime.com)
| not_your_vase wrote:
| How are such savings not obvious after putting the amounts in an
| Excel sheet, and spending an hour over it (and most importantly
| doing this _before_ spending half a million /year on AWS)?
| Spivak wrote:
| I would be surprised if people didn't know that coloing was
| cheaper. I certainly evangelize it for workloads that are
| particularly expensive on AWS.
|
| It's not entirely without downsides though and I think many
| shops are willing to pay more for a different set of them. It
| is incredibly rewarding work though. You get to do magic.
|
| * You do need more experienced people, there's no way around it
| and the skills are hard to come by sometimes. We spent probably
| 3 years looking to hire a senior dba before we found one.
| Networking people are also unicorns.
|
| * Having to deal with the full full stack is a lot more work
| and needing manage IRL hardware is a PITA. I hated driving 50
| miles to swap some hard drives. Rather than using those nice
| cloud APIs you are also on the other side implementing them.
| And all the VM management software sucks in their own unique
| ways.
|
| * Storage will make you lose sleep. Ceph is a wonder of the
| technological world but it will also follow you in a dark
| alleyway and ruin your sleep.
|
| * Building true redundancy is harder than you think it should
| be. "What if your ceph cluster dies?" "What if your ESXi shits
| the bed?" "What if Consul?" Setting things up so that you don't
| accidentally have single points of failure is tedious work.
|
| * You have to constantly be looking at your horizons. We made a
| stupid little doomsday clock web app that we put all the "in
| the next x days/weeks/months we have to do x or we'll have an
| outage." Because it will take more time than you think it
| should to buy equipment.
| theLiminator wrote:
| It's great when you don't need instant elasticity and traffic
| is very predictable.
|
| I think it's very useful for batch processing, especially
| owning a GPU cluster could be great for ML startups.
|
| Hybrid cloud + bare metal is probably the way to go (though
| that does incur the complexity of dealing with both, which is
| also hard).
| mschuster91 wrote:
| > and most importantly doing this before spending half a
| million/year on AWS
|
| AWS is... incentivizing scope creep, to put it mildly. In ye
| olde days, you had your ESXi blades, and if you were lucky some
| decent storage attached to it, and you gotta made do with what
| you had - if you needed more resources, you'd have to go
| through the entire usual corporate bullshit. Get quotes from at
| least three comparable vendors, line up contract details, POs,
| get approval from multiple levels...
|
| Now? Who cares if you spin up entire servers worth of instances
| for feature branch environments, and look, isn't that new AI
| chatbot something we could use... you get the idea. The reason
| why cloud (not just AWS) is so popular in corporate hellscapes
| is because it eliminates a lot of the busybody impeders. Shadow
| IT as a Service.
| tqi wrote:
| Those busybodies are also there to keep rogue engineers from
| burning money on useless features (like AI chat bots) that
| only serve to bolster their promo packet...
| mschuster91 wrote:
| That depends on how the incentive structures for your
| corporate purchase department are set up - and there's
| really a _ton_ of variance there, with results ranging from
| everyone being happy in the best case to frustrated
| employees quitting in droves or the company getting burned
| at employer rating portals.
| tqi wrote:
| > That depends on how the incentive structures for your
| corporate purchase department are set up
|
| Sure, but that seems orthogonal to the pros and cons of
| having more layers of oversight (busybodies, to use your
| term) on infra spend. Badly run companies are badly run,
| and I don't think having the increased flexibility that
| comes from cloud providers changes that.
| withinboredom wrote:
| I literally started laughing at this. I worked at a bare-
| metal shop fairly recently and a guy on my team used a
| corporate credit card to set up an AWS account and create
| an AI chatbot.
|
| The dude nearly got fired, but your comment hit the spot.
| You made my night, thank you.
| threeseed wrote:
| > keep rogue engineers from burning money on useless
| features (like AI chat bots)
|
| As someone who has worked on an AI chat bot I can assure
| you it does not come from engineers.
|
| It's coming from the CFO who is salivating at the thought
| of downsizing their customer support team.
| ldargin wrote:
| Bare-metal solutions save money, but are costly in terms of
| development time and lost agility. Basically, they have much
| more friction.
| withinboredom wrote:
| huh, say wut?
|
| I guess before Amazon invented "the cloud" there wasn't any
| software companies...
| threeseed wrote:
| AWS isn't just IaaS they are PaaS.
|
| So it's a fact that for most use cases it will be
| significantly easier to manage than bare metal.
|
| Because much of it is being managed for you e.g. object
| store, databases etc.
| withinboredom wrote:
| Setting up k3s: 2 hours
|
| Setting up Garage for obj store: 1 hour.
|
| Setting up Longhorn for storage: .25 hour.
|
| Setting up db: 30 minutes.
|
| Setting up Cilium with a pool of ips to use as a lb: 45
| mins.
|
| All in: ~5 hours and I'm ready to deploy and spending 300
| bucks a month, just renting bare metal servers.
|
| AWS, for far less compute and same capabilities:
| approximately 800-1000 bucks a month, and takes about 3
| hours -- we aren't even counting egress costs yet.
|
| So, for two extra hours on your initial setup, you can
| save a ridiculous amount of money. Maintenance is
| actually less work than AWS too.
|
| (source: I'm working on a youtube video)
| threeseed wrote:
| You should stick to making Youtube videos then.
|
| Because there is a world of difference between installing
| some software and making it robust enough to support a
| multi-million dollar business. I would be surprised if
| you can setup and test a proper highly-available database
| with automated backup in < 30 mins.
| RhodesianHunter wrote:
| Because the amount of time your engineers will spend
| maintaining your little herd of pet servers, and the
| opportunity cost of not being able to spin up _manged service
| X_ to try an experiment, are not measurable.
| yjftsjthsd-h wrote:
| > maintaining your little herd of pet servers
|
| You know bare metal can be an automated fleet of cattle too,
| right?
| withinboredom wrote:
| Have you ever heard of pxe boot? You should check out
| Harvester made by Rancher (IIRC). Basically, manage bare
| metal machines using standard k8s tooling.
| dilyevsky wrote:
| Cloud is putting spend decisions into individual EMs' or even
| devs' hands. With baremetal one team ("infra" or whatever) will
| own all compute and thus spend decisions need to be justified
| by EMs which they usually dont like ;)
| hipadev23 wrote:
| They were paying on-demand ec2 prices and reserved instances
| alone would save them ~35%, savings plan even more which would
| apply to egress and storage costs too. Anyway, they're still
| saving a lot more (~55%), but it's not nearly as egregious of a
| difference.
| RhodesianHunter wrote:
| Right. Now what's the developer man-hour cost of the move?
|
| Unless their product is pretty static and not seeing much
| development, they're probably in the negative.
| manvillej wrote:
| when we don't optimize for cloud and look at it from this
| angle and squint, it looks like we're saving money!
| zer00eyz wrote:
| It's not a very good question: they still have aws
| compatibility as their fail over/backup (should be live but
| that's another matter...)
|
| What's capex vs opex now? Thats 150k of depreciable assets,
| probably ones that will be available for use long after all
| the current staff depart.
|
| Everyone forgets what WhatsApp did with few engineers and
| less hardware, there's probably more than enough room for
| them to grow, and they have space to increase capacity.
|
| The cloud has a place, but candidly so does a Datacenter and
| ownership.
| cypress66 wrote:
| Going from AWS to something like hetzner would be most of the
| way there probably.
| hipadev23 wrote:
| Hetzner in particular is a disaster waiting to happen, but
| yes I agree with the sentiment. OVH doesn't arbitrarily shut
| off your servers or close your account without warning.
| rgrieselhuber wrote:
| We've been on both Hetzner and OVH for years and have never
| had this happen.
|
| The move does cost money, once. Then the savings over years
| add up to a lot. We made this change more than 10 years ago
| and it was one of the best decisions we ever made.
| nonsens3 wrote:
| Hetzner randomly shuts down one of my servers every 2-3
| months.
| codenesium wrote:
| Nice of them to test your failover for you.
| rgrieselhuber wrote:
| Yeah you do have to have redundancy built in, but we
| don't get random shutdowns.
| razemio wrote:
| I am sorry what? I have been with Hetzner over 10 years
| hosting multiple servers without issue. There has to my
| knowledge never been a shutdown without notice on bare
| metal servers and it does not happen often. Like once
| every 2 years.
| ptico wrote:
| Hetzner suspended the account of a non-profit org I
| voluntarily supported, without explaining the reason or
| giving us possibility to take our data out. The issue was
| resolved only after bringing it to the public space. Even
| there they tried to pretend we are not actually their
| customers first
| riku_iki wrote:
| I had the same issue, sent them ticket, they swapped
| server, and worked fine since then.
| dvfjsdhgfv wrote:
| I've been using Hetzner for years and what happens every
| 3-4 years is that a disk dies. So I inform them, they
| usually replace it within an hour and I rebuild the
| array, that's all.
|
| Recently I've been moving most projects to Hetzner Cloud,
| it's a pleasure to work with and pleasantly inexpensive.
| It's a pity they didn't start it 10 years earlier.
| forty wrote:
| I think OVH might let your server burn and the backup which
| was stored next to it (of course) with it ;)
| rgrieselhuber wrote:
| Definitely don't keep all your data in one region only.
| For object storage, I prefer Backblaze unless you need
| high throughput.
| FpUser wrote:
| Using both Hetzner and OVH for years and not a single
| problem be it technical or administrative. Does not men it
| never happens but this is just my experience
| dvfjsdhgfv wrote:
| > Hetzner in particular is a disaster waiting to happen
|
| Why would you spread FUD? They have several datacenters in
| different locations, and even if they were as incompetent
| as OVH (they are not)[0], the destruction of one datacenter
| doesn't mean you will lose data stored in the remaining
| ones.
|
| [0] I bet OVH is also way smarter than they were before the
| fire.
| Karrot_Kream wrote:
| After that 35% savings, they ended up saving about a US mid
| level engineer's salary, sans benefits. Hope the time needed
| for the migration was worth it.
| alberth wrote:
| I'm sure the are also getting better performance as well.
|
| Not sure how to factor that $ into the equation.
| flkenosad wrote:
| Also, I'd imagine most companies can fill unused compute
| with long-running batch jobs so you're getting way more
| bang for your buck. It's really egregious what these clouds
| are charging.
| darkwater wrote:
| To get real savings with a complex enough project you will
| need one or more FTE salaries just to stay on top of AWS
| spending optimizations
| baz00 wrote:
| Plus...
|
| 2x FTEs to manage the AWS support tickets
|
| 3x FTE to understand the differences between the AWS
| bundled products and open source stuff which you can't get
| close enough to the config for so that you can actually use
| it as intended.
|
| 3x Security folk to work out how to manage the tangle of
| multiple accounts, networks, WAF and compliance overheads
|
| 3x FTEs to write HCL and YAML to support the cloud.
|
| 2x Solution architects to try and rebuild everything cloud
| native and get stuck in some technicality inside step
| functions for 9 months and achieve nothing.
|
| 1x extra manager to sit in meetings with AWS once a week
| and bitch about the crap support, the broken OSS bundled
| stuff and work out weird network issues.
|
| 1x cloud janitor to clean up all the dirt left around the
| cluster burning cash.
|
| ---
|
| Footnote: Was this to free us or enslave us?
| hotpotamus wrote:
| > Footnote: Was this to free us or enslave us?
|
| I assume whichever provides more margin to Jeff Bezos.
| gymbeaux wrote:
| Our experience hasn't been THAT bad but we did waste a
| lot of time in weekly meetings with AWS "solutions
| architects" who knew next to nothing about AWS aside from
| a shallow, salesman-like understanding. They make around
| $150k too, by the way. I tried to apply to be one, but
| AWS wants someone with more sales experience and they
| don't really care about my AWS certs
| baz00 wrote:
| As an AWS Solution Architect (independent untethered to
| Bezos) I resent that comment. I know slightly more than
| next to nothing about AWS and I can Google something and
| come up with something convincing and sell it to you in a
| couple of minutes!
| makeitdouble wrote:
| Getting a bare metal stack has interesting side effects on
| how they can plan future projects.
|
| One that's not immediately obvious is to keep on staff
| experienced infra engineers that bring their expertise for
| designing future projects.
|
| Another is the option to tackle project in ways that would be
| to costly if they were still on AWS (e.g. ML training, stuff
| with long and heavy CPU load).
| flkenosad wrote:
| Yep and hardware is only getting cheaper. Better to just
| buy more drives/chips when you need them.
| meowface wrote:
| A possible middle-ground option is to use a cheaper cloud
| provider like Digital Ocean. You don't need dedicated
| infrastructure engineers and you still get a lot of the
| same benefits as AWS, including some API compatibility
| (Digital Ocean's S3-alike, and many others', support S3's
| API).
|
| Perhaps there are some good reasons to not choose such a
| provider once you reach a certain scale, but they now have
| their own versions of a lot of different AWS services, and
| they're more than sufficient for my own relatively small
| scale.
| gymbeaux wrote:
| That's the niche DigitalOcean is trying to carve out.
| I've always loved and preferred their UI/UX to that of
| AWS or Azure. No experience with the CLI but I would
| guess it's not any worse than AWS CLI.
| efitz wrote:
| I was thinking the same thing. If the migration took more
| than one man-year then they lost money.
|
| Also what happens at hardware end-of-life?
|
| Also what happens if they encounter an explosive growth or
| burst usage event?
|
| And did their current staffing include enough headcount to
| maintain the physical machines or did they have to hire for
| that?
|
| Etc etc. Cloud is not cheap but if you are honest about TCO
| then the savings likely are WAY less than they imply in the
| article.
| flkenosad wrote:
| > If the migration took more than one man-year then they
| lost money.
|
| Your math is incorrect. The savings are per year. The job
| gets done once.
|
| > Also what happens at hardware end-of-life?
|
| You buy more hardware. A drive should last a few years on
| average at least.
|
| > Also what happens if they encounter an explosive growth
| or burst usage event?
|
| Short term, clouds are always available to handle extra
| compute. It's not a bad idea to use a cloud load-balancing
| system anyway to handle spam or caching.
|
| But also, you can buy hardware from amazon and get it the
| next day with Prime.
|
| > And did their current staffing include enough headcount
| to maintain the physical machines or did they have to hire
| for that?
|
| I'm sure any team capable of building complex software at
| scale is capable of running a few servers on prem. I'm sure
| there's more than a few programmers on most teams that have
| homelabs they muck around with.
|
| > Etc etc.
|
| I'd love to hear more arguments.
| awslol wrote:
| They also saved the salaries of the team whose job was doing
| nothing but chasing misplaced spaces in yaml configuration
| files. Cloud infrastructure doesn't just appear out of thin
| air. You have to hire people to describe what you want to do.
| And with the complexity mess we're in today it's not at all
| clear which takes more effort.
| quickthrower2 wrote:
| 100% this. Cloud is a hard slog too. A different slog
| though. We spend a lot of time chasing Azure deprecations.
| They are closing down a type of MySQL instance for example
| for one which is more "modern" but from the end user point
| of view it is still a MySQL server!
| gymbeaux wrote:
| Exactly. Last job I worked at there was always an issue
| with the YAML... and as a "mere" software engineer, I had
| to wait for offshore DevOps to fix, but that's another
| issue.
| icedchai wrote:
| To manage a large fleet of physical servers, you need
| similar ops skills. You're not going to configure all those
| systems by hand, are you?
| awslol wrote:
| Depends on the size of the fleet.
|
| If you're using less than a dozen servers manual
| configuration is simpler. Depending on what you're doing
| that could mean serving a hundred million customers.
| Which is plenty for most business.
| dilyevsky wrote:
| I broke the rules and read the article first:
|
| > In the context of AWS, the expenses associated with
| employing AWS administrators often exceed those of Linux on-
| premises server administrators. This represents an additional
| cost-saving benefit when shifting to bare metal. With today's
| servers being both efficient and reliable, the need for
| "management" has significantly decreased.
|
| I also never seen an eng org where substantial part of it
| didn't do useless projects that never amount to anything
| rewmie wrote:
| I get the point that they tried to make, but this
| comparison between "AWS administrators" and "Linux on-
| premises server administrators" is beyond apple-and-oranges
| and is actually completely meaningless.
|
| A team does not use AWS because it provides compute. AWS,
| even when using barebonea EC2 instances, actually means on-
| demand provisioning of computational resources with the
| help of infrastructure-as-code services. A random developer
| logs into his AWS console, clicks a few buttons, and he's
| already running a fully instrumented service with logging
| and metrics a click away. He can click another button and
| delete/shut down everything. He can click on a button again
| and deploy the same application in multiple continents with
| static files provided through a global CDN, deployed with a
| dedicated pipeline. He clicks on another button again and
| everything is shut down again.
|
| How do you pull that off with "Linux on-premises server
| administrators"? You don't.
|
| At most, you can get your Linux server administrators to
| manage their hardware with something like OpenStack, buy
| they would be playing the role of the AWS engineers that
| your "AWS administrators" don't even know exist. However,
| anyone who works with AWS only works on the abstraction
| layers above that which a "Linux on premises administrator"
| works on.
| dilyevsky wrote:
| > A random developer logs into his AWS console, clicks a
| few buttons, and he's already running a fully
| instrumented service with logging and metrics a click
| away...
|
| This only works that way for very small spend orgs that
| haven't implemented soc 2 or the like. If that's what
| you're doing then probably should stay away from
| datacenter, sure
| spamizbad wrote:
| Going to be honest: If your AWS spend is well over 6
| figures and you're still click-ops-ing most things
| you're:
|
| 1) not as reliable as you think you are 2) probably
| wasting gobs of money somewhere
| awslol wrote:
| You just log into the server...
|
| Not everything is warehouse scale. You can serve tens of
| millions of customers from a single machine.
| baz00 wrote:
| This is the voice of someone who has never actually ended
| up with a big AWS estate.
|
| You don't click to start and stop. You start with someone
| negotiating credits and reserved instance costs with AWS.
| Then you have to keep up with spending commitments.
| Sometimes clicking stop will cost you more than leaving
| shit running.
|
| It gets to the point where $50k a month is
| indistinguishable from the noise floor of spending.
| ygjb wrote:
| Yeah, that's part of it. The other part is that you can
| move stuff that is working, and working well, into on-
| prem (or colo) if it is designed well and portable. If
| everything is running in containers, and orchestration is
| already configured, and you aren't using AWS or cloud
| provider specific features, portability is not super
| painful (modulo the complexity of your app, and the
| volume of data you need to migrate). Clearly this team
| did the assessment, and the savings they achieved by
| moving to on-prem was worthwhile.
|
| That doesn't preclude continuing to use AWS and other
| cloud service as a click-ops driven platform for
| experimentation, and requiring that anything that is
| targeting production to refactored to run in the bare-
| metal environment. At least two shops I worked at
| previously have used that as a recurring model (one
| focusing on AWS, the other on GCP) for stuff that was in
| prototyping or development.
| oxfordmale wrote:
| That is having your cake and eat it. AWS administrators
| don't do the same job as on prem administrators.
| dheera wrote:
| They probably need to now hire 24/7 security to watch the
| bare metal if they're serious about it, so not sure about
| that engineer
| dilyevsky wrote:
| onsite security is offered by the colo provider. You can
| also pay for locked cabinets with cameras and anti-
| tampering or even completely caged off depending on your
| security requirements
| baz00 wrote:
| If we saved 35% that could hire 20 FTEs.
|
| Not that we'd need them as we wouldn't have to write as much
| HCL.
| throw555chip wrote:
| Bandwidth would need to be compared and considered between EC2
| and what they were able to negotiate for bare metal co-
| location.
| bauruine wrote:
| Bandwidth is about 2 orders of magnitude less on non-cloud
| even without any negotiation or commitment. How much do you
| have to commit for e.g. Cloudfront to pay 2 orders of
| magnitude less than their list price of 0.02 per GB?
| threeseed wrote:
| Also being an uptime site surprised they didn't use the m7g
| instance type.
|
| Would've saved another ~30% for minimal difference in
| performance.
|
| For me this doesn't look like a sensible move especially since
| with AWS EKS you have a managed, highly-available, multi-AZ
| control plane.
| hipadev23 wrote:
| I'd be so excited to run my company's observability platform
| on a single self-managed rack.
| cj wrote:
| > savings plan even more which would apply to egress and
| storage
|
| Wait, is this accurate?
|
| If so I need to sign our company up for a savings plan... now.
| We use RI's but I thought savings plan only applied to instance
| cost and not bandwidth (and definitely not S3)
| tommek4077 wrote:
| You got it right. They do not include traffic or S3.
| ActorNightly wrote:
| Also, EKS (i.e a managed service) is also more expensive then
| renting EC2s and doing everything yourself, which is not that
| hard.
| pojzon wrote:
| Is 150$ really that much when you are paying hundreds of
| thousends for nodes ?
| threeseed wrote:
| EKS's control plane consists of decent EC2 instances, ELB,
| ENIs distributed across multiple availability zones.
|
| You're not saving anything doing it yourself.
|
| And you've just given yourself the massive inconvenience of
| running a HA Kubernetes control plane.
| aranelsurion wrote:
| If that's the case, probably just going for Spot machines would
| save them more than that move.
| 4death4 wrote:
| Cool, so you can hire one additional engineer. Are you sure your
| bare metal setup will occupy less than a single engineer's time?
| VBprogrammer wrote:
| I'm not so sure it's a zero sum kind of thing. Yes, it seems
| likely that they are paying at least one full time employee to
| maintain their production environment. At the same time, AWS
| isn't without it's complications, there are people who are
| employed specifically to babysit it.
| r2_pilot wrote:
| Ha, while I'd work for that salary, even half of it would
| almost double what I make as a sysadmin. I guess I work here
| for the mission. Plus I don't have to deal with cloud services
| except for when the external services go down. Our stuff keeps
| running though.
| 4death4 wrote:
| I'm including overhead in that number. FWIW I know many ICs
| earning over double that number, not even including overhead.
| r2_pilot wrote:
| >FWIW I know many ICs earning over double that number, not
| even including overhead.
|
| Keep rubbing salt lol I live in a low cost area though.
| It's even pleasant some times of the year.
| dzikimarian wrote:
| AWS was operated by holy ghost I assume?
| byyll wrote:
| They can fire the "AWS engineer".
| jscheel wrote:
| So, they essentially saved the cost of one good engineer.
| Question is, are they spending 1 man-year of effort to maintain
| this setup themselves? If not, they made the right choice.
| Otherwise, it's not as clear cut.
| jacquesm wrote:
| That depends on where they are located. Good engineers aren't
| 230K$ / year everywhere.
| corobo wrote:
| Americans get paid so much, got dayum.
|
| Half that and half it again and I'd still be looking at a
| decent raise lmao
| uoaei wrote:
| That's about senior-level compensation even among most
| companies in the Bay Area. Only the extreme outliers with
| good performance on the stock market can be said to be
| significantly higher in TC.
|
| Edit: even then, TC is tied to how the stock market is
| doing, and not paid out by the company directly, so it only
| makes sense to compare with base wage plus benefits.
| dmoy wrote:
| TC for senior at big tech companies is over $300k. Over
| $400k for Facebook I hear.
|
| It doesn't take an extreme outlier to get significantly
| above $250k.
|
| > so it only makes sense to compare with base wage plus
| benefits.
|
| Not really, when stock approaches 40%+ of compensation,
| and is in RSU with fairly fast vesting schedule.
| jacquesm wrote:
| Indeed. And it does of course needs to be offset to the
| cost of living.
| IntelMiner wrote:
| It include an asterisk. Those salaries come with the
| reality of living in locales like the bay area or Seattle
| and the like generally, with all the exorbitant costs of
| living in those areas
|
| A lot of companies (like Amazon) will gleefully slash your
| salary if you try to move somewhere cheaper, because why
| should we pay you more if you don't just need that money to
| fork over to a landlord every month?
|
| There's also all the things Americans go without, like
| socialized healthcare. Even with their lauded insurance
| plans they still pay significantly more for worse health
| outcomes than any other country
| IshKebab wrote:
| Nah even with bay area costs Americans get paid much much
| more than elsewhere. I could easily double my salary by
| moving from the UK to San Francisco. House prices are
| maybe double too, but since they are only a part of your
| outgoings, overall you come out waaaay ahead.
|
| Of course then I would have to send my kids to schools
| with metal detectors and school shooting drills... It's
| not all about the money.
| the_gipsy wrote:
| The employer es basically subsidizing the mortgage,
| incentivizing workers to move to the most expensive
| location, which make these locations even more expensive.
| nxm wrote:
| 90%+ of Americans have some form of health insurance,
| especially tech workers. And there are issues with
| socialized health care as well
| acdanger wrote:
| Likely including benefits in this figure.
| cj wrote:
| I was curious about this too, but this company lists a range
| of $200-250k for remote.
|
| https://github.com/OneUptime/interview/blob/master/software-.
| ..
|
| Side note: I'm in slight disbelief at how high that salary
| range is compared to how minimal the job requirements are.
| totallywrong wrote:
| Right, like AWS is set and forget.
| wholinator2 wrote:
| Exactly. A more accurate figure would be the difference
| between the work hours spent maintaining bare metal _minus_
| the work hours spent maintaining AWS. Impossible to know
| without internals but at least a point in favor of bare metal
| threeseed wrote:
| Depending on what parts of AWS you use it is.
|
| Fargate, S3, Aurora etc. These are managed services and are
| incredibly reliable.
|
| Lot of people here seem to think these cloud providers are
| just a bunch of managed servers. It's far more than that.
| SteveNuts wrote:
| Even the "easy" services like that have at least _some_
| barrier to entry. IAM alone is a pretty big beast and I
| doubt someone whose never used AWS would grasp it their
| very first time logging into the web interface - and every
| service uses it extensively.
|
| And then there's the question of whether you're going to
| use Terraform, Ansible, CloudFormation, etc or click
| through the GUI to manage things.
|
| My point is, nothing in AWS is 100% turnkey like a lot of
| folks pretend it is. Most of the time, it's leadership that
| thinks since AWS is "Cloud" that it's as simple as put in
| your credit card and you're done.
| threeseed wrote:
| IAM and IaC is only needed once you get to a certain
| size.
|
| For smaller projects you can absolutely get away with
| just the UI.
| isbvhodnvemrwvn wrote:
| IAM is absolutely NOT something you can just ignore
| unless you have a huge pile of cash to burn when your
| shit gets compromised.
| scns wrote:
| There are companies earning money by showing other
| companies how to reduce their AWS bill.
| avgDev wrote:
| Set and forget until you wake up to an astronomical bill one
| morning.
| politelemon wrote:
| The beauty of decisions like these is that it looks good on a
| bean counter's spreadsheet. The hours of human time they end up
| spending on its maintenance simply don't appear in that
| spreadsheet, but is gladly pushed onto everyone else's plates.
| andrewstuart wrote:
| This fiction remains that AWS requires no specialist expertise.
|
| And your own computers require expertise so expensive and
| frightening that no sane company would host their own
| computers.
|
| How Amazon created this alternate reality should be studied in
| business schools for the next 50 years. Amazon made the IT
| industry doubt its own technical capabilities so much that the
| entire industry essential gave up on the idea that it can run
| computer systems, and instead bought into the fabulously
| complex and expensive _and technically challenging_ cloud
| systems, whilst still believing they were doing the simplest
| and cheapest thing.
| kennydude wrote:
| AWS does require some expertise to master considering the
| sheet number of products and options. Tick the wrong box and
| cost increases by 50% etc.
|
| Different solutions work best for different companies.
| 1980phipsi wrote:
| That's what he's saying.
| FpUser wrote:
| >"This fiction remains that AWS requires no specialist
| expertise. And your own computers require expertise so
| expensive and frightening that no sane company would host
| their own computers."
|
| Each of these statements is utter BS
|
| PS. Oopsy I just read their third paragraph ;)
| DirkH wrote:
| Read their third paragraph. They completely agree with you
| FpUser wrote:
| LOL Sorry. I was shooting from the hip. Thanks.
| madrox wrote:
| Amazon didn't create it. I was there for the mass cloud
| migrates of the last 15 years. It isn't that AWS requires no
| specialist expertise, it's that it's a certain kind of
| expertise that's easier to plan for and manage. Managing
| physical premises, hardware upgrade costs, etc are all skills
| your typical devops jockey doesn't need anymore. Unless
| you're fine with hosting your company's servers under your
| desk, it's the hidden costs of metal that makes businesses
| move to cloud.
| Nextgrid wrote:
| Fortunately there are companies like Deft, OVH, Hetzner,
| Equinix, etc that handle all of that for you for a flat fee
| and while achieving economies of scale.
|
| Colocation is rarely worth it unless you have non-standard
| requirements. If you just need a general-purpose machine,
| any of the aforementioned providers will sort you out just
| fine.
| viraptor wrote:
| This is a strawman that keeps getting brought up, but
| nobody's claiming that. The difference remains though and the
| scale depends on what exactly do you consider as an
| alternative. Renting a couple servers will cost you in
| availability/resilience and extra hardware management time.
| Renting a managed rack will cost you in the above and premium
| on management. Doing things yourself will cost you in extra
| contracts / power / network planning, remote hands and time
| to source your components.
|
| Almost everything that the AWS specialist needs to know comes
| in after that and has some equivalent in bare metal world, so
| those costs don't disappear either.
|
| In practice there are extra costs which may or may not make
| sense in each case. And there are companies that don't
| reassess their spending as well as they should. But there's
| no alternative realty really. (As in, the usually discussed
| complications of bare metal are not extremely overplayed)
| killingtime74 wrote:
| With opportunity cost its multiples more. We don't hire people
| to break even on them right, we hire to make a profit.
| roamerz wrote:
| Even if they do spend 1 person-year of effort in maintenance
| they still may have made the correct choice. Having a good
| engineer on staff may have additional side benefits as well
| especially if they could manage to hire locally and that
| person's wages then contribute to the local economy. As you
| said though it's definitely not clear cut especially from a
| spectator's point of view.
| matsemann wrote:
| 1 man-year effort is probably less than the effort of AWS,
| though. So a double win!
|
| A bit in jest, but places I've worked where we've moved to the
| cloud ended up with more people managing k8s and building a
| platform and tooling, than when we had a simple inhouse scp
| upload to some servers.
| timeon wrote:
| Is this not addressed in section 'Server Admins' of the
| article?
| keremkacel wrote:
| And now their blog won't load
| mike_d wrote:
| A lot of comments here seem to be along the lines of "you can
| hire one more engineer," but given the current economic situation
| remember that might be "keep one more engineer." Would you lay
| off someone on your team to keep AWS?
|
| Keeping a few racks of servers happily humming along isn't the
| massive undertaking that most people here seem to think it is. I
| think lots of "cloud native" engineers are just intimidated by
| having to learn lower levels to keep things running.
| unglaublich wrote:
| > I think lots of "cloud native" engineers are just intimidated
| by having to learn lower levels to keep things running.
|
| Rightly so, because they're cloud native engineers, not system
| administrators. They're intimidated by the things they don't
| know. It'll be a very individual calculation whether it's worth
| it for your enterprise to organize and maintain hardware
| yourself, or isn't.
| nwmcsween wrote:
| eh, to a degree, having to deal with failed hardware and worse
| buggy hardware is just a pain and really time consuming.
| bee_rider wrote:
| Especially given the low unemployment rate, laying somebody off
| seems quite risky, if it doesn't work out you'll have trouble
| hiring some replacement I guess.
| cj wrote:
| The current hiring market in tech is the easiest (for
| employers) than it has been in a really long time. It used to
| take 3-4 months to fill a role. In the current market it's
| more like 2-4 weeks.
| Jnr wrote:
| And in other places around the world those would be closer to 3
| or 4 good engineers for the same money. And while each engineer
| costs some money, they probably bring in close to double of
| what they are being paid.
| deepspace wrote:
| > Keeping a few racks of servers happily humming along isn't
| the massive undertaking that most people here seem to think it
| is
|
| Keeping them humming along redundantly, with adequate power and
| cooling, and protection against cooling- and power failures is
| more of an undertaking, though. Now you are maintaining
| generators, UPSs and multiple HVAC systems in addition to your
| 'few racks of servers'.
|
| You also need to maintain full network redundancy (including
| ingress/egress) and all the cost that entails.
|
| All the above hardware needs maintenance and replacement when
| it becomes obsolete.
|
| Now you are good in one DC, but not protected against
| tornadoes, fire and flood like you would be if you used AWS
| with multiple availability zones.
|
| So, you have to build another DC far enough away, staff it, and
| buy tons of replication software, plus several FTEs to manage
| cross-site backups and deal with sync issues.
| lol768 wrote:
| Most of those requirements cease to exist if you decide to
| colo. It's not cloud or "run your own DC".
| 10000truths wrote:
| You don't need to build your own datacenter unless your
| workload requires a datacenter's worth of hardware.
| Colocation is a feasible and popular option for handling all
| of the hands-on stuff you mention. Ship the racks to a colo
| center, they'll install them for you. Ship them replacement
| peripherals, and the operators will hot-swap them for you. If
| you need redundancy, that's just a matter of sending your
| hardware to multiple places instead of one. Slightly more
| involved, but it's hardly rocket science.
| dankwizard wrote:
| And these savings will be passed down to your customers too?
| Or....?
| willsmith72 wrote:
| why should they be?
| candiddevmike wrote:
| Would anyone be interested in an immutable OS with built in
| configuration management that works the same (as in the same
| image) in the cloud and on premise (bare metal, PXE, or virtual)?
| Basically using this image you could almost guarantee everything
| runs the same way.
| quillo_ wrote:
| Yes - I would be interested :) my issue is that there is a
| mixed workload of centralised cloud compute and physical
| hardware in strange locations. I want something like Headscale
| as a global mesh control plane and some mechanism for deploying
| immutable flatcar images that hooks into IAM (for cloud) and
| TPM (for BM) as a system auth mechanism.
| candiddevmike wrote:
| My email is in my profile if you want to discuss this
| more...!
| dilyevsky wrote:
| This already exists - https://docs.fedoraproject.org/en-
| US/fedora-coreos/bare-meta...
| andrewstuart wrote:
| The key point is _per year_ - ongoing saving every year.
| alex_lav wrote:
| Curious how much was spent on the migration? I skimmed but didn't
| see that number.
|
| > Server Admins: When planning a transition to bare metal, many
| believe that hiring server administrators is a necessity. While
| their role is undeniably important, it's worth noting that a
| substantial part of hardware maintenance is actually managed by
| the colocation facility. In the context of AWS, the expenses
| associated with employing AWS administrators often exceed those
| of Linux on-premises server administrators. This represents an
| additional cost-saving benefit when shifting to bare metal. With
| today's servers being both efficient and reliable, the need for
| "management" has significantly decreased.
|
| This feels like a "famous last words" moment. Next year there'll
| be 400k in "emergency Server Admin hire" budget allocated.
| nwmcsween wrote:
| I would go with managed bare-metal, it's a step up from unmanaged
| bare metal cost wise but saves you on headaches from memory,
| storage, network, etc issues.
| lgkk wrote:
| I'm sure if the stack is simple enough it's non trivial for most
| senior plus engineers to figure out the infrastructure.
|
| I've definitely seen a lot of over engineered solutions in the
| chase of some ideals or promotions.
| yieldcrv wrote:
| I had an out of touch cofounder a few years back, he had asked me
| why the coworking space's hours were the way they were, before
| interjecting that companies were probably managing their servers
| up there at the those later hours
|
| like, talk about decades removed! no, nobody has their servers in
| the coworking space anymore sir.
|
| nice to see people attempting a holistic solution to hosting
| though. with containerization redeploying anywhere on anything
| shouldn't be hard.
| maximusdrex wrote:
| It feels like every comment on this article didn't read past the
| first paragraph. Every comment I see is talking about how they
| likely barely made any money on the transition once all costs are
| factored in, but they explicitly stated a critical business
| rationale behind the move that remains true regardless of how
| much money it cost them to transition. Since they needed to
| function even when AWS is down, it made sense for them to
| transition even if it cost them more. This may increase the cost
| of running their service (though probably not) but it could made
| it more reliable, and therefore a better solution, making them
| more down the line.
| threeseed wrote:
| > Since they needed to function even when AWS is down
|
| AWS as a whole has never been down.
|
| It's Cloud 101 to architect your platform to operate across
| multiple availability zones (data centres). Not only to
| insulate against data centre specific issues e.g. fire, power.
| But also AWS backplane software update issues or cascading
| faults.
|
| If you read what they did it's actually _worse_ than AWS
| because their Kubernetes control plane isn 't highly-available.
| wbsun wrote:
| People often learn the lessons in a hard way: they will keep
| saving 230k/yr until one day their non-HA bare-metal is down
| and major customers retreat.
| christophilus wrote:
| > We have a ready to go backup cluster on AWS that can spin
| up in under 10 minutes if something were to happen to our
| co-location facility.
|
| Sounds like they already have their bases covered.
| threeseed wrote:
| Still need to synchronise data, update DNS records, wait
| for TTLs to expire.
|
| HA architectures exist for a reason because that last
| step is a massive headache.
| quickthrower2 wrote:
| They need to do fire drills and practice this maybe daily
| or at least weekly? Failover being a normal case. Can't
| you do failovers in DNS?
| slig wrote:
| >It's Cloud 101 to architect your platform to operate across
| multiple availability zones (data centres)
|
| A huge multi billion dollar company with "cloud" in its name
| recently had a big downtime because they did not follow
| "cloud 101".
| annexrichmond wrote:
| Some AWS outages have affected all AZs in a given region, so
| they aren't always all that isolated. For this reason many
| orgs are investing in multi-cloud architectures (in addition
| to multi region)
| oxfordmale wrote:
| You can use multiple availability zones and if needed even
| multi cloud. If you own the hardware, you do regularly need to
| test the UPS power supply to ensure there is a graceful fail
| over in case of a power outage. Unless of course, you buy the
| hardware already hosted in a data centre.
| icedchai wrote:
| I'm not convinced of the critical business rationale. Your
| single data data center is much more likely to go down than a
| multi-AZ AWS deployment. The correct business rationale would
| be to go multi-cloud.
| didip wrote:
| The post is super light on details, it's hard to visualize if
| it's worth it or not.
|
| For examples:
|
| - How much data are they working with? What's the traffic shape?
| Using NFS makes me think that they don't have a lot of data.
|
| - What happened when their customers accidentally sent too much
| events? Will they simply drop the payload? In bare-metal they
| lose the ability to auto-scale quickly.
|
| - Are they using S3 or not, if they are, did they move that as
| well to their own Ceph cluster?
|
| - What's the RDBMS setup? Are they running their own DB proxy
| that can handle live switch-over and seamless upgrade?
|
| - What's the details on the bare metal setup? Is everything
| redundant? How quickly can they add several racks in one go?
| What's included as a service from their co-lo provider?
| oxfordmale wrote:
| It is not unlikely an AWS to GCP migration would have saved
| them significant money too, in the sense that they likely
| reviewed and right sizes different systems.
|
| I also would love to see a comparison done by a financial
| planning analyst to ensure no cost centres are missed. On prem
| is cheaper but only by 30 to 50%. That is the premium you pay
| for flexibility, which you can partly mitigate by purchasing
| reserved instance for multiple years.
| threeseed wrote:
| > On prem is cheaper but only by 30 to 50%
|
| Depending on use case.
|
| If you have traffic which isn't consistent 24/7 then AWS Spot
| instances with Gravitron CPUs will be cheaper than on-
| premise.
|
| Because you have the ability to in real-time scale your
| infrastructure up/down.
| Thristle wrote:
| Its 2 seperate issues:
|
| fluctuation in traffic is handled by auto scaling
|
| Saving money on stateless (or short start times) services
| is done with spots
| mensetmanusman wrote:
| nines of up time?
| renecito wrote:
| now a set of linux machines is considered bare-metal?
|
| I was under the impression that bare-metal means "no OS".
| jdoss wrote:
| The term bare-metal in this context means installing and
| managing Linux (or your OS of choice) on hardware directly
| without a Hypervisor is considered a bare-metal deployment.
| It's never meant no operating system.
| _joel wrote:
| In this context it's running without the use of virtualisation
| 0xbadcafebee wrote:
| Moving from buying Ferraris to Toyota Camrys would save a lot of
| money too. These stories are always bs blog spam by companies
| trying to pretend they pulled off some amazing new hack. In
| reality they were burning cash because they hadn't the faintest
| idea how to control their spend. When we were
| utilizing AWS, our setup consisted of a 28-node managed
| Kubernetes cluster. Each of these nodes was an m7a EC2
| instance. With block storage and network fees included, our
| monthly bills amounted to $38,000+
|
| The hell were you doing with 28 nodes to run an uptime tracking
| app? Did you try just running it on like, 3 nodes, without K8s?
| When compared to our previous AWS costs, we're saving over
| $230,000 roughly per year if you amortize the cap-ex costs
| of the server over 5 years.
|
| Compared to a 5-year AWS savings plan? Probably not.
|
| On top of this, they somehow advertise using K8s as a
| simplification? Let's reign in our spend, not only by abandoning
| the convenience of VMs and having to do more maintenance, but
| let's require customers use a minimum of 3 nodes and a dozen
| services to run a dinky _uptime tracking app_.
|
| This meme must be repeating itself due to ignorance. The
| CIOs/CTOs have no clue how to control spend in the cloud, so they
| rake up huge bills and ignore it "because we're trying to grow
| quickly!" Then maybe they hire someone who knows the Cloud, but
| they tell them to ignore the cost too. Finally they run out of
| cash because they weren't watching the billing, so they do the
| only thing they are technically competent enough to do: set up
| some computers and install Linux, and write off the cost as cap-
| ex. Finally they write a blog post in order to try to gain
| political cover for why they burned through several headcount
| worth of funding on nothing.
| Nextgrid wrote:
| > The hell were you doing with 28 nodes to run an uptime
| tracking app?
|
| To be fair, considering the pocket-calculator-grade performance
| you get from AWS (along with terrible IO performance compared
| to direct-attach NVME) I can totally understand they'd need 28
| nodes to run something that would run on a handful of real,
| uncontended bare-metal hosts.
| w4f7z wrote:
| ServeTheHome has also written[0] a bit about the relative costs
| of AWS vs. colocation. They compare various reserved instance
| scenarios and include the labor of colocation. TL;DR: it's still
| far cheaper to colo.
|
| [0] https://www.servethehome.com/falling-from-the-
| sky-2020-self-...
| boiler_up800 wrote:
| I'd say $500k per year on AWS is kind of within a dead man's zone
| where if you're not expecting that spend to grow significantly
| and your infra is relatively simple, migrating off may actually
| make sense.
|
| On the other hand maintaining $100K a year of spend on AWS is
| unlikely to be worth the effort of optimizing and maintaining
| $1M+ on AWS probably means the usage patterns are such that the
| cloud is cheaper and easier to maintain.
| dvfjsdhgfv wrote:
| In my experience amounts are meaningless, what counts is what
| kinds of services you need most. In my current org we use all 3
| major public clouds + on-on prem services, carefully planning
| what should go where and why.
| m3kw9 wrote:
| Actually saves less if you spread the development of transitions
| and op costs over 5 years. Hidden costs
| monlockandkey wrote:
| I've said this before, unless you are using specific AWS
| services, I think it is a fools errand to use it.
|
| Compute, storage, database, networking. You would be better off
| using Digital Ocean, Linode Vultr etc. so much cheaper than AWS,
| lots of bandwidth included rather than the extortionate $0.08 GB
| egress.
|
| Compute is the same story. 2 VCPU, 4GB VPS is ~$24 using a VPS.
| The equivalent instances (after navigating the obscured pricing
| and naming scheme), is the c6g.large is double the price at $50.
|
| This is the happy middle ground between bare metal and AWS.
| greyface- wrote:
| Weren't you searching for a colo provider just yesterday? That
| was a quick $230k! https://news.ycombinator.com/item?id=38275614
| dilyevsky wrote:
| Anyone can comment on server lifetime of 5 years? I would think
| it's one the order of 8-10 years these days?
| Havoc wrote:
| Cheaper price, lower redundancy:
|
| >single rack configuration at our co-location partner
|
| I've got symetrical gigabit static ipv4 at home...so can murder
| commercial offerings out there on bang/buck for many things.
| Right up until you factor in reliability and redundancy.
| FuriouslyAdrift wrote:
| The math I have always seen is cloud is around 2.5x more
| expensive than on-prem UNLESS you can completely re-architect
| your infra to be cloud native.
|
| Lift and shift is brutal and doesn't make a lot of sense.
| dvfjsdhgfv wrote:
| > The math I have always seen is cloud is around 2.5x more
| expensive than on-prem UNLESS you can completely re-architect
| your infra to be cloud native.
|
| And and this point you are completely locked in.
| dstainer wrote:
| In my opinion the story here is that AWS allowed them to quickly
| build and prove out a business idea. This in turn afforded them
| the luxury to make this kind of switch. Cloud did it's job.
| allenrb wrote:
| Cue the inevitable litany of reasons why it is wrong to move out
| of "the cloud" in 3... 2... 1...
| tonymet wrote:
| what is the market distortion that allows AWS margins to remain
| so high? there are two major competitors (Azure, GCP) and dozens
| of minor ones.
|
| It seems crazy to me that the two options are AWS vs bare metal
| to save that much money. Why not a moderate solution?
___________________________________________________________________
(page generated 2023-11-16 23:00 UTC)