[HN Gopher] How we migrated onto K8s in less than 12 months
___________________________________________________________________
How we migrated onto K8s in less than 12 months
Author : ianvonseggern
Score : 97 points
Date : 2024-08-08 16:07 UTC (6 hours ago)
(HTM) web link (www.figma.com)
(TXT) w3m dump (www.figma.com)
| jb1991 wrote:
| Can anyone advise what is the most common language used in
| enterprise settings for interfacing with K8s?
| JohnMakin wrote:
| IME almost exclusively golang.
| roshbhatia wrote:
| ++, most controllers are written in go, but there's plenty of
| client libraries for other languages.
|
| A common pattern you'll see though is skipping writing any
| sort of code and instead using a higher level dsl-ish
| configuration usually via yaml, using tools like Kyverno.
| gadflyinyoureye wrote:
| Depends on what you mean. Helm will control a lot. You can make
| the yaml file in any language. Also you can admin it from
| command line tools. So again any language but often zsh or
| bash.
| cortesoft wrote:
| A lot of yaml
| yen223 wrote:
| The kind of yaml that has a lot of {{ }} in them that breaks
| your syntax highlighter.
| mplewis wrote:
| I have seen more Terraform than anything else.
| akdor1154 wrote:
| On the platform consumer side (app infra description) - well
| schema'd yaml, potentially orchestrated by helm ("templates to
| hellish extremes") or kustomize ("no templates, this is the
| hill we will die on").
|
| On the platform integration/hook side (app code doing
| specialised platform-specific integration stuff, extensions to
| k8s itself), golang is the lingua franca but bindings for many
| languages are around and good.
| JohnMakin wrote:
| I like how this article clearly and articulately states the
| reasons it gains to benefit from Kubernetes. Many make the jump
| without knowing what they even stand to gain, or if they need to
| in the first place - the reasons given here are good.
| nailer wrote:
| I was about to write the opposite - the logic is poor and
| circular - but multiple other commenters have already raised
| this: https://news.ycombinator.com/item?id=41194506
| https://news.ycombinator.com/item?id=41194420
| JohnMakin wrote:
| I don't really see those rebuttals as all that valid. The
| reasons given in this article are completely valid, from my
| perspective of someone who's worked heavily with
| Kubernetes/ECS.
|
| Helm, for instance, is a great time saver for installing
| software. Often software will support nothing but helm. Ease
| of deployment is a good consideration. Their points on
| networking are absolutely spot on. The scaling considerations
| are spot on. Killing/isolating unhealthy containers is
| completely valid. I could go on a lot more, but I don't see a
| single point listed as invalid.
| samcat116 wrote:
| They're quite specific in that they mention that teams would
| like to make use of existing helm charts for other software
| products. Telling them to build and maintain definitions for
| those services from scratch is added work in their mind.
| dijksterhuis wrote:
| > When applied, Terraform code would spin up a template of what
| the service should look like by creating an ECS task set with
| zero instances. Then, the developer would need to deploy the
| service and clone this template task set [and do a bunch of
| manual things]
|
| > This meant that something as simple as adding an environment
| variable required writing and applying Terraform, then running a
| deploy
|
| This sounds less like a problem with ECS and more like an
| overcomplication in how they were using terraform + ECS to manage
| their deployments.
|
| I get the generating templates part for verification prior to
| live deploys. But this seems... dunno.
| wfleming wrote:
| Very much agree. I have built infra on ECS with terraform at
| two companies now, and we have zero manual steps for actions
| like this, beyond "add the env var to a terraform file, merge
| it and let CI deploy". The majority of config changes we would
| make are that process.
| dijksterhuis wrote:
| Yeah.... thinking about it a bit more i just don't see why
| they didn't set up their CI to deploy a short lived
| environment on a push to a feature branch.
|
| To me that seems like the simpler solution.
| roshbhatia wrote:
| I'm with you here -- ECS deploys are pretty painless and
| uncomplicated, but I can picture a few scenarios where this
| ends up being necessary, for ex; if they have a lot of services
| deployed on ECS and it ends up bloating the size of the
| Terraform state. That'd slow down plans and applies
| significantly, which makes sharding the Terraform state by
| literally cloning the configuration based on a template a lot
| safer.
| freedomben wrote:
| > ECS deploys are pretty painless and uncomplicated
|
| Unfortunately in my experience, this is true until it isn't.
| Once it isn't true, it can quickly become a painful blackbox
| debugging exercise. If your org is big enough to have
| dedicated AWS support then they can often get help from
| engineers, but if you aren't then life can get really
| complicated.
|
| Still not a bad choice for most apps though, especially if
| it's just a run-of-the-mill HTTP-based app
| ianvonseggern wrote:
| Hey, author here, I totally agree that this is not a
| fundamental limitation of ECS and we could have iterated on
| this setup and made something better. I intentionally listed
| this under work we decided to scope into the migration process,
| and not under the fundamental reasons we undertook the
| migration because of that distinction.
| Aeolun wrote:
| Honestly, I find the reasons they name for using Kubernetes
| flimsy as hell.
|
| "ECS doesn't support helm charts!"
|
| No shit sherlock, that's a thing literally built on Kubernetes.
| It's like a government RFP that can only be fulfilled by a single
| client.
| Carrok wrote:
| > We also encountered many smaller paper cuts, like attempting
| to gracefully terminate a single poorly behaving EC2 machine
| when running ECS on EC2. This is easy on Amazon's Elastic
| Kubernetes Service (EKS), which allows you to simply cordon off
| the bad node and let the API server move the pods off to
| another machine while respecting their shutdown routines.
|
| I dunno, that seems like a very good reason to me.
| watermelon0 wrote:
| I assume that ECS Fargate would solve this, because one
| misbehaving ECS task would not affect others, and stopping it
| should still respect the shutdown routines.
| ko_pivot wrote:
| Fargate is very expensive at scale. Great for small or
| bursty workloads, but when you're at Figma scale, you
| almost always go EC2 for cost-effectiveness.
| aranelsurion wrote:
| To be fair there are many benefits of running on the platform
| that has the most mindshare.
|
| Unless they are in this space competing against k8s, it's
| reasonable for them if they want to use Helm charts, to move
| where they can.
|
| Also, Helm doesn't work with ECS, so doesn't <50 other tools
| and tech from the CNCF map>.
| cwiggs wrote:
| I think what they should have said is "there isn't a tool like
| Helm for ECS" If you want to deploy a full prometheus, grafana,
| alertmanager, etc stack on ECS, good luck with that, no one has
| written the task definition for you to consume and override
| values.
|
| With k8s you can easily deploy a helm chart that will deploy
| lots of things that all work together fairly easily.
| JohnMakin wrote:
| It's almost like people factor in a piece of software's tooling
| environment before they use the software - wild.
| vouwfietsman wrote:
| Maybe its normal for a company this size, but I have a hard time
| following much of the decision making around these gigantic
| migrations or technology efforts because the decisions don't seem
| to come from any user or company need. There was a similar post
| from Figma earlier, I think around databases, that left me
| feeling the same.
|
| For instance: they want to go to k8s because they want to use
| etcd/helm, which they can't on ECS? Why do you want to use
| etcd/helm? Is it really this important? Is there really no other
| way to achieve the goals of the company than exactly like that?
|
| When a decision is founded on a desire of the user, its easy to
| validate that downstream decisions make sense. When a decision is
| founded on a technological desire, downstream decisions may make
| sense in the context of the technical desire, but do they make
| sense in the context of the user, still?
|
| Either I don't understand organizations of this scale, or it is
| fundamentally difficult for organizations of this scale to
| identify and reason about valuable work.
| WaxProlix wrote:
| People move to K8s (specifically from ECS) so that they can use
| cloud provider agnostic tooling and products. I suspect a lot
| of larger company K8s migrations are fueled by a desire to be
| multicloud or hybrid on-prem, mitigate cost, availability, and
| lock-in risk.
| timbotron wrote:
| there's a pretty direct translation from ECS task definition
| to docker-compose file
| zug_zug wrote:
| I've heard all of these lip-service justifications before,
| but I've yet to see anybody actually publish data showing how
| they saved any money. Would love to be proven wrong by some
| hard data, but something tells me I won't be.
| nailer wrote:
| Likewise. I'm not sure Kubernetes famous complexity (and
| the resulting staff requirements) are worth it to
| preemptively avoid vendor lockin, and wouldn't be solved
| more efficiently by migrating to another cloud provider's
| native tools if the need arises.
| bryanlarsen wrote:
| I'm confident Figma isn't paying published rates for AWS.
| The transition might have helped them in their rate
| negotiations with AWS, or it might not have. Hard data on
| the money saved would be difficult to attribute.
| jgalt212 wrote:
| True but if AWS knows your lock-in is less locked-in, I'd
| bet they'd more flexible when contracts are up for renewal.
| I mean it's possible the blog post's primary purpose was a
| shot across bow to their AWS account manager.
| logifail wrote:
| > it's possible the blog post's primary purpose was a
| shot across bow to their AWS account manager
|
| Isn't it slightly depressing that this explanation is
| fairly (the most?) plausible?
| jiggawatts wrote:
| Our state department of education is one of the biggest
| networks in the world with about half a million devices.
| They would occasionally publicly announce a migration to
| Linux.
|
| This was just a Microsoft licensing negotiation tactic.
| Before he was CEO, Ballmer flew here to negotiate one of
| the contracts. The discounts were _epic_.
| tengbretson wrote:
| There are large swaths of the b2b space where (for whatever
| reason) being in the same cloud is a hard business
| requirement.
| vundercind wrote:
| The vast majority of corporate decisions are never
| justified by useful data analysis, before or after the
| fact.
|
| Many are so-analyzed, but usually in ways that anyone who
| paid attention in high school science or stats classes can
| tell are so flawed that they're meaningless.
|
| We can't even measure manager efficacy to any useful
| degree, in nearly all cases. We can come up with numbers,
| but they don't mean anything. Good luck with anything more
| complex.
|
| Very small organizations can probably manage to isolate
| enough variables to know how good or bad some move was in
| hindsight, if they try and are competent at it (... if).
| Sometimes an effect is so huge for a large org that it
| overwhelms confounders and you can be pretty confident that
| it was at least good or bad, even if the degree is fuzzy.
| Usually, no.
|
| Big organizations are largely flying blind. This has only
| gotten worse with the shift from people-who-know-the-work-
| as-leadership to professional-managers-as-leadership.
| Alupis wrote:
| Why would you assume it's lip-service?
|
| Being vendor-locked into ECS means you _must_ pay whatever
| ECS wants... using k8s means you can feasibly pick up and
| move if you are forced.
|
| Even if it doesn't save money _today_ it might save a
| tremendous amount in the future and /or provide a much
| stronger position to negotiate from.
| greener_grass wrote:
| Great in theory but in practice when you do K8s on AWS,
| the AWS stuff leaks through and you still have lock-in.
| Alupis wrote:
| Then don't use the AWS stuff. You can bring your own
| anything that they provide.
| cwiggs wrote:
| It doesn't have to be that way though. You can use the
| AWS ingress controller, or you can use ingress-nginx. You
| can use external secrets operator and tie it into AWS
| Secrets manager, or you can tie it into 1pass, or
| Hashicorp Vault.
|
| Just like picking EKS you have to be aware of the pros
| and cons of picking the cloud provider tool or not.
| Luckily the CNCF is doing a lot for reducing vender lock
| in and I think it will only continue.
| elktown wrote:
| I don't understand why this "you shouldn't be vendor-
| locked" rationalization is taken at face value at all?
|
| 1. The time it will take to move to another cloud is
| proportional to the complexity of your app. For example,
| if you're a Go shop using managed persistence are you
| more vendor locked in any meaningful way than k8s? What's
| the delta here?
|
| 2. Do you really think you can haggle with the fuel-
| producers like you're MAERKS? No, you're more likely just
| a car driving around for a gas station with increasingly
| diminishing returns.
| Alupis wrote:
| This year alone we've seen significant price increases
| from web services, including critical ones such as Auth.
| If you are vendor-locked into, say Auth0, and they
| increase their price 300%[1]... What choice do you have?
| What negotiation position do you have? None... They know
| you cannot leave.
|
| It's even worse when your entire platform is vendor-
| locked.
|
| There is nothing but upside to working towards a vendor-
| neutral position. It gives you options. Even if you never
| use those options, they are there.
|
| > Do you really think you can haggle
|
| At the scale of someone like Figma? Yes, they do
| negotiate rates - and a competent account manager will
| understand Figma's position and maximize the revenue they
| can extract. Now, if the account rep doesn't play ball,
| Figma can actually move their stuff somewhere else.
| There's literally nothing but upside.
|
| I swear, it feels like some people are just allergic to
| anything k8s and actively seek out ways to hate on it.
|
| [1] https://auth0.com/blog/upcoming-pricing-changes-for-
| the-cust...
| elktown wrote:
| Why skip point 1 and do some strange tangent on a SaaS
| product unrelated to using k8s or not?
|
| Most people looking into (and using) k8s that are being
| told the "you most avoid vendor lock in!" selling point
| are nowhere near the size where it matters. But I know
| there's essentially bulk-pricing, as we have it where I
| work as well. That it's because of picking k8s or not
| however is an extremely long stretch, and imo mostly
| rationalization. There's nothing saying that a cloud move
| _without_ k8s couldn 't be done within the same amount of
| time. Or that even k8s is the main problem, I imagine it
| isn't since it's usually supposed to be stateless apps.
| Alupis wrote:
| The point was about vendor lock, which you asserted is
| not a good reason to make a move, such as this. The
| "tangent" about a SaaS product was to make it clear what
| happens when you build your system in such a way as-to
| become entirely dependent on that vendor. Just because
| Auth0 is not part of one of the big "cloud" providers,
| doesn't make it any less vendor-locky. Almost all of the
| vendor services offered on the big clouds are extremely
| vendor-locked and non-portable.
|
| Where you buy compute from is just as big of a deal as
| where you buy your other SaaS' from. In all of the cases,
| if you cannot move even if you had to (ie. it'll take 1
| year+ to move), then you are not in a good position.
|
| Addressing your #1 point - if you use a regular database
| that happens to be offered by a cloud provider (ie.
| Postgres, MySQL, MongoDB, etc) then you can pick up and
| move. If you use something proprietary like CosmoDB, then
| you are stuck or face significant efforts to migrate.
|
| With k8s, moving to another cloud can be as simple as
| creating an account and updating your configs to point at
| the new cluster. You can run every service you need
| inside your cluster if you wanted. You have freedom of
| choice and mobility.
|
| > Most people looking into (and using) k8s that are being
| told the "you most avoid vendor lock in!" selling point
| are nowhere near the size where it matters.
|
| This is just simply wrong, as highlighted by the SaaS
| example I provided. If you think you are too small so it
| doesn't matter, and decide to embrace all of the cloud
| vendor's proprietary services... what happens to you when
| that cloud provider decides to change their billing
| model, or dramatically increases price? You are screwed
| and have no options but cough up more money.
|
| There's more decisions to make and consider regarding
| choosing a cloud platform and services than just whatever
| is easiest to use today - for any size of business.
| watermelon0 wrote:
| I would assume that the migration from ECS to something
| else would be a lot easier, compared to migrating from
| other managed services, such as S3/SQS/Kinesis/DynamoDB,
| and especially IAM, which ties everything together.
| otterley wrote:
| Amazon ECS is and always has been free of charge. You pay
| for the underlying compute and other resources (just like
| you do with EKS, too), but not the orchestration service.
| WaxProlix wrote:
| It looks like I'm implying that companies are successful in
| getting those things from a K8s transition, but I wasn't
| trying to say that, just thinking of the times when I've
| seen these migrations happen and relaying the stated aims.
| I agree, I think it can be a burner of dev time and a
| burden on the business as devs acquire the new skillset
| instead of doing more valuable work.
| OptionOfT wrote:
| Flexibility was a big thing for us. Many different
| jurisdictions required us to be conscious of where exactly
| data was stored & processed.
|
| K8s makes this really easy. Don't need to worry whether
| country X has a local Cloud data center of Vendor Y.
|
| Plus it makes hiring so much easier as you only need to
| understand the abstraction layer.
|
| We don't hire people for ARM64 or x86. We have abstraction
| layers. Multiple even.
|
| We'd be fooling us not to use them.
| fazkan wrote:
| This, most of it, I think is to support on-prem, and cloud-
| flexibility. Also from the customers point of view, they can
| now sell the entire figma "box" to controlled industries for
| a premium.
| teyc wrote:
| People move to K8s so that their resumes and job ads are
| cloud provider agnostic. Peoples careers stagnate when their
| employers platform on a home baked tech, or on specific
| offerings from cloud providers. Employers find Mmoving to a
| common platform makes recruiting easier.
| samcat116 wrote:
| > I have a hard time following much of the decision making
| around these gigantic migrations or technology efforts because
| the decisions don't seem to come from any user or company need
|
| I mean the blog post is written by the team deciding the
| company needs. They explained exactly why they can't easily use
| etcd on ECS due to technical limitations. They also talked
| about many other technical limitations that were causing them
| issues and increasing cost. What else are you expecting?
| Flokoso wrote:
| Managing 500 or more VMS is a lot of work.
|
| Aline the VM upgrade, auth, backup, log rotation etc.
|
| With k8s I can give everyone a namespace, policies, volumes,
| have automatic log aggregation due to demon sets and k8s/cloud
| native stacks.
|
| Self healing and more.
|
| It's hard to describe how much better it is.
| ianvonseggern wrote:
| Hey, author here, I think you ask a good question and I think
| you frame it well. I agree that, at least for some major
| decisions - including this one, "it is fundamentally difficult
| for organizations of this scale to identify and reason about
| valuable work."
|
| At its core we are a platform teams building tools, often for
| other platform teams, that are building tools that support the
| developers at Figma creating the actual product experience. It
| is often harder to reason about what the right decisions are
| when you are further removed from the end user, although it
| also gives you great leverage. If we do our jobs right the
| multiplier effect of getting this platform right impacts the
| ability of every other engineer to do their job efficiently and
| effectively (many indirectly!).
|
| You bring up good examples of why this is hard. It was
| certainly an alternative to say sorry we can't support etcd and
| helm and you will need to find other ways to work around this
| limitation. This was simply two more data points helping push
| us toward the conclusion that we were running our Compute
| platform on the wrong base building blocks.
|
| While difficult to reason about, I do think its still very
| worth trying to do this reasoning well. It's how as a platform
| team we ensure we are tackling the right work to get to the
| best platform we can. Thats why we spent so much time making
| the decision to go ahead with this and part of why I thought it
| was an interesting topic to write about.
| felixgallo wrote:
| I have a constructive recommendation for you and your
| engineering management for future cases such as this.
|
| First, when some team says "we want to use helm and etcd for
| some reason and we haven't been able to figure out how to get
| that working on our existing platform," start by asking them
| what their actual goal is. It is obscenely unlikely that helm
| (of all things) is a fundamental requirement to their work.
| Installing temporal, for example, doesn't require helm and is
| actually simple, if it turns out that temporal is the best
| workflow orchestrator for the job and that none of the
| probably 590 other options will do.
|
| Second, once you have figured out what the actual goal is,
| and have a buffet of options available, price them out. Doing
| some napkin math on how many people were involved and how
| much work had to go into it, it looks to me that what you
| have spent to completely rearchitect your stack and
| operations and retrain everyone -- completely discounting
| opportunity cost -- is likely not to break even in even my
| most generous estimate of increased productivity for about
| five years. More likely, the increased cost of the platform
| switch, the lack of likely actual velocity accrual, and the
| opportunity cost make this a net-net bad move except for the
| resumes of all of those involved.
| Spivak wrote:
| > we can't support etcd and helm and you will need to find
| other ways to work around this limitation
|
| So am I reading this right that either downstream platform
| teams or devs wanted to leverage existing helm templates to
| provision infrastructure and being on ECS locked you out of
| those and the water eventually boiled over. If so that's a
| pretty strong statement about the platform effect of k8s.
| wg0 wrote:
| If you haven't broken down your software into 50+ different
| separate applications written in 15 different languages using 5
| different database technologies - you'll find very little use
| for k8s.
|
| All you need is a way to roll out your artifact to production
| in a roll over or blue green fashion after the preparations
| such as required database alterations be it data or schema
| wise.
| javaunsafe2019 wrote:
| But you do know which problems the k8s abstraction solves,
| right? Cause it has nothing to do with many languages nor
| many services but things like discovery, scaling, failover
| and automation ...
| imiric wrote:
| > All you need is a way to roll out your artifact to
| production in a roll over or blue green fashion after the
| preparations such as required database alterations be it data
| or schema wise.
|
| Easier said than done.
|
| You can start by implementing this yourself and thinking how
| simple it is. But then you find that you also need to decide
| how to handle different environments, configuration and
| secret management, rollbacks, failover, load balancing, HA,
| scaling, and a million other details. And suddenly you find
| yourself maintaining a hodgepodge of bespoke infrastructure
| tooling instead of your core product.
|
| K8s isn't for everyone. But it sure helps when someone else
| has thought about common infrastructure problems and solved
| them for you.
| mattmanser wrote:
| You need to remove a lot of things from that list. Almost
| all of that functionality is available in build tools that
| have been available for decades. I want to emphasize the
| DECADES.
|
| And then all you're left with is scaling. Which most
| business do not need.
|
| Almost everything you've written there is a standard
| feature of almost any CI toolchain, teamcity, Jenkins,
| Azure DevOps, etc., etc.
|
| We were doing it before k8s was even written.
| mplewis wrote:
| Yeah, all you need is a rollout system that supports blue-
| green! Very easy to homeroll ;)
| friendly_deer wrote:
| Here's a theory about why at least some of these come about:
|
| https://lethain.com/grand-migration/
| tedunangst wrote:
| How long will it take to migrate off?
| codetrotter wrote:
| It's irreversible.
| wrs wrote:
| A migration with the goal of improving the infrastructure
| foundation is great. However, I was surprised to see that one of
| the motivations was to allow teams to use Helm charts rather than
| converting to Terraform. I haven't seen in practice the
| consistent ability to actually use random Helm charts unmodified,
| so by encouraging its use you end up with teams forking and
| modifying the charts. And Helm is such a horrendous tool, you
| don't really want to be maintaining your own bespoke Helm charts.
| IMO you're actually _better off_ rewriting in Terraform so at
| least your local version is maintainable.
|
| Happy to hear counterexamples, though -- maybe the "indent 4"
| insanity and multi-level string templating in Helm is gone
| nowadays?
| smellybigbelly wrote:
| Our team also suffered from the problems you described of
| public helm charts. There is always something you need to
| customise to make things work on your own environment. Our
| approach has been to use the public helm chart as-is and do any
| customisation with `kustomize --enable-helm`.
| BobbyJo wrote:
| Helm is quite often the default supported way of launching
| containerized third-party products. I have works at two
| separate startups whose 'on prem' product was offered this way.
| freedomben wrote:
| Indeed. I try hard to minimize the amount of Helm we use, but
| a significant amount of tools are only shipped as Helm
| charts. Fortunately I'm increasingly seeing people provide
| "raw k8s" yaml, but it's far from universal.
| cwiggs wrote:
| Helm Charts and Terraform are different things IMO. Terraform
| is better used to deploying cloud resources (s3 bucket, EKS
| cluster, EKS workers, RDS, etc). Sure you can manage your k8s
| workloads with Terraform, but I wouldn't recommend it.
| Terraform having state when you already have your start in k8s
| makes working with Terraform + k8s a pain. Helm is purpose
| built for k8s, Terraform is not.
|
| I'm not a fan of Helm either though, templat-ed yaml sucks, you
| still have the "indent 4" insanity too. Kustomize is nice when
| things are simple, but once your app is complex Kustomize is
| worse than Helm IMO. Try to deploy an app that has a ING, with
| a TLS cert and external-DNS with Kustomize for multiple
| environments; you have to patch the resources 3 times instead
| of just have 1 variable you and use in 3 places.
|
| Helm is popular, Terraform is popular so they both are talked a
| lot, but IMO there is a tool that is yet to become popular that
| will replace both of these tools.
| wrs wrote:
| I agree, I wouldn't generate k8s from Terraform either,
| that's just the alternative I thought the OP was presenting.
| But I'd still rather convert charts from Helm to pretty much
| anything else than maintain them.
| stackskipton wrote:
| Lack of Variable substitution in Kustomize is downright
| frustrating. We use Flux so we have the feature anyways, but
| I wish it was built into Kustomize.
| gouggoug wrote:
| Talking about helm - I personally have come to profoundly
| loathe it. It was amazing when it came out and filled a much
| needed gap.
|
| However it is loaded with so many footguns that I spend my time
| redoing and debugging others engineers work.
|
| I'm hoping this new tool called << timoni >> picks up steam. It
| fixes pretty every qualm I have with helm.
|
| So if like me you're looking for a better solution, go check
| timoni.
| JohnMakin wrote:
| It's completely cursed, but I've started deploying helm via
| terraform lately. Many people, ironically me included, find
| that managing deployments via terraform is an anti pattern.
|
| I'm giving it a try and I don't despise it yet, but it feels
| gross - application configs are typically far more mutable and
| dynamic than cloud infrastructure configs, and IME, terraform
| does not likey super dynamic configs.
| andrewguy9 wrote:
| I look forward to the blog post where they get off K8, in just 18
| months.
| surfingdino wrote:
| ECS makes sense when you are building and breaking stuff. K8s
| makes sense when you are mature (as an org).
| xiwenc wrote:
| I'm baffled to see so many anti-k8s sentiments on HN. Is it
| because most commenters are developers used to services like
| heroku, fly.io, render.com etc. Or run their apps on VM's?
| elktown wrote:
| I think some are just pretty sick and tired of the explosion of
| needless complexity we've seen in the last decade or so in
| software, and rightly so. This is an industry-wide problem of
| deeply misaligned incentives (& some amount of ZIRP gold rush),
| not specific to this particular case - if this one is even a
| good example of this to begin with.
|
| Honestly, as it stands, I think we'd be seen as pretty useless
| craftsmen in any other field due to an unhealthy obsession of
| our tooling and meta-work - consistently throwing any kind of
| sensible resource usage out of the window in favor of just
| getting to work with certain tooling. It's some kind of a
| "Temporarily embarrassed FAANG engineer" situation.
| cwiggs wrote:
| I agree with this somewhat. The other day I was driving home
| and I saw a sprinkler head and broke on the side of the road
| and was spraying water everywhere. It made me think, why
| aren't sprinkler systems designed with HA in mind? Why aren't
| there dual water lines with dual sprinkler heads everywhere
| with an electronic component that detects a break in a line
| and automatically switches to the backup water line? It's
| because the downside of having the water spray everywhere,
| the grass become unhealthy or die is less than how much it
| would cost to deploy it HA.
|
| In the software/tech industry it's common place to just
| accept that your app can't be down for any amount of time no
| matter what. No one checked to see how much more it would
| cost (engineering time & infra costs) to deploy the app so it
| would be HA, so no one checked to see if it would be worth
| it.
|
| I blame this logic on the low interest rates for a decade. I
| could be wrong.
| maayank wrote:
| It's one of those technologies where there's merit to use them
| in some situations but are too often cargo culted.
| caniszczyk wrote:
| Hating is a sign of success in some ways :)
|
| In some ways, it's nice to see companies move to use mostly
| open source infrastructure, a lot of it coming from CNCF
| (https://landscape.cncf.io), ASF and other organizations out
| there (on top of the random things on github).
| tryauuum wrote:
| For me it is about VMs. Feel uneasy knowing that any kernel
| vulnerability will allow a malicious code to escape the
| container and explore the kubernetes host
|
| There are kata-containers I think, they might solve my angst
| and make me enjoy k8s
|
| Overall... There's just nothing cool in kubernetes to me.
| Containers, load balancers, megabytes of yaml -- I've seen it
| all. Nothing feels interesting enough to try
| stackskipton wrote:
| vs the Application getting hacked and running lose on the VM?
|
| If you have never dealt with, I have to run these 50
| containers plus Nginx/CertBot while figuring out which node
| is best to run it, yea, I can see you not being thrilled
| about Kubernetes. For the rest of us though, Kubernetes helps
| out with that easily.
| solatic wrote:
| I don't get the hate for Kubernetes in this thread. TFA is from
| _Figma_. You can talk all day long about how early startups just
| don 't need the kind of management benefits that Kubernetes
| offers, but the article isn't written by someone working for a
| startup, it's written by a company that nearly got sold to Adobe
| for $20 billion.
|
| Y'all really don't think a company like Figma stands to benefit
| from the flexibility that Kubernetes offers?
| BobbyJo wrote:
| Kubernetes isn't even that complicated, and first party support
| from cloud providers often means you're doing something in K8s
| inleu of doing it in a cloud specific way (like ingress vs
| cloud specific load balancer setups).
|
| At a certain scale, K8s is the simple option.
|
| I think much of the hate on HN comes from the "ruby on rails is
| all you need" crowd.
| JohnMakin wrote:
| > I think much of the hate on HN comes from the "ruby on
| rails is all you need" crowd.
|
| Maybe - people seem really gungho about serverless solutions
| here too
| logifail wrote:
| > it's written by a company that nearly got sold to Adobe for
| $20 billion
|
| (Apologies if this is a dumb question) but isn't Figma big
| enough to want to do any of their stuff on their own hardware
| yet? Why would they still be paying AWS rates?
|
| Or is it the case that a high-profile blog post about K8S and
| being provider-agnostic gets you sufficient discount on your
| AWS bill to still be value-for-money?
| jeffbee wrote:
| There are a lot of ex-Dropbox people at Figma who might have
| learned firsthand that bringing your stuff on-prem under a
| theory of saving money is an intensely stupid idea.
| logifail wrote:
| > There are a lot of ex-Dropbox people at Figma who might
| have learned firsthand that bringing your stuff on-prem
| under a theory of saving money is an intensely stupid idea
|
| Well, that's one hypothesis.
|
| Another is that "Every maturing company with predictable
| products must be exploring ways to move workloads out of
| the cloud. AWS took your margin and isn't giving it back."
| ( https://news.ycombinator.com/item?id=35235775 )
| ozim wrote:
| They are preparing for next blog post in a year - ,,how we
| cut costs by xx% by moving to our own servers".
| hyperbolablabla wrote:
| I work for a company making ~$9B in annual revenue and we use
| AWS for everything. I think a big aspect of that is just
| developer buy-in, as well as reliability guarantees, and
| being able to blame Amazon when things do go down
| NomDePlum wrote:
| Much bigger companies use AWS for very practical well thought
| out reasons.
|
| Not managing procurement of hardware, upgrades, etc, and a
| defined standard operating model with accessible
| documentation and the ability to hire people with experience,
| and have to hire less people as you are doing less is enough
| to build a viable and demonstrable business case.
|
| Scale beyond a certain point is hard without support and
| delegated responsibility.
| cwiggs wrote:
| k8s is complex, if you don't need the following you probably
| shouldn't use it:
|
| * Service discovery
|
| * Auto bin packing
|
| * Load Balancing
|
| * Automated rollouts and rollbacks
|
| * Horizonal scaling
|
| * Probably more I forgot about
|
| You also have secret and config management built in. If you use
| k8s you also have the added benefit of making it easier to move
| your workloads between clouds and bare metal. As long as you
| have a k8s cluster you can mostly move your app there.
|
| Problem is most companies I've worked at in the past 10 years
| needed multiple of the features above, and they decided to roll
| their own solution with Ansible/Chef, Terraform, ASGs, Packer,
| custom scripts, custom apps, etc. The solutions have always
| been worse than what k8s provides, and it's a bespoke tool that
| you can't hire for.
|
| For what k8s provides, it isn't complex, and it's all
| documented very well, AND it's extensible so you can build your
| own apps on top of it.
|
| I think there are more SWE on HN than
| Infra/Platform/Devops/buzzword engineers. As a result there are
| a lot of people who don't have a lot of experience managing
| infra and think that spinning up their docker container on a VM
| is the same as putting an app in k8s. That's my opinion on why
| k8s gets so much hate on HN.
| Osiris wrote:
| Those all seem important to even moderately sized products.
| worldsayshi wrote:
| As long as your requirements are simple the config doesn't
| need to be complex either. Not much more than docker-
| compose.
|
| But once you start using k8s you probably tend to scope
| creep and find a lot of shiny things to add to your set up.
| breakingcups wrote:
| I feel so out of touch when I read a blog post which casually
| mentions 6 CNCF projects with kool names that I've never heard
| of, for gaining seemingly simple functionality.
|
| I'm really wondering if I'm aging out of professional software
| development.
| renewiltord wrote:
| Nah, there's lots of IC work. It just means that you're
| unfamiliar with one approach to org scaling: abstracting over
| hardware, logging, retrying handled by platform team.
|
| It's not the only approach so you may well be familiar with
| others.
| twodave wrote:
| TL;DR because they already ran everything in containers. Having
| performed a migration where this wasn't the case, the path from
| non-containerized to containerized is way more effort than going
| from containerized non-k8s to k8s.
| _pdp_ wrote:
| In my own experience, AWS Fargate is easier, more secure and way
| more robust then running your K8S even with EKS.
| watermelon0 wrote:
| Do you mean ECS Fargate? Because you can use AWS Fargate with
| EKS, with some limitations.
| ko_pivot wrote:
| I'm not surprised that the first reason they state for moving off
| of ECS was the lack of support for stateful services. The lack of
| integration between EBS and ECS has always felt really strange to
| me, considering that AWS already built all the logic to integrate
| EKS with EBS in a StatefulSet compliant way.
| datatrashfire wrote:
| https://aws.amazon.com/about-aws/whats-new/2024/01/amazon-ec...
|
| This was actually added beginning of the year. Definitely was
| on my most wanted list for a while. You could technically use
| EFS, but that's a very expensive way to run anything IO
| intensive.
| ko_pivot wrote:
| This adds support for ephemeral EBS volumes. When a task is
| created a volume gets created, and when the task is
| destroyed, for whatever reason, the volume is destroyed too.
| It has no concept of task identity. If the task needs to be
| moved to a new host, the volume is destroyed.
| julienmarie wrote:
| I personally love k8s. I run multiple small but complex custom
| e-commerce shops and handle all the tech on top of marketing,
| finance and customer service.
|
| I was running on dedicated servers before. My stack is quite
| complicated and deploys were a nightmare. In the end the dread of
| deploying was slowing down the little company.
|
| Learning and moving to k8s took me a month. I run around 25
| different services ( front ends, product admins, logistics
| dashboards, delivery routes optimizers, orsm, ERP, recommendation
| engine, search, etc.... ).
|
| It forced me to clean my act and structure things in a repeatable
| way. Having all your cluster config in one place allows you to
| exactly know the state of every service, which version is
| running.
|
| It allowed me to do rolling deploys with no downtime.
|
| Yes it's complex. As programmers we are used to complex. An Nginx
| config file is complex as well.
|
| But the more you dive into it the more you understand the
| architecture if k8s and how it makes sense. It forces you to
| respect the twelve factors to the letter.
|
| And yes, HA is more than nice, especially when your income is
| directly linked to the availability and stability of your stack.
|
| And it's not that expensive. I lay around 400 usd a month in
| hosting.
| xyst wrote:
| Of course there's no mention of performance loss or gain after
| migration.
|
| I remember when microservices architecture was the latest hot
| trend that came off the presses. Small and big firms were racing
| to redesign/reimplement apps. But most forgot they weren't
| Google/Netflix/Facebook.
|
| I remember end user experience ended up being _worse_ after the
| implementation. There was a saturation point where a single micro
| service called by all of the other micro services would cause
| complete system meltdown. There was also the case of an
| "accidental" dependency loop (S1 -> S2 -> S3 -> S1). Company
| didn't have an easy way to trace logs across different services
| (way before distributed tracing was a thing). Turns out only a
| specific condition would trigger the dependency loop (maybe, 1 in
| 100 requests?).
|
| Good times. Also, job safety.
| api wrote:
| This is a very fad driven industry. One of the things you earn
| after being in it for a long time is intuition for spotting
| fads and gratuitous complexity traps.
___________________________________________________________________
(page generated 2024-08-08 23:00 UTC)