[HN Gopher] Terraform vs. AWS CloudFormation
___________________________________________________________________
Terraform vs. AWS CloudFormation
Author : historynops
Score : 80 points
Date : 2021-10-06 20:25 UTC (2 hours ago)
(HTM) web link (gswallow.medium.com)
(TXT) w3m dump (gswallow.medium.com)
| johnl1479 wrote:
| I can appreciate the author's criticisms of the shortcomings of
| Cloudformation, but this is really just a "Why you should use
| Terraform" post.
| mylons wrote:
| "But CDK transpiles into CloudFormation templates. For that
| reason alone I can't recommend it."
|
| CDK is superior to terraform for a glaring reason: it's a first
| class citizen in AWS' eyes and terraform is not.
| thecopy wrote:
| > With Terraform, your local executable makes rest calls to each
| service's REST API for you, meaning no intermediary sits between
| you and the service you're controlling. Want an RDS instance?
| Terraform will make calls directly to the RDS API.
|
| How is this different than CloudFormation making the same calls?
| lykr0n wrote:
| You give CloudFormation a list of instructions. It accepts it
| and gives you an ID to watch for updates, then it goes off and
| executes them.
|
| Terraform executes a list of instructions. It executes them in
| front of you while you wait.
|
| Both are fine until you run into something like this:
|
| I'm pushing a Elastic Container Service Task Definition change
| via CDK. A CloudFormation change is submitted, and I wait for
| it to finish. In the background, it's trying to do the update
| but the update fails due to some misconfiguration with the new
| container.
|
| CloudFormation doesn't fail or return an error. It times out
| after an hour and reverts the change. I have to know to dig
| into the AWS console to find my failed tasks to view the error.
|
| If I did this update via Terraform, I would get the error back
| in my console quickly as Terraform is directly telling ECS to
| make the change. With CDK, the CloudFormation changeset is
| generated, it is submitted to CloudFormation, then the tool
| polls the AWS API for progress updates. Sometimes you get
| specific messages back, sometimes it fails and you need to go
| in and see what it failed on.
| kennu wrote:
| That's right - use AWS CDK instead. You don't have to worry about
| the low-level CloudFormation syntax and details. I switched a few
| years ago and haven't looked back. CDK keeps getting better and
| better, also handling things like asset deployments (Docker
| images, S3 content, bundling), quick Lambda updates with
| --hotswap, quick stack debugging with --no-rollback, etc.
| fdgsdfogijq wrote:
| I'm always surprised that more people arent aware of CDK. Its
| an extremely powerful way to write software. Especially once
| you get good at it. CFN pales in comparison, CDK to me feels
| like the future of software development.
| k__ wrote:
| Pulumi is also nice for non-AWS related stuff.
| nagyf wrote:
| I agree, and have the same experience. CDK is so much easier,
| much less verbose, and unit testable (at least to some degree).
|
| Since resource importing is possible in CDK (not nice, but
| possible) you can even start using it if you already have
| resources that you do not want to recreate.
| zenux wrote:
| Fun fact: in the leak of the Twitch (Amazon) repositories of this
| morning, I saw that the developers use Terraform !
| cube2222 wrote:
| You can use more than one tool.
|
| CloudFormation is great because of its transactionality, so it
| lends itself nicely to deploying multiple services which are
| versioned together. You either succeed fully, or all services
| will be rolled back.
|
| This way you can deploy your whole infra with Terraform, and then
| deploy to your i.e. ECS cluster using CloudFormation. Works great
| in practice.
| zapt02 wrote:
| The rollback functionality of CF is a blessing. We use both CF
| and Terraform at my company and i vividly recall multiple times
| where my connection had cut out during "terraform apply" and
| left the Terraform infrastructure in a half-finished state.
| acdha wrote:
| > The rollback functionality of CF is a blessing
|
| When it works, which is a big caveat: we had far more cases
| where it failed in a way which required manual remediation
| and the gaps in validation meant that you'd be in a "apply /
| error / rollback" loop requiring 20+ minutes before you could
| try again. Terraform was always considerably faster but it
| was especially the orders of magnitude improvement in retry
| time which convinced most of us to switch.
|
| The CloudFormation team has been working on this so it's
| possible that experience has improved but the scar tissue
| will take time to fade.
| nickjj wrote:
| Rollback doesn't always work with CF. I've noticed so many
| times that it would mostly delete everything but not certain
| things once in a while. Then you're left having to play
| detective to manually figure out what you need to delete
| while having to delete dependencies by hand in a specific
| order.
|
| I've spent hours just waiting for CF to fail deleting EKS or
| RDS related resources then I end up getting billed for $30+ a
| month sometimes because I forgot to manually delete a NAT
| gateway.
| vageli wrote:
| > i vividly recall multiple times where my connection had cut
| out during "terraform apply"
|
| The issue could be at least partially resolved by using
| automation (like atlantis for example) to apply your plans.
| l0b0 wrote:
| Unless things have changed in the meantime, the killer feature of
| CloudFormation for me is that I don't have to keep track of the
| state locally. Having to set up tracking of the infra state in
| Terraform is a huge pain, since it should be stored independently
| of both the infra code (to allow deploying anything but HEAD) and
| the infra itself (duh). As long as Terraform doesn't query the
| existing infra to work out what needs doing I don't want to go
| back to it.
| Pensacola wrote:
| While the read was interesting and informative, something about
| the tone made me search for a disclaimer/disclosure of interest.
| Are you an "influencer?"
| draklor40 wrote:
| CloudFormation, with its HORRIBLE YAML templating (whatever
| dsl/language) and arcane error messages is a horror story. I hate
| it so much that I'd rather quit my job than debug why
| CloudFormation decided for no reason to update my RDS instance
| for a PR that was just a README file update.
| emmanueloga_ wrote:
| How about Pulumi? [1] Seems compatible with CF and supports
| TypeScript as configuration language. Any fans?
|
| 1: https://www.pulumi.com/docs/guides/adopting/from_aws/
| orf wrote:
| I spent a bit of time trying to deploy a lambda app with
| Cloudformation. I wanted to use a relational database, so I
| needed to handle migrations.
|
| Ok, so apparently I need to write a custom Cloudformation
| resource to execute a lambda function that will run the
| migrations prior to deploying the new version of the lambda. Kind
| of neat that you can do that.
|
| Except I messed up the output of the custom resource lambda and
| Cloudformation completely locked my deployment up for _3 hours_.
| 3 hours. I couldn 't do _anything_ - rollback, update, whatever.
|
| Cloudformation via a CDK is interesting, and I don't hate it, but
| oh boy if it gets into a weird state it can completely kill your
| iteration loop. And the docs say something along the lines of "if
| it's stuck for too long contact support". No thanks.
| zapt02 wrote:
| CF does have a lot of quirks (especially stacks locking up for
| various reasons, or rollbacks taking hours).
|
| I find it easiest to run migrations when an application is
| first starting up (with an appropriate transaction lock so
| other instances won't cause the migration to run more than
| once), this way you don't have to do a lot of devops magic for
| it to work.
| singlewind wrote:
| To be honest, I don't agree this. Manage an infrastructure need
| evidence and trace how this get created. I've been in the
| situation a few times. Have been threw projects terraform code
| doesn't match aws infrastructure. We don't know when an how the
| drift happen. At least, cloudformation can have some feature to
| detect the difference and help me trace back which commit
| actually has been deployed. CDK make the job easier for
| developers because it deliver some convenience and offer more
| pattern to write code. I like both.
| Hikikomori wrote:
| Vanilla cloudformation is bad, but so is terraform (for my use
| case anyway). We wrap our cloudformation with python, you need
| something similar for terraform to make it less terrible (cdktf,
| terragrunt, terrascript).
| flurie wrote:
| One of the most amazing things I saw at AWS reInvent was an
| advanced talk on IaC that provided the code of a lambda function
| inline in a CloudFormation template. I realize that this is just
| one talk, and there are plenty of ways to structure things well,
| but this practice is directly encouraged by the design of
| CloudFormation[1]. AWS has attempted redefining the lambda
| deployment story multiple times, there are multiple companies
| whose primary offering is providing a better way to deploy code
| to serverless offerings, but this still stands out to me as one
| of the most terrible ways to do things, and I blame the design of
| CloudFormation.
|
| [1]
| https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGui...
| xyzzy123 wrote:
| I'm going off track here but Pulumi have a totally mind-bending
| feature where you can write the code of a lambda function not
| only inline, but such that it captures the value of variables
| from the surrounding infra code at the time the function is
| serialized.
|
| See: https://www.pulumi.com/docs/intro/concepts/function-
| serializ...
|
| Seeing the specific examples they use it for (AWS infra glue)
| makes me think that there is room for infrastructure related
| lambdas to be defined right in cfn or infra code, with very low
| ceremony, even if you wouldn't want to deploy "applications"
| like that.
| nzoschke wrote:
| Counterpoint... Use CloudFormation!
|
| Managed services offer big benefits over software. With CF, new
| stacks, change sets, updates, rollbacks and drift detection are
| an API call away.
|
| Managed service providers offer big benefits over software. With
| CF and AWS support, help with problems are a support ticket away.
|
| Using a single cloud provider has a big benefit over a multi-
| cloud tooling. I only run workloads on AWS, so the CF syntax,
| specs and docs unlocks endless first party features. A portable
| Terraform + Kubernetes contraption is a lowest common denominator
| approach.
|
| Of course everything depends.
|
| I've configured literally 1000s of systems with CloudFormation
| with very few problems.
|
| I have seen Terraform turn into a tire-fire of migrations from
| state files to Terraform enterprise to Atlantis that took an
| entire DevOps team to care for.
| acdha wrote:
| > Managed services offer big benefits over software. With CF,
| new stacks, change sets, updates, rollbacks and drift detection
| are an API call away. > > Managed service providers offer big
| benefits over software. With CF and AWS support, help with
| problems are a support ticket away.
|
| The problem is when those help tickets get responses like "try
| deleting everything by hand and see if it recreates without an
| error next time". They've worked on CloudFormation over the
| last year or but everyone I've known who's switched to tools
| like Terraform did so after getting tired of unpredictable
| deployment times or hitting the many cases where CloudFormation
| gets itself into an irrecoverable state. I can count on no
| fingers the number of development teams who used CF and didn't
| ask for help recovering from an error state in CF which
| required out-of-band remediation.
|
| I believe they've also gotten better at tracking new AWS
| features but there were multiple cases where using Terraform
| got you the ability to use a feature 6+ months ahead of CF.
|
| > A portable Terraform + Kubernetes contraption is a lowest
| common denominator approach.
|
| Terraform is much, much richer than CloudFormation so I'd
| compare it to CDK (with the usual aesthetic debate over
| declarative vs. procedural models) and it doesn't really make
| sense to call it LCD in the same way that you might use that to
| describe Kubernetes because it's not trying to build an
| abstraction which covers up the underlying platform details.
| Most of the Terraform I've written controls AWS but there's a
| significant value to also being able to use the same tool to
| control GCP, GitLab, Cloudflare, Docker, various enterprise
| tools, etc. with full access to native functionality.
| dolni wrote:
| > I've configured literally 1000s of systems with
| CloudFormation with very few problems.
|
| This is a great way of saying "I've never used CloudFormation"
| without stating it directly.
| void_mint wrote:
| > Managed services offer big benefits over software.
|
| TF can be used as a managed service.
|
| > Managed service providers offer big benefits over software.
| With CF and AWS support, help with problems are a support
| ticket away.
|
| The same is true with TF, except 100000% better unless you're
| paying boatloads of money for higher tiered support.
|
| > I only run workloads on AWS, so the CF syntax, specs and docs
| unlocks endless first party features.
|
| CF syntax is an abomination. Lots of the bounds of CF are
| dogmatic and unhelpful.
|
| > I have seen Terraform turn into a tire-fire of migrations
| from state files to Terraform enterprise to Atlantis that took
| an entire DevOps team to care for.
|
| CF generally takes an entire DevOps team to care for, for any
| substantial project.
| ldoughty wrote:
| Agree. CF is not a magic bullet, but neither is ansible or
| terraform.
|
| We used ansible heavily with AWS for 2 years. Then we decided
| to gut it out and do CF directly. Why? If we want to switch
| clouds, it's not like the ansible or terraform modules are
| transferable ... So might as well go the native supported
| route.
|
| I agree with the article, messages can be cryptic, but at the
| end of the day, I have a CF stack that represents an entity. I
| can blow away the stack, and if there's any failure or issue, I
| can escalate my permissions and kill it again. Still a problem?
| Then it's AWS's fault and a ticket away (though I've only had
| to do this once in 5 years and > 150,000 CF stacks.
|
| I also would argue, if a stack deletion stalls development, you
| are probably using hard-coded stack names, which isn't wise.
| Throw in a "random" value like a commit or pipeline identifier.
|
| I've had far less issues with CF than terraform or ansible. I
| have yet to see CF break backward compatibility, while I had a
| nightmare day when I couldn't run any playbooks in ansible
| because the module had a new required parameter on a minor or
| patch version bump.l (which was when I called it quits on
| ansible, I then relooked at terraform, and decided to go
| native)
|
| I will caveat that our use case for AWS involves LOTS of
| creation and deletion, so I find it super helpful to manage my
| infrastructure in "stacks" that are created and deleted as a
| unit.. I dont need to worry about partial creations or
| deletions.. like ever... It basically never fails redoing
| known-working stuff... Only "first time" and usually because we
| follow least-privilege heavily
| HatchedLake721 wrote:
| I'm confused. Isn't Ansible and CloudFormation what apple is
| to an orange with completely different use cases and purpose?
|
| One is a configuration management and deployment tool.
|
| The other one is cloud resource provisioning service.
|
| They're meant to work in tandem, not one to replace another.
| mooreds wrote:
| I think Ansible has extensions which allow for managing
| infra such as AWS. See https://docs.ansible.com/ansible/lat
| est/collections/amazon/a... for example.
| booleanbetrayal wrote:
| Yeah, importing existing resources into Cloudformation is a
| nightmare in "Am I going to break everything? _Fingers Crossed_
| ".
|
| It is also very possible to get into very bad situations if your
| settings drift and you attempt to reconcile those changes.
| easton wrote:
| Something funny (well, kind of sad) about CloudFormation I
| noticed this summer was that if you deploy a CloudFormation stack
| which updates a ECS service and deploys tasks which then fail
| health checks, CloudFormation will do nothing about this and just
| let ECS keep killing and restarting tasks for.. well, at least
| several hours. You have to know to go into ECS and drain the
| tasks manually and then initiate a rollback from CF to get your
| service back into a good state. The bug reports about this I
| found were going back years.
|
| The upside is that I got really well acquainted with how ECS
| worked.
| fictionfuture wrote:
| I had this same bug!! Cost us like $1000 before we fixed it;
| tkahnoski wrote:
| 100x this. Prior company committed to doing Infrastructure as
| Code and CloudFormation worked well except for this hiccup. We
| didn't even have that many services on ECS but we probably had
| 1 ticket a week asking support to help us with a 'stuck' stack.
|
| Our commitment to CloudFormation was doubled down on that we
| could do containers, Lambda, and 95% of any other AWS
| Services....
|
| However, in hidsight using SAM and the ECS CLI probably would
| have resulted in a more predictable CI/CD process as we weren't
| fighting deploy semantics through CloudFormation abstraction.
| cmaggiulli wrote:
| Writing Terraform scripts for AWS is 70% of my job. I do have
| some issues with the AWS provider in Terraform. Firstly, there
| are bugs. I ran into a bug a few days ago where the ARN attribute
| on a Lamba alias was resolving to the ARN of the Lambda, not it's
| alias. I only figured it out because I found a GitHub Issue.
| Additionally, Hashicorp is often playing catch-up with Amazon. A
| few days ago AWS released a new instruction set architecture for
| Lambdas that would save my org a lot of money. However after I
| saw the announcement in AWS I see tons of different GitHub issues
| created to add this functionality. So I start editing my files
| based off the documentation only for that issue to be closed and
| pointed to a new one with different syntax. So I start working
| off the new syntax only for that issue to close and be pointed to
| a different one
| robohoe wrote:
| That's right, don't use CloudFormation. Use CDK which will
| generate and obfuscate CF for you and you won't have to worry
| about it.
| Arelius wrote:
| I'm not sure I understand... Is obfuscating the CF a good
| thing?
| yjftsjthsd-h wrote:
| I'm pretty sure that was sarcasm. I disagree with said
| sarcasm, because CDK takes you one layer away from the actual
| thing that gets run but gives you a much nicer thing to work
| with so it can still be a good trade off; writing rust (or
| whatever) "obfuscates" the underlying CPU instructions but it
| still turns out to be a good idea.
| robohoe wrote:
| I will admit that troubleshooting permissions-related deployment
| issues in StackSets are a super nightmare inducing events.
___________________________________________________________________
(page generated 2021-10-06 23:00 UTC)