https://hub.qovery.com/guides/engineering/terraform-not-the-golden-hammer/ GuidesDocsTutorialsForum Web ConsoleHome [ ] [ ] * Guides * Docs * Tutorials * Forum * Web Console * Home * engineering Terraform is Not the Golden Hammer Feedbacks about mixed usages (Cloud Providers, Kubernetes...) type: engineeringtechnology: terraform [deimosfr] Pierre M. Pierre is an SRE, and CTO of Qovery. He has 15+ years of experience in R&D. From the financial to the Ad-Tech industry, he has a strong knowledge in distributed and highly-reliable systems. He's also the MariaDB High Performance book author. Stats 9 min read Updated Sep 17th, 2021 Contents * Overview * How we used Terraform + HCL + GitOps and team usage + tfstate + Helm management * Problems facing + Heterogeneous resources management + (Too) Strong dependencies + No automatic reconciliation + Automatic import * Advises and suggestion + Split + Outsource * Conclusion Terraform is probably the most used tool to deploy cloud services. It's a fantastic tool, easily usable, with descriptive language (DSL) called HCL, team-oriented, supporting tons of cloud providers, etc. On paper, it's an attractive solution. And it's easy to start delegating more and more responsibilities to Terraform, as it's like a swiss knife; it knows how to perform several kinds of actions against several varieties of technologies. Qovery is a platform to help developers to deploy their app on their cloud account in a few minutes (check it out). Before deploying an app, Qovery needs to deploy a few services (cloud provider side) where the app code will be hosted. To do so, we decided to use Terraform. The main reasons are: * Terraform is the industry standard to deploy cloud services. * Qovery Engine is open source (https://github.com/Qovery/engine), and we wanted to use something that anyone could easily contribute to. * Terraform is maintained by HashiCorp and by Cloud providers themself (trust of good quality and integration) At the beginning of Qovery, we took shortcuts. We needed to go fast. Using Terraform as the golden hammer was our shortcut. Based on our past experiences, we knew the golden hammer didn't exist. We've seen many companies struggling when they start needing customization. In the end, you pay the price of using non-adapted tools! So we were playing with the clock, as we knew it wouldn't fit in the mid/long run but did not precisely know when it would happen. This article is a return of experience, explaining where, when, and how you should use Terraform. #How we used Terraform #HCL First thing to understand is how Terraform works. It's a DSL as I mentionned earlier, the code looks like this: Copy resource "scaleway_k8s_cluster" "kubernetes_cluster" { name = var.kubernetes_cluster_name version = var.scaleway_ks_version cni = "cilium" autoscaler_config { disable_scale_down = true estimator = "binpacking" scale_down_delay_after_add = "10m" balance_similar_node_groups = true } auto_upgrade { enable = true maintenance_window_day = "any" maintenance_window_start_hour = 3 } } As you can see, it's easily readable and understandable. It supports AWS, DigitalOcean, Scaleway, and so many other cloud providers. #GitOps and team usage You can add this kind of code in a git repository and work with your team members on the same codebase. When you run terraform against the Terraform code you've written, it will generate a tfstate file locally containing the information of what it has managed, keeping track of what it owns. Working with Terraform in a team with parallel deployments is not the default Terraform behavior. You will have to configure a remote backend (s3+dynamodb, for example) to store the tfstate file. Copy terraform { backend "s3" { access_key = xxx secret_key = xxx bucket = xxx key =xxx.tfstate" dynamodb_table = xxx region = xxx } } You'll then have a shared lock mechanism to avoid more than one person applying a change to the same resources. #tfstate When you run terraform, it first refreshes the content of the state file, comparing what is deployed and what is stored into the tfstate file. It allows Terraform only to perform change actions on what is different from the tfstate file. It's very efficient. #Helm management Helm doesn't only know how to work with several cloud providers but also knows how to talk to Kubernetes, Helm...the list is...HUGE! As you can see on the provider list (https://registry.terraform.io/ browse/providers), there are +1.3k providers available! So we were using it for Helm. Why? Because it's super useful to create something on a cloud provider (like an IAM account), get the results from Terraform, and directly inject them as Helm variables. To show how easy it is: Copy # Create user and attach policy resource "aws_iam_user" "iam_eks_loki" { name = "qovery-logs-${var.kubernetes_cluster_id}" tags = local.tags_eks } resource "aws_iam_access_key" "iam_eks_loki" { user = aws_iam_user.iam_eks_loki.name } resource "aws_iam_policy" "loki_s3_policy" { name = aws_iam_user.iam_eks_loki.name description = "Policy for logs storage" policy = <