Terraform

Infrastructure as Code: Terraform Fundamentals

The first time you set up cloud infrastructure, you click through a console. The fifth time, you wonder why nobody ever wrote down the steps. The fiftieth time, you realise that "writing down the steps" is not the answer — the answer is treating the infrastructure itself as code, the same way you treat application code: versioned, reviewed, deployed automatically.

Terraform is the dominant tool for this. This article walks the fundamentals: providers, resources, state, and the patterns that work for real teams.

Why infrastructure as code

Three concrete problems that solve themselves once your infrastructure is in code:

  • Reproducibility. Spinning up a new environment (staging, dev, disaster recovery) is running a command, not a week of console clicking.
  • Auditability. What changed? Who changed it? Why? Git log answers these for code; without IaC, infrastructure changes are invisible.
  • Drift detection. Someone clicks something in the console; reality diverges from the documented setup. With IaC, the diff is visible the next time someone runs terraform plan.

The Terraform model

flowchart LR Files[.tf files
declarative config] --> Plan[terraform plan
compute diff] State[(state file
current reality)] --> Plan Cloud[(actual cloud)] -.->|refresh| State Plan --> Apply[terraform apply
execute diff] Apply --> Cloud Apply --> State

Terraform's loop: declared config compared against state, diff applied to the cloud, state updated.

Terraform's job: take the desired state from your .tf files, compare it to the current state (stored in a state file), compute a diff, and execute the changes against the cloud provider's API.

The minimum viable example

// main.tf
terraform {
  required_providers {
    aws = {
      source  = "hashicorp/aws"
      version = "~> 5.0"
    }
  }
}

provider "aws" {
  region = "us-east-1"
}

resource "aws_s3_bucket" "example" {
  bucket = "my-app-uploads-${var.environment}"

  tags = {
    Environment = var.environment
    Managed_by  = "terraform"
  }
}

variable "environment" {
  type    = string
  default = "dev"
}

output "bucket_name" {
  value = aws_s3_bucket.example.bucket
}

Run terraform init once, then terraform plan to see what will happen, then terraform apply to execute. Three commands, infrastructure created. terraform destroy tears it all down again.

The state file

Terraform's state is a JSON file recording what resources exist and their attributes. By default it lives in your project as terraform.tfstate. This is the most important Terraform concept and the source of most pain.

  • Never commit state to git. It contains secrets (RDS passwords, etc.) and is updated frequently in ways that produce massive diffs.
  • Use a remote backend. S3 + DynamoDB lock table is the standard for AWS. Terraform Cloud / HCP Terraform is the managed alternative. For Azure, blob storage; for GCP, GCS bucket.
  • Lock during applies. Two engineers running apply simultaneously corrupts state. The lock prevents this.
  • Back up. Losing the state file means Terraform thinks no resources exist; the next plan tries to create everything again. S3 versioning + cross-region replication is cheap insurance.
// backend.tf
terraform {
  backend "s3" {
    bucket         = "mycompany-terraform-state"
    key            = "production/main.tfstate"
    region         = "us-east-1"
    dynamodb_table = "terraform-state-lock"
    encrypt        = true
  }
}

Modules

A module is a reusable Terraform package. Once you have written a VPC config that you like, wrap it in a module and you can instantiate identical VPCs in multiple environments.

// modules/vpc/main.tf
variable "cidr_block" { type = string }
variable "name"       { type = string }

resource "aws_vpc" "this" {
  cidr_block = var.cidr_block
  tags       = { Name = var.name }
}

output "id" { value = aws_vpc.this.id }

// callers/main.tf
module "prod_vpc" {
  source     = "./modules/vpc"
  cidr_block = "10.0.0.0/16"
  name       = "prod"
}
module "staging_vpc" {
  source     = "./modules/vpc"
  cidr_block = "10.1.0.0/16"
  name       = "staging"
}

The official Terraform Registry hosts thousands of community modules for common patterns (AWS VPC, EKS cluster, GCP networks). For most use cases, a well-maintained registry module is better than rolling your own.

Variables and environments

The standard pattern: one Terraform configuration, multiple .tfvars files for different environments.

// terraform.tfvars (default)
environment = "dev"
instance_count = 1

// production.tfvars
environment = "production"
instance_count = 3

// Apply
terraform apply -var-file=production.tfvars

For more complex setups (per-environment differences in resources, not just values), use Terraform workspaces or separate root configurations per environment with shared modules.

Common pitfalls

  • Manually changing things in the console. Terraform sees the drift on the next plan and tries to reset to its declared state. Either embrace IaC fully or accept that drift will be silently overwritten.
  • Cyclic dependencies. Resource A depends on B, B depends on A. Terraform refuses to plan. Restructure with intermediate resources or split into multiple apply phases.
  • Sensitive values in plan output. Set sensitive = true on outputs containing passwords, tokens.
  • Using count when you should use for_each. count indexes by integer; removing the middle item shifts everything else, causing destruction. for_each indexes by key; safer.
  • Hardcoding region or account. Use variables. Move to a different region in 30 seconds.
  • Forgetting to format. terraform fmt normalises whitespace. Run it before committing; saves diff noise.

Terraform vs alternatives

  • Pulumi: infrastructure as actual code (Python, TypeScript, Go) instead of HCL. Stronger for complex logic; smaller community than Terraform.
  • AWS CloudFormation: AWS-native. Tightly integrated with AWS, no other clouds. Verbose YAML.
  • CDK (AWS, Terraform CDK): code-based wrappers that generate CloudFormation or Terraform. Best of both worlds for some teams.
  • OpenTofu: open-source fork of Terraform after HashiCorp's license change. API-compatible. Rapidly maturing.

For new projects today: Terraform or OpenTofu. The community, registry, and tutorials are richest there. Pulumi is a credible alternative if your team strongly prefers writing code over HCL.

Production patterns

  • Run Terraform in CI, not locally. Engineers propose changes via PRs; CI runs plan on PR; CI runs apply on merge. Atlantis or Terraform Cloud automates this.
  • Separate state per environment. Different state files for dev, staging, prod. A bad apply in dev should not touch prod.
  • Modules in their own repository. Reusable modules versioned independently from the consuming code. Pin to specific versions in callers.
  • Policy as code. Sentinel (HashiCorp), OPA, or tflint catches expensive or dangerous changes before apply (e.g., "no public S3 buckets", "no instances larger than m5.4xlarge").

Frequently Asked Questions

What about destroying production?

The classic horror story. Always run terraform plan first; never use -auto-approve on prod; protect critical resources with prevent_destroy lifecycle rules.

Is Terraform free?The CLI is open-source (BSL after the license change; OpenTofu is the fully open-source fork). Terraform Cloud / HCP Terraform are paid services for managed state and team collaboration.

How do I migrate existing manually-created resources to Terraform?terraform import brings existing resources under Terraform's management. Tedious but works. The community tool terraformer automates the import process for many providers.

Should I learn Terraform or Pulumi?Terraform first — it has the larger community, better documentation, more registry modules. Pulumi is worth knowing if your team is highly TypeScript or Python-fluent and finds HCL frustrating.

How does Terraform compare to Ansible?Different tools for different jobs. Terraform creates infrastructure (VPCs, EC2 instances, RDS databases). Ansible configures running systems (install packages, edit config files, restart services). Many teams use both: Terraform to provision the VM, Ansible to set up what runs on it.

Share your thoughts

Worked with this in production and have a story to share, or disagree with a tradeoff? Email us at support@mybytenest.com — we read everything.