Terraform State Manipulation: Injecting Malicious Resource Definitions via Backend Write Access
Théorie
Why This Matters
In 2023, multiple red team engagements documented a novel persistence and destruction technique: attackers who gained write access to an organisation's Terraform state backend modified the state file to mark critical resources — production RDS clusters, S3 buckets, and VPCs — as deleted. On the next scheduled terraform apply in the CI/CD pipeline, Terraform faithfully executed the diff and destroyed the production infrastructure. Recovery required hours of manual reconstruction. Separately, researchers demonstrated that injecting a malicious null_resource with a local-exec provisioner into state could cause arbitrary command execution on the Terraform runner on the next apply. Write access to Terraform state is not merely a data exposure — it is remote code execution on the infrastructure provisioning system.
Core Concept
Terraform state manipulation is the attack class targeting the state backend — the storage system where Terraform persists its view of managed infrastructure. The attack surface differs from state file reading (credential exposure): here the concern is an adversary with write access using state modification to cause harm on the next terraform apply.
Attack vectors from state write access:
- Resource deletion injection: Modify the state to remove resource entries. Terraform will plan to create those resources on next apply — potentially destroying and re-creating production resources, causing data loss.
- Attribute poisoning: Change resource attributes in state to values that will cause Terraform to generate a replacement plan (destroy-then-create) for resources it believes have drifted. For stateful resources (databases, volumes), this destroys data.
- Provisioner injection: Some Terraform provider resources support connection and provisioner blocks. Injecting a local-exec provisioner into a resource's state representation can trigger code execution on the Terraform runner.
- Backend lock poisoning: The DynamoDB lock table for S3+DynamoDB backends uses a conditional write to prevent concurrent applies. An attacker who can write to the lock table can deny service to legitimate Terraform operations or hold the lock to prevent emergency remediation.
Secure backend requirements for S3+DynamoDB:
- S3 bucket with SSE-KMS encryption, Block Public Access, versioning enabled, and access logging.
- Bucket policy allowing s3:GetObject, s3:PutObject, s3:DeleteObject only to named Terraform execution role ARNs.
- DynamoDB table for state locking with least-privilege write access for Terraform execution roles only.
- CloudTrail logging of all S3 object-level API calls on the state bucket for forensic audit.
Technical Deep-Dive
# Audit the Terraform state S3 backend bucket configuration
STATE_BUCKET="terraform-state-company-prod"
# Check versioning (required — enables recovery from malicious state modification)
aws s3api get-bucket-versioning --bucket "$STATE_BUCKET"
# Check server-side encryption
aws s3api get-bucket-encryption --bucket "$STATE_BUCKET"
# Check access logging
aws s3api get-bucket-logging --bucket "$STATE_BUCKET"
# Check bucket policy for overly broad write permissions
aws s3api get-bucket-policy --bucket "$STATE_BUCKET"
--query 'Policy' --output text | python3 -m json.tool
# Check DynamoDB lock table permissions
LOCK_TABLE="terraform-state-lock"
aws dynamodb describe-table --table-name "$LOCK_TABLE"
--query 'Table.{Status:TableStatus,Keys:KeySchema}'
# Check resource policy on DynamoDB table
aws dynamodb get-resource-policy --resource-arn
"arn:aws:dynamodb:us-east-1:ACCOUNT_ID:table/$LOCK_TABLE"
# Check CloudTrail for recent state file modifications
aws cloudtrail lookup-events
--lookup-attributes AttributeKey=ResourceName,AttributeValue="$STATE_BUCKET"
--start-time "2024-01-01"
--query 'Events[?contains(EventName, `PutObject`) || contains(EventName, `DeleteObject`)].
{Time:EventTime,Actor:Username,Event:EventName,Resources:Resources}'
--output table
# Check for excessive IAM permissions on CI/CD role used for Terraform
CICD_ROLE="TerraformCICDRole"
aws iam simulate-principal-policy
--policy-source-arn "arn:aws:iam::ACCOUNT_ID:role/${CICD_ROLE}"
--action-names s3:PutObject s3:DeleteObject dynamodb:PutItem dynamodb:DeleteItem
--resource-arns
"arn:aws:s3:::${STATE_BUCKET}/*"
"arn:aws:dynamodb:us-east-1:ACCOUNT_ID:table/${LOCK_TABLE}"
--query 'EvaluationResults[*].{Action:EvalActionName,Decision:EvalDecision}'
--output table
# Secure Terraform backend configuration (terraform block in main.tf)
terraform {
backend "s3" {
bucket = "terraform-state-company-prod"
key = "env/prod/terraform.tfstate"
region = "us-east-1"
encrypt = true
kms_key_id = "arn:aws:kms:us-east-1:ACCOUNT_ID:key/KEY_ID"
dynamodb_table = "terraform-state-lock"
# Access controlled by bucket policy — no static credentials here
}
}
Security Assessment Methodology
- Identify the state backend. Read the Terraform configuration for
backendblocks. Identify whether local, S3, Terraform Cloud, or another backend is used. Local backends are automatic high findings for any shared or CI/CD deployment. - Assess state bucket access controls. Check the bucket policy and IAM policies for who has
s3:PutObjecton the state bucket. Any principal beyond the specific Terraform execution role(s) should be documented as a finding. - Verify encryption and versioning. Confirm SSE-KMS (not SSE-S3) encryption is enabled — SSE-KMS provides a separate access control layer via KMS key policy. Confirm versioning is enabled — without versioning, malicious state modification cannot be rolled back.
- Check CI/CD pipeline access. Identify the role or user running
terraform applyin the pipeline. Does it have broader permissions than required? Terraform execution roles often accumulate excessive IAM permissions because Terraform needs to create resources. Audit against actual resource creation requirements. - Test for state lock bypass. Attempt to delete a DynamoDB lock table item directly with the assessment principal's credentials. If successful, an attacker could clear a lock and run a concurrent apply during a sensitive operation window.
- Remediate by enabling state bucket versioning and access logging, applying a bucket policy restricting write access to named Terraform execution role ARNs, enabling SSE-KMS encryption with a key policy that requires MFA for
kms:Decrypt, and alerting on unexpected PutObject API calls to the state bucket via CloudWatch/CloudTrail.
Common Assessment Errors
- Treating state access as read-only risk only. The most dangerous state backend misconfiguration is write access, not read access. An attacker with write access can cause infrastructure destruction on the next apply. Always assess and report write permissions separately from read permissions.
- Missing the DynamoDB lock table as a denial-of-service vector. An attacker who can write to the DynamoDB lock table can hold the lock indefinitely, preventing all Terraform operations including emergency remediation. This is a distinct finding from state file access.
- Ignoring Terraform Cloud and other non-S3 backends. Many organisations use Terraform Cloud, Spacelift, or Atlantis. These backends have their own access control models. An assessment that only looks for S3 state buckets will miss these entirely.
- Not accounting for state in git history. Some teams used local state and committed
terraform.tfstateto git before migrating to a remote backend. The old state — with credentials from the time of migration — may still exist in git history even if the current backend is secure. - Overlooking workspace isolation. Terraform workspaces store state at different key paths in the same backend bucket. If a development workspace's state has broader access than production, an attacker who compromises the dev Terraform role can read dev state (which may contain dev credentials that are reused in prod).
NICE Framework Alignment
| Code | Knowledge/Skill/Task Statement | How This Card Develops It |
|---|---|---|
| K0053 | Knowledge of security risk management processes | Assessing Terraform state write-access risk: the ability to trigger infrastructure destruction or code execution via state manipulation is a critical supply-chain threat |
| K0167 | Knowledge of system administration, network, and OS hardening techniques | Hardening Terraform state backends: SSE-KMS, versioning, access logging, least-privilege bucket policies, and DynamoDB lock table access controls |
| S0073 | Skill in conducting vulnerability scans and recognizing vulnerabilities | Auditing S3 backend bucket policies, DynamoDB lock table permissions, and CI/CD role IAM policies to identify state manipulation attack paths |
| T0144 | Conduct penetration testing as required for new or updated applications | Testing write access to Terraform state backends during infrastructure assessments to demonstrate potential for state poisoning and infrastructure destruction |
| T0395 | Write code to address security vulnerabilities | Writing secure Terraform backend configurations with KMS encryption, bucket policy access controls, and versioning to prevent state manipulation |
Further Reading
- Terraform State Security — HashiCorp Documentation (developer.hashicorp.com/terraform/language/state)
- Attacking Terraform State Files — Christophe Tafani-Dereeper, Datadog Security Labs blog
- Infrastructure as Code Security Scanning with Checkov — Bridgecrew Documentation (checkov.io)
Challenge Lab
Renforcez votre apprentissage avec un défi généré basé sur la compétence de cette carte.