terraform-best-practices

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Terraform Best Practices Skill

Terraform最佳实践技能

Comprehensive community best practices for Terraform infrastructure as code, based on Anton Babenko's widely-adopted guide at terraform-best-practices.com.
基于Anton Babenko广受采用的terraform-best-practices.com指南,提供Terraform基础设施即代码的全面社区最佳实践。

When to Use This Skill

何时使用此技能

Activate this skill when:
  • Designing project structure - Choosing how to organize Terraform code for small, medium, or large infrastructure
  • Implementing IaC patterns - Following community best practices for modules, compositions, and state management
  • Scaling infrastructure - Growing from simple setups to complex multi-environment deployments
  • Evaluating tools - Deciding between vanilla Terraform vs Terragrunt orchestration
  • Establishing standards - Creating team conventions for naming, styling, and code organization
  • Troubleshooting common issues - Resolving frequent Terraform problems (dependency hell, state management, etc.)
在以下场景激活此技能:
  • 设计项目结构 - 为小型、中型或大型基础设施选择Terraform代码的组织方式
  • 实现IaC模式 - 遵循社区关于模块、组合和状态管理的最佳实践
  • 扩展基础设施 - 从简单部署扩展到复杂的多环境部署
  • 评估工具 - 在原生Terraform与Terragrunt编排之间做选择
  • 建立标准 - 制定团队的命名、样式和代码组织规范
  • 排查常见问题 - 解决Terraform常见问题(依赖地狱、状态管理等)

Key Concepts

核心概念

Infrastructure Sizes & Patterns

基础设施规模与模式

Small Infrastructure (< 20 resources)
  • Single Terraform directory
  • Minimal module structure
  • Direct resource definitions
  • Simple state management
Medium Infrastructure (20-100 resources)
  • Multiple environment directories
  • Reusable modules
  • Remote state backend
  • Workspaces or directory-based environments
Large Infrastructure (100+ resources)
  • Module composition approach
  • Terragrunt for orchestration
  • Hierarchical state structure
  • Infrastructure vs resource modules
小型基础设施(少于20个资源)
  • 单一Terraform目录
  • 极简模块结构
  • 直接定义资源
  • 简单状态管理
中型基础设施(20-100个资源)
  • 多环境目录
  • 可复用模块
  • 远程状态后端
  • 工作区或基于目录的环境
大型基础设施(100+个资源)
  • 模块组合方法
  • 使用Terragrunt进行编排
  • 分层状态结构
  • 基础设施模块与资源模块分离

Module Types

模块类型

Resource Modules
  • Create individual AWS/Azure/GCP resources
  • Highly reusable across projects
  • Published to registries
  • Examples:
    terraform-aws-modules/vpc/aws
Infrastructure Modules
  • Combine resource modules
  • Environment-specific configurations
  • Less portable, more opinionated
  • Example: Company VPC + security groups + bastion
Compositions
  • Top-level infrastructure assembly
  • Orchestrate multiple modules
  • Environment-specific values
  • No reusable logic, only wiring
资源模块
  • 创建单个AWS/Azure/GCP资源
  • 跨项目高度可复用
  • 发布到注册表
  • 示例:
    terraform-aws-modules/vpc/aws
基础设施模块
  • 组合多个资源模块
  • 特定环境的配置
  • 可移植性较低,更具主观性
  • 示例:公司VPC + 安全组 + 堡垒机
组合模块
  • 顶层基础设施组装
  • 编排多个模块
  • 特定环境的值
  • 无复用逻辑,仅做连接

Code Structure Patterns

代码结构模式

undefined
undefined

Small Infrastructure

小型基础设施

terraform/ main.tf variables.tf outputs.tf terraform.tfvars
terraform/ main.tf variables.tf outputs.tf terraform.tfvars

Medium Infrastructure

中型基础设施

terraform/ modules/ vpc/ compute/ environments/ dev/ prod/
terraform/ modules/ vpc/ compute/ environments/ dev/ prod/

Large Infrastructure (Terragrunt)

大型基础设施(Terragrunt)

infrastructure/ _global/ dev/ vpc/ terragrunt.hcl compute/ terragrunt.hcl prod/ vpc/ compute/
undefined
infrastructure/ _global/ dev/ vpc/ terragrunt.hcl compute/ terragrunt.hcl prod/ vpc/ compute/
undefined

Naming Conventions

命名规范

Resource Naming

资源命名

hcl
undefined
hcl
undefined

Pattern: {project}-{environment}-{resource-type}-{name}

模式:{项目}-{环境}-{资源类型}-{名称}

resource "aws_s3_bucket" "main" { bucket = "myapp-prod-data-customer-uploads" }
resource "aws_s3_bucket" "main" { bucket = "myapp-prod-data-customer-uploads" }

Pattern: this for single resource of type

模式:单个资源类型使用this

resource "aws_security_group" "this" { name = "${var.project}-${var.environment}-app" }
undefined
resource "aws_security_group" "this" { name = "${var.project}-${var.environment}-app" }
undefined

Variable Naming

变量命名

  • Use snake_case:
    instance_type
    ,
    vpc_cidr_block
  • Boolean prefix with
    enable_
    or
    create_
    :
    enable_monitoring
    ,
    create_vpc
  • Plural for lists:
    subnet_ids
    ,
    availability_zones
  • 使用蛇形命名法:
    instance_type
    ,
    vpc_cidr_block
  • 布尔类型前缀用
    enable_
    create_
    enable_monitoring
    ,
    create_vpc
  • 列表类型使用复数:
    subnet_ids
    ,
    availability_zones

File Organization

文件组织

  • main.tf
    - Primary resource definitions
  • variables.tf
    - Input variables
  • outputs.tf
    - Output values
  • versions.tf
    - Provider and Terraform version constraints
  • data.tf
    - Data sources (optional)
  • locals.tf
    - Local values (optional)
  • main.tf
    - 主要资源定义
  • variables.tf
    - 输入变量
  • outputs.tf
    - 输出值
  • versions.tf
    - Provider和Terraform版本约束
  • data.tf
    - 数据源(可选)
  • locals.tf
    - 本地值(可选)

Code Styling Best Practices

代码风格最佳实践

Formatting

格式化

hcl
undefined
hcl
undefined

Use terraform fmt

使用terraform fmt

Group related settings

分组相关设置

resource "aws_instance" "web" { ami = data.aws_ami.ubuntu.id instance_type = var.instance_type
tags = { Name = "${var.project}-web" Environment = var.environment ManagedBy = "Terraform" } }
resource "aws_instance" "web" { ami = data.aws_ami.ubuntu.id instance_type = var.instance_type
tags = { Name = "${var.project}-web" Environment = var.environment ManagedBy = "Terraform" } }

Align equals signs in blocks

块内等号对齐

variable "instance_config" { type = object({ instance_type = string volume_size = number volume_type = string }) }
undefined
variable "instance_config" { type = object({ instance_type = string volume_size = number volume_type = string }) }
undefined

Module Structure

模块结构

hcl
undefined
hcl
undefined

versions.tf - Pin versions

versions.tf - 固定版本

terraform { required_version = ">= 1.0"
required_providers { aws = { source = "hashicorp/aws" version = "~> 5.0" } } }
terraform { required_version = ">= 1.0"
required_providers { aws = { source = "hashicorp/aws" version = "~> 5.0" } } }

variables.tf - Document everything

variables.tf - 全面文档

variable "vpc_cidr" { description = "CIDR block for VPC" type = string default = "10.0.0.0/16"
validation { condition = can(cidrhost(var.vpc_cidr, 0)) error_message = "Must be valid IPv4 CIDR." } }
undefined
variable "vpc_cidr" { description = "VPC的CIDR块" type = string default = "10.0.0.0/16"
validation { condition = can(cidrhost(var.vpc_cidr, 0)) error_message = "必须是有效的IPv4 CIDR。" } }
undefined

State Management

状态管理

Backend Configuration

后端配置

hcl
undefined
hcl
undefined

Use remote state for team collaboration

使用远程状态实现团队协作

terraform { backend "s3" { bucket = "myapp-terraform-state" key = "prod/vpc/terraform.tfstate" region = "us-east-1" encrypt = true dynamodb_table = "terraform-state-lock" } }
undefined
terraform { backend "s3" { bucket = "myapp-terraform-state" key = "prod/vpc/terraform.tfstate" region = "us-east-1" encrypt = true dynamodb_table = "terraform-state-lock" } }
undefined

State Best Practices

状态最佳实践

  • Never commit
    .tfstate
    files to version control (contains plaintext secrets)
  • Use remote backend (S3, Azure Storage, GCS) with locking
  • CRITICAL: State files contain sensitive data (passwords, keys, IPs)
    • Enable versioning on backend storage for rollback capability
    • Restrict access via IAM policies (least privilege principle)
    • Consider using
      sensitive = true
      for sensitive outputs
  • Separate state files by environment and component
  • Use state file encryption at rest (AES-256)
  • Implement state file backups and disaster recovery procedures
  • Use
    terraform_remote_state
    data source for cross-stack references
  • 绝对不要提交
    .tfstate
    文件到版本控制(包含明文密钥)
  • 使用带锁定功能的远程后端(S3、Azure Storage、GCS)
  • 关键提示:状态文件包含敏感数据(密码、密钥、IP地址)
    • 为后端存储启用版本控制以支持回滚
    • 通过IAM策略限制访问(最小权限原则)
    • 考虑为敏感输出设置
      sensitive = true
  • 按环境和组件分离状态文件
  • 启用状态文件静态加密(AES-256)
  • 实现状态文件备份和灾难恢复流程
  • 使用
    terraform_remote_state
    数据源实现跨栈引用

Terraform vs Terragrunt

Terraform vs Terragrunt

When to Use Vanilla Terraform

何时使用原生Terraform

✅ Small to medium infrastructure (< 50 resources) ✅ Single cloud provider ✅ Few environments (dev/prod) ✅ Team comfortable with DRY through modules
✅ 小型到中型基础设施(少于50个资源) ✅ 单一云提供商 ✅ 少量环境(开发/生产) ✅ 团队熟悉通过模块实现DRY原则

When to Use Terragrunt

何时使用Terragrunt

✅ Large infrastructure (100+ resources) ✅ Many environments (dev/staging/prod/dr) ✅ Deep directory hierarchies ✅ Need for inheritance and composition ✅ Complex dependency orchestration
✅ 大型基础设施(100+个资源) ✅ 多环境(开发/预发布/生产/灾难恢复) ✅ 深度目录层级 ✅ 需要继承和组合功能 ✅ 复杂依赖编排

Terragrunt Benefits

Terragrunt优势

  • DRY backend configuration
  • Dependency orchestration
  • Variable inheritance
  • Before/after hooks
  • Auto-init and auto-retry
  • DRY后端配置
  • 依赖编排
  • 变量继承
  • 前后置钩子
  • 自动初始化和自动重试

Recommended Tools

推荐工具

Essential

必备工具

  • terraform - Core IaC tool
  • terraform fmt - Code formatter (built-in)
  • terraform validate - Syntax validator (built-in)
  • terraform - 核心IaC工具
  • terraform fmt - 代码格式化工具(内置)
  • terraform validate - 语法验证工具(内置)

Quality & Linting

质量与代码检查

  • tflint - Terraform linter with provider-specific rules
  • tfsec - Security scanner for Terraform code
  • checkov - Policy-as-code scanner
  • terraform-docs - Auto-generate documentation
  • tflint - 支持提供商特定规则的Terraform代码检查工具
  • tfsec - Terraform代码安全扫描器
  • checkov - 策略即代码扫描器
  • terraform-docs - 自动生成文档

Version Management

版本管理

  • tfenv - Terraform version manager (like nvm for Node)
  • tgenv - Terragrunt version manager
  • tfenv - Terraform版本管理器(类似Node的nvm)
  • tgenv - Terragrunt版本管理器

Workflow Automation

工作流自动化

  • pre-commit-terraform - Git hooks for quality gates
  • Atlantis - Pull request automation for Terraform
  • Infracost - Cost estimation in PRs
  • pre-commit-terraform - 用于质量门禁的Git钩子
  • Atlantis - Terraform的拉取请求自动化工具
  • Infracost - 拉取请求中的成本估算

Orchestration

编排工具

  • Terragrunt - DRY orchestration wrapper
  • Terramate - Stack orchestration and code generation
  • Terragrunt - DRY编排包装器
  • Terramate - 栈编排和代码生成工具

Common Patterns

常见模式

Multi-Environment Setup

多环境设置

hcl
undefined
hcl
undefined

environments/dev/main.tf

environments/dev/main.tf

module "infrastructure" { source = "../../modules/infrastructure"
environment = "dev" instance_type = "t3.micro" instance_count = 1 }
module "infrastructure" { source = "../../modules/infrastructure"
environment = "dev" instance_type = "t3.micro" instance_count = 1 }

environments/prod/main.tf

environments/prod/main.tf

module "infrastructure" { source = "../../modules/infrastructure"
environment = "prod" instance_type = "t3.large" instance_count = 3 }
undefined
module "infrastructure" { source = "../../modules/infrastructure"
environment = "prod" instance_type = "t3.large" instance_count = 3 }
undefined

Module Composition

模块组合

hcl
undefined
hcl
undefined

modules/infrastructure/main.tf

modules/infrastructure/main.tf

module "vpc" { source = "terraform-aws-modules/vpc/aws" version = "~> 5.0"
name = "${var.project}-${var.environment}" cidr = var.vpc_cidr }
module "security_group" { source = "terraform-aws-modules/security-group/aws" version = "~> 5.0"
name = "${var.project}-${var.environment}-app" vpc_id = module.vpc.vpc_id }
undefined
module "vpc" { source = "terraform-aws-modules/vpc/aws" version = "~> 5.0"
name = "${var.project}-${var.environment}" cidr = var.vpc_cidr }
module "security_group" { source = "terraform-aws-modules/security-group/aws" version = "~> 5.0"
name = "${var.project}-${var.environment}-app" vpc_id = module.vpc.vpc_id }
undefined

Conditional Resources

条件资源

hcl
resource "aws_instance" "bastion" {
  count = var.create_bastion ? 1 : 0

  ami           = data.aws_ami.ubuntu.id
  instance_type = "t3.micro"
}
hcl
resource "aws_instance" "bastion" {
  count = var.create_bastion ? 1 : 0

  ami           = data.aws_ami.ubuntu.id
  instance_type = "t3.micro"
}

Access with: aws_instance.bastion[0]

访问方式:aws_instance.bastion[0]

undefined
undefined

Frequent Terraform Problems (FTP)

Terraform常见问题(FTP)

Dependency Hell

依赖地狱

Problem: Circular dependencies between modules, version conflicts Solution:
  • Pin provider and module versions explicitly
  • Use Dependabot for automated updates
  • Implement testing for version upgrades
  • Avoid cross-module dependencies; use data sources instead
问题:模块间循环依赖、版本冲突 解决方案:
  • 显式固定提供商和模块版本
  • 使用Dependabot进行自动化更新
  • 为版本升级实现测试
  • 避免跨模块依赖;改用数据源

State Lock Issues

状态锁定问题

Problem: "Error acquiring state lock" Solution:
  • Implement DynamoDB table for S3 backend locking
  • Use
    terraform force-unlock
    cautiously
  • Never delete
    .terraform.lock.hcl
问题:"获取状态锁失败" 解决方案:
  • 为S3后端实现DynamoDB表用于锁定
  • 谨慎使用
    terraform force-unlock
  • 绝对不要删除
    .terraform.lock.hcl

Resource Drift

资源漂移

Problem: Manual changes outside Terraform Solution:
  • Run
    terraform plan
    regularly in CI
  • Use
    terraform refresh
    to detect drift
  • Implement policy-as-code (OPA, Sentinel)
  • Restrict manual changes via IAM policies
问题:在Terraform之外进行手动更改 解决方案:
  • 在CI中定期运行
    terraform plan
  • 使用
    terraform refresh
    检测漂移
  • 实现策略即代码(OPA、Sentinel)
  • 通过IAM策略限制手动更改

Count vs For_Each

Count vs For_Each

Problem: Changing count causes resource recreation Solution:
  • Prefer
    for_each
    with maps for stable resources
  • Use
    count
    only for simple on/off toggles
hcl
undefined
问题:Count值变化导致资源重建 解决方案:
  • 对于稳定资源,优先使用带映射的
    for_each
  • 仅在简单开关场景使用
    count
hcl
undefined

Bad - index changes cause recreation

不佳写法 - 索引变化会导致重建

resource "aws_subnet" "example" { count = length(var.azs) cidr_block = cidrsubnet(var.vpc_cidr, 8, count.index) }
resource "aws_subnet" "example" { count = length(var.azs) cidr_block = cidrsubnet(var.vpc_cidr, 8, count.index) }

Good - stable keys prevent recreation with explicit mapping

推荐写法 - 稳定键避免重建,使用显式映射

locals { az_cidrs = { "us-east-1a" = cidrsubnet(var.vpc_cidr, 8, 0) "us-east-1b" = cidrsubnet(var.vpc_cidr, 8, 1) "us-east-1c" = cidrsubnet(var.vpc_cidr, 8, 2) } }
resource "aws_subnet" "example" { for_each = local.az_cidrs availability_zone = each.key cidr_block = each.value }
undefined
locals { az_cidrs = { "us-east-1a" = cidrsubnet(var.vpc_cidr, 8, 0) "us-east-1b" = cidrsubnet(var.vpc_cidr, 8, 1) "us-east-1c" = cidrsubnet(var.vpc_cidr, 8, 2) } }
resource "aws_subnet" "example" { for_each = local.az_cidrs availability_zone = each.key cidr_block = each.value }
undefined

Working with This Skill

使用此技能的指南

For Beginners

初学者

  1. Start with the Small Infrastructure pattern
  2. Read
    references/terraform.md
    for code structure examples
  3. Review naming conventions before writing code
  4. Use
    terraform fmt
    and
    tflint
    from day one
  1. 小型基础设施模式开始
  2. 阅读
    references/terraform.md
    获取代码结构示例
  3. 编写代码前先查看命名规范
  4. 从第一天起就使用
    terraform fmt
    tflint

For Scaling Up

规模扩展

  1. Review Medium Infrastructure patterns when hitting 20+ resources
  2. Evaluate Terragrunt when managing 3+ environments
  3. Implement module composition for reusability
  4. Set up remote state and locking
  1. 当资源数量达到20+时,查看中型基础设施模式
  2. 管理3个以上环境时,评估Terragrunt
  3. 实现模块组合以提高复用性
  4. 设置远程状态和锁定

For Production Readiness

生产就绪

  1. Pin all provider and module versions
  2. Implement pre-commit hooks for quality gates
  3. Use Atlantis or similar for PR-based workflows
  4. Add security scanning (tfsec, checkov) to CI/CD
  5. Set up cost estimation (Infracost)
  1. 固定所有提供商和模块版本
  2. 实现pre-commit钩子作为质量门禁
  3. 使用Atlantis或类似工具实现基于PR的工作流
  4. 在CI/CD中添加安全扫描(tfsec、checkov)
  5. 设置成本估算(Infracost)

Reference Files

参考文件

references/terraform.md

references/terraform.md

Complete documentation extracted from terraform-best-practices.com covering:
  • Code structure patterns for all infrastructure sizes
  • Module types and composition strategies
  • Real-world examples from small to large setups
  • Tool recommendations and integration guides
从terraform-best-practices.com提取的完整文档,涵盖:
  • 所有规模基础设施的代码结构模式
  • 模块类型和组合策略
  • 从小型到大型部署的真实示例
  • 工具推荐和集成指南

references/examples.md

references/examples.md

Practical examples demonstrating:
  • Small/medium/large infrastructure implementations
  • Terraform vs Terragrunt comparisons
  • Module composition patterns
  • Environment-specific configurations
实用示例展示:
  • 小型/中型/大型基础设施实现
  • Terraform与Terragrunt对比
  • 模块组合模式
  • 特定环境配置

references/llms.md

references/llms.md

Multilingual index of all content (20+ languages available on source website)
所有内容的多语言索引(源网站支持20+种语言)

Quick Reference Commands

快速参考命令

bash
undefined
bash
undefined

Initialize and validate

初始化和验证

terraform init terraform validate terraform fmt -recursive
terraform init terraform validate terraform fmt -recursive

Plan and apply

计划和应用

terraform plan -out=tfplan terraform apply tfplan
terraform plan -out=tfplan terraform apply tfplan

State management

状态管理

terraform state list terraform state show aws_instance.web terraform state mv aws_instance.old aws_instance.new
terraform state list terraform state show aws_instance.web terraform state mv aws_instance.old aws_instance.new

Workspace management

工作区管理

terraform workspace list terraform workspace select dev terraform workspace new staging
terraform workspace list terraform workspace select dev terraform workspace new staging

Import existing resources

导入现有资源

terraform import aws_instance.web i-1234567890abcdef0
terraform import aws_instance.web i-1234567890abcdef0

Debugging

调试

TF_LOG=DEBUG terraform apply terraform console # Interactive evaluation
undefined
TF_LOG=DEBUG terraform apply terraform console # 交互式评估
undefined

Additional Resources

额外资源

Notes

说明

  • This skill represents community best practices (Anton Babenko), not official HashiCorp documentation
  • Content is based on Terraform 1.0+ patterns and recommendations
  • Focuses on AWS examples but principles apply to all providers
  • Reference files extracted from multilingual source (English content emphasized)
  • 此技能代表社区最佳实践(Anton Babenko),而非HashiCorp官方文档
  • 内容基于Terraform 1.0+的模式和建议
  • 以AWS示例为主,但原则适用于所有提供商
  • 参考文件提取自多语言源(重点为英文内容)

Updating This Skill

更新此技能

To refresh with latest best practices:
bash
skill-seekers scrape https://www.terraform-best-practices.com/ \
  --name terraform-best-practices \
  --max-pages 50
如需使用最新最佳实践刷新内容:
bash
skill-seekers scrape https://www.terraform-best-practices.com/ \
  --name terraform-best-practices \
  --max-pages 50