terraform-iac-expert

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Terraform IaC Expert

Terraform IaC 专家

Overview

概述

Expert in Infrastructure as Code using Terraform and OpenTofu. Specializes in module design, state management, multi-cloud deployments, and CI/CD integration. Handles complex infrastructure patterns including multi-environment setups, remote state backends, and secure secrets management.
精通使用Terraform和OpenTofu的基础设施即代码(IaC)。专注于模块设计、多云部署以及CI/CD集成。能够处理复杂的基础设施模式,包括多环境配置、远程状态后端和安全密钥管理。

When to Use

适用场景

  • Setting up new Terraform projects and workspaces
  • Designing reusable Terraform modules
  • Managing state files and remote backends
  • Implementing multi-environment (dev/staging/prod) infrastructure
  • Migrating existing infrastructure to Terraform
  • Troubleshooting state drift and plan failures
  • Integrating Terraform with CI/CD pipelines
  • Implementing security best practices (secrets, IAM, policies)
  • 搭建新的Terraform项目和工作区
  • 设计可复用的Terraform模块
  • 管理状态文件和远程后端
  • 实现多环境(开发/预发布/生产)基础设施
  • 将现有基础设施迁移到Terraform
  • 排查状态漂移和计划执行失败问题
  • 集成Terraform与CI/CD流水线
  • 实施安全最佳实践(密钥、IAM、策略)

Capabilities

核心能力

Project Structure

项目结构

  • Module-based architecture design
  • Workspace vs directory structure strategies
  • Variable and output organization
  • Provider configuration and version constraints
  • Backend configuration for remote state
  • 基于模块的架构设计
  • 工作区与目录结构策略
  • 变量与输出的组织
  • 提供商配置与版本约束
  • 远程状态的后端配置

Module Development

模块开发

  • Reusable module patterns
  • Input validation and type constraints
  • Output design for module composition
  • Local modules vs registry modules
  • Module versioning and publishing
  • 可复用模块模式
  • 输入验证与类型约束
  • 用于模块组合的输出设计
  • 本地模块与注册表模块对比
  • 模块版本控制与发布

State Management

状态管理

  • Remote state backends (S3, GCS, Azure Blob, Terraform Cloud)
  • State locking mechanisms
  • State migration and manipulation
  • Import existing resources
  • Handling state drift
  • 远程状态后端(S3、GCS、Azure Blob、Terraform Cloud)
  • 状态锁定机制
  • 状态迁移与操作
  • 导入现有资源
  • 处理状态漂移

Multi-Environment Patterns

多环境模式

  • Workspace-based environments
  • Directory-based environments
  • Terragrunt for DRY infrastructure
  • Environment-specific variables
  • Promotion workflows
  • 基于工作区的环境配置
  • 基于目录的环境配置
  • 使用Terragrunt实现DRY基础设施
  • 环境专属变量
  • 环境升级工作流

Security

安全

  • Sensitive variable handling
  • IAM role design for Terraform
  • Policy as Code (Sentinel, OPA)
  • Secrets management integration (Vault, AWS Secrets Manager)
  • Least privilege principles
  • 敏感变量处理
  • 面向Terraform的IAM角色设计
  • 策略即代码(Sentinel、OPA)
  • 密钥管理集成(Vault、AWS Secrets Manager)
  • 最小权限原则

CI/CD Integration

CI/CD集成

  • GitHub Actions for Terraform
  • Atlantis for PR-based workflows
  • Terraform Cloud/Enterprise
  • Plan/Apply automation
  • Cost estimation integration
  • 用于Terraform的GitHub Actions
  • 基于PR工作流的Atlantis
  • Terraform Cloud/Enterprise
  • 计划/应用自动化
  • 成本估算集成

Dependencies

协同依赖

Works well with:
  • aws-solutions-architect
    - AWS resource patterns
  • kubernetes-orchestrator
    - K8s infrastructure
  • github-actions-pipeline-builder
    - CI/CD automation
  • site-reliability-engineer
    - Production infrastructure
与以下角色工具适配性良好:
  • aws-solutions-architect
    - AWS资源模式
  • kubernetes-orchestrator
    - K8s基础设施
  • github-actions-pipeline-builder
    - CI/CD自动化
  • site-reliability-engineer
    - 生产级基础设施

Examples

示例

Project Structure

项目结构

terraform/
├── modules/
│   ├── vpc/
│   │   ├── main.tf
│   │   ├── variables.tf
│   │   └── outputs.tf
│   ├── eks/
│   └── rds/
├── environments/
│   ├── dev/
│   │   ├── main.tf
│   │   ├── variables.tf
│   │   ├── terraform.tfvars
│   │   └── backend.tf
│   ├── staging/
│   └── prod/
└── shared/
    └── provider.tf
terraform/
├── modules/
│   ├── vpc/
│   │   ├── main.tf
│   │   ├── variables.tf
│   │   └── outputs.tf
│   ├── eks/
│   └── rds/
├── environments/
│   ├── dev/
│   │   ├── main.tf
│   │   ├── variables.tf
│   │   ├── terraform.tfvars
│   │   └── backend.tf
│   ├── staging/
│   └── prod/
└── shared/
    └── provider.tf

Root Module with Locals

包含Locals的根模块

hcl
undefined
hcl
undefined

environments/prod/main.tf

environments/prod/main.tf

terraform { required_version = ">= 1.5.0"
required_providers { aws = { source = "hashicorp/aws" version = "~> 5.0" } }
backend "s3" { bucket = "mycompany-terraform-state" key = "prod/terraform.tfstate" region = "us-west-2" encrypt = true dynamodb_table = "terraform-locks" } }
locals { environment = "prod" project = "myapp"
common_tags = { Environment = local.environment Project = local.project ManagedBy = "terraform" } }
module "vpc" { source = "../../modules/vpc"
environment = local.environment cidr_block = "10.0.0.0/16" tags = local.common_tags }
module "eks" { source = "../../modules/eks"
environment = local.environment vpc_id = module.vpc.vpc_id private_subnet_ids = module.vpc.private_subnet_ids cluster_version = "1.29" tags = local.common_tags }
undefined
terraform { required_version = ">= 1.5.0"
required_providers { aws = { source = "hashicorp/aws" version = "~> 5.0" } }
backend "s3" { bucket = "mycompany-terraform-state" key = "prod/terraform.tfstate" region = "us-west-2" encrypt = true dynamodb_table = "terraform-locks" } }
locals { environment = "prod" project = "myapp"
common_tags = { Environment = local.environment Project = local.project ManagedBy = "terraform" } }
module "vpc" { source = "../../modules/vpc"
environment = local.environment cidr_block = "10.0.0.0/16" tags = local.common_tags }
module "eks" { source = "../../modules/eks"
environment = local.environment vpc_id = module.vpc.vpc_id private_subnet_ids = module.vpc.private_subnet_ids cluster_version = "1.29" tags = local.common_tags }
undefined

Reusable Module with Validation

带验证功能的可复用模块

hcl
undefined
hcl
undefined

modules/vpc/variables.tf

modules/vpc/variables.tf

variable "environment" { type = string description = "Environment name (dev, staging, prod)"
validation { condition = contains(["dev", "staging", "prod"], var.environment) error_message = "Environment must be dev, staging, or prod." } }
variable "cidr_block" { type = string description = "VPC CIDR block"
validation { condition = can(cidrhost(var.cidr_block, 0)) error_message = "Must be a valid CIDR block." } }
variable "availability_zones" { type = list(string) description = "List of AZs to use" default = ["us-west-2a", "us-west-2b", "us-west-2c"] }
variable "enable_nat_gateway" { type = bool description = "Enable NAT Gateway for private subnets" default = true }
variable "tags" { type = map(string) description = "Tags to apply to all resources" default = {} }
undefined
variable "environment" { type = string description = "Environment name (dev, staging, prod)"
validation { condition = contains(["dev", "staging", "prod"], var.environment) error_message = "Environment must be dev, staging, or prod." } }
variable "cidr_block" { type = string description = "VPC CIDR block"
validation { condition = can(cidrhost(var.cidr_block, 0)) error_message = "Must be a valid CIDR block." } }
variable "availability_zones" { type = list(string) description = "List of AZs to use" default = ["us-west-2a", "us-west-2b", "us-west-2c"] }
variable "enable_nat_gateway" { type = bool description = "Enable NAT Gateway for private subnets" default = true }
variable "tags" { type = map(string) description = "Tags to apply to all resources" default = {} }
undefined

Module with Dynamic Blocks

带动态块的模块

hcl
undefined
hcl
undefined

modules/security-group/main.tf

modules/security-group/main.tf

resource "aws_security_group" "this" { name = var.name description = var.description vpc_id = var.vpc_id
dynamic "ingress" { for_each = var.ingress_rules content { from_port = ingress.value.from_port to_port = ingress.value.to_port protocol = ingress.value.protocol cidr_blocks = ingress.value.cidr_blocks description = ingress.value.description } }
egress { from_port = 0 to_port = 0 protocol = "-1" cidr_blocks = ["0.0.0.0/0"] }
tags = merge(var.tags, { Name = var.name }) }
undefined
resource "aws_security_group" "this" { name = var.name description = var.description vpc_id = var.vpc_id
dynamic "ingress" { for_each = var.ingress_rules content { from_port = ingress.value.from_port to_port = ingress.value.to_port protocol = ingress.value.protocol cidr_blocks = ingress.value.cidr_blocks description = ingress.value.description } }
egress { from_port = 0 to_port = 0 protocol = "-1" cidr_blocks = ["0.0.0.0/0"] }
tags = merge(var.tags, { Name = var.name }) }
undefined

Remote State Data Source

远程状态数据源

hcl
undefined
hcl
undefined

Reference another environment's state

Reference another environment's state

data "terraform_remote_state" "shared" { backend = "s3"
config = { bucket = "mycompany-terraform-state" key = "shared/terraform.tfstate" region = "us-west-2" } }
data "terraform_remote_state" "shared" { backend = "s3"
config = { bucket = "mycompany-terraform-state" key = "shared/terraform.tfstate" region = "us-west-2" } }

Use outputs from shared state

Use outputs from shared state

resource "aws_instance" "app" { ami = data.terraform_remote_state.shared.outputs.base_ami_id instance_type = "t3.medium" subnet_id = data.terraform_remote_state.shared.outputs.private_subnet_id }
undefined
resource "aws_instance" "app" { ami = data.terraform_remote_state.shared.outputs.base_ami_id instance_type = "t3.medium" subnet_id = data.terraform_remote_state.shared.outputs.private_subnet_id }
undefined

GitHub Actions CI/CD

GitHub Actions CI/CD

yaml
undefined
yaml
undefined

.github/workflows/terraform.yml

.github/workflows/terraform.yml

name: Terraform
on: pull_request: paths: - 'terraform/' push: branches: [main] paths: - 'terraform/'
env: TF_VERSION: 1.6.0 AWS_REGION: us-west-2
jobs: plan: runs-on: ubuntu-latest permissions: contents: read pull-requests: write id-token: write # For OIDC
steps:
  - uses: actions/checkout@v4

  - name: Configure AWS credentials
    uses: aws-actions/configure-aws-credentials@v4
    with:
      role-to-assume: arn:aws:iam::123456789:role/terraform-github-actions
      aws-region: ${{ env.AWS_REGION }}

  - uses: hashicorp/setup-terraform@v3
    with:
      terraform_version: ${{ env.TF_VERSION }}

  - name: Terraform Init
    working-directory: terraform/environments/prod
    run: terraform init

  - name: Terraform Plan
    working-directory: terraform/environments/prod
    run: terraform plan -out=tfplan

  - name: Upload Plan
    uses: actions/upload-artifact@v4
    with:
      name: tfplan
      path: terraform/environments/prod/tfplan
apply: needs: plan runs-on: ubuntu-latest if: github.ref == 'refs/heads/main' && github.event_name == 'push' environment: production
steps:
  - uses: actions/checkout@v4

  - name: Configure AWS credentials
    uses: aws-actions/configure-aws-credentials@v4
    with:
      role-to-assume: arn:aws:iam::123456789:role/terraform-github-actions
      aws-region: ${{ env.AWS_REGION }}

  - uses: hashicorp/setup-terraform@v3
    with:
      terraform_version: ${{ env.TF_VERSION }}

  - name: Download Plan
    uses: actions/download-artifact@v4
    with:
      name: tfplan
      path: terraform/environments/prod

  - name: Terraform Apply
    working-directory: terraform/environments/prod
    run: terraform apply -auto-approve tfplan
undefined
name: Terraform
on: pull_request: paths: - 'terraform/' push: branches: [main] paths: - 'terraform/'
env: TF_VERSION: 1.6.0 AWS_REGION: us-west-2
jobs: plan: runs-on: ubuntu-latest permissions: contents: read pull-requests: write id-token: write # For OIDC
steps:
  - uses: actions/checkout@v4

  - name: Configure AWS credentials
    uses: aws-actions/configure-aws-credentials@v4
    with:
      role-to-assume: arn:aws:iam::123456789:role/terraform-github-actions
      aws-region: ${{ env.AWS_REGION }}

  - uses: hashicorp/setup-terraform@v3
    with:
      terraform_version: ${{ env.TF_VERSION }}

  - name: Terraform Init
    working-directory: terraform/environments/prod
    run: terraform init

  - name: Terraform Plan
    working-directory: terraform/environments/prod
    run: terraform plan -out=tfplan

  - name: Upload Plan
    uses: actions/upload-artifact@v4
    with:
      name: tfplan
      path: terraform/environments/prod/tfplan
apply: needs: plan runs-on: ubuntu-latest if: github.ref == 'refs/heads/main' && github.event_name == 'push' environment: production
steps:
  - uses: actions/checkout@v4

  - name: Configure AWS credentials
    uses: aws-actions/configure-aws-credentials@v4
    with:
      role-to-assume: arn:aws:iam::123456789:role/terraform-github-actions
      aws-region: ${{ env.AWS_REGION }}

  - uses: hashicorp/setup-terraform@v3
    with:
      terraform_version: ${{ env.TF_VERSION }}

  - name: Download Plan
    uses: actions/download-artifact@v4
    with:
      name: tfplan
      path: terraform/environments/prod

  - name: Terraform Apply
    working-directory: terraform/environments/prod
    run: terraform apply -auto-approve tfplan
undefined

Import Existing Resources

导入现有资源

bash
undefined
bash
undefined

Import existing AWS resource into state

Import existing AWS resource into state

terraform import aws_s3_bucket.existing my-existing-bucket
terraform import aws_s3_bucket.existing my-existing-bucket

Import using for_each key

Import using for_each key

terraform import 'aws_iam_user.users["alice"]' alice
terraform import 'aws_iam_user.users["alice"]' alice

Generate configuration from import (Terraform 1.5+)

Generate configuration from import (Terraform 1.5+)

terraform plan -generate-config-out=generated.tf
undefined
terraform plan -generate-config-out=generated.tf
undefined

Handling Sensitive Values

敏感值处理

hcl
undefined
hcl
undefined

Reference secrets from AWS Secrets Manager

Reference secrets from AWS Secrets Manager

data "aws_secretsmanager_secret_version" "db_password" { secret_id = "prod/db/password" }
resource "aws_db_instance" "main" {

... other config ...

password = data.aws_secretsmanager_secret_version.db_password.secret_string }
data "aws_secretsmanager_secret_version" "db_password" { secret_id = "prod/db/password" }
resource "aws_db_instance" "main" {

... other config ...

password = data.aws_secretsmanager_secret_version.db_password.secret_string }

Mark outputs as sensitive

Mark outputs as sensitive

output "db_connection_string" { value = "postgres://admin:${aws_db_instance.main.password}@${aws_db_instance.main.endpoint}" sensitive = true }
undefined
output "db_connection_string" { value = "postgres://admin:${aws_db_instance.main.password}@${aws_db_instance.main.endpoint}" sensitive = true }
undefined

Best Practices

最佳实践

  1. Use remote state - Never store state locally for team projects
  2. Enable state locking - Prevent concurrent modifications
  3. Version pin providers - Use
    ~>
    constraints, not
    >=
  4. Separate environments - Use directories or workspaces, not branches
  5. Module everything reusable - But don't over-abstract
  6. Validate inputs - Use variable validation blocks
  7. Use data sources - Reference existing resources instead of hardcoding
  8. Tag all resources - Apply consistent tags for cost tracking
  9. Review plans carefully - Especially for destroy operations
  1. 使用远程状态 - 团队项目绝不要本地存储状态文件
  2. 启用状态锁定 - 防止并发修改
  3. 固定提供商版本 - 使用
    ~>
    约束,而非
    >=
  4. 隔离环境 - 使用目录或工作区,而非分支
  5. 模块化所有可复用内容 - 但不要过度抽象
  6. 验证输入 - 使用变量验证块
  7. 使用数据源 - 引用现有资源而非硬编码
  8. 为所有资源打标签 - 应用统一标签便于成本追踪
  9. 仔细审查计划 - 尤其是销毁操作

Common Pitfalls

常见陷阱

  • State file conflicts - Multiple people running terraform simultaneously
  • Hardcoded values - Not using variables for environment differences
  • Circular dependencies - Resources depending on each other
  • Missing dependencies - Not using
    depends_on
    when implicit deps aren't enough
  • Large state files - Not breaking up large infrastructure
  • Secrets in state - State contains sensitive values, encrypt at rest
  • Provider version drift - Different team members using different versions
  • Not using -target carefully - Can cause drift, use sparingly
  • 状态文件冲突 - 多人同时执行terraform操作
  • 硬编码值 - 未针对环境差异使用变量
  • 循环依赖 - 资源间互相依赖
  • 缺失依赖 - 隐式依赖不足时未使用
    depends_on
  • 过大的状态文件 - 未拆分大型基础设施
  • 状态中包含密钥 - 状态包含敏感值,需静态加密
  • 提供商版本漂移 - 团队成员使用不同版本
  • 随意使用-target - 可能导致状态漂移,谨慎使用