terraform-iac-data-engineering

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Terraform IaC for Data Engineering

面向数据工程的Terraform基础设施即代码(IaC)

Skill by ara.so — Data Skills collection.
This project provides Infrastructure-as-Code (IaC) patterns using Terraform specifically for data engineering workloads on AWS. It demonstrates how to provision and manage AWS resources (S3, EC2, IAM) needed for data pipelines and processing.
ara.so提供的技能——数据技能合集。
本项目提供了使用Terraform构建的基础设施即代码(IaC)模式,专门针对AWS上的数据工程工作负载。它展示了如何配置和管理数据管道与处理所需的AWS资源(S3、EC2、IAM)。

What This Project Does

本项目功能

  • Provisions AWS S3 buckets for data storage
  • Creates EC2 instances for data processing workloads
  • Manages IAM users, roles, and policies
  • Demonstrates Terraform state management
  • Provides reusable IaC patterns for data engineering infrastructure
  • 配置用于数据存储的AWS S3存储桶
  • 创建用于数据处理工作负载的EC2实例
  • 管理IAM用户、角色和策略
  • 演示Terraform状态管理
  • 提供可复用的数据工程基础设施IaC模式

Installation

安装

Prerequisites

前置要求

  1. Terraform CLI
    bash
    # macOS
    brew install terraform
    
    # Linux
    wget https://releases.hashicorp.com/terraform/1.5.0/terraform_1.5.0_linux_amd64.zip
    unzip terraform_1.5.0_linux_amd64.zip
    sudo mv terraform /usr/local/bin/
  2. AWS CLI
    bash
    # macOS
    brew install awscli
    
    # Linux
    curl "https://awscli.amazonaws.com/awscli-exe-linux-x86_64.zip" -o "awscliv2.zip"
    unzip awscliv2.zip
    sudo ./aws/install
  3. Configure AWS CLI
    bash
    aws configure
    # Enter your AWS Access Key ID
    # Enter your AWS Secret Access Key
    # Default region: us-east-1
    # Default output format: json
  1. Terraform CLI
    bash
    # macOS
    brew install terraform
    
    # Linux
    wget https://releases.hashicorp.com/terraform/1.5.0/terraform_1.5.0_linux_amd64.zip
    unzip terraform_1.5.0_linux_amd64.zip
    sudo mv terraform /usr/local/bin/
  2. AWS CLI
    bash
    # macOS
    brew install awscli
    
    # Linux
    curl "https://awscli.amazonaws.com/awscli-exe-linux-x86_64.zip" -o "awscliv2.zip"
    unzip awscliv2.zip
    sudo ./aws/install
  3. 配置AWS CLI
    bash
    aws configure
    # 输入你的AWS访问密钥ID
    # 输入你的AWS秘密访问密钥
    # 默认区域:us-east-1
    # 默认输出格式:json

Project Setup

项目设置

bash
git clone https://github.com/josephmachado/iac-for-data-engineering-terraform-.git
cd iac-for-data-engineering-terraform-
bash
git clone https://github.com/josephmachado/iac-for-data-engineering-terraform-.git
cd iac-for-data-engineering-terraform-

Key Terraform Commands

核心Terraform命令

Initialize Terraform

初始化Terraform

bash
undefined
bash
undefined

Initialize terraform (downloads providers, sets up backend)

初始化Terraform(下载提供程序,设置后端)

terraform -chdir=terraform init
terraform -chdir=terraform init

Validate configuration files

验证配置文件

terraform -chdir=terraform validate
terraform -chdir=terraform validate

Format configuration files

格式化配置文件

terraform -chdir=terraform fmt
undefined
terraform -chdir=terraform fmt
undefined

Plan and Apply Infrastructure

规划并应用基础设施

bash
undefined
bash
undefined

Preview changes before applying

应用前预览变更

terraform -chdir=terraform plan
terraform -chdir=terraform plan

Apply infrastructure changes

应用基础设施变更

terraform -chdir=terraform apply
terraform -chdir=terraform apply

Auto-approve without confirmation (use with caution)

自动确认无需交互(谨慎使用)

terraform -chdir=terraform apply -auto-approve
undefined
terraform -chdir=terraform apply -auto-approve
undefined

Inspect Infrastructure

检查基础设施

bash
undefined
bash
undefined

List all resources in state

列出状态中的所有资源

terraform -chdir=terraform state list
terraform -chdir=terraform state list

Show details of a specific resource

查看特定资源的详情

terraform -chdir=terraform state show aws_s3_bucket.data_bucket
terraform -chdir=terraform state show aws_s3_bucket.data_bucket

Output specific values

输出特定值

terraform -chdir=terraform output
terraform -chdir=terraform output

Show current state in JSON

以JSON格式显示当前状态

terraform -chdir=terraform show -json
undefined
terraform -chdir=terraform show -json
undefined

Destroy Infrastructure

销毁基础设施

bash
undefined
bash
undefined

Destroy all managed infrastructure

销毁所有托管的基础设施

terraform -chdir=terraform destroy
terraform -chdir=terraform destroy

Destroy specific resource

销毁特定资源

terraform -chdir=terraform destroy -target=aws_instance.data_processor
undefined
terraform -chdir=terraform destroy -target=aws_instance.data_processor
undefined

Configuration Structure

配置结构

Basic Terraform Configuration for Data Engineering

面向数据工程的基础Terraform配置

main.tf - Core infrastructure definition:
hcl
terraform {
  required_providers {
    aws = {
      source  = "hashicorp/aws"
      version = "~> 5.0"
    }
  }
}

provider "aws" {
  region = var.aws_region
}
main.tf - 核心基础设施定义:
hcl
terraform {
  required_providers {
    aws = {
      source  = "hashicorp/aws"
      version = "~> 5.0"
    }
  }
}

provider "aws" {
  region = var.aws_region
}

S3 bucket for data storage

用于数据存储的S3存储桶

resource "aws_s3_bucket" "data_lake" { bucket = "my-unique-data-lake-bucket-${var.environment}"
tags = { Name = "Data Lake Bucket" Environment = var.environment Project = "DataEngineering" } }
resource "aws_s3_bucket" "data_lake" { bucket = "my-unique-data-lake-bucket-${var.environment}"
tags = { Name = "Data Lake Bucket" Environment = var.environment Project = "DataEngineering" } }

Enable versioning for data protection

启用版本控制以保护数据

resource "aws_s3_bucket_versioning" "data_lake_versioning" { bucket = aws_s3_bucket.data_lake.id
versioning_configuration { status = "Enabled" } }
resource "aws_s3_bucket_versioning" "data_lake_versioning" { bucket = aws_s3_bucket.data_lake.id
versioning_configuration { status = "Enabled" } }

Block public access

阻止公共访问

resource "aws_s3_bucket_public_access_block" "data_lake_public_access" { bucket = aws_s3_bucket.data_lake.id
block_public_acls = true block_public_policy = true ignore_public_acls = true restrict_public_buckets = true }
resource "aws_s3_bucket_public_access_block" "data_lake_public_access" { bucket = aws_s3_bucket.data_lake.id
block_public_acls = true block_public_policy = true ignore_public_acls = true restrict_public_buckets = true }

EC2 instance for data processing

用于数据处理的EC2实例

resource "aws_instance" "data_processor" { ami = var.ec2_ami instance_type = var.ec2_instance_type
tags = { Name = "DataProcessor" Environment = var.environment }
user_data = <<-EOF #!/bin/bash sudo yum update -y sudo yum install -y python3 python3-pip pip3 install pandas boto3 EOF }
resource "aws_instance" "data_processor" { ami = var.ec2_ami instance_type = var.ec2_instance_type
tags = { Name = "DataProcessor" Environment = var.environment }
user_data = <<-EOF #!/bin/bash sudo yum update -y sudo yum install -y python3 python3-pip pip3 install pandas boto3 EOF }

IAM role for EC2 to access S3

用于EC2访问S3的IAM角色

resource "aws_iam_role" "ec2_s3_access_role" { name = "ec2-s3-access-role"
assume_role_policy = jsonencode({ Version = "2012-10-17" Statement = [ { Action = "sts:AssumeRole" Effect = "Allow" Principal = { Service = "ec2.amazonaws.com" } } ] }) }
resource "aws_iam_role" "ec2_s3_access_role" { name = "ec2-s3-access-role"
assume_role_policy = jsonencode({ Version = "2012-10-17" Statement = [ { Action = "sts:AssumeRole" Effect = "Allow" Principal = { Service = "ec2.amazonaws.com" } } ] }) }

IAM policy for S3 access

用于S3访问的IAM策略

resource "aws_iam_role_policy" "ec2_s3_policy" { name = "ec2-s3-policy" role = aws_iam_role.ec2_s3_access_role.id
policy = jsonencode({ Version = "2012-10-17" Statement = [ { Effect = "Allow" Action = [ "s3:GetObject", "s3:PutObject", "s3:ListBucket" ] Resource = [ aws_s3_bucket.data_lake.arn, "${aws_s3_bucket.data_lake.arn}/*" ] } ] }) }
resource "aws_iam_role_policy" "ec2_s3_policy" { name = "ec2-s3-policy" role = aws_iam_role.ec2_s3_access_role.id
policy = jsonencode({ Version = "2012-10-17" Statement = [ { Effect = "Allow" Action = [ "s3:GetObject", "s3:PutObject", "s3:ListBucket" ] Resource = [ aws_s3_bucket.data_lake.arn, "${aws_s3_bucket.data_lake.arn}/*" ] } ] }) }

Attach IAM role to EC2 instance

将IAM角色附加到EC2实例

resource "aws_iam_instance_profile" "ec2_profile" { name = "ec2-s3-profile" role = aws_iam_role.ec2_s3_access_role.name }

**variables.tf** - Input variables:

```hcl
variable "aws_region" {
  description = "AWS region for resources"
  type        = string
  default     = "us-east-1"
}

variable "environment" {
  description = "Environment name (dev, staging, prod)"
  type        = string
  default     = "dev"
}

variable "ec2_ami" {
  description = "AMI ID for EC2 instance"
  type        = string
  default     = "ami-0c55b159cbfafe1f0"  # Amazon Linux 2
}

variable "ec2_instance_type" {
  description = "EC2 instance type"
  type        = string
  default     = "t2.micro"
}

variable "bucket_prefix" {
  description = "Prefix for S3 bucket names"
  type        = string
  default     = "data-eng"
}
outputs.tf - Output values:
hcl
output "s3_bucket_name" {
  description = "Name of the S3 data lake bucket"
  value       = aws_s3_bucket.data_lake.id
}

output "s3_bucket_arn" {
  description = "ARN of the S3 bucket"
  value       = aws_s3_bucket.data_lake.arn
}

output "ec2_instance_id" {
  description = "ID of the EC2 data processor"
  value       = aws_instance.data_processor.id
}

output "ec2_public_ip" {
  description = "Public IP of EC2 instance"
  value       = aws_instance.data_processor.public_ip
}
terraform.tfvars - Variable values (gitignore this file):
hcl
aws_region         = "us-west-2"
environment        = "production"
ec2_instance_type  = "t3.medium"
bucket_prefix      = "my-company-data"
resource "aws_iam_instance_profile" "ec2_profile" { name = "ec2-s3-profile" role = aws_iam_role.ec2_s3_access_role.name }

**variables.tf** - 输入变量:

```hcl
variable "aws_region" {
  description = "AWS资源所在区域"
  type        = string
  default     = "us-east-1"
}

variable "environment" {
  description = "环境名称(dev、staging、prod)"
  type        = string
  default     = "dev"
}

variable "ec2_ami" {
  description = "EC2实例的AMI ID"
  type        = string
  default     = "ami-0c55b159cbfafe1f0"  # Amazon Linux 2
}

variable "ec2_instance_type" {
  description = "EC2实例类型"
  type        = string
  default     = "t2.micro"
}

variable "bucket_prefix" {
  description = "S3存储桶名称前缀"
  type        = string
  default     = "data-eng"
}
outputs.tf - 输出值:
hcl
output "s3_bucket_name" {
  description = "S3数据湖存储桶的名称"
  value       = aws_s3_bucket.data_lake.id
}

output "s3_bucket_arn" {
  description = "S3存储桶的ARN"
  value       = aws_s3_bucket.data_lake.arn
}

output "ec2_instance_id" {
  description = "EC2数据处理实例的ID"
  value       = aws_instance.data_processor.id
}

output "ec2_public_ip" {
  description = "EC2实例的公网IP"
  value       = aws_instance.data_processor.public_ip
}
terraform.tfvars - 变量值(该文件需加入git忽略):
hcl
aws_region         = "us-west-2"
environment        = "production"
ec2_instance_type  = "t3.medium"
bucket_prefix      = "my-company-data"

Common Data Engineering Patterns

常见数据工程模式

Multi-Environment Setup

多环境设置

environments/dev/main.tf:
hcl
module "data_infrastructure" {
  source = "../../modules/data-infra"
  
  environment       = "dev"
  instance_type     = "t2.micro"
  enable_monitoring = false
}
environments/prod/main.tf:
hcl
module "data_infrastructure" {
  source = "../../modules/data-infra"
  
  environment       = "prod"
  instance_type     = "t3.xlarge"
  enable_monitoring = true
  backup_enabled    = true
}
environments/dev/main.tf
hcl
module "data_infrastructure" {
  source = "../../modules/data-infra"
  
  environment       = "dev"
  instance_type     = "t2.micro"
  enable_monitoring = false
}
environments/prod/main.tf
hcl
module "data_infrastructure" {
  source = "../../modules/data-infra"
  
  environment       = "prod"
  instance_type     = "t3.xlarge"
  enable_monitoring = true
  backup_enabled    = true
}

S3 Bucket with Lifecycle Policies

带生命周期策略的S3存储桶

hcl
resource "aws_s3_bucket" "data_archive" {
  bucket = "data-archive-${var.environment}"
}

resource "aws_s3_bucket_lifecycle_configuration" "data_archive_lifecycle" {
  bucket = aws_s3_bucket.data_archive.id
  
  rule {
    id     = "archive-old-data"
    status = "Enabled"
    
    transition {
      days          = 30
      storage_class = "STANDARD_IA"
    }
    
    transition {
      days          = 90
      storage_class = "GLACIER"
    }
    
    expiration {
      days = 365
    }
  }
  
  rule {
    id     = "delete-incomplete-uploads"
    status = "Enabled"
    
    abort_incomplete_multipart_upload {
      days_after_initiation = 7
    }
  }
}
hcl
resource "aws_s3_bucket" "data_archive" {
  bucket = "data-archive-${var.environment}"
}

resource "aws_s3_bucket_lifecycle_configuration" "data_archive_lifecycle" {
  bucket = aws_s3_bucket.data_archive.id
  
  rule {
    id     = "archive-old-data"
    status = "Enabled"
    
    transition {
      days          = 30
      storage_class = "STANDARD_IA"
    }
    
    transition {
      days          = 90
      storage_class = "GLACIER"
    }
    
    expiration {
      days = 365
    }
  }
  
  rule {
    id     = "delete-incomplete-uploads"
    status = "Enabled"
    
    abort_incomplete_multipart_upload {
      days_after_initiation = 7
    }
  }
}

VPC Setup for Data Processing

数据处理的VPC设置

hcl
resource "aws_vpc" "data_vpc" {
  cidr_block           = "10.0.0.0/16"
  enable_dns_hostnames = true
  enable_dns_support   = true
  
  tags = {
    Name = "data-engineering-vpc"
  }
}

resource "aws_subnet" "private_subnet" {
  vpc_id            = aws_vpc.data_vpc.id
  cidr_block        = "10.0.1.0/24"
  availability_zone = "${var.aws_region}a"
  
  tags = {
    Name = "private-data-subnet"
  }
}

resource "aws_security_group" "data_processor_sg" {
  name        = "data-processor-sg"
  description = "Security group for data processing instances"
  vpc_id      = aws_vpc.data_vpc.id
  
  egress {
    from_port   = 0
    to_port     = 0
    protocol    = "-1"
    cidr_blocks = ["0.0.0.0/0"]
  }
  
  ingress {
    from_port   = 22
    to_port     = 22
    protocol    = "tcp"
    cidr_blocks = ["10.0.0.0/16"]
  }
}
hcl
resource "aws_vpc" "data_vpc" {
  cidr_block           = "10.0.0.0/16"
  enable_dns_hostnames = true
  enable_dns_support   = true
  
  tags = {
    Name = "data-engineering-vpc"
  }
}

resource "aws_subnet" "private_subnet" {
  vpc_id            = aws_vpc.data_vpc.id
  cidr_block        = "10.0.1.0/24"
  availability_zone = "${var.aws_region}a"
  
  tags = {
    Name = "private-data-subnet"
  }
}

resource "aws_security_group" "data_processor_sg" {
  name        = "data-processor-sg"
  description = "数据处理实例的安全组"
  vpc_id      = aws_vpc.data_vpc.id
  
  egress {
    from_port   = 0
    to_port     = 0
    protocol    = "-1"
    cidr_blocks = ["0.0.0.0/0"]
  }
  
  ingress {
    from_port   = 22
    to_port     = 22
    protocol    = "tcp"
    cidr_blocks = ["10.0.0.0/16"]
  }
}

Remote State Configuration

远程状态配置

backend.tf:
hcl
terraform {
  backend "s3" {
    bucket         = "terraform-state-bucket-unique-name"
    key            = "data-engineering/terraform.tfstate"
    region         = "us-east-1"
    encrypt        = true
    dynamodb_table = "terraform-state-lock"
  }
}
Create state backend resources:
hcl
resource "aws_s3_bucket" "terraform_state" {
  bucket = "terraform-state-bucket-unique-name"
}

resource "aws_s3_bucket_versioning" "terraform_state_versioning" {
  bucket = aws_s3_bucket.terraform_state.id
  
  versioning_configuration {
    status = "Enabled"
  }
}

resource "aws_dynamodb_table" "terraform_locks" {
  name         = "terraform-state-lock"
  billing_mode = "PAY_PER_REQUEST"
  hash_key     = "LockID"
  
  attribute {
    name = "LockID"
    type = "S"
  }
}
backend.tf
hcl
terraform {
  backend "s3" {
    bucket         = "terraform-state-bucket-unique-name"
    key            = "data-engineering/terraform.tfstate"
    region         = "us-east-1"
    encrypt        = true
    dynamodb_table = "terraform-state-lock"
  }
}
创建状态后端资源:
hcl
resource "aws_s3_bucket" "terraform_state" {
  bucket = "terraform-state-bucket-unique-name"
}

resource "aws_s3_bucket_versioning" "terraform_state_versioning" {
  bucket = aws_s3_bucket.terraform_state.id
  
  versioning_configuration {
    status = "Enabled"
  }
}

resource "aws_dynamodb_table" "terraform_locks" {
  name         = "terraform-state-lock"
  billing_mode = "PAY_PER_REQUEST"
  hash_key     = "LockID"
  
  attribute {
    name = "LockID"
    type = "S"
  }
}

Verification and Testing

验证与测试

Verify S3 Bucket Creation

验证S3存储桶创建

bash
undefined
bash
undefined

List all S3 buckets

列出所有S3存储桶

aws s3 ls
aws s3 ls

Check specific bucket

检查特定存储桶

aws s3 ls s3://my-unique-data-lake-bucket-dev/
aws s3 ls s3://my-unique-data-lake-bucket-dev/

Upload test file

上传测试文件

echo "test data" > test.txt aws s3 cp test.txt s3://my-unique-data-lake-bucket-dev/
undefined
echo "test data" > test.txt aws s3 cp test.txt s3://my-unique-data-lake-bucket-dev/
undefined

Verify EC2 Instances

验证EC2实例

bash
undefined
bash
undefined

List running instances

列出运行中的实例

aws ec2 describe-instances
--filters "Name=instance-state-name,Values=running"
--query 'Reservations[].Instances[].{ID:InstanceId, Name:Tags[?Key==
Name
].Value, Type:InstanceType, State:State.Name, PublicIP:PublicIpAddress}'
--output table
aws ec2 describe-instances
--filters "Name=instance-state-name,Values=running"
--query 'Reservations[].Instances[].{ID:InstanceId, Name:Tags[?Key==
Name
].Value, Type:InstanceType, State:State.Name, PublicIP:PublicIpAddress}'
--output table

Get specific instance details

获取特定实例详情

aws ec2 describe-instances
--instance-ids $(terraform -chdir=terraform output -raw ec2_instance_id)
undefined
aws ec2 describe-instances
--instance-ids $(terraform -chdir=terraform output -raw ec2_instance_id)
undefined

Verify IAM Roles

验证IAM角色

bash
undefined
bash
undefined

List IAM roles

列出IAM角色

aws iam list-roles --query 'Roles[?contains(RoleName,
ec2-s3-access
)].RoleName'
aws iam list-roles --query 'Roles[?contains(RoleName,
ec2-s3-access
)].RoleName'

Get role policy

获取角色策略

aws iam get-role-policy
--role-name ec2-s3-access-role
--policy-name ec2-s3-policy
undefined
aws iam get-role-policy
--role-name ec2-s3-access-role
--policy-name ec2-s3-policy
undefined

State Management

状态管理

Inspect State

检查状态

bash
undefined
bash
undefined

View state file (formatted)

查看格式化后的状态文件

cat terraform/terraform.tfstate | jq -r '.resources[] | [.type, .name] | join(",")'
cat terraform/terraform.tfstate | jq -r '.resources[] | [.type, .name] | join(",")'

List resources in state

列出状态中的资源

terraform -chdir=terraform state list
terraform -chdir=terraform state list

Show resource details

查看资源详情

terraform -chdir=terraform state show aws_s3_bucket.data_lake
undefined
terraform -chdir=terraform state show aws_s3_bucket.data_lake
undefined

Import Existing Resources

导入现有资源

bash
undefined
bash
undefined

Import existing S3 bucket

导入现有S3存储桶

terraform -chdir=terraform import aws_s3_bucket.data_lake my-existing-bucket
terraform -chdir=terraform import aws_s3_bucket.data_lake my-existing-bucket

Import existing EC2 instance

导入现有EC2实例

terraform -chdir=terraform import aws_instance.data_processor i-1234567890abcdef0
undefined
terraform -chdir=terraform import aws_instance.data_processor i-1234567890abcdef0
undefined

Move Resources in State

在状态中移动资源

bash
undefined
bash
undefined

Rename resource in state

在状态中重命名资源

terraform -chdir=terraform state mv aws_s3_bucket.old_name aws_s3_bucket.new_name
undefined
terraform -chdir=terraform state mv aws_s3_bucket.old_name aws_s3_bucket.new_name
undefined

Troubleshooting

故障排除

Common Issues

常见问题

Issue: Bucket name already exists
hcl
undefined
问题:存储桶名称已存在
hcl
undefined

Solution: Use unique bucket name with random suffix

解决方案:使用带随机后缀的唯一存储桶名称

resource "random_id" "bucket_suffix" { byte_length = 4 }
resource "aws_s3_bucket" "data_lake" { bucket = "data-lake-${var.environment}-${random_id.bucket_suffix.hex}" }

**Issue: AWS credentials not found**
```bash
resource "random_id" "bucket_suffix" { byte_length = 4 }
resource "aws_s3_bucket" "data_lake" { bucket = "data-lake-${var.environment}-${random_id.bucket_suffix.hex}" }

**问题:未找到AWS凭证**
```bash

Check AWS configuration

检查AWS配置

aws configure list
aws configure list

Use environment variables

使用环境变量

export AWS_ACCESS_KEY_ID="${AWS_ACCESS_KEY_ID}" export AWS_SECRET_ACCESS_KEY="${AWS_SECRET_ACCESS_KEY}" export AWS_DEFAULT_REGION="us-east-1"

**Issue: State file locked**
```bash
export AWS_ACCESS_KEY_ID="${AWS_ACCESS_KEY_ID}" export AWS_SECRET_ACCESS_KEY="${AWS_SECRET_ACCESS_KEY}" export AWS_DEFAULT_REGION="us-east-1"

**问题:状态文件已锁定**
```bash

Force unlock (use with caution)

强制解锁(谨慎使用)

terraform -chdir=terraform force-unlock <LOCK_ID>

**Issue: Resource already exists**
```bash
terraform -chdir=terraform force-unlock <LOCK_ID>

**问题:资源已存在**
```bash

Import existing resource

导入现有资源

terraform -chdir=terraform import <resource_type>.<resource_name> <resource_id>
terraform -chdir=terraform import <resource_type>.<resource_name> <resource_id>

Or remove from state

或从状态中移除

terraform -chdir=terraform state rm <resource_type>.<resource_name>

**Issue: Terraform version mismatch**
```hcl
terraform -chdir=terraform state rm <resource_type>.<resource_name>

**问题:Terraform版本不匹配**
```hcl

Specify required version in terraform block

在terraform块中指定所需版本

terraform { required_version = ">= 1.5.0"
required_providers { aws = { source = "hashicorp/aws" version = "~> 5.0" } } }
undefined
terraform { required_version = ">= 1.5.0"
required_providers { aws = { source = "hashicorp/aws" version = "~> 5.0" } } }
undefined

Debugging

调试

bash
undefined
bash
undefined

Enable debug logging

启用调试日志

export TF_LOG=DEBUG terraform -chdir=terraform apply
export TF_LOG=DEBUG terraform -chdir=terraform apply

Log to file

将日志写入文件

export TF_LOG_PATH=terraform-debug.log terraform -chdir=terraform apply
export TF_LOG_PATH=terraform-debug.log terraform -chdir=terraform apply

Disable logging

禁用日志

unset TF_LOG unset TF_LOG_PATH
undefined
unset TF_LOG unset TF_LOG_PATH
undefined

Validate and Format

验证与格式化

bash
undefined
bash
undefined

Validate configuration

验证配置

terraform -chdir=terraform validate
terraform -chdir=terraform validate

Format all files

格式化所有文件

terraform -chdir=terraform fmt -recursive
terraform -chdir=terraform fmt -recursive

Check formatting without making changes

检查格式但不修改文件

terraform -chdir=terraform fmt -check
undefined
terraform -chdir=terraform fmt -check
undefined

Best Practices

最佳实践

  1. Always use variables for environment-specific values
  2. Enable S3 versioning for state files and data buckets
  3. Use remote state for team collaboration
  4. Tag all resources with environment, project, and owner
  5. Implement lifecycle policies for cost optimization
  6. Use modules for reusable infrastructure patterns
  7. Store secrets in AWS Secrets Manager, reference via data sources
  8. Run
    terraform plan
    before apply
  9. Use workspaces for multiple environments
  10. Document your infrastructure with comments and README files
  1. 始终使用变量存储环境特定值
  2. 为状态文件和数据存储桶启用S3版本控制
  3. 使用远程状态支持团队协作
  4. 为所有资源添加标签,包含环境、项目和所有者信息
  5. 实施生命周期策略优化成本
  6. 使用模块实现可复用的基础设施模式
  7. 将机密信息存储在AWS Secrets Manager中,通过数据源引用
  8. 在apply前运行
    terraform plan
  9. 使用工作区管理多环境
  10. 通过注释和README文件记录基础设施