terraform-data-engineering-infrastructure
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseTerraform Data Engineering Infrastructure
Terraform数据工程基础设施
Skill by ara.so — Data Skills collection.
This project provides Infrastructure-as-Code (IaC) patterns for data engineering teams using Terraform to provision and manage AWS resources. It demonstrates how to automate the creation of data infrastructure including S3 buckets for data lakes, EC2 instances for processing, and IAM policies for secure access.
由ara.so提供的Skill —— 数据技能合集。
本项目为数据工程团队提供基于Terraform的基础设施即代码(Infrastructure-as-Code,IaC)模式,用于配置和管理AWS资源。它展示了如何自动化创建数据基础设施,包括用于数据湖的S3存储桶、用于处理的EC2实例,以及用于安全访问的IAM策略。
What This Project Does
本项目功能
- Provisions AWS infrastructure specifically designed for data engineering workloads
- Manages S3 buckets for data storage and data lake architectures
- Creates EC2 instances for data processing and ETL jobs
- Configures IAM roles and policies for secure resource access
- Provides declarative infrastructure definitions that can be version-controlled
- Enables reproducible environment creation across dev/staging/prod
- 配置专为数据工程工作负载设计的AWS基础设施
- 管理用于数据存储和数据湖架构的S3存储桶
- 创建用于数据处理和ETL作业的EC2实例
- 配置用于安全资源访问的IAM角色和策略
- 提供可版本控制的声明式基础设施定义
- 支持在开发/预发布/生产环境中创建可重复的环境
Prerequisites
前提条件
Before using this project, ensure you have:
- An AWS account with root or administrative access
- Terraform installed (v1.0+)
- AWS CLI installed and configured
- IAM user with appropriate permissions (S3, EC2, IAM full access)
使用本项目前,请确保您已具备:
- 拥有根权限或管理员权限的AWS账户
- 已安装Terraform(v1.0+版本)
- 已安装并配置AWS CLI
- 拥有适当权限的IAM用户(S3、EC2、IAM全访问权限)
Installing Prerequisites
安装前提条件
bash
undefinedbash
undefinedInstall Terraform (macOS)
Install Terraform (macOS)
brew tap hashicorp/tap
brew install hashicorp/tap/terraform
brew tap hashicorp/tap
brew install hashicorp/tap/terraform
Install AWS CLI (macOS)
Install AWS CLI (macOS)
brew install awscli
brew install awscli
Configure AWS CLI
Configure AWS CLI
aws configure
aws configure
Enter your AWS Access Key ID, Secret Access Key, region, and output format
Enter your AWS Access Key ID, Secret Access Key, region, and output format
undefinedundefinedSetting Up IAM Permissions
设置IAM权限
Create an IAM user with the following permissions for Terraform:
- Full S3 access (AmazonS3FullAccess)
- Full EC2 access (AmazonEC2FullAccess)
- Full IAM access (IAMFullAccess)
Note: This is for development/learning. In production, use least-privilege policies.
bash
undefined创建具备以下Terraform所需权限的IAM用户:
- 完整S3访问权限(AmazonS3FullAccess)
- 完整EC2访问权限(AmazonEC2FullAccess)
- 完整IAM访问权限(IAMFullAccess)
注意: 此配置适用于开发/学习场景。生产环境中,请遵循最小权限原则设置策略。
bash
undefinedCreate access keys for your IAM user
Create access keys for your IAM user
aws iam create-access-key --user-name your-terraform-user
aws iam create-access-key --user-name your-terraform-user
Configure AWS CLI with these credentials
Configure AWS CLI with these credentials
aws configure --profile terraform
undefinedaws configure --profile terraform
undefinedProject Structure
项目结构
terraform/
├── main.tf # Main infrastructure definitions
├── variables.tf # Input variables (if present)
├── outputs.tf # Output values (if present)
└── terraform.tfstate # State file (generated)terraform/
├── main.tf # Main infrastructure definitions
├── variables.tf # Input variables (if present)
├── outputs.tf # Output values (if present)
└── terraform.tfstate # State file (generated)Key Terraform Commands
核心Terraform命令
Initialize Terraform
初始化Terraform
bash
undefinedbash
undefinedInitialize the working directory
Initialize the working directory
terraform -chdir=terraform init
terraform -chdir=terraform init
Validate configuration files
Validate configuration files
terraform -chdir=terraform validate
terraform -chdir=terraform validate
Format configuration files
Format configuration files
terraform -chdir=terraform fmt
undefinedterraform -chdir=terraform fmt
undefinedPlan and Apply Infrastructure
规划并应用基础设施
bash
undefinedbash
undefinedPreview changes without applying
Preview changes without applying
terraform -chdir=terraform plan
terraform -chdir=terraform plan
Apply changes and create infrastructure
Apply changes and create infrastructure
terraform -chdir=terraform apply
terraform -chdir=terraform apply
Apply without confirmation prompt
Apply without confirmation prompt
terraform -chdir=terraform apply -auto-approve
undefinedterraform -chdir=terraform apply -auto-approve
undefinedInspect Infrastructure
查看基础设施
bash
undefinedbash
undefinedList all resources in state
List all resources in state
terraform -chdir=terraform state list
terraform -chdir=terraform state list
Show details of a specific resource
Show details of a specific resource
terraform -chdir=terraform state show aws_s3_bucket.data_bucket
terraform -chdir=terraform state show aws_s3_bucket.data_bucket
Output current state
Output current state
terraform -chdir=terraform show
undefinedterraform -chdir=terraform show
undefinedDestroy Infrastructure
销毁基础设施
bash
undefinedbash
undefinedDestroy all managed infrastructure
Destroy all managed infrastructure
terraform -chdir=terraform destroy
terraform -chdir=terraform destroy
Destroy specific resources
Destroy specific resources
terraform -chdir=terraform destroy -target=aws_instance.data_processor
undefinedterraform -chdir=terraform destroy -target=aws_instance.data_processor
undefinedConfiguration Patterns
配置模式
Basic S3 Bucket for Data Lake
数据湖基础S3存储桶
hcl
undefinedhcl
undefinedterraform/main.tf
terraform/main.tf
terraform {
required_providers {
aws = {
source = "hashicorp/aws"
version = "~> 5.0"
}
}
}
provider "aws" {
region = "us-east-1"
}
terraform {
required_providers {
aws = {
source = "hashicorp/aws"
version = "~> 5.0"
}
}
}
provider "aws" {
region = "us-east-1"
}
S3 bucket for raw data
S3 bucket for raw data
resource "aws_s3_bucket" "raw_data" {
bucket = "my-unique-raw-data-bucket-12345"
tags = {
Environment = "dev"
Purpose = "data-lake-raw"
ManagedBy = "terraform"
}
}
resource "aws_s3_bucket" "raw_data" {
bucket = "my-unique-raw-data-bucket-12345"
tags = {
Environment = "dev"
Purpose = "data-lake-raw"
ManagedBy = "terraform"
}
}
Enable versioning for data recovery
Enable versioning for data recovery
resource "aws_s3_bucket_versioning" "raw_data_versioning" {
bucket = aws_s3_bucket.raw_data.id
versioning_configuration {
status = "Enabled"
}
}
resource "aws_s3_bucket_versioning" "raw_data_versioning" {
bucket = aws_s3_bucket.raw_data.id
versioning_configuration {
status = "Enabled"
}
}
Block public access
Block public access
resource "aws_s3_bucket_public_access_block" "raw_data_public_access" {
bucket = aws_s3_bucket.raw_data.id
block_public_acls = true
block_public_policy = true
ignore_public_acls = true
restrict_public_buckets = true
}
undefinedresource "aws_s3_bucket_public_access_block" "raw_data_public_access" {
bucket = aws_s3_bucket.raw_data.id
block_public_acls = true
block_public_policy = true
ignore_public_acls = true
restrict_public_buckets = true
}
undefinedEC2 Instance for Data Processing
用于数据处理的EC2实例
hcl
undefinedhcl
undefinedSecurity group for EC2 instance
Security group for EC2 instance
resource "aws_security_group" "data_processor_sg" {
name = "data-processor-sg"
description = "Security group for data processing EC2 instances"
ingress {
description = "SSH access"
from_port = 22
to_port = 22
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"] # Restrict this in production
}
egress {
from_port = 0
to_port = 0
protocol = "-1"
cidr_blocks = ["0.0.0.0/0"]
}
tags = {
Name = "data-processor-sg"
ManagedBy = "terraform"
}
}
resource "aws_security_group" "data_processor_sg" {
name = "data-processor-sg"
description = "Security group for data processing EC2 instances"
ingress {
description = "SSH access"
from_port = 22
to_port = 22
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"] # Restrict this in production
}
egress {
from_port = 0
to_port = 0
protocol = "-1"
cidr_blocks = ["0.0.0.0/0"]
}
tags = {
Name = "data-processor-sg"
ManagedBy = "terraform"
}
}
EC2 instance for data processing
EC2 instance for data processing
resource "aws_instance" "data_processor" {
ami = "ami-0c55b159cbfafe1f0" # Amazon Linux 2 AMI (update for your region)
instance_type = "t3.medium"
vpc_security_group_ids = [aws_security_group.data_processor_sg.id]
iam_instance_profile = aws_iam_instance_profile.data_processor_profile.name
user_data = <<-EOF
#!/bin/bash
yum update -y
yum install -y python3 python3-pip
pip3 install boto3 pandas
EOF
tags = {
Name = "data-processor"
Environment = "dev"
ManagedBy = "terraform"
}
}
undefinedresource "aws_instance" "data_processor" {
ami = "ami-0c55b159cbfafe1f0" # Amazon Linux 2 AMI (update for your region)
instance_type = "t3.medium"
vpc_security_group_ids = [aws_security_group.data_processor_sg.id]
iam_instance_profile = aws_iam_instance_profile.data_processor_profile.name
user_data = <<-EOF
#!/bin/bash
yum update -y
yum install -y python3 python3-pip
pip3 install boto3 pandas
EOF
tags = {
Name = "data-processor"
Environment = "dev"
ManagedBy = "terraform"
}
}
undefinedIAM Role for EC2 to Access S3
用于EC2访问S3的IAM角色
hcl
undefinedhcl
undefinedIAM role for EC2 instances
IAM role for EC2 instances
resource "aws_iam_role" "data_processor_role" {
name = "data-processor-role"
assume_role_policy = jsonencode({
Version = "2012-10-17"
Statement = [
{
Action = "sts:AssumeRole"
Effect = "Allow"
Principal = {
Service = "ec2.amazonaws.com"
}
}
]
})
tags = {
ManagedBy = "terraform"
}
}
resource "aws_iam_role" "data_processor_role" {
name = "data-processor-role"
assume_role_policy = jsonencode({
Version = "2012-10-17"
Statement = [
{
Action = "sts:AssumeRole"
Effect = "Allow"
Principal = {
Service = "ec2.amazonaws.com"
}
}
]
})
tags = {
ManagedBy = "terraform"
}
}
Policy to allow S3 access
Policy to allow S3 access
resource "aws_iam_role_policy" "s3_access_policy" {
name = "s3-access-policy"
role = aws_iam_role.data_processor_role.id
policy = jsonencode({
Version = "2012-10-17"
Statement = [
{
Effect = "Allow"
Action = [
"s3:GetObject",
"s3:PutObject",
"s3:ListBucket"
]
Resource = [
aws_s3_bucket.raw_data.arn,
"${aws_s3_bucket.raw_data.arn}/*"
]
}
]
})
}
resource "aws_iam_role_policy" "s3_access_policy" {
name = "s3-access-policy"
role = aws_iam_role.data_processor_role.id
policy = jsonencode({
Version = "2012-10-17"
Statement = [
{
Effect = "Allow"
Action = [
"s3:GetObject",
"s3:PutObject",
"s3:ListBucket"
]
Resource = [
aws_s3_bucket.raw_data.arn,
"${aws_s3_bucket.raw_data.arn}/*"
]
}
]
})
}
Instance profile for EC2
Instance profile for EC2
resource "aws_iam_instance_profile" "data_processor_profile" {
name = "data-processor-profile"
role = aws_iam_role.data_processor_role.name
}
undefinedresource "aws_iam_instance_profile" "data_processor_profile" {
name = "data-processor-profile"
role = aws_iam_role.data_processor_role.name
}
undefinedMulti-Environment Setup with Variables
基于变量的多环境设置
hcl
undefinedhcl
undefinedterraform/variables.tf
terraform/variables.tf
variable "environment" {
description = "Environment name (dev, staging, prod)"
type = string
default = "dev"
}
variable "bucket_prefix" {
description = "Prefix for S3 bucket names"
type = string
}
variable "instance_type" {
description = "EC2 instance type"
type = string
default = "t3.medium"
}
variable "aws_region" {
description = "AWS region"
type = string
default = "us-east-1"
}
variable "environment" {
description = "Environment name (dev, staging, prod)"
type = string
default = "dev"
}
variable "bucket_prefix" {
description = "Prefix for S3 bucket names"
type = string
}
variable "instance_type" {
description = "EC2 instance type"
type = string
default = "t3.medium"
}
variable "aws_region" {
description = "AWS region"
type = string
default = "us-east-1"
}
terraform/main.tf
terraform/main.tf
resource "aws_s3_bucket" "data_bucket" {
bucket = "${var.bucket_prefix}-${var.environment}-data"
tags = {
Environment = var.environment
ManagedBy = "terraform"
}
}
Apply with variables:
```bash
terraform -chdir=terraform apply \
-var="environment=prod" \
-var="bucket_prefix=mycompany" \
-var="instance_type=t3.large"resource "aws_s3_bucket" "data_bucket" {
bucket = "${var.bucket_prefix}-${var.environment}-data"
tags = {
Environment = var.environment
ManagedBy = "terraform"
}
}
使用变量应用配置:
```bash
terraform -chdir=terraform apply \
-var="environment=prod" \
-var="bucket_prefix=mycompany" \
-var="instance_type=t3.large"Output Values for Integration
用于集成的输出值
hcl
undefinedhcl
undefinedterraform/outputs.tf
terraform/outputs.tf
output "s3_bucket_name" {
description = "Name of the S3 bucket"
value = aws_s3_bucket.raw_data.id
}
output "s3_bucket_arn" {
description = "ARN of the S3 bucket"
value = aws_s3_bucket.raw_data.arn
}
output "ec2_instance_id" {
description = "ID of the EC2 instance"
value = aws_instance.data_processor.id
}
output "ec2_public_ip" {
description = "Public IP of the EC2 instance"
value = aws_instance.data_processor.public_ip
}
output "s3_bucket_name" {
description = "Name of the S3 bucket"
value = aws_s3_bucket.raw_data.id
}
output "s3_bucket_arn" {
description = "ARN of the S3 bucket"
value = aws_s3_bucket.raw_data.arn
}
output "ec2_instance_id" {
description = "ID of the EC2 instance"
value = aws_instance.data_processor.id
}
output "ec2_public_ip" {
description = "Public IP of the EC2 instance"
value = aws_instance.data_processor.public_ip
}
View outputs
View outputs
terraform -chdir=terraform output
terraform -chdir=terraform output
undefinedundefinedCommon Workflows
常见工作流
Initial Setup
初始设置
bash
undefinedbash
undefinedClone the repository
Clone the repository
git clone https://github.com/josephmachado/iac-for-data-engineering-terraform-.git
cd iac-for-data-engineering-terraform-
git clone https://github.com/josephmachado/iac-for-data-engineering-terraform-.git
cd iac-for-data-engineering-terraform-
Update bucket name in terraform/main.tf to be globally unique
Update bucket name in terraform/main.tf to be globally unique
Edit terraform/main.tf and change bucket name
Edit terraform/main.tf and change bucket name
Initialize and apply
Initialize and apply
terraform -chdir=terraform init
terraform -chdir=terraform validate
terraform -chdir=terraform fmt
terraform -chdir=terraform apply
undefinedterraform -chdir=terraform init
terraform -chdir=terraform validate
terraform -chdir=terraform fmt
terraform -chdir=terraform apply
undefinedVerify Resources Created
验证已创建的资源
bash
undefinedbash
undefinedList S3 buckets
List S3 buckets
aws s3 ls
aws s3 ls
Check EC2 instances
Check EC2 instances
aws ec2 describe-instances
--filters "Name=instance-state-name,Values=running"
--query 'Reservations[].Instances[].{ID:InstanceId, Name:Tags[?Key==].Value, Type:InstanceType, State:State.Name, PublicIP:PublicIpAddress}'
--output table
--filters "Name=instance-state-name,Values=running"
--query 'Reservations[].Instances[].{ID:InstanceId, Name:Tags[?Key==
Name--output table
aws ec2 describe-instances
--filters "Name=instance-state-name,Values=running"
--query 'Reservations[].Instances[].{ID:InstanceId, Name:Tags[?Key==].Value, Type:InstanceType, State:State.Name, PublicIP:PublicIpAddress}'
--output table
--filters "Name=instance-state-name,Values=running"
--query 'Reservations[].Instances[].{ID:InstanceId, Name:Tags[?Key==
Name--output table
View Terraform state
View Terraform state
terraform -chdir=terraform state list
cat terraform/terraform.tfstate | jq -r '.resources[] | [.type, .name] | join(",")'
undefinedterraform -chdir=terraform state list
cat terraform/terraform.tfstate | jq -r '.resources[] | [.type, .name] | join(",")'
undefinedUpdate Infrastructure
更新基础设施
bash
undefinedbash
undefinedEdit terraform files
Edit terraform files
Then preview changes
Then preview changes
terraform -chdir=terraform plan
terraform -chdir=terraform plan
Apply changes
Apply changes
terraform -chdir=terraform apply
undefinedterraform -chdir=terraform apply
undefinedClean Up
清理资源
bash
undefinedbash
undefinedDestroy all resources
Destroy all resources
terraform -chdir=terraform destroy
terraform -chdir=terraform destroy
Verify cleanup
Verify cleanup
aws s3 ls
aws ec2 describe-instances --filters "Name=instance-state-name,Values=running"
undefinedaws s3 ls
aws ec2 describe-instances --filters "Name=instance-state-name,Values=running"
undefinedAdvanced Patterns
高级模式
Data Lake Structure with Multiple Buckets
多存储桶的数据湖结构
hcl
undefinedhcl
undefinedRaw data bucket
Raw data bucket
resource "aws_s3_bucket" "raw" {
bucket = "${var.bucket_prefix}-raw-${var.environment}"
tags = {
Layer = "raw"
}
}
resource "aws_s3_bucket" "raw" {
bucket = "${var.bucket_prefix}-raw-${var.environment}"
tags = {
Layer = "raw"
}
}
Processed data bucket
Processed data bucket
resource "aws_s3_bucket" "processed" {
bucket = "${var.bucket_prefix}-processed-${var.environment}"
tags = {
Layer = "processed"
}
}
resource "aws_s3_bucket" "processed" {
bucket = "${var.bucket_prefix}-processed-${var.environment}"
tags = {
Layer = "processed"
}
}
Curated data bucket
Curated data bucket
resource "aws_s3_bucket" "curated" {
bucket = "${var.bucket_prefix}-curated-${var.environment}"
tags = {
Layer = "curated"
}
}
resource "aws_s3_bucket" "curated" {
bucket = "${var.bucket_prefix}-curated-${var.environment}"
tags = {
Layer = "curated"
}
}
Lifecycle policy for raw data
Lifecycle policy for raw data
resource "aws_s3_bucket_lifecycle_configuration" "raw_lifecycle" {
bucket = aws_s3_bucket.raw.id
rule {
id = "archive-old-data"
status = "Enabled"
transition {
days = 90
storage_class = "GLACIER"
}
expiration {
days = 365
}}
}
undefinedresource "aws_s3_bucket_lifecycle_configuration" "raw_lifecycle" {
bucket = aws_s3_bucket.raw.id
rule {
id = "archive-old-data"
status = "Enabled"
transition {
days = 90
storage_class = "GLACIER"
}
expiration {
days = 365
}}
}
undefinedRemote State Management
远程状态管理
hcl
undefinedhcl
undefinedCreate S3 bucket for state
Create S3 bucket for state
resource "aws_s3_bucket" "terraform_state" {
bucket = "my-terraform-state-bucket-12345"
lifecycle {
prevent_destroy = true
}
}
resource "aws_s3_bucket_versioning" "terraform_state_versioning" {
bucket = aws_s3_bucket.terraform_state.id
versioning_configuration {
status = "Enabled"
}
}
resource "aws_s3_bucket" "terraform_state" {
bucket = "my-terraform-state-bucket-12345"
lifecycle {
prevent_destroy = true
}
}
resource "aws_s3_bucket_versioning" "terraform_state_versioning" {
bucket = aws_s3_bucket.terraform_state.id
versioning_configuration {
status = "Enabled"
}
}
Configure backend (in a separate backend.tf file)
Configure backend (in a separate backend.tf file)
terraform {
terraform {
backend "s3" {
backend "s3" {
bucket = "my-terraform-state-bucket-12345"
bucket = "my-terraform-state-bucket-12345"
key = "data-engineering/terraform.tfstate"
key = "data-engineering/terraform.tfstate"
region = "us-east-1"
region = "us-east-1"
}
}
}
}
undefinedundefinedTroubleshooting
故障排查
Bucket Name Already Exists
存储桶名称已存在
Error:
BucketAlreadyExists: The requested bucket name is not availableSolution: S3 bucket names must be globally unique. Change the bucket name in :
main.tfhcl
resource "aws_s3_bucket" "data_bucket" {
bucket = "your-unique-prefix-data-bucket-${random_id.bucket_suffix.hex}"
}
resource "random_id" "bucket_suffix" {
byte_length = 4
}错误:
BucketAlreadyExists: The requested bucket name is not available解决方案: S3存储桶名称必须全局唯一。修改中的存储桶名称:
main.tfhcl
resource "aws_s3_bucket" "data_bucket" {
bucket = "your-unique-prefix-data-bucket-${random_id.bucket_suffix.hex}"
}
resource "random_id" "bucket_suffix" {
byte_length = 4
}Insufficient IAM Permissions
IAM权限不足
Error: or
UnauthorizedOperationAccessDeniedSolution: Verify IAM user has required permissions:
bash
undefined错误: 或
UnauthorizedOperationAccessDenied解决方案: 验证IAM用户是否具备所需权限:
bash
undefinedCheck current user identity
Check current user identity
aws sts get-caller-identity
aws sts get-caller-identity
Verify policies attached to user
Verify policies attached to user
aws iam list-attached-user-policies --user-name your-terraform-user
undefinedaws iam list-attached-user-policies --user-name your-terraform-user
undefinedState Lock Issues
状态锁定问题
Error:
Error acquiring the state lockSolution:
bash
undefined错误:
Error acquiring the state lock解决方案:
bash
undefinedForce unlock (use with caution)
Force unlock (use with caution)
terraform -chdir=terraform force-unlock LOCK_ID
terraform -chdir=terraform force-unlock LOCK_ID
Or remove local state lock file
Or remove local state lock file
rm terraform/.terraform.tfstate.lock.info
undefinedrm terraform/.terraform.tfstate.lock.info
undefinedResource Already Exists
资源已存在
Error: Resource already exists but not in state
Solution: Import existing resource:
bash
undefined错误: 资源已存在但未在状态文件中
解决方案: 导入现有资源:
bash
undefinedImport S3 bucket
Import S3 bucket
terraform -chdir=terraform import aws_s3_bucket.data_bucket my-existing-bucket-name
terraform -chdir=terraform import aws_s3_bucket.data_bucket my-existing-bucket-name
Import EC2 instance
Import EC2 instance
terraform -chdir=terraform import aws_instance.data_processor i-1234567890abcdef0
undefinedterraform -chdir=terraform import aws_instance.data_processor i-1234567890abcdef0
undefinedTerraform State Drift
Terraform状态漂移
Error: Resources differ from state
Solution:
bash
undefined错误: 资源与状态文件不一致
解决方案:
bash
undefinedRefresh state to match real infrastructure
Refresh state to match real infrastructure
terraform -chdir=terraform refresh
terraform -chdir=terraform refresh
Or during plan/apply
Or during plan/apply
terraform -chdir=terraform apply -refresh=true
undefinedterraform -chdir=terraform apply -refresh=true
undefinedRegion-Specific AMI Issues
区域特定AMI问题
Error: Invalid AMI ID for region
Solution: Use data source to find correct AMI:
hcl
data "aws_ami" "amazon_linux_2" {
most_recent = true
owners = ["amazon"]
filter {
name = "name"
values = ["amzn2-ami-hvm-*-x86_64-gp2"]
}
}
resource "aws_instance" "data_processor" {
ami = data.aws_ami.amazon_linux_2.id
instance_type = "t3.medium"
}错误: 区域对应的AMI ID无效
解决方案: 使用数据源查找正确的AMI:
hcl
data "aws_ami" "amazon_linux_2" {
most_recent = true
owners = ["amazon"]
filter {
name = "name"
values = ["amzn2-ami-hvm-*-x86_64-gp2"]
}
}
resource "aws_instance" "data_processor" {
ami = data.aws_ami.amazon_linux_2.id
instance_type = "t3.medium"
}Best Practices
最佳实践
- Use Remote State: Store Terraform state in S3 with versioning enabled
- Separate Environments: Use workspaces or separate state files for dev/staging/prod
- Least Privilege IAM: Use specific IAM policies instead of full access in production
- Tag Everything: Add consistent tags for cost tracking and resource management
- Version Control: Commit files but exclude
.tfandterraform.tfstate.terraform/ - Plan Before Apply: Always run before
terraform planapply - Use Variables: Parameterize configurations for reusability
- Enable Encryption: Use S3 bucket encryption and EBS encryption for EC2
- Implement Lifecycle Policies: Archive or delete old data automatically
- Document Dependencies: Use comments to explain resource relationships
- 使用远程状态: 将Terraform状态存储在启用版本控制的S3中
- 分离环境: 使用工作区或独立状态文件管理开发/预发布/生产环境
- 最小权限IAM: 生产环境中使用特定IAM策略而非全访问权限
- 统一标记: 添加一致的标签用于成本追踪和资源管理
- 版本控制: 提交文件但排除
.tf和terraform.tfstate目录.terraform/ - 先规划再应用: 执行前务必运行
terraform applyterraform plan - 使用变量: 参数化配置以提高可复用性
- 启用加密: 对S3存储桶和EC2的EBS卷启用加密
- 实现生命周期策略: 自动归档或删除旧数据
- 记录依赖关系: 使用注释说明资源间的关系
Environment Variables
环境变量
bash
undefinedbash
undefinedAWS credentials (preferred over hardcoding)
AWS credentials (preferred over hardcoding)
export AWS_ACCESS_KEY_ID=your_access_key
export AWS_SECRET_ACCESS_KEY=your_secret_key
export AWS_DEFAULT_REGION=us-east-1
export AWS_ACCESS_KEY_ID=your_access_key
export AWS_SECRET_ACCESS_KEY=your_secret_key
export AWS_DEFAULT_REGION=us-east-1
Terraform variables
Terraform variables
export TF_VAR_environment=dev
export TF_VAR_bucket_prefix=mycompany
export TF_VAR_instance_type=t3.medium
undefinedexport TF_VAR_environment=dev
export TF_VAR_bucket_prefix=mycompany
export TF_VAR_instance_type=t3.medium
undefined