devops-flow
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
Chinesedevops-flow
DevOps工作流
Description: Infrastructure as Code, CI/CD pipeline automation, and deployment management
Category: DevOps & Deployment
Complexity: High (multi-cloud + orchestration + automation)
说明: 基础设施即代码、CI/CD流水线自动化及部署管理
分类: DevOps与部署
复杂度: 高(多云+编排+自动化)
Purpose
目标
Automate infrastructure provisioning, CI/CD pipelines, and deployment processes based on SPEC documents and ADR decisions. Ensures consistent, repeatable, and secure deployments across environments.
基于SPEC文档和ADR决策,自动化基础设施配置、CI/CD流水线和部署流程。确保跨环境的部署一致、可重复且安全。
Capabilities
能力
1. Infrastructure as Code (IaC)
1. 基础设施即代码(IaC)
- Terraform: Cloud-agnostic infrastructure provisioning
- CloudFormation: AWS-native infrastructure
- Ansible: Configuration management and provisioning
- Pulumi: Modern IaC with standard programming languages
- Kubernetes manifests: Container orchestration
- Terraform: 云无关的基础设施配置
- CloudFormation: AWS原生基础设施
- Ansible: 配置管理与配置
- Pulumi: 使用标准编程语言的现代IaC
- Kubernetes manifests: 容器编排
2. CI/CD Pipeline Generation
2. CI/CD流水线生成
- GitHub Actions: Workflow automation
- GitLab CI: Pipeline configuration
- Jenkins: Pipeline as code
- CircleCI: Cloud-native CI/CD
- Azure DevOps: Microsoft ecosystem integration
- GitHub Actions: 工作流自动化
- GitLab CI: 流水线配置
- Jenkins: 流水线即代码
- CircleCI: 云原生CI/CD
- Azure DevOps: Microsoft生态系统集成
3. Container Configuration
3. 容器配置
- Dockerfile: Container image definition
- Docker Compose: Multi-container applications
- Kubernetes: Production orchestration
- Helm charts: Kubernetes package management
- Container registry: Image storage and versioning
- Dockerfile: 容器镜像定义
- Docker Compose: 多容器应用
- Kubernetes: 生产环境编排
- Helm charts: Kubernetes包管理
- Container registry: 镜像存储与版本控制
4. Deployment Strategies
4. 部署策略
- Blue-Green: Zero-downtime deployments
- Canary: Gradual rollout with monitoring
- Rolling: Sequential instance updates
- Feature flags: Progressive feature enablement
- Rollback procedures: Automated failure recovery
- Blue-Green: 零停机部署
- Canary: 带监控的渐进式发布
- Rolling: 顺序实例更新
- Feature flags: 渐进式功能启用
- Rollback procedures: 自动化故障恢复
5. Environment Management
5. 环境管理
- Environment separation: dev, staging, production
- Configuration management: Environment-specific configs
- Secret management: Vault, AWS Secrets Manager, etc.
- Infrastructure versioning: State management
- Cost optimization: Resource tagging and monitoring
- Environment separation: 开发、 staging、生产环境隔离
- Configuration management: 环境专属配置
- Secret management: Vault、AWS Secrets Manager等
- Infrastructure versioning: 状态管理
- Cost optimization: 资源标签与监控
6. Monitoring & Observability
6. 监控与可观测性
- Logging: Centralized log aggregation
- Metrics: Performance and health monitoring
- Alerting: Incident response automation
- Tracing: Distributed request tracking
- Dashboards: Real-time visualization
- Logging: 集中式日志聚合
- Metrics: 性能与健康监控
- Alerting: 事件响应自动化
- Tracing: 分布式请求追踪
- Dashboards: 实时可视化
7. Security & Compliance
7. 安全与合规
- Security scanning: Container and infrastructure
- Compliance checks: Policy enforcement
- Access control: IAM and RBAC
- Network security: Firewall rules, VPC configuration
- Audit logging: Change tracking
- Security scanning: 容器与基础设施安全扫描
- Compliance checks: 策略执行
- Access control: IAM与RBAC
- Network security: 防火墙规则、VPC配置
- Audit logging: 变更追踪
8. Disaster Recovery
8. 灾难恢复
- Backup automation: Data and configuration backups
- Recovery procedures: Automated restoration
- Failover: Multi-region redundancy
- Data replication: Cross-region sync
- RTO/RPO: Recovery objectives implementation
- Backup automation: 数据与配置备份自动化
- Recovery procedures: 自动化恢复
- Failover: 多区域冗余
- Data replication: 跨区域同步
- RTO/RPO: 恢复目标落地
DevOps Workflow
DevOps工作流
mermaid
graph TD
A[SPEC Document] --> B[Extract Requirements]
B --> C{Infrastructure Needed?}
C -->|Yes| D[Generate IaC Templates]
C -->|No| E[Generate CI/CD Pipeline]
D --> F[Terraform/CloudFormation]
F --> G[Validate Infrastructure Code]
G --> H{Validation Pass?}
H -->|No| I[Report Issues]
H -->|Yes| J[Generate Deployment Pipeline]
E --> J
J --> K[CI/CD Configuration]
K --> L[Add Build Stage]
L --> M[Add Test Stage]
M --> N[Add Security Scan]
N --> O[Add Deploy Stage]
O --> P[Environment Strategy]
P --> Q{Deployment Type}
Q -->|Blue-Green| R[Generate Blue-Green Config]
Q -->|Canary| S[Generate Canary Config]
Q -->|Rolling| T[Generate Rolling Config]
R --> U[Add Monitoring]
S --> U
T --> U
U --> V[Add Rollback Procedure]
V --> W[Generate Documentation]
W --> X[Review & Deploy]
I --> Xmermaid
graph TD
A[SPEC Document] --> B[Extract Requirements]
B --> C{Infrastructure Needed?}
C -->|Yes| D[Generate IaC Templates]
C -->|No| E[Generate CI/CD Pipeline]
D --> F[Terraform/CloudFormation]
F --> G[Validate Infrastructure Code]
G --> H{Validation Pass?}
H -->|No| I[Report Issues]
H -->|Yes| J[Generate Deployment Pipeline]
E --> J
J --> K[CI/CD Configuration]
K --> L[Add Build Stage]
L --> M[Add Test Stage]
M --> N[Add Security Scan]
N --> O[Add Deploy Stage]
O --> P[Environment Strategy]
P --> Q{Deployment Type}
Q -->|Blue-Green| R[Generate Blue-Green Config]
Q -->|Canary| S[Generate Canary Config]
Q -->|Rolling| T[Generate Rolling Config]
R --> U[Add Monitoring]
S --> U
T --> U
U --> V[Add Rollback Procedure]
V --> W[Generate Documentation]
W --> X[Review & Deploy]
I --> XUsage Instructions
使用说明
Generate Infrastructure from SPEC
从SPEC生成基础设施
bash
devops-flow generate-infra \
--spec specs/SPEC-API-V1.md \
--cloud aws \
--output infrastructure/Generated Terraform structure:
infrastructure/
├── main.tf # Main configuration
├── variables.tf # Input variables
├── outputs.tf # Output values
├── providers.tf # Cloud provider config
├── modules/
│ ├── vpc/ # Network infrastructure
│ ├── compute/ # EC2, Lambda, etc.
│ ├── database/ # RDS, DynamoDB
│ └── storage/ # S3, EBS
└── environments/
├── dev.tfvars # Development config
├── staging.tfvars # Staging config
└── prod.tfvars # Production configbash
devops-flow generate-infra \
--spec specs/SPEC-API-V1.md \
--cloud aws \
--output infrastructure/生成的Terraform结构:
infrastructure/
├── main.tf # 主配置
├── variables.tf # 输入变量
├── outputs.tf # 输出值
├── providers.tf # 云提供商配置
├── modules/
│ ├── vpc/ # 网络基础设施
│ ├── compute/ # EC2、Lambda等
│ ├── database/ # RDS、DynamoDB
│ └── storage/ # S3、EBS
└── environments/
├── dev.tfvars # 开发环境配置
├── staging.tfvars # Staging环境配置
└── prod.tfvars # 生产环境配置Generate CI/CD Pipeline
生成CI/CD流水线
bash
devops-flow generate-pipeline \
--type github-actions \
--language python \
--deploy-strategy blue-green \
--output .github/workflows/Generated GitHub Actions workflow:
yaml
name: CI/CD Pipeline
on:
push:
branches: [main, develop]
pull_request:
branches: [main]
env:
PYTHON_VERSION: '3.11'
AWS_REGION: us-east-1
jobs:
lint:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Set up Python
uses: actions/setup-python@v4
with:
python-version: ${{ env.PYTHON_VERSION }}
- name: Install dependencies
run: |
pip install ruff mypy
- name: Run linters
run: |
ruff check .
mypy .
test:
needs: lint
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Run tests
run: |
pytest --cov=. --cov-report=xml
- name: Upload coverage
uses: codecov/codecov-action@v3
security:
needs: test
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Security scan
run: |
bandit -r . -f json -o security-report.json
- name: Upload security report
uses: actions/upload-artifact@v3
with:
name: security-report
path: security-report.json
build:
needs: [lint, test, security]
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Build Docker image
run: |
docker build -t app:${{ github.sha }} .
- name: Push to registry
run: |
docker push registry.example.com/app:${{ github.sha }}
deploy-staging:
needs: build
if: github.ref == 'refs/heads/develop'
runs-on: ubuntu-latest
environment: staging
steps:
- name: Deploy to staging
run: |
aws ecs update-service \
--cluster staging-cluster \
--service app-service \
--force-new-deployment
deploy-production:
needs: build
if: github.ref == 'refs/heads/main'
runs-on: ubuntu-latest
environment: production
steps:
- name: Deploy blue-green
run: |
# Deploy to green environment
./scripts/deploy-green.sh
# Run smoke tests
./scripts/smoke-tests.sh
# Switch traffic to green
./scripts/switch-traffic.sh
# Keep blue for rollbackbash
devops-flow generate-pipeline \
--type github-actions \
--language python \
--deploy-strategy blue-green \
--output .github/workflows/生成的GitHub Actions工作流:
yaml
name: CI/CD Pipeline
on:
push:
branches: [main, develop]
pull_request:
branches: [main]
env:
PYTHON_VERSION: '3.11'
AWS_REGION: us-east-1
jobs:
lint:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Set up Python
uses: actions/setup-python@v4
with:
python-version: ${{ env.PYTHON_VERSION }}
- name: Install dependencies
run: |
pip install ruff mypy
- name: Run linters
run: |
ruff check .
mypy .
test:
needs: lint
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Run tests
run: |
pytest --cov=. --cov-report=xml
- name: Upload coverage
uses: codecov/codecov-action@v3
security:
needs: test
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Security scan
run: |
bandit -r . -f json -o security-report.json
- name: Upload security report
uses: actions/upload-artifact@v3
with:
name: security-report
path: security-report.json
build:
needs: [lint, test, security]
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Build Docker image
run: |
docker build -t app:${{ github.sha }} .
- name: Push to registry
run: |
docker push registry.example.com/app:${{ github.sha }}
deploy-staging:
needs: build
if: github.ref == 'refs/heads/develop'
runs-on: ubuntu-latest
environment: staging
steps:
- name: Deploy to staging
run: |
aws ecs update-service \
--cluster staging-cluster \
--service app-service \
--force-new-deployment
deploy-production:
needs: build
if: github.ref == 'refs/heads/main'
runs-on: ubuntu-latest
environment: production
steps:
- name: Deploy blue-green
run: |
# 部署到绿色环境
./scripts/deploy-green.sh
# 运行冒烟测试
./scripts/smoke-tests.sh
# 切换流量到绿色环境
./scripts/switch-traffic.sh
# 保留蓝色环境用于回滚Generate Kubernetes Configuration
生成Kubernetes配置
bash
devops-flow generate-k8s \
--spec specs/SPEC-API-V1.md \
--replicas 3 \
--output k8s/Generated Kubernetes manifests:
yaml
undefinedbash
devops-flow generate-k8s \
--spec specs/SPEC-API-V1.md \
--replicas 3 \
--output k8s/生成的Kubernetes清单:
yaml
undefinedk8s/deployment.yaml
k8s/deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: api-service
labels:
app: api
version: v1
spec:
replicas: 3
selector:
matchLabels:
app: api
template:
metadata:
labels:
app: api
version: v1
spec:
containers:
- name: api
image: registry.example.com/api:latest
ports:
- containerPort: 8000
env:
- name: DATABASE_URL
valueFrom:
secretKeyRef:
name: api-secrets
key: database-url
resources:
requests:
memory: "256Mi"
cpu: "250m"
limits:
memory: "512Mi"
cpu: "500m"
livenessProbe:
httpGet:
path: /health
port: 8000
initialDelaySeconds: 30
periodSeconds: 10
readinessProbe:
httpGet:
path: /ready
port: 8000
initialDelaySeconds: 5
periodSeconds: 5
apiVersion: apps/v1
kind: Deployment
metadata:
name: api-service
labels:
app: api
version: v1
spec:
replicas: 3
selector:
matchLabels:
app: api
template:
metadata:
labels:
app: api
version: v1
spec:
containers:
- name: api
image: registry.example.com/api:latest
ports:
- containerPort: 8000
env:
- name: DATABASE_URL
valueFrom:
secretKeyRef:
name: api-secrets
key: database-url
resources:
requests:
memory: "256Mi"
cpu: "250m"
limits:
memory: "512Mi"
cpu: "500m"
livenessProbe:
httpGet:
path: /health
port: 8000
initialDelaySeconds: 30
periodSeconds: 10
readinessProbe:
httpGet:
path: /ready
port: 8000
initialDelaySeconds: 5
periodSeconds: 5
k8s/service.yaml
k8s/service.yaml
apiVersion: v1
kind: Service
metadata:
name: api-service
spec:
selector:
app: api
ports:
- protocol: TCP port: 80 targetPort: 8000 type: LoadBalancer
apiVersion: v1
kind: Service
metadata:
name: api-service
spec:
selector:
app: api
ports:
- protocol: TCP port: 80 targetPort: 8000 type: LoadBalancer
k8s/hpa.yaml
k8s/hpa.yaml
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: api-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: api-service
minReplicas: 3
maxReplicas: 10
metrics:
- type: Resource resource: name: cpu target: type: Utilization averageUtilization: 70
---apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: api-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: api-service
minReplicas: 3
maxReplicas: 10
metrics:
- type: Resource resource: name: cpu target: type: Utilization averageUtilization: 70
---Generate Deployment Scripts from REQ
从REQ生成部署脚本
bash
devops-flow generate-deployment-scripts \
--req docs/07_REQ/REQ-NN.md \
--spec docs/09_SPEC/SPEC-NN.yaml \
--output scripts/Generated shell scripts structure:
scripts/
├── setup.sh # Initial environment setup
├── install.sh # Application installation
├── deploy.sh # Main deployment orchestration
├── rollback.sh # Rollback to previous version
├── health-check.sh # Health verification
└── cleanup.sh # Cleanup old versionsScript Generation Logic:
- Parse REQ Section 9.5.3 for script requirements
- Parse SPEC deployment section for technical details
- Apply script standards (Bash 4.0+, error handling, logging)
- Reference cloud provider from REQ @adr tags
- Use environment-specific configurations from REQ 9.5.2
Example generated script (setup.sh):
bash
#!/bin/bash
set -euo pipefailbash
devops-flow generate-deployment-scripts \
--req docs/07_REQ/REQ-NN.md \
--spec docs/09_SPEC/SPEC-NN.yaml \
--output scripts/生成的Shell脚本结构:
scripts/
├── setup.sh # 初始环境搭建
├── install.sh # 应用安装
├── deploy.sh # 主部署编排
├── rollback.sh # 回滚到上一版本
├── health-check.sh # 健康验证
└── cleanup.sh # 清理旧版本脚本生成逻辑:
- 解析REQ第9.5.3节获取脚本需求
- 解析SPEC部署节获取技术细节
- 应用脚本标准(Bash 4.0+、错误处理、日志)
- 从REQ @adr标签引用云提供商
- 使用REQ 9.5.2中的环境专属配置
生成的脚本示例 (setup.sh):
bash
#!/bin/bash
set -euo pipefailSetup environment for deployment
部署环境搭建
LOG_FILE="logs/deployment_$(date +%Y%m%d_%H%M%S).log"
mkdir -p logs
log() {
echo "[$(date +%Y-%m-%d %H:%M:%S)] $*" | tee -a "$LOG_FILE"
}
log "Starting environment setup..."
LOG_FILE="logs/deployment_$(date +%Y%m%d_%H%M%S).log"
mkdir -p logs
log() {
echo "[$(date +%Y-%m-%d %H:%M:%S)] $*" | tee -a "$LOG_FILE"
}
log "开始环境搭建..."
Install dependencies
安装依赖
if [ ! -f .tool-versions ]; then
log "Installing Python dependencies..."
pip install -r requirements.txt
fi
if [ ! -f .tool-versions ]; then
log "安装Python依赖..."
pip install -r requirements.txt
fi
Configure environment variables
配置环境变量
if [ -f .env.deployment ]; then
log "Loading deployment environment variables..."
export $(cat .env.deployment | grep -v '^#' | xargs)
fi
log "Environment setup complete"
exit 0
**Example generated script** (deploy.sh):
```bash
#!/bin/bash
set -euo pipefailif [ -f .env.deployment ]; then
log "加载部署环境变量..."
export $(cat .env.deployment | grep -v '^#' | xargs)
fi
log "环境搭建完成"
exit 0
**生成的脚本示例** (deploy.sh):
```bash
#!/bin/bash
set -euo pipefailMain deployment orchestration script
主部署编排脚本
LOG_FILE="logs/deployment_$(date +%Y%m%d_%H%M%S).log"
ENVIRONMENT="${1:-staging}"
log() {
echo "[$(date +%Y-%m-%d %H:%M:%S)] $*" | tee -a "$LOG_FILE"
}
LOG_FILE="logs/deployment_$(date +%Y%m%d_%H%M%S).log"
ENVIRONMENT="${1:-staging}"
log() {
echo "[$(date +%Y-%m-%d %H:%M:%S)] $*" | tee -a "$LOG_FILE"
}
Step 1: Setup
步骤1:搭建环境
log "Running setup..."
./scripts/setup.sh
log "执行环境搭建..."
./scripts/setup.sh
Step 2: Install
步骤2:安装应用
log "Installing application..."
./scripts/install.sh --env "$ENVIRONMENT"
log "安装应用..."
./scripts/install.sh --env "$ENVIRONMENT"
Step 3: Deploy
步骤3:部署应用
log "Deploying application..."
if [ "$ENVIRONMENT" = "production" ]; then
./scripts/deploy-prod.sh
else
./scripts/deploy-staging.sh
fi
log "部署应用..."
if [ "$ENVIRONMENT" = "production" ]; then
./scripts/deploy-prod.sh
else
./scripts/deploy-staging.sh
fi
Step 4: Health check
步骤4:健康检查
log "Running health check..."
./scripts/health-check.sh --env "$ENVIRONMENT"
if [ $? -eq 0 ]; then
log "Deployment successful"
else
log "Deployment failed, initiating rollback..."
./scripts/rollback.sh --env "$ENVIRONMENT"
exit 1
fi
**Example generated script** (health-check.sh):
```bash
#!/bin/bash
set -euo pipefaillog "执行健康检查..."
./scripts/health-check.sh --env "$ENVIRONMENT"
if [ $? -eq 0 ]; then
log "部署成功"
else
log "部署失败,启动回滚..."
./scripts/rollback.sh --env "$ENVIRONMENT"
exit 1
fi
**生成的脚本示例** (health-check.sh):
```bash
#!/bin/bash
set -euo pipefailHealth verification script
健康验证脚本
HEALTH_URL="${1:-http://localhost:8000/health/live}"
TIMEOUT=60
RETRIES=3
log() {
echo "[$(date +%Y-%m-%d %H:%M:%S)] $*"
}
log "Starting health check..."
for i in $(seq 1 $RETRIES); do
log "Attempt $i of $RETRIES..."
RESPONSE=$(curl -s -o /dev/null -w "%{http_code}" --max-time $TIMEOUT "$HEALTH_URL")
if [ "$RESPONSE" = "200" ]; then
log "Health check passed"
exit 0
fi
log "Health check failed, sleeping before retry..."
sleep 5
done
log "Health check failed after $RETRIES attempts"
exit 1
undefinedHEALTH_URL="${1:-http://localhost:8000/health/live}"
TIMEOUT=60
RETRIES=3
log() {
echo "[$(date +%Y-%m-%d %H:%M:%S)] $*"
}
log "开始健康检查..."
for i in $(seq 1 $RETRIES); do
log "第$i次尝试,共$RETRIES次..."
RESPONSE=$(curl -s -o /dev/null -w "%{http_code}" --max-time $TIMEOUT "$HEALTH_URL")
if [ "$RESPONSE" = "200" ]; then
log "健康检查通过"
exit 0
fi
log "健康检查失败,重试前等待..."
sleep 5
done
log "$RETRIES次尝试后健康检查仍失败"
exit 1
undefinedGenerate Ansible Playbooks from REQ
从REQ生成Ansible Playbook
bash
devops-flow generate-ansible-playbooks \
--req docs/07_REQ/REQ-NN.md \
--spec docs/09_SPEC/SPEC-NN.yaml \
--output ansible/Generated Ansible playbooks structure:
ansible/
├── provision_infra.yml # Infrastructure provisioning
├── configure_instances.yml # Instance configuration
├── deploy_app.yml # Application deployment
├── configure_monitoring.yml # Monitoring setup
├── configure_security.yml # Security hardening
└── backup_restore.yml # Backup/restore proceduresPlaybook Generation Logic:
- Parse REQ Section 9.5.4 for playbook requirements
- Parse Section 9.5.1 for infrastructure configuration
- Apply Ansible standards (2.9+, modular roles, idempotency)
- Reference cloud provider from REQ @adr tags
- Use environment-specific variables from REQ 9.5.2
Example generated playbook (provision_infra.yml):
yaml
---
- name: Provision Infrastructure
hosts: localhost
gather_facts: no
vars_files:
- "environments/{{ target_env }}.yml"
tasks:
- name: Create VPC
ec2_vpc_net:
name: "{{ vpc_name }}"
cidr_block: "{{ vpc_cidr }}"
region: "{{ aws_region }}"
tags:
Project: "{{ project_name }}"
Environment: "{{ target_env }}"
ManagedBy: "Ansible"
- name: Create security groups
ec2_security_group:
name: "{{ security_group_name }}"
description: "Security group for {{ application_name }}"
vpc_id: "{{ vpc.vpc_id }}"
rules:
- proto: tcp
from_port: 80
to_port: 80
cidr_ip: 0.0.0.0/0
- proto: tcp
from_port: 443
to_port: 443
cidr_ip: 0.0.0.0/0
region: "{{ aws_region }}"
tags:
Project: "{{ project_name }}"
Environment: "{{ target_env }}"
- name: Create RDS instance
rds:
db_name: "{{ db_name }}"
engine: postgres
engine_version: "{{ db_version }}"
instance_type: "{{ db_instance_class }}"
allocated_storage: "{{ db_storage_gb }}"
username: "{{ db_username }}"
password: "{{ db_password }}"
vpc_security_group_ids:
- "{{ security_group.group_id }}"
subnet_group_name: "{{ db_subnet_group }}"
backup_retention_period: "{{ backup_retention_days }}"
multi_az: true
region: "{{ aws_region }}"
tags:
Project: "{{ project_name }}"
Environment: "{{ target_env }}"
ManagedBy: "Ansible"Example generated playbook (deploy_app.yml):
yaml
---
- name: Deploy Application
hosts: app_servers
gather_facts: yes
become: yes
vars_files:
- "environments/{{ target_env }}.yml"
tasks:
- name: Ensure application directory exists
file:
path: "{{ app_directory }}"
state: directory
mode: '0755'
owner: "{{ app_user }}"
group: "{{ app_group }}"
- name: Copy application code
synchronize:
src: "{{ app_source_directory }}/"
dest: "{{ app_directory }}/"
delete: yes
recursive: yes
- name: Install Python dependencies
pip:
requirements: "{{ app_directory }}/requirements.txt"
virtualenv: "{{ app_venv }}"
state: present
- name: Configure application
template:
src: "templates/{{ target_env }}_config.yml"
dest: "{{ app_directory }}/config.yml"
owner: "{{ app_user }}"
group: "{{ app_group }}"
mode: '0640'
- name: Restart application service
systemd:
name: "{{ app_service_name }}"
state: restarted
daemon_reload: yes
notify: Run Health Check
- name: Wait for application to be ready
wait_for:
port: 8000
host: "{{ inventory_hostname }}"
timeout: 300
handlers:
- name: Run Health Check
uri:
url: "http://localhost:8000/health/ready"
method: GET
status_code: 200
register: health_checkbash
devops-flow generate-ansible-playbooks \
--req docs/07_REQ/REQ-NN.md \
--spec docs/09_SPEC/SPEC-NN.yaml \
--output ansible/生成的Ansible Playbook结构:
ansible/
├── provision_infra.yml # 基础设施配置
├── configure_instances.yml # 实例配置
├── deploy_app.yml # 应用部署
├── configure_monitoring.yml # 监控搭建
├── configure_security.yml # 安全加固
└── backup_restore.yml # 备份/恢复流程Playbook生成逻辑:
- 解析REQ第9.5.4节获取Playbook需求
- 解析第9.5.1节获取基础设施配置
- 应用Ansible标准(2.9+、模块化角色、幂等性)
- 从REQ @adr标签引用云提供商
- 使用REQ 9.5.2中的环境专属变量
生成的Playbook示例 (provision_infra.yml):
yaml
---
- name: Provision Infrastructure
hosts: localhost
gather_facts: no
vars_files:
- "environments/{{ target_env }}.yml"
tasks:
- name: Create VPC
ec2_vpc_net:
name: "{{ vpc_name }}"
cidr_block: "{{ vpc_cidr }}"
region: "{{ aws_region }}"
tags:
Project: "{{ project_name }}"
Environment: "{{ target_env }}"
ManagedBy: "Ansible"
- name: Create security groups
ec2_security_group:
name: "{{ security_group_name }}"
description: "Security group for {{ application_name }}"
vpc_id: "{{ vpc.vpc_id }}"
rules:
- proto: tcp
from_port: 80
to_port: 80
cidr_ip: 0.0.0.0/0
- proto: tcp
from_port: 443
to_port: 443
cidr_ip: 0.0.0.0/0
region: "{{ aws_region }}"
tags:
Project: "{{ project_name }}"
Environment: "{{ target_env }}"
- name: Create RDS instance
rds:
db_name: "{{ db_name }}"
engine: postgres
engine_version: "{{ db_version }}"
instance_type: "{{ db_instance_class }}"
allocated_storage: "{{ db_storage_gb }}"
username: "{{ db_username }}"
password: "{{ db_password }}"
vpc_security_group_ids:
- "{{ security_group.group_id }}"
subnet_group_name: "{{ db_subnet_group }}"
backup_retention_period: "{{ backup_retention_days }}"
multi_az: true
region: "{{ aws_region }}"
tags:
Project: "{{ project_name }}"
Environment: "{{ target_env }}"
ManagedBy: "Ansible"生成的Playbook示例 (deploy_app.yml):
yaml
---
- name: Deploy Application
hosts: app_servers
gather_facts: yes
become: yes
vars_files:
- "environments/{{ target_env }}.yml"
tasks:
- name: Ensure application directory exists
file:
path: "{{ app_directory }}"
state: directory
mode: '0755'
owner: "{{ app_user }}"
group: "{{ app_group }}"
- name: Copy application code
synchronize:
src: "{{ app_source_directory }}/"
dest: "{{ app_directory }}/"
delete: yes
recursive: yes
- name: Install Python dependencies
pip:
requirements: "{{ app_directory }}/requirements.txt"
virtualenv: "{{ app_venv }}"
state: present
- name: Configure application
template:
src: "templates/{{ target_env }}_config.yml"
dest: "{{ app_directory }}/config.yml"
owner: "{{ app_user }}"
group: "{{ app_group }}"
mode: '0640'
- name: Restart application service
systemd:
name: "{{ app_service_name }}"
state: restarted
daemon_reload: yes
notify: Run Health Check
- name: Wait for application to be ready
wait_for:
port: 8000
host: "{{ inventory_hostname }}"
timeout: 300
handlers:
- name: Run Health Check
uri:
url: "http://localhost:8000/health/ready"
method: GET
status_code: 200
register: health_checkInfrastructure Templates
基础设施模板
AWS Infrastructure (Terraform)
AWS基础设施(Terraform)
hcl
undefinedhcl
undefinedmain.tf
main.tf
terraform {
required_version = ">= 1.0"
required_providers {
aws = {
source = "hashicorp/aws"
version = "~> 5.0"
}
}
backend "s3" {
bucket = "terraform-state-bucket"
key = "infrastructure/terraform.tfstate"
region = "us-east-1"
}
}
provider "aws" {
region = var.aws_region
default_tags {
tags = {
Project = var.project_name
Environment = var.environment
ManagedBy = "Terraform"
}
}
}
terraform {
required_version = ">= 1.0"
required_providers {
aws = {
source = "hashicorp/aws"
version = "~> 5.0"
}
}
backend "s3" {
bucket = "terraform-state-bucket"
key = "infrastructure/terraform.tfstate"
region = "us-east-1"
}
}
provider "aws" {
region = var.aws_region
default_tags {
tags = {
Project = var.project_name
Environment = var.environment
ManagedBy = "Terraform"
}
}
}
VPC Module
VPC模块
module "vpc" {
source = "./modules/vpc"
vpc_cidr = var.vpc_cidr
availability_zones = var.availability_zones
public_subnet_cidrs = var.public_subnet_cidrs
private_subnet_cidrs = var.private_subnet_cidrs
}
module "vpc" {
source = "./modules/vpc"
vpc_cidr = var.vpc_cidr
availability_zones = var.availability_zones
public_subnet_cidrs = var.public_subnet_cidrs
private_subnet_cidrs = var.private_subnet_cidrs
}
ECS Cluster
ECS集群
resource "aws_ecs_cluster" "main" {
name = "${var.project_name}-${var.environment}-cluster"
setting {
name = "containerInsights"
value = "enabled"
}
}
resource "aws_ecs_cluster" "main" {
name = "${var.project_name}-${var.environment}-cluster"
setting {
name = "containerInsights"
value = "enabled"
}
}
Application Load Balancer
应用负载均衡器
resource "aws_lb" "main" {
name = "${var.project_name}-${var.environment}-alb"
internal = false
load_balancer_type = "application"
security_groups = [aws_security_group.alb.id]
subnets = module.vpc.public_subnet_ids
enable_deletion_protection = var.environment == "production"
}
resource "aws_lb" "main" {
name = "${var.project_name}-${var.environment}-alb"
internal = false
load_balancer_type = "application"
security_groups = [aws_security_group.alb.id]
subnets = module.vpc.public_subnet_ids
enable_deletion_protection = var.environment == "production"
}
RDS Database
RDS数据库
resource "aws_db_instance" "main" {
identifier = "${var.project_name}-${var.environment}-db"
engine = "postgres"
engine_version = "15.3"
instance_class = var.db_instance_class
allocated_storage = var.db_allocated_storage
max_allocated_storage = var.db_max_allocated_storage
storage_encrypted = true
db_name = var.db_name
username = var.db_username
password = random_password.db_password.result
vpc_security_group_ids = [aws_security_group.db.id]
db_subnet_group_name = aws_db_subnet_group.main.name
backup_retention_period = var.environment == "production" ? 30 : 7
skip_final_snapshot = var.environment != "production"
tags = {
Name = "${var.project_name}-${var.environment}-db"
}
}
undefinedresource "aws_db_instance" "main" {
identifier = "${var.project_name}-${var.environment}-db"
engine = "postgres"
engine_version = "15.3"
instance_class = var.db_instance_class
allocated_storage = var.db_allocated_storage
max_allocated_storage = var.db_max_allocated_storage
storage_encrypted = true
db_name = var.db_name
username = var.db_username
password = random_password.db_password.result
vpc_security_group_ids = [aws_security_group.db.id]
db_subnet_group_name = aws_db_subnet_group.main.name
backup_retention_period = var.environment == "production" ? 30 : 7
skip_final_snapshot = var.environment != "production"
tags = {
Name = "${var.project_name}-${var.environment}-db"
}
}
undefinedDocker Configuration
Docker配置
dockerfile
undefineddockerfile
undefinedDockerfile
Dockerfile
FROM python:3.11-slim as base
WORKDIR /app
FROM python:3.11-slim as base
WORKDIR /app
Install system dependencies
安装系统依赖
RUN apt-get update && apt-get install -y
gcc
libpq-dev
&& rm -rf /var/lib/apt/lists/*
gcc
libpq-dev
&& rm -rf /var/lib/apt/lists/*
RUN apt-get update && apt-get install -y
gcc
libpq-dev
&& rm -rf /var/lib/apt/lists/*
gcc
libpq-dev
&& rm -rf /var/lib/apt/lists/*
Copy requirements
复制依赖文件
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
Copy application code
复制应用代码
COPY . .
COPY . .
Create non-root user
创建非root用户
RUN useradd -m -u 1000 appuser && chown -R appuser:appuser /app
USER appuser
RUN useradd -m -u 1000 appuser && chown -R appuser:appuser /app
USER appuser
Expose port
暴露端口
EXPOSE 8000
EXPOSE 8000
Health check
健康检查
HEALTHCHECK --interval=30s --timeout=3s --start-period=40s --retries=3
CMD python -c "import requests; requests.get('http://localhost:8000/health')"
CMD python -c "import requests; requests.get('http://localhost:8000/health')"
HEALTHCHECK --interval=30s --timeout=3s --start-period=40s --retries=3
CMD python -c "import requests; requests.get('http://localhost:8000/health')"
CMD python -c "import requests; requests.get('http://localhost:8000/health')"
Run application
启动应用
CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8000"]
CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8000"]
Multi-stage build for smaller image
多阶段构建以缩小镜像体积
FROM base as production
ENV ENVIRONMENT=production
RUN pip install --no-cache-dir gunicorn
CMD ["gunicorn", "main:app", "-w", "4", "-k", "uvicorn.workers.UvicornWorker", "--bind", "0.0.0.0:8000"]
undefinedFROM base as production
ENV ENVIRONMENT=production
RUN pip install --no-cache-dir gunicorn
CMD ["gunicorn", "main:app", "-w", "4", "-k", "uvicorn.workers.UvicornWorker", "--bind", "0.0.0.0:8000"]
undefinedDocker Compose (Local Development)
Docker Compose(本地开发)
yaml
undefinedyaml
undefineddocker-compose.yml
docker-compose.yml
version: '3.8'
services:
api:
build:
context: .
dockerfile: Dockerfile
target: base
ports:
- "8000:8000"
environment:
- DATABASE_URL=postgresql://user:password@db:5432/appdb
- REDIS_URL=redis://redis:6379/0
- ENVIRONMENT=development
volumes:
- .:/app
depends_on:
db:
condition: service_healthy
redis:
condition: service_started
command: uvicorn main:app --host 0.0.0.0 --port 8000 --reload
db:
image: postgres:15
environment:
POSTGRES_USER: user
POSTGRES_PASSWORD: password
POSTGRES_DB: appdb
ports:
- "5432:5432"
volumes:
- postgres_data:/var/lib/postgresql/data
healthcheck:
test: ["CMD-SHELL", "pg_isready -U user"]
interval: 10s
timeout: 5s
retries: 5
redis:
image: redis:7-alpine
ports:
- "6379:6379"
volumes:
- redis_data:/data
nginx:
image: nginx:alpine
ports:
- "80:80"
volumes:
- ./nginx.conf:/etc/nginx/nginx.conf:ro
depends_on:
- api
volumes:
postgres_data:
redis_data:
---version: '3.8'
services:
api:
build:
context: .
dockerfile: Dockerfile
target: base
ports:
- "8000:8000"
environment:
- DATABASE_URL=postgresql://user:password@db:5432/appdb
- REDIS_URL=redis://redis:6379/0
- ENVIRONMENT=development
volumes:
- .:/app
depends_on:
db:
condition: service_healthy
redis:
condition: service_started
command: uvicorn main:app --host 0.0.0.0 --port 8000 --reload
db:
image: postgres:15
environment:
POSTGRES_USER: user
POSTGRES_PASSWORD: password
POSTGRES_DB: appdb
ports:
- "5432:5432"
volumes:
- postgres_data:/var/lib/postgresql/data
healthcheck:
test: ["CMD-SHELL", "pg_isready -U user"]
interval: 10s
timeout: 5s
retries: 5
redis:
image: redis:7-alpine
ports:
- "6379:6379"
volumes:
- redis_data:/data
nginx:
image: nginx:alpine
ports:
- "80:80"
volumes:
- ./nginx.conf:/etc/nginx/nginx.conf:ro
depends_on:
- api
volumes:
postgres_data:
redis_data:
---Deployment Strategies
部署策略
Blue-Green Deployment
蓝绿部署
bash
#!/bin/bashbash
#!/bin/bashdeploy-blue-green.sh
deploy-blue-green.sh
set -e
BLUE_ENV="production-blue"
GREEN_ENV="production-green"
CURRENT_ENV=$(get_active_environment)
if [ "$CURRENT_ENV" == "$BLUE_ENV" ]; then
TARGET_ENV="$GREEN_ENV"
OLD_ENV="$BLUE_ENV"
else
TARGET_ENV="$BLUE_ENV"
OLD_ENV="$GREEN_ENV"
fi
echo "Deploying to $TARGET_ENV (current: $OLD_ENV)"
set -e
BLUE_ENV="production-blue"
GREEN_ENV="production-green"
CURRENT_ENV=$(get_active_environment)
if [ "$CURRENT_ENV" == "$BLUE_ENV" ]; then
TARGET_ENV="$GREEN_ENV"
OLD_ENV="$BLUE_ENV"
else
TARGET_ENV="$BLUE_ENV"
OLD_ENV="$GREEN_ENV"
fi
echo "部署到$TARGET_ENV(当前环境:$OLD_ENV)"
Deploy to target environment
部署到目标环境
deploy_to_environment "$TARGET_ENV"
deploy_to_environment "$TARGET_ENV"
Run smoke tests
运行冒烟测试
if ! run_smoke_tests "$TARGET_ENV"; then
echo "Smoke tests failed, rolling back"
exit 1
fi
if ! run_smoke_tests "$TARGET_ENV"; then
echo "冒烟测试失败,执行回滚"
exit 1
fi
Switch traffic
切换流量
switch_load_balancer "$TARGET_ENV"
switch_load_balancer "$TARGET_ENV"
Monitor for 5 minutes
监控5分钟
monitor_environment "$TARGET_ENV" 300
monitor_environment "$TARGET_ENV" 300
If all good, keep old environment for quick rollback
若一切正常,保留旧环境以便快速回滚
echo "Deployment successful. Old environment $OLD_ENV kept for rollback."
undefinedecho "部署成功。旧环境$OLD_ENV已保留用于回滚。"
undefinedCanary Deployment
金丝雀部署
yaml
undefinedyaml
undefinedk8s/canary-deployment.yaml
k8s/canary-deployment.yaml
apiVersion: v1
kind: Service
metadata:
name: api-service
spec:
selector:
app: api
ports:
- port: 80 targetPort: 8000
apiVersion: v1
kind: Service
metadata:
name: api-service
spec:
selector:
app: api
ports:
- port: 80 targetPort: 8000
Stable version (90% traffic)
稳定版本(90%流量)
apiVersion: apps/v1
kind: Deployment
metadata:
name: api-stable
spec:
replicas: 9
selector:
matchLabels:
app: api
version: stable
template:
metadata:
labels:
app: api
version: stable
spec:
containers:
- name: api
image: registry.example.com/api:v1.0.0
apiVersion: apps/v1
kind: Deployment
metadata:
name: api-stable
spec:
replicas: 9
selector:
matchLabels:
app: api
version: stable
template:
metadata:
labels:
app: api
version: stable
spec:
containers:
- name: api
image: registry.example.com/api:v1.0.0
Canary version (10% traffic)
金丝雀版本(10%流量)
apiVersion: apps/v1
kind: Deployment
metadata:
name: api-canary
spec:
replicas: 1
selector:
matchLabels:
app: api
version: canary
template:
metadata:
labels:
app: api
version: canary
spec:
containers:
- name: api
image: registry.example.com/api:v1.1.0
---apiVersion: apps/v1
kind: Deployment
metadata:
name: api-canary
spec:
replicas: 1
selector:
matchLabels:
app: api
version: canary
template:
metadata:
labels:
app: api
version: canary
spec:
containers:
- name: api
image: registry.example.com/api:v1.1.0
---Monitoring & Observability
监控与可观测性
Prometheus Configuration
Prometheus配置
yaml
undefinedyaml
undefinedprometheus.yml
prometheus.yml
global:
scrape_interval: 15s
evaluation_interval: 15s
scrape_configs:
-
job_name: 'api-service' kubernetes_sd_configs:
- role: pod relabel_configs:
- source_labels: [__meta_kubernetes_pod_label_app] action: keep regex: api
-
job_name: 'node-exporter' static_configs:
- targets: ['node-exporter:9100']
alerting:
alertmanagers:
- static_configs:
- targets: ['alertmanager:9093']
rule_files:
- /etc/prometheus/alerts/*.yml
undefinedglobal:
scrape_interval: 15s
evaluation_interval: 15s
scrape_configs:
-
job_name: 'api-service' kubernetes_sd_configs:
- role: pod relabel_configs:
- source_labels: [__meta_kubernetes_pod_label_app] action: keep regex: api
-
job_name: 'node-exporter' static_configs:
- targets: ['node-exporter:9100']
alerting:
alertmanagers:
- static_configs:
- targets: ['alertmanager:9093']
rule_files:
- /etc/prometheus/alerts/*.yml
undefinedAlert Rules
告警规则
yaml
undefinedyaml
undefinedalerts/api-alerts.yml
alerts/api-alerts.yml
groups:
- name: api-alerts
interval: 30s
rules:
-
alert: HighErrorRate expr: rate(http_requests_total{status=~"5.."}[5m]) > 0.05 for: 5m labels: severity: critical annotations: summary: "High error rate detected" description: "{{ $labels.instance }} has error rate {{ $value }}"
-
alert: HighLatency expr: histogram_quantile(0.95, http_request_duration_seconds_bucket) > 1 for: 10m labels: severity: warning annotations: summary: "High latency detected" description: "95th percentile latency is {{ $value }}s"
-
alert: PodDown expr: up{job="api-service"} == 0 for: 2m labels: severity: critical annotations: summary: "Pod is down" description: "{{ $labels.instance }} has been down for 2 minutes"
-
---groups:
- name: api-alerts
interval: 30s
rules:
-
alert: HighErrorRate expr: rate(http_requests_total{status=~"5.."}[5m]) > 0.05 for: 5m labels: severity: critical annotations: summary: "检测到高错误率" description: "{{ $labels.instance }}的错误率为{{ $value }}"
-
alert: HighLatency expr: histogram_quantile(0.95, http_request_duration_seconds_bucket) > 1 for: 10m labels: severity: warning annotations: summary: "检测到高延迟" description: "95分位延迟为{{ $value }}秒"
-
alert: PodDown expr: up{job="api-service"} == 0 for: 2m labels: severity: critical annotations: summary: "Pod已下线" description: "{{ $labels.instance }}已下线2分钟"
-
---Security Configuration
安全配置
Network Security
网络安全
hcl
undefinedhcl
undefinedsecurity-groups.tf
security-groups.tf
ALB Security Group
ALB安全组
resource "aws_security_group" "alb" {
name_prefix = "${var.project_name}-alb-"
vpc_id = module.vpc.vpc_id
ingress {
from_port = 443
to_port = 443
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
description = "HTTPS from internet"
}
egress {
from_port = 0
to_port = 0
protocol = "-1"
cidr_blocks = ["0.0.0.0/0"]
}
}
resource "aws_security_group" "alb" {
name_prefix = "${var.project_name}-alb-"
vpc_id = module.vpc.vpc_id
ingress {
from_port = 443
to_port = 443
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
description = "来自互联网的HTTPS"
}
egress {
from_port = 0
to_port = 0
protocol = "-1"
cidr_blocks = ["0.0.0.0/0"]
}
}
Application Security Group
应用安全组
resource "aws_security_group" "app" {
name_prefix = "${var.project_name}-app-"
vpc_id = module.vpc.vpc_id
ingress {
from_port = 8000
to_port = 8000
protocol = "tcp"
security_groups = [aws_security_group.alb.id]
description = "From ALB"
}
egress {
from_port = 0
to_port = 0
protocol = "-1"
cidr_blocks = ["0.0.0.0/0"]
}
}
resource "aws_security_group" "app" {
name_prefix = "${var.project_name}-app-"
vpc_id = module.vpc.vpc_id
ingress {
from_port = 8000
to_port = 8000
protocol = "tcp"
security_groups = [aws_security_group.alb.id]
description = "来自ALB"
}
egress {
from_port = 0
to_port = 0
protocol = "-1"
cidr_blocks = ["0.0.0.0/0"]
}
}
Database Security Group
数据库安全组
resource "aws_security_group" "db" {
name_prefix = "${var.project_name}-db-"
vpc_id = module.vpc.vpc_id
ingress {
from_port = 5432
to_port = 5432
protocol = "tcp"
security_groups = [aws_security_group.app.id]
description = "From application"
}
}
---resource "aws_security_group" "db" {
name_prefix = "${var.project_name}-db-"
vpc_id = module.vpc.vpc_id
ingress {
from_port = 5432
to_port = 5432
protocol = "tcp"
security_groups = [aws_security_group.app.id]
description = "来自应用"
}
}
---Tool Access
工具权限
Required tools:
- : Read SPEC documents and ADRs
Read - : Generate infrastructure and pipeline files
Write - : Execute Terraform, Docker, kubectl commands
Bash - : Search for configuration patterns
Grep
Required software:
- Terraform / OpenTofu
- Docker / Podman
- kubectl / helm
- aws-cli / gcloud / az-cli
- Ansible (optional)
所需工具权限:
- : 读取SPEC文档和ADR
Read - : 生成基础设施和流水线文件
Write - : 执行Terraform、Docker、kubectl命令
Bash - : 搜索配置模式
Grep
所需软件:
- Terraform / OpenTofu
- Docker / Podman
- kubectl / helm
- aws-cli / gcloud / az-cli
- Ansible(可选)
Integration Points
集成点
With doc-flow
与doc-flow集成
- Extract infrastructure requirements from SPEC documents
- Validate ADR compliance in infrastructure code
- Generate deployment documentation
- 从SPEC文档提取基础设施需求
- 验证基础设施代码的ADR合规性
- 生成部署文档
With security-audit
与security-audit集成
- Security scanning of infrastructure code
- Vulnerability assessment of containers
- Compliance validation
- 基础设施代码安全扫描
- 容器漏洞评估
- 合规验证
With test-automation
与test-automation集成
- Integration with CI/CD for automated testing
- Deployment smoke tests
- Infrastructure validation tests
- 与CI/CD集成实现自动化测试
- 部署冒烟测试
- 基础设施验证测试
With analytics-flow
与analytics-flow集成
- Deployment metrics and trends
- Infrastructure cost tracking
- Performance monitoring integration
- 部署指标与趋势
- 基础设施成本追踪
- 性能监控集成
Best Practices
最佳实践
- Infrastructure as Code: All infrastructure versioned in Git
- Immutable infrastructure: Replace, don't modify
- Environment parity: Dev/staging/prod consistency
- Secret management: Never commit secrets
- Monitoring from day one: Observability built-in
- Automated rollbacks: Fast failure recovery
- Cost optimization: Tag resources, monitor spending
- Security by default: Least privilege, encryption
- Documentation: Runbooks for common operations
- Disaster recovery: Regular backup testing
- 基础设施即代码: 所有基础设施在Git中版本化
- 不可变基础设施: 替换而非修改
- 环境一致性: 开发/staging/生产环境一致
- 密钥管理: 绝不提交密钥
- 从第一天开始监控: 内置可观测性
- 自动化回滚: 快速故障恢复
- 成本优化: 资源标签、支出监控
- 默认安全: 最小权限、加密
- 文档: 常见操作手册
- 灾难恢复: 定期备份测试
Success Criteria
成功标准
- Zero manual infrastructure provisioning
- Deployment time < 15 minutes
- Rollback time < 5 minutes
- Zero-downtime deployments
- Infrastructure drift detection automated
- Security compliance 100%
- Cost variance < 10% from budget
- 无手动基础设施配置
- 部署时间<15分钟
- 回滚时间<5分钟
- 零停机部署
- 基础设施漂移检测自动化
- 安全合规100%
- 成本偏差<预算的10%
Notes
注意事项
- Generated configurations require review before production use
- Cloud provider credentials must be configured separately
- State management (Terraform) requires backend configuration
- Multi-region deployments require additional configuration
- Cost estimation available with
terraform plan
- 生成的配置在生产使用前需审核
- 云提供商凭证需单独配置
- 状态管理(Terraform)需配置后端
- 多区域部署需额外配置
- 可通过进行成本估算
terraform plan