AWS Cost Optimization & FinOps

AWS成本优化与FinOps

Systematic workflows for AWS cost optimization and financial operations management.

AWS成本优化与财务运营管理的系统化工作流。

When to Use This Skill

何时使用本技能

Use this skill when you need to:

Find cost savings: Identify unused resources, rightsizing opportunities, or commitment discounts
Analyze spending: Understand cost trends, detect anomalies, or break down costs
Optimize architecture: Choose cost-effective services, storage tiers, or instance types
Implement FinOps: Set up governance, tagging, budgets, or monthly reviews
Make purchase decisions: Evaluate Reserved Instances, Savings Plans, or Spot instances
Troubleshoot costs: Investigate unexpected bills or cost spikes
Plan budgets: Forecast costs or evaluate impact of new projects

当你需要以下操作时，使用本技能：

挖掘成本节省空间：识别未使用资源、实例规格调整机会或承诺折扣
分析支出情况：了解成本趋势、检测异常或拆分成本明细
优化架构：选择高性价比服务、存储层级或实例类型
实施FinOps：建立治理机制、标签策略、预算或月度评审流程
制定采购决策：评估Reserved Instance、Savings Plans或Spot实例
排查成本问题：调查意外账单或成本激增情况
规划预算：预测成本或评估新项目的成本影响

Cost Optimization Workflow

成本优化工作流

Follow this systematic approach for AWS cost optimization:

┌─────────────────────────────────────────────┐
│ 1. DISCOVER                                 │
│    What are we spending money on?           │
│    Run: find_unused_resources.py            │
│    Run: cost_anomaly_detector.py            │
└─────────────────────────────────────────────┘
                    ↓
┌─────────────────────────────────────────────┐
│ 2. ANALYZE                                  │
│    Where are the optimization opportunities?│
│    Run: rightsizing_analyzer.py             │
│    Run: detect_old_generations.py           │
│    Run: spot_recommendations.py             │
│    Run: analyze_ri_recommendations.py       │
└─────────────────────────────────────────────┘
                    ↓
┌─────────────────────────────────────────────┐
│ 3. PRIORITIZE                               │
│    What should we optimize first?           │
│    - Quick wins (low risk, high savings)    │
│    - Low-hanging fruit (easy to implement)  │
│    - Strategic improvements                 │
└─────────────────────────────────────────────┘
                    ↓
┌─────────────────────────────────────────────┐
│ 4. IMPLEMENT                                │
│    Execute optimization actions             │
│    - Delete unused resources                │
│    - Rightsize instances                    │
│    - Purchase commitments                   │
│    - Migrate to new generations             │
└─────────────────────────────────────────────┘
                    ↓
┌─────────────────────────────────────────────┐
│ 5. MONITOR                                  │
│    Verify savings and track metrics         │
│    - Monthly cost reviews                   │
│    - Tag compliance monitoring              │
│    - Budget variance tracking               │
└─────────────────────────────────────────────┘

遵循以下系统化方法进行AWS成本优化：

┌─────────────────────────────────────────────┐
│ 1. DISCOVER                                 │
│    What are we spending money on?           │
│    Run: find_unused_resources.py            │
│    Run: cost_anomaly_detector.py            │
└─────────────────────────────────────────────┘
                    ↓
┌─────────────────────────────────────────────┐
│ 2. ANALYZE                                  │
│    Where are the optimization opportunities?│
│    Run: rightsizing_analyzer.py             │
│    Run: detect_old_generations.py           │
│    Run: spot_recommendations.py             │
│    Run: analyze_ri_recommendations.py       │
└─────────────────────────────────────────────┘
                    ↓
┌─────────────────────────────────────────────┐
│ 3. PRIORITIZE                               │
│    What should we optimize first?           │
│    - Quick wins (low risk, high savings)    │
│    - Low-hanging fruit (easy to implement)  │
│    - Strategic improvements                 │
└─────────────────────────────────────────────┘
                    ↓
┌─────────────────────────────────────────────┐
│ 4. IMPLEMENT                                │
│    Execute optimization actions             │
│    - Delete unused resources                │
│    - Rightsize instances                    │
│    - Purchase commitments                   │
│    - Migrate to new generations             │
└─────────────────────────────────────────────┘
                    ↓
┌─────────────────────────────────────────────┐
│ 5. MONITOR                                  │
│    Verify savings and track metrics         │
│    - Monthly cost reviews                   │
│    - Tag compliance monitoring              │
│    - Budget variance tracking               │
└─────────────────────────────────────────────┘

Core Workflows

核心工作流

Workflow 1: Monthly Cost Optimization Review

工作流1：月度成本优化评审

Frequency: Run monthly (first week of each month)

Step 1: Find Unused Resources

bash

undefined

频率：每月运行（每月第一周）

步骤1：查找未使用资源

bash

undefined

Scan for waste across all resources

python3 scripts/find_unused_resources.py

Expected output:

- Unattached EBS volumes

- Old snapshots

- Unused Elastic IPs

- Idle NAT Gateways

- Idle EC2 instances

- Unused load balancers

- Estimated monthly savings


**Step 2: Analyze Cost Anomalies**
```bash


**步骤2：分析成本异常**
```bash

Detect unusual spending patterns

python3 scripts/cost_anomaly_detector.py --days 30

Expected output:

- Cost spikes and anomalies

- Top cost drivers

- Period-over-period comparison

- 30-day forecast


**Step 3: Identify Rightsizing Opportunities**
```bash


**步骤3：识别实例规格调整机会**
```bash

Find oversized instances

python3 scripts/rightsizing_analyzer.py --days 30

Expected output:

- EC2 instances with low utilization

- RDS instances with low utilization

- Recommended smaller instance types

- Estimated savings


**Step 4: Generate Monthly Report**
```bash


**步骤4：生成月度报告**
```bash

Use the template to compile findings

cp assets/templates/monthly_cost_report.md reports/$(date +%Y-%m)-cost-report.md

Fill in:

- Findings from scripts

- Action items

- Team cost breakdowns

- Optimization wins


**Step 5: Team Review Meeting**
- Present findings to engineering teams
- Assign optimization tasks
- Track action items to completion

---


**步骤5：团队评审会议**
- 向工程团队展示发现结果
- 分配优化任务
- 跟踪行动项直至完成

---

Workflow 2: Commitment Purchase Analysis (RI/Savings Plans)

工作流2：承诺型采购分析（RI/Savings Plans）

When: Quarterly or when usage patterns stabilize

Step 1: Analyze Current Usage

bash

undefined

时机：每季度或使用模式稳定时

步骤1：分析当前使用情况

bash

undefined

Identify workloads suitable for commitments

python3 scripts/analyze_ri_recommendations.py --days 60

Looks for:

- EC2 instances running consistently for 60+ days

- RDS instances with stable usage

- Calculates ROI for 1yr vs 3yr commitments


**Step 2: Review Recommendations**

Evaluate each recommendation:

✅ Good candidate if:

Running 24/7 for 60+ days
Workload is stable and predictable
No plans to change architecture
Savings > 30%

❌ Poor candidate if:

Workload is variable or experimental
Architecture changes planned
Instance type may change
Dev/test environment


**Step 3: Choose Commitment Type**

**Reserved Instances**:
- Standard RI: Highest discount (63%), no flexibility
- Convertible RI: Moderate discount (54%), can change instance type
- Best for: Specific instance types, stable workloads

**Savings Plans**:
- Compute SP: Flexible across instance types, regions (66% savings)
- EC2 Instance SP: Flexible across sizes in same family (72% savings)
- Best for: Variable workloads within constraints

**Decision Matrix**:

Known instance type, won't change → Standard RI May need to change types → Convertible RI or Compute SP Variable workloads → Compute Savings Plan Maximum flexibility → Compute Savings Plan


**Step 4: Purchase and Track**
- Purchase through AWS Console or CLI
- Tag commitments with purchase date and owner
- Monitor utilization monthly
- Aim for >90% utilization

**Reference**: See `references/best_practices.md` for detailed commitment strategies

---


**步骤2：评估推荐方案**

评估每个推荐项：

✅ Good candidate if:

Running 24/7 for 60+ days
Workload is stable and predictable
No plans to change architecture
Savings > 30%

❌ Poor candidate if:

Workload is variable or experimental
Architecture changes planned
Instance type may change
Dev/test environment


**步骤3：选择承诺类型**

**Reserved Instances**:
- Standard RI：折扣最高（63%），无灵活性
- Convertible RI：中等折扣（54%），可更改实例类型
- 适用场景：特定实例类型、稳定工作负载

**Savings Plans**:
- Compute SP：跨实例类型、区域灵活适用（66%节省）
- EC2 Instance SP：同系列内跨规格灵活适用（72%节省）
- 适用场景：约束范围内的可变工作负载

**决策矩阵**:

Known instance type, won't change → Standard RI May need to change types → Convertible RI or Compute SP Variable workloads → Compute Savings Plan Maximum flexibility → Compute Savings Plan


**步骤4：采购与跟踪**
- 通过AWS控制台或CLI进行采购
- 为承诺项添加采购日期和所有者标签
- 每月监控利用率
- 目标利用率>90%

**参考**：详见`references/best_practices.md`中的详细承诺策略

---

Workflow 3: Instance Generation Migration

工作流3：实例代际迁移

When: During architecture reviews or optimization sprints

Step 1: Detect Old Instances

bash

undefined

时机：架构评审或优化冲刺期间

步骤1：检测旧代际实例

bash

undefined

Find outdated instance generations

python3 scripts/detect_old_generations.py

Identifies:

- t2 → t3 migrations (10% savings)

- m4 → m5 → m6i migrations

- Intel → Graviton opportunities (20% savings)


**Step 2: Prioritize Migrations**

**Quick Wins (Low Risk)**:

t2 → t3: Drop-in replacement, 10% savings m4 → m5: Better performance, 5% savings gp2 → gp3: No downtime, 20% savings


**Medium Effort (Test Required)**:

x86 → Graviton (ARM64): 20% savings

Requires ARM64 compatibility testing
Most modern frameworks support ARM64
Test in staging first


**Step 3: Execute Migration**

**For EC2 (x86 to x86)**:
1. Stop instance
2. Change instance type
3. Start instance
4. Verify application

**For Graviton Migration**:
1. Create ARM64 AMI or Docker image
2. Launch new Graviton instance
3. Test thoroughly
4. Cut over traffic
5. Terminate old instance

**Step 4: Validate Savings**
- Monitor new costs in Cost Explorer
- Verify performance is acceptable
- Document migration for other teams

**Reference**: See `references/best_practices.md` → Compute Optimization

---


**步骤2：优先排序迁移任务**

**快速见效项（低风险）**:

t2 → t3: 直接替换，节省10% m4 → m5: 性能提升，节省5% gp2 → gp3: 无停机，节省20%


**中等工作量（需测试）**:

x86 → Graviton (ARM64): 节省20%

需要ARM64兼容性测试
大多数现代框架支持ARM64
先在预发布环境测试


**步骤3：执行迁移**

**EC2（x86到x86）迁移**:
1. 停止实例
2. 更改实例类型
3. 启动实例
4. 验证应用

**Graviton迁移**:
1. 创建ARM64 AMI或Docker镜像
2. 启动新的Graviton实例
3. 全面测试
4. 切换流量
5. 终止旧实例

**步骤4：验证节省效果**
- 在Cost Explorer中监控新成本
- 验证性能符合要求
- 记录迁移过程供其他团队参考

**参考**：详见`references/best_practices.md` → 计算优化

---

Workflow 4: Spot Instance Evaluation

工作流4：Spot实例评估

When: For fault-tolerant workloads or Auto Scaling Groups

Step 1: Identify Candidates

bash

undefined

时机：适用于容错工作负载或Auto Scaling组

步骤1：识别候选对象

bash

undefined

Analyze workloads for Spot suitability

python3 scripts/spot_recommendations.py

Evaluates:

- Instances in Auto Scaling Groups (good candidates)

- Dev/test/staging environments

- Batch processing workloads

- CI/CD and build servers


**Step 2: Assess Suitability**

**Excellent for Spot**:
- Stateless applications
- Batch jobs
- CI/CD pipelines
- Data processing
- Auto Scaling Groups

**NOT suitable for Spot**:
- Databases (without replicas)
- Stateful applications
- Real-time services
- Mission-critical workloads

**Step 3: Implementation Strategy**

**Option 1: Fargate Spot (Easiest)**
```yaml


**步骤2：评估适用性**

**非常适合Spot的场景**:
- 无状态应用
- 批处理作业
- CI/CD流水线
- 数据处理
- Auto Scaling组

**不适合Spot的场景**:
- 数据库（无副本时）
- 有状态应用
- 实时服务
- 关键业务工作负载

**步骤3：实施策略**

**选项1：Fargate Spot（最简单）**
```yaml

ECS task definition

requiresCompatibilities:

FARGATE capacityProviderStrategy:
capacityProvider: FARGATE_SPOT weight: 70 # 70% Spot
capacityProvider: FARGATE weight: 30 # 30% On-Demand


**Option 2: EC2 Auto Scaling with Spot**
```yaml

requiresCompatibilities:

FARGATE capacityProviderStrategy:
capacityProvider: FARGATE_SPOT weight: 70 # 70% Spot
capacityProvider: FARGATE weight: 30 # 30% On-Demand


**选项2：带Spot的EC2 Auto Scaling**
```yaml

Mixed instances policy

MixedInstancesPolicy: InstancesDistribution: OnDemandBaseCapacity: 2 OnDemandPercentageAboveBaseCapacity: 30 SpotAllocationStrategy: capacity-optimized LaunchTemplate: Overrides: - InstanceType: m5.large - InstanceType: m5a.large - InstanceType: m5n.large


**Option 3: EC2 Spot Fleet**
```bash

MixedInstancesPolicy: InstancesDistribution: OnDemandBaseCapacity: 2 OnDemandPercentageAboveBaseCapacity: 30 SpotAllocationStrategy: capacity-optimized LaunchTemplate: Overrides: - InstanceType: m5.large - InstanceType: m5a.large - InstanceType: m5n.large


**选项3：EC2 Spot Fleet**
```bash

Create Spot Fleet with diverse instance types

aws ec2 request-spot-fleet --spot-fleet-request-config file://spot-fleet.json


**Step 4: Implement Interruption Handling**
```bash

aws ec2 request-spot-fleet --spot-fleet-request-config file://spot-fleet.json


**步骤4：实现中断处理**
```bash

Handle 2-minute termination notice

Instance metadata: /latest/meta-data/spot/instance-action

In application:

Poll for termination notice
Gracefully shutdown (save state)
Drain connections
Exit


**Reference**: See `references/best_practices.md` → Compute Optimization → Spot Instances

---

Poll for termination notice
Gracefully shutdown (save state)
Drain connections
Exit


**参考**：详见`references/best_practices.md` → 计算优化 → Spot实例

---

Quick Reference: Cost Optimization Scripts

快速参考：成本优化脚本

All Scripts Location

所有脚本位置

bash

ls scripts/

bash

ls scripts/

find_unused_resources.py

analyze_ri_recommendations.py

detect_old_generations.py

spot_recommendations.py

rightsizing_analyzer.py

cost_anomaly_detector.py

undefined

undefined

Script Usage Patterns

脚本使用模式

Monthly Review (Run all):

bash

python3 scripts/find_unused_resources.py
python3 scripts/cost_anomaly_detector.py --days 30
python3 scripts/rightsizing_analyzer.py --days 30

Quarterly Optimization:

bash

python3 scripts/analyze_ri_recommendations.py --days 60
python3 scripts/detect_old_generations.py
python3 scripts/spot_recommendations.py

Specific Region Only:

bash

python3 scripts/find_unused_resources.py --region us-east-1
python3 scripts/rightsizing_analyzer.py --region us-west-2

Named AWS Profile:

bash

python3 scripts/find_unused_resources.py --profile production
python3 scripts/cost_anomaly_detector.py --profile production --days 60

月度评审（全部运行）:

bash

python3 scripts/find_unused_resources.py
python3 scripts/cost_anomaly_detector.py --days 30
python3 scripts/rightsizing_analyzer.py --days 30

季度优化:

bash

python3 scripts/analyze_ri_recommendations.py --days 60
python3 scripts/detect_old_generations.py
python3 scripts/spot_recommendations.py

仅特定区域:

bash

python3 scripts/find_unused_resources.py --region us-east-1
python3 scripts/rightsizing_analyzer.py --region us-west-2

指定AWS配置文件:

bash

python3 scripts/find_unused_resources.py --profile production
python3 scripts/cost_anomaly_detector.py --profile production --days 60

Script Requirements

脚本要求

bash

undefined

bash

undefined

Install dependencies

pip install boto3 tabulate

AWS credentials required

Configure via: aws configure

Or use: --profile PROFILE_NAME

---

---

Service-Specific Optimization

服务特定优化

Compute Optimization

计算优化

Key Actions:

Migrate to Graviton (20% savings)
Use Spot for fault-tolerant workloads (70% savings)
Purchase RIs for stable workloads (40-65% savings)
Right-size oversized instances

Reference:

references/best_practices.md

→ Compute Optimization

关键操作:

迁移至Graviton（节省20%）
为容错工作负载使用Spot（节省70%）
为稳定工作负载购买RI（节省40-65%）
调整超大实例的规格

参考：

references/best_practices.md

→ 计算优化

Storage Optimization

存储优化

Key Actions:

Convert gp2 → gp3 (20% savings)
Implement S3 lifecycle policies (50-95% savings)
Delete old snapshots
Use S3 Intelligent-Tiering

Reference:

references/best_practices.md

→ Storage Optimization

关键操作:

将gp2转换为gp3（节省20%）
实施S3生命周期策略（节省50-95%）
删除旧快照
使用S3 Intelligent-Tiering

参考：

references/best_practices.md

→ 存储优化

Network Optimization

网络优化

Key Actions:

Replace NAT Gateways with VPC Endpoints (save $25-30/month each)
Use CloudFront to reduce data transfer costs
Colocate resources in same AZ when possible

Reference:

references/best_practices.md

→ Network Optimization

关键操作:

用VPC Endpoints替换NAT Gateways（每个每月节省25-30美元）
使用CloudFront降低数据传输成本
尽可能将资源部署在同一可用区

参考：

references/best_practices.md

→ 网络优化

Database Optimization

数据库优化

Key Actions:

Right-size RDS instances
Use gp3 storage (20% cheaper than gp2)
Evaluate Aurora Serverless for variable workloads
Purchase RDS Reserved Instances

Reference:

references/best_practices.md

→ Database Optimization

关键操作:

调整RDS实例规格
使用gp3存储（比gp2便宜20%）
为可变工作负载评估Aurora Serverless
购买RDS Reserved Instance

参考：

references/best_practices.md

→ 数据库优化

Service Alternatives Decision Guide

服务替代方案决策指南

Need help choosing between services?

Question: "Should I use EC2, Lambda, or Fargate?" Answer: See

references/service_alternatives.md

→ Compute Alternatives

Question: "Which S3 storage class should I use?" Answer: See

references/service_alternatives.md

→ Storage Alternatives

Question: "Should I use RDS or Aurora?" Answer: See

references/service_alternatives.md

→ Database Alternatives

Question: "NAT Gateway vs VPC Endpoint vs NAT Instance?" Answer: See

references/service_alternatives.md

→ Networking Alternatives

需要帮助选择服务？

问题："我应该使用EC2、Lambda还是Fargate？" 答案：详见

references/service_alternatives.md

→ 计算服务替代方案

问题："我应该使用哪种S3存储类别？" 答案：详见

references/service_alternatives.md

→ 存储服务替代方案

问题："我应该使用RDS还是Aurora？" 答案：详见

references/service_alternatives.md

→ 数据库服务替代方案

问题："NAT Gateway vs VPC Endpoint vs NAT Instance？" 答案：详见

references/service_alternatives.md

→ 网络服务替代方案

FinOps Governance & Process

FinOps治理与流程

Setting Up FinOps

搭建FinOps体系

Phase 1: Foundation (Month 1)

Enable Cost Explorer
Set up AWS Budgets
Define tagging strategy
Activate cost allocation tags

Phase 2: Visibility (Months 2-3)

Implement tagging enforcement
Run optimization scripts
Set up monthly reviews
Create team cost reports

Phase 3: Culture (Ongoing)

Cost metrics in engineering KPIs
Cost review in architecture decisions
Regular optimization sprints
FinOps champions in each team

Full Guide: See

references/finops_governance.md

阶段1：基础搭建（第1个月）

启用Cost Explorer
设置AWS Budgets
定义标签策略
激活成本分配标签

阶段2：可视化（第2-3个月）

实施标签强制策略
运行优化脚本
建立月度评审流程
创建团队成本报告

阶段3：文化建设（持续进行）

将成本指标纳入工程KPI
在架构决策中加入成本评审
定期开展优化冲刺
在每个团队设立FinOps负责人

完整指南：详见

references/finops_governance.md

Monthly Review Process

月度评审流程

Week 1: Data Collection

Run all optimization scripts
Export Cost & Usage Reports
Compile findings

Week 2: Analysis

Identify trends
Find opportunities
Prioritize actions

Week 3: Team Reviews

Present to engineering teams
Discuss optimizations
Assign action items

Week 4: Executive Reporting

Create executive summary
Forecast next quarter
Report optimization wins

Template: See

assets/templates/monthly_cost_report.md

Detailed Process: See

references/finops_governance.md

→ Monthly Review Process

第1周：数据收集

运行所有优化脚本
导出成本与使用报告
整理发现结果

第2周：分析

识别趋势
发现优化机会
优先排序行动项

第3周：团队评审

向工程团队展示
讨论优化方案
分配行动项

第4周：高管汇报

创建高管摘要
预测下一季度成本
汇报优化成果

模板：详见

assets/templates/monthly_cost_report.md

详细流程：详见

references/finops_governance.md

→ 月度评审流程

Cost Optimization Checklist

成本优化检查清单

Quick Wins (Do First)

快速见效项（优先完成）

Delete unattached EBS volumes
Delete old EBS snapshots (>90 days)
Release unused Elastic IPs
Convert gp2 → gp3 volumes
Stop/terminate idle EC2 instances
Enable S3 Intelligent-Tiering
Set up AWS Budgets and alerts

Medium Effort (This Quarter)

中等工作量项（本季度完成）

Right-size oversized instances
Migrate to newer instance generations
Purchase Reserved Instances for stable workloads
Implement S3 lifecycle policies
Replace NAT Gateways with VPC Endpoints (where applicable)
Enable automated resource scheduling (dev/test)
Implement tagging strategy and enforcement

调整超大实例的规格
迁移至新一代实例
为稳定工作负载购买Reserved Instance
实施S3生命周期策略
用VPC Endpoints替换NAT Gateways（适用场景）
启用自动化资源调度（开发/测试环境）
实施标签策略与强制机制

Strategic Initiatives (Ongoing)

战略举措（持续进行）

Migrate to Graviton instances
Implement Spot for fault-tolerant workloads
Establish monthly cost review process
Set up cost allocation by team
Implement chargeback/showback model
Create FinOps culture and practices

Troubleshooting Cost Issues

成本问题排查

"My bill suddenly increased"

"我的账单突然增加"

Run cost anomaly detection:

bash

python3 scripts/cost_anomaly_detector.py --days 30

Check Cost Explorer for service breakdown
Review CloudTrail for resource creation events
Check for AutoScaling events
Verify no Reserved Instances expired

运行成本异常检测：

bash

python3 scripts/cost_anomaly_detector.py --days 30

在Cost Explorer中查看服务明细
查看CloudTrail中的资源创建事件
检查AutoScaling事件
确认没有Reserved Instance过期

"I need to reduce costs by X%"

"我需要将成本降低X%"

Follow the optimization workflow:

Run all discovery scripts
Calculate total potential savings
Prioritize by: Savings Amount × (1 / Effort)
Focus on quick wins first
Implement strategic changes for long-term

遵循优化工作流：

运行所有发现脚本
计算总潜在节省金额
按以下优先级排序：节省金额 × (1 / 工作量)
优先处理快速见效项
实施战略变更以实现长期节省

"How do I know if Reserved Instances make sense?"

"我如何判断Reserved Instance是否值得购买"

Run RI analysis:

bash

python3 scripts/analyze_ri_recommendations.py --days 60

Look for:

Instances running 60+ days consistently
Workloads that won't change
Savings > 30%

运行RI分析：

bash

python3 scripts/analyze_ri_recommendations.py --days 60

关注以下要点：

持续运行60天以上的实例
工作负载不会发生变化
节省比例超过30%

"Which resources can I safely delete?"

"哪些资源可以安全删除"

Run unused resource finder:

bash

python3 scripts/find_unused_resources.py

Safe to delete (usually):

Unattached EBS volumes (after verifying)
Snapshots > 90 days (if backups exist elsewhere)
Unused Elastic IPs (after verifying not in DNS)
Stopped EC2 instances > 30 days (after confirming abandoned)

Always verify with resource owner before deletion!

运行未使用资源查找脚本：

bash

python3 scripts/find_unused_resources.py

通常可安全删除的资源：

未挂载的EBS卷（需先验证）
超过90天的快照（如果其他地方有备份）
未使用的Elastic IP（需先验证未在DNS中使用）
停止超过30天的EC2实例（需确认已废弃）

删除前务必与资源所有者确认！

Best Practices Summary

最佳实践总结

Tag Everything: Consistent tagging enables cost allocation and accountability
Monitor Continuously: Weekly script runs catch waste early
Review Monthly: Regular reviews prevent cost drift
Right-size Proactively: Don't wait for cost issues to optimize
Use Commitments Wisely: RIs/SPs for stable workloads only
Test Before Migrating: Especially for Graviton or Spot
Automate Cleanup: Scheduled shutdown of dev/test resources
Share Wins: Celebrate cost savings to build FinOps culture

全面打标签：一致的标签可实现成本分配与问责
持续监控：每周运行脚本可尽早发现浪费
月度评审：定期评审防止成本失控
主动调整规格：不要等到出现成本问题才优化
明智使用承诺型采购：仅为稳定工作负载购买RI/SP
迁移前测试：尤其是Graviton或Spot实例
自动化清理：定时关闭开发/测试环境资源
分享成果：庆祝成本节省以打造FinOps文化

Additional Resources

额外资源

Detailed References:

```
references/best_practices.md
```
: Comprehensive optimization strategies
```
references/service_alternatives.md
```
: Cost-effective service selection
```
references/finops_governance.md
```
: Organizational FinOps practices

Templates:

```
assets/templates/monthly_cost_report.md
```
: Monthly reporting template

Scripts:

All scripts in
```
scripts/
```
directory with
```
--help
```
for usage

AWS Documentation:

AWS Cost Explorer: https://aws.amazon.com/aws-cost-management/aws-cost-explorer/
AWS Budgets: https://aws.amazon.com/aws-cost-management/aws-budgets/
FinOps Foundation: https://www.finops.org

详细参考文档:

```
references/best_practices.md
```
：全面的优化策略
```
references/service_alternatives.md
```
：高性价比服务选择指南
```
references/finops_governance.md
```
：企业级FinOps实践

模板:

```
assets/templates/monthly_cost_report.md
```
：月度报告模板

脚本:

所有脚本位于
```
scripts/
```
目录，使用
```
--help
```
查看用法

AWS官方文档:

AWS Cost Explorer: https://aws.amazon.com/aws-cost-management/aws-cost-explorer/
AWS Budgets: https://aws.amazon.com/aws-cost-management/aws-budgets/
FinOps Foundation: https://www.finops.org ",

aws-cost-finops

Original

Translation

AWS Cost Optimization & FinOps

AWS成本优化与FinOps

When to Use This Skill

何时使用本技能

Cost Optimization Workflow

成本优化工作流

Core Workflows

核心工作流

Workflow 1: Monthly Cost Optimization Review

工作流1：月度成本优化评审

Scan for waste across all resources

Scan for waste across all resources

Expected output:

Expected output:

- Unattached EBS volumes

- Unattached EBS volumes

- Old snapshots

- Old snapshots

- Unused Elastic IPs

- Unused Elastic IPs

- Idle NAT Gateways

- Idle NAT Gateways

- Idle EC2 instances

- Idle EC2 instances

- Unused load balancers

- Unused load balancers

- Estimated monthly savings

- Estimated monthly savings

Detect unusual spending patterns

Detect unusual spending patterns

Expected output:

Expected output:

- Cost spikes and anomalies

- Cost spikes and anomalies

- Top cost drivers

- Top cost drivers

- Period-over-period comparison

- Period-over-period comparison

- 30-day forecast

- 30-day forecast

Find oversized instances

Find oversized instances

Expected output:

Expected output:

- EC2 instances with low utilization

- EC2 instances with low utilization

- RDS instances with low utilization

- RDS instances with low utilization

- Recommended smaller instance types

- Recommended smaller instance types

- Estimated savings

- Estimated savings

Use the template to compile findings

Use the template to compile findings

Fill in:

Fill in:

- Findings from scripts

- Findings from scripts

- Action items

- Action items

- Team cost breakdowns

- Team cost breakdowns

- Optimization wins

- Optimization wins

Workflow 2: Commitment Purchase Analysis (RI/Savings Plans)

工作流2：承诺型采购分析（RI/Savings Plans）

Identify workloads suitable for commitments

Identify workloads suitable for commitments

Looks for:

Looks for:

- EC2 instances running consistently for 60+ days

- EC2 instances running consistently for 60+ days

- RDS instances with stable usage

- RDS instances with stable usage

- Calculates ROI for 1yr vs 3yr commitments

- Calculates ROI for 1yr vs 3yr commitments

Workflow 3: Instance Generation Migration