chaos-experiment

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Chaos Experiment

混沌实验

Create and manage chaos experiments using Harness Chaos Engineering via MCP.
通过MCP使用Harness Chaos Engineering创建并管理混沌实验。

Instructions

操作步骤

Step 1: Check Infrastructure

步骤1:检查基础设施

Call MCP tool: harness_list
Parameters:
  resource_type: "chaos_infrastructure"
  org_id: "<organization>"
  project_id: "<project>"
Call MCP tool: harness_list
Parameters:
  resource_type: "chaos_infrastructure"
  org_id: "<organization>"
  project_id: "<project>"

Step 2: Browse Templates

步骤2:浏览模板

Call MCP tool: harness_list
Parameters:
  resource_type: "chaos_experiment_template"
  org_id: "<organization>"
  project_id: "<project>"
Call MCP tool: harness_list
Parameters:
  resource_type: "chaos_experiment_template"
  org_id: "<organization>"
  project_id: "<project>"

Step 3: List Existing Experiments

步骤3:列出已有实验

Call MCP tool: harness_list
Parameters:
  resource_type: "chaos_experiment"
  org_id: "<organization>"
  project_id: "<project>"
Call MCP tool: harness_list
Parameters:
  resource_type: "chaos_experiment"
  org_id: "<organization>"
  project_id: "<project>"

Step 4: Create Experiment

步骤4:创建实验

Call MCP tool: harness_create
Parameters:
  resource_type: "chaos_experiment"
  org_id: "<organization>"
  project_id: "<project>"
  body: <experiment definition>
Call MCP tool: harness_create
Parameters:
  resource_type: "chaos_experiment"
  org_id: "<organization>"
  project_id: "<project>"
  body: <experiment definition>

Step 5: Run Experiment

步骤5:运行实验

Call MCP tool: harness_execute
Parameters:
  resource_type: "chaos_experiment"
  action: "run"
  resource_id: "<experiment_id>"
  org_id: "<organization>"
  project_id: "<project>"
Call MCP tool: harness_execute
Parameters:
  resource_type: "chaos_experiment"
  action: "run"
  resource_id: "<experiment_id>"
  org_id: "<organization>"
  project_id: "<project>"

Step 6: Monitor Results

步骤6:监控结果

Call MCP tool: harness_list
Parameters:
  resource_type: "chaos_experiment_run"
  org_id: "<organization>"
  project_id: "<project>"
Get specific run details:
Call MCP tool: harness_get
Parameters:
  resource_type: "chaos_experiment_run"
  resource_id: "<run_id>"
Call MCP tool: harness_list
Parameters:
  resource_type: "chaos_experiment_run"
  org_id: "<organization>"
  project_id: "<project>"
获取特定运行详情:
Call MCP tool: harness_get
Parameters:
  resource_type: "chaos_experiment_run"
  resource_id: "<run_id>"

Step 7: Check Probes

步骤7:检查探针

Call MCP tool: harness_list
Parameters:
  resource_type: "chaos_probe"
  org_id: "<organization>"
  project_id: "<project>"
Call MCP tool: harness_list
Parameters:
  resource_type: "chaos_probe"
  org_id: "<organization>"
  project_id: "<project>"

Common Experiment Types

常见实验类型

  • Pod Delete - Kill pods to test recovery
  • Pod CPU Hog - Stress CPU to test throttling
  • Pod Memory Hog - Consume memory to test OOM handling
  • Pod Network Loss - Simulate network failures
  • Pod Network Latency - Add artificial latency
  • Node Drain - Drain K8s nodes
  • EC2 Stop - Stop AWS EC2 instances
  • ECS Task Stop - Stop ECS tasks
  • Pod删除 - 杀死Pod以测试恢复能力
  • Pod CPU占用 - 给CPU施压以测试限流
  • Pod内存占用 - 消耗内存以测试OOM处理
  • Pod网络丢包 - 模拟网络故障
  • Pod网络延迟 - 添加人为延迟
  • 节点驱逐 - 驱逐K8s节点
  • EC2停止 - 停止AWS EC2实例
  • ECS任务停止 - 停止ECS任务

Chaos Resource Types

混沌资源类型

Resource TypeOperationsDescription
chaos_experiment
list, get, create, update, delete, runExperiments
chaos_experiment_run
list, getRun history/results
chaos_experiment_template
list, getPre-built templates
chaos_infrastructure
list, getTarget infrastructure
chaos_probe
list, getHealth probes
资源类型操作描述
chaos_experiment
list, get, create, update, delete, run实验
chaos_experiment_run
list, get运行历史/结果
chaos_experiment_template
list, get预构建模板
chaos_infrastructure
list, get目标基础设施
chaos_probe
list, get健康探针

Examples

示例

  • "Show me all chaos experiments" - List chaos_experiment
  • "Create a pod-delete experiment for checkout-service" - Create chaos_experiment
  • "Run the weekly resilience test" - Execute run action
  • "What were the results of the last chaos run?" - Get chaos_experiment_run
  • "显示所有混沌实验" - List chaos_experiment
  • "为checkout-service创建一个Pod删除实验" - Create chaos_experiment
  • "运行每周弹性测试" - Execute run action
  • "上次混沌运行的结果是什么?" - Get chaos_experiment_run

Performance Notes

性能注意事项

  • Review existing experiments before creating duplicates. Check for similar fault types targeting the same service.
  • Wait for experiment completion before analyzing results. Do not draw conclusions from partial runs.
  • Verify the target infrastructure and service are healthy before running chaos experiments.
  • 创建重复实验前请查看已有实验,检查是否有针对同一服务的类似故障类型实验。
  • 分析结果前请等待实验完成,不要根据部分运行得出结论。
  • 运行混沌实验前,请验证目标基础设施和服务是否健康。

Troubleshooting

故障排除

Experiment Won't Run

实验无法运行

  • Verify chaos infrastructure is connected and active
  • Check target application/namespace exists
  • Ensure RBAC permissions for chaos operations
  • 验证混沌基础设施已连接且处于活跃状态
  • 检查目标应用/命名空间是否存在
  • 确保拥有混沌操作的RBAC权限

Probes Failing

探针失败

  • Check probe endpoints are accessible
  • Verify probe timeout settings
  • Review probe type matches expected behavior
  • 检查探针端点是否可访问
  • 验证探针超时设置
  • 确认探针类型与预期行为匹配