Loading...
Loading...
Use when the user wants to deploy and run a prepared AWS FIS experiment. Triggers on "execute FIS experiment", "run FIS experiment", "start chaos experiment", "deploy FIS template", "启动 FIS 实验", "运行混沌实验", "执行故障注入实验", "deploy and run the experiment in [directory]". Expects a prepared experiment directory (from aws-fis-experiment-prepare or manually created) containing experiment-template.json, iam-policy.json, cfn-template.yaml, and alarm configs. Deploys resources via CLI or CloudFormation, starts the experiment with strict user confirmation, monitors progress, and generates results report.
npx skill4agent add panlm/skills aws-fis-experiment-executeaws fisaws iamaws cloudwatchaws cloudformationdigraph execute_flow {
"Load experiment directory" [shape=box];
"Validate files" [shape=box];
"Choose deployment method" [shape=diamond];
"CLI deployment" [shape=box];
"CFN deployment" [shape=box];
"User confirms deployment" [shape=diamond];
"Deploy resources" [shape=box];
"User confirms experiment start" [shape=diamond, style=bold, color=red];
"Start experiment" [shape=box];
"Monitor experiment" [shape=box];
"Experiment complete?" [shape=diamond];
"Generate results report" [shape=box];
"Load experiment directory" -> "Validate files";
"Validate files" -> "Choose deployment method";
"Choose deployment method" -> "CLI deployment" [label="CLI"];
"Choose deployment method" -> "CFN deployment" [label="CFN"];
"CLI deployment" -> "User confirms deployment";
"CFN deployment" -> "User confirms deployment";
"User confirms deployment" -> "Deploy resources" [label="Yes"];
"User confirms deployment" -> "Load experiment directory" [label="No, abort"];
"Deploy resources" -> "User confirms experiment start";
"User confirms experiment start" -> "Start experiment" [label="Yes, I confirm"];
"User confirms experiment start" -> "Generate results report" [label="No, abort"];
"Start experiment" -> "Monitor experiment";
"Monitor experiment" -> "Experiment complete?" ;
"Experiment complete?" -> "Monitor experiment" [label="No, poll again"];
"Experiment complete?" -> "Generate results report" [label="Yes"];
}EXPERIMENT_DIR="{USER_PROVIDED_PATH}"
# Required files
ls "${EXPERIMENT_DIR}/experiment-template.json"
ls "${EXPERIMENT_DIR}/iam-policy.json"
ls "${EXPERIMENT_DIR}/cfn-template.yaml"
ls "${EXPERIMENT_DIR}/README.md"
ls "${EXPERIMENT_DIR}/expected-behavior.md"
# Optional files
ls "${EXPERIMENT_DIR}/alarms/stop-condition-alarms.json" 2>/dev/null
ls "${EXPERIMENT_DIR}/alarms/dashboard.json" 2>/dev/nullREADME.mdHow would you like to deploy the experiment resources?
- AWS CLI — Step-by-step deployment with individual commands
- CloudFormation — All-in-one stack deployment
references/cli-commands.md# Show command to user, wait for confirmation
aws iam create-role \
--role-name "FISExperimentRole-{SCENARIO}" \
--assume-role-policy-document '{...}' \
--region {REGION}
aws iam put-role-policy \
--role-name "FISExperimentRole-{SCENARIO}" \
--policy-name FISExperimentPolicy \
--policy-document "file://${EXPERIMENT_DIR}/iam-policy.json"alarms/stop-condition-alarms.jsonaws cloudwatch put-metric-alarm --cli-input-json '{...}' --region {REGION}aws cloudwatch put-dashboard \
--dashboard-name "FIS-{SCENARIO}" \
--dashboard-body "file://${EXPERIMENT_DIR}/alarms/dashboard.json" \
--region {REGION}aws fis create-experiment-template \
--cli-input-json "file://${EXPERIMENT_DIR}/experiment-template.json" \
--region {REGION}experimentTemplate.idaws cloudformation deploy \
--template-file "${EXPERIMENT_DIR}/cfn-template.yaml" \
--stack-name "fis-{SCENARIO}-{TIMESTAMP}" \
--capabilities CAPABILITY_NAMED_IAM \
--region {REGION}aws cloudformation wait stack-create-complete \
--stack-name "fis-{SCENARIO}-{TIMESTAMP}" \
--region {REGION}TEMPLATE_ID=$(aws cloudformation describe-stacks \
--stack-name "fis-{SCENARIO}-{TIMESTAMP}" \
--query 'Stacks[0].Outputs[?OutputKey==`ExperimentTemplateId`].OutputValue' \
--output text --region {REGION})⚠️ WARNING: Starting this FIS experiment will cause REAL impact:
Scenario: {SCENARIO_NAME}
Region: {REGION}
Target AZ: {AZ_ID}
Duration: {DURATION}
Resources that WILL be affected:
- {list each affected resource type and count}
Stop Conditions:
- {list each alarm that will stop the experiment}
Type "Yes, start experiment" to proceed, or "No" to abort.aws fis start-experiment \
--experiment-template-id "{TEMPLATE_ID}" \
--region {REGION}experiment.idaws fis get-experiment \
--id "{EXPERIMENT_ID}" \
--region {REGION} \
--query '{
State: experiment.state.status,
Reason: experiment.state.reason,
StartTime: experiment.startTime,
EndTime: experiment.endTime,
Actions: experiment.actions
}'expected-behavior.mdinitiatingrunningcompletedstoppingstoppedfailedexpected-behavior.mdaws fis stop-experiment --id "{EXPERIMENT_ID}" --region {REGION}TIMESTAMP=$(date +%Y-%m-%d-%H-%M-%S)
SCENARIO_SLUG=$(echo "{SCENARIO_NAME}" | tr '[:upper:]' '[:lower:]' | tr ' :/' '-')
# File name: ${TIMESTAMP}-${SCENARIO_SLUG}-experiment-results.md
# Save the file in the experiment directory (${EXPERIMENT_DIR})2025-03-30T14:05:32+08:0005:05:32expected-behavior.md## FIS Experiment Results
**Experiment ID:** {EXPERIMENT_ID}
**Template ID:** {TEMPLATE_ID}
**Status:** {FINAL_STATUS}
**Start Time:** {START_TIME}
**End Time:** {END_TIME}
**Duration:** {ACTUAL_DURATION}
### Action Results
| Action | Action ID | Status | Start (UTC) | End (UTC) | Duration |
|---|---|---|---|---|---|
| {action_name} | {action_id} | {status} | {HH:MM:SS} | {HH:MM:SS} | {duration} |
### Stop Condition Alarms
| Alarm | Final Status |
|---|---|
| {alarm_name} | {OK/ALARM} |
### Per-Service Impact Analysis
For EACH service listed in expected-behavior.md, create a sub-section below.
Also include indirectly affected services (e.g., services impacted by network
disruption even without a dedicated FIS action).
#### {Service Name} ({resource_identifier})
| Time (UTC) | Event | Observation |
|---|---|---|
| {HH:MM:SS} | {event} | {what was observed at this point} |
| {HH:MM:SS} | {event} | {observed result / status change} |
| ... | ... | ... |
**Key Findings:**
- {finding_1 — what happened and why}
- {finding_2 — recovery behavior}
(Repeat for each service)
### Recovery Status Summary
| Resource | Recovery Status | Notes |
|---|---|---|
| {service} | {Recovered / Partially Recovered / Recovering} | {details} |
### Issues Requiring Attention
#### 1. {Issue title}
- **Problem:** {description}
- **Recommendation:** {action to take, with CLI command if applicable}
### Cleanup
{cleanup instructions with CLI commands}# Delete experiment template
aws fis delete-experiment-template --id "{TEMPLATE_ID}" --region {REGION}
# Delete CloudWatch alarms
aws cloudwatch delete-alarms --alarm-names "FIS-StopCondition-{SCENARIO}-{SERVICE}" --region {REGION}
# Delete CloudWatch dashboard
aws cloudwatch delete-dashboards --dashboard-names "FIS-{SCENARIO}" --region {REGION}
# Delete IAM role
aws iam delete-role-policy --role-name "FISExperimentRole-{SCENARIO}" --policy-name FISExperimentPolicy
aws iam delete-role --role-name "FISExperimentRole-{SCENARIO}"aws cloudformation delete-stack --stack-name "fis-{SCENARIO}-{TIMESTAMP}" --region {REGION}| Error | Cause | Resolution |
|---|---|---|
| Insufficient permissions | Check IAM policy in iam-policy.json |
| Invalid template JSON | Validate with |
| Tagged resources not found | Verify resource tags match template |
| Alarm creation fails | Metric/namespace mismatch | Check metric name and namespace exist |
| Stack creation fails | CFN template validation error | Run |
Experiment stuck in | IAM role propagation delay | Wait 30 seconds and check again |