eks-app-log-analysis
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseEKS App Log Analysis
EKS 应用日志分析
Analyze EKS application logs during FIS fault injection experiments to understand how
applications respond to infrastructure failures. Supports real-time monitoring and
post-hoc analysis modes.
在FIS故障注入实验期间分析EKS应用日志,了解应用对基础设施故障的响应情况。支持实时监控和事后分析两种模式。
Output Language Rule
输出语言规则
Detect the language of the user's conversation and use the same language for all output.
- Chinese input -> Chinese output
- English input -> English output
检测用户对话的语言,所有输出使用相同语言:
- 中文输入 → 中文输出
- 英文输入 → 英文输出
Prerequisites
前置要求
Required tools:
- kubectl — configured with access to target EKS cluster
- AWS CLI — for querying FIS experiment status
- A prepared/executed FIS experiment directory (from aws-fis-experiment-prepare or aws-fis-experiment-execute)
所需工具:
- kubectl — 已配置目标EKS集群的访问权限
- AWS CLI — 用于查询FIS实验状态
- 已准备/执行的FIS实验目录(来自aws-fis-experiment-prepare或aws-fis-experiment-execute)
Workflow
工作流
dot
digraph log_analysis_flow {
"Receive input path" [shape=box];
"Detect mode" [shape=diamond];
"Real-time mode" [shape=box];
"Post-hoc mode" [shape=box];
"Read service list" [shape=box];
"Ask user for app dependencies" [shape=box];
"Start background log collection" [shape=box];
"Batch fetch historical logs" [shape=box];
"Frontend polling + insight display" [shape=box];
"Experiment complete?" [shape=diamond];
"Generate analysis report" [shape=box];
"Receive input path" -> "Detect mode";
"Detect mode" -> "Real-time mode" [label="directory with README"];
"Detect mode" -> "Post-hoc mode" [label="*-experiment-results.md"];
"Real-time mode" -> "Read service list";
"Post-hoc mode" -> "Read service list";
"Read service list" -> "Ask user for app dependencies";
"Ask user for app dependencies" -> "Start background log collection" [label="real-time"];
"Ask user for app dependencies" -> "Batch fetch historical logs" [label="post-hoc"];
"Start background log collection" -> "Frontend polling + insight display";
"Frontend polling + insight display" -> "Experiment complete?";
"Experiment complete?" -> "Frontend polling + insight display" [label="No, continue"];
"Experiment complete?" -> "Generate analysis report" [label="Yes"];
"Batch fetch historical logs" -> "Generate analysis report";
}dot
digraph log_analysis_flow {
"Receive input path" [shape=box];
"Detect mode" [shape=diamond];
"Real-time mode" [shape=box];
"Post-hoc mode" [shape=box];
"Read service list" [shape=box];
"Ask user for app dependencies" [shape=box];
"Start background log collection" [shape=box];
"Batch fetch historical logs" [shape=box];
"Frontend polling + insight display" [shape=box];
"Experiment complete?" [shape=diamond];
"Generate analysis report" [shape=box];
"Receive input path" -> "Detect mode";
"Detect mode" -> "Real-time mode" [label="directory with README"];
"Detect mode" -> "Post-hoc mode" [label="*-experiment-results.md"];
"Real-time mode" -> "Read service list";
"Post-hoc mode" -> "Read service list";
"Read service list" -> "Ask user for app dependencies";
"Ask user for app dependencies" -> "Start background log collection" [label="real-time"];
"Ask user for app dependencies" -> "Batch fetch historical logs" [label="post-hoc"];
"Start background log collection" -> "Frontend polling + insight display";
"Frontend polling + insight display" -> "Experiment complete?";
"Experiment complete?" -> "Frontend polling + insight display" [label="No, continue"];
"Experiment complete?" -> "Generate analysis report" [label="Yes"];
"Batch fetch historical logs" -> "Generate analysis report";
}Step 1: Detect Mode and Load Context
步骤1:检测模式并加载上下文
The user provides either:
- Directory path (e.g., ) → Real-time mode
./az-power-interruption-2026-03-31-14-30-22/ - Report file path (e.g., ) → Post-hoc mode
./2026-03-31-...-experiment-results.md
用户可提供以下两种输入之一:
- 目录路径(例如 )→ 实时模式
./az-power-interruption-2026-03-31-14-30-22/ - 报告文件路径(例如 )→ 事后模式
./2026-03-31-...-experiment-results.md
Real-time Mode Detection
实时模式检测
bash
undefinedbash
undefinedCheck if input is a directory with README.md
Check if input is a directory with README.md
if [ -d "${INPUT_PATH}" ] && [ -f "${INPUT_PATH}/README.md" ]; then
MODE="realtime"
# Extract template ID from README
TEMPLATE_ID=$(grep -oP 'experiment-template-id["\s:]+\K[A-Za-z0-9-]+' "${INPUT_PATH}/README.md" ||
grep -oP 'ExperimentTemplateId["\s:]+\K[A-Za-z0-9-]+' "${INPUT_PATH}/README.md") REGION=$(grep -oP 'Region:["\s]+\K[a-z0-9-]+' "${INPUT_PATH}/README.md") fi
grep -oP 'ExperimentTemplateId["\s:]+\K[A-Za-z0-9-]+' "${INPUT_PATH}/README.md") REGION=$(grep -oP 'Region:["\s]+\K[a-z0-9-]+' "${INPUT_PATH}/README.md") fi
undefinedif [ -d "${INPUT_PATH}" ] && [ -f "${INPUT_PATH}/README.md" ]; then
MODE="realtime"
# Extract template ID from README
TEMPLATE_ID=$(grep -oP 'experiment-template-id["\s:]+\K[A-Za-z0-9-]+' "${INPUT_PATH}/README.md" ||
grep -oP 'ExperimentTemplateId["\s:]+\K[A-Za-z0-9-]+' "${INPUT_PATH}/README.md") REGION=$(grep -oP 'Region:["\s]+\K[a-z0-9-]+' "${INPUT_PATH}/README.md") fi
grep -oP 'ExperimentTemplateId["\s:]+\K[A-Za-z0-9-]+' "${INPUT_PATH}/README.md") REGION=$(grep -oP 'Region:["\s]+\K[a-z0-9-]+' "${INPUT_PATH}/README.md") fi
undefinedPost-hoc Mode Detection
事后模式检测
bash
undefinedbash
undefinedCheck if input is an experiment results file
Check if input is an experiment results file
if [ -f "${INPUT_PATH}" ] && grep -q "FIS Experiment Results" "${INPUT_PATH}"; then
MODE="posthoc"
# Extract time range from report
START_TIME=$(grep -oP 'Start Time:["\s]+\K[0-9T:+-]+' "${INPUT_PATH}")
END_TIME=$(grep -oP 'End Time:["\s]+\K[0-9T:+-]+' "${INPUT_PATH}")
EXPERIMENT_ID=$(grep -oP 'Experiment ID:["\s]+\K[A-Za-z0-9-]+' "${INPUT_PATH}")
fi
undefinedif [ -f "${INPUT_PATH}" ] && grep -q "FIS Experiment Results" "${INPUT_PATH}"; then
MODE="posthoc"
# Extract time range from report
START_TIME=$(grep -oP 'Start Time:["\s]+\K[0-9T:+-]+' "${INPUT_PATH}")
END_TIME=$(grep -oP 'End Time:["\s]+\K[0-9T:+-]+' "${INPUT_PATH}")
EXPERIMENT_ID=$(grep -oP 'Experiment ID:["\s]+\K[A-Za-z0-9-]+' "${INPUT_PATH}")
fi
undefinedStep 2: Read Service List
步骤2:读取服务列表
Extract affected services from (real-time) or the experiment report (post-hoc).
expected-behavior.mdbash
undefined从(实时模式)或实验报告(事后模式)中提取受影响的服务。
expected-behavior.mdbash
undefinedFrom expected-behavior.md - look for "### {Service Name}" sections
From expected-behavior.md - look for "### {Service Name}" sections
grep -oP '### \K[A-Za-z0-9 ]+(?= ()' "${EXPERIMENT_DIR}/expected-behavior.md"
grep -oP '### \K[A-Za-z0-9 ]+(?= ()' "${EXPERIMENT_DIR}/expected-behavior.md"
From experiment report - look for "#### {Service Name}" in Per-Service Impact Analysis
From experiment report - look for "#### {Service Name}" in Per-Service Impact Analysis
grep -oP '#### \K[A-Za-z0-9 ]+(?= ()' "${REPORT_PATH}"
Present the service list to user:
Detected affected services from the experiment:
- RDS (cluster-xxx)
- ElastiCache (redis-xxx)
- EC2 (instances in ap-northeast-1a)
For each service, please provide the EKS applications that depend on it.
undefinedgrep -oP '#### \K[A-Za-z0-9 ]+(?= ()' "${REPORT_PATH}"
向用户展示服务列表:
Detected affected services from the experiment:
- RDS (cluster-xxx)
- ElastiCache (redis-xxx)
- EC2 (instances in ap-northeast-1a)
For each service, please provide the EKS applications that depend on it.
undefinedStep 3: Collect Application Dependencies
步骤3:收集应用依赖关系
For each service, ask the user to provide dependent applications:
Which EKS applications depend on RDS (cluster-xxx)?
Please provide in format: namespace/deployment or namespace/pod-label-selector
Example: default/app-backend, production/api-server
>Store the mapping:
SERVICE_APP_MAP:
rds-cluster-xxx:
- namespace: default
deployment: app-backend
- namespace: production
deployment: api-server
elasticache-redis-xxx:
- namespace: default
deployment: cache-layer针对每个服务,询问用户提供依赖该服务的应用:
Which EKS applications depend on RDS (cluster-xxx)?
Please provide in format: namespace/deployment or namespace/pod-label-selector
Example: default/app-backend, production/api-server
>存储映射关系:
SERVICE_APP_MAP:
rds-cluster-xxx:
- namespace: default
deployment: app-backend
- namespace: production
deployment: api-server
elasticache-redis-xxx:
- namespace: default
deployment: cache-layerStep 4: Log Collection
步骤4:日志采集
Real-time Mode: Background Collection
实时模式:后台采集
For each application, start a background process:
kubectl logs -fbash
undefined为每个应用启动后台进程:
kubectl logs -fbash
undefinedCreate output directory structure
Create output directory structure
LOG_DIR="${EXPERIMENT_DIR}/app-logs"
mkdir -p "${LOG_DIR}/rds-cluster-xxx"
mkdir -p "${LOG_DIR}/elasticache-redis-xxx"
LOG_DIR="${EXPERIMENT_DIR}/app-logs"
mkdir -p "${LOG_DIR}/rds-cluster-xxx"
mkdir -p "${LOG_DIR}/elasticache-redis-xxx"
Start background log collection for each app
Start background log collection for each app
for app in ${APPS[@]}; do
NAMESPACE=$(echo $app | cut -d'/' -f1)
DEPLOYMENT=$(echo $app | cut -d'/' -f2)
SERVICE_DIR="${LOG_DIR}/${SERVICE_NAME}"
kubectl logs -f deployment/${DEPLOYMENT} -n ${NAMESPACE} \
--timestamps --all-containers=true \
>> "${SERVICE_DIR}/${DEPLOYMENT}.log" 2>&1 &
# Store PID for cleanup
echo $! >> "${LOG_DIR}/.pids"done
undefinedfor app in ${APPS[@]}; do
NAMESPACE=$(echo $app | cut -d'/' -f1)
DEPLOYMENT=$(echo $app | cut -d'/' -f2)
SERVICE_DIR="${LOG_DIR}/${SERVICE_NAME}"
kubectl logs -f deployment/${DEPLOYMENT} -n ${NAMESPACE} \
--timestamps --all-containers=true \
>> "${SERVICE_DIR}/${DEPLOYMENT}.log" 2>&1 &
# Store PID for cleanup
echo $! >> "${LOG_DIR}/.pids"done
undefinedPost-hoc Mode: Batch Fetch
事后模式:批量拉取
bash
undefinedbash
undefinedFetch logs for the experiment time window
Fetch logs for the experiment time window
kubectl logs deployment/${DEPLOYMENT} -n ${NAMESPACE}
--timestamps --all-containers=true
--since-time="${START_TIME}"
> "${SERVICE_DIR}/${DEPLOYMENT}.log" 2>&1
--timestamps --all-containers=true
--since-time="${START_TIME}"
> "${SERVICE_DIR}/${DEPLOYMENT}.log" 2>&1
undefinedkubectl logs deployment/${DEPLOYMENT} -n ${NAMESPACE}
--timestamps --all-containers=true
--since-time="${START_TIME}"
> "${SERVICE_DIR}/${DEPLOYMENT}.log" 2>&1
--timestamps --all-containers=true
--since-time="${START_TIME}"
> "${SERVICE_DIR}/${DEPLOYMENT}.log" 2>&1
undefinedStep 5: Real-time Monitoring Display
步骤5:实时监控展示
Poll every 30 seconds and display insights per service group:
bash
while experiment_is_running; do
clear_screen_section
for SERVICE in ${SERVICES[@]}; do
echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━"
echo "[$(date +%H:%M:%S)] ${SERVICE} Impact Analysis"
echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━"
for APP in ${SERVICE_APPS[$SERVICE]}; do
LOG_FILE="${LOG_DIR}/${SERVICE}/${APP}.log"
# Get last 30 seconds of logs
RECENT_LOGS=$(tail -1000 "$LOG_FILE" | awk -v cutoff="$(date -d '30 seconds ago' +%Y-%m-%dT%H:%M:%S)" '$1 >= cutoff')
# Count errors
ERROR_COUNT=$(echo "$RECENT_LOGS" | grep -ciE 'error|exception|fail|refused|timeout')
WARN_COUNT=$(echo "$RECENT_LOGS" | grep -ciE 'warn|retry')
echo ""
echo "▶ ${APP} (last 30s: ${ERROR_COUNT} errors, ${WARN_COUNT} warnings)"
echo "┌─────────────────────────────────────────────────────────────┐"
# Show 5 most relevant log lines (errors first)
echo "$RECENT_LOGS" | grep -iE 'error|exception|fail|refused|timeout' | tail -5
echo "└─────────────────────────────────────────────────────────────┘"
# Generate insight
if [ $ERROR_COUNT -gt 0 ]; then
FIRST_ERROR=$(echo "$RECENT_LOGS" | grep -iE 'error|exception' | head -1 | cut -d' ' -f1)
LAST_ERROR=$(echo "$RECENT_LOGS" | grep -iE 'error|exception' | tail -1 | cut -d' ' -f1)
echo "💡 Insight: ${ERROR_COUNT} errors between ${FIRST_ERROR} - ${LAST_ERROR}"
# Detect recovery
if echo "$RECENT_LOGS" | tail -5 | grep -qiE 'connected|restored|success|recovered'; then
echo "✅ Recovery signal detected in recent logs"
fi
else
echo "✅ No errors detected"
fi
done
done
sleep 30
done每30秒轮询一次,按服务分组展示洞察信息:
bash
while experiment_is_running; do
clear_screen_section
for SERVICE in ${SERVICES[@]}; do
echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━"
echo "[$(date +%H:%M:%S)] ${SERVICE} Impact Analysis"
echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━"
for APP in ${SERVICE_APPS[$SERVICE]}; do
LOG_FILE="${LOG_DIR}/${SERVICE}/${APP}.log"
# Get last 30 seconds of logs
RECENT_LOGS=$(tail -1000 "$LOG_FILE" | awk -v cutoff="$(date -d '30 seconds ago' +%Y-%m-%dT%H:%M:%S)" '$1 >= cutoff')
# Count errors
ERROR_COUNT=$(echo "$RECENT_LOGS" | grep -ciE 'error|exception|fail|refused|timeout')
WARN_COUNT=$(echo "$RECENT_LOGS" | grep -ciE 'warn|retry')
echo ""
echo "▶ ${APP} (last 30s: ${ERROR_COUNT} errors, ${WARN_COUNT} warnings)"
echo "┌─────────────────────────────────────────────────────────────┐"
# Show 5 most relevant log lines (errors first)
echo "$RECENT_LOGS" | grep -iE 'error|exception|fail|refused|timeout' | tail -5
echo "└─────────────────────────────────────────────────────────────┘"
# Generate insight
if [ $ERROR_COUNT -gt 0 ]; then
FIRST_ERROR=$(echo "$RECENT_LOGS" | grep -iE 'error|exception' | head -1 | cut -d' ' -f1)
LAST_ERROR=$(echo "$RECENT_LOGS" | grep -iE 'error|exception' | tail -1 | cut -d' ' -f1)
echo "💡 Insight: ${ERROR_COUNT} errors between ${FIRST_ERROR} - ${LAST_ERROR}"
# Detect recovery
if echo "$RECENT_LOGS" | tail -5 | grep -qiE 'connected|restored|success|recovered'; then
echo "✅ Recovery signal detected in recent logs"
fi
else
echo "✅ No errors detected"
fi
done
done
sleep 30
doneStep 6: Check Experiment Status (Real-time Mode)
步骤6:检查实验状态(实时模式)
bash
check_experiment_status() {
# Query running experiments for this template
RUNNING=$(aws fis list-experiments \
--query "experiments[?experimentTemplateId=='${TEMPLATE_ID}' && state.status=='running']" \
--region ${REGION} --output json)
if [ "$(echo $RUNNING | jq length)" -gt 0 ]; then
return 0 # Still running
else
return 1 # Completed or not started
fi
}bash
check_experiment_status() {
# Query running experiments for this template
RUNNING=$(aws fis list-experiments \
--query "experiments[?experimentTemplateId=='${TEMPLATE_ID}' && state.status=='running']" \
--region ${REGION} --output json)
if [ "$(echo $RUNNING | jq length)" -gt 0 ]; then
return 0 # Still running
else
return 1 # Completed or not started
fi
}Step 7: Generate Analysis Report
步骤7:生成分析报告
After experiment completes (or immediately in post-hoc mode), generate the report:
bash
TIMESTAMP=$(date +%Y-%m-%d-%H-%M-%S)
REPORT_FILE="${EXPERIMENT_DIR}/${TIMESTAMP}-app-log-analysis.md"Report structure:
markdown
undefined实验结束后(事后模式下直接生成)生成报告:
bash
TIMESTAMP=$(date +%Y-%m-%d-%H-%M-%S)
REPORT_FILE="${EXPERIMENT_DIR}/${TIMESTAMP}-app-log-analysis.md"报告结构:
markdown
undefinedApplication Log Analysis Report
Application Log Analysis Report
Experiment ID: {EXPERIMENT_ID}
Analysis Time: {TIMESTAMP}
Time Range: {START_TIME} - {END_TIME}
Duration: {DURATION}
Experiment ID: {EXPERIMENT_ID}
Analysis Time: {TIMESTAMP}
Time Range: {START_TIME} - {END_TIME}
Duration: {DURATION}
Summary
Summary
| Service | Application | Total Errors | Peak Error Rate | Recovery Time |
|---|---|---|---|---|
| {service} | {app} | {count} | {rate}/min | {time} |
| Service | Application | Total Errors | Peak Error Rate | Recovery Time |
|---|---|---|---|---|
| {service} | {app} | {count} | {rate}/min | {time} |
Per-Service Application Analysis
Per-Service Application Analysis
{Service Name} ({resource_id})
{Service Name} ({resource_id})
{Application Name} ({namespace}/{deployment})
{Application Name} ({namespace}/{deployment})
Error Timeline:
| Time (UTC) | Level | Message |
|---|---|---|
| {HH:MM:SS} | ERROR | {truncated message} |
| ... | ... | ... |
Key Error Patterns:
| Pattern | Count | First Occurrence | Last Occurrence |
|---|---|---|---|
| Connection refused | {n} | {time} | {time} |
| Timeout | {n} | {time} | {time} |
Log Sample (Critical Errors):
{5-10 lines of actual error logs}Insights:
- {insight_1}: Error spike at {time}, correlates with {service} failover
- {insight_2}: Recovery detected at {time}, {duration} after fault injection ended
- {insight_3}: Application retry mechanism worked/failed because...
(Repeat for each application)
Error Timeline:
| Time (UTC) | Level | Message |
|---|---|---|
| {HH:MM:SS} | ERROR | {truncated message} |
| ... | ... | ... |
Key Error Patterns:
| Pattern | Count | First Occurrence | Last Occurrence |
|---|---|---|---|
| Connection refused | {n} | {time} | {time} |
| Timeout | {n} | {time} | {time} |
Log Sample (Critical Errors):
{5-10 lines of actual error logs}Insights:
- {insight_1}: Error spike at {time}, correlates with {service} failover
- {insight_2}: Recovery detected at {time}, {duration} after fault injection ended
- {insight_3}: Application retry mechanism worked/failed because...
(Repeat for each application)
Cross-Service Correlation
Cross-Service Correlation
| Time | Event | RDS Impact | ElastiCache Impact | Application Response |
|---|---|---|---|---|
| {time} | Fault injection start | - | - | First errors appear |
| {time} | {service} failover | Connection errors | - | Retrying... |
| {time} | Recovery | Connections restored | - | Normal operation |
| Time | Event | RDS Impact | ElastiCache Impact | Application Response |
|---|---|---|---|---|
| {time} | Fault injection start | - | - | First errors appear |
| {time} | {service} failover | Connection errors | - | Retrying... |
| {time} | Recovery | Connections restored | - | Normal operation |
Recommendations
Recommendations
- {Issue}: {description}
- Impact: {what happened}
- Recommendation: {what to improve}
- {Issue}: {description}
- Impact: {what happened}
- Recommendation: {what to improve}
Appendix: Log File Locations
Appendix: Log File Locations
| Application | Log File |
|---|---|
| {app} | |
undefined| Application | Log File |
|---|---|
| {app} | |
undefinedStep 8: Cleanup (Real-time Mode)
步骤8:清理(实时模式)
Stop all background log collection processes:
bash
cleanup_log_collectors() {
if [ -f "${LOG_DIR}/.pids" ]; then
while read pid; do
kill $pid 2>/dev/null
done < "${LOG_DIR}/.pids"
rm "${LOG_DIR}/.pids"
fi
}停止所有后台日志采集进程:
bash
cleanup_log_collectors() {
if [ -f "${LOG_DIR}/.pids" ]; then
while read pid; do
kill $pid 2>/dev/null
done < "${LOG_DIR}/.pids"
rm "${LOG_DIR}/.pids"
fi
}Register cleanup on exit
Register cleanup on exit
trap cleanup_log_collectors EXIT
undefinedtrap cleanup_log_collectors EXIT
undefinedError Handling
错误处理
| Error | Cause | Resolution |
|---|---|---|
| kubectl not installed | Install kubectl and configure kubeconfig |
| kubeconfig not configured | Run |
| Deployment/pod doesn't exist | Verify deployment name and namespace |
| Pod not running or restarted | Check pod status, may need to fetch from CloudWatch Logs |
| Template ID not found | README format changed | Manually provide template ID |
| 错误 | 原因 | 解决方案 |
|---|---|---|
| 未安装kubectl | 安装kubectl并配置kubeconfig |
| 未配置kubeconfig | 执行 |
| Deployment/pod不存在 | 验证deployment名称和命名空间 |
| Pod未运行或已重启 | 检查Pod状态,可能需要从CloudWatch Logs拉取 |
| Template ID not found | README格式变更 | 手动提供模板ID |
Output Files
输出文件
{experiment-dir}/
├── app-logs/
│ ├── rds-cluster-xxx/
│ │ ├── app-backend.log
│ │ └── api-server.log
│ ├── elasticache-redis-xxx/
│ │ └── cache-layer.log
│ └── .pids (temporary, cleaned up)
└── {timestamp}-app-log-analysis.md{experiment-dir}/
├── app-logs/
│ ├── rds-cluster-xxx/
│ │ ├── app-backend.log
│ │ └── api-server.log
│ ├── elasticache-redis-xxx/
│ │ └── cache-layer.log
│ └── .pids (temporary, cleaned up)
└── {timestamp}-app-log-analysis.mdUsage Examples
使用示例
undefinedundefinedReal-time monitoring (during experiment)
实时监控(实验进行中)
"Analyze app logs for ./az-power-interruption-2026-03-31-14-30-22/"
"Monitor application behavior in the experiment directory"
"实时监控应用日志"
"Analyze app logs for ./az-power-interruption-2026-03-31-14-30-22/"
"Monitor application behavior in the experiment directory"
"实时监控应用日志"
Post-hoc analysis (after experiment)
事后分析(实验结束后)
"Analyze app logs using ./2026-03-31-14-35-00-az-power-interruption-experiment-results.md"
"分析实验报告中的应用表现"
"Check what happened to applications during the experiment"
undefined"Analyze app logs using ./2026-03-31-14-35-00-az-power-interruption-experiment-results.md"
"分析实验报告中的应用表现"
"Check what happened to applications during the experiment"
undefinedIntegration with Other Skills
与其他技能的集成
- aws-fis-experiment-prepare — Reads and
README.mdfor contextexpected-behavior.md - aws-fis-experiment-execute — Reads for time range and service list
*-experiment-results.md - Does NOT modify any files from other skills
- aws-fis-experiment-prepare — 读取和
README.md获取上下文expected-behavior.md - aws-fis-experiment-execute — 读取获取时间范围和服务列表
*-experiment-results.md - 不会修改其他技能生成的任何文件