analyze-github-action-logs

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Analyze GitHub Action Logs

分析GitHub Action日志

Fetch and analyze recent GitHub Actions runs for a given workflow. Review agent/step performance, identify wasted effort and mistakes, and produce a report with actionable improvements.
获取并分析指定工作流的近期GitHub Actions运行记录。审核Agent/步骤性能,识别无效操作与错误,并生成包含可落地改进建议的报告。

Input

输入

You need:
  • workflow
    (required) — The workflow file name or ID (e.g.,
    issue-triage.yml
    ,
    deploy.yml
    ).
  • repo
    (optional) — The GitHub repository in
    OWNER/REPO
    format. Defaults to
    withastro/astro
    .
  • count
    (optional) — Number of recent completed runs to analyze. Defaults to
    5
    .
你需要:
  • workflow
    (必填)—— 工作流文件名或ID(例如:
    issue-triage.yml
    deploy.yml
    )。
  • repo
    (可选)—— GitHub仓库,格式为
    OWNER/REPO
    。默认值为
    withastro/astro
  • count
    (可选)—— 要分析的近期已完成运行记录数量。默认值为
    5

Step 1: List Recent Runs

步骤1:列出近期运行记录

Fetch the most recent completed runs for the workflow. Filter by
--status=completed
:
bash
gh run list --workflow=<workflow> -R <repo> --status=completed -L <count>
Present the list to orient yourself: run IDs, titles, status (success/failure), and duration. Pick the runs to analyze — prefer a mix of successes and failures if available, and prefer runs that exercised more steps (longer runs tend to go through more stages, while shorter runs may exit early).
获取该工作流的最新已完成运行记录。通过
--status=completed
进行过滤:
bash
gh run list --workflow=<workflow> -R <repo> --status=completed -L <count>
列出记录以明确情况:运行ID、标题、状态(成功/失败)和时长。选择要分析的运行记录——如果有条件,优先选择成功与失败记录的组合,且优先选择执行了更多步骤的运行记录(较长的运行记录通常会经历更多阶段,而较短的可能提前退出)。

Step 2: Fetch Logs

步骤2:获取日志

For each run you want to analyze, save the full log to a temp file:
bash
gh run view <run_id> -R <repo> --log > /tmp/actions-run-<run_id>.log
对于每个要分析的运行记录,将完整日志保存到临时文件:
bash
gh run view <run_id> -R <repo> --log > /tmp/actions-run-<run_id>.log

Step 3: Identify Step/Skill Boundaries

步骤3:识别步骤/Skill边界

Search each log file for markers that indicate where each step or skill starts and ends. The markers depend on the workflow — look for patterns like:
  • Flue skill markers:
    [flue] skill("..."): starting
    /
    completed
  • GitHub Actions step markers: Step name headers in the log output
  • Custom markers: Any
    START
    /
    END
    or similar delimiters the workflow uses
bash
grep -n "skill(\|step\|START\|END\|starting\|completed" /tmp/actions-run-<run_id>.log | head -50
From this, determine which line ranges correspond to each step/skill. Also find any result markers:
bash
grep -n "RESULT_START\|RESULT_END\|extractResult" /tmp/actions-run-<run_id>.log
Note: Some log files may contain binary/null bytes. Use
grep -a
if needed.
在每个日志文件中搜索标记,以确定每个步骤或Skill的开始和结束位置。标记取决于工作流——寻找如下模式:
  • Flue Skill标记
    [flue] skill("..."): starting
    /
    completed
  • GitHub Actions步骤标记:日志输出中的步骤名称标题
  • 自定义标记:任何
    START
    /
    END
    或类似的分隔符
bash
grep -n "skill(\|step\|START\|END\|starting\|completed" /tmp/actions-run-<run_id>.log | head -50
据此确定每个步骤/Skill对应的行范围。同时查找任何结果标记:
bash
grep -n "RESULT_START\|RESULT_END\|extractResult" /tmp/actions-run-<run_id>.log
注意:部分日志文件可能包含二进制/空字节。必要时使用
grep -a

Step 4: Analyze Each Step (Use Subagents)

步骤4:分析每个步骤(使用Subagent)

For each step/skill that ran, launch a subagent to analyze that section's log. This is critical to avoid polluting your context with thousands of log lines.
For each subagent, provide:
  1. The log file path and the line range for that step
  2. If skill instruction files exist for the workflow, tell the subagent to read them first for context
  3. The run title/context so the subagent understands what was being done
  4. The analysis criteria below
对于每个已运行的步骤/Skill,启动一个Subagent来分析该部分日志。这对于避免上下文被数千行日志污染至关重要。
为每个Subagent提供:
  1. 日志文件路径和该步骤对应的行范围
  2. 如果工作流存在Skill说明文件,告知Subagent先读取这些文件以获取上下文
  3. 运行记录标题/上下文,以便Subagent了解正在执行的任务
  4. 以下分析标准

Analysis Criteria

分析标准

Tell each subagent to evaluate:
  1. Correctness — Was the step's final result/verdict correct?
  2. Efficiency — How long did it take? What's a reasonable baseline? Where was time wasted?
  3. Mistakes — Wrong tool calls, failed commands retried without changes, unnecessary rebuilds, etc.
  4. Instruction compliance — If skill instructions exist, did the agent follow them? Where did it deviate?
  5. Scope creep — Did the agent do work that belongs in a different step?
  6. Suggestions — Specific, actionable changes that would prevent the issues found.
Tell each subagent to return a structured response with: Summary, Time Analysis, Issues Found (with estimated time wasted for each), and Suggestions for Improvement.
告知每个Subagent评估以下内容:
  1. 正确性——步骤的最终结果/判断是否正确?
  2. 效率——耗时多久?合理的基准是多少?时间浪费在了哪里?
  3. 错误——错误的工具调用、未做修改就重试失败的命令、不必要的重新构建等。
  4. 指令合规性——如果存在Skill说明,Agent是否遵循?在哪些地方偏离了?
  5. 范围蔓延——Agent是否执行了属于其他步骤的工作?
  6. 建议——具体、可落地的变更,以防止发现的问题再次发生。
告知每个Subagent返回结构化响应,包含:摘要、时间分析、发现的问题(每个问题需估算浪费的时间)以及改进建议。

Step 5: Consolidate Report

步骤5:整合报告

After all subagents return, synthesize their findings into a single report. Structure it as:
在所有Subagent返回结果后,将它们的发现整合为一份单一报告。结构如下:

Per-Run Summary Table

单运行记录摘要表

For each run analyzed, include a table:
Step/SkillTimeResultTime WastedTop Issue
对于每个分析的运行记录,包含一个表格:
步骤/Skill耗时结果浪费时间主要问题

Cross-Cutting Patterns

跨运行记录模式

Identify issues that appeared across multiple runs or multiple steps. These are the highest-value improvements. Common patterns to look for:
  • TodoWrite abuse — Agent wasting time on task list management during automated runs
  • Server management failures — Port conflicts, failed process kills, stale log files
  • Tool misuse — Using
    curl
    instead of
    gh
    ,
    jq
    not found, etc.
  • Scope creep — One step doing work that belongs in another
  • Unnecessary rebuilds — Building packages multiple times without changes
  • Test timeouts — Running slow E2E/Playwright tests that time out
  • Instruction violations — Agent doing something the instructions explicitly forbid
  • Redundant work — Re-reading files, re-running searches, re-installing dependencies
识别出现在多个运行记录或多个步骤中的问题。这些是价值最高的改进点。常见模式包括:
  • TodoWrite滥用——Agent在自动化运行期间浪费时间在任务列表管理上
  • 服务器管理失败——端口冲突、进程终止失败、日志文件过期
  • 工具误用——使用
    curl
    而非
    gh
    、未找到
    jq
  • 范围蔓延——一个步骤执行了属于另一个步骤的工作
  • 不必要的重新构建——在无变更的情况下多次构建包
  • 测试超时——运行缓慢的E2E/Playwright测试导致超时
  • 违反指令——Agent执行了说明中明确禁止的操作
  • 冗余工作——重复读取文件、重复运行搜索、重复安装依赖

Prioritized Recommendations

优先级排序的建议

Rank your improvement suggestions by estimated time savings across all runs. For each recommendation:
  1. What to change — Which file(s) to edit and what to add/modify
  2. Why — What pattern it addresses, with evidence from the runs
  3. Estimated impact — How much time it would save per run
根据所有运行记录的预估时间节省量,对改进建议进行排序。每个建议需包含:
  1. 修改内容——要编辑哪些文件以及要添加/修改的内容
  2. 原因——解决的模式,以及来自运行记录的证据
  3. 预估影响——每次运行记录可节省的时间

Output

输出

Present the full consolidated report. Do NOT edit any workflow or skill files — only report findings and recommendations. The user will decide which changes to apply.
呈现完整的整合报告。请勿编辑任何工作流或Skill文件——仅报告发现的问题和建议。由用户决定应用哪些变更。