optimize-runbook

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Optimize Runbook

优化Runbook

You are analyzing previous Jetty workflow runs to identify patterns and propose targeted improvements to a local runbook. The goal is to produce specific, evidence-backed changes — not generic advice.
你正在分析过往Jetty工作流运行记录,以识别模式并为本地runbook提出针对性改进建议。目标是产出具体的、有数据支撑的变更——而非通用建议。

Cross-Agent Compatibility

跨Agent兼容性

This skill uses
AskUserQuestion
for interactive choices. If running in an environment where
AskUserQuestion
is not available, replace each call with a direct question in your text output.

该技能使用
AskUserQuestion
实现交互式选择。如果运行环境不支持
AskUserQuestion
,请将每次调用替换为文本输出中的直接问题。

Step 1: Identify the Runbook

步骤1:确定目标Runbook

  1. Look for runbook files in the current directory:
bash
ls -la RUNBOOK*.md 2>/dev/null
  1. If multiple runbooks are found, use AskUserQuestion:
    • Header: "Runbook"
    • Question: "Multiple runbooks found. Which one do you want to optimize?"
    • Options: list each filename
  2. Read the chosen runbook with the Read tool. Extract from frontmatter:
    • version
      ,
      evaluation
      (programmatic or rubric)
    • agent
      ,
      model
      ,
      snapshot
      (if present)
  3. Parse the evaluation section:
    • Programmatic: extract the PASS/PARTIAL/FAIL criteria table
    • Rubric: extract the rubric table (criteria, score descriptions, pass threshold)
  4. Identify the collection and task name. Check the skill argument first, then look in the runbook for Jetty API references. If not found, use AskUserQuestion:
    • Header: "Collection/Task"
    • Question: "Which Jetty collection and task does this runbook run as? (format: collection/task_name)"
    • Options:
      • "I'll type it" / "Let me enter the collection and task name"

  1. 在当前目录中查找runbook文件:
bash
ls -la RUNBOOK*.md 2>/dev/null
  1. 如果找到多个runbook,使用AskUserQuestion:
    • 标题:"Runbook"
    • 问题:"找到多个runbook。你想优化哪一个?"
    • 选项:列出每个文件名
  2. 使用Read工具读取选中的runbook。从前置元数据中提取:
    • version
      evaluation
      (程序化或评分标准)
    • agent
      model
      snapshot
      (如果存在)
  3. 解析评估部分:
    • 程序化:提取PASS/PARTIAL/FAIL标准表格
    • 评分标准:提取评分标准表格(评估项、分数说明、合格阈值)
  4. 确定集合和任务名称。优先检查技能参数,然后在runbook中查找Jetty API引用。如果未找到,使用AskUserQuestion:
    • 标题:"集合/任务"
    • 问题:"该runbook以哪个Jetty集合和任务运行?(格式:collection/task_name)"
    • 选项:
      • "我将手动输入" / "让我输入集合和任务名称"

Step 2: Fetch Trajectories

步骤2:获取运行轨迹(Trajectory)

Parse the skill argument for trajectory IDs or
--last N
. If not provided:
Use AskUserQuestion:
  • Header: "Trajectories"
  • Question: "How many recent runs should I analyze?"
  • Options:
    • "Last 5 runs" / "Analyze the 5 most recent trajectories"
    • "Last 10 runs" / "Analyze the 10 most recent trajectories"
    • "Specific IDs" / "I'll paste trajectory IDs"
Fetch the trajectory list:
bash
TOKEN="$(cat ~/.config/jetty/token)"
COLLECTION="the-collection"
TASK="the-task"
curl -s -H "Authorization: Bearer $TOKEN" \
  "https://flows-api.jetty.io/api/v1/db/trajectories/$COLLECTION/$TASK?limit=$LIMIT"
Parse the response — format is
{"trajectories": [...], "total": N}
.
For each trajectory, fetch full details:
bash
TOKEN="$(cat ~/.config/jetty/token)"
curl -s -H "Authorization: Bearer $TOKEN" \
  "https://flows-api.jetty.io/api/v1/db/trajectory/$COLLECTION/$TASK/$TRAJECTORY_ID"
Extract and record for each:
  • Status: completed / failed / timed_out
  • Duration: total execution time
  • Step outputs: iterate over
    .steps
    object keys
  • Errors: any error messages in failed steps
  • Labels: any quality labels applied
Download output files where available (validation_report.json, summary.md):
bash
TOKEN="$(cat ~/.config/jetty/token)"
curl -s -H "Authorization: Bearer $TOKEN" \
  "https://flows-api.jetty.io/api/v1/file/$FILE_PATH"

解析技能参数中的轨迹ID或
--last N
。如果未提供:
使用AskUserQuestion:
  • 标题:"运行轨迹"
  • 问题:"我应该分析最近多少次运行记录?"
  • 选项:
    • "最近5次运行" / "分析最近5条运行轨迹"
    • "最近10次运行" / "分析最近10条运行轨迹"
    • "特定ID" / "我将粘贴轨迹ID"
获取轨迹列表:
bash
TOKEN="$(cat ~/.config/jetty/token)"
COLLECTION="the-collection"
TASK="the-task"
curl -s -H "Authorization: Bearer $TOKEN" \
  "https://flows-api.jetty.io/api/v1/db/trajectories/$COLLECTION/$TASK?limit=$LIMIT"
解析响应——格式为
{"trajectories": [...], "total": N}
针对每条轨迹,获取完整详情:
bash
TOKEN="$(cat ~/.config/jetty/token)"
curl -s -H "Authorization: Bearer $TOKEN" \
  "https://flows-api.jetty.io/api/v1/db/trajectory/$COLLECTION/$TASK/$TRAJECTORY_ID"
为每条轨迹提取并记录以下信息:
  • 状态:completed / failed / timed_out
  • 时长:总执行时间
  • 步骤输出:遍历
    .steps
    对象的键
  • 错误:失败步骤中的任何错误信息
  • 标签:已应用的任何质量标签
下载可用的输出文件(validation_report.json、summary.md):
bash
TOKEN="$(cat ~/.config/jetty/token)"
curl -s -H "Authorization: Bearer $TOKEN" \
  "https://flows-api.jetty.io/api/v1/file/$FILE_PATH"

Step 3: Build Analysis Summary

步骤3:生成分析摘要

Create and display a summary table:
markdown
| # | Trajectory ID | Status | Duration | Iterations | Score/Result | Key Issue |
|---|---------------|--------|----------|------------|-------------|-----------|
Fill in from trajectory data. Present to the user.

创建并展示摘要表格:
markdown
| # | Trajectory ID | 状态 | 时长 | 迭代次数 | 分数/结果 | 关键问题 |
|---|---------------|--------|----------|------------|-------------|-----------|
根据轨迹数据填充表格,并展示给用户。

Step 4: Pattern Analysis

步骤4:模式分析

Analyze trajectories against the runbook for these patterns:
对照runbook分析轨迹,识别以下模式:

4a: Consistent Failures

4a:持续失败

Evaluation criteria scoring below threshold across multiple runs.
  • Rubric: criteria scoring < 4 in more than half of runs
  • Programmatic: stages showing FAIL/PARTIAL across runs
多次运行中评估标准得分低于阈值的情况。
  • 评分标准:超过半数运行中某项评估项得分<4
  • 程序化:多轮运行中某阶段显示FAIL/PARTIAL

4b: Iteration Waste

4b:迭代浪费

Steps that consistently need 2-3 retry rounds. Predictable first-attempt failures that could be prevented with better instructions or templates.
持续需要2-3次重试的步骤。可通过更清晰的说明或模板避免可预测的首次尝试失败。

4c: Timeout Patterns

4c:超时模式

Runs that timed out or took disproportionately long. Which steps are the bottlenecks?
出现超时或耗时过长的运行。哪些步骤是瓶颈?

4d: Divergent Agent Behavior

4d:Agent行为不一致

Cases where the agent interpreted instructions differently across runs. Structurally different outputs suggesting ambiguous instructions.
Agent在不同运行中对指令的解读存在差异的情况。输出结构差异表明指令存在歧义。

4e: Missing Guardrails

4e:缺少防护机制

Errors not caught by evaluation criteria. Environment setup issues (wrong versions, missing tools).
评估标准未捕获的错误。环境配置问题(版本错误、工具缺失)。

4f: Score Plateaus (rubric only)

4f:分数停滞(仅适用于评分标准)

Criteria that iterate but don't improve — suggesting the Common Fixes table lacks actionable guidance.
Present each pattern found with supporting evidence (trajectory IDs, scores, error messages).

反复迭代但未提升的评估项——表明通用修复建议表格缺乏可操作的指导。
展示每个发现的模式及支撑证据(轨迹ID、分数、错误信息)。

Step 5: Generate Proposed Changes

步骤5:生成建议变更

For each pattern, propose a specific change to the RUNBOOK.md:
markdown
undefined
针对每个发现的模式,为RUNBOOK.md提出具体变更:
markdown
undefined

Proposed Changes

建议变更

Change 1: {Brief title} (addresses: {pattern})

变更1:{简短标题}(解决:{对应模式})

Section: {Which runbook section} Current:
{Exact current text}
Proposed:
{Replacement text}
Evidence: {Trajectory IDs, scores, errors that support this change}

Guidelines:
- Changes must be **specific** — quote exact sections, provide exact replacements
- Changes must be **evidence-backed** — cite trajectories, scores, or errors
- Prefer **additive** changes (add a Common Fix, add a tip, strengthen descriptions)
- If frontmatter fields are missing (agent, model, snapshot), propose adding them
- Don't fabricate evidence — only cite patterns actually observed

---
涉及章节: {runbook中的对应章节} 当前内容:
{当前精确文本}
建议内容:
{替换文本}
支撑证据: {支持该变更的轨迹ID、分数或错误信息}

指导原则:
- 变更必须**具体**——引用精确章节,提供精确替换文本
- 变更必须**有数据支撑**——引用轨迹、分数或错误信息
- 优先选择**增量式**变更(添加通用修复建议、添加提示、强化描述)
- 如果前置元数据字段缺失(agent、model、snapshot),建议添加
- 不得编造证据——仅报告数据中实际观察到的模式

---

Step 6: Apply Changes

步骤6:应用变更

Use AskUserQuestion:
  • Header: "Apply Changes"
  • Question: "I found {N} proposed improvements. Which should I apply?"
  • Options:
    • "Apply all" / "Apply all {N} changes to the runbook"
    • "Let me choose" / "I'll approve each change individually"
    • "Save as report" / "Don't modify the runbook — save analysis to a file"
If "Apply all": Apply each change using Edit. Bump the version (patch increment).
If "Let me choose": For each change, ask approve/skip/modify.
If "Save as report": Write to
./runbook-optimization-report.md
.

使用AskUserQuestion:
  • 标题:"应用变更"
  • 问题:"我找到了{N}项建议改进。你希望应用哪些?"
  • 选项:
    • "全部应用" / "将所有{N}项变更应用到runbook"
    • "手动选择" / "我将逐个审批每项变更"
    • "保存为报告" / "不修改runbook——将分析结果保存到文件"
如果选择“全部应用”:使用Edit工具应用每项变更。升级版本(补丁版本号递增)。
如果选择“手动选择”:针对每项变更,询问用户是否批准、跳过或修改。
如果选择“保存为报告”:将结果写入
./runbook-optimization-report.md

Step 7: Summary

步骤7:总结

markdown
undefined
markdown
undefined

Optimization Summary

优化总结

  • Runbook: {filename}
  • Trajectories analyzed: {count}
  • Patterns identified: {count}
  • Changes applied: {count} / {total}
  • Version: {old} → {new}
  • Runbook:{文件名}
  • 分析的轨迹数量:{数量}
  • 识别的模式数量:{数量}
  • 已应用的变更数量:{已应用数} / {总数}
  • 版本:{旧版本} → {新版本}

Recommended next steps

后续建议

  • Run the updated runbook 2-3 times to verify improvements
  • Run
    /jetty optimize-runbook
    again after new runs to measure progress

---
  • 运行更新后的runbook 2-3次以验证改进效果
  • 新运行完成后再次执行
    /jetty optimize-runbook
    以衡量进展

---

Important Notes

重要说明

  • Read the token from file:
    TOKEN="$(cat ~/.config/jetty/token)"
    at the start of each bash block.
  • URL: Use
    flows-api.jetty.io
    for API calls. Never
    flows.jetty.io
    .
  • Trajectories shape:
    {"trajectories": [...]}
    — access via
    .trajectories[]
    .
  • Steps are objects: keyed by name, not indexed.
  • Minimum trajectories: Works with 1+, but 3+ gives better patterns.
  • Don't fabricate: Only report patterns actually observed in the data.
  • 从文件读取令牌:在每个bash块开头使用
    TOKEN="$(cat ~/.config/jetty/token)"
  • URL:API调用使用
    flows-api.jetty.io
    。切勿使用
    flows.jetty.io
  • 轨迹数据结构
    {"trajectories": [...]}
    ——通过
    .trajectories[]
    访问数据。
  • 步骤为对象:按名称而非索引键控。
  • 最小轨迹数量:支持1条及以上轨迹,但3条及以上能识别出更准确的模式。
  • 不得编造内容:仅报告数据中实际观察到的模式。