agent-orchestrator

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Agent Orchestrator

Overview

概述

Run a disciplined multi-agent workflow where this instance acts as the coordinator: it delegates audits and fixes to other agents, reconciles results, enforces quality gates, and drives the work to a usable, validated end state.

Core pattern: dispatch a fresh implementer per cluster, then run two-stage review (spec compliance first, then code quality).

运行一套规范的多Agent工作流，当前实例作为协调者：将审计和修复任务委派给其他Agent，对齐结果，执行质量闸门，推动工作达到可用、经过验证的最终状态。

核心模式：为每个集群分配一个全新的实现者，然后运行两阶段评审（先检查规范符合性，再检查代码质量）。

Workflow (Coordinator)

工作流（协调者）

Discover and use other skills (when helpful)
- Check the harness-provided skill list (typically present in system context). If a relevant specialized skill exists, explicitly invoke it (e.g.,
```
$some-skill
```
  ) and follow its workflow instead of reinventing it.
- Use other skills to: fetch external info safely, generate boilerplate reliably, apply framework-specific conventions, or handle fragile formats (docs/PDFs, CI config, release workflows).
- Keep skill usage intentional: choose the minimal set, state which skills you’re using and why, and avoid duplicating their instructions inside this skill.
Freeze scope + success criteria
- Restate the mission, constraints, and “done” criteria in concrete terms.
- Identify any authoritative sources (docs/specs) and record what claims must be backed by evidence.
Create a phase plan and keep it current
- Use your environment’s planning mechanism (e.g.,
```
update_plan
```
  if available) to track phases and prevent drifting.
- Prefer 4–7 steps; keep exactly one step
```
in_progress
```
  .
Decompose into subsystems
- Choose subsystems that can be audited independently (API surface, core logic, error handling, perf, integrations, tests, docs).
- For each subsystem, define 2–5 invariants (what must always be true).
Run dual independent audits per subsystem
- Spawn two independent audits per subsystem (auditA and auditB) and keep them independent until reconciliation.
- Require evidence for every issue (repo location, deterministic repro, expected vs actual, severity).
Reconcile audits into a single confirmed issue list
- Compare auditA vs auditB outputs and keep only mutually confirmed issues.
- Track rejected candidates with a brief reason (weak evidence, out of scope, non-deterministic).
- Use this reconciled list as the only input to implementation.
Implement in clusters with clear ownership
- Group confirmed issues into clusters that can be fixed with minimal coupling.
- Spawn exactly one fixer per cluster; fixers should “own” a file set and avoid broad refactors.
- Every fix must come with a regression test (unit/integration/e2e as appropriate).
- For each cluster, run a two-stage review loop:
  - Implementer completes the cluster (tests, self-review, commit) and reports what changed.
  - Spec compliance reviewer validates “nothing more, nothing less” by reading code (do not trust the report).
  - Code quality reviewer validates maintainability and test quality (only after spec compliance passes).
  - If any review FAILs, send concrete feedback back to the implementer and repeat the failed review stage.
Enforce review gates
- Do not merge/land a cluster unless spec compliance PASS and code quality PASS are both recorded with concrete references.
Integrate + validate
- Run the repo’s standard validations (tests, lint, build, typecheck).
- If the repo has no clear commands, discover them from
```
README
```
  ,
```
package.json
```
  ,
```
pyproject.toml
```
  , CI config, etc.
Deliver a concise completion report

What is usable now.
What remains intentionally unsupported (with next steps/issues).
Commands executed (at least the key validation commands) and results.

发现并使用其他技能（适用时）
- 查看框架提供的技能列表（通常存在于系统上下文中）。如果存在相关的专用技能，显式调用它（例如
```
$some-skill
```
  ）并遵循其工作流，不要重复造轮子。
- 使用其他技能来完成以下工作：安全获取外部信息、可靠生成样板代码、应用特定框架的约定，或者处理易出错的格式（文档/PDF、CI配置、发布工作流）。
- 有意识地使用技能：选择最小集合，说明你正在使用的技能以及原因，避免在本技能内重复它们的指令。
冻结范围和成功标准
- 用具体的表述重述任务目标、约束条件和“完成”标准。
- 识别所有权威来源（文档/规范），并记录哪些主张必须有证据支撑。
创建阶段计划并保持更新
- 使用你环境中的规划机制（例如可用的
```
update_plan
```
  ）来跟踪阶段，防止偏离目标。
- 优先设置4-7个步骤；同一时间仅保留一个步骤处于
```
in_progress
```
  状态。
拆分为子系统
- 选择可以独立审计的子系统（API层、核心逻辑、错误处理、性能、集成能力、测试、文档）。
- 为每个子系统定义2-5个不变量（必须始终满足的条件）。
为每个子系统运行双独立审计
- 为每个子系统生成两个独立的审计任务（auditA和auditB），在结果对齐前保持两者完全独立。
- 要求每个问题都提供证据（仓库位置、可确定性复现步骤、预期结果与实际结果对比、严重程度）。
对齐审计结果，生成单一确认问题列表
- 对比auditA和auditB的输出，仅保留双方共同确认的问题。
- 记录被驳回的候选问题以及简要原因（证据不足、超出范围、无法确定性复现）。
- 仅使用这份对齐后的列表作为实现阶段的唯一输入。
按集群实现，明确所有权
- 将确认的问题分组为多个集群，每个集群的修复耦合度最低。
- 为每个集群分配恰好一个修复者；修复者应该“拥有”一组文件，避免大范围重构。
- 每个修复都必须配套回归测试（根据场景选择单元/集成/端到端测试）。
- 针对每个集群，运行两阶段评审循环：
  - 实现者完成集群修复（测试、自审、提交）并报告变更内容。
  - 规范符合性评审员通过阅读代码验证“不多做、不少做”（不要相信实现者的报告）。
  - 代码质量评审员验证可维护性和测试质量（仅在规范符合性评审通过后执行）。
  - 如果任何评审失败，向实现者发送具体反馈，并重复失败的评审阶段。
执行评审闸门
- 除非规范符合性评审和代码质量评审都通过并有具体记录，否则不要合并/提交集群的变更。
集成与验证
- 运行仓库的标准验证流程（测试、lint、构建、typecheck）。
- 如果仓库没有明确的命令，从
```
README
```
  、
```
package.json
```
  、
```
pyproject.toml
```
  、CI配置等文件中查找。
交付简洁的完成报告
- 当前可用的产出。
- 有意暂不支持的内容（以及后续步骤/关联问题）。
- 执行的命令（至少包含核心验证命令）和结果。

Agent Prompt Templates

Agent 提示词模板

Use these as starting points; keep subsystem- and repo-specific details in the message you send.

以下内容作为起始模板；在你发送的消息中保留子系统和仓库的特定细节。

Auditor (per subsystem)

审计员（每个子系统）

Task:

Audit the
```
<SUBSYSTEM>
```
subsystem independently.
Do not propose fixes yet; identify issues only.
If a specialized skill is relevant to the subsystem, invoke it and follow its audit/checklist guidance.

Output (bullet list):

issue title
severity: critical/high/medium/low
evidence: repo file + symbol (and line if stable)
deterministic repro (commands/steps) or reasoning for why repro is not needed
expected vs actual
violated invariant (if known) or propose a new invariant

任务：

独立审计
```
<SUBSYSTEM>
```
子系统。
暂时不要提出修复方案，仅识别问题。
如果有与该子系统相关的专用技能，调用它并遵循其审计/检查清单指引。

输出（无序列表）：

问题标题
严重程度：critical/high/medium/low
证据：仓库文件+符号（如果行号稳定可包含行号）
可确定性复现步骤（命令/操作）或者不需要复现的原因
预期结果与实际结果对比
违反的不变量（如果已知）或者提出新的不变量

Reconciler (coordinator task)

结果整合员（协调者任务）

Task:

Compare auditA vs auditB for
```
<SUBSYSTEM>
```
.
Produce a single decision set: confirmed issues (mutual) + rejected candidates (with reason).

Output:

Confirmed issues (only mutual)
Rejected candidates (reason)
Consensus achieved: YES/NO

任务：

对比
```
<SUBSYSTEM>
```
的auditA和auditB结果。
生成单一决策集：确认的问题（双方共同认可）+ 驳回的候选问题（附原因）。

输出：

确认的问题（仅双方共同认可的）
驳回的候选问题（附原因）
达成共识：是/否

Implementer (per cluster)

实现者（每个集群）

Task:

Implement cluster
```
<CLUSTER_ID>
```
derived from confirmed issues.
Work from a fresh context: do not assume prior clusters’ details unless provided.
Do not open plan files unless explicitly instructed; the coordinator should paste the full cluster/task text and context here.
Ask questions before you start if anything is unclear.
Stay within agreed owned files; avoid opportunistic refactors.
Add/adjust regression tests for every change.
Run relevant validations (targeted tests first, then broader if appropriate).
Commit your work (unless the repo workflow forbids local commits).
Invoke specialized skills when they reduce risk (framework conventions, CI/test harness setup, format-sensitive edits).

Output:

changed files (paths)
commands executed + results
brief behavior change summary
tests added/updated

任务：

实现基于确认问题生成的集群
```
<CLUSTER_ID>
```
。
基于全新上下文工作：除非提供了之前集群的细节，否则不要做相关假设。
除非明确指示，否则不要打开计划文件；协调者应该在这里粘贴完整的集群/任务文本和上下文。
如果有任何不清楚的地方，开始工作前先提问。
只在约定的所属文件范围内工作，避免非必要的重构。
为每个变更添加/调整回归测试。
运行相关验证（先运行针对性测试，适用时再运行更广泛的验证）。
提交你的工作（除非仓库工作流禁止本地提交）。
当专用技能可以降低风险时调用它们（框架约定、CI/测试框架设置、格式敏感的编辑）。

输出：

变更的文件（路径）
执行的命令+结果
简要的行为变更说明
新增/更新的测试

Spec Compliance Reviewer (per cluster)

规范符合性评审员（每个集群）

Task:

Verify the implementation matches the cluster’s requirements: nothing missing, nothing extra.
Do not trust the implementer’s report; verify by reading the actual code.
Call out missing requirements, extra features, or misunderstandings with concrete file references.

Output:

PASS/FAIL
missing requirements (if any) with concrete references
extra/unneeded work (if any) with concrete references

任务：

验证实现符合集群的要求：没有遗漏，没有多余内容。
不要相信实现者的报告，通过阅读实际代码验证。
指出遗漏的需求、额外的功能或者理解偏差，并附具体的文件引用。

输出：

通过/失败
遗漏的需求（如果有）附具体引用
多余/不必要的工作（如果有）附具体引用

Code Quality Reviewer (per cluster)

代码质量评审员（每个集群）

Task:

Review cluster
```
<CLUSTER_ID>
```
changes for maintainability, test quality, and adherence to existing patterns.
Only run after spec compliance PASS.
Run the cluster’s relevant tests/commands (or explain what prevented running them).
Confirm any invoked specialized skills were followed (or explicitly explain deviations).

Output:

PASS/FAIL
concrete references (files/symbols)
any invariant violations or missing tests

任务：

评审集群
```
<CLUSTER_ID>
```
的变更的可维护性、测试质量，以及对现有模式的遵循度。
仅在规范符合性评审通过后执行。
运行集群相关的测试/命令（或者解释无法运行的原因）。
确认所有调用的专用技能的规则都被遵循（或者明确解释偏差原因）。

输出：

通过/失败
具体引用（文件/符号）
任何违反不变量的情况或者缺失的测试