agent-orchestrator

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Agent Orchestrator

Agent Orchestrator

Overview

概述

Run a disciplined multi-agent workflow where this instance acts as the coordinator: it delegates audits and fixes to other agents, reconciles results, enforces quality gates, and drives the work to a usable, validated end state.
Core pattern: dispatch a fresh implementer per cluster, then run two-stage review (spec compliance first, then code quality).
运行一套规范的多Agent工作流,当前实例作为协调者:将审计和修复任务委派给其他Agent,对齐结果,执行质量闸门,推动工作达到可用、经过验证的最终状态。
核心模式:为每个集群分配一个全新的实现者,然后运行两阶段评审(先检查规范符合性,再检查代码质量)。

Workflow (Coordinator)

工作流(协调者)

  1. Discover and use other skills (when helpful)
    • Check the harness-provided skill list (typically present in system context). If a relevant specialized skill exists, explicitly invoke it (e.g.,
      $some-skill
      ) and follow its workflow instead of reinventing it.
    • Use other skills to: fetch external info safely, generate boilerplate reliably, apply framework-specific conventions, or handle fragile formats (docs/PDFs, CI config, release workflows).
    • Keep skill usage intentional: choose the minimal set, state which skills you’re using and why, and avoid duplicating their instructions inside this skill.
  2. Freeze scope + success criteria
    • Restate the mission, constraints, and “done” criteria in concrete terms.
    • Identify any authoritative sources (docs/specs) and record what claims must be backed by evidence.
  3. Create a phase plan and keep it current
    • Use your environment’s planning mechanism (e.g.,
      update_plan
      if available) to track phases and prevent drifting.
    • Prefer 4–7 steps; keep exactly one step
      in_progress
      .
  4. Decompose into subsystems
    • Choose subsystems that can be audited independently (API surface, core logic, error handling, perf, integrations, tests, docs).
    • For each subsystem, define 2–5 invariants (what must always be true).
  5. Run dual independent audits per subsystem
    • Spawn two independent audits per subsystem (auditA and auditB) and keep them independent until reconciliation.
    • Require evidence for every issue (repo location, deterministic repro, expected vs actual, severity).
  6. Reconcile audits into a single confirmed issue list
    • Compare auditA vs auditB outputs and keep only mutually confirmed issues.
    • Track rejected candidates with a brief reason (weak evidence, out of scope, non-deterministic).
    • Use this reconciled list as the only input to implementation.
  7. Implement in clusters with clear ownership
    • Group confirmed issues into clusters that can be fixed with minimal coupling.
    • Spawn exactly one fixer per cluster; fixers should “own” a file set and avoid broad refactors.
    • Every fix must come with a regression test (unit/integration/e2e as appropriate).
    • For each cluster, run a two-stage review loop:
      • Implementer completes the cluster (tests, self-review, commit) and reports what changed.
      • Spec compliance reviewer validates “nothing more, nothing less” by reading code (do not trust the report).
      • Code quality reviewer validates maintainability and test quality (only after spec compliance passes).
      • If any review FAILs, send concrete feedback back to the implementer and repeat the failed review stage.
  8. Enforce review gates
    • Do not merge/land a cluster unless spec compliance PASS and code quality PASS are both recorded with concrete references.
  9. Integrate + validate
    • Run the repo’s standard validations (tests, lint, build, typecheck).
    • If the repo has no clear commands, discover them from
      README
      ,
      package.json
      ,
      pyproject.toml
      , CI config, etc.
  10. Deliver a concise completion report
  • What is usable now.
  • What remains intentionally unsupported (with next steps/issues).
  • Commands executed (at least the key validation commands) and results.
  1. 发现并使用其他技能(适用时)
    • 查看框架提供的技能列表(通常存在于系统上下文中)。如果存在相关的专用技能,显式调用它(例如
      $some-skill
      )并遵循其工作流,不要重复造轮子。
    • 使用其他技能来完成以下工作:安全获取外部信息、可靠生成样板代码、应用特定框架的约定,或者处理易出错的格式(文档/PDF、CI配置、发布工作流)。
    • 有意识地使用技能:选择最小集合,说明你正在使用的技能以及原因,避免在本技能内重复它们的指令。
  2. 冻结范围和成功标准
    • 用具体的表述重述任务目标、约束条件和“完成”标准。
    • 识别所有权威来源(文档/规范),并记录哪些主张必须有证据支撑。
  3. 创建阶段计划并保持更新
    • 使用你环境中的规划机制(例如可用的
      update_plan
      )来跟踪阶段,防止偏离目标。
    • 优先设置4-7个步骤;同一时间仅保留一个步骤处于
      in_progress
      状态。
  4. 拆分为子系统
    • 选择可以独立审计的子系统(API层、核心逻辑、错误处理、性能、集成能力、测试、文档)。
    • 为每个子系统定义2-5个不变量(必须始终满足的条件)。
  5. 为每个子系统运行双独立审计
    • 为每个子系统生成两个独立的审计任务(auditA和auditB),在结果对齐前保持两者完全独立。
    • 要求每个问题都提供证据(仓库位置、可确定性复现步骤、预期结果与实际结果对比、严重程度)。
  6. 对齐审计结果,生成单一确认问题列表
    • 对比auditA和auditB的输出,仅保留双方共同确认的问题。
    • 记录被驳回的候选问题以及简要原因(证据不足、超出范围、无法确定性复现)。
    • 仅使用这份对齐后的列表作为实现阶段的唯一输入。
  7. 按集群实现,明确所有权
    • 将确认的问题分组为多个集群,每个集群的修复耦合度最低。
    • 为每个集群分配恰好一个修复者;修复者应该“拥有”一组文件,避免大范围重构。
    • 每个修复都必须配套回归测试(根据场景选择单元/集成/端到端测试)。
    • 针对每个集群,运行两阶段评审循环:
      • 实现者完成集群修复(测试、自审、提交)并报告变更内容。
      • 规范符合性评审员通过阅读代码验证“不多做、不少做”(不要相信实现者的报告)。
      • 代码质量评审员验证可维护性和测试质量(仅在规范符合性评审通过后执行)。
      • 如果任何评审失败,向实现者发送具体反馈,并重复失败的评审阶段。
  8. 执行评审闸门
    • 除非规范符合性评审和代码质量评审都通过并有具体记录,否则不要合并/提交集群的变更。
  9. 集成与验证
    • 运行仓库的标准验证流程(测试、lint、构建、typecheck)。
    • 如果仓库没有明确的命令,从
      README
      package.json
      pyproject.toml
      、CI配置等文件中查找。
  10. 交付简洁的完成报告
    • 当前可用的产出。
    • 有意暂不支持的内容(以及后续步骤/关联问题)。
    • 执行的命令(至少包含核心验证命令)和结果。

Agent Prompt Templates

Agent 提示词模板

Use these as starting points; keep subsystem- and repo-specific details in the message you send.
以下内容作为起始模板;在你发送的消息中保留子系统和仓库的特定细节。

Auditor (per subsystem)

审计员(每个子系统)

Task:
  • Audit the
    <SUBSYSTEM>
    subsystem independently.
  • Do not propose fixes yet; identify issues only.
  • If a specialized skill is relevant to the subsystem, invoke it and follow its audit/checklist guidance.
Output (bullet list):
  • issue title
  • severity: critical/high/medium/low
  • evidence: repo file + symbol (and line if stable)
  • deterministic repro (commands/steps) or reasoning for why repro is not needed
  • expected vs actual
  • violated invariant (if known) or propose a new invariant
任务:
  • 独立审计
    <SUBSYSTEM>
    子系统。
  • 暂时不要提出修复方案,仅识别问题。
  • 如果有与该子系统相关的专用技能,调用它并遵循其审计/检查清单指引。
输出(无序列表):
  • 问题标题
  • 严重程度:critical/high/medium/low
  • 证据:仓库文件+符号(如果行号稳定可包含行号)
  • 可确定性复现步骤(命令/操作)或者不需要复现的原因
  • 预期结果与实际结果对比
  • 违反的不变量(如果已知)或者提出新的不变量

Reconciler (coordinator task)

结果整合员(协调者任务)

Task:
  • Compare auditA vs auditB for
    <SUBSYSTEM>
    .
  • Produce a single decision set: confirmed issues (mutual) + rejected candidates (with reason).
Output:
  • Confirmed issues (only mutual)
  • Rejected candidates (reason)
  • Consensus achieved: YES/NO
任务:
  • 对比
    <SUBSYSTEM>
    的auditA和auditB结果。
  • 生成单一决策集:确认的问题(双方共同认可)+ 驳回的候选问题(附原因)。
输出:
  • 确认的问题(仅双方共同认可的)
  • 驳回的候选问题(附原因)
  • 达成共识:是/否

Implementer (per cluster)

实现者(每个集群)

Task:
  • Implement cluster
    <CLUSTER_ID>
    derived from confirmed issues.
  • Work from a fresh context: do not assume prior clusters’ details unless provided.
  • Do not open plan files unless explicitly instructed; the coordinator should paste the full cluster/task text and context here.
  • Ask questions before you start if anything is unclear.
  • Stay within agreed owned files; avoid opportunistic refactors.
  • Add/adjust regression tests for every change.
  • Run relevant validations (targeted tests first, then broader if appropriate).
  • Commit your work (unless the repo workflow forbids local commits).
  • Invoke specialized skills when they reduce risk (framework conventions, CI/test harness setup, format-sensitive edits).
Output:
  • changed files (paths)
  • commands executed + results
  • brief behavior change summary
  • tests added/updated
任务:
  • 实现基于确认问题生成的集群
    <CLUSTER_ID>
  • 基于全新上下文工作:除非提供了之前集群的细节,否则不要做相关假设。
  • 除非明确指示,否则不要打开计划文件;协调者应该在这里粘贴完整的集群/任务文本和上下文。
  • 如果有任何不清楚的地方,开始工作前先提问。
  • 只在约定的所属文件范围内工作,避免非必要的重构。
  • 为每个变更添加/调整回归测试。
  • 运行相关验证(先运行针对性测试,适用时再运行更广泛的验证)。
  • 提交你的工作(除非仓库工作流禁止本地提交)。
  • 当专用技能可以降低风险时调用它们(框架约定、CI/测试框架设置、格式敏感的编辑)。
输出:
  • 变更的文件(路径)
  • 执行的命令+结果
  • 简要的行为变更说明
  • 新增/更新的测试

Spec Compliance Reviewer (per cluster)

规范符合性评审员(每个集群)

Task:
  • Verify the implementation matches the cluster’s requirements: nothing missing, nothing extra.
  • Do not trust the implementer’s report; verify by reading the actual code.
  • Call out missing requirements, extra features, or misunderstandings with concrete file references.
Output:
  • PASS/FAIL
  • missing requirements (if any) with concrete references
  • extra/unneeded work (if any) with concrete references
任务:
  • 验证实现符合集群的要求:没有遗漏,没有多余内容。
  • 不要相信实现者的报告,通过阅读实际代码验证。
  • 指出遗漏的需求、额外的功能或者理解偏差,并附具体的文件引用。
输出:
  • 通过/失败
  • 遗漏的需求(如果有)附具体引用
  • 多余/不必要的工作(如果有)附具体引用

Code Quality Reviewer (per cluster)

代码质量评审员(每个集群)

Task:
  • Review cluster
    <CLUSTER_ID>
    changes for maintainability, test quality, and adherence to existing patterns.
  • Only run after spec compliance PASS.
  • Run the cluster’s relevant tests/commands (or explain what prevented running them).
  • Confirm any invoked specialized skills were followed (or explicitly explain deviations).
Output:
  • PASS/FAIL
  • concrete references (files/symbols)
  • any invariant violations or missing tests
任务:
  • 评审集群
    <CLUSTER_ID>
    的变更的可维护性、测试质量,以及对现有模式的遵循度。
  • 仅在规范符合性评审通过后执行。
  • 运行集群相关的测试/命令(或者解释无法运行的原因)。
  • 确认所有调用的专用技能的规则都被遵循(或者明确解释偏差原因)。
输出:
  • 通过/失败
  • 具体引用(文件/符号)
  • 任何违反不变量的情况或者缺失的测试