agent-orchestrator
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseAgent Orchestrator
Agent Orchestrator
Overview
概述
Run a disciplined multi-agent workflow where this instance acts as the coordinator: it delegates audits and fixes to other agents, reconciles results, enforces quality gates, and drives the work to a usable, validated end state.
Core pattern: dispatch a fresh implementer per cluster, then run two-stage review (spec compliance first, then code quality).
运行一个规范化的多Agent工作流,本实例将作为协调者:它会将审计和修复任务委派给其他Agent,协调结果,执行质量门控,并推动工作达到可用、已验证的最终状态。
核心模式:为每个集群分配一个新的实现者,然后运行两阶段评审(首先是规范合规性评审,然后是代码质量评审)。
Workflow (Coordinator)
协调者工作流
-
Discover and use other skills (when helpful)
- Check the harness-provided skill list (typically present in system context). If a relevant specialized skill exists, explicitly invoke it (e.g., ) and follow its workflow instead of reinventing it.
$some-skill - Use other skills to: fetch external info safely, generate boilerplate reliably, apply framework-specific conventions, or handle fragile formats (docs/PDFs, CI config, release workflows).
- Keep skill usage intentional: choose the minimal set, state which skills you’re using and why, and avoid duplicating their instructions inside this skill.
- Check the harness-provided skill list (typically present in system context). If a relevant specialized skill exists, explicitly invoke it (e.g.,
-
Freeze scope + success criteria
- Restate the mission, constraints, and “done” criteria in concrete terms.
- Identify any authoritative sources (docs/specs) and record what claims must be backed by evidence.
-
Create a phase plan and keep it current
- Use your environment’s planning mechanism (e.g., if available) to track phases and prevent drifting.
update_plan - Prefer 4–7 steps; keep exactly one step .
in_progress
- Use your environment’s planning mechanism (e.g.,
-
Decompose into subsystems
- Choose subsystems that can be audited independently (API surface, core logic, error handling, perf, integrations, tests, docs).
- For each subsystem, define 2–5 invariants (what must always be true).
-
Run dual independent audits per subsystem
- Spawn two independent audits per subsystem (auditA and auditB) and keep them independent until reconciliation.
- Require evidence for every issue (repo location, deterministic repro, expected vs actual, severity).
-
Reconcile audits into a single confirmed issue list
- Compare auditA vs auditB outputs and keep only mutually confirmed issues.
- Track rejected candidates with a brief reason (weak evidence, out of scope, non-deterministic).
- Use this reconciled list as the only input to implementation.
-
Implement in clusters with clear ownership
- Group confirmed issues into clusters that can be fixed with minimal coupling.
- Spawn exactly one fixer per cluster; fixers should “own” a file set and avoid broad refactors.
- Every fix must come with a regression test (unit/integration/e2e as appropriate).
- For each cluster, run a two-stage review loop:
- Implementer completes the cluster (tests, self-review, commit) and reports what changed.
- Spec compliance reviewer validates “nothing more, nothing less” by reading code (do not trust the report).
- Code quality reviewer validates maintainability and test quality (only after spec compliance passes).
- If any review FAILs, send concrete feedback back to the implementer and repeat the failed review stage.
-
Enforce review gates
- Do not merge/land a cluster unless spec compliance PASS and code quality PASS are both recorded with concrete references.
-
Integrate + validate
- Run the repo’s standard validations (tests, lint, build, typecheck).
- If the repo has no clear commands, discover them from ,
README,package.json, CI config, etc.pyproject.toml
-
Deliver a concise completion report
- What is usable now.
- What remains intentionally unsupported (with next steps/issues).
- Commands executed (at least the key validation commands) and results.
-
发现并使用其他技能(如有帮助)
- 检查工具提供的技能列表(通常存在于系统上下文中)。如果存在相关的专业技能,显式调用它(例如)并遵循其工作流,而非自行重复实现。
$some-skill - 使用其他技能来:安全获取外部信息、可靠生成样板代码、应用特定框架的约定,或处理易损坏的格式(文档/PDF、CI配置、发布工作流)。
- 有目的性地使用技能:选择最少的必要技能集合,说明你正在使用哪些技能及原因,避免在本技能内部重复这些技能的指令。
- 检查工具提供的技能列表(通常存在于系统上下文中)。如果存在相关的专业技能,显式调用它(例如
-
冻结范围 + 成功标准
- 用具体术语重述任务目标、约束条件和“完成”标准。
- 识别任何权威来源(文档/规范),并记录哪些声明必须有证据支持。
-
创建阶段计划并保持更新
- 使用环境中的规划机制(例如若可用则使用)来跟踪阶段进度,防止偏离。
update_plan - 优先设置4–7个步骤;始终保持只有一个步骤处于状态。
in_progress
- 使用环境中的规划机制(例如若可用则使用
-
拆分为子系统
- 选择可独立审计的子系统(API接口、核心逻辑、错误处理、性能、集成、测试、文档)。
- 为每个子系统定义2–5个不变量(必须始终成立的规则)。
-
为每个子系统执行双重独立审计
- 为每个子系统启动两个独立的审计(auditA和auditB),在结果整合前保持两者独立。
- 要求每个问题都附带证据(仓库位置、可复现的步骤、预期与实际结果、严重程度)。
-
将审计结果整合为单一已确认问题列表
- 对比auditA和auditB的输出,仅保留双方都确认的问题。
- 记录被否决的候选问题及简短原因(证据不足、超出范围、无法复现)。
- 仅使用该整合后的列表作为实现阶段的输入。
-
按集群实现并明确所有权
- 将已确认的问题分组为耦合度最低的集群。
- 为每个集群分配恰好一个修复者;修复者应“负责”一组文件,避免大范围重构。
- 每个修复都必须附带回归测试(根据情况选择单元/集成/端到端测试)。
- 为每个集群运行两阶段评审循环:
- 实现者完成集群任务(测试、自评审、提交)并报告变更内容。
- 规范合规性评审者通过阅读代码验证“不多不少,完全符合要求”(不要信任报告)。
- 代码质量评审者验证可维护性和测试质量(仅在规范合规性通过后执行)。
- 如果任何评审未通过,向实现者发送具体反馈并重新执行未通过的评审阶段。
-
强制执行评审门控
- 只有当规范合规性评审通过且代码质量评审通过,并记录有具体参考依据时,才能合并/落地集群的变更。
-
集成 + 验证
- 运行仓库的标准验证流程(测试、代码检查、构建、类型检查)。
- 如果仓库没有明确的命令,从、
README、package.json、CI配置等文件中查找。pyproject.toml
-
提交简洁的完成报告
- 当前可用的功能。
- 有意不支持的功能(以及后续步骤/问题)。
- 执行的命令(至少包括关键验证命令)及结果。
Agent Prompt Templates
Agent提示模板
Use these as starting points; keep subsystem- and repo-specific details in the message you send.
可将以下内容作为起点;在发送的消息中保留子系统和仓库的特定细节。
Auditor (per subsystem)
审计员(每个子系统)
Task:
- Audit the subsystem independently.
<SUBSYSTEM> - Do not propose fixes yet; identify issues only.
- If a specialized skill is relevant to the subsystem, invoke it and follow its audit/checklist guidance.
Output (bullet list):
- issue title
- severity: critical/high/medium/low
- evidence: repo file + symbol (and line if stable)
- deterministic repro (commands/steps) or reasoning for why repro is not needed
- expected vs actual
- violated invariant (if known) or propose a new invariant
任务:
- 独立审计<SUBSYSTEM>子系统。
- 暂不提出修复方案;仅识别问题。
- 如果有与该子系统相关的专业技能,调用该技能并遵循其审计/检查清单指南。
输出(项目符号列表):
- 问题标题
- 严重程度:critical/high/medium/low
- 证据:仓库文件 + 符号(如果行号稳定也需提供)
- 可复现的步骤(命令/操作)或无需复现的理由
- 预期与实际结果
- 违反的不变量(如已知)或提出新的不变量
Reconciler (coordinator task)
协调整合员(协调者任务)
Task:
- Compare auditA vs auditB for .
<SUBSYSTEM> - Produce a single decision set: confirmed issues (mutual) + rejected candidates (with reason).
Output:
- Confirmed issues (only mutual)
- Rejected candidates (reason)
- Consensus achieved: YES/NO
任务:
- 对比<SUBSYSTEM>子系统的auditA和auditB结果。
- 生成单一决策集:已确认问题(双方都认可)+ 被否决的候选问题(附带理由)。
输出:
- 已确认问题(仅双方都认可的)
- 被否决的候选问题(附带理由)
- 达成共识:YES/NO
Implementer (per cluster)
实现者(每个集群)
Task:
- Implement cluster derived from confirmed issues.
<CLUSTER_ID> - Work from a fresh context: do not assume prior clusters’ details unless provided.
- Do not open plan files unless explicitly instructed; the coordinator should paste the full cluster/task text and context here.
- Ask questions before you start if anything is unclear.
- Stay within agreed owned files; avoid opportunistic refactors.
- Add/adjust regression tests for every change.
- Run relevant validations (targeted tests first, then broader if appropriate).
- Commit your work (unless the repo workflow forbids local commits).
- Invoke specialized skills when they reduce risk (framework conventions, CI/test harness setup, format-sensitive edits).
Output:
- changed files (paths)
- commands executed + results
- brief behavior change summary
- tests added/updated
任务:
- 实现基于已确认问题的<CLUSTER_ID>集群任务。
- 从全新上下文开始工作:除非提供相关信息,否则不要假设其他集群的细节。
- 除非明确指示,否则不要打开计划文件;协调者应在此处粘贴完整的集群/任务文本和上下文。
- 开始前如有任何不清楚的地方,请提问。
- 严格在约定的负责文件范围内工作;避免随意重构。
- 为每个变更添加/调整回归测试。
- 运行相关验证(优先运行针对性测试,必要时再运行更全面的测试)。
- 提交你的工作(除非仓库工作流禁止本地提交)。
- 当专业技能能降低风险时调用它们(框架约定、CI/测试工具设置、对格式敏感的编辑)。
输出:
- 变更的文件(路径)
- 执行的命令 + 结果
- 简短的行为变更摘要
- 添加/更新的测试
Spec Compliance Reviewer (per cluster)
规范合规性评审者(每个集群)
Task:
- Verify the implementation matches the cluster’s requirements: nothing missing, nothing extra.
- Do not trust the implementer’s report; verify by reading the actual code.
- Call out missing requirements, extra features, or misunderstandings with concrete file references.
Output:
- PASS/FAIL
- missing requirements (if any) with concrete references
- extra/unneeded work (if any) with concrete references
任务:
- 验证实现是否符合集群的要求:无遗漏,无多余内容。
- 不要信任实现者的报告;通过阅读实际代码进行验证。
- 指出遗漏的需求、额外的功能或理解偏差,并提供具体的文件参考。
输出:
- PASS/FAIL
- 遗漏的需求(如有)及具体参考
- 多余/不必要的工作(如有)及具体参考
Code Quality Reviewer (per cluster)
代码质量评审者(每个集群)
Task:
- Review cluster changes for maintainability, test quality, and adherence to existing patterns.
<CLUSTER_ID> - Only run after spec compliance PASS.
- Run the cluster’s relevant tests/commands (or explain what prevented running them).
- Confirm any invoked specialized skills were followed (or explicitly explain deviations).
Output:
- PASS/FAIL
- concrete references (files/symbols)
- any invariant violations or missing tests
任务:
- 评审<CLUSTER_ID>集群的变更,检查可维护性、测试质量及对现有模式的遵循情况。
- 仅在规范合规性评审通过后运行。
- 运行集群相关的测试/命令(或解释无法运行的原因)。
- 确认是否遵循了调用的专业技能(或明确解释偏差原因)。
输出:
- PASS/FAIL
- 具体参考(文件/符号)
- 任何违反不变量或缺失测试的情况