Loading...
Loading...
Compare original and translation side by side
CLAUDE_PLUGIN_ROOT=${CLAUDE_PLUGIN_ROOT}CLAUDE_PLUGIN_ROOT=${CLAUDE_PLUGIN_ROOT}Input patterns:
1. --files "src/a.ts,src/b.ts,src/c.ts" --> File-based targets
2. --targets "UserService,OrderService" --> Named targets
3. Infer from task description --> Parse file paths from task--files--targets输入模式:
1. --files "src/a.ts,src/b.ts,src/c.ts" --> 基于文件的目标
2. --targets "UserService,OrderService" --> 命名目标
3. 从任务描述中推断 --> 从任务中解析文件路径--files--targetsLet me analyze this parallel task step by step to determine the optimal configuration:
1. **Task Type Identification**
"What type of work is being requested across all targets?"
- Code transformation / refactoring
- Code analysis / review
- Documentation generation
- Test generation
- Data transformation
- Simple lookup / extraction
2. **Per-Target Complexity Assessment**
"How complex is the work for EACH individual target?"
- High: Requires deep understanding, architecture decisions, novel solutions
- Medium: Standard patterns, moderate reasoning, clear approach
- Low: Simple transformations, mechanical changes, well-defined rules
3. **Per-Target Output Size**
"How extensive is each target's expected output?"
- Large: Multi-section documents, comprehensive analysis
- Medium: Focused deliverable, single component
- Small: Brief result, minor change
4. **Independence Check**
"Are the targets truly independent?"
- Yes: No shared state, no cross-dependencies, order doesn't matter
- Partial: Some shared context needed, but can run in parallel
- No: Dependencies exist --> Use sequential execution instead让我逐步分析这个并行任务,以确定最优配置:
1. **任务类型识别**
"所有目标请求的工作类型是什么?"
- 代码转换 / 重构
- 代码分析 / 评审
- 文档生成
- 测试生成
- 数据转换
- 简单查找 / 提取
2. **每个目标的复杂度评估**
"每个单独目标的工作复杂度如何?"
- 高:需要深度理解、架构决策、新颖解决方案
- 中:标准模式、适度推理、清晰方法
- 低:简单转换、机械变更、规则明确
3. **每个目标的输出规模**
"每个目标的预期输出有多广泛?"
- 大:多章节文档、全面分析
- 中:聚焦交付物、单个组件
- 小:简短结果、微小变更
4. **独立性检查**
"目标是否真正独立?"
- 是:无共享状态、无交叉依赖、顺序无关
- 部分:需要一些共享上下文,但可以并行运行
- 否:存在依赖关系 --> 使用顺序执行| Check | Question | If NO |
|---|---|---|
| File Independence | Do targets share files? | Cannot parallelize - files conflict |
| State Independence | Do tasks modify shared state? | Cannot parallelize - race conditions |
| Order Independence | Does execution order matter? | Cannot parallelize - sequencing required |
| Output Independence | Does any target read another's output? | Cannot parallelize - data dependency |
/launch-sub-agent| 检查项 | 问题 | 如果答案为否 |
|---|---|---|
| 文件独立性 | 目标是否共享文件? | 无法并行化 - 文件冲突 |
| 状态独立性 | 任务是否修改共享状态? | 无法并行化 - 竞争条件 |
| 顺序独立性 | 执行顺序是否重要? | 无法并行化 - 需要按顺序执行 |
| 输出独立性 | 任何目标是否读取其他目标的输出? | 无法并行化 - 数据依赖 |
/launch-sub-agent| Grouping Type | When to Apply | Meta-Judges | Implementation Agents | Judges |
|---|---|---|---|---|
| Repeatable | Same task pattern applied across multiple files/modules (e.g., "add tests to all 3 modules") | ONE shared meta-judge for the group | One per task (always isolated) | One per task, each receiving the SAME shared spec |
| Shared | Tasks that should be reviewed/verified together because they are interdependent (e.g., "implement S3 adapter AND integrate it into analytics") | ONE combined meta-judge for the group | One per task (always isolated) | ONE judge for the entire group, reviewing all changes together |
| Independent | Tasks that are fully independent with no grouping benefit | One per task | One per task (always isolated) | One per task |
For each pair of tasks, ask:
1. "Is this the SAME task applied to different targets?"
+-- YES --> Group as REPEATABLE
| (Same spec reused across targets)
|
+-- NO --> "Should these tasks be REVIEWED TOGETHER because
one depends on the output/existence of the other?"
|
+-- YES --> Group as SHARED
| (Combined spec, single judge reviews all)
|
+-- NO --> Mark as INDEPENDENT
(Separate meta-judge and judge per task)| 分组类型 | 适用场景 | 元法官 | 执行代理 | 法官 |
|---|---|---|---|---|
| 可重复 | 相同任务模式应用于多个文件/模块(例如:"为所有3个模块添加测试") | 组内共享一个元法官 | 每个任务一个(始终隔离) | 每个任务一个,均接收相同的共享规范 |
| 共享 | 因相互依赖而应一起评审/验证的任务(例如:"实现S3适配器并将其集成到分析模块") | 组内共享一个联合元法官 | 每个任务一个(始终隔离) | 整个组一个法官,共同审查所有变更 |
| 独立 | 完全独立、无分组收益的任务 | 每个任务一个 | 每个任务一个(始终隔离) | 每个任务一个 |
对于每对任务,询问:
1. "这是应用于不同目标的相同任务吗?"
+-- 是 --> 归为可重复组
| (相同规范跨目标重用)
|
+-- 否 --> "这些任务是否因彼此依赖输出/存在而应一起评审?"
|
+-- 是 --> 归为共享组
| (联合规范,单个法官审查所有内容)
|
+-- 否 --> 标记为独立任务
(每个任务单独的元法官和法官)| Task Profile | Recommended Model | Rationale |
|---|---|---|
| Complex per-target (architecture, design) | | Maximum reasoning capability per task |
| Specialized domain (code review, security) | | Domain expertise matters |
| Medium complexity, large output | | Good capability, cost-efficient for volume |
| Simple transformations (rename, format) | | Fast, cheap, sufficient for mechanical tasks |
| Default (when uncertain) | | Optimize for quality over cost |
Is EACH target's task COMPLEX (architecture, novel problem, critical decision)?
|
+-- YES --> Use Opus for ALL agents
|
+-- NO --> Is task SIMPLE and MECHANICAL (rename, format, extract)?
|
+-- YES --> Use Haiku for ALL agents
|
+-- NO --> Is output LARGE but task not complex?
|
+-- YES --> Use Sonnet for ALL agents
|
+-- NO --> Use Opus for ALL agents (default)| 任务概况 | 推荐模型 | 理由 |
|---|---|---|
| 每个目标复杂度高(架构、设计) | | 每个任务的最大推理能力 |
| 专业领域(代码评审、安全) | | 领域专业知识很重要 |
| 中等复杂度、大输出 | | 能力良好,针对批量任务成本高效 |
| 简单转换(重命名、格式化) | | 快速、低成本,足以应对机械任务 |
| 默认(不确定时) | | 优先考虑质量而非成本 |
每个目标的任务是否复杂(架构、新问题、关键决策)?
|
+-- 是 --> 所有代理使用Opus
|
+-- 否 --> 任务是否简单且机械(重命名、格式化、提取)?
|
+-- 是 --> 所有代理使用Haiku
|
+-- 否 --> 输出是否大但任务不复杂?
|
+-- 是 --> 所有代理使用Sonnet
|
+-- 否 --> 所有代理使用Opus(默认)undefinedundefined${CLAUDE_PLUGIN_ROOT}${CLAUDE_PLUGIN_ROOT}
**Repeatable group meta-judge prompt (ONE per group):**
```markdown
**可重复组元法官提示词(每组一个):**
```markdown${CLAUDE_PLUGIN_ROOT}${CLAUDE_PLUGIN_ROOT}
**Shared group meta-judge prompt (ONE per group):**
```markdown
**共享组元法官提示词(每组一个):**
```markdown${CLAUDE_PLUGIN_ROOT}${CLAUDE_PLUGIN_ROOT}undefinedundefinedUse Task tool (one per group/independent task, all in same message):
[Meta-judge for Repeatable Group: "add tests"]
- description: "Meta-judge (repeatable): reusable spec for adding tests across 3 modules"
- prompt: {repeatable group meta-judge prompt}
- model: opus
- subagent_type: "sadd:meta-judge"
[Meta-judge for Shared Group: "S3 adapter + integration"]
- description: "Meta-judge (shared): combined spec for S3 adapter implementation and integration"
- prompt: {shared group meta-judge prompt}
- model: opus
- subagent_type: "sadd:meta-judge"
[Meta-judge for Independent Task: "update CI pipeline"]
- description: "Meta-judge: update CI pipeline"
- prompt: {independent meta-judge prompt}
- model: opus
- subagent_type: "sadd:meta-judge"
[All meta-judges launched simultaneously]使用Task工具(每组/独立任务一个,全部在同一消息中):
[可重复组元法官:"添加测试"]
- description: "Meta-judge (repeatable): reusable spec for adding tests across 3 modules"
- prompt: {可重复组元法官提示词}
- model: opus
- subagent_type: "sadd:meta-judge"
[共享组元法官:"S3适配器 + 集成"]
- description: "Meta-judge (shared): combined spec for S3 adapter implementation and integration"
- prompt: {共享组元法官提示词}
- model: opus
- subagent_type: "sadd:meta-judge"
[独立任务元法官:"更新CI流水线"]
- description: "Meta-judge: update CI pipeline"
- prompt: {独立元法官提示词}
- model: opus
- subagent_type: "sadd:meta-judge"
[所有元法官同时启动]undefinedundefinedundefinedundefined<task>
{Task description from $ARGUMENTS}
</task>
<target>
{Specific target for this agent: file path, component name, etc.}
</target>
<constraints>
- Work ONLY on the specified target
- Do NOT modify other files unless explicitly required
- Follow existing patterns in the target
- {Any additional constraints from context}
</constraints>
<output>
{Expected deliverable location and format}
CRITICAL: At the end of your work, provide a "Summary" section containing:
- Files modified (full paths)
- Key changes (3-5 bullet points)
- Any decisions made and rationale
- Potential concerns or follow-up needed
</output><task>
{来自$ARGUMENTS的任务描述}
</task>
<target>
{此代理的特定目标:文件路径、组件名称等}
</target>
<constraints>
- 仅在指定目标上工作
- 除非明确要求,否则不要修改其他文件
- 遵循目标中的现有模式
- {来自上下文的任何额外约束}
</constraints>
<output>
{预期交付物的位置和格式}
关键注意事项:工作结束时,提供一个“摘要”部分,包含:
- 修改的文件(完整路径)
- 关键变更(3-5个要点)
- 做出的任何决策及其理由
- 潜在问题或后续需求
</output>undefinedundefined| # | Question | Why It Matters |
|---|---|---|
| 1 | Did I achieve the stated objective for this target? | Incomplete work = failed task |
| 2 | Are my changes consistent with patterns in this file/codebase? | Inconsistency creates technical debt |
| 3 | Did I introduce any regressions or break existing functionality? | Breaking changes are unacceptable |
| 4 | Are edge cases and error scenarios handled appropriately? | Edge cases cause production issues |
| 5 | Is my output clear, well-formatted, and ready for review? | Unclear output reduces value |
| # | 问题 | 重要性 |
|---|---|---|
| 1 | 我是否实现了该目标的既定目标? | 未完成工作 = 任务失败 |
| 2 | 我的变更是否与该文件/代码库中的模式一致? | 不一致会产生技术债务 |
| 3 | 我是否引入了任何回归或破坏了现有功能? | 破坏性变更不可接受 |
| 4 | 边缘情况和错误场景是否得到适当处理? | 边缘情况会导致生产问题 |
| 5 | 我的输出是否清晰、格式良好且准备好评审? | 不清晰的输出会降低价值 |
undefinedundefined┌─────────────────────────────────────────────────────────────────────────┐
│ │
│ Phase 3.5: Meta-Judge Dispatch (ALL in parallel) │
│ │
│ Independent: Repeatable Group: │
│ ┌──────────────┐ ┌─────────────────────┐ │
│ │ Meta-Judge A │ │ Meta-Judge (shared) │ │
│ │ (Opus) │ │ (Opus) │ │
│ │ → Spec YAML A │ │ → Reusable Spec YAML │ │
│ └──────┬───────┘ └──────────┬──────────┘ │
│ │ ┌─────┴─────┐ │
│ ▼ ▼ ▼ │
│ Phase 5: Implementation (ALL in parallel, one per task) │
│ │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
│ │ Implementer A │ │ Implementer B │ │ Implementer C │ │
│ └──────┬───────┘ └──────┬───────┘ └──────┬───────┘ │
│ │ │ │ │
│ ▼ ▼ ▼ │
│ Phase 5.2: Judge per task (after ALL implementors complete) │
│ │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
│ │ Judge A │ │ Judge B │ │ Judge C │ │
│ │ +Spec YAML A │ │ +Reusable Spec│ │ +Reusable Spec│ │
│ └──────┬───────┘ └──────┬───────┘ └──────┬───────┘ │
│ ▼ ▼ ▼ │
│ Parse Verdict (per target) → PASS/FAIL → Retry if needed │
└─────────────────────────────────────────────────────────────────────────┘┌─────────────────────────────────────────────────────────────────────────┐
│ │
│ Phase 3.5: Meta-Judge for Shared Group │
│ ┌──────────────────────┐ │
│ │ Meta-Judge (combined) │ │
│ │ (Opus) │ │
│ │ → Combined Spec YAML │ │
│ └──────────┬───────────┘ │
│ ┌────┴────┐ │
│ ▼ ▼ │
│ Phase 5: Implementation (one per task, in parallel) │
│ ┌──────────────┐ ┌──────────────┐ │
│ │ Implementer X │ │ Implementer Y │ │
│ └──────┬───────┘ └──────┬───────┘ │
│ │ │ │
│ └────────┬─────────┘ │
│ ▼ │
│ Phase 5.2: ONE Judge for entire group │
│ ┌────────────────────────────────┐ │
│ │ Judge (shared) │ │
│ │ +Combined Spec YAML │ │
│ │ +ALL implementation outputs │ │
│ └──────────────┬─────────────────┘ │
│ ▼ │
│ Parse per-task verdicts → Retry ONLY failing task(s) if needed │
└─────────────────────────────────────────────────────────────────────────┘undefined┌─────────────────────────────────────────────────────────────────────────┐
│ │
│ 阶段3.5:元法官调度(全部并行) │
│ │
│ 独立任务: 可重复组: │
│ ┌──────────────┐ ┌─────────────────────┐ │
│ │ 元法官A │ │ 元法官(共享) │ │
│ │ (Opus) │ │ (Opus) │ │
│ │ → 规范YAML A │ │ → 可重用规范YAML │ │
│ └──────┬───────┘ └──────────┬──────────┘ │
│ │ ┌─────┴─────┐ │
│ ▼ ▼ ▼ │
│ 阶段5:执行(全部并行,每个任务一个) │
│ │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
│ │ 执行代理A │ │ 执行代理B │ │ 执行代理C │ │
│ └──────┬───────┘ └──────┬───────┘ └──────┬───────┘ │
│ │ │ │ │
│ ▼ ▼ ▼ │
│ 阶段5.2:每个任务的法官(所有执行代理完成后) │
│ │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
│ │ 法官A │ │ 法官B │ │ 法官C │ │
│ │ +规范YAML A │ │ +可重用规范│ │ +可重用规范│ │
│ └──────┬───────┘ └──────┬───────┘ └──────┬───────┘ │
│ ▼ ▼ ▼ │
│ 解析裁决(每个目标) → 通过/失败 → 如有需要则重试 │
└─────────────────────────────────────────────────────────────────────────┘┌─────────────────────────────────────────────────────────────────────────┐
│ │
│ 阶段3.5:共享组元法官 │
│ ┌──────────────────────┐ │
│ │ 元法官(联合) │ │
│ │ (Opus) │ │
│ │ → 联合规范YAML │ │
│ └──────────┬───────────┘ │
│ ┌────┴────┐ │
│ ▼ ▼ │
│ 阶段5:执行(每个任务一个,并行) │
│ ┌──────────────┐ ┌──────────────┐ │
│ │ 执行代理X │ │ 执行代理Y │ │
│ └──────┬───────┘ └──────┬───────┘ │
│ │ │ │
│ └────────┬─────────┘ │
│ ▼ │
│ 阶段5.2:整个组一个法官 │
│ ┌────────────────────────────────┐ │
│ │ 法官(共享) │ │
│ │ +联合规范YAML │ │
│ │ +所有执行输出 │ │
│ └──────────────┬─────────────────┘ │
│ ▼ │
│ 解析每个任务的裁决 → 如有需要仅重试失败的任务 │
└─────────────────────────────────────────────────────────────────────────┘undefined
**Parallelization Guidelines:**
- Launch ALL independent tasks in a single batch (same response)
- Do NOT wait for one task before starting another
- Do NOT make sequential Task tool calls
- Task tool handles parallelization automatically
- Results collected after all complete
**Context Isolation (IMPORTANT):**
- Pass only context relevant to each specific target
- Do NOT pass the full list of all targets to each agent
- Let sub-agents discover local patterns through file reading
- Each agent works in clean context without accumulated confusion
**并行化指南:**
- 在单个批次(同一响应)中启动所有独立任务
- 不要等待一个任务完成再启动另一个
- 不要按顺序调用Task工具
- Task工具自动处理并行化
- 所有任务完成后收集结果
**上下文隔离(重要):**
- 仅传递与每个特定目标相关的上下文
- 不要将所有目标的完整列表传递给每个代理
- 让子代理通过读取文件发现本地模式
- 每个代理在干净的上下文中工作,避免累积混淆| Grouping Type | Judge Dispatch | Spec Used |
|---|---|---|
| Independent | One judge per task | Task-specific meta-judge spec |
| Repeatable | One judge per task | SAME shared reusable spec from the group's meta-judge |
| Shared | ONE judge for the entire group | Combined spec from the group's meta-judge |
| 分组类型 | 法官调度 | 使用的规范 |
|---|---|---|
| 独立 | 每个任务一个法官 | 特定于任务的元法官规范 |
| 可重复 | 每个任务一个法官 | 组元法官提供的相同共享可重用规范 |
| 共享 | 整个组一个法官 | 组元法官提供的联合规范 |
You are evaluating an implementation artifact for target {target_name} against an evaluation specification produced by the meta judge.
CLAUDE_PLUGIN_ROOT=`${CLAUDE_PLUGIN_ROOT}`你正在根据元法官生成的评估规范评估目标{target_name}的实现产物。
CLAUDE_PLUGIN_ROOT=`${CLAUDE_PLUGIN_ROOT}`{meta-judge's evaluation specification YAML}{元法官的评估规范YAML}
CRITICAL: NEVER provide score threshold, in any format, including `threshold_pass` or anything different. Judge MUST not know what threshold for score is, in order to not be biased!!!
关键注意事项:绝对不要以任何格式提供分数阈值,包括`threshold_pass`或其他任何形式。法官绝对不能知道分数阈值,以避免偏见!You are evaluating implementation artifacts for a group of related tasks against a combined evaluation specification produced by the meta judge. These tasks are interdependent and must be reviewed together.
CLAUDE_PLUGIN_ROOT=`${CLAUDE_PLUGIN_ROOT}`你正在根据元法官生成的联合评估规范评估一组相关任务的实现产物。这些任务相互依赖,必须共同评审。
CLAUDE_PLUGIN_ROOT=`${CLAUDE_PLUGIN_ROOT}`{meta-judge's COMBINED evaluation specification YAML}{元法官的联合评估规范YAML}undefinedundefinedUse Task tool:
- description: "Judge: {target name}"
- prompt: {judge verification prompt with exact meta-judge specification YAML, and Pre-existing or Expected Parallel Changes section if applicable}
- model: opus
- subagent_type: "sadd:judge"Use Task tool:
- description: "Judge (shared): {group description}"
- prompt: {shared group judge prompt from 5.2.3 with combined meta-judge specification YAML and ALL implementation outputs}
- model: opus
- subagent_type: "sadd:judge"threshold_pass使用Task工具:
- description: "Judge: {目标名称}"
- prompt: {法官验证提示词,包含精确的元法官规范YAML,以及适用的预先存在或预期的并行变更部分}
- model: opus
- subagent_type: "sadd:judge"使用Task工具:
- description: "Judge (shared): {组描述}"
- prompt: {来自5.2.3的共享组法官提示词,包含联合元法官规范YAML和所有执行输出}
- model: opus
- subagent_type: "sadd:judge"threshold_passExtract from judge reply:
- VERDICT: PASS or FAIL
- SCORE: X.X/5.0
- ISSUES: List of problems (if any)
- IMPROVEMENTS: List of suggestions (if any)If score >= 4:
-> VERDICT: PASS
-> Mark target complete
-> Include IMPROVEMENTS as optional enhancements
IF score >= 3.0 and all found issues are low priority, then:
-> VERDICT: PASS
-> Mark target complete
-> Include IMPROVEMENTS as optional enhancements
If score < 4:
-> VERDICT: FAIL
-> Check retry count for this target
If retries < 3:
-> Dispatch retry implementation agent with judge feedback
-> Return to judge verification with same target-specific meta-judge specification
If retries >= 3:
-> Mark target as failed (isolate from other targets)
-> Do NOT proceed with more retries without user decisionExtract from shared judge reply:
- Per-task verdicts:
- Task 1 ({target}): VERDICT: PASS/FAIL, SCORE: X.X/5.0, ISSUES: [...]
- Task 2 ({target}): VERDICT: PASS/FAIL, SCORE: X.X/5.0, ISSUES: [...]
- OVERALL SCORE: X.X/5.0
- CROSS-TASK ISSUES: List of integration problems (if any)If shared judge finds failures:
1. Identify which specific task(s) failed from per-task verdicts
2. Re-launch ONLY the implementation agent(s) for the failed task(s)
-- Do NOT re-launch agents whose tasks passed
3. After retry implementation completes, re-launch the shared judge
to review ALL changes again (passed + retried)
-- The shared judge still uses the same combined meta-judge spec
4. Repeat until all tasks pass or max retries reached for any task
CRITICAL: Only the specific failing implementation agent(s) are retried.
Passing tasks are NOT re-implemented. The shared judge always reviews
the complete group together on each evaluation round.从法官回复中提取:
- VERDICT: 通过或失败
- SCORE: X.X/5.0
- ISSUES: 问题列表(如有)
- IMPROVEMENTS: 建议列表(如有)如果分数 >= 4:
-> 裁决:通过
-> 标记目标完成
-> 将改进建议作为可选增强措施包含
如果分数 >= 3.0且所有发现的问题都是低优先级,则:
-> 裁决:通过
-> 标记目标完成
-> 将改进建议作为可选增强措施包含
如果分数 < 4:
-> 裁决:失败
-> 检查此目标的重试次数
如果重试次数 < 3:
-> 使用法官反馈调度重试执行代理
-> 使用相同的特定于目标的元法官规范返回法官验证
如果重试次数 >= 3:
-> 将目标标记为失败(与其他目标隔离)
-> 未获得用户决策则不再重试从共享法官回复中提取:
- 每个任务的裁决:
- 任务1 ({目标}): 裁决: 通过/失败, 分数: X.X/5.0, 问题: [...]
- 任务2 ({目标}): 裁决: 通过/失败, 分数: X.X/5.0, 问题: [...]
- 总体分数: X.X/5.0
- 跨任务问题: 集成问题列表(如有)如果共享法官发现失败:
1. 从每个任务的裁决中识别哪些特定任务失败
2. 仅重新启动失败任务的执行代理
-- 不要重新启动任务通过的代理
3. 重试执行完成后,重新启动共享法官
再次审查所有变更(通过的 + 重试的)
-- 共享法官仍使用相同的联合元法官规范
4. 重复直到所有任务通过或任何任务达到最大重试次数
关键注意事项:仅重试特定失败的执行代理。
通过的任务不会重新实现。共享法官在每次评估轮次中始终共同审查完整的组。undefinedundefinedundefinedundefinedundefinedundefined| Target | Grouping | Model | Judge Score | Retries | Status | Summary |
|---|---|---|---|---|---|---|
| {target_1} | {Repeatable/Shared/Independent} | {model} | {X.X}/5.0 | {0-3} | SUCCESS | {brief outcome} |
| {target_2} | {Repeatable/Shared/Independent} | {model} | {X.X}/5.0 | {0-3} | SUCCESS | {brief outcome} |
| {target_3} | {Repeatable/Shared/Independent} | {model} | {X.X}/5.0 | {3} | FAILED | {failure reason} |
| ... | ... | ... | ... | ... | ... | ... |
| 目标 | 分组 | 模型 | 法官分数 | 重试次数 | 状态 | 摘要 |
|---|---|---|---|---|---|---|
| {target_1} | {可重复/共享/独立} | {model} | {X.X}/5.0 | {0-3} | 成功 | {简要结果} |
| {target_2} | {可重复/共享/独立} | {model} | {X.X}/5.0 | {0-3} | 成功 | {简要结果} |
| {target_3} | {可重复/共享/独立} | {model} | {X.X}/5.0 | {3} | 失败 | {失败原因} |
| ... | ... | ... | ... | ... | ... | ... |
**Failure Handling:**
- Report failed tasks clearly with error details
- Successful tasks are NOT affected by failures
- Failed targets isolated after max retries
- Suggest options: provide guidance, skip, or manual fix
**失败处理:**
- 清晰报告失败任务及其错误细节
- 成功任务不受失败影响
- 达到最大重试次数后隔离失败目标
- 建议选项:提供指导重试、跳过或手动修复src/api/users.tssrc/api/orders.tssrc/api/products.ts/do-in-parallel add tests to all 3 modules in src folder and add tests step to github actionsPhase 2: Task Analysis + Requirement Grouping
1. Task Identification:
- Task A: "Add tests to src/modules/auth.ts"
- Task B: "Add tests to src/modules/cart.ts"
- Task C: "Add tests to src/modules/payments.ts"
- Task D: "Add tests step to GitHub Actions CI pipeline"
2. Requirement Grouping:
- Tasks A, B, C: REPEATABLE — same task ("add tests") applied to 3 different modules
→ ONE shared meta-judge producing a reusable spec
- Task D: INDEPENDENT — different task type (CI configuration)
→ Separate meta-judge
3. Pre-existing and Expected Parallel Changes Assessment:
- Pre-existing (from prior batch): API documentation updated across
src/api/users.ts, src/api/orders.ts, src/api/products.ts
- Expected parallel: Each agent should be aware that other agents in this
batch are adding tests to other modules and updating GH Actions simultaneously
4. Agent Count:
- Meta-judges: 2 (1 repeatable for tests + 1 independent for GH Actions)
- Implementation agents: 4 (one per task, always isolated)
- Judges: 4 (3 using shared test spec + 1 for GH Actions)
- Total: 10 agents (vs. 12 without grouping)[Meta-judge 1: Repeatable group — test generation]
Use Task tool:
- description: "Meta-judge (repeatable): reusable spec for adding tests across 3 modules"
- prompt:
## Task
Generate a REUSABLE evaluation specification yaml that can be applied to
ANY of the following targets performing the same task. You will produce
rubrics, checklists, and scoring criteria that individual judge agents
will each use independently to evaluate one target's implementation artifact.
CLAUDE_PLUGIN_ROOT={CLAUDE_PLUGIN_ROOT}
## User Prompt as Context
add tests to all 3 modules in src folder and add tests step to github actions
## Task Being Repeated
Add comprehensive unit tests to a source module
## Targets in This Group
- src/modules/auth.ts
- src/modules/cart.ts
- src/modules/payments.ts
## Context
Project uses Jest for testing. Test files should be co-located as
*.test.ts files. Existing test patterns available in src/modules/__tests__/.
## Artifact Type
code
## Instructions
CRITICAL: You are generating a REUSABLE spec that will be applied to
EACH target independently by separate judges.
- Use generic language: "target file should align with criteria" instead
of "all files should align"
- Do NOT include file-specific requirements (e.g., NOT "auth.ts should
test only authentication logic") since this same spec will be applied
to different files
- The spec must be applicable to ANY target in this group without modification
- Each judge will receive this same spec and evaluate only its own target
against it
User prompt is provided as context, you should use it only as reference
of changes that can occur in the project by other agents.
Return only the final evaluation specification YAML in your response.
- model: opus
- subagent_type: "sadd:meta-judge"
[Meta-judge 2: Independent — GitHub Actions]
Use Task tool:
- description: "Meta-judge: add tests step to GitHub Actions"
- prompt:
## Task
Generate an evaluation specification yaml for the following task applied
to a specific target. You will produce rubrics, checklists, and scoring
criteria that a judge agent will use to evaluate the implementation
artifact for this specific target.
CLAUDE_PLUGIN_ROOT={CLAUDE_PLUGIN_ROOT}
## User Prompt as Context
add tests to all 3 modules in src folder and add tests step to github actions
## Target
Add a test execution step to the GitHub Actions CI pipeline
(.github/workflows/ci.yml or similar)
## Context
Project uses Jest for testing. The CI pipeline should run tests after
build step. Existing workflow file may need a new job or step.
## Artifact Type
configuration
## Instructions
User prompt is provided as context, you should use it only as reference
of changes that can occur in the project by other agents. Generate
evaluation specification ONLY for adding the tests step to GitHub Actions.
Your report will be used to verify only this particular task, not the
all tasks in the user prompt.
Return only the final evaluation specification YAML in your response.
- model: opus
- subagent_type: "sadd:meta-judge"
[Both meta-judges launched simultaneously][Implementation 1: auth module tests]
Use Task tool:
- description: "Parallel: add tests to src/modules/auth.ts"
- prompt:
## Reasoning Approach
Let's think step by step.
Before taking any action, think through the problem systematically:
1. "Let me first understand what is being asked for this specific target..."
2. "Let me analyze this specific target..."
3. "Let me plan my approach..."
Work through each step explicitly before implementing.
<task>Add comprehensive unit tests</task>
<target>src/modules/auth.ts</target>
<constraints>
- Work ONLY on the specified target
- Do NOT modify other files unless explicitly required
- Follow existing test patterns in the project
</constraints>
<output>
Create test file for the auth module.
CRITICAL: At the end of your work, provide a "Summary" section containing:
- Files modified (full paths)
- Key changes (3-5 bullet points)
- Any decisions made and rationale
</output>
## Self-Critique Verification (MANDATORY)
[standard self-critique suffix]
- model: sonnet
[Implementation 2: cart module tests]
Use Task tool:
- description: "Parallel: add tests to src/modules/cart.ts"
- prompt:
## Reasoning Approach
Let's think step by step.
Before taking any action, think through the problem systematically:
1. "Let me first understand what is being asked for this specific target..."
2. "Let me analyze this specific target..."
3. "Let me plan my approach..."
Work through each step explicitly before implementing.
<task>Add comprehensive unit tests</task>
<target>src/modules/cart.ts</target>
<constraints>
- Work ONLY on the specified target
- Do NOT modify other files unless explicitly required
- Follow existing test patterns in the project
</constraints>
<output>
Create test file for the cart module.
CRITICAL: At the end of your work, provide a "Summary" section containing:
- Files modified (full paths)
- Key changes (3-5 bullet points)
- Any decisions made and rationale
</output>
## Self-Critique Verification (MANDATORY)
Before submitting, verify your work:
1. Re-read the original task and confirm every requirement is addressed
2. Check that all tests follow existing patterns in the project
3. Verify no unrelated files were modified
4. Confirm the Summary section is complete and accurate
- model: sonnet
[Implementation 3: payments module tests]
Use Task tool:
- description: "Parallel: add tests to src/modules/payments.ts"
- prompt: [Same CoT prefix + task body for payments.ts + critique suffix]
- model: sonnet
[Implementation 4: GitHub Actions test step]
Use Task tool:
- description: "Parallel: add tests step to GitHub Actions CI"
- prompt:
## Reasoning Approach
Let's think step by step.
Before taking any action, think through the problem systematically:
1. "Let me first understand what is being asked for this specific target..."
2. "Let me analyze this specific target..."
3. "Let me plan my approach..."
Work through each step explicitly before implementing.
<task>Add a test execution step to the GitHub Actions CI pipeline</task>
<target>.github/workflows/ci.yml</target>
<constraints>
- Work ONLY on the CI workflow file
- Add a step that runs the test suite after the build step
- Do NOT modify other workflow files or steps beyond what is necessary
- Follow existing workflow patterns and conventions
</constraints>
<output>
Update the CI workflow with a test execution step.
CRITICAL: At the end of your work, provide a "Summary" section containing:
- Files modified (full paths)
- Key changes (3-5 bullet points)
- Any decisions made and rationale
</output>
## Self-Critique Verification (MANDATORY)
Before submitting, verify your work:
1. Re-read the original task and confirm every requirement is addressed
2. Check that the workflow YAML is valid and well-structured
3. Verify no unrelated workflow steps were modified
4. Confirm the Summary section is complete and accurate
- model: sonnet
[All 4 launched simultaneously][Judge 1: auth module — uses SHARED reusable spec from repeatable meta-judge]
Use Task tool:
- description: "Judge: src/modules/auth.ts"
- prompt:
You are evaluating an implementation artifact for target
src/modules/auth.ts against an evaluation specification produced
by the meta judge.
CLAUDE_PLUGIN_ROOT={CLAUDE_PLUGIN_ROOT}
## User Prompt
add tests to all 3 modules in src folder and add tests step to github actions
## Target
src/modules/auth.ts
## Pre-existing and expected parallel changes (Context Only)
The following changes were made before or expected to be done by
other parallel agents in the same batch now. They are NOT part of
the current implementation agent's output. Focus your evaluation
on the current agent's changes to its specific target. Only verify
other changed files/logic if they directly relate to the current
target's task requirements.
### Previous do-in-parallel: "Update API documentation for all endpoints"
The following files were modified as part of a previous parallel batch:
- src/api/users.ts (modified) - Added JSDoc to public methods,
updated module header
- src/api/orders.ts (modified) - Added JSDoc to public methods,
added @example tags
- src/api/products.ts (modified) - Added JSDoc to public methods,
updated type annotations
### Expected parallel changes (current batch)
Other agents in this batch are simultaneously:
- Adding tests to src/modules/cart.ts and src/modules/payments.ts
(repeatable group — same task on other modules)
- Adding a tests step to .github/workflows/ci.yml (independent task)
## Evaluation Specification
```yaml
{EXACT reusable spec YAML from repeatable meta-judge — same for all 3 module judges}
```
## Implementation Output
{Summary from auth implementation agent}
## Instructions
User prompt is provided as context, you should use it only as reference
of changes that can occur in the project by other agents. Evaluate ONLY
the test generation for auth.ts.
Follow your full judge process as defined in your agent instructions!
## Output
CRITICAL: You must reply with this exact structured evaluation report
format in YAML at the START of your response!
- model: opus
- subagent_type: "sadd:judge"
[Judge 2: cart module — uses SAME shared reusable spec]
Use Task tool:
- description: "Judge: src/modules/cart.ts"
- prompt: [Same judge template, same reusable spec YAML, cart implementation output.
Pre-existing and expected parallel changes section: same prior batch info,
expected parallel changes list auth.ts, payments.ts, and GH Actions instead]
- model: opus
- subagent_type: "sadd:judge"
[Judge 3: payments module — uses SAME shared reusable spec]
Use Task tool:
- description: "Judge: src/modules/payments.ts"
- prompt: [Same judge template, same reusable spec YAML, payments implementation output.
Pre-existing and expected parallel changes section: same prior batch info,
expected parallel changes list auth.ts, cart.ts, and GH Actions instead]
- model: opus
- subagent_type: "sadd:judge"
[Judge 4: GitHub Actions — uses INDEPENDENT spec from GH Actions meta-judge]
Use Task tool:
- description: "Judge: GitHub Actions CI"
- prompt:
You are evaluating an implementation artifact for target
.github/workflows/ci.yml against an evaluation specification produced
by the meta judge.
CLAUDE_PLUGIN_ROOT={CLAUDE_PLUGIN_ROOT}
## User Prompt
add tests to all 3 modules in src folder and add tests step to github actions
## Target
.github/workflows/ci.yml
## Pre-existing and expected parallel changes (Context Only)
The following changes were made before or expected to be done by
other parallel agents in the same batch now. They are NOT part of
the current implementation agent's output. Focus your evaluation
on the current agent's changes to its specific target. Only verify
other changed files/logic if they directly relate to the current
target's task requirements.
### Previous do-in-parallel: "Update API documentation for all endpoints"
The following files were modified as part of a previous parallel batch:
- src/api/users.ts (modified) - Added JSDoc to public methods,
updated module header
- src/api/orders.ts (modified) - Added JSDoc to public methods,
added @example tags
- src/api/products.ts (modified) - Added JSDoc to public methods,
updated type annotations
### Expected parallel changes (current batch)
Other agents in this batch are simultaneously:
- Adding tests to src/modules/auth.ts, src/modules/cart.ts,
and src/modules/payments.ts (repeatable group — test generation)
## Evaluation Specification
```yaml
{EXACT spec YAML from independent GH Actions meta-judge}
```
## Implementation Output
{Summary from GH Actions implementation agent}
## Instructions
User prompt is provided as context, you should use it only as reference
of changes that can occur in the project by other agents. Evaluate ONLY
the GitHub Actions test step.
Follow your full judge process as defined in your agent instructions!
## Output
CRITICAL: You must reply with this exact structured evaluation report
format in YAML at the START of your response!
- model: opus
- subagent_type: "sadd:judge"
[All 4 judges launched simultaneously]| Target | Grouping | Model | Judge Score | Retries | Status |
|---|---|---|---|---|---|
| src/modules/auth.ts | Repeatable | sonnet | 4.2/5.0 | 0 | SUCCESS |
| src/modules/cart.ts | Repeatable | sonnet | 4.0/5.0 | 0 | SUCCESS |
| src/modules/payments.ts | Repeatable | sonnet | 4.1/5.0 | 0 | SUCCESS |
| .github/workflows/ci.yml | Independent | sonnet | 4.3/5.0 | 0 | SUCCESS |
src/api/users.tssrc/api/orders.tssrc/api/products.ts/do-in-parallel add tests to all 3 modules in src folder and add tests step to github actions阶段2:任务分析 + 需求分组
1. 任务识别:
- 任务A: "Add tests to src/modules/auth.ts"
- 任务B: "Add tests to src/modules/cart.ts"
- 任务C: "Add tests to src/modules/payments.ts"
- 任务D: "Add tests step to GitHub Actions CI pipeline"
2. 需求分组:
- 任务A、B、C: 可重复 — 相同任务("添加测试")应用于3个不同模块
→ 一个共享元法官生成可重用规范
- 任务D: 独立 — 不同任务类型(CI配置)
→ 单独的元法官
3. 预先存在和预期的并行变更评估:
- 预先存在(来自先前批次): API文档更新涉及
src/api/users.ts、src/api/orders.ts、src/api/products.ts
- 预期并行: 每个代理应了解此批次中其他代理同时为其他模块添加测试并更新GH Actions
4. 代理数量:
- 元法官: 2个(1个用于测试的可重复组 + 1个用于GH Actions的独立任务)
- 执行代理: 4个(每个任务一个,始终隔离)
- 法官: 4个(3个使用共享测试规范 + 1个用于GH Actions)
- 总计: 10个代理(不分组则为12个)[元法官1: 可重复组 — 测试生成]
使用Task工具:
- description: "Meta-judge (repeatable): reusable spec for adding tests across 3 modules"
- prompt:
## 任务
Generate a REUSABLE evaluation specification yaml that can be applied to
ANY of the following targets performing the same task. You will produce
rubrics, checklists, and scoring criteria that individual judge agents
will each use independently to evaluate one target's implementation artifact.
CLAUDE_PLUGIN_ROOT={CLAUDE_PLUGIN_ROOT}
## 用户提示词作为上下文
add tests to all 3 modules in src folder and add tests step to github actions
## 重复执行的任务
Add comprehensive unit tests to a source module
## 此组中的目标
- src/modules/auth.ts
- src/modules/cart.ts
- src/modules/payments.ts
## 上下文
Project uses Jest for testing. Test files should be co-located as
*.test.ts files. Existing test patterns available in src/modules/__tests__/.
## 产物类型
code
## 指令
CRITICAL: You are generating a REUSABLE spec that will be applied to
EACH target independently by separate judges.
- Use generic language: "target file should align with criteria" instead
of "all files should align"
- Do NOT include file-specific requirements (e.g., NOT "auth.ts should
test only authentication logic") since this same spec will be applied
to different files
- The spec must be applicable to ANY target in this group without modification
- Each judge will receive this same spec and evaluate only its own target
against it
用户提示词作为上下文提供,你应仅将其作为项目中其他代理可能发生的变更的参考。
仅在响应中返回最终的评估规范YAML。
- model: opus
- subagent_type: "sadd:meta-judge"
[元法官2: 独立任务 — GitHub Actions]
使用Task工具:
- description: "Meta-judge: add tests step to GitHub Actions"
- prompt:
## 任务
Generate an evaluation specification yaml for the following task applied
to a specific target. You will produce rubrics, checklists, and scoring
criteria that a judge agent will use to evaluate the implementation
artifact for this specific target.
CLAUDE_PLUGIN_ROOT={CLAUDE_PLUGIN_ROOT}
## 用户提示词作为上下文
add tests to all 3 modules in src folder and add tests step to github actions
## 目标
Add a test execution step to the GitHub Actions CI pipeline
(.github/workflows/ci.yml or similar)
## 上下文
Project uses Jest for testing. The CI pipeline should run tests after
build step. Existing workflow file may need a new job or step.
## 产物类型
configuration
## 指令
用户提示词作为上下文提供,你应仅将其作为项目中其他代理可能发生的变更的参考。仅为向GitHub Actions添加测试步骤生成评估规范。你的报告将仅用于验证此特定任务,而非用户提示词中的所有任务。
仅在响应中返回最终的评估规范YAML。
- model: opus
- subagent_type: "sadd:meta-judge"
[两个元法官同时启动][执行1: auth模块测试]
使用Task工具:
- description: "Parallel: add tests to src/modules/auth.ts"
- prompt:
## 推理方法
Let's think step by step.
Before taking any action, think through the problem systematically:
1. "Let me first understand what is being asked for this specific target..."
2. "Let me analyze this specific target..."
3. "Let me plan my approach..."
Work through each step explicitly before implementing.
<task>Add comprehensive unit tests</task>
<target>src/modules/auth.ts</target>
<constraints>
- Work ONLY on the specified target
- Do NOT modify other files unless explicitly required
- Follow existing test patterns in the project
</constraints>
<output>
Create test file for the auth module.
CRITICAL: At the end of your work, provide a "Summary" section containing:
- Files modified (full paths)
- Key changes (3-5 bullet points)
- Any decisions made and rationale
</output>
## 自我审查验证(必须)
[标准自我审查后缀]
- model: sonnet
[执行2: cart模块测试]
使用Task工具:
- description: "Parallel: add tests to src/modules/cart.ts"
- prompt:
## 推理方法
Let's think step by step.
Before taking any action, think through the problem systematically:
1. "Let me first understand what is being asked for this specific target..."
2. "Let me analyze this specific target..."
3. "Let me plan my approach..."
Work through each step explicitly before implementing.
<task>Add comprehensive unit tests</task>
<target>src/modules/cart.ts</target>
<constraints>
- Work ONLY on the specified target
- Do NOT modify other files unless explicitly required
- Follow existing test patterns in the project
</constraints>
<output>
Create test file for the cart module.
CRITICAL: At the end of your work, provide a "Summary" section containing:
- Files modified (full paths)
- Key changes (3-5 bullet points)
- Any decisions made and rationale
</output>
## 自我审查验证(必须)
提交之前,验证你的工作:
1. 重新阅读原始任务,确认所有需求都已满足
2. 检查所有测试是否遵循项目中的现有模式
3. 验证未修改无关文件
4. 确认摘要部分完整准确
- model: sonnet
[执行3: payments模块测试]
使用Task工具:
- description: "Parallel: add tests to src/modules/payments.ts"
- prompt: [相同的思维链前缀 + payments.ts的任务主体 + 审查后缀]
- model: sonnet
[执行4: GitHub Actions测试步骤]
使用Task工具:
- description: "Parallel: add tests step to GitHub Actions CI"
- prompt:
## 推理方法
Let's think step by step.
Before taking any action, think through the problem systematically:
1. "Let me first understand what is being asked for this specific target..."
2. "Let me analyze this specific target..."
3. "Let me plan my approach..."
Work through each step explicitly before implementing.
<task>Add a test execution step to the GitHub Actions CI pipeline</task>
<target>.github/workflows/ci.yml</target>
<constraints>
- Work ONLY on the CI workflow file
- Add a step that runs the test suite after the build step
- Do NOT modify other workflow files or steps beyond what is necessary
- Follow existing workflow patterns and conventions
</constraints>
<output>
Update the CI workflow with a test execution step.
CRITICAL: At the end of your work, provide a "Summary" section containing:
- Files modified (full paths)
- Key changes (3-5 bullet points)
- Any decisions made and rationale
</output>
## 自我审查验证(必须)
提交之前,验证你的工作:
1. 重新阅读原始任务,确认所有需求都已满足
2. 检查工作流YAML是否有效且结构良好
3. 验证未修改无关工作流步骤
4. 确认摘要部分完整准确
- model: sonnet
[所有4个代理同时启动][法官1: auth模块 — 使用可重复元法官的共享可重用规范]
使用Task工具:
- description: "Judge: src/modules/auth.ts"
- prompt:
You are evaluating an implementation artifact for target
src/modules/auth.ts against an evaluation specification produced
by the meta judge.
CLAUDE_PLUGIN_ROOT={CLAUDE_PLUGIN_ROOT}
## 用户提示词
add tests to all 3 modules in src folder and add tests step to github actions
## 目标
src/modules/auth.ts
## 预先存在和预期的并行变更(仅上下文)
The following changes were made before or expected to be done by
other parallel agents in the same batch now. They are NOT part of
the current implementation agent's output. Focus your evaluation
on the current agent's changes to its specific target. Only verify
other changed files/logic if they directly relate to the current
target's task requirements.
### Previous do-in-parallel: "Update API documentation for all endpoints"
The following files were modified as part of a previous parallel batch:
- src/api/users.ts (modified) - Added JSDoc to public methods,
updated module header
- src/api/orders.ts (modified) - Added JSDoc to public methods,
added @example tags
- src/api/products.ts (modified) - Added JSDoc to public methods,
updated type annotations
### Expected parallel changes (current batch)
Other agents in this batch are simultaneously:
- Adding tests to src/modules/cart.ts and src/modules/payments.ts
(repeatable group — same task on other modules)
- Adding a tests step to .github/workflows/ci.yml (independent task)
## 评估规范
```yaml
{来自可重复元法官的精确可重用规范YAML — 所有3个模块法官相同}
```
## 执行输出
{来自auth执行代理的摘要}
## 指令
用户提示词作为上下文提供,你应仅将其作为项目中其他代理可能发生的变更的参考。仅评估auth.ts的测试生成。
遵循代理指令中定义的完整法官流程!
## 输出
CRITICAL: You must reply with this exact structured evaluation report
format in YAML at the START of your response!
- model: opus
- subagent_type: "sadd:judge"
[法官2: cart模块 — 使用相同的共享可重用规范]
使用Task工具:
- description: "Judge: src/modules/cart.ts"
- prompt: [相同的法官模板,相同的可重用规范YAML,cart执行输出。
预先存在和预期的并行变更部分:相同的先前批次信息,
预期并行变更列表改为auth.ts、payments.ts和GH Actions]
- model: opus
- subagent_type: "sadd:judge"
[法官3: payments模块 — 使用相同的共享可重用规范]
使用Task工具:
- description: "Judge: src/modules/payments.ts"
- prompt: [相同的法官模板,相同的可重用规范YAML,payments执行输出。
预先存在和预期的并行变更部分:相同的先前批次信息,
预期并行变更列表改为auth.ts、cart.ts和GH Actions]
- model: opus
- subagent_type: "sadd:judge"
[法官4: GitHub Actions — 使用GH Actions元法官的独立规范]
使用Task工具:
- description: "Judge: GitHub Actions CI"
- prompt:
You are evaluating an implementation artifact for target
.github/workflows/ci.yml against an evaluation specification produced
by the meta judge.
CLAUDE_PLUGIN_ROOT={CLAUDE_PLUGIN_ROOT}
## 用户提示词
add tests to all 3 modules in src folder and add tests step to github actions
## 目标
.github/workflows/ci.yml
## 预先存在和预期的并行变更(仅上下文)
The following changes were made before or expected to be done by
other parallel agents in the same batch now. They are NOT part of
the current implementation agent's output. Focus your evaluation
on the current agent's changes to its specific target. Only verify
other changed files/logic if they directly relate to the current
target's task requirements.
### Previous do-in-parallel: "Update API documentation for all endpoints"
The following files were modified as part of a previous parallel batch:
- src/api/users.ts (modified) - Added JSDoc to public methods,
updated module header
- src/api/orders.ts (modified) - Added JSDoc to public methods,
added @example tags
- src/api/products.ts (modified) - Added JSDoc to public methods,
updated type annotations
### Expected parallel changes (current batch)
Other agents in this batch are simultaneously:
- Adding tests to src/modules/auth.ts, src/modules/cart.ts,
and src/modules/payments.ts (repeatable group — test generation)
## 评估规范
```yaml
{来自独立GH Actions元法官的精确规范YAML}
```
## 执行输出
{来自GH Actions执行代理的摘要}
## 指令
用户提示词作为上下文提供,你应仅将其作为项目中其他代理可能发生的变更的参考。仅评估GitHub Actions测试步骤。
遵循代理指令中定义的完整法官流程!
## 输出
CRITICAL: You must reply with this exact structured evaluation report
format in YAML at the START of your response!
- model: opus
- subagent_type: "sadd:judge"
[所有4个法官同时启动]| 目标 | 分组 | 模型 | 法官分数 | 重试次数 | 状态 |
|---|---|---|---|---|---|
| src/modules/auth.ts | 可重复 | sonnet | 4.2/5.0 | 0 | 成功 |
| src/modules/cart.ts | 可重复 | sonnet | 4.0/5.0 | 0 | 成功 |
| src/modules/payments.ts | 可重复 | sonnet | 4.1/5.0 | 0 | 成功 |
| .github/workflows/ci.yml | 独立 | sonnet | 4.3/5.0 | 0 | 成功 |
/do-in-parallel I wrote class interface for S3 service in s3.adapter.ts, please do 2 tasks: implement s3 adapter with tests and integrate s3 adapter to analytics module. Also refactor and simplify all files in cart modulePhase 2: Task Analysis + Requirement Grouping
1. Task Identification:
- Task A: "Implement S3 adapter with tests in src/adapters/s3.adapter.ts"
- Task B: "Integrate S3 adapter into src/modules/analytics.module.ts"
- Task C: "Refactor and simplify src/modules/cart/cart.service.ts"
- Task D: "Refactor and simplify src/modules/cart/cart.repository.ts"
- Task E: "Refactor and simplify src/modules/cart/cart.controller.ts"
2. Requirement Grouping:
- Tasks A, B: SHARED — interdependent (adapter must match interface consumed
by analytics integration; should be reviewed together)
→ ONE combined meta-judge, ONE shared judge
- Tasks C, D, E: REPEATABLE — same task ("refactor and simplify") applied
to 3 different files in cart module
→ ONE reusable meta-judge
3. Pre-existing and Expected Parallel Changes Assessment:
- Pre-existing (user modifications): Refactored database connection layer
(src/db/connection.ts, src/db/queries.ts), updated service modules,
and added S3 class interface in src/adapters/s3.adapter.ts
- Expected parallel: S3 adapter implementation and analytics integration
run in parallel (shared group); cart refactoring agents run in parallel
(repeatable group); both groups run simultaneously
4. Agent Count:
- Meta-judges: 2 (1 shared for S3 work + 1 repeatable for cart refactoring)
- Implementation agents: 5 (one per task, always isolated)
- Judges: 4 (1 shared for S3 group + 3 individual for cart)
- Total: 11 agents (vs. 15 without grouping)[Meta-judge 1: Shared group — S3 adapter + integration]
Use Task tool:
- description: "Meta-judge (shared): combined spec for S3 adapter and analytics integration"
- prompt:
## Task
Generate a COMBINED evaluation specification yaml that covers ALL of the
following related tasks. These tasks are interdependent and will be
reviewed TOGETHER by a single judge. You will produce rubrics, checklists,
and scoring criteria that account for cross-task dependencies and
integration points.
CLAUDE_PLUGIN_ROOT={CLAUDE_PLUGIN_ROOT}
## User Prompt as Context
I wrote class interface for S3 service in s3.adapter.ts, please do 2 tasks:
implement s3 adapter with tests and integrate s3 adapter to analytics module.
Also refactor and simplify all files in cart module
## Tasks in This Shared Group
- Task A: Implement S3 adapter with tests -> src/adapters/s3.adapter.ts
- Task B: Integrate S3 adapter into analytics module -> src/modules/analytics.module.ts
## Context
The user has already written the class interface in s3.adapter.ts. Task A
implements the interface methods and adds unit tests. Task B integrates the
adapter into the analytics module. The adapter's public API from Task A must
match what Task B consumes.
## Artifact Type
code
## Instructions
CRITICAL: You are generating a COMBINED spec for tasks that will be
reviewed TOGETHER by ONE judge.
- Include evaluation criteria for EACH individual task
- Include cross-task verification criteria (e.g., "S3 adapter's public
methods match the calls made by the analytics integration")
- Organize the spec so the judge can identify which criteria apply to
which task's changes
- The judge will review ALL changes from ALL tasks in this group in a
single evaluation
User prompt is provided as context, you should use it only as reference
of changes that can occur in the project by other agents.
Return only the final evaluation specification YAML in your response.
- model: opus
- subagent_type: "sadd:meta-judge"
[Meta-judge 2: Repeatable group — cart refactoring]
Use Task tool:
- description: "Meta-judge (repeatable): reusable spec for refactoring cart module files"
- prompt:
## Task
Generate a REUSABLE evaluation specification yaml that can be applied to
ANY of the following targets performing the same task. You will produce
rubrics, checklists, and scoring criteria that individual judge agents
will each use independently to evaluate one target's implementation artifact.
CLAUDE_PLUGIN_ROOT={CLAUDE_PLUGIN_ROOT}
## User Prompt as Context
I wrote class interface for S3 service in s3.adapter.ts, please do 2 tasks:
implement s3 adapter with tests and integrate s3 adapter to analytics module.
Also refactor and simplify all files in cart module
## Task Being Repeated
Refactor and simplify a source file in the cart module
## Targets in This Group
- src/modules/cart/cart.service.ts
- src/modules/cart/cart.repository.ts
- src/modules/cart/cart.controller.ts
## Context
All three files are in the cart module. Refactoring should simplify logic,
reduce complexity, improve readability while preserving existing behavior.
## Artifact Type
code
## Instructions
CRITICAL: You are generating a REUSABLE spec that will be applied to
EACH target independently by separate judges.
- Use generic language: "target file should align with criteria" instead
of "all files should align"
- Do NOT include file-specific requirements since this same spec will be
applied to different files
- The spec must be applicable to ANY target in this group without modification
User prompt is provided as context, you should use it only as reference
of changes that can occur in the project by other agents.
Return only the final evaluation specification YAML in your response.
- model: opus
- subagent_type: "sadd:meta-judge"
[Both meta-judges launched simultaneously][Implementation 1: S3 adapter]
Use Task tool:
- description: "Parallel: implement S3 adapter with tests"
- prompt:
## Reasoning Approach
Let's think step by step.
Before taking any action, think through the problem systematically:
1. "Let me first understand what is being asked for this specific target..."
2. "Let me analyze this specific target..."
3. "Let me plan my approach..."
Work through each step explicitly before implementing.
<task>Implement S3 adapter with tests based on the existing class interface</task>
<target>src/adapters/s3.adapter.ts</target>
<constraints>
- Work ONLY on the specified target
- Implement all methods defined in the existing class interface
- Add comprehensive unit tests
- Do NOT modify the analytics module
</constraints>
<output>
Implement the S3 adapter and create tests.
CRITICAL: At the end of your work, provide a "Summary" section containing:
- Files modified (full paths)
- Key changes (3-5 bullet points)
- Any decisions made and rationale
</output>
## Self-Critique Verification (MANDATORY)
Before submitting, verify your work:
1. Re-read the original task and confirm every requirement is addressed
2. Check that the adapter implements all interface methods correctly
3. Verify no unrelated files were modified
4. Confirm the Summary section is complete and accurate
- model: opus
[Implementation 2: Analytics integration]
Use Task tool:
- description: "Parallel: integrate S3 adapter into analytics module"
- prompt:
## Reasoning Approach
[standard CoT prefix]
<task>Integrate S3 adapter into the analytics module</task>
<target>src/modules/analytics.module.ts</target>
<constraints>
- Work ONLY on the specified target
- Import and use the S3 adapter from src/adapters/s3.adapter.ts
- Follow existing dependency injection patterns
- Do NOT modify the S3 adapter itself
</constraints>
<output>
Integrate S3 adapter into analytics module.
CRITICAL: At the end of your work, provide a "Summary" section.
</output>
## Self-Critique Verification (MANDATORY)
[standard self-critique suffix]
- model: opus
[Implementation 3: cart.service.ts refactoring]
Use Task tool:
- description: "Parallel: refactor src/modules/cart/cart.service.ts"
- prompt:
## Reasoning Approach
Let's think step by step.
Before taking any action, think through the problem systematically:
1. "Let me first understand what is being asked for this specific target..."
2. "Let me analyze this specific target..."
3. "Let me plan my approach..."
Work through each step explicitly before implementing.
<task>Refactor and simplify the cart service</task>
<target>src/modules/cart/cart.service.ts</target>
<constraints>
- Work ONLY on the specified target
- Simplify logic, reduce complexity, improve readability
- Preserve existing behavior — no functional changes
- Do NOT modify other cart module files
</constraints>
<output>
Refactor the cart service file.
CRITICAL: At the end of your work, provide a "Summary" section containing:
- Files modified (full paths)
- Key changes (3-5 bullet points)
- Any decisions made and rationale
</output>
## Self-Critique Verification (MANDATORY)
Before submitting, verify your work:
1. Re-read the original task and confirm every requirement is addressed
2. Check that existing behavior is preserved after refactoring
3. Verify no unrelated files were modified
4. Confirm the Summary section is complete and accurate
- model: sonnet
[Implementation 4: cart.repository.ts refactoring]
Use Task tool:
- description: "Parallel: refactor src/modules/cart/cart.repository.ts"
- prompt: [Same CoT prefix + refactoring task body for cart.repository.ts + critique suffix]
- model: sonnet
[Implementation 5: cart.controller.ts refactoring]
Use Task tool:
- description: "Parallel: refactor src/modules/cart/cart.controller.ts"
- prompt: [Same CoT prefix + refactoring task body for cart.controller.ts + critique suffix]
- model: sonnet
[All 5 launched simultaneously][Judge 1: SHARED judge for S3 group — reviews both S3 adapter + analytics integration]
Use Task tool:
- description: "Judge (shared): S3 adapter implementation and analytics integration"
- prompt:
You are evaluating implementation artifacts for a group of related tasks
against a combined evaluation specification produced by the meta judge.
These tasks are interdependent and must be reviewed together.
CLAUDE_PLUGIN_ROOT={CLAUDE_PLUGIN_ROOT}
## User Prompt
I wrote class interface for S3 service in s3.adapter.ts, please do 2 tasks:
implement s3 adapter with tests and integrate s3 adapter to analytics module.
Also refactor and simplify all files in cart module
## Tasks in This Shared Group
- Task A: Implement S3 adapter with tests -> src/adapters/s3.adapter.ts
- Task B: Integrate S3 adapter into analytics module -> src/modules/analytics.module.ts
## Pre-existing and expected parallel changes (Context Only)
The following changes were made before or expected to be done by
other parallel agents in the same batch now. They are NOT part of
the current implementation agents' output for this shared group.
Focus your evaluation on the S3 group's changes. Only verify other
changed files/logic if they directly relate to these tasks.
### User modifications (before current task)
The user made changes to the following files/modules before this
task was started:
- src/db/connection.ts (modified) - Refactored database connection
pooling
- src/db/queries.ts (modified) - Updated query builder patterns
- src/adapters/s3.adapter.ts (created) - Added S3 class interface
(the interface that Task A implements)
- Several service modules updated to use new DB connection API
### Expected parallel changes (current batch)
Other agents in this batch are simultaneously:
- Refactoring src/modules/cart/cart.service.ts (repeatable group)
- Refactoring src/modules/cart/cart.repository.ts (repeatable group)
- Refactoring src/modules/cart/cart.controller.ts (repeatable group)
## Evaluation Specification
```yaml
{EXACT combined spec YAML from shared S3 meta-judge}
```
## Implementation Outputs
### Task: Implement S3 adapter with tests -> src/adapters/s3.adapter.ts
{Summary from S3 adapter implementation agent}
Files: src/adapters/s3.adapter.ts (modified), src/adapters/s3.adapter.test.ts (created)
### Task: Integrate S3 adapter into analytics -> src/modules/analytics.module.ts
{Summary from analytics integration agent}
Files: src/modules/analytics.module.ts (modified)
## Instructions
User prompt is provided as context, you should use it only as reference
of changes that can occur in the project by other agents. Evaluate ALL
tasks in this shared group together. Verify cross-task integration points
(e.g., does the adapter's public API match what the analytics module consumes?).
CRITICAL: For each task, indicate separately whether it PASSED or FAILED
so that only failing tasks can be retried.
Follow your full judge process as defined in your agent instructions!
## Output
CRITICAL: You must reply with this exact structured evaluation report
format in YAML at the START of your response! Include per-task verdicts.
- model: opus
- subagent_type: "sadd:judge"
[Judge 2: cart.service.ts — uses SHARED reusable spec from repeatable meta-judge]
Use Task tool:
- description: "Judge: src/modules/cart/cart.service.ts"
- prompt:
You are evaluating an implementation artifact for target
src/modules/cart/cart.service.ts against an evaluation specification
produced by the meta judge.
CLAUDE_PLUGIN_ROOT={CLAUDE_PLUGIN_ROOT}
## User Prompt
[original user prompt]
## Target
src/modules/cart/cart.service.ts
## Pre-existing and expected parallel changes (Context Only)
The following changes were made before or expected to be done by
other parallel agents in the same batch now. They are NOT part of
the current implementation agent's output. Focus your evaluation
on the current agent's changes to its specific target. Only verify
other changed files/logic if they directly relate to the current
target's task requirements.
### User modifications (before current task)
The user made changes to the following files/modules before this
task was started:
- src/db/connection.ts (modified) - Refactored database connection
pooling
- src/db/queries.ts (modified) - Updated query builder patterns
- src/adapters/s3.adapter.ts (created) - Added S3 class interface
- Several service modules updated to use new DB connection API
### Expected parallel changes (current batch)
Other agents in this batch are simultaneously:
- Implementing S3 adapter in src/adapters/s3.adapter.ts (shared group)
- Integrating S3 adapter into src/modules/analytics.module.ts (shared group)
- Refactoring src/modules/cart/cart.repository.ts (repeatable group)
- Refactoring src/modules/cart/cart.controller.ts (repeatable group)
## Evaluation Specification
```yaml
{EXACT reusable spec YAML from repeatable cart meta-judge — same for all 3 cart judges}
```
## Implementation Output
{Summary from cart.service.ts implementation agent}
## Instructions
User prompt is provided as context, you should use it only as reference
of changes that can occur in the project by other agents. Evaluate ONLY
the refactoring of cart.service.ts.
Follow your full judge process as defined in your agent instructions!
## Output
CRITICAL: You must reply with this exact structured evaluation report
format in YAML at the START of your response!
- model: opus
- subagent_type: "sadd:judge"
[Judge 3: cart.repository.ts — uses SAME shared reusable spec]
Use Task tool:
- description: "Judge: src/modules/cart/cart.repository.ts"
- prompt: [Same judge template, same reusable spec YAML, cart.repository implementation output.
Pre-existing and expected parallel changes section: same user modifications,
expected parallel changes list S3 group, cart.service.ts, and cart.controller.ts instead]
- model: opus
- subagent_type: "sadd:judge"
[Judge 4: cart.controller.ts — uses SAME shared reusable spec]
Use Task tool:
- description: "Judge: src/modules/cart/cart.controller.ts"
- prompt: [Same judge template, same reusable spec YAML, cart.controller implementation output.
Pre-existing and expected parallel changes section: same user modifications,
expected parallel changes list S3 group, cart.service.ts, and cart.repository.ts instead]
- model: opus
- subagent_type: "sadd:judge"
[All 4 judges launched simultaneously]Shared Judge Verdict:
- Task A (S3 adapter): PASS, SCORE: 4.2/5.0
- Task B (analytics integration): FAIL, SCORE: 3.0/5.0
ISSUES: Analytics module imports wrong method name from S3 adapter
- CROSS-TASK ISSUES: Method signature mismatch between adapter and consumer
Retry Decision:
→ Task A PASSED — do NOT re-launch S3 adapter implementation agent
→ Task B FAILED — re-launch ONLY the analytics integration agent with feedback
→ After retry, re-launch shared judge to review ALL changes again| Target | Grouping | Model | Judge Score | Retries | Status |
|---|---|---|---|---|---|
| src/adapters/s3.adapter.ts | Shared | opus | 4.2/5.0 | 0 | SUCCESS |
| src/modules/analytics.module.ts | Shared | opus | 4.1/5.0 | 1 | SUCCESS |
| src/modules/cart/cart.service.ts | Repeatable | sonnet | 4.0/5.0 | 0 | SUCCESS |
| src/modules/cart/cart.repository.ts | Repeatable | sonnet | 4.3/5.0 | 0 | SUCCESS |
| src/modules/cart/cart.controller.ts | Repeatable | sonnet | 4.1/5.0 | 0 | SUCCESS |
/do-in-parallel I wrote class interface for S3 service in s3.adapter.ts, please do 2 tasks: implement s3 adapter with tests and integrate s3 adapter to analytics module. Also refactor and simplify all files in cart module阶段2:任务分析 + 需求分组
1. 任务识别:
- 任务A: "Implement S3 adapter with tests in src/adapters/s3.adapter.ts"
- 任务B: "Integrate S3 adapter into src/modules/analytics.module.ts"
- 任务C: "Refactor and simplify src/modules/cart/cart.service.ts"
- 任务D: "Refactor and simplify src/modules/cart/cart.repository.ts"
- 任务E: "Refactor and simplify src/modules/cart/cart.controller.ts"
2. 需求分组:
- 任务A、B: 共享 — 相互依赖(适配器必须匹配分析集成使用的接口;应一起评审)
→ 一个联合元法官,一个共享法官
- 任务C、D、E: 可重复 — 相同任务("重构和简化")应用于cart模块中的3个不同文件
→ 一个可重用元法官
3. 预先存在和预期的并行变更评估:
- 预先存在(用户修改): 重构了数据库连接层
(src/db/connection.ts, src/db/queries.ts),更新了服务模块,
并在src/adapters/s3.adapter.ts中添加了S3类接口
- 预期并行: S3适配器实现和分析集成并行运行(共享组);cart重构代理并行运行
(可重复组);两个组同时运行
4. 代理数量:
- 元法官: 2个(1个用于S3工作的共享组 + 1个用于cart重构的可重复组)
- 执行代理: 5个(每个任务一个,始终隔离)
- 法官: 4个(1个用于S3组的共享法官 + 3个用于cart的独立法官)
- 总计: 11个代理(不分组则为15个)[元法官1: 共享组 — S3适配器 + 集成]
使用Task工具:
- description: "Meta-judge (shared): combined spec for S3 adapter and analytics integration"
- prompt:
## 任务
Generate a COMBINED evaluation specification yaml that covers ALL of the
following related tasks. These tasks are interdependent and will be
reviewed TOGETHER by a single judge. You will produce rubrics, checklists,
and scoring criteria that account for cross-task dependencies and
integration points.
CLAUDE_PLUGIN_ROOT={CLAUDE_PLUGIN_ROOT}
## 用户提示词作为上下文
I wrote class interface for S3 service in s3.adapter.ts, please do 2 tasks:
implement s3 adapter with tests and integrate s3 adapter to analytics module.
Also refactor and simplify all files in cart module
## 此共享组中的任务
- 任务A: Implement S3 adapter with tests -> src/adapters/s3.adapter.ts
- 任务B: Integrate S3 adapter into analytics module -> src/modules/analytics.module.ts
## 上下文
用户已在s3.adapter.ts中编写了类接口。任务A实现接口方法并添加单元测试。任务B将适配器集成到分析模块中。任务A中适配器的公共API必须与任务B使用的API匹配。
## 产物类型
code
## 指令
CRITICAL: You are generating a COMBINED spec for tasks that will be
reviewed TOGETHER by ONE judge.
- Include evaluation criteria for EACH individual task
- Include cross-task verification criteria (e.g., "S3 adapter's public
methods match the calls made by the analytics integration")
- Organize the spec so the judge can identify which criteria apply to
which task's changes
- The judge will review ALL changes from ALL tasks in this group in a
single evaluation
用户提示词作为上下文提供,你应仅将其作为项目中其他代理可能发生的变更的参考。
仅在响应中返回最终的评估规范YAML。
- model: opus
- subagent_type: "sadd:meta-judge"
[元法官2: 可重复组 — cart重构]
使用Task工具:
- description: "Meta-judge (repeatable): reusable spec for refactoring cart module files"
- prompt:
## 任务
Generate a REUSABLE evaluation specification yaml that can be applied to
ANY of the following targets performing the same task. You will produce
rubrics, checklists, and scoring criteria that individual judge agents
will each use independently to evaluate one target's implementation artifact.
CLAUDE_PLUGIN_ROOT={CLAUDE_PLUGIN_ROOT}
## 用户提示词作为上下文
I wrote class interface for S3 service in s3.adapter.ts, please do 2 tasks:
implement s3 adapter with tests and integrate s3 adapter to analytics module.
Also refactor and simplify all files in cart module
## 重复执行的任务
Refactor and simplify a source file in the cart module
## 此组中的目标
- src/modules/cart/cart.service.ts
- src/modules/cart/cart.repository.ts
- src/modules/cart/cart.controller.ts
## 上下文
所有三个文件都在cart模块中。重构应简化逻辑、降低复杂度、提高可读性,同时保留现有行为。
## 产物类型
code
## 指令
CRITICAL: You are generating a REUSABLE spec that will be applied to
EACH target independently by separate judges.
- Use generic language: "target file should align with criteria" instead
of "all files should align"
- Do NOT include file-specific requirements since this same spec will be
applied to different files
- The spec must be applicable to ANY target in this group without modification
用户提示词作为上下文提供,你应仅将其作为项目中其他代理可能发生的变更的参考。
仅在响应中返回最终的评估规范YAML。
- model: opus
- subagent_type: "sadd:meta-judge"
[两个元法官同时启动][执行1: S3适配器]
使用Task工具:
- description: "Parallel: implement S3 adapter with tests"
- prompt:
## 推理方法
Let's think step by step.
Before taking any action, think through the problem systematically:
1. "Let me first understand what is being asked for this specific target..."
2. "Let me analyze this specific target..."
3. "Let me plan my approach..."
Work through each step explicitly before implementing.
<task>Implement S3 adapter with tests based on the existing class interface</task>
<target>src/adapters/s3.adapter.ts</target>
<constraints>
- Work ONLY on the specified target
- Implement all methods defined in the existing class interface
- Add comprehensive unit tests
- Do NOT modify the analytics module
</constraints>
<output>
Implement the S3 adapter and create tests.
CRITICAL: At the end of your work, provide a "Summary" section containing:
- Files modified (full paths)
- Key changes (3-5 bullet points)
- Any decisions made and rationale
</output>
## 自我审查验证(必须)
提交之前,验证你的工作:
1. 重新阅读原始任务,确认所有需求都已满足
2. 检查适配器是否正确实现了所有接口方法
3. 验证未修改无关文件
4. 确认摘要部分完整准确
- model: opus
[执行2: 分析集成]
使用Task工具:
- description: "Parallel: integrate S3 adapter into analytics module"
- prompt:
## 推理方法
[标准思维链前缀]
<task>Integrate S3 adapter into the analytics module</task>
<target>src/modules/analytics.module.ts</target>
<constraints>
- Work ONLY on the specified target
- Import and use the S3 adapter from src/adapters/s3.adapter.ts
- Follow existing dependency injection patterns
- Do NOT modify the S3 adapter itself
</constraints>
<output>
Integrate S3 adapter into analytics module.
CRITICAL: At the end of your work, provide a "Summary" section.
</output>
## 自我审查验证(必须)
[标准自我审查后缀]
- model: opus
[执行3: cart.service.ts重构]
使用Task工具:
- description: "Parallel: refactor src/modules/cart/cart.service.ts"
- prompt:
## 推理方法
Let's think step by step.
Before taking any action, think through the problem systematically:
1. "Let me first understand what is being asked for this specific target..."
2. "Let me analyze this specific target..."
3. "Let me plan my approach..."
Work through each step explicitly before implementing.
<task>Refactor and simplify the cart service</task>
<target>src/modules/cart/cart.service.ts</target>
<constraints>
- Work ONLY on the specified target
- Simplify logic, reduce complexity, improve readability
- Preserve existing behavior — no functional changes
- Do NOT modify other cart module files
</constraints>
<output>
Refactor the cart service file.
CRITICAL: At the end of your work, provide a "Summary" section containing:
- Files modified (full paths)
- Key changes (3-5 bullet points)
- Any decisions made and rationale
</output>
## 自我审查验证(必须)
提交之前,验证你的工作:
1. 重新阅读原始任务,确认所有需求都已满足
2. 检查重构后是否保留了现有行为
3. 验证未修改无关文件
4. 确认摘要部分完整准确
- model: sonnet
[执行4: cart.repository.ts重构]
使用Task工具:
- description: "Parallel: refactor src/modules/cart/cart.repository.ts"
- prompt: [相同的思维链前缀 + cart.repository.ts的重构任务主体 + 审查后缀]
- model: sonnet
[执行5: cart.controller.ts重构]
使用Task工具:
- description: "Parallel: refactor src/modules/cart/cart.controller.ts"
- prompt: [相同的思维链前缀 + cart.controller.ts的重构任务主体 + 审查后缀]
- model: sonnet
[所有5个代理同时启动][法官1: S3组的共享法官 — 同时审查S3适配器 + 分析集成]
使用Task工具:
- description: "Judge (shared): S3 adapter implementation and analytics integration"
- prompt:
You are evaluating implementation artifacts for a group of related tasks
against a combined evaluation specification produced by the meta judge.
These tasks are interdependent and must be reviewed together.
CLAUDE_PLUGIN_ROOT={CLAUDE_PLUGIN_ROOT}
## 用户提示词
I wrote class interface for S3 service in s3.adapter.ts, please do 2 tasks:
implement s3 adapter with tests and integrate s3 adapter to analytics module.
Also refactor and simplify all files in cart module
## 此共享组中的任务
- 任务A: Implement S3 adapter with tests -> src/adapters/s3.adapter.ts
- 任务B: Integrate S3 adapter into analytics module -> src/modules/analytics.module.ts
## 预先存在和预期的并行变更(仅上下文)
The following changes were made before or expected to be done by
other parallel agents in the same batch now. They are NOT part of
the current implementation agents' output for this shared group.
Focus your evaluation on the S3 group's changes. Only verify other
changed files/logic if they directly relate to these tasks.
### 用户修改(当前任务之前)
用户在此任务开始之前对以下文件/模块进行了更改:
- src/db/connection.ts (modified) - 重构了数据库连接
池
- src/db/queries.ts (modified) - 更新了查询构建器模式
- src/adapters/s3.adapter.ts (created) - 添加了S3类接口
(任务A实现的接口)
- 几个服务模块更新为使用新的DB连接API
### 预期并行变更(当前批次)
此批次中的其他代理同时:
- 重构src/modules/cart/cart.service.ts(可重复组)
- 重构src/modules/cart/cart.repository.ts(可重复组)
- 重构src/modules/cart/cart.controller.ts(可重复组)
## 评估规范
```yaml
{来自共享S3元法官的精确联合规范YAML}
```
## 执行输出
### 任务: Implement S3 adapter with tests -> src/adapters/s3.adapter.ts
{来自S3适配器执行代理的摘要}
文件: src/adapters/s3.adapter.ts (modified), src/adapters/s3.adapter.test.ts (created)
### 任务: Integrate S3 adapter into analytics -> src/modules/analytics.module.ts
{来自分析集成执行代理的摘要}
文件: src/modules/analytics.module.ts (modified)
## 指令
用户提示词作为上下文提供,你应仅将其作为项目中其他代理可能发生的变更的参考。共同评估此共享组中的所有任务。验证跨任务集成点(例如,适配器的公共API是否与分析模块使用的API匹配?)。
CRITICAL: For each task, indicate separately whether it PASSED or FAILED
so that only failing tasks can be retried.
遵循代理指令中定义的完整法官流程!
## 输出
CRITICAL: You must reply with this exact structured evaluation report
format in YAML at the START of your response! Include per-task verdicts.
- model: opus
- subagent_type: "sadd:judge"
[法官2: cart.service.ts — 使用可重复元法官的共享可重用规范]
使用Task工具:
- description: "Judge: src/modules/cart/cart.service.ts"
- prompt:
You are evaluating an implementation artifact for target
src/modules/cart/cart.service.ts against an evaluation specification
produced by the meta judge.
CLAUDE_PLUGIN_ROOT={CLAUDE_PLUGIN_ROOT}
## 用户提示词
[原始用户提示词]
## 目标
src/modules/cart/cart.service.ts
## 预先存在和预期的并行变更(仅上下文)
The following changes were made before or expected to be done by
other parallel agents in the same batch now. They are NOT part of
the current implementation agent's output. Focus your evaluation
on the current agent's changes to its specific target. Only verify
other changed files/logic if they directly relate to the current
target's task requirements.
### 用户修改(当前任务之前)
用户在此任务开始之前对以下文件/模块进行了更改:
- src/db/connection.ts (modified) - 重构了数据库连接
池
- src/db/queries.ts (modified) - 更新了查询构建器模式
- src/adapters/s3.adapter.ts (created) - 添加了S3类接口
- 几个服务模块更新为使用新的DB连接API
### 预期并行变更(当前批次)
此批次中的其他代理同时:
- 在src/adapters/s3.adapter.ts中实现S3适配器(共享组)
- 将S3适配器集成到src/modules/analytics.module.ts(共享组)
- 重构src/modules/cart/cart.repository.ts(可重复组)
- 重构src/modules/cart/cart.controller.ts(可重复组)
## 评估规范
```yaml
{来自可重复cart元法官的精确可重用规范YAML — 所有3个cart法官相同}
```
## 执行输出
{来自cart.service.ts执行代理的摘要}
## 指令
用户提示词作为上下文提供,你应仅将其作为项目中其他代理可能发生的变更的参考。仅评估cart.service.ts的重构。
遵循代理指令中定义的完整法官流程!
## 输出
CRITICAL: You must reply with this exact structured evaluation report
format in YAML at the START of your response!
- model: opus
- subagent_type: "sadd:judge"
[法官3: cart.repository.ts — 使用相同的共享可重用规范]
使用Task工具:
- description: "Judge: src/modules/cart/cart.repository.ts"
- prompt: [相同的法官模板,相同的可重用规范YAML,cart.repository执行输出。
预先存在和预期的并行变更部分:相同的用户修改,
预期并行变更列表改为S3组、cart.service.ts和cart.controller.ts]
- model: opus
- subagent_type: "sadd:judge"
[法官4: cart.controller.ts — 使用相同的共享可重用规范]
使用Task工具:
- description: "Judge: src/modules/cart/cart.controller.ts"
- prompt: [相同的法官模板,相同的可重用规范YAML,cart.controller执行输出。
预先存在和预期的并行变更部分:相同的用户修改,
预期并行变更列表改为S3组、cart.service.ts和cart.repository.ts]
- model: opus
- subagent_type: "sadd:judge"
[所有4个法官同时启动]共享法官裁决:
- 任务A(S3适配器): 通过,分数: 4.2/5.0
- 任务B(分析集成): 失败,分数: 3.0/5.0
问题: 分析模块从S3适配器导入了错误的方法名称
- 跨任务问题: 适配器和消费者之间的方法签名不匹配
重试决策:
→ 任务A通过 — 不要重新启动S3适配器执行代理
→ 任务B失败 — 仅使用反馈重新启动分析集成代理
→ 重试后,重新启动共享法官以再次审查所有变更| 目标 | 分组 | 模型 | 法官分数 | 重试次数 | 状态 |
|---|---|---|---|---|---|
| src/adapters/s3.adapter.ts | 共享 | opus | 4.2/5.0 | 0 | 成功 |
| src/modules/analytics.module.ts | 共享 | opus | 4.1/5.0 | 1 | 成功 |
| src/modules/cart/cart.service.ts | 可重复 | sonnet | 4.0/5.0 | 0 | 成功 |
| src/modules/cart/cart.repository.ts | 可重复 | sonnet | 4.3/5.0 | 0 | 成功 |
| src/modules/cart/cart.controller.ts | 可重复 | sonnet | 4.1/5.0 | 0 | 成功 |
/do-in-parallel write tests for loan.service.ts, add password recovery feature to auth module and enable caching during dependency loading in github actions.Phase 2: Task Analysis + Requirement Grouping
1. Task Identification:
- Task A: "Write tests for src/services/loan.service.ts"
- Task B: "Add password recovery feature to src/modules/auth/"
- Task C: "Enable caching during dependency loading in .github/workflows/ci.yml"
2. Requirement Grouping:
- Task A: INDEPENDENT — test generation for a specific service
- Task B: INDEPENDENT — new feature in auth module (unrelated to tasks A and C)
- Task C: INDEPENDENT — CI configuration change (unrelated to tasks A and B)
- No grouping possible: all 3 tasks are different task types on different targets
3. Agent Count:
- Meta-judges: 3 (one per task — standard flow)
- Implementation agents: 3 (one per task)
- Judges: 3 (one per task)
- Total: 9 agents (no reduction possible)[Meta-judge 1: Independent — loan service tests]
Use Task tool:
- description: "Meta-judge: write tests for loan.service.ts"
- prompt:
## Task
Generate an evaluation specification yaml for the following task applied
to a specific target. You will produce rubrics, checklists, and scoring
criteria that a judge agent will use to evaluate the implementation
artifact for this specific target.
CLAUDE_PLUGIN_ROOT={CLAUDE_PLUGIN_ROOT}
## User Prompt as Context
write tests for loan.service.ts, add password recovery feature to auth
module and enable caching during dependency loading in github actions.
## Target
Write comprehensive unit tests for src/services/loan.service.ts
## Context
Project uses Jest. Tests should cover all public methods, edge cases,
and error scenarios for the loan service.
## Artifact Type
code
## Instructions
User prompt is provided as context, you should use it only as reference
of changes that can occur in the project by other agents. Generate
evaluation specification ONLY for the loan service test generation.
Your report will be used to verify only this particular task, not the
all tasks in the user prompt.
Return only the final evaluation specification YAML in your response.
- model: opus
- subagent_type: "sadd:meta-judge"
[Meta-judge 2: Independent — password recovery feature]
Use Task tool:
- description: "Meta-judge: add password recovery to auth module"
- prompt:
## Task
Generate an evaluation specification yaml for the following task applied
to a specific target. You will produce rubrics, checklists, and scoring
criteria that a judge agent will use to evaluate the implementation
artifact for this specific target.
CLAUDE_PLUGIN_ROOT={CLAUDE_PLUGIN_ROOT}
## User Prompt as Context
write tests for loan.service.ts, add password recovery feature to auth
module and enable caching during dependency loading in github actions.
## Target
Add password recovery feature to src/modules/auth/ (password reset flow:
request, token generation, validation, password update)
## Context
Auth module handles authentication. Password recovery requires new
endpoints, email integration, token management.
## Artifact Type
code
## Instructions
User prompt is provided as context, you should use it only as reference
of changes that can occur in the project by other agents. Generate
evaluation specification ONLY for the password recovery feature.
Your report will be used to verify only this particular task, not the
all tasks in the user prompt.
Return only the final evaluation specification YAML in your response.
- model: opus
- subagent_type: "sadd:meta-judge"
[Meta-judge 3: Independent — GH Actions caching]
Use Task tool:
- description: "Meta-judge: enable dependency caching in GitHub Actions"
- prompt:
## Task
Generate an evaluation specification yaml for the following task applied
to a specific target. You will produce rubrics, checklists, and scoring
criteria that a judge agent will use to evaluate the implementation
artifact for this specific target.
CLAUDE_PLUGIN_ROOT={CLAUDE_PLUGIN_ROOT}
## User Prompt as Context
write tests for loan.service.ts, add password recovery feature to auth
module and enable caching during dependency loading in github actions.
## Target
Enable caching during dependency loading in .github/workflows/ci.yml
(e.g., npm/yarn cache, actions/cache)
## Context
GitHub Actions CI pipeline. Dependency installation step should use
caching to speed up builds.
## Artifact Type
configuration
## Instructions
User prompt is provided as context, you should use it only as reference
of changes that can occur in the project by other agents. Generate
evaluation specification ONLY for enabling dependency caching in GH Actions.
Your report will be used to verify only this particular task, not the
all tasks in the user prompt.
Return only the final evaluation specification YAML in your response.
- model: opus
- subagent_type: "sadd:meta-judge"
[All 3 meta-judges launched simultaneously][Implementation 1: loan service tests]
Use Task tool:
- description: "Parallel: write tests for loan.service.ts"
- prompt:
## Reasoning Approach
Let's think step by step.
Before taking any action, think through the problem systematically:
1. "Let me first understand what is being asked for this specific target..."
2. "Let me analyze this specific target..."
3. "Let me plan my approach..."
Work through each step explicitly before implementing.
<task>Write comprehensive unit tests for the loan service</task>
<target>src/services/loan.service.ts</target>
<constraints>
- Work ONLY on the specified target
- Create test file co-located with the service
- Cover all public methods, edge cases, and error scenarios
- Follow existing test patterns in the project
</constraints>
<output>
Create test file for the loan service.
CRITICAL: At the end of your work, provide a "Summary" section containing:
- Files modified (full paths)
- Key changes (3-5 bullet points)
- Any decisions made and rationale
</output>
## Self-Critique Verification (MANDATORY)
Before submitting, verify your work:
1. Re-read the original task and confirm every requirement is addressed
2. Check that all tests follow existing patterns in the project
3. Verify no unrelated files were modified
4. Confirm the Summary section is complete and accurate
- model: sonnet
[Implementation 2: password recovery]
Use Task tool:
- description: "Parallel: add password recovery feature to auth module"
- prompt:
## Reasoning Approach
[standard CoT prefix]
<task>Add password recovery feature to the auth module</task>
<target>src/modules/auth/</target>
<constraints>
- Work ONLY on the auth module
- Implement password reset request, token generation, validation,
and password update
- Follow existing auth module patterns
- Do NOT modify unrelated modules
</constraints>
<output>
Implement password recovery feature.
CRITICAL: At the end of your work, provide a "Summary" section.
</output>
## Self-Critique Verification (MANDATORY)
[standard self-critique suffix]
- model: opus
[Implementation 3: GH Actions caching]
Use Task tool:
- description: "Parallel: enable dependency caching in GitHub Actions"
- prompt:
## Reasoning Approach
[standard CoT prefix]
<task>Enable caching during dependency loading in CI pipeline</task>
<target>.github/workflows/ci.yml</target>
<constraints>
- Work ONLY on the CI workflow file
- Add dependency caching (npm/yarn cache or actions/cache)
- Do NOT modify other workflow steps beyond what is necessary
</constraints>
<output>
Update CI workflow with dependency caching.
CRITICAL: At the end of your work, provide a "Summary" section.
</output>
## Self-Critique Verification (MANDATORY)
[standard self-critique suffix]
- model: sonnet
[All 3 launched simultaneously][Judge 1: loan service tests — independent spec]
Use Task tool:
- description: "Judge: loan.service.ts tests"
- prompt:
You are evaluating an implementation artifact for target
src/services/loan.service.ts against an evaluation specification
produced by the meta judge.
CLAUDE_PLUGIN_ROOT={CLAUDE_PLUGIN_ROOT}
## User Prompt
write tests for loan.service.ts, add password recovery feature to auth
module and enable caching during dependency loading in github actions.
## Target
src/services/loan.service.ts
## Evaluation Specification
```yaml
{EXACT spec YAML from loan service meta-judge}
```
## Implementation Output
{Summary from loan service test implementation agent}
## Instructions
User prompt is provided as context, you should use it only as reference
of changes that can occur in the project by other agents. Evaluate ONLY
the test generation for loan.service.ts.
Follow your full judge process as defined in your agent instructions!
## Output
CRITICAL: You must reply with this exact structured evaluation report
format in YAML at the START of your response!
- model: opus
- subagent_type: "sadd:judge"
[Judge 2: password recovery — independent spec]
Use Task tool:
- description: "Judge: auth password recovery"
- prompt:
You are evaluating an implementation artifact for target
src/modules/auth/ against an evaluation specification produced
by the meta judge.
CLAUDE_PLUGIN_ROOT={CLAUDE_PLUGIN_ROOT}
## User Prompt
[original user prompt]
## Target
src/modules/auth/ (password recovery feature)
## Evaluation Specification
```yaml
{EXACT spec YAML from password recovery meta-judge}
```
## Implementation Output
{Summary from password recovery implementation agent}
## Instructions
User prompt is provided as context, you should use it only as reference
of changes that can occur in the project by other agents. Evaluate ONLY
the password recovery feature.
Follow your full judge process as defined in your agent instructions!
## Output
CRITICAL: You must reply with this exact structured evaluation report
format in YAML at the START of your response!
- model: opus
- subagent_type: "sadd:judge"
[Judge 3: GH Actions caching — independent spec]
Use Task tool:
- description: "Judge: GitHub Actions dependency caching"
- prompt:
You are evaluating an implementation artifact for target
.github/workflows/ci.yml against an evaluation specification produced
by the meta judge.
CLAUDE_PLUGIN_ROOT={CLAUDE_PLUGIN_ROOT}
## User Prompt
[original user prompt]
## Target
.github/workflows/ci.yml (dependency caching)
## Evaluation Specification
```yaml
{EXACT spec YAML from GH Actions caching meta-judge}
```
## Implementation Output
{Summary from GH Actions caching implementation agent}
## Instructions
User prompt is provided as context, you should use it only as reference
of changes that can occur in the project by other agents. Evaluate ONLY
the dependency caching in GitHub Actions.
Follow your full judge process as defined in your agent instructions!
## Output
CRITICAL: You must reply with this exact structured evaluation report
format in YAML at the START of your response!
- model: opus
- subagent_type: "sadd:judge"
[All 3 judges launched simultaneously]| Target | Grouping | Model | Judge Score | Retries | Status |
|---|---|---|---|---|---|
| src/services/loan.service.ts | Independent | sonnet | 4.1/5.0 | 0 | SUCCESS |
| src/modules/auth/ | Independent | opus | 4.3/5.0 | 0 | SUCCESS |
| .github/workflows/ci.yml | Independent | sonnet | 4.0/5.0 | 0 | SUCCESS |
/do-in-parallel write tests for loan.service.ts, add password recovery feature to auth module and enable caching during dependency loading in github actions.阶段2:任务分析 + 需求分组
1. 任务识别:
- 任务A: "Write tests for src/services/loan.service.ts"
- 任务B: "Add password recovery feature to src/modules/auth/"
- 任务C: "Enable caching during dependency loading in .github/workflows/ci.yml"
2. 需求分组:
- 任务A: 独立 — 特定服务的测试生成
- 任务B: 独立 — auth模块中的新功能(与任务A和C无关)
- 任务C: 独立 — CI配置变更(与任务A和B无关)
- 无法分组:所有3个任务是不同任务类型,针对不同目标
3. 代理数量:
- 元法官: 3个(每个任务一个 — 标准流程)
- 执行代理: 3个(每个任务一个)
- 法官: 3个(每个任务一个)
- 总计: 9个代理(无法减少)[元法官1: 独立任务 — loan服务测试]
使用Task工具:
- description: "Meta-judge: write tests for loan.service.ts"
- prompt:
## 任务
Generate an evaluation specification yaml for the following task applied
to a specific target. You will produce rubrics, checklists, and scoring
criteria that a judge agent will use to evaluate the implementation
artifact for this specific target.
CLAUDE_PLUGIN_ROOT={CLAUDE_PLUGIN_ROOT}
## 用户提示词作为上下文
write tests for loan.service.ts, add password recovery feature to auth
module and enable caching during dependency loading in github actions.
## 目标
Write comprehensive unit tests for src/services/loan.service.ts
## 上下文
Project uses Jest. Tests should cover all public methods, edge cases,
and error scenarios for the loan service.
## 产物类型
code
## 指令
用户提示词作为上下文提供,你应仅将其作为项目中其他代理可能发生的变更的参考。仅为loan服务测试生成评估规范。你的报告将仅用于验证此特定任务,而非用户提示词中的所有任务。
仅在响应中返回最终的评估规范YAML。
- model: opus
- subagent_type: "sadd:meta-judge"
[元法官2: 独立任务 — 密码恢复功能]
使用Task工具:
- description: "Meta-judge: add password recovery to auth module"
- prompt:
## 任务
Generate an evaluation specification yaml for the following task applied
to a specific target. You will produce rubrics, checklists, and scoring
criteria that a judge agent will use to evaluate the implementation
artifact for this specific target.
CLAUDE_PLUGIN_ROOT={CLAUDE_PLUGIN_ROOT}
## 用户提示词作为上下文
write tests for loan.service.ts, add password recovery feature to auth
module and enable caching during dependency loading in github actions.
## 目标
Add password recovery feature to src/modules/auth/ (password reset flow:
request, token generation, validation, password update)
## 上下文
Auth模块处理身份验证。密码恢复需要新的端点、电子邮件集成、令牌管理。
## 产物类型
code
## 指令
用户提示词作为上下文提供,你应仅将其作为项目中其他代理可能发生的变更的参考。仅为密码恢复功能生成评估规范。你的报告将仅用于验证此特定任务,而非用户提示词中的所有任务。
仅在响应中返回最终的评估规范YAML。
- model: opus
- subagent_type: "sadd:meta-judge"
[元法官3: 独立任务 — GH Actions缓存]
使用Task工具:
- description: "Meta-judge: enable dependency caching in GitHub Actions"
- prompt:
## 任务
Generate an evaluation specification yaml for the following task applied
to a specific target. You will produce rubrics, checklists, and scoring
criteria that a judge agent will use to evaluate the implementation
artifact for this specific target.
CLAUDE_PLUGIN_ROOT={CLAUDE_PLUGIN_ROOT}
## 用户提示词作为上下文
write tests for loan.service.ts, add password recovery feature to auth
module and enable caching during dependency loading in github actions.
## 目标
Enable caching during dependency loading in .github/workflows/ci.yml
(e.g., npm/yarn cache, actions/cache)
## 上下文
GitHub Actions CI流水线。依赖安装步骤应使用缓存以加快构建速度。
## 产物类型
configuration
## 指令
用户提示词作为上下文提供,你应仅将其作为项目中其他代理可能发生的变更的参考。仅为在GH Actions中启用依赖缓存生成评估规范。你的报告将仅用于验证此特定任务,而非用户提示词中的所有任务。
仅在响应中返回最终的评估规范YAML。
- model: opus
- subagent_type: "sadd:meta-judge"
[所有3个元法官同时启动][执行1: loan服务测试]
使用Task工具:
- description: "Parallel: write tests for loan.service.ts"
- prompt:
## 推理方法
Let's think step by step.
Before taking any action, think through the problem systematically:
1. "Let me first understand what is being asked for this specific target..."
2. "Let me analyze this specific target..."
3. "Let me plan my approach..."
Work through each step explicitly before implementing.
<task>Write comprehensive unit tests for the loan service</task>
<target>src/services/loan.service.ts</target>
<constraints>
- Work ONLY on the specified target
- Create test file co-located with the service
- Cover all public methods, edge cases, and error scenarios
- Follow existing test patterns in the project
</constraints>
<output>
Create test file for the loan service.
CRITICAL: At the end of your work, provide a "Summary" section containing:
- Files modified (full paths)
- Key changes (3-5 bullet points)
- Any decisions made and rationale
</output>
## 自我审查验证(必须)
提交之前,验证你的工作:
1. 重新阅读原始任务,确认所有需求都已满足
2. 检查所有测试是否遵循项目中的现有模式
3. 验证未修改无关文件
4. 确认摘要部分完整准确
- model: sonnet
[执行2: 密码恢复]
使用Task工具:
- description: "Parallel: add password recovery feature to auth module"
- prompt:
## 推理方法
[标准思维链前缀]
<task>Add password recovery feature to the auth module</task>
<target>src/modules/auth/</target>
<constraints>
- Work ONLY on the auth module
- Implement password reset request, token generation, validation,
and password update
- Follow existing auth module patterns
- Do NOT modify unrelated modules
</constraints>
<output>
Implement password recovery feature.
CRITICAL: At the end of your work, provide a "Summary" section.
</output>
## 自我审查验证(必须)
[标准自我审查后缀]
- model: opus
[执行3: GH Actions缓存]
使用Task工具:
- description: "Parallel: enable dependency caching in GitHub Actions"
- prompt:
## 推理方法
[标准思维链前缀]
<task>Enable caching during dependency loading in CI pipeline</task>
<target>.github/workflows/ci.yml</target>
<constraints>
- Work ONLY on the CI workflow file
- Add dependency caching (npm/yarn cache or actions/cache)
- Do NOT modify other workflow steps beyond what is necessary
</constraints>
<output>
Update CI workflow with dependency caching.
CRITICAL: At the end of your work, provide a "Summary" section.
</output>
## 自我审查验证(必须)
[标准自我审查后缀]
- model: sonnet
[所有3个代理同时启动][法官1: loan服务测试 — 独立规范]
使用Task工具:
- description: "Judge: loan.service.ts tests"
- prompt:
You are evaluating an implementation artifact for target
src/services/loan.service.ts against an evaluation specification
produced by the meta judge.
CLAUDE_PLUGIN_ROOT={CLAUDE_PLUGIN_ROOT}
## 用户提示词
write tests for loan.service.ts, add password recovery feature to auth
module and enable caching during dependency loading in github actions.
## 目标
src/services/loan.service.ts
## 评估规范
```yaml
{来自loan服务元法官的精确规范YAML}
```
## 执行输出
{来自loan服务测试执行代理的摘要}
## 指令
用户提示词作为上下文提供,你应仅将其作为项目中其他代理可能发生的变更的参考。仅评估loan.service.ts的测试生成。
遵循代理指令中定义的完整法官流程!
## 输出
CRITICAL: You must reply with this exact structured evaluation report
format in YAML at the START of your response!
- model: opus
- subagent_type: "sadd:judge"
[法官2: 密码恢复 — 独立规范]
使用Task工具:
- description: "Judge: auth password recovery"
- prompt:
You are evaluating an implementation artifact for target
src/modules/auth/ against an evaluation specification produced
by the meta judge.
CLAUDE_PLUGIN_ROOT={CLAUDE_PLUGIN_ROOT}
## 用户提示词
[原始用户提示词]
## 目标
src/modules/auth/ (password recovery feature)
## 评估规范
```yaml
{来自密码恢复元法官的精确规范YAML}
```
## 执行输出
{来自密码恢复执行代理的摘要}
## 指令
用户提示词作为上下文提供,你应仅将其作为项目中其他代理可能发生的变更的参考。仅评估密码恢复功能。
遵循代理指令中定义的完整法官流程!
## 输出
CRITICAL: You must reply with this exact structured evaluation report
format in YAML at the START of your response!
- model: opus
- subagent_type: "sadd:judge"
[法官3: GH Actions缓存 — 独立规范]
使用Task工具:
- description: "Judge: GitHub Actions dependency caching"
- prompt:
You are evaluating an implementation artifact for target
.github/workflows/ci.yml against an evaluation specification produced
by the meta judge.
CLAUDE_PLUGIN_ROOT={CLAUDE_PLUGIN_ROOT}
## 用户提示词
[原始用户提示词]
## 目标
.github/workflows/ci.yml (dependency caching)
## 评估规范
```yaml
{来自GH Actions缓存元法官的精确规范YAML}
```
## 执行输出
{来自GH Actions缓存执行代理的摘要}
## 指令
用户提示词作为上下文提供,你应仅将其作为项目中其他代理可能发生的变更的参考。仅评估GitHub Actions中的依赖缓存。
遵循代理指令中定义的完整法官流程!
## 输出
CRITICAL: You must reply with this exact structured evaluation report
format in YAML at the START of your response!
- model: opus
- subagent_type: "sadd:judge"
[所有3个法官同时启动]| 目标 | 分组 | 模型 | 法官分数 | 重试次数 | 状态 |
|---|---|---|---|---|---|
| src/services/loan.service.ts | 独立 | sonnet | 4.1/5.0 | 0 | 成功 |
| src/modules/auth/ | 独立 | opus | 4.3/5.0 | 0 | 成功 |
| .github/workflows/ci.yml | 独立 | sonnet | 4.0/5.0 | 0 | 成功 |
| Scenario | Model | Reason |
|---|---|---|
| Security analysis | Opus | Critical reasoning required |
| Architecture decisions | Opus | Quality over speed |
| Simple refactoring | Haiku | Fast, sufficient |
| Documentation generation | Haiku | Mechanical task |
| Code review per file | Sonnet | Balanced capability |
| Test generation | Sonnet | Extensive but patterned |
| 场景 | 模型 | 理由 |
|---|---|---|
| 安全分析 | Opus | 需要关键推理能力 |
| 架构决策 | Opus | 质量优先于速度 |
| 简单重构 | Haiku | 快速、足够满足需求 |
| 文档生成 | Haiku | 机械性任务 |
| 逐文件代码评审 | Sonnet | 能力均衡 |
| 测试生成 | Sonnet | 工作量大但模式化 |
| Implementation Model | Judge Model | Rationale |
|---|---|---|
| Opus | Opus | Critical work needs strong verification |
| Sonnet | Opus | Tailored evaluation requires strong reasoning |
| Haiku | Opus | Verify simple work with strong evaluation |
| 执行模型 | 法官模型 | 理由 |
|---|---|---|
| Opus | Opus | 关键工作需要强大验证 |
| Sonnet | Opus | 定制评估需要强大推理 |
| Haiku | Opus | 使用强大评估验证简单工作 |
| Failure Type | Description | Recovery Action |
|---|---|---|
| Recoverable | Judge found issues, retry available | Retry with judge feedback (max 3 per target) |
| Approach Failure | The approach for this target is wrong | Escalate to user with options |
| Foundation Issue | Requirements unclear or impossible | Escalate to user for clarification |
| Max Retries Exceeded | Target failed after 3 retries | Mark failed, continue other targets, report at end |
| 失败类型 | 描述 | 恢复操作 |
|---|---|---|
| 可恢复 | 法官发现问题,可重试 | 使用法官反馈重试(每个目标最多3次) |
| 方法失败 | 此目标的方法错误 | 向用户上报并提供选项 |
| 基础问题 | 需求不明确或无法实现 | 向用户上报以澄清 |
| 超过最大重试次数 | 目标失败3次 | 标记为失败,继续其他目标,最后报告 |