massgen
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseMassGen
MassGen
Invoke MassGen for multi-agent iteration on any task — general-purpose work, evaluation, planning, or spec writing. Multiple AI agents independently work on the problem and converge on the strongest result through MassGen's checklist-gated voting system.
调用MassGen多Agent系统来处理任意任务——通用工作、评估、规划或需求规格撰写。多个AI Agent会独立处理问题,并通过MassGen的基于检查清单的投票机制收敛出最优结果。
When to Use
适用场景
General (default) — get multi-agent results on any task:
- When you want multiple AI agents to independently tackle a problem
- When the task doesn't fit neatly into evaluate, plan, or spec
- Writing, research, code generation, design, analysis, or any open-ended task
Evaluate — get diverse, critical feedback on existing work:
- After iterating and stalling — need outside perspective
- Before submitting PRs or delivering artifacts
- When wanting diverse AI perspectives on implementation quality
Plan — create or refine a structured project plan:
- When starting a complex feature or project that needs task decomposition
- When an existing plan has gaps, is too vague, or needs restructuring
- When you need a valid task DAG with verification criteria
Spec — create or refine a requirements specification:
- When starting a feature that needs precise requirements before implementation
- When an existing spec has ambiguities, missing edge cases, or gaps
- When you need EARS-formatted requirements with acceptance criteria
通用模式(默认)——为任意任务获取多Agent结果:
- 当你需要多个AI Agent独立处理同一个问题时
- 当任务无法明确归为评估、规划或规格制定类别时
- 写作、研究、代码生成、设计、分析或任何开放式任务
评估模式——为现有工作获取多元、批判性的反馈:
- 当你在迭代过程中陷入停滞,需要外部视角时
- 在提交PR或交付成果之前
- 当你需要AI从多元视角评估实现质量时
规划模式——创建或优化结构化项目计划:
- 当你启动需要任务分解的复杂功能或项目时
- 当现有计划存在漏洞、过于模糊或需要重构时
- 当你需要一个带有验证标准的有效任务DAG(有向无环图)时
规格模式——创建或优化需求规格:
- 当你在实现前需要为功能制定精确需求时
- 当现有规格存在歧义、缺失边缘场景或漏洞时
- 当你需要带有验收标准的EARS格式需求时
Mode Selection
模式选择
| Mode | Purpose | Input | Output | Default Criteria Preset |
|---|---|---|---|---|
| general | Any task | Task description + context | Winner's deliverables in | Auto-generated |
| evaluate | Critique existing work | Artifacts to evaluate | | |
| plan | Create or refine a plan | Goal + constraints (+ existing plan) | | |
| spec | Create or refine a spec | Problem + needs (+ existing spec) | | |
| 模式 | 用途 | 输入 | 输出 | 默认标准预设 |
|---|---|---|---|---|
| general | 任意任务 | 任务描述 + 上下文 | 获胜Agent的交付成果(存储于 | 自动生成 |
| evaluate | 评审现有工作 | 待评估的成果 | | |
| plan | 创建或优化计划 | 目标 + 约束(+ 现有计划) | | |
| spec | 创建或优化规格 | 问题 + 需求(+ 现有规格) | | |
FIRST: Confirm Config (do this before anything else)
第一步:确认配置(请在所有操作前完成)
Always ask the user which config to use. The config controls which models
run and how many agents are spawned — this directly affects quality and cost.
Never silently pick a config. The user must confirm the choice every time.
务必询问用户要使用哪个配置。配置会控制运行的模型以及生成的Agent数量——这直接影响结果质量和成本。切勿自行选择配置,每次都必须由用户确认选择。
Step A: Check what the user already specified
步骤A:检查用户已指定的内容
Scan the user's message for any config signal before searching:
| Signal in message | What to do |
|---|---|
Explicit file path (e.g. | Go to Step D — verify it exists, then confirm |
| Provider/model name (e.g. "use Claude", "GPT-4 agents", "Gemini") | Note the preference; use it to rank options in Step B |
| Named config (e.g. "the teams config", "my 3-agent setup") | Search for a match in Step B, confirm before using |
| "Same as last time" / "use recent" | Find the last-used config (see Step B), confirm before using |
| Nothing about config | Proceed to Step B |
在搜索前,先扫描用户消息中的配置信号:
| 消息中的信号 | 操作 |
|---|---|
明确的文件路径(如 | 进入步骤D——验证文件是否存在,然后确认 |
| 提供商/模型名称(如 "use Claude"、"GPT-4 agents"、"Gemini") | 记录用户偏好;在步骤B中用该偏好排序选项 |
| 命名配置(如 "团队配置"、"我的3-Agent设置") | 在步骤B中搜索匹配项,使用前需确认 |
| "和上次一样" / "使用最近的配置" | 查找最近使用的配置(见步骤B),使用前需确认 |
| 未提及配置 | 进入步骤B |
Step B: Discover available configs and models
步骤B:发现可用配置和模型
Run these checks and collect all found paths:
bash
undefined运行以下检查并收集所有找到的路径:
bash
undefinedStandard locations
标准位置
ls .massgen/config.yaml 2>/dev/null && echo "PROJECT: .massgen/config.yaml"
ls ~/.config/massgen/config.yaml 2>/dev/null && echo "GLOBAL: ~/.config/massgen/config.yaml"
ls .massgen/config.yaml 2>/dev/null && echo "PROJECT: .massgen/config.yaml"
ls ~/.config/massgen/config.yaml 2>/dev/null && echo "GLOBAL: ~/.config/massgen/config.yaml"
Recently used (from past skill runs in this project)
最近使用的配置(来自本项目中最近的Skill运行记录)
ls -t .massgen/*/run_summary.json 2>/dev/null | head -5
If the user said "same as last time", check the most recent `run_summary.json` for
a `"config"` field — that's the last-used path.
If the user mentioned a provider or model name but you need to verify what's
available, run:
```bash
uv run massgen --list-backendsThis prints all supported backends with their models, capabilities, and required
API keys — useful for matching a user's stated preference to a real backend name.
If no configs are found at all, do NOT create a YAML file yourself.
Instead, use the headless quickstart, which auto-detects available API keys
and generates a config without requiring a browser:
bash
uv run massgen --quickstart --headlessThis writes a config to and exits. If you need a specific
backend, add (repeat
for multiple agents). Only fall back to if the user
explicitly wants the browser-based setup wizard.
.massgen/config.yaml--quickstart-agent backend=claude,model=claude-opus-4-6--web-quickstartls -t .massgen/*/run_summary.json 2>/dev/null | head -5
如果用户说“和上次一样”,请检查最新的`run_summary.json`中的`"config"`字段——这就是上次使用的路径。
如果用户提到了提供商或模型名称,但你需要验证其可用性,请运行:
```bash
uv run massgen --list-backends该命令会打印所有支持的后端及其模型、功能和所需的API密钥——有助于将用户的偏好与实际可用的后端名称匹配。
如果未找到任何配置,请勿自行创建YAML文件。请使用无界面快速启动模式,它会自动检测可用的API密钥并生成配置,无需浏览器:
bash
uv run massgen --quickstart --headless该命令会将配置写入并退出。如果你需要特定后端,请添加(多个Agent可重复添加)。仅当用户明确需要基于浏览器的设置向导时,才使用作为备选方案。
.massgen/config.yaml--quickstart-agent backend=claude,model=claude-opus-4-6--web-quickstartStep C: Ask the user to confirm
步骤C:请用户确认
Use AskUserQuestion to present the options. Format the question clearly:
I found these MassGen configs:
— project config.massgen/config.yaml — global config~/.config/massgen/config.yamlWhich would you like to use? You can also paste a path to a different config, say "create new" to generate one, or tell me which provider/model you prefer.
Rules for presenting options:
- List every found config with its location label (project / global / path)
- If the user expressed a preference (provider name, agent count), note which option best matches and say why
- Always include "create new" as an option
- If only one config exists, still ask — just make it easy: "I found one config
at — use it, or would you prefer a different one?"
.massgen/config.yaml
使用AskUserQuestion呈现选项,清晰格式化问题:
我找到了以下MassGen配置:
— 项目配置.massgen/config.yaml — 全局配置~/.config/massgen/config.yaml你想使用哪一个?你也可以粘贴其他配置的路径,或者说“创建新配置”来生成一个,或者告诉我你偏好的提供商/模型。
呈现选项的规则:
- 列出所有找到的配置及其位置标签(项目/全局/路径)
- 如果用户表达了偏好(提供商名称、Agent数量),请指出最匹配的选项并说明原因
- 始终将“创建新配置”作为选项之一
- 如果仅存在一个配置,仍需询问用户:“我找到了一个配置位于— 是否使用它,还是你想要其他配置?”
.massgen/config.yaml
Step D: Resolve the user's answer
步骤D:处理用户的回答
| User response | Resolution |
|---|---|
| Picks a number from the list | Use that config path |
| Pastes or types a file path | Verify it exists; if not, report error and stop |
| Describes preference (e.g. "the Claude one", "use Gemini") | Match to discovered list or run |
| Says "default" or presses enter with one option | Use the single discovered config |
| Says "create new" / "generate one" | Run |
| Specifies backend+model (e.g. "3 Claude agents") | Run headless quickstart with explicit |
Once resolved, pass the path via in the
invocation (Step 4). If the user confirmed (the implicit
default), you may omit .
--config <path>massgen_run.sh.massgen/config.yaml--configSTOP here until you have a confirmed config. Do NOT proceed to Scope or
Workflow until the user has explicitly chosen a config. Do NOT write config
YAML files yourself — use the headless quickstart to generate them. Do NOT
search for configs in subdirectories, parent directories, or anywhere else
beyond the standard locations above.
| 用户回复 | 处理方式 |
|---|---|
| 选择列表中的编号 | 使用对应的配置路径 |
| 粘贴或输入文件路径 | 验证文件是否存在;如果不存在,报告错误并停止操作 |
| 描述偏好(如“使用Claude”、“用Gemini”) | 与已发现的列表匹配,或运行 |
| 回复“默认”或在仅有一个选项时按回车 | 使用找到的唯一配置 |
| 回复“创建新配置” / “生成一个” | 在当前工作目录运行 |
| 指定后端+模型(如“3个Claude Agent”) | 使用显式的 |
处理完成后,在调用(步骤4)中通过传递配置路径。如果用户确认使用(隐式默认配置),可以省略参数。
massgen_run.sh--config <path>.massgen/config.yaml--config在获得确认的配置前,请停止所有操作。在用户明确选择配置前,请勿进入范围定义或工作流程步骤。请勿自行编写配置YAML文件——请使用无界面快速启动模式生成配置。请勿在标准位置之外的子目录、父目录或其他位置搜索配置。
Scope
范围定义
Before starting, determine what the MassGen invocation covers. Focused invocations
produce far better results than unscoped "do everything" runs.
- General: the task to accomplish, relevant context, quality expectations
- Evaluate: which files/artifacts to evaluate, what to ignore, evaluation focus
- Plan: the goal/objective, constraints, what context to include
- Spec: the problem to specify, user needs, system boundaries
If the user doesn't specify scope, ask them.
在启动前,请确定MassGen调用的覆盖范围。聚焦的调用比无范围的“处理所有事情”运行效果好得多。
- 通用模式:要完成的任务、相关上下文、质量期望
- 评估模式:待评估的文件/成果、忽略项、评估重点
- 规划模式:目标/目的、约束条件、需包含的上下文
- 规格模式:待定义的问题、用户需求、系统边界
如果用户未指定范围,请询问用户。
Workflow
工作流程
Step 0: Create Working Directory
步骤0:创建工作目录
Create a timestamped subdirectory so parallel invocations don't conflict:
bash
MODE="general" # or "evaluate", "plan", or "spec"
WORK_DIR=".massgen/$MODE/$(date +%Y%m%d_%H%M%S)"
mkdir -p "$WORK_DIR"All artifacts (context, criteria, prompt, output, logs) go in this directory.
创建带时间戳的子目录,避免并行调用冲突:
bash
MODE="general" # 或 "evaluate"、"plan"、"spec"
WORK_DIR=".massgen/$MODE/$(date +%Y%m%d_%H%M%S)"
mkdir -p "$WORK_DIR"所有成果(上下文、标准、提示词、输出、日志)都存储在该目录中。
Step 1: Clarify & Write Context File
步骤1:明确并编写上下文文件
Read (relative to this skill) for the context
file template specific to your mode.
references/<mode>/workflow.mdWrite using the template from the workflow file.
$WORK_DIR/context.mdKey principle for all modes: provide factual context that orients the MassGen
agents. Do NOT bias them with your opinions about quality — let them discover
issues independently. That's the whole point of multi-agent evaluation.
- General: describe the task, relevant context, quality expectations
- Evaluate: describe what was built, scope, git info, verification evidence
- Plan: describe the goal, constraints, existing context, success criteria
- Spec: describe the problem, user needs, system boundaries, constraints
阅读(相对于本Skill)获取对应模式的上下文文件模板。
references/<mode>/workflow.md使用工作流文件中的模板编写。
$WORK_DIR/context.md所有模式的核心原则:提供能让MassGen Agent快速了解情况的事实性上下文。不要用你对质量的看法影响Agent——让它们独立发现问题。这正是多Agent评估的意义所在。
- 通用模式:描述任务、相关上下文、质量期望
- 评估模式:描述已完成的工作、范围、Git信息、验证证据
- 规划模式:描述目标、约束条件、现有上下文、成功标准
- 规格模式:描述问题、用户需求、系统边界、约束条件
Step 2: Generate Criteria
步骤2:生成评估标准
Each mode has a recommended criteria preset:
| Mode | Preset | How to Apply |
|---|---|---|
| general | Auto-generated | Omit |
| evaluate | Custom or | |
| plan | | |
| spec | | |
For general mode, criteria are auto-generated from the task content — omit
both flags unless you have specific quality axes to enforce.
To use custom criteria: read for the format
and writing guide, then write criteria JSON to .
references/criteria_guide.md$WORK_DIR/criteria.jsonIf there's a specific focus area, weight your criteria toward that focus.
In Claude Code: use AskUserQuestion to ask the user for focus preference.
In Codex or non-interactive: default to general coverage.
bash
cat > $WORK_DIR/criteria.json << 'EOF'
[
{"text": "...", "category": "must"},
{"text": "...", "category": "must"},
{"text": "...", "category": "should"},
{"text": "...", "category": "could"}
]
EOF每个模式都有推荐的标准预设:
| 模式 | 预设 | 应用方式 |
|---|---|---|
| general | 自动生成 | 省略 |
| evaluate | 自定义或 | 使用 |
| plan | | 使用 |
| spec | | 使用 |
对于通用模式,标准会根据任务内容自动生成——除非你有特定的质量维度需要强制实施,否则请省略这两个参数。
使用自定义标准:阅读了解格式和编写指南,然后将标准JSON写入。
references/criteria_guide.md$WORK_DIR/criteria.json如果有特定的重点领域,请在标准中侧重该领域。在Claude Code中:使用AskUserQuestion询问用户的重点偏好。在Codex或非交互式环境中:默认使用通用覆盖范围。
bash
cat > $WORK_DIR/criteria.json << 'EOF'
[
{"text": "...", "category": "must"},
{"text": "...", "category": "must"},
{"text": "...", "category": "should"},
{"text": "...", "category": "could"}
]
EOFStep 3: Construct the Prompt
步骤3:构建提示词
- Read the prompt template from (relative to this skill)
references/<mode>/prompt_template.md - Read the context file you wrote in Step 1
- Replace with the context file contents
{{CONTEXT_FILE_CONTENT}} - Replace with the focus directive (or empty string if none)
{{CUSTOM_FOCUS}} - Write the final prompt to
$WORK_DIR/prompt.md
- 从(相对于本Skill)读取提示词模板
references/<mode>/prompt_template.md - 读取你在步骤1中编写的上下文文件
- 将替换为上下文文件的内容
{{CONTEXT_FILE_CONTENT}} - 将替换为重点指令(如果没有则为空字符串)
{{CUSTOM_FOCUS}} - 将最终提示词写入
$WORK_DIR/prompt.md
Step 4: Run MassGen
步骤4:运行MassGen
Use the launcher script ( relative to this skill)
to run MassGen, launch the web viewer, and wait for completion in a single
atomic command. This avoids the double-backgrounding issues that cause agents
to lose track of running processes.
scripts/massgen_run.shRun in the background using your agent's native mechanism (e.g.,
in Claude Code):
run_in_backgroundbash
undefined使用启动脚本(,相对于本Skill)运行MassGen、启动Web查看器并等待命令完成,所有操作通过一个原子命令完成。这避免了导致Agent丢失运行进程跟踪的双重后台问题。
scripts/massgen_run.sh使用Agent的原生机制在后台运行(例如,Claude Code中的):
run_in_backgroundbash
undefinedSKILL_DIR is the directory containing this SKILL.md file
SKILL_DIR是包含本SKILL.md文件的目录
bash "$SKILL_DIR/scripts/massgen_run.sh"
--work-dir "$WORK_DIR"
--prompt-file "$WORK_DIR/prompt.md"
--criteria-file "$WORK_DIR/criteria.json"
--viewer
--work-dir "$WORK_DIR"
--prompt-file "$WORK_DIR/prompt.md"
--criteria-file "$WORK_DIR/criteria.json"
--viewer
If using default criteria (no custom criteria file), omit `--criteria-file`.
For planning/spec modes, use `--criteria-preset planning` or `--criteria-preset spec` instead.
If you resolved a custom config in Step 2, add `--config <path>`.
The script handles everything atomically:
1. Launches MassGen in `--automation` mode
2. Waits for the log directory to appear (with 30s timeout)
3. Starts the web viewer at `http://localhost:8000`
4. Waits for MassGen to complete
5. Writes `$WORK_DIR/run_summary.json` with exit code, duration, log dir
**After the background task completes**, read the summary:
```bash
cat $WORK_DIR/run_summary.jsonScript options:
- — launch web viewer (opens
--viewer)http://localhost:8000 - — use a different port
--viewer-port PORT - — custom MassGen config YAML
--config FILE - — override result path (default:
--output-file FILE)$WORK_DIR/result.md - — disable read-only CWD access
--no-cwd-context - — pass additional massgen CLI flags
--extra-args "..."
Timing: expect 15-45 minutes. Do not assume something is stuck — MassGen runs multiple agents through several rounds of iteration.
bash "$SKILL_DIR/scripts/massgen_run.sh"
--work-dir "$WORK_DIR"
--prompt-file "$WORK_DIR/prompt.md"
--criteria-file "$WORK_DIR/criteria.json"
--viewer
--work-dir "$WORK_DIR"
--prompt-file "$WORK_DIR/prompt.md"
--criteria-file "$WORK_DIR/criteria.json"
--viewer
如果使用默认标准(无自定义标准文件),请省略`--criteria-file`参数。对于规划/规格模式,请使用`--criteria-preset planning`或`--criteria-preset spec`替代。如果在步骤2中确定了自定义配置,请添加`--config <path>`参数。
该脚本会自动处理所有操作:
1. 以`--automation`模式启动MassGen
2. 等待日志目录出现(超时时间30秒)
3. 在`http://localhost:8000`启动Web查看器
4. 等待MassGen运行完成
5. 将退出代码、运行时长、日志目录写入`$WORK_DIR/run_summary.json`
**后台任务完成后**,读取摘要:
```bash
cat $WORK_DIR/run_summary.json脚本选项:
- — 启动Web查看器(打开
--viewer)http://localhost:8000 - — 使用指定端口
--viewer-port PORT - — 自定义MassGen配置YAML
--config FILE - — 覆盖结果路径(默认:
--output-file FILE)$WORK_DIR/result.md - — 禁用只读当前工作目录访问
--no-cwd-context - — 传递额外的massgen CLI标志
--extra-args "..."
运行时长:预计15-45分钟。不要假设进程卡住——MassGen会让多个Agent完成多轮迭代。
Step 5: Parse the Output
步骤5:解析输出
The output depends on the mode. The winner's workspace path is shown in
(look for "Workspace cwd" or check in
the log directory for ).
$WORK_DIR/result.mdstatus.jsonworkspace_pathsGeneral mode: the winner's answer is in . Any files
the agents created are in the winner's workspace (path shown in result.md).
Copy or reference the workspace files as needed.
$WORK_DIR/result.mdEvaluate mode: three files — , , .
Read first to determine iterate vs converged.
See for full output structure.
verdict.jsonnext_tasks.jsoncritique_packet.mdverdict.jsonreferences/evaluate/workflow.mdPlan mode: — structured task list with chunks,
dependencies, and verification. May include auxiliary files in ,
, subdirectories.
See for full output structure.
project_plan.jsonresearch/framework/risks/references/plan/workflow.mdSpec mode: — EARS requirements with chunks,
rationale, and verification. May include auxiliary files in ,
, subdirectories.
See for full output structure.
project_spec.jsonresearch/design/decisions/references/spec/workflow.md输出内容取决于所选模式。获胜Agent的工作区路径会显示在中(查找“Workspace cwd”或查看日志目录中的中的)。
$WORK_DIR/result.mdstatus.jsonworkspace_paths通用模式:获胜Agent的答案存储在中。Agent创建的所有文件都存储在获胜Agent的工作区(路径显示在result.md中)。根据需要复制或引用工作区文件。
$WORK_DIR/result.md评估模式:三个文件——、、。请先读取确定是需要迭代还是已收敛。查看了解完整的输出结构。
verdict.jsonnext_tasks.jsoncritique_packet.mdverdict.jsonreferences/evaluate/workflow.md规划模式: — 包含任务块、依赖关系和验证标准的结构化任务列表。可能包含、、子目录中的辅助文件。查看了解完整的输出结构。
project_plan.jsonresearch/framework/risks/references/plan/workflow.md规格模式: — 包含任务块、设计依据和验证标准的EARS需求。可能包含、、子目录中的辅助文件。查看了解完整的输出结构。
project_spec.jsonresearch/design/decisions/references/spec/workflow.mdStep 6: Ground in Your Native Task System
步骤6:整合到原生任务系统
This is the most critical step for evaluate, plan, and spec modes. MassGen
produced a structured result — now you must internalize it by entering your
native task/plan mode and enumerating every task or requirement as a tracked
item. Without this, the plan is just text that fades from context as you work.
For general mode, grounding is optional — it applies when the output
contains a structured task list or action items, but many general tasks
produce artifacts (code, documents, designs) rather than task lists.
Why this matters: agents that skip this step tend to execute the first
few tasks, then drift — forgetting verification steps, skipping later
tasks, or losing track of dependencies. Grounding forces you to commit
to the full scope before executing anything.
For all modes:
- Enter your task planning mode (e.g., TodoWrite in Claude Code, task tracking in Codex, or whatever native tracking your environment provides)
- Create one tracked task per item from the MassGen output:
- Evaluate: each task from becomes a tracked task, preserving
next_tasks.json,implementation_guidance, anddepends_onverification - Plan: each task from becomes a tracked task, preserving chunk ordering, dependencies, and verification criteria
project_plan.json - Spec: each requirement from becomes a tracked task (implement + verify), preserving priority, dependencies, and acceptance criteria
project_spec.json
- Evaluate: each task from
- Preserve the dependency order — don't flatten the DAG. Tasks in chunk C01 must complete before C02 tasks begin
- Include verification as explicit tasks — don't just track "implement X", also track "verify X meets [criteria]". Verification that isn't tracked doesn't happen
- Mark each task's status as you work: pending → in_progress → completed
Then execute in order, updating status as you go. When you complete a
task, check it off and move to the next one. This creates an execution
trace that keeps you honest about what's done and what remains.
这是评估、规划和规格模式中最关键的步骤。MassGen生成了结构化结果——现在你必须将其整合到原生任务/规划模式中,并将每个任务或需求列为跟踪项。否则,计划只是一段文字,会随着工作推进逐渐被遗忘。
对于通用模式,此步骤可选——仅当输出包含结构化任务列表或行动项时需要执行,许多通用任务会生成成果(代码、文档、设计)而非任务列表。
为什么这很重要:跳过此步骤的Agent往往会执行前几个任务,然后偏离方向——忘记验证步骤、跳过后续任务,或丢失依赖关系跟踪。整合操作会迫使你在执行任何任务前承诺完成全部范围。
所有模式的操作:
- 进入任务规划模式(例如,Claude Code中的TodoWrite、Codex中的任务跟踪,或你的环境提供的任何原生跟踪工具)
- 为MassGen输出中的每个项创建一个跟踪任务:
- 评估模式:中的每个任务都成为一个跟踪任务,保留
next_tasks.json、implementation_guidance和depends_on字段verification - 规划模式:中的每个任务都成为一个跟踪任务,保留任务块顺序、依赖关系和验证标准
project_plan.json - 规格模式:中的每个需求都成为一个跟踪任务(实现+验证),保留优先级、依赖关系和验收标准
project_spec.json
- 评估模式:
- 保留依赖顺序——不要扁平化DAG。任务块C01中的任务必须在C02中的任务开始前完成
- 将验证作为显式任务包含——不要只跟踪“实现X”,还要跟踪“验证X是否符合[标准]”。未被跟踪的验证步骤往往不会被执行
- 在工作时标记每个任务的状态:pending → in_progress → completed
然后按顺序执行,在推进时更新状态。完成任务后,将其标记为已完成并继续下一个任务。这会创建一个执行轨迹,让你清楚了解已完成和待完成的工作。
Step 7: Execute and Iterate
步骤7:执行并迭代
General: read for the winning answer. Copy deliverable
files from the winner's workspace if applicable.
result.mdEvaluate: read — if , check
in first:
verdict.json"iterate"approach_assessment.ceiling_statusnext_tasks.json- → execute
ceiling_not_reached, thenfix_tasksas stretchevolution_tasks - → execute
ceiling_approaching, thenfix_tasksevolution_tasks - → consider re-invoking plan mode with evaluation findings (see Plan-Evaluate Loop below) If
ceiling_reached, proceed to delivery."converged"
Plan / Spec: store the result as a living document (see below),
then execute the grounded tasks chunk by chunk. At tasks marked with
, invoke evaluate mode to assess approach viability
before continuing (see Plan-Evaluate Loop below).
eval_checkpoint通用模式:阅读获取获胜答案。如果适用,从获胜Agent的工作区复制交付文件。
result.md评估模式:阅读——如果结果为,请先查看中的:
verdict.json"iterate"next_tasks.jsonapproach_assessment.ceiling_status- → 执行
ceiling_not_reached,然后将fix_tasks作为扩展任务执行evolution_tasks - → 执行
ceiling_approaching,然后执行fix_tasksevolution_tasks - → 考虑结合评估结果重新调用规划模式(见下文的“规划-评估循环”) 如果结果为
ceiling_reached,则可以交付成果。"converged"
规划/规格模式:将结果存储为活文档(见下文),然后按任务块执行已整合的任务。在标记为的任务处,调用评估模式评估方法的可行性,然后再继续推进(见下文的“规划-评估循环”)。
eval_checkpointLiving Document Protocol (Plan & Spec Modes)
活文档协议(规划和规格模式)
This is the most important section for plan/spec modes — it defines how
the output is used after MassGen produces it.
这是规划/规格模式中最重要的部分——定义了MassGen生成输出后的使用方式。
Store
存储
Adopt the MassGen output into using the existing
infrastructure:
.massgen/plans/PlanStorage.massgen/plans/plan_<timestamp>/
├── workspace/ # Mutable working copy
│ ├── plan.json # (renamed from project_plan.json) or spec.json
│ └── research/ # Auxiliary files from MassGen output
├── frozen/ # Immutable snapshot (identical to workspace/ at creation)
│ ├── plan.json # or spec.json
│ └── research/
└── plan_metadata.json # artifact_type, status, chunk_order, context_pathsCopy → (or →
). Copy any auxiliary directories. Create as
an identical snapshot.
project_plan.jsonworkspace/plan.jsonproject_spec.jsonworkspace/spec.jsonfrozen/使用现有的基础设施将MassGen输出整合到中:
PlanStorage.massgen/plans/.massgen/plans/plan_<timestamp>/
├── workspace/ # 可编辑的工作副本
│ ├── plan.json # (从project_plan.json重命名)或spec.json
│ └── research/ # MassGen输出中的辅助文件
├── frozen/ # 不可变的快照(与创建时的workspace/完全一致)
│ ├── plan.json # 或spec.json
│ └── research/
└── plan_metadata.json # artifact_type、status、chunk_order、context_paths将复制为(或将复制为)。复制所有辅助目录。创建与workspace/完全一致的快照。
project_plan.jsonworkspace/plan.jsonproject_spec.jsonworkspace/spec.jsonfrozen/Read on Restart
重启时读取
FIRST ACTION in every new session: read (or
). This is the source of truth for what's done and
what's next.
workspace/plan.jsonworkspace/spec.json每个新会话的第一个操作:读取(或)。这是已完成和待完成工作的唯一来源。
workspace/plan.jsonworkspace/spec.jsonUpdate Continuously
持续更新
As tasks complete (plan) or requirements are implemented (spec), update
the workspace copy. Mark status, add notes, record discoveries.
The workspace copy is a living document.
随着任务完成(规划模式)或需求实现(规格模式),更新工作副本。标记状态、添加注释、记录新发现。工作副本是一个活文档。
Check Drift
检查偏离
Periodically compare against . The existing
returns a
(0.0 = no drift, 1.0 = complete rewrite). High drift means re-evaluate
whether the plan/spec is still valid.
workspace/frozen/PlanSession.compute_plan_diff()divergence_score定期比较和。现有的会返回(0.0 = 无偏离,1.0 = 完全重写)。高偏离值意味着需要重新评估计划/规格是否仍然有效。
workspace/frozen/PlanSession.compute_plan_diff()divergence_scoreRefine When Stuck
陷入停滞时优化
If the plan/spec proves wrong or incomplete, re-invoke this skill with
the workspace copy as "What Already Exists" to get multi-agent refinement.
This creates a new plan directory with a fresh snapshot.
frozen/如果计划/规格被证明有误或不完整,请以工作副本作为“现有内容”重新调用本Skill,获取多Agent优化结果。这会创建一个新的计划目录,包含新的快照。
frozen/Don't Drift Silently
不要无声地偏离
If you deviate from the plan/spec, update the workspace copy first.
An outdated plan is worse than no plan.
如果你偏离了计划/规格,请先更新工作副本。过时的计划比没有计划更糟糕。
Mode Overviews
模式概述
General
通用模式
Run any task through MassGen's multi-agent system. Agents independently
produce solutions and converge through checklist-gated voting. Use this
for tasks that don't fit neatly into evaluate, plan, or spec — writing,
code generation, research, analysis, design, or anything where multiple
perspectives improve the result.
See for the context template and
output handling.
references/general/workflow.md通过MassGen多Agent系统运行任意任务。Agent独立生成解决方案,并通过基于检查清单的投票机制收敛结果。适用于无法明确归为评估、规划或规格制定类别的任务——写作、代码生成、研究、分析、设计,或任何多视角能提升结果质量的场景。
查看获取上下文模板和输出处理指南。
references/general/workflow.mdEvaluate
评估模式
Critique existing work artifacts. Evaluator agents inspect your code,
documents, or deliverables and produce a structured critique with
machine-readable verdict, per-criterion scores, and actionable
improvement tasks. The checklist-gated voting system ensures agents
converge on the strongest critique.
See for the full context template,
output structure, and examples.
references/evaluate/workflow.md评审现有工作成果。评估Agent会检查你的代码、文档或交付成果,并生成包含机器可读结论、分项标准评分和可操作改进任务的结构化评审。基于检查清单的投票机制确保Agent收敛出最有价值的评审结果。
查看获取完整的上下文模板、输出结构和示例。
references/evaluate/workflow.mdPlan
规划模式
Create or refine a structured project plan. Planning agents decompose
the goal into a task DAG with chunks, dependencies, verification
criteria, and technology choices. Each round of MassGen iteration
improves task quality — descriptions get more actionable, verification
gets more specific, sequencing gets tighter.
See for the full context template,
output format, and lifecycle.
references/plan/workflow.md创建或优化结构化项目计划。规划Agent会将目标分解为包含任务块、依赖关系、验证标准和技术选择的任务DAG。每一轮MassGen迭代都会提升任务质量——任务描述会更具可操作性、验证标准会更具体、任务顺序会更紧凑。
查看获取完整的上下文模板、输出格式和生命周期指南。
references/plan/workflow.mdSpec
规格模式
Create or refine a requirements specification. Spec agents produce
EARS-formatted requirements with acceptance criteria, rationale,
and verification. Iteration focuses on precision — each round
eliminates ambiguities, fills gaps, and strengthens edge case coverage.
See for the full context template,
output format, and lifecycle.
references/spec/workflow.md创建或优化需求规格。规格Agent会生成带有验收标准、设计依据和验证标准的EARS格式需求。迭代的重点是精确性——每一轮迭代都会消除歧义、填补漏洞并强化边缘场景覆盖。
查看获取完整的上下文模板、输出格式和生命周期指南。
references/spec/workflow.mdPlan-Evaluate Loop
规划-评估循环
For complex or creative projects, plan and evaluate modes work together
in a feedback loop:
Plan → Execute → Evaluate → (fix OR re-plan) → Execute → Evaluate → ...对于复杂或创意项目,规划和评估模式会形成一个反馈循环协同工作:
规划 → 执行 → 评估 → (修复 或 重新规划) → 执行 → 评估 → ...When to Use the Loop
何时使用该循环
- The task has exploratory components (visual design, creative writing, UX)
- The project is complex enough that the initial plan is partly speculative
- Quality expectations are high and "correct but adequate" isn't enough
- Prior iterations show diminishing returns
- 任务包含探索性内容(视觉设计、创意写作、UX)
- 项目足够复杂,初始计划部分具有推测性
- 质量期望高,“合格但不够好”无法满足需求
- 前几轮迭代的收益递减
Loop Protocol
循环协议
- Plan: invoke plan mode. Agents classify tasks as or
deterministicand create prototypes to validate assumptionsexploratory - Execute: implement the plan chunk by chunk
- Evaluate: at tasks (or after any exploratory chunk), invoke evaluate mode
eval_checkpoint - Decide: read in the evaluation output:
approach_assessment- → execute fix_tasks, continue
ceiling_not_reached - → execute fix_tasks + evolution_tasks, continue
ceiling_approaching - → re-invoke plan mode with evaluation discoveries
ceiling_reached
- Evolve: if re-planning, pass and
approach_assessmentas context. The new plan amplifies what worked and avoids approaches that hit their ceilingbreakthroughs - Repeat until evaluation returns "converged"
- 规划:调用规划模式。Agent会将任务分类为(确定性)或
deterministic(探索性),并创建原型验证假设exploratory - 执行:按任务块逐步实现计划
- 评估:在任务处(或任何探索性任务块完成后),调用评估模式
eval_checkpoint - 决策:读取评估输出中的:
approach_assessment- → 执行fix_tasks,继续推进
ceiling_not_reached - → 执行fix_tasks + evolution_tasks,继续推进
ceiling_approaching - → 结合评估发现重新调用规划模式
ceiling_reached
- 演进:如果重新规划,请将和
approach_assessment作为上下文传递。新计划会放大有效的方法,避免使用已触及天花板的方法breakthroughs - 重复,直到评估返回“converged”
What Makes This Different from Just Re-Running Eval
与仅重新运行评估的区别
- Eval assesses whether the APPROACH has room to grow, not just whether the OUTPUT has defects
- When the approach is limited, the loop goes back to PLANNING, not just more implementation
- Breakthroughs discovered during execution feed FORWARD into new plans, not just into preserve lists
- The plan evolves based on evidence from execution, not speculation
- 评估会评估方法是否有成长空间,而不仅仅是输出是否有缺陷
- 当方法存在局限性时,循环会回到规划阶段,而不仅仅是继续实现
- 执行过程中发现的突破会反馈到新计划中,而不仅仅是被保留
- 计划会基于执行中的证据演进,而非基于推测
Loop Termination
循环终止条件
- Max 3 plan mutations per chunk — if still not converging, escalate to user
- If evaluation returns "converged" with , the loop is complete
ceiling_not_reached - If the user provides explicit direction, follow it regardless of ceiling status
- 每个任务块最多允许3次计划变更——如果仍未收敛,请向用户升级反馈
- 如果评估返回“converged”且,则循环完成
ceiling_not_reached - 如果用户提供明确指示,无论天花板状态如何,请遵循用户指示
Condensed Examples
简化示例
General: Multi-Agent Task Execution
通用模式:多Agent任务执行
bash
WORK_DIR=".massgen/general/$(date +%Y%m%d_%H%M%S)"
mkdir -p "$WORK_DIR"
cat > $WORK_DIR/context.md << 'EOF'bash
WORK_DIR=".massgen/general/$(date +%Y%m%d_%H%M%S)"
mkdir -p "$WORK_DIR"
cat > $WORK_DIR/context.md << 'EOF'Task
任务
Build a responsive landing page for a developer tool that converts
CSV files to JSON. Single page with hero, features, and CTA sections.
为一个将CSV文件转换为JSON的开发者工具构建响应式着陆页。单页包含Hero区、功能区和CTA区。
Context
上下文
- Target audience: developers and data engineers
- Tech stack: HTML, CSS, vanilla JS (no frameworks)
- Must work on mobile and desktop
- 目标受众:开发者和数据工程师
- 技术栈:HTML、CSS、原生JS(无框架)
- 必须支持移动端和桌面端
Quality Expectations
质量期望
- Visually polished, not template-looking
- Fast load time, no external dependencies EOF
- 视觉效果精致,避免模板化
- 加载速度快,无外部依赖 EOF
Build prompt from references/general/prompt_template.md, then run
从references/general/prompt_template.md构建提示词,然后运行
No --criteria-file — criteria auto-generated from task
无需--criteria-file — 标准会从任务中自动生成
bash "$SKILL_DIR/scripts/massgen_run.sh"
--work-dir "$WORK_DIR"
--prompt-file "$WORK_DIR/prompt.md"
--viewer
--work-dir "$WORK_DIR"
--prompt-file "$WORK_DIR/prompt.md"
--viewer
undefinedbash "$SKILL_DIR/scripts/massgen_run.sh"
--work-dir "$WORK_DIR"
--prompt-file "$WORK_DIR/prompt.md"
--viewer
--work-dir "$WORK_DIR"
--prompt-file "$WORK_DIR/prompt.md"
--viewer
undefinedEvaluate: Pre-PR Review
评估模式:PR前评审
bash
WORK_DIR=".massgen/evaluate/$(date +%Y%m%d_%H%M%S)"
mkdir -p "$WORK_DIR"bash
WORK_DIR=".massgen/evaluate/$(date +%Y%m%d_%H%M%S)"
mkdir -p "$WORK_DIR"Write context (scope: specific deliverables, no quality opinions)
编写上下文(范围:特定交付成果,无质量评价)
cat > $WORK_DIR/context.md << 'EOF'
cat > $WORK_DIR/context.md << 'EOF'
Deliverables in Scope
范围内的交付成果
- — API request handler
src/api/handler.ts - — authentication hook
src/hooks/useAuth.ts
- — API请求处理器
src/api/handler.ts - — 认证Hook
src/hooks/useAuth.ts
Out of Scope
范围外内容
- Test files, CI config
- 测试文件、CI配置
Original Task
原始任务
Add JWT authentication to the API layer
为API层添加JWT认证
What Was Done
已完成工作
Implemented JWT validation in handler and auth hook for React components.
在处理器和React组件的认证Hook中实现了JWT验证。
Verification Evidence
验证证据
pytest: 24 passed, 0 failed
EOF
pytest:24个测试通过,0个失败
EOF
Write criteria (or omit --eval-criteria to use default preset)
编写标准(或省略--eval-criteria使用默认预设)
cat > $WORK_DIR/criteria.json << 'EOF'
[
{"text": "Auth security: JWT validation covers expiration, signature, and audience checks.", "category": "must"},
{"text": "Error handling: invalid/expired tokens produce clear error responses.", "category": "must"},
{"text": "Code quality: clean separation between auth logic and business logic.", "category": "should"}
]
EOF
cat > $WORK_DIR/criteria.json << 'EOF'
[
{"text": "认证安全性:JWT验证包含过期时间、签名和受众检查。", "category": "must"},
{"text": "错误处理:无效/过期令牌会返回清晰的错误响应。", "category": "must"},
{"text": "代码质量:认证逻辑与业务逻辑清晰分离。", "category": "should"}
]
EOF
Build prompt from template, then run
从模板构建提示词,然后运行
bash "$SKILL_DIR/scripts/massgen_run.sh"
--work-dir "$WORK_DIR"
--prompt-file "$WORK_DIR/prompt.md"
--criteria-file "$WORK_DIR/criteria.json"
--viewer
--work-dir "$WORK_DIR"
--prompt-file "$WORK_DIR/prompt.md"
--criteria-file "$WORK_DIR/criteria.json"
--viewer
undefinedbash "$SKILL_DIR/scripts/massgen_run.sh"
--work-dir "$WORK_DIR"
--prompt-file "$WORK_DIR/prompt.md"
--criteria-file "$WORK_DIR/criteria.json"
--viewer
--work-dir "$WORK_DIR"
--prompt-file "$WORK_DIR/prompt.md"
--criteria-file "$WORK_DIR/criteria.json"
--viewer
undefinedPlan: New Feature Planning
规划模式:新功能规划
bash
WORK_DIR=".massgen/plan/$(date +%Y%m%d_%H%M%S)"
mkdir -p "$WORK_DIR"
cat > $WORK_DIR/context.md << 'EOF'bash
WORK_DIR=".massgen/plan/$(date +%Y%m%d_%H%M%S)"
mkdir -p "$WORK_DIR"
cat > $WORK_DIR/context.md << 'EOF'Goal
目标
Add real-time collaboration to the document editor — multiple users
editing the same document simultaneously with cursor presence.
为文档编辑器添加实时协作功能——多个用户可同时编辑同一文档,显示光标位置。
Constraints
约束条件
- Must work with existing PostgreSQL database
- Timeline: 2 weeks
- Team: 2 engineers
- 必须与现有PostgreSQL数据库兼容
- 时间线:2周
- 团队:2名工程师
Existing Context
现有上下文
Express.js backend, React frontend, WebSocket already used for notifications.
Express.js后端、React前端、已使用WebSocket实现通知功能。
Success Criteria
成功标准
Two users can edit the same document with <500ms sync latency and no data loss.
EOF
两名用户可同时编辑同一文档,同步延迟<500ms且无数据丢失。
EOF
Build prompt from references/plan/prompt_template.md, then run
从references/plan/prompt_template.md构建提示词,然后运行
bash "$SKILL_DIR/scripts/massgen_run.sh"
--work-dir "$WORK_DIR"
--prompt-file "$WORK_DIR/prompt.md"
--criteria-preset planning
--viewer
--work-dir "$WORK_DIR"
--prompt-file "$WORK_DIR/prompt.md"
--criteria-preset planning
--viewer
undefinedbash "$SKILL_DIR/scripts/massgen_run.sh"
--work-dir "$WORK_DIR"
--prompt-file "$WORK_DIR/prompt.md"
--criteria-preset planning
--viewer
--work-dir "$WORK_DIR"
--prompt-file "$WORK_DIR/prompt.md"
--criteria-preset planning
--viewer
undefinedSpec: Feature Specification
规格模式:功能规格制定
bash
WORK_DIR=".massgen/spec/$(date +%Y%m%d_%H%M%S)"
mkdir -p "$WORK_DIR"
cat > $WORK_DIR/context.md << 'EOF'bash
WORK_DIR=".massgen/spec/$(date +%Y%m%d_%H%M%S)"
mkdir -p "$WORK_DIR"
cat > $WORK_DIR/context.md << 'EOF'Problem Statement
问题描述
Users cannot recover deleted items — deletion is permanent and irreversible.
用户无法恢复已删除的项目——删除操作是永久且不可逆的。
User Needs / Personas
用户需求 / 用户角色
- End users: accidentally delete items, need easy recovery
- Admins: need to purge items for compliance after retention period
- 终端用户:不小心删除项目,需要简单的恢复方式
- 管理员:需要在保留期后为合规目的清除项目
Constraints
约束条件
- PostgreSQL database, soft-delete pattern preferred
- 30-day retention before permanent purge
- Must not break existing API consumers EOF
- PostgreSQL数据库,优先使用软删除模式
- 30天保留期后永久清除
- 不得影响现有API消费者 EOF
Build prompt from references/spec/prompt_template.md, then run
从references/spec/prompt_template.md构建提示词,然后运行
bash "$SKILL_DIR/scripts/massgen_run.sh"
--work-dir "$WORK_DIR"
--prompt-file "$WORK_DIR/prompt.md"
--criteria-preset spec
--viewer
--work-dir "$WORK_DIR"
--prompt-file "$WORK_DIR/prompt.md"
--criteria-preset spec
--viewer
undefinedbash "$SKILL_DIR/scripts/massgen_run.sh"
--work-dir "$WORK_DIR"
--prompt-file "$WORK_DIR/prompt.md"
--criteria-preset spec
--viewer
--work-dir "$WORK_DIR"
--prompt-file "$WORK_DIR/prompt.md"
--criteria-preset spec
--viewer
undefinedReference Files
参考文件
- — general mode context template and output handling
references/general/workflow.md - — general prompt template with placeholders
references/general/prompt_template.md - — how to write quality criteria (format, tiers, examples)
references/criteria_guide.md - — evaluate mode context template, output structure, examples
references/evaluate/workflow.md - — evaluation prompt template with placeholders
references/evaluate/prompt_template.md - — plan mode context template, output format, lifecycle
references/plan/workflow.md - — planning prompt template with placeholders
references/plan/prompt_template.md - — spec mode context template, output format, lifecycle
references/spec/workflow.md - — spec prompt template with placeholders
references/spec/prompt_template.md - — source methodology for evaluation
massgen/subagent_types/round_evaluator/SUBAGENT.md - — reference pattern for
massgen/skills/massgen-develops-massgen/SKILL.md--automation
- — 通用模式上下文模板和输出处理指南
references/general/workflow.md - — 通用提示词模板(包含占位符)
references/general/prompt_template.md - — 质量标准编写指南(格式、层级、示例)
references/criteria_guide.md - — 评估模式上下文模板、输出结构、示例
references/evaluate/workflow.md - — 评估提示词模板(包含占位符)
references/evaluate/prompt_template.md - — 规划模式上下文模板、输出格式、生命周期指南
references/plan/workflow.md - — 规划提示词模板(包含占位符)
references/plan/prompt_template.md - — 规格模式上下文模板、输出格式、生命周期指南
references/spec/workflow.md - — 规格提示词模板(包含占位符)
references/spec/prompt_template.md - — 评估方法的原始文档
massgen/subagent_types/round_evaluator/SUBAGENT.md - —
massgen/skills/massgen-develops-massgen/SKILL.md模式的参考示例--automation