spec-first

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Spec-First Development

先规范后开发

A structured workflow for LLM-assisted coding that delays implementation until decisions are explicit.
这是一种适用于LLM辅助编码的结构化工作流,在决策明确前不会开始实现。

When This Activates

适用场景

  • "Build X" or "Create Y" (new features/projects)
  • "Implement..." (non-trivial functionality)
  • "Add a feature that..." (multi-step work)
  • Any request requiring 3+ files or unclear requirements
  • “构建X”或“创建Y”(新功能/项目)
  • “实现……”(非简单功能)
  • “添加一个……的功能”(多步骤工作)
  • 任何需要修改3个以上文件或需求不明确的请求

When to Skip

跳过场景

  • Single-file changes under 50 lines
  • Typo fixes, log additions, config tweaks
  • User explicitly says "just do it" or "quick fix"
  • 50行以内的单文件修改
  • 拼写错误修复、日志添加、配置调整
  • 用户明确要求“直接做”或“快速修复”

Core Principles

核心原则

  1. Delay implementation until tradeoffs are explicit — Use conversation to clarify constraints, compare options, surface risks. Only then write code.
  2. Treat the model like a junior engineer with infinite typing speed — Provide structure: clear interfaces, small tasks, explicit acceptance criteria. Code is cheap; understanding and correctness are scarce.
  3. Specs beat prompts — For anything non-trivial, create a durable artifact (spec file) that can be re-fed, diffed, and reused across sessions.
  4. Generated code is disposable; tests are not — Assume rewrites. Design for easy replacement: small modules, minimal coupling, clean seams, strong tests.
  5. The model is over-confident; reality is the judge — Everything important gets verified by execution: tests, linters, typecheckers, reproducible builds.
  1. 决策明确后再实现 — 通过沟通明确约束条件、对比方案、发现风险。只有在这些完成后才编写代码。
  2. 将模型视为打字速度极快的初级工程师 — 提供结构化内容:清晰的接口、小任务、明确的验收标准。代码成本低;理解和正确性才是稀缺资源。
  3. 规范优于提示词 — 对于任何非简单项目,创建可重复使用、可对比、可跨会话复用的持久化文档(spec文件)。
  4. 生成的代码可丢弃;测试不可丢弃 — 假设代码会被重写。设计时要便于替换:小模块、低耦合、清晰的边界、可靠的测试。
  5. 模型过度自信;现实是检验标准 — 所有重要内容都要通过执行来验证:测试、代码检查工具、类型检查器、可复现的构建。

The 6-Stage Workflow

六阶段工作流

Stage A: Frame the Problem (conversation mode)

阶段A:问题框架(沟通模式)

Goal: Decide before you implement.
Prompts that work:
  • "List 3 viable approaches. Compare on: complexity, failure modes, testability, future change, time to first demo."
  • "What assumptions are you making? Which ones are risky?"
  • "Propose a minimal version that can be deleted later without regret."
Output: Decision notes for
.agents/DECISIONS/[feature-name].md
目标: 先决策,再实现。
有效的提示词:
  • “列出3种可行方案。从复杂度、失败模式、可测试性、未来扩展性、首次演示时间等维度进行对比。”
  • “你做出了哪些假设?哪些假设存在风险?”
  • “提出一个最小化版本,后续可以无遗憾地删除。”
输出: 决策记录保存至
.agents/DECISIONS/[feature-name].md

Stage B: Write spec.md (freeze decisions)

阶段B:编写spec.md(固化决策)

Goal: Turn decisions into unambiguous requirements.
File:
.agents/SPECS/[feature-name].md
markdown
undefined
目标: 将决策转化为明确无歧义的需求。
文件:
.agents/SPECS/[feature-name].md
markdown
undefined

[Feature Name] Spec

[功能名称] 规范

Purpose

目的

One paragraph: what this is for.
一段文字:说明该功能的用途。

Non-Goals

非目标

Explicitly state what you are NOT building.
明确说明不会构建的内容。

Interfaces

接口

Inputs/outputs, data types, file formats, API endpoints, CLI commands.
输入/输出、数据类型、文件格式、API端点、CLI命令。

Key Decisions

关键决策

Libraries, architecture, persistence choices, constraints.
库选择、架构、持久化方案、约束条件。

Edge Cases and Failure Modes

边缘情况与失败模式

Timeouts, retries, partial failures, invalid input, concurrency, idempotency.
超时、重试、部分失败、无效输入、并发、幂等性。

Acceptance Criteria

验收标准

Bullet list of testable statements. Avoid "should be fast." Prefer: "processes 1k items under 2s on M1 Mac."
可测试的语句列表。避免使用“应该快速”这类描述。 推荐写法:“在M1 Mac上处理1000条数据耗时不超过2秒。”

Test Plan

测试计划

Unit/integration boundaries, fixtures, golden files, what must be mocked.
undefined
单元/集成测试边界、测试数据、基准文件、需要模拟的内容。
undefined

Stage C: Generate todo.md (planning mode)

阶段C:生成todo.md(规划模式)

Goal: Stepwise checklist where each step has a verification command.
File:
.agents/TODOS/[feature-name].md
markdown
undefined
目标: 分步检查清单,每个步骤都有验证命令。
文件:
.agents/TODOS/[feature-name].md
markdown
undefined

[Feature Name] TODO

[功能名称] 待办事项

  • Add project scaffolding (build/run/test commands) Verify:
    npm run build && npm test
  • Implement module X with interface Y Verify:
    npm test -- --grep "module X"
  • Add tests for edge cases A/B/C Verify:
    npm test -- --grep "edge cases"
  • Wire integration Verify:
    npm run integration
  • Add docs Verify:
    npm run docs && open docs/index.html

Each item must be independently checkable. This prevents "looks right" progress.
  • 添加项目脚手架(构建/运行/测试命令) 验证:
    npm run build && npm test
  • 实现符合接口Y的模块X 验证:
    npm test -- --grep "module X"
  • 为边缘情况A/B/C添加测试 验证:
    npm test -- --grep "edge cases"
  • 集成对接 验证:
    npm run integration
  • 添加文档 验证:
    npm run docs && open docs/index.html

每个事项都必须可独立验证。这可以避免“看起来有进展”的虚假情况。

Stage D: Execute Changes (implementation mode)

阶段D:执行变更(实现模式)

Goal: Small diffs, frequent verification, controlled context.
Rules:
  • One logical change per step
  • Keep focus on one interface at a time
  • After each change: run verification command, paste actual output back
  • Commit early and often
For large codebases:
  • Provide only relevant files plus spec/todo
  • If summarizing repo, do it once and keep as reusable artifact
目标: 小幅度变更、频繁验证、可控上下文。
规则:
  • 每个步骤只做一个逻辑变更
  • 一次只专注于一个接口
  • 每次变更后:运行验证命令,粘贴实际输出
  • 尽早并频繁提交代码
对于大型代码库:
  • 仅提供相关文件以及spec/todo
  • 如果需要总结仓库,只做一次并保存为可复用的文档

Stage E: Verify and Review (adversarial mode)

阶段E:验证与评审(对抗模式)

Goal: Force the model to try to break its own work.
Prompts:
  • "Act as a hostile reviewer. Find correctness bugs, not style nits. List concrete failing scenarios."
  • "Given these acceptance criteria, which are not actually satisfied? Be specific."
  • "Propose 5 tests that would fail if the implementation is wrong."
目标: 促使模型尝试找出自己代码中的问题。
提示词:
  • “扮演一个严苛的评审者。找出正确性问题,而非风格问题。列出具体的失败场景。”
  • “根据这些验收标准,哪些标准实际上没有被满足?请具体说明。”
  • “提出5个如果实现错误就会失败的测试用例。”

Stage F: Decide What Lasts

阶段F:确定留存内容

Goal: Keep the system easy to delete and rewrite.
Heuristics:
  • Keep "policy" (business rules) separate from "mechanism" (I/O, DB, HTTP)
  • Prefer shallow abstractions that can be removed without cascade
  • Invest in tests and fixtures more than clever architecture
目标: 保持系统易于删除和重写。
启发式原则:
  • 将“策略”(业务规则)与“机制”(I/O、数据库、HTTP)分离
  • 优先选择可移除且不会引发连锁反应的浅层抽象
  • 在测试和测试数据上投入更多,而非复杂的架构

The Three-File Convention

三文件约定

Keep in the
.agents/
folder (not project root):
.agents/
├── SPECS/
│   └── [feature-name].md    # what/why/constraints
├── TODOS/
│   └── [feature-name].md    # steps + verification commands
└── DECISIONS/
    └── [feature-name].md    # tradeoffs, rejected options, assumptions
Naming: Use the feature/task name as the filename (e.g.,
user-auth.md
,
api-refactor.md
).
Why .agent folder:
  • Keeps project root clean
  • Groups all AI-assisted planning artifacts
  • Works with task-prd-creator and ai-dev-loop skills
  • Persists across sessions
将文件保存在
.agents/
文件夹中(而非项目根目录):
.agents/
├── SPECS/
│   └── [feature-name].md    # 内容/目的/约束条件
├── TODOS/
│   └── [feature-name].md    # 步骤 + 验证命令
└── DECISIONS/
    └── [feature-name].md    # 权衡方案、被否决的选项、假设前提
命名规则: 以功能/任务名称作为文件名(例如
user-auth.md
api-refactor.md
)。
为何使用.agent文件夹:
  • 保持项目根目录整洁
  • 将所有AI辅助规划的文档集中管理
  • 与task-prd-creator和ai-dev-loop技能兼容
  • 跨会话持久化保存

Agent Readiness Checklist (IMPACT)

Agent就绪检查清单(IMPACT)

Before running autonomous/agentic execution, verify:
DimensionQuestionIf No...
IntentDo you have acceptance criteria and a test harness?Don't run agent
MemoryDo you have durable artifacts (spec/todo) so it can resume?It will thrash
PlanningCan it produce/update a plan with checkpoints?It will improvise badly
AuthorityIs what it can do restricted (edit, test, commit)?Too risky
Control FlowDoes it decide next step based on tool output?It's just generating blobs
ToolsDoes it have minimum necessary tooling and nothing extra?Attack surface too large
Approve at meaningful checkpoints (end of todo item, after test suite passes), not every micro-step.
在运行自主/代理执行前,验证以下内容:
维度问题如果否……
Intent(意图)是否有验收标准和测试框架?不要运行代理
Memory(记忆)是否有持久化文档(spec/todo)以便代理可以恢复任务?代理会陷入混乱
Planning(规划)代理能否生成/更新带有检查点的计划?代理会进行糟糕的即兴发挥
Authority(权限)代理的操作是否受到限制(编辑、测试、提交)?风险过高
Control Flow(控制流)代理是否根据工具输出来决定下一步操作?它只是在生成无意义的内容
Tools(工具)代理是否拥有必要的最少工具,没有多余工具?攻击面过大
在有意义的检查点批准(待办事项完成后、测试套件通过后),而非每个微小步骤都批准。

Prompt Patterns

提示词模式

Authoritarian (for correctness):
Edit these files: [paths]
Interface: [exact signatures]
Acceptance criteria: [list]
Required tests: [list]
Don't change anything else.
Options and tradeoffs (for design):
Give me 3 options and a recommendation.
Make the recommendation conditional on constraints A/B/C.
Context discipline (for large codebases):
Only use the files I provided.
If you need more context, ask for a specific file and explain why.
Make it provable:
Add a test that fails on the buggy version and passes on the correct one.
权威型(确保正确性):
编辑这些文件:[路径]
接口:[精确签名]
验收标准:[列表]
必填测试:[列表]
不要修改其他任何内容。
选项与权衡(用于设计):
给我3个选项和一个推荐方案。
根据约束条件A/B/C给出有条件的推荐。
上下文约束(用于大型代码库):
仅使用我提供的文件。
如果需要更多上下文,请索要特定文件并说明原因。
可证明性:
添加一个在错误版本中失败、在正确版本中通过的测试。

Output Format

输出格式

When this skill activates, produce:
SPEC-FIRST WORKFLOW

STAGE A - FRAMING:
[3 approaches with tradeoffs]
[Recommendation]

STAGE B - SPEC:
[Draft spec.md content]

STAGE C - TODO:
[Draft todo.md with verification commands]

Ready to proceed to Stage D (execution)?
当激活此技能时,生成以下内容:
SPEC-FIRST WORKFLOW

STAGE A - FRAMING:
[3种带有权衡的方案]
[推荐方案]

STAGE B - SPEC:
[spec.md草稿内容]

STAGE C - TODO:
[带有验证命令的todo.md草稿]

是否准备进入阶段D(执行)?