tdd

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

TDD

TDD

Value: Feedback -- short cycles with verifiable evidence keep AI-generated code honest and the human in control. Tests express intent; evidence confirms progress.
价值: 反馈——通过可验证证据的短周期确保AI生成代码的可靠性,同时保持人类的控制权。测试表达意图,证据确认进展。

Purpose

目标

Teaches a five-step TDD cycle (RED, DOMAIN, GREEN, DOMAIN, COMMIT) that adapts to whatever harness runs it. Detects available delegation primitives and routes to guided mode (human drives each phase) or automated mode (system orchestrates phases). Prevents primitive obsession, skipped reviews, and untested complexity regardless of mode.
教授五步TDD循环(RED、DOMAIN、GREEN、DOMAIN、COMMIT),可适配任何运行它的工具框架。检测可用的委托原语,自动切换至引导模式(人工驱动每个阶段)或自动化模式(系统编排各阶段)。无论采用哪种模式,都能防止原始类型痴迷、跳过评审以及未测试的复杂度问题。

Practices

实践

The Five-Step Cycle

五步循环

Every feature is built by repeating: RED -> DOMAIN -> GREEN -> DOMAIN -> COMMIT.
  1. RED -- Write one failing test with one assertion. Only edit test files. Write the code you wish you had -- reference types and functions that do not exist yet. Run the test. Paste the failure output. Stop. Done when: tests run and FAIL (compilation error OR assertion failure).
  2. DOMAIN (after RED) -- Review the test for primitive obsession and invalid-state risks. Create type definitions with stub bodies (
    todo!()
    ,
    raise NotImplementedError
    , etc.). Do not implement logic. Stop. Done when: tests COMPILE but still FAIL (assertion/panic, not compilation error).
  3. GREEN -- Write the minimal code to make the test pass. Only edit production files. Run the test. Paste passing output. Stop. Done when: tests PASS with minimal implementation.
  4. DOMAIN (after GREEN) -- Review the implementation for domain violations: anemic models, leaked validation, primitive obsession that slipped through. If violations found, raise a concern and propose a revision. Done when: types are clean and tests still pass.
  5. COMMIT -- Run the full test suite. Stage all changes and create a git commit referencing the GWT scenario. This is a hard gate: no new RED phase may begin until this commit exists. Done when: git commit created with all tests passing.
After step 5, either start the next RED phase or tidy the code (structural changes only, separate commit).
A compilation failure IS a test failure. Do not pre-create types to avoid compilation errors. Types flow FROM tests, never precede them.
Domain review has veto power over primitive obsession and invalid-state representability. Vetoes escalate to the human after two rounds.
每个功能的构建都遵循重复循环:RED -> DOMAIN -> GREEN -> DOMAIN -> COMMIT。
  1. RED——编写一个包含单个断言的失败测试。仅编辑测试文件。编写你期望拥有的代码——引用尚不存在的类型和函数。运行测试,粘贴失败输出,停止操作。 完成标志:测试运行并失败(编译错误或断言失败)。
  2. DOMAIN(RED之后)——评审测试,检查是否存在原始类型痴迷和无效状态风险。创建带有存根体的类型定义(
    todo!()
    raise NotImplementedError
    等)。不要实现逻辑,停止操作。 完成标志:测试可编译但仍失败(断言/ panic,而非编译错误)。
  3. GREEN——编写最少代码使测试通过。仅编辑生产文件。运行测试,粘贴通过输出,停止操作。 完成标志:测试通过,且实现代码最简。
  4. DOMAIN(GREEN之后)——评审实现代码,检查是否存在领域违规:贫血模型、验证泄露、遗漏的原始类型痴迷问题。若发现违规,提出问题并建议修订。 完成标志:类型定义清晰,且测试仍通过。
  5. COMMIT——运行完整测试套件。暂存所有变更并创建引用GWT场景的git提交。这是一个硬性关卡:在提交完成前,不得启动新的RED阶段。 完成标志:创建git提交,且所有测试通过。
完成第5步后,要么启动下一个RED阶段,要么整理代码(仅结构变更,单独提交)。
编译错误属于测试失败。请勿预先创建类型以避免编译错误。类型应从测试衍生,而非先于测试存在。
领域评审对原始类型痴迷和无效状态表示拥有否决权。若经过两轮仍未解决,需升级至人工处理。

User-Facing Modes

用户可见模式

Guided mode (
/tdd red
,
/tdd domain
,
/tdd green
,
/tdd commit
): Each phase loads
references/{phase}.md
with detailed instructions for that step. For experienced engineers who want explicit phase control. Works on any harness -- no delegation primitives required. The human decides when to advance phases.
Automated mode (
/tdd
or
/tdd auto
): The system detects harness capabilities, selects an execution strategy, and orchestrates the full cycle. The user sees working code, not sausage-making. For verbose output showing phase transitions and evidence, use
/tdd auto --verbose
.
引导模式
/tdd red
/tdd domain
/tdd green
/tdd commit
): 每个阶段会加载
references/{phase}.md
,包含该步骤的详细说明。适合需要明确阶段控制的资深工程师。可在任何工具框架上运行——无需委托原语。由人工决定何时推进阶段。
自动化模式
/tdd
/tdd auto
): 系统检测工具框架能力,选择执行策略并编排完整循环。用户仅需查看可用代码,无需关注内部流程。若需显示阶段转换和证据的详细输出,使用
/tdd auto --verbose

Capability Detection (Automated Mode)

能力检测(自动化模式)

When automated mode activates, detect available primitives in this order:
  1. Agent teams available? Check for TeamCreate tool. If present, use the agent teams strategy with persistent pair sessions.
  2. Subagents available? Check for Task tool (subagent spawning). If present, use the serial subagents strategy with focused per-phase agents.
  3. Fallback. Use the chaining strategy -- role-switch internally between phases within a single context.
Select the most capable strategy available. Do not attempt a higher strategy when its primitives are missing.
启动自动化模式时,按以下顺序检测可用原语:
  1. 是否支持Agent团队? 检查是否有TeamCreate工具。若存在,使用Agent团队策略,搭配持久结对会话。
  2. 是否支持子Agent? 检查是否有Task工具(用于生成子Agent)。若存在,使用串行子Agent策略,搭配专注于各阶段的Agent。
  3. 回退方案:使用链式策略——在单个上下文内,各阶段之间切换角色。
选择可用的最强大策略。若缺少对应原语,请勿尝试更高阶策略。

Execution Strategy: Chaining (Fallback)

执行策略:链式(回退方案)

Used when no delegation primitives are available. The agent plays each role sequentially:
  1. Load
    references/red.md
    . Execute the RED phase.
  2. Load
    references/domain.md
    . Execute DOMAIN review of the test.
  3. Load
    references/green.md
    . Execute the GREEN phase.
  4. Load
    references/domain.md
    . Execute DOMAIN review of the implementation.
  5. Load
    references/commit.md
    . Execute the COMMIT phase.
  6. Repeat.
Role boundaries are advisory in this mode. The agent must self-enforce phase boundaries: only edit file types permitted by the current phase (see
references/phase-boundaries.md
).
当无委托原语可用时使用。Agent会依次扮演每个角色:
  1. 加载
    references/red.md
    ,执行RED阶段。
  2. 加载
    references/domain.md
    ,对测试进行DOMAIN评审。
  3. 加载
    references/green.md
    ,执行GREEN阶段。
  4. 加载
    references/domain.md
    ,对实现代码进行DOMAIN评审。
  5. 加载
    references/commit.md
    ,执行COMMIT阶段。
  6. 重复循环。
在此模式下,角色边界仅为建议性。Agent必须自我强制执行阶段边界:仅编辑当前阶段允许的文件类型(详见
references/phase-boundaries.md
)。

Execution Strategy: Serial Subagents

执行策略:串行子Agent

Used when the Task tool is available for spawning focused subagents. Each phase runs in an isolated subagent with constrained scope.
  • Spawn each phase agent using the prompt template in
    references/{phase}-prompt.md
    .
  • The orchestrator follows
    references/orchestrator.md
    for coordination rules.
  • Structural handoff schema (
    references/handoff-schema.md
    ): every phase agent must return evidence fields (test output, file paths changed, domain concerns). Missing evidence fields = handoff blocked. The orchestrator does not proceed to the next phase until the schema is satisfied.
  • Context isolation provides structural enforcement: each subagent receives only the files relevant to its phase.
当有Task工具可用于生成专注子Agent时使用。每个阶段在独立的子Agent中运行,范围受限。
  • 使用
    references/{phase}-prompt.md
    中的提示模板生成每个阶段的Agent。
  • 编排器遵循
    references/orchestrator.md
    中的协调规则。
  • 结构化交接模式
    references/handoff-schema.md
    ):每个阶段Agent必须返回证据字段(测试输出、变更文件路径、领域问题)。缺少证据字段将阻止交接。编排器在模式满足前不会推进至下一阶段。
  • 上下文隔离提供结构强制:每个子Agent仅接收与其阶段相关的文件。

Execution Strategy: Agent Teams

执行策略:Agent团队

Used when TeamCreate is available for persistent agent sessions. Maximum enforcement through role specialization and persistent pair context.
  • Follow
    references/ping-pong-pairing.md
    for pair session lifecycle, role selection, structured handoffs, and drill-down ownership.
  • Both engineers persist for the entire TDD cycle of a vertical slice. Handoffs happen via lightweight structured messages, not agent recreation.
  • Track pairing history in
    .team/pairing-history.json
    . Do not repeat either of the last 2 pairings.
  • The orchestrator monitors and intervenes only for external clarification routing or blocking disagreements.
当有TeamCreate工具可用于持久Agent会话时使用。通过角色专业化和持久结对上下文实现最大程度的强制。
  • 遵循
    references/ping-pong-pairing.md
    中的结对会话生命周期、角色选择、结构化交接和深入所有权规则。
  • 两名工程师(Agent)会在垂直切片的整个TDD循环中持续存在。交接通过轻量级结构化消息完成,而非重新创建Agent。
  • .team/pairing-history.json
    中跟踪结对历史。请勿重复最近2次结对组合中的任何一种。
  • 编排器仅在外部澄清路由或阻塞性分歧时进行监控和干预。

Phase Boundary Rules

阶段边界规则

Each phase edits only its own file types. This prevents drift. See
references/phase-boundaries.md
for the complete file-type matrix.
PhaseCan EditCannot Edit
REDTest filesProduction code, type definitions
DOMAINType definitions (stubs)Test logic, implementation bodies
GREENImplementation bodiesTest files, type signatures
COMMITNothing -- git operations onlyAll source files
If blocked by a boundary, stop and return to the orchestrator (automated) or report to the user (guided). Never circumvent boundaries.
每个阶段仅可编辑其对应的文件类型,防止偏离。完整文件类型矩阵详见
references/phase-boundaries.md
阶段可编辑不可编辑
RED测试文件生产代码、类型定义
DOMAIN类型定义(存根)测试逻辑、实现主体
GREEN实现主体测试文件、类型签名
COMMIT无——仅执行git操作所有源文件
若被边界规则阻止,停止操作并返回编排器(自动化模式)或告知用户(引导模式)。绝不可规避边界规则。

Walking Skeleton First

先构建可行骨架

The first vertical slice must be a walking skeleton: the thinnest end-to-end path proving all architectural layers connect. It may use hardcoded values or stubs. Build it before any other slice. It de-risks the architecture and gives subsequent slices a proven wiring path to extend.
第一个垂直切片必须是可行骨架:证明所有架构层连通的最简端到端路径。可使用硬编码值或存根。在构建其他切片前完成此步骤。它可降低架构风险,并为后续切片提供经过验证的连接路径。

Outside-In TDD

由外而内的TDD

Start from an acceptance test at the application boundary -- the point where external input enters the system. Drill inward through unit tests. The outer acceptance test stays RED while inner unit tests go through their own red-green-domain-commit cycles. The slice is complete only when the outer acceptance test passes.
A test that calls internal functions directly is a unit test, not an acceptance test -- even if it asserts on user-visible behavior.
Boundary enforcement by mode:
  • Pipeline mode: The CYCLE_COMPLETE evidence must include
    boundary_type
    and
    boundary_evidence
    on the acceptance test. The pipeline's TDD gate rejects evidence where the acceptance test calls internal functions directly.
  • Automated mode (non-pipeline): The orchestrator checks boundary scope and re-delegates if the first test is not a boundary test. Advisory -- no gate blocks progression.
  • Guided mode: The human is responsible for ensuring boundary-level tests. The skill text instructs correct behavior but cannot enforce it.
从应用边界的验收测试开始——即外部输入进入系统的节点。通过单元测试向内深入。外部验收测试保持RED状态,而内部单元测试完成各自的red-green-domain-commit循环。仅当外部验收测试通过时,切片才算完成。
直接调用内部函数的测试属于单元测试,而非验收测试——即使它断言用户可见行为。
模式下的边界强制:
  • 流水线模式: CYCLE_COMPLETE证据必须包含验收测试的
    boundary_type
    boundary_evidence
    。流水线的TDD关卡会拒绝验收测试直接调用内部函数的证据。
  • 自动化模式(非流水线): 编排器会检查边界范围,若首个测试不是边界测试,会重新委托。仅为建议性——无关卡阻止推进。
  • 引导模式: 由人工负责确保边界级测试。技能文本会指导正确行为,但无法强制。

Cycle-Complete Evidence

循环完成证据

At the end of each complete RED-DOMAIN-GREEN-DOMAIN-COMMIT cycle, produce a CYCLE_COMPLETE evidence packet containing: slice_id, acceptance_test {file, name, output, boundary_type, boundary_evidence}, unit_tests {count, all_passing, output}, domain_reviews [{phase, verdict, concerns}], commits [{hash, message}], rework_cycles, pair {driver, navigator}.
When
pipeline-state
is provided in context metadata, the TDD skill operates in pipeline mode: it receives a
slice_id
and stores evidence to
.factory/audit-trail/slices/<slice-id>/tdd-cycles/cycle-NNN.json
. When running standalone, the evidence is informational only (not stored).
See
references/cycle-evidence.md
for full schema.
在每个完整的RED-DOMAIN-GREEN-DOMAIN-COMMIT循环结束时,生成CYCLE_COMPLETE证据包,包含:slice_id、acceptance_test {file, name, output, boundary_type, boundary_evidence}、unit_tests {count, all_passing, output}、domain_reviews [{phase, verdict, concerns}]、commits [{hash, message}]、rework_cycles、pair {driver, navigator}。
若上下文元数据中提供
pipeline-state
,TDD技能将在流水线模式下运行:接收
slice_id
并将证据存储至
.factory/audit-trail/slices/<slice-id>/tdd-cycles/cycle-NNN.json
。独立运行时,证据仅作信息参考(不存储)。
完整模式详见
references/cycle-evidence.md

Harness-Specific Guidance

工具框架特定指南

If running on Claude Code, also read
references/claude-code.md
for harness-specific rules including hook-based enforcement. For maximum mechanical enforcement, ask the bootstrap skill to install optional hooks from
references/hooks/claude-code-hooks.json
.
若在Claude Code上运行,还需阅读
references/claude-code.md
中的工具框架特定规则,包括基于钩子的强制。若需最大程度的机械强制,可要求引导技能从
references/hooks/claude-code-hooks.json
安装可选钩子。

Enforcement Note

强制说明

Enforcement is proportional to capability:
  • Guided mode: Advisory. The skill text instructs correct behavior but cannot prevent violations. The human enforces by controlling phase transitions.
  • Automated mode (chaining): Advisory with self-enforcement. The agent follows phase boundaries by convention.
  • Automated mode (serial subagents): Structural enforcement via context isolation and handoff schemas. Subagents receive only phase-relevant files. Missing evidence blocks handoffs.
  • Automated mode (agent teams): Maximum enforcement through role specialization. Neither engineer can skip review because the other is watching. Persistent context means accumulated understanding, not just rules.
  • Optional hooks (Claude Code): Mechanical enforcement. Pre-tool-use hooks block unauthorized file edits per phase. See
    references/claude-code.md
    .
No mode guarantees perfect discipline. If you observe violations -- production code edited during RED, domain review skipped, commits missing -- point it out.
强制力度与能力成正比:
  • 引导模式: 建议性。技能文本指导正确行为,但无法阻止违规。由人工通过控制阶段转换进行强制。
  • 自动化模式(链式): 建议性,辅以自我强制。Agent按约定遵循阶段边界。
  • 自动化模式(串行子Agent): 通过上下文隔离和交接模式实现结构强制。子Agent仅接收与阶段相关的文件。缺少证据会阻止交接。
  • 自动化模式(Agent团队): 通过角色专业化实现最大程度的强制。两名工程师都无法跳过评审,因为另一方会监督。持久上下文意味着积累的理解,而非仅依赖规则。
  • 可选钩子(Claude Code):机械强制。工具使用前的钩子会阻止每个阶段的未授权文件编辑。详见
    references/claude-code.md
没有任何模式能保证完美的纪律性。若发现违规行为——如在RED阶段编辑生产代码、跳过领域评审、遗漏提交——请指出。

Verification

验证

After completing a cycle, verify:
  • Every failing test was written BEFORE its implementation
  • Domain review occurred after EVERY RED and GREEN phase
  • Phase boundary rules were respected (file-type restrictions)
  • Evidence (test output) was provided at each handoff
  • Commit exists for every completed RED-GREEN cycle
  • Walking skeleton completed first (first vertical slice)
HARD GATE -- COMMIT (must pass before any new RED phase):
  • All tests pass
  • Git commit created with message referencing the current GWT scenario
  • No new RED phase started before this commit was made
完成循环后,验证以下内容:
  • 所有失败测试均在实现代码之前编写
  • 每个RED和GREEN阶段后都进行了领域评审
  • 遵守了阶段边界规则(文件类型限制)
  • 每次交接都提供了证据(测试输出)
  • 每个完成的RED-GREEN循环都有对应的提交
  • 首个垂直切片完成了可行骨架构建
硬性关卡——COMMIT(启动新RED阶段前必须通过):
  • 所有测试通过
  • 创建了引用当前GWT场景的git提交
  • 在提交完成前未启动新的RED阶段

Dependencies

依赖

This skill works standalone. For enhanced workflows, it integrates with:
  • domain-modeling: Strengthens the domain review phases with parse-don't-validate, semantic types, and invalid-state prevention principles.
  • code-review: Three-stage review (spec compliance, code quality, domain integrity) after TDD cycles complete.
  • mutation-testing: Validates test quality by checking that tests detect injected mutations in production code.
  • ensemble-team: Provides real-world expert personas for pair selection and mob review.
Missing a dependency? Install with:
npx skills add jwilger/agent-skills --skill domain-modeling
该技能可独立运行。若需增强工作流,可与以下技能集成:
  • domain-modeling: 通过parse-don't-validate、语义类型和无效状态预防原则强化领域评审阶段。
  • code-review: TDD循环完成后进行三阶段评审(规范合规性、代码质量、领域完整性)。
  • mutation-testing: 通过检查测试是否能检测到生产代码中注入的突变,验证测试质量。
  • ensemble-team: 为结对选择和群体评审提供真实世界的专家角色。
缺少依赖?使用以下命令安装:
npx skills add jwilger/agent-skills --skill domain-modeling