tdd

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

/tdd - Test-Driven Development Workflow

/tdd - 测试驱动开发(TDD)工作流

Strict TDD workflow: tests first, then implementation.
严格的TDD工作流:先编写测试,再实现代码。

When to Use

适用场景

  • "Implement X using TDD"
  • "Build this feature test-first"
  • "Write tests for X then implement"
  • Any feature where test coverage is critical
  • Bug fixes that need regression tests

  • 使用TDD实现功能X
  • 以测试优先的方式开发此功能
  • 先为X编写测试再实现代码
  • 任何对测试覆盖率要求严格的功能
  • 需要回归测试的Bug修复

TDD Philosophy

TDD理念

Overview

概述

Write the test first. Watch it fail. Write minimal code to pass.
Core principle: If you didn't watch the test fail, you don't know if it tests the right thing.
Violating the letter of the rules is violating the spirit of the rules.
先编写测试,看着它失败,再编写最少的代码使其通过。
核心原则: 如果你没有看到测试失败,你就无法确定它是否测试了正确的内容。
违反规则的字面要求,就是违反规则的精神。

The Iron Law

铁律

NO PRODUCTION CODE WITHOUT A FAILING TEST FIRST
Write code before the test? Delete it. Start over.
No exceptions:
  • Don't keep it as "reference"
  • Don't "adapt" it while writing tests
  • Don't look at it
  • Delete means delete
Implement fresh from tests. Period.
NO PRODUCTION CODE WITHOUT A FAILING TEST FIRST
在编写测试前就写代码?删掉它,重新开始。
无例外:
  • 不要把它当作“参考”保留
  • 不要在编写测试时“改编”它
  • 不要查看它
  • 删除就是彻底删除
从测试出发重新实现。就这么简单。

Red-Green-Refactor

红-绿-重构(Red-Green-Refactor)

RED - Write Failing Test

RED - 编写失败的测试

Write one minimal test showing what should happen.
Good:
typescript
test('retries failed operations 3 times', async () => {
  let attempts = 0;
  const operation = () => {
    attempts++;
    if (attempts < 3) throw new Error('fail');
    return 'success';
  };

  const result = await retryOperation(operation);

  expect(result).toBe('success');
  expect(attempts).toBe(3);
});
Clear name, tests real behavior, one thing.
Bad:
typescript
test('retry works', async () => {
  const mock = jest.fn()
    .mockRejectedValueOnce(new Error())
    .mockResolvedValueOnce('success');
  await retryOperation(mock);
  expect(mock).toHaveBeenCalledTimes(3);
});
Vague name, tests mock not code.
Requirements:
  • One behavior
  • Clear name
  • Real code (no mocks unless unavoidable)
编写一个最小化的测试,明确预期行为。
良好示例:
typescript
test('retries failed operations 3 times', async () => {
  let attempts = 0;
  const operation = () => {
    attempts++;
    if (attempts < 3) throw new Error('fail');
    return 'success';
  };

  const result = await retryOperation(operation);

  expect(result).toBe('success');
  expect(attempts).toBe(3);
});
名称清晰,测试真实行为,仅测试单一功能。
糟糕示例:
typescript
test('retry works', async () => {
  const mock = jest.fn()
    .mockRejectedValueOnce(new Error())
    .mockResolvedValueOnce('success');
  await retryOperation(mock);
  expect(mock).toHaveBeenCalledTimes(3);
});
名称模糊,测试的是模拟对象而非实际代码。
要求:
  • 单一行为
  • 名称清晰
  • 使用真实代码(除非万不得已,否则不要用模拟对象)

Verify RED - Watch It Fail

验证RED阶段 - 确认测试失败

MANDATORY. Never skip.
bash
npm test path/to/test.test.ts
必须执行,绝不能跳过。
bash
npm test path/to/test.test.ts

or

or

pytest path/to/test_file.py

Confirm:
- Test fails (not errors)
- Failure message is expected
- Fails because feature missing (not typos)

**Test passes?** You're testing existing behavior. Fix test.
**Test errors?** Fix error, re-run until it fails correctly.
pytest path/to/test_file.py

确认:
- 测试失败(不是报错)
- 失败信息符合预期
- 失败原因是功能缺失(而非拼写错误)

**测试通过了?** 你在测试已有的行为。修改测试。
**测试报错?** 修复错误,重新运行直到测试正确失败。

GREEN - Minimal Code

GREEN - 编写最简代码

Write simplest code to pass the test.
Good:
typescript
async function retryOperation<T>(fn: () => Promise<T>): Promise<T> {
  for (let i = 0; i < 3; i++) {
    try {
      return await fn();
    } catch (e) {
      if (i === 2) throw e;
    }
  }
  throw new Error('unreachable');
}
Just enough to pass.
Bad:
typescript
async function retryOperation<T>(
  fn: () => Promise<T>,
  options?: {
    maxRetries?: number;
    backoff?: 'linear' | 'exponential';
    onRetry?: (attempt: number) => void;
  }
): Promise<T> {
  // YAGNI - over-engineered
}
Don't add features, refactor other code, or "improve" beyond the test.
编写最简单的代码使测试通过。
良好示例:
typescript
async function retryOperation<T>(fn: () => Promise<T>): Promise<T> {
  for (let i = 0; i < 3; i++) {
    try {
      return await fn();
    } catch (e) {
      if (i === 2) throw e;
    }
  }
  throw new Error('unreachable');
}
刚好满足测试要求。
糟糕示例:
typescript
async function retryOperation<T>(
  fn: () => Promise<T>,
  options?: {
    maxRetries?: number;
    backoff?: 'linear' | 'exponential';
    onRetry?: (attempt: number) => void;
  }
): Promise<T> {
  // YAGNI - 过度设计
}
不要添加额外功能、重构其他代码或超出测试要求的“优化”。

Verify GREEN - Watch It Pass

验证GREEN阶段 - 确认测试通过

MANDATORY.
bash
npm test path/to/test.test.ts
Confirm:
  • Test passes
  • Other tests still pass
  • Output pristine (no errors, warnings)
Test fails? Fix code, not test. Other tests fail? Fix now.
必须执行。
bash
npm test path/to/test.test.ts
确认:
  • 测试通过
  • 其他测试仍能通过
  • 输出干净(无错误、警告)
测试失败? 修复代码,而非测试。 其他测试失败? 立即修复。

REFACTOR - Clean Up

REFACTOR - 代码清理

After green only:
  • Remove duplication
  • Improve names
  • Extract helpers
Keep tests green. Don't add behavior.
仅在GREEN阶段之后进行:
  • 消除重复代码
  • 优化命名
  • 提取辅助函数
保持测试通过,不要添加新行为。

Common Rationalizations

常见借口与实际情况

ExcuseReality
"Too simple to test"Simple code breaks. Test takes 30 seconds.
"I'll test after"Tests passing immediately prove nothing.
"Tests after achieve same goals"Tests-after = "what does this do?" Tests-first = "what should this do?"
"Already manually tested"Ad-hoc ≠ systematic. No record, can't re-run.
"Deleting X hours is wasteful"Sunk cost fallacy. Keeping unverified code is technical debt.
"Keep as reference, write tests first"You'll adapt it. That's testing after. Delete means delete.
"Need to explore first"Fine. Throw away exploration, start with TDD.
"Test hard = design unclear"Listen to test. Hard to test = hard to use.
"TDD will slow me down"TDD faster than debugging. Pragmatic = test-first.
"Manual test faster"Manual doesn't prove edge cases. You'll re-test every change.
借口实际情况
“太简单了,不用测试”简单代码也会出错。编写测试只需要30秒。
“我之后再测试”测试立即通过毫无意义。
“事后测试能达到同样目的”事后测试=“这段代码做什么?” 事前测试=“这段代码应该做什么?”
“我已经手动测试过了”临时测试≠系统化测试。没有记录,无法重复执行。
“删掉X小时的工作太浪费了”沉没成本谬误。保留未验证的代码就是技术债务。
“保留作为参考,先写测试”你会不自觉地参考它,这本质还是事后测试。删除就是彻底删除。
“我需要先探索一下”没问题。探索完成后全部丢弃,再用TDD重新开始。
“测试难度大=设计不清晰”倾听测试的反馈。难测试=难使用。
“TDD会拖慢我的速度”TDD比调试更快。务实的选择就是测试优先。
“手动测试更快”手动测试无法覆盖边缘场景。你会在每次变更后重复手动测试。

Red Flags - STOP and Start Over

危险信号 - 立即停止并重新开始

  • Code before test
  • Test after implementation
  • Test passes immediately
  • Can't explain why test failed
  • Tests added "later"
  • Rationalizing "just this once"
  • "I already manually tested it"
  • "Tests after achieve the same purpose"
  • "Keep as reference" or "adapt existing code"
All of these mean: Delete code. Start over with TDD.
  • 在编写测试前就写了代码
  • 实现代码后才写测试
  • 测试立即通过
  • 无法解释测试失败的原因
  • 事后添加测试
  • 找借口“就这一次”
  • “我已经手动测试过了”
  • “事后测试能达到同样目的”
  • “保留作为参考”或“改编现有代码”
以上所有情况都意味着:删除代码,重新开始TDD流程。

Verification Checklist

验证清单

Before marking work complete:
  • Every new function/method has a test
  • Watched each test fail before implementing
  • Each test failed for expected reason (feature missing, not typo)
  • Wrote minimal code to pass each test
  • All tests pass
  • Output pristine (no errors, warnings)
  • Tests use real code (mocks only if unavoidable)
  • Edge cases and errors covered
Can't check all boxes? You skipped TDD. Start over.
在标记工作完成前:
  • 每个新函数/方法都有对应的测试
  • 每个测试在实现前都确认过失败
  • 每个测试的失败原因符合预期(功能缺失,而非拼写错误)
  • 编写的代码刚好满足测试要求
  • 所有测试都通过
  • 输出干净(无错误、警告)
  • 测试使用真实代码(除非万不得已才用模拟对象)
  • 覆盖了边缘场景和错误情况
无法勾选所有选项?说明你跳过了TDD流程。重新开始。

When Stuck

遇到困境时

ProblemSolution
Don't know how to testWrite wished-for API. Write assertion first. Ask your human partner.
Test too complicatedDesign too complicated. Simplify interface.
Must mock everythingCode too coupled. Use dependency injection.
Test setup hugeExtract helpers. Still complex? Simplify design.

问题解决方案
不知道如何测试编写你期望的API,先写断言。向你的人类搭档求助。
测试过于复杂设计过于复杂。简化接口。
必须大量使用模拟对象代码耦合度过高。使用依赖注入。
测试设置工作繁重提取辅助函数。如果仍然复杂,简化设计。

Workflow Execution

工作流执行

Workflow Overview

工作流概述

┌────────────┐    ┌──────────┐    ┌──────────┐    ┌───────────┐
│   plan-    │───▶│ arbiter  │───▶│  kraken  │───▶│ arbiter  │
│   agent    │    │          │    │          │    │           │
└────────────┘    └──────────┘    └──────────┘    └───────────┘
   Design          Write           Implement        Verify
   approach        failing         minimal          all tests
                   tests           code             pass
┌────────────┐    ┌──────────┐    ┌──────────┐    ┌───────────┐
│   plan-    │───▶│ arbiter  │───▶│  kraken  │───▶│ arbiter  │
│   agent    │    │          │    │          │    │           │
└────────────┘    └──────────┘    └──────────┘    └───────────┘
   设计方案          编写失败的测试       实现最简代码        验证所有测试通过

Agent Sequence

Agent执行序列

#AgentRoleOutput
1plan-agentDesign test cases and implementation approachTest plan
2arbiterWrite failing tests (RED phase)Test files
3krakenImplement minimal code to pass (GREEN phase)Implementation
4arbiterRun all tests, verify nothing brokenTest report
序号Agent角色输出
1plan-agent设计测试用例和实现方案测试计划
2arbiter编写失败的测试(RED阶段)测试文件
3kraken实现最简代码使测试通过(GREEN阶段)代码实现
4arbiter运行所有测试,确认无功能损坏测试报告

Core Principle

核心原则

NO PRODUCTION CODE WITHOUT A FAILING TEST FIRST
Each agent follows the TDD contract:
  • arbiter writes tests that MUST fail initially
  • kraken writes MINIMAL code to make tests pass
  • arbiter confirms the full suite passes
NO PRODUCTION CODE WITHOUT A FAILING TEST FIRST
每个Agent都遵循TDD约定:
  • arbiter编写的测试必须初始状态下失败
  • kraken编写最少的代码使测试通过
  • arbiter确认整个测试套件通过

Execution

执行步骤

Phase 1: Plan Test Cases

阶段1:规划测试用例

Task(
  subagent_type="plan-agent",
  prompt="""
  Design TDD approach for: [FEATURE_NAME]

  Define:
  1. What behaviors need to be tested
  2. Edge cases to cover
  3. Expected test structure

  DO NOT write any implementation code.
  Output: Test plan document
  """
)
Task(
  subagent_type="plan-agent",
  prompt="""
  为以下功能设计TDD方案:[功能名称]

  定义:
  1. 需要测试的行为
  2. 需要覆盖的边缘场景
  3. 预期的测试结构

  **请勿编写任何实现代码。**
  输出:测试计划文档
  """
)

Phase 2: Write Failing Tests (RED)

阶段2:编写失败的测试(RED阶段)

Task(
  subagent_type="arbiter",
  prompt="""
  Write failing tests for: [FEATURE_NAME]

  Test plan: [from phase 1]

  Requirements:
  - Write tests FIRST
  - Run tests to confirm they FAIL
  - Tests must fail because feature is missing (not syntax errors)
  - Create clear test names describing expected behavior

  DO NOT write any implementation code.
  """
)
Task(
  subagent_type="arbiter",
  prompt="""
  为以下功能编写失败的测试:[功能名称]

  测试计划:[来自阶段1]

  要求:
  - 先编写测试
  - 运行测试确认失败
  - 测试必须因功能缺失而失败(而非语法错误)
  - 为测试起清晰的名称,描述预期行为

  **请勿编写任何实现代码。**
  """
)

Phase 3: Implement (GREEN)

阶段3:实现代码(GREEN阶段)

Task(
  subagent_type="kraken",
  prompt="""
  Implement MINIMAL code to pass tests: [FEATURE_NAME]

  Tests location: [test file path]

  Requirements:
  - Write ONLY enough code to make tests pass
  - No additional features beyond what tests require
  - No "improvements" or "enhancements"
  - Run tests after each change

  Follow Red-Green-Refactor strictly.
  """
)
Task(
  subagent_type="kraken",
  prompt="""
  编写最少的代码使以下功能的测试通过:[功能名称]

  测试文件路径:[测试文件路径]

  要求:
  - 仅编写足够使测试通过的代码
  - 不要添加测试未要求的额外功能
  - 不要进行“改进”或“增强”
  - 每次变更后运行测试

  严格遵循红-绿-重构流程。
  """
)

Phase 4: Validate

阶段4:验证

Task(
  subagent_type="arbiter",
  prompt="""
  Validate TDD implementation: [FEATURE_NAME]

  - Run full test suite
  - Verify all new tests pass
  - Verify no existing tests broke
  - Check test coverage if available
  """
)
Task(
  subagent_type="arbiter",
  prompt="""
  验证以下功能的TDD实现:[功能名称]

  - 运行完整的测试套件
  - 验证所有新测试通过
  - 验证现有测试未被破坏
  - 如果有条件,检查测试覆盖率
  """
)

TDD Rules Enforced

示例

  1. arbiter cannot write implementation code
  2. kraken cannot add untested features
  3. Tests must fail before implementation
  4. Tests must pass after implementation
用户:/tdd 为注册表单添加邮箱验证功能

Claude:开始为邮箱验证功能执行/tdd工作流...

阶段1:规划测试用例...
[启动plan-agent]
测试计划:
- 有效邮箱格式
- 无效邮箱格式
- 空邮箱拦截
- 边缘场景(Unicode、超长邮箱)

阶段2:编写失败的测试(RED阶段)...
[启动arbiter]
✅ 已编写8个测试,全部按预期失败

阶段3:实现最简代码(GREEN阶段)...
[启动kraken]
✅ 所有8个测试现在已通过

阶段4:验证...
[启动arbiter]
✅ 247个测试通过(含8个新增测试),0个失败

TDD工作流完成!

Example

重构阶段(可选)

User: /tdd Add email validation to the signup form

Claude: Starting /tdd workflow for email validation...

Phase 1: Planning test cases...
[Spawns plan-agent]
Test plan:
- Valid email formats
- Invalid email formats
- Empty email rejection
- Edge cases (unicode, long emails)

Phase 2: Writing failing tests (RED)...
[Spawns arbiter]
✅ 8 tests written, all failing as expected

Phase 3: Implementing minimal code (GREEN)...
[Spawns kraken]
✅ All 8 tests now passing

Phase 4: Validating...
[Spawns arbiter]
✅ 247 tests passing (8 new), 0 failing

TDD workflow complete!
在GREEN阶段之后,你可以添加重构阶段:
Task(
  subagent_type="kraken",
  prompt="""
  重构以下功能:[功能名称]

  - 在保持测试通过的前提下清理代码
  - 消除重复代码
  - 优化命名
  - 必要时提取辅助函数

  **请勿添加新行为。保持所有测试通过。**
  """
)

Refactor Phase (Optional)

After GREEN, you can add a refactor phase:
Task(
  subagent_type="kraken",
  prompt="""
  Refactor: [FEATURE_NAME]

  - Clean up code while keeping tests green
  - Remove duplication
  - Improve naming
  - Extract helpers if needed

  DO NOT add new behavior. Keep all tests passing.
  """
)