test-driven-development

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Test-Driven Development

测试驱动开发

Overview

概述

TDD enforces the RED-GREEN-REFACTOR cycle as an unbreakable discipline: write a failing test, make it pass with minimal code, then clean up. This skill prevents untested production code from ever existing and ensures every line of implementation is driven by a verified requirement.
Announce at start: "I'm using the test-driven-development skill with the RED-GREEN-REFACTOR cycle."

TDD强制将红-绿-重构周期作为不可打破的开发准则:先编写失败的测试,用最少的代码让测试通过,再对代码进行清理优化。这套规范可以从根源避免未测试的生产代码出现,确保每一行实现代码都对应经过验证的需求。
开始前声明: "我将使用测试驱动开发规范,遵循红-绿-重构周期进行开发。"

Iron Law

铁律

┌─────────────────────────────────────────────────────────────────┐
│  HARD-GATE: NO PRODUCTION CODE WITHOUT A FAILING TEST FIRST    │
│                                                                 │
│  This is non-negotiable. There are no exceptions. If you are   │
│  writing production code and there is no failing test demanding │
│  that code, you are violating this skill. STOP immediately     │
│  and write the test first.                                     │
└─────────────────────────────────────────────────────────────────┘

┌─────────────────────────────────────────────────────────────────┐
│  HARD-GATE: NO PRODUCTION CODE WITHOUT A FAILING TEST FIRST    │
│                                                                 │
│  This is non-negotiable. There are no exceptions. If you are   │
│  writing production code and there is no failing test demanding │
│  that code, you are violating this skill. STOP immediately     │
│  and write the test first.                                     │
└─────────────────────────────────────────────────────────────────┘

Phase 1: RED (Write a Failing Test)

阶段1:红(编写失败的测试)

Goal: Write exactly ONE test that fails for the right reason.
目标: 仅编写1个符合预期失败原因的测试用例。

Actions

操作步骤

  1. Identify the smallest unit of behavior to implement next
  2. Write a test that asserts that behavior exists
  3. Run the test suite — confirm the new test FAILS
  4. Read the failure message — confirm it fails for the RIGHT reason (missing functionality, not syntax error or import error)
  5. If it fails for the wrong reason, fix the test until it fails correctly
  1. 确定下一个要实现的最小行为单元
  2. 编写测试用例,断言该行为存在
  3. 运行测试套件——确认新增的测试失败
  4. 查看失败信息——确认失败原因是对的(缺少对应功能,而非语法错误或导入错误)
  5. 如果失败原因不对,修复测试直到它符合预期地失败

STOP — HARD-GATE: Do NOT proceed to GREEN until:

停止——硬门槛:满足以下所有条件后,才能进入绿阶段:

  • Test is written and saved
  • Test suite has been run
  • New test fails
  • Failure reason is correct (tests the intended behavior)

  • 测试已编写并保存
  • 测试套件已运行
  • 新增测试失败
  • 失败原因正确(测试的是预期行为)

Phase 2: GREEN (Make It Pass)

阶段2:绿(让测试通过)

Goal: Write the MINIMUM production code to make the failing test pass.
目标: 编写最少的生产代码,让失败的测试通过。

Actions

操作步骤

  1. Write only enough code to make the failing test pass
  2. Do NOT refactor. Do NOT clean up. Do NOT optimize
  3. Hardcode values if that makes the test pass — that is fine
  4. Run the full test suite
  5. ALL tests must pass (not just the new one)
  1. 仅编写刚好能让失败测试通过的代码
  2. 不要重构、不要清理代码、不要优化
  3. 如果硬编码值能让测试通过,完全可以这么做
  4. 运行完整测试套件
  5. 所有测试必须全部通过(不只是新增的测试)

STOP — HARD-GATE: Do NOT proceed to REFACTOR until:

停止——硬门槛:满足以下所有条件后,才能进入重构阶段:

  • Production code is written
  • Full test suite has been run
  • ALL tests pass (new and existing)
  • No more code was written than necessary

  • 生产代码已编写
  • 完整测试套件已运行
  • 所有测试全部通过(新增和已有测试)
  • 没有编写超出必要范围的代码

Phase 3: REFACTOR (Clean Up)

阶段3:重构(清理优化)

Goal: Improve code quality without changing behavior.
目标: 提升代码质量,不改变原有行为。

Actions

操作步骤

  1. Look for duplication, poor naming, long methods, code smells
  2. Make ONE refactoring change at a time
  3. Run the full test suite after EACH change
  4. If any test fails, undo the refactoring immediately
  5. Continue until the code is clean
  1. 查找重复代码、命名不规范、过长方法、代码坏味道等问题
  2. 每次仅做1项重构修改
  3. 每次修改后都运行完整测试套件
  4. 如果有测试失败,立即撤销本次重构
  5. 持续优化直到代码整洁为止

STOP — HARD-GATE: Do NOT proceed to next RED until:

停止——硬门槛:满足以下所有条件后,才能进入下一个红阶段:

  • Code is clean and readable
  • All tests still pass after refactoring
  • No behavior was changed during refactoring

  • 代码整洁易读
  • 重构后所有测试仍全部通过
  • 重构过程中没有改变任何原有行为

HARD-GATE Enforcement

硬门槛执行规则

┌─────────────────────────────────────────────────────────────┐
│  HARD-GATE: PHASE COMPLETION CHECK                          │
│                                                             │
│  Before moving to next phase, ALL items in the              │
│  STOP MARKER checklist must be satisfied.                   │
│                                                             │
│  If ANY item is not satisfied:                              │
│  → STOP                                                    │
│  → Complete the missing item                               │
│  → Re-verify ALL items                                     │
│  → ONLY THEN proceed                                       │
└─────────────────────────────────────────────────────────────┘

┌─────────────────────────────────────────────────────────────┐
│  HARD-GATE: PHASE COMPLETION CHECK                          │
│                                                             │
│  Before moving to next phase, ALL items in the              │
│  STOP MARKER checklist must be satisfied.                   │
│                                                             │
│  If ANY item is not satisfied:                              │
│  → STOP                                                    │
│  → Complete the missing item                               │
│  → Re-verify ALL items                                     │
│  → ONLY THEN proceed                                       │
└─────────────────────────────────────────────────────────────┘

Watch Mode Discipline

监听模式规范

After every change to any file (test or production), run the relevant test suite. No exceptions.
ActionRun Tests?Expected Result
Write a testYesFailure (RED)
Write production codeYesPass (GREEN)
Refactor codeYesPass (still GREEN)
Any other editYesNo regressions
If your test runner supports watch mode, use it. If not, run tests manually after every save.

对任意文件(测试文件或生产代码)做修改后,都要运行对应的测试套件,没有例外。
操作是否运行测试?预期结果
编写测试失败(红)
编写生产代码通过(绿)
重构代码通过(保持绿)
其他任何编辑没有回归问题
如果你的测试运行器支持监听模式,请开启使用。如果不支持,每次保存后手动运行测试。

Decision Table: Test Type Selection

决策表:测试类型选择

Behavior Being TestedTest TypeFramework Example
Pure function logicUnit testVitest, pytest, cargo test
API endpoint request/responseIntegration testSupertest, httpx
Database query correctnessIntegration testTestcontainers
UI component renderingUnit testReact Testing Library
Full user workflowE2E testPlaywright
Error handling pathUnit testVitest, pytest

待测试行为测试类型框架示例
纯函数逻辑单元测试Vitest, pytest, cargo test
API端点请求/响应集成测试Supertest, httpx
数据库查询正确性集成测试Testcontainers
UI组件渲染单元测试React Testing Library
完整用户工作流端到端测试Playwright
错误处理路径单元测试Vitest, pytest

Example Cycle

周期示例

Requirement: "Users can register with email and password"

Behavior List:
1. Registration with valid email and password succeeds
2. Registration fails if email is empty
3. Registration fails if password is too short
4. Registration fails if email is already taken

Cycle 1 - Behavior 1:
  RED:   test_registration_with_valid_email_and_password_succeeds → FAIL (no register function)
  GREEN: def register(email, password): return User(email=email) → PASS
  REFACTOR: rename variable for clarity → PASS

Cycle 2 - Behavior 2:
  RED:   test_registration_fails_if_email_is_empty → FAIL (no validation)
  GREEN: add if not email: raise ValueError → PASS
  REFACTOR: extract validation to separate method → PASS

...continue for each behavior...

需求:"用户可以通过邮箱和密码注册"

行为列表:
1. 使用有效邮箱和密码注册成功
2. 邮箱为空时注册失败
3. 密码过短时注册失败
4. 邮箱已被占用时注册失败

周期1 - 行为1:
  红:   test_registration_with_valid_email_and_password_succeeds → 失败(没有注册函数)
  绿: def register(email, password): return User(email=email) → 通过
  重构: 重命名变量提升可读性 → 通过

周期2 - 行为2:
  红:   test_registration_fails_if_email_is_empty → 失败(没有校验逻辑)
  绿: 添加 if not email: raise ValueError → 通过
  重构: 将校验逻辑提取为独立方法 → 通过

...对每个行为重复以上流程...

Checklist: Starting a New Feature with TDD

检查清单:使用TDD开发新功能

  1. Understand the requirement fully before writing any code
  2. Break the requirement into a list of specific behaviors
  3. Order behaviors from simplest to most complex
  4. Create a task for the first behavior
  5. Enter RED phase: write failing test for first behavior
  6. Enter GREEN phase: write minimal code to pass
  7. Enter REFACTOR phase: clean up
  8. Create task for next behavior, repeat from step 5
  9. After all behaviors are implemented, run full test suite
  10. Invoke
    verification-before-completion
    before claiming done

  1. 编写任何代码前完全理解需求
  2. 将需求拆解为具体的行为列表
  3. 按从简单到复杂的顺序排列行为
  4. 为第一个行为创建任务
  5. 进入红阶段:为第一个行为编写失败的测试
  6. 进入绿阶段:编写最少的代码让测试通过
  7. 进入重构阶段:清理优化代码
  8. 为下一个行为创建任务,从第5步开始重复
  9. 所有行为实现完成后,运行完整测试套件
  10. 声明完成前调用
    verification-before-completion
    校验

Test Quality Standards

测试质量标准

Each test must be:
StandardDefinition
FastMilliseconds, not seconds
IsolatedNo shared state between tests, no test ordering dependencies
RepeatableSame result every time, no flakiness
Self-validatingPass or fail, no manual interpretation needed
TimelyWritten before the production code (that is the whole point)
Each test should:
  • Test ONE behavior or scenario
  • Have a descriptive name that explains the scenario and expected outcome
  • Follow Arrange-Act-Assert (or Given-When-Then) structure
  • Use the minimum setup necessary
  • Assert outcomes, not implementation details

每个测试必须满足:
标准定义
快速执行耗时为毫秒级,而非秒级
隔离测试之间没有共享状态,没有执行顺序依赖
可重复每次运行结果一致,没有不稳定的波动
自校验结果只有通过/失败,不需要人工解读
及时在生产代码编写前完成测试(这是TDD的核心)
每个测试应该:
  • 仅测试1个行为或场景
  • 命名清晰,说明测试场景和预期结果
  • 遵循Arrange-Act-Assert(准备-执行-断言)或Given-When-Then结构
  • 使用最少的前置配置
  • 断言结果,而非实现细节

Anti-Patterns / Common Mistakes

反模式/常见错误

Anti-PatternWhy It Is WrongCorrect Approach
Writing production code firstDefeats the purpose of TDD; tests shaped to passWrite the test first, always
Writing multiple tests before any codeBatch testing defeats incremental designOne test, one cycle
Test passes on first runEither test is wrong or behavior already existsInvestigate before proceeding
Spending >5 minutes in GREENWriting too much code at onceSimplify; make test more specific
Modifying tests to match codeTests specify behavior; code must match testsFix the code, not the test
Skipping REFACTOR phaseTechnical debt accumulates rapidlyRefactor every cycle
Not running tests after every changeRegressions go unnoticedRun tests after every save

反模式错误原因正确做法
先写生产代码违背TDD的核心目的;测试会被调整为适配代码而非校验需求永远先写测试
写代码前先编写多个测试批量测试会破坏增量设计一个测试对应一个周期
测试首次运行就通过要么测试写错了,要么该行为已经存在先排查问题再继续
绿阶段耗时超过5分钟一次性写了太多代码简化实现,让测试更聚焦
修改测试适配代码测试是行为的规范,代码应该符合测试要求修复代码,而非修改测试
跳过重构阶段技术债务会快速累积每个周期都要做重构
每次修改后不运行测试回归问题无法被及时发现每次保存后都运行测试

Rationalization Prevention

常见借口反驳

ExcuseReality
"It's just a small change"Small changes cause production outages. Test it.
"I'll write the tests after"You will not. And if you do, they will be weaker because they were shaped to pass, not to specify.
"This is just a refactor"Refactors change behavior more often than you think. The test suite proves they do not.
"I know this works"You do not. You think you do. The test proves it.
"Tests would slow me down"Debugging without tests slows you down 10x more.
"This code is too simple to test"If it is too simple to test, it is too simple to get wrong — so the test will be trivial to write. Write it.
"I can't test this because of dependencies"Then your design has a coupling problem. Fix the design.
"The test would be harder to write than the code"That means you do not understand the requirements well enough. The test forces you to clarify.
"I'll just manually verify it"Manual verification is not repeatable, not documented, and not trustworthy.
"This is throwaway/prototype code"Prototype code has a habit of becoming production code. Test it now or regret it later.
"The framework makes it hard to test"Use the framework's testing utilities, or isolate your logic from the framework.
"I'm under time pressure"TDD is faster over any timeline longer than 20 minutes. The pressure is exactly why you need it.

借口现实
"只是个小改动而已"小改动也会引发生产故障,必须测试
"我之后再补测试"你不会补的。就算补了,测试也是为了适配已有的代码写的,校验能力会弱很多
"只是重构而已"重构比你想的更容易改变行为,测试套件可以证明行为没有变化
"我知道这段代码没问题"你只是觉得没问题,测试才能证明真的没问题
"写测试会拖慢进度"没有测试的调试会让你慢10倍以上
"这段代码太简单了不需要测试"如果简单到不会出错,那测试写起来也很简单,写就对了
"有依赖我没法测试"说明你的设计存在耦合问题,先修复设计
"写测试比写代码还难"说明你没有充分理解需求,测试会倒逼你理清需求
"我手动测过了"手动校验不可重复、没有文档、也不可靠
"这是临时/原型代码"原型代码很容易变成生产代码,现在测试不然之后后悔
"框架不好写测试"用框架的测试工具,或者把业务逻辑和框架解耦
"我赶时间"只要开发时长超过20分钟,TDD整体效率更高。赶时间的时候你才更需要它

Red Flags

危险信号

If you observe any of these, STOP and reassess:
Red FlagWhat It MeansAction
Writing production code with no failing testImmediate violationStop. Write the test.
Test passes immediately on first runTest is wrong or behavior existsInvestigate before proceeding
More than 5 minutes in GREEN phaseWriting too much codeSimplify. Make test more specific.
Refactoring changes behaviorTest coverage has a gapAdd missing tests
Tests modified to passRequirements invertedFix code to match tests
Multiple tests before any production codeBatch testing defeats purposeOne test at a time
Test suite not run after a changeRegressions invisibleRun tests. Always. Every time.

如果出现以下任意情况,立即停止并重新评估:
危险信号含义应对措施
没有失败的测试就写生产代码直接违反TDD规范停止,先写测试
测试首次运行就通过测试错误或者该行为已存在先排查问题再继续
绿阶段超过5分钟写了太多代码简化实现,让测试更聚焦
重构改变了原有行为测试覆盖有缺口补充缺失的测试
为了通过测试修改测试用例需求和实现关系倒置修改代码适配测试
写生产代码前编写多个测试批量测试违背TDD的目的一次只写一个测试
修改后没有运行测试无法发现回归问题每次修改后都必须运行测试

Integration Points

集成关联

SkillRelationship
verification-before-completion
MUST be invoked before claiming any TDD work is complete
systematic-debugging
When a test fails unexpectedly during REFACTOR, switch to debugging
code-review
After completing a feature via TDD, review the test suite for completeness
acceptance-testing
Acceptance criteria drive the behavior list for TDD cycles
planning
Plan breaks features into behaviors suitable for TDD cycles
testing-strategy
Strategy defines frameworks; TDD defines the cycle

技能关系
verification-before-completion
声明TDD开发工作完成前必须调用
systematic-debugging
重构阶段测试意外失败时,切换到调试模式
code-review
通过TDD完成功能开发后,评审测试套件的完整性
acceptance-testing
验收标准是TDD周期行为列表的输入来源
planning
规划阶段将功能拆解为适合TDD周期的行为单元
testing-strategy
测试策略定义使用的框架,TDD定义开发周期

Test Types in TDD

TDD中的测试类型

TypeScopeSpeedWhen to Write
Unit (Primary)Individual functions, methods, classesMillisecondsRED phase for every behavior
Integration (Secondary)Component interactionsSecondsAfter unit tests cover individual behaviors
E2E (Tertiary)Complete user workflowsSeconds-minutesCritical paths after unit and integration are solid

类型范围速度编写时机
单元测试(核心)单个函数、方法、类毫秒级每个行为的红阶段编写
集成测试(次要)组件之间的交互秒级单元测试覆盖单个行为后编写
端到端测试(补充)完整用户工作流秒到分钟级单元和集成测试稳定后,针对核心路径编写

Skill Type

规范类型

RIGID — The RED-GREEN-REFACTOR cycle is mandatory and cannot be reordered, skipped, or combined. Every phase has a HARD-GATE that must be satisfied before proceeding. No production code without a failing test first.
刚性规范 —— 红-绿-重构周期是强制要求,不能调整顺序、跳过或合并阶段。每个阶段都有硬门槛,必须满足才能进入下一阶段。没有失败的测试禁止编写生产代码。