test-driven-development

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Test-Driven Development

测试驱动开发

Overview

概述

TDD enforces the RED-GREEN-REFACTOR cycle as an unbreakable discipline: write a failing test, make it pass with minimal code, then clean up. This skill prevents untested production code from ever existing and ensures every line of implementation is driven by a verified requirement.

Announce at start: "I'm using the test-driven-development skill with the RED-GREEN-REFACTOR cycle."

TDD强制将红-绿-重构周期作为不可打破的开发准则：先编写失败的测试，用最少的代码让测试通过，再对代码进行清理优化。这套规范可以从根源避免未测试的生产代码出现，确保每一行实现代码都对应经过验证的需求。

开始前声明： "我将使用测试驱动开发规范，遵循红-绿-重构周期进行开发。"

Iron Law

铁律

┌─────────────────────────────────────────────────────────────────┐
│  HARD-GATE: NO PRODUCTION CODE WITHOUT A FAILING TEST FIRST    │
│                                                                 │
│  This is non-negotiable. There are no exceptions. If you are   │
│  writing production code and there is no failing test demanding │
│  that code, you are violating this skill. STOP immediately     │
│  and write the test first.                                     │
└─────────────────────────────────────────────────────────────────┘

┌─────────────────────────────────────────────────────────────────┐
│  HARD-GATE: NO PRODUCTION CODE WITHOUT A FAILING TEST FIRST    │
│                                                                 │
│  This is non-negotiable. There are no exceptions. If you are   │
│  writing production code and there is no failing test demanding │
│  that code, you are violating this skill. STOP immediately     │
│  and write the test first.                                     │
└─────────────────────────────────────────────────────────────────┘

Phase 1: RED (Write a Failing Test)

阶段1：红（编写失败的测试）

Goal: Write exactly ONE test that fails for the right reason.

目标： 仅编写1个符合预期失败原因的测试用例。

Actions

操作步骤

Identify the smallest unit of behavior to implement next
Write a test that asserts that behavior exists
Run the test suite — confirm the new test FAILS
Read the failure message — confirm it fails for the RIGHT reason (missing functionality, not syntax error or import error)
If it fails for the wrong reason, fix the test until it fails correctly

确定下一个要实现的最小行为单元
编写测试用例，断言该行为存在
运行测试套件——确认新增的测试失败
查看失败信息——确认失败原因是对的（缺少对应功能，而非语法错误或导入错误）
如果失败原因不对，修复测试直到它符合预期地失败

STOP — HARD-GATE: Do NOT proceed to GREEN until:

停止——硬门槛：满足以下所有条件后，才能进入绿阶段：

Test is written and saved
Test suite has been run
New test fails
Failure reason is correct (tests the intended behavior)

测试已编写并保存
测试套件已运行
新增测试失败
失败原因正确（测试的是预期行为）

Phase 2: GREEN (Make It Pass)

阶段2：绿（让测试通过）

Goal: Write the MINIMUM production code to make the failing test pass.

目标： 编写最少的生产代码，让失败的测试通过。

Actions

操作步骤

Write only enough code to make the failing test pass
Do NOT refactor. Do NOT clean up. Do NOT optimize
Hardcode values if that makes the test pass — that is fine
Run the full test suite
ALL tests must pass (not just the new one)

仅编写刚好能让失败测试通过的代码
不要重构、不要清理代码、不要优化
如果硬编码值能让测试通过，完全可以这么做
运行完整测试套件
所有测试必须全部通过（不只是新增的测试）

STOP — HARD-GATE: Do NOT proceed to REFACTOR until:

停止——硬门槛：满足以下所有条件后，才能进入重构阶段：

Production code is written
Full test suite has been run
ALL tests pass (new and existing)
No more code was written than necessary

生产代码已编写
完整测试套件已运行
所有测试全部通过（新增和已有测试）
没有编写超出必要范围的代码

Phase 3: REFACTOR (Clean Up)

阶段3：重构（清理优化）

Goal: Improve code quality without changing behavior.

目标： 提升代码质量，不改变原有行为。

Actions

操作步骤

Look for duplication, poor naming, long methods, code smells
Make ONE refactoring change at a time
Run the full test suite after EACH change
If any test fails, undo the refactoring immediately
Continue until the code is clean

查找重复代码、命名不规范、过长方法、代码坏味道等问题
每次仅做1项重构修改
每次修改后都运行完整测试套件
如果有测试失败，立即撤销本次重构
持续优化直到代码整洁为止

STOP — HARD-GATE: Do NOT proceed to next RED until:

停止——硬门槛：满足以下所有条件后，才能进入下一个红阶段：

Code is clean and readable
All tests still pass after refactoring
No behavior was changed during refactoring

代码整洁易读
重构后所有测试仍全部通过
重构过程中没有改变任何原有行为

HARD-GATE Enforcement

硬门槛执行规则

┌─────────────────────────────────────────────────────────────┐
│  HARD-GATE: PHASE COMPLETION CHECK                          │
│                                                             │
│  Before moving to next phase, ALL items in the              │
│  STOP MARKER checklist must be satisfied.                   │
│                                                             │
│  If ANY item is not satisfied:                              │
│  → STOP                                                    │
│  → Complete the missing item                               │
│  → Re-verify ALL items                                     │
│  → ONLY THEN proceed                                       │
└─────────────────────────────────────────────────────────────┘

┌─────────────────────────────────────────────────────────────┐
│  HARD-GATE: PHASE COMPLETION CHECK                          │
│                                                             │
│  Before moving to next phase, ALL items in the              │
│  STOP MARKER checklist must be satisfied.                   │
│                                                             │
│  If ANY item is not satisfied:                              │
│  → STOP                                                    │
│  → Complete the missing item                               │
│  → Re-verify ALL items                                     │
│  → ONLY THEN proceed                                       │
└─────────────────────────────────────────────────────────────┘

Watch Mode Discipline

监听模式规范

After every change to any file (test or production), run the relevant test suite. No exceptions.

Action	Run Tests?	Expected Result
Write a test	Yes	Failure (RED)
Write production code	Yes	Pass (GREEN)
Refactor code	Yes	Pass (still GREEN)
Any other edit	Yes	No regressions

If your test runner supports watch mode, use it. If not, run tests manually after every save.

对任意文件（测试文件或生产代码）做修改后，都要运行对应的测试套件，没有例外。

操作	是否运行测试？	预期结果
编写测试	是	失败（红）
编写生产代码	是	通过（绿）
重构代码	是	通过（保持绿）
其他任何编辑	是	没有回归问题

如果你的测试运行器支持监听模式，请开启使用。如果不支持，每次保存后手动运行测试。

Decision Table: Test Type Selection

决策表：测试类型选择

Behavior Being Tested	Test Type	Framework Example
Pure function logic	Unit test	Vitest, pytest, cargo test
API endpoint request/response	Integration test	Supertest, httpx
Database query correctness	Integration test	Testcontainers
UI component rendering	Unit test	React Testing Library
Full user workflow	E2E test	Playwright
Error handling path	Unit test	Vitest, pytest

待测试行为	测试类型	框架示例
纯函数逻辑	单元测试	Vitest, pytest, cargo test
API端点请求/响应	集成测试	Supertest, httpx
数据库查询正确性	集成测试	Testcontainers
UI组件渲染	单元测试	React Testing Library
完整用户工作流	端到端测试	Playwright
错误处理路径	单元测试	Vitest, pytest

Example Cycle

周期示例

Requirement: "Users can register with email and password"

Behavior List:
1. Registration with valid email and password succeeds
2. Registration fails if email is empty
3. Registration fails if password is too short
4. Registration fails if email is already taken

Cycle 1 - Behavior 1:
  RED:   test_registration_with_valid_email_and_password_succeeds → FAIL (no register function)
  GREEN: def register(email, password): return User(email=email) → PASS
  REFACTOR: rename variable for clarity → PASS

Cycle 2 - Behavior 2:
  RED:   test_registration_fails_if_email_is_empty → FAIL (no validation)
  GREEN: add if not email: raise ValueError → PASS
  REFACTOR: extract validation to separate method → PASS

...continue for each behavior...

需求："用户可以通过邮箱和密码注册"

行为列表：
1. 使用有效邮箱和密码注册成功
2. 邮箱为空时注册失败
3. 密码过短时注册失败
4. 邮箱已被占用时注册失败

周期1 - 行为1：
  红:   test_registration_with_valid_email_and_password_succeeds → 失败（没有注册函数）
  绿: def register(email, password): return User(email=email) → 通过
  重构: 重命名变量提升可读性 → 通过

周期2 - 行为2：
  红:   test_registration_fails_if_email_is_empty → 失败（没有校验逻辑）
  绿: 添加 if not email: raise ValueError → 通过
  重构: 将校验逻辑提取为独立方法 → 通过

...对每个行为重复以上流程...

Checklist: Starting a New Feature with TDD

检查清单：使用TDD开发新功能

Understand the requirement fully before writing any code
Break the requirement into a list of specific behaviors
Order behaviors from simplest to most complex
Create a task for the first behavior
Enter RED phase: write failing test for first behavior
Enter GREEN phase: write minimal code to pass
Enter REFACTOR phase: clean up
Create task for next behavior, repeat from step 5
After all behaviors are implemented, run full test suite
Invoke
```
verification-before-completion
```
before claiming done

编写任何代码前完全理解需求
将需求拆解为具体的行为列表
按从简单到复杂的顺序排列行为
为第一个行为创建任务
进入红阶段：为第一个行为编写失败的测试
进入绿阶段：编写最少的代码让测试通过
进入重构阶段：清理优化代码
为下一个行为创建任务，从第5步开始重复
所有行为实现完成后，运行完整测试套件
声明完成前调用
```
verification-before-completion
```
校验

Test Quality Standards

测试质量标准

Each test must be:

Standard	Definition
Fast	Milliseconds, not seconds
Isolated	No shared state between tests, no test ordering dependencies
Repeatable	Same result every time, no flakiness
Self-validating	Pass or fail, no manual interpretation needed
Timely	Written before the production code (that is the whole point)

Each test should:

Test ONE behavior or scenario
Have a descriptive name that explains the scenario and expected outcome
Follow Arrange-Act-Assert (or Given-When-Then) structure
Use the minimum setup necessary
Assert outcomes, not implementation details

每个测试必须满足：

标准	定义
快速	执行耗时为毫秒级，而非秒级
隔离	测试之间没有共享状态，没有执行顺序依赖
可重复	每次运行结果一致，没有不稳定的波动
自校验	结果只有通过/失败，不需要人工解读
及时	在生产代码编写前完成测试（这是TDD的核心）

每个测试应该：

仅测试1个行为或场景
命名清晰，说明测试场景和预期结果
遵循Arrange-Act-Assert（准备-执行-断言）或Given-When-Then结构
使用最少的前置配置
断言结果，而非实现细节

Anti-Patterns / Common Mistakes

反模式/常见错误

Anti-Pattern	Why It Is Wrong	Correct Approach
Writing production code first	Defeats the purpose of TDD; tests shaped to pass	Write the test first, always
Writing multiple tests before any code	Batch testing defeats incremental design	One test, one cycle
Test passes on first run	Either test is wrong or behavior already exists	Investigate before proceeding
Spending >5 minutes in GREEN	Writing too much code at once	Simplify; make test more specific
Modifying tests to match code	Tests specify behavior; code must match tests	Fix the code, not the test
Skipping REFACTOR phase	Technical debt accumulates rapidly	Refactor every cycle
Not running tests after every change	Regressions go unnoticed	Run tests after every save

反模式	错误原因	正确做法
先写生产代码	违背TDD的核心目的；测试会被调整为适配代码而非校验需求	永远先写测试
写代码前先编写多个测试	批量测试会破坏增量设计	一个测试对应一个周期
测试首次运行就通过	要么测试写错了，要么该行为已经存在	先排查问题再继续
绿阶段耗时超过5分钟	一次性写了太多代码	简化实现，让测试更聚焦
修改测试适配代码	测试是行为的规范，代码应该符合测试要求	修复代码，而非修改测试
跳过重构阶段	技术债务会快速累积	每个周期都要做重构
每次修改后不运行测试	回归问题无法被及时发现	每次保存后都运行测试

Rationalization Prevention

常见借口反驳

Excuse	Reality
"It's just a small change"	Small changes cause production outages. Test it.
"I'll write the tests after"	You will not. And if you do, they will be weaker because they were shaped to pass, not to specify.
"This is just a refactor"	Refactors change behavior more often than you think. The test suite proves they do not.
"I know this works"	You do not. You think you do. The test proves it.
"Tests would slow me down"	Debugging without tests slows you down 10x more.
"This code is too simple to test"	If it is too simple to test, it is too simple to get wrong — so the test will be trivial to write. Write it.
"I can't test this because of dependencies"	Then your design has a coupling problem. Fix the design.
"The test would be harder to write than the code"	That means you do not understand the requirements well enough. The test forces you to clarify.
"I'll just manually verify it"	Manual verification is not repeatable, not documented, and not trustworthy.
"This is throwaway/prototype code"	Prototype code has a habit of becoming production code. Test it now or regret it later.
"The framework makes it hard to test"	Use the framework's testing utilities, or isolate your logic from the framework.
"I'm under time pressure"	TDD is faster over any timeline longer than 20 minutes. The pressure is exactly why you need it.

借口	现实
"只是个小改动而已"	小改动也会引发生产故障，必须测试
"我之后再补测试"	你不会补的。就算补了，测试也是为了适配已有的代码写的，校验能力会弱很多
"只是重构而已"	重构比你想的更容易改变行为，测试套件可以证明行为没有变化
"我知道这段代码没问题"	你只是觉得没问题，测试才能证明真的没问题
"写测试会拖慢进度"	没有测试的调试会让你慢10倍以上
"这段代码太简单了不需要测试"	如果简单到不会出错，那测试写起来也很简单，写就对了
"有依赖我没法测试"	说明你的设计存在耦合问题，先修复设计
"写测试比写代码还难"	说明你没有充分理解需求，测试会倒逼你理清需求
"我手动测过了"	手动校验不可重复、没有文档、也不可靠
"这是临时/原型代码"	原型代码很容易变成生产代码，现在测试不然之后后悔
"框架不好写测试"	用框架的测试工具，或者把业务逻辑和框架解耦
"我赶时间"	只要开发时长超过20分钟，TDD整体效率更高。赶时间的时候你才更需要它

Red Flags

危险信号

If you observe any of these, STOP and reassess:

Red Flag	What It Means	Action
Writing production code with no failing test	Immediate violation	Stop. Write the test.
Test passes immediately on first run	Test is wrong or behavior exists	Investigate before proceeding
More than 5 minutes in GREEN phase	Writing too much code	Simplify. Make test more specific.
Refactoring changes behavior	Test coverage has a gap	Add missing tests
Tests modified to pass	Requirements inverted	Fix code to match tests
Multiple tests before any production code	Batch testing defeats purpose	One test at a time
Test suite not run after a change	Regressions invisible	Run tests. Always. Every time.

如果出现以下任意情况，立即停止并重新评估：

危险信号	含义	应对措施
没有失败的测试就写生产代码	直接违反TDD规范	停止，先写测试
测试首次运行就通过	测试错误或者该行为已存在	先排查问题再继续
绿阶段超过5分钟	写了太多代码	简化实现，让测试更聚焦
重构改变了原有行为	测试覆盖有缺口	补充缺失的测试
为了通过测试修改测试用例	需求和实现关系倒置	修改代码适配测试
写生产代码前编写多个测试	批量测试违背TDD的目的	一次只写一个测试
修改后没有运行测试	无法发现回归问题	每次修改后都必须运行测试

Integration Points

集成关联

Skill	Relationship
`verification-before-completion`	MUST be invoked before claiming any TDD work is complete
`systematic-debugging`	When a test fails unexpectedly during REFACTOR, switch to debugging
`code-review`	After completing a feature via TDD, review the test suite for completeness
`acceptance-testing`	Acceptance criteria drive the behavior list for TDD cycles
`planning`	Plan breaks features into behaviors suitable for TDD cycles
`testing-strategy`	Strategy defines frameworks; TDD defines the cycle

技能	关系
`verification-before-completion`	声明TDD开发工作完成前必须调用
`systematic-debugging`	重构阶段测试意外失败时，切换到调试模式
`code-review`	通过TDD完成功能开发后，评审测试套件的完整性
`acceptance-testing`	验收标准是TDD周期行为列表的输入来源
`planning`	规划阶段将功能拆解为适合TDD周期的行为单元
`testing-strategy`	测试策略定义使用的框架，TDD定义开发周期

Test Types in TDD

TDD中的测试类型

Type	Scope	Speed	When to Write
Unit (Primary)	Individual functions, methods, classes	Milliseconds	RED phase for every behavior
Integration (Secondary)	Component interactions	Seconds	After unit tests cover individual behaviors
E2E (Tertiary)	Complete user workflows	Seconds-minutes	Critical paths after unit and integration are solid

类型	范围	速度	编写时机
单元测试（核心）	单个函数、方法、类	毫秒级	每个行为的红阶段编写
集成测试（次要）	组件之间的交互	秒级	单元测试覆盖单个行为后编写
端到端测试（补充）	完整用户工作流	秒到分钟级	单元和集成测试稳定后，针对核心路径编写

Skill Type

规范类型

RIGID — The RED-GREEN-REFACTOR cycle is mandatory and cannot be reordered, skipped, or combined. Every phase has a HARD-GATE that must be satisfied before proceeding. No production code without a failing test first.

刚性规范 —— 红-绿-重构周期是强制要求，不能调整顺序、跳过或合并阶段。每个阶段都有硬门槛，必须满足才能进入下一阶段。没有失败的测试禁止编写生产代码。