test-driven-development
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseTest-Driven Development
测试驱动开发
Overview
概述
TDD enforces the RED-GREEN-REFACTOR cycle as an unbreakable discipline: write a failing test, make it pass with minimal code, then clean up. This skill prevents untested production code from ever existing and ensures every line of implementation is driven by a verified requirement.
Announce at start: "I'm using the test-driven-development skill with the RED-GREEN-REFACTOR cycle."
TDD强制将红-绿-重构周期作为不可打破的开发准则:先编写失败的测试,用最少的代码让测试通过,再对代码进行清理优化。这套规范可以从根源避免未测试的生产代码出现,确保每一行实现代码都对应经过验证的需求。
开始前声明: "我将使用测试驱动开发规范,遵循红-绿-重构周期进行开发。"
Iron Law
铁律
┌─────────────────────────────────────────────────────────────────┐
│ HARD-GATE: NO PRODUCTION CODE WITHOUT A FAILING TEST FIRST │
│ │
│ This is non-negotiable. There are no exceptions. If you are │
│ writing production code and there is no failing test demanding │
│ that code, you are violating this skill. STOP immediately │
│ and write the test first. │
└─────────────────────────────────────────────────────────────────┘┌─────────────────────────────────────────────────────────────────┐
│ HARD-GATE: NO PRODUCTION CODE WITHOUT A FAILING TEST FIRST │
│ │
│ This is non-negotiable. There are no exceptions. If you are │
│ writing production code and there is no failing test demanding │
│ that code, you are violating this skill. STOP immediately │
│ and write the test first. │
└─────────────────────────────────────────────────────────────────┘Phase 1: RED (Write a Failing Test)
阶段1:红(编写失败的测试)
Goal: Write exactly ONE test that fails for the right reason.
目标: 仅编写1个符合预期失败原因的测试用例。
Actions
操作步骤
- Identify the smallest unit of behavior to implement next
- Write a test that asserts that behavior exists
- Run the test suite — confirm the new test FAILS
- Read the failure message — confirm it fails for the RIGHT reason (missing functionality, not syntax error or import error)
- If it fails for the wrong reason, fix the test until it fails correctly
- 确定下一个要实现的最小行为单元
- 编写测试用例,断言该行为存在
- 运行测试套件——确认新增的测试失败
- 查看失败信息——确认失败原因是对的(缺少对应功能,而非语法错误或导入错误)
- 如果失败原因不对,修复测试直到它符合预期地失败
STOP — HARD-GATE: Do NOT proceed to GREEN until:
停止——硬门槛:满足以下所有条件后,才能进入绿阶段:
- Test is written and saved
- Test suite has been run
- New test fails
- Failure reason is correct (tests the intended behavior)
- 测试已编写并保存
- 测试套件已运行
- 新增测试失败
- 失败原因正确(测试的是预期行为)
Phase 2: GREEN (Make It Pass)
阶段2:绿(让测试通过)
Goal: Write the MINIMUM production code to make the failing test pass.
目标: 编写最少的生产代码,让失败的测试通过。
Actions
操作步骤
- Write only enough code to make the failing test pass
- Do NOT refactor. Do NOT clean up. Do NOT optimize
- Hardcode values if that makes the test pass — that is fine
- Run the full test suite
- ALL tests must pass (not just the new one)
- 仅编写刚好能让失败测试通过的代码
- 不要重构、不要清理代码、不要优化
- 如果硬编码值能让测试通过,完全可以这么做
- 运行完整测试套件
- 所有测试必须全部通过(不只是新增的测试)
STOP — HARD-GATE: Do NOT proceed to REFACTOR until:
停止——硬门槛:满足以下所有条件后,才能进入重构阶段:
- Production code is written
- Full test suite has been run
- ALL tests pass (new and existing)
- No more code was written than necessary
- 生产代码已编写
- 完整测试套件已运行
- 所有测试全部通过(新增和已有测试)
- 没有编写超出必要范围的代码
Phase 3: REFACTOR (Clean Up)
阶段3:重构(清理优化)
Goal: Improve code quality without changing behavior.
目标: 提升代码质量,不改变原有行为。
Actions
操作步骤
- Look for duplication, poor naming, long methods, code smells
- Make ONE refactoring change at a time
- Run the full test suite after EACH change
- If any test fails, undo the refactoring immediately
- Continue until the code is clean
- 查找重复代码、命名不规范、过长方法、代码坏味道等问题
- 每次仅做1项重构修改
- 每次修改后都运行完整测试套件
- 如果有测试失败,立即撤销本次重构
- 持续优化直到代码整洁为止
STOP — HARD-GATE: Do NOT proceed to next RED until:
停止——硬门槛:满足以下所有条件后,才能进入下一个红阶段:
- Code is clean and readable
- All tests still pass after refactoring
- No behavior was changed during refactoring
- 代码整洁易读
- 重构后所有测试仍全部通过
- 重构过程中没有改变任何原有行为
HARD-GATE Enforcement
硬门槛执行规则
┌─────────────────────────────────────────────────────────────┐
│ HARD-GATE: PHASE COMPLETION CHECK │
│ │
│ Before moving to next phase, ALL items in the │
│ STOP MARKER checklist must be satisfied. │
│ │
│ If ANY item is not satisfied: │
│ → STOP │
│ → Complete the missing item │
│ → Re-verify ALL items │
│ → ONLY THEN proceed │
└─────────────────────────────────────────────────────────────┘┌─────────────────────────────────────────────────────────────┐
│ HARD-GATE: PHASE COMPLETION CHECK │
│ │
│ Before moving to next phase, ALL items in the │
│ STOP MARKER checklist must be satisfied. │
│ │
│ If ANY item is not satisfied: │
│ → STOP │
│ → Complete the missing item │
│ → Re-verify ALL items │
│ → ONLY THEN proceed │
└─────────────────────────────────────────────────────────────┘Watch Mode Discipline
监听模式规范
After every change to any file (test or production), run the relevant test suite. No exceptions.
| Action | Run Tests? | Expected Result |
|---|---|---|
| Write a test | Yes | Failure (RED) |
| Write production code | Yes | Pass (GREEN) |
| Refactor code | Yes | Pass (still GREEN) |
| Any other edit | Yes | No regressions |
If your test runner supports watch mode, use it. If not, run tests manually after every save.
对任意文件(测试文件或生产代码)做修改后,都要运行对应的测试套件,没有例外。
| 操作 | 是否运行测试? | 预期结果 |
|---|---|---|
| 编写测试 | 是 | 失败(红) |
| 编写生产代码 | 是 | 通过(绿) |
| 重构代码 | 是 | 通过(保持绿) |
| 其他任何编辑 | 是 | 没有回归问题 |
如果你的测试运行器支持监听模式,请开启使用。如果不支持,每次保存后手动运行测试。
Decision Table: Test Type Selection
决策表:测试类型选择
| Behavior Being Tested | Test Type | Framework Example |
|---|---|---|
| Pure function logic | Unit test | Vitest, pytest, cargo test |
| API endpoint request/response | Integration test | Supertest, httpx |
| Database query correctness | Integration test | Testcontainers |
| UI component rendering | Unit test | React Testing Library |
| Full user workflow | E2E test | Playwright |
| Error handling path | Unit test | Vitest, pytest |
| 待测试行为 | 测试类型 | 框架示例 |
|---|---|---|
| 纯函数逻辑 | 单元测试 | Vitest, pytest, cargo test |
| API端点请求/响应 | 集成测试 | Supertest, httpx |
| 数据库查询正确性 | 集成测试 | Testcontainers |
| UI组件渲染 | 单元测试 | React Testing Library |
| 完整用户工作流 | 端到端测试 | Playwright |
| 错误处理路径 | 单元测试 | Vitest, pytest |
Example Cycle
周期示例
Requirement: "Users can register with email and password"
Behavior List:
1. Registration with valid email and password succeeds
2. Registration fails if email is empty
3. Registration fails if password is too short
4. Registration fails if email is already taken
Cycle 1 - Behavior 1:
RED: test_registration_with_valid_email_and_password_succeeds → FAIL (no register function)
GREEN: def register(email, password): return User(email=email) → PASS
REFACTOR: rename variable for clarity → PASS
Cycle 2 - Behavior 2:
RED: test_registration_fails_if_email_is_empty → FAIL (no validation)
GREEN: add if not email: raise ValueError → PASS
REFACTOR: extract validation to separate method → PASS
...continue for each behavior...需求:"用户可以通过邮箱和密码注册"
行为列表:
1. 使用有效邮箱和密码注册成功
2. 邮箱为空时注册失败
3. 密码过短时注册失败
4. 邮箱已被占用时注册失败
周期1 - 行为1:
红: test_registration_with_valid_email_and_password_succeeds → 失败(没有注册函数)
绿: def register(email, password): return User(email=email) → 通过
重构: 重命名变量提升可读性 → 通过
周期2 - 行为2:
红: test_registration_fails_if_email_is_empty → 失败(没有校验逻辑)
绿: 添加 if not email: raise ValueError → 通过
重构: 将校验逻辑提取为独立方法 → 通过
...对每个行为重复以上流程...Checklist: Starting a New Feature with TDD
检查清单:使用TDD开发新功能
- Understand the requirement fully before writing any code
- Break the requirement into a list of specific behaviors
- Order behaviors from simplest to most complex
- Create a task for the first behavior
- Enter RED phase: write failing test for first behavior
- Enter GREEN phase: write minimal code to pass
- Enter REFACTOR phase: clean up
- Create task for next behavior, repeat from step 5
- After all behaviors are implemented, run full test suite
- Invoke before claiming done
verification-before-completion
- 编写任何代码前完全理解需求
- 将需求拆解为具体的行为列表
- 按从简单到复杂的顺序排列行为
- 为第一个行为创建任务
- 进入红阶段:为第一个行为编写失败的测试
- 进入绿阶段:编写最少的代码让测试通过
- 进入重构阶段:清理优化代码
- 为下一个行为创建任务,从第5步开始重复
- 所有行为实现完成后,运行完整测试套件
- 声明完成前调用校验
verification-before-completion
Test Quality Standards
测试质量标准
Each test must be:
| Standard | Definition |
|---|---|
| Fast | Milliseconds, not seconds |
| Isolated | No shared state between tests, no test ordering dependencies |
| Repeatable | Same result every time, no flakiness |
| Self-validating | Pass or fail, no manual interpretation needed |
| Timely | Written before the production code (that is the whole point) |
Each test should:
- Test ONE behavior or scenario
- Have a descriptive name that explains the scenario and expected outcome
- Follow Arrange-Act-Assert (or Given-When-Then) structure
- Use the minimum setup necessary
- Assert outcomes, not implementation details
每个测试必须满足:
| 标准 | 定义 |
|---|---|
| 快速 | 执行耗时为毫秒级,而非秒级 |
| 隔离 | 测试之间没有共享状态,没有执行顺序依赖 |
| 可重复 | 每次运行结果一致,没有不稳定的波动 |
| 自校验 | 结果只有通过/失败,不需要人工解读 |
| 及时 | 在生产代码编写前完成测试(这是TDD的核心) |
每个测试应该:
- 仅测试1个行为或场景
- 命名清晰,说明测试场景和预期结果
- 遵循Arrange-Act-Assert(准备-执行-断言)或Given-When-Then结构
- 使用最少的前置配置
- 断言结果,而非实现细节
Anti-Patterns / Common Mistakes
反模式/常见错误
| Anti-Pattern | Why It Is Wrong | Correct Approach |
|---|---|---|
| Writing production code first | Defeats the purpose of TDD; tests shaped to pass | Write the test first, always |
| Writing multiple tests before any code | Batch testing defeats incremental design | One test, one cycle |
| Test passes on first run | Either test is wrong or behavior already exists | Investigate before proceeding |
| Spending >5 minutes in GREEN | Writing too much code at once | Simplify; make test more specific |
| Modifying tests to match code | Tests specify behavior; code must match tests | Fix the code, not the test |
| Skipping REFACTOR phase | Technical debt accumulates rapidly | Refactor every cycle |
| Not running tests after every change | Regressions go unnoticed | Run tests after every save |
| 反模式 | 错误原因 | 正确做法 |
|---|---|---|
| 先写生产代码 | 违背TDD的核心目的;测试会被调整为适配代码而非校验需求 | 永远先写测试 |
| 写代码前先编写多个测试 | 批量测试会破坏增量设计 | 一个测试对应一个周期 |
| 测试首次运行就通过 | 要么测试写错了,要么该行为已经存在 | 先排查问题再继续 |
| 绿阶段耗时超过5分钟 | 一次性写了太多代码 | 简化实现,让测试更聚焦 |
| 修改测试适配代码 | 测试是行为的规范,代码应该符合测试要求 | 修复代码,而非修改测试 |
| 跳过重构阶段 | 技术债务会快速累积 | 每个周期都要做重构 |
| 每次修改后不运行测试 | 回归问题无法被及时发现 | 每次保存后都运行测试 |
Rationalization Prevention
常见借口反驳
| Excuse | Reality |
|---|---|
| "It's just a small change" | Small changes cause production outages. Test it. |
| "I'll write the tests after" | You will not. And if you do, they will be weaker because they were shaped to pass, not to specify. |
| "This is just a refactor" | Refactors change behavior more often than you think. The test suite proves they do not. |
| "I know this works" | You do not. You think you do. The test proves it. |
| "Tests would slow me down" | Debugging without tests slows you down 10x more. |
| "This code is too simple to test" | If it is too simple to test, it is too simple to get wrong — so the test will be trivial to write. Write it. |
| "I can't test this because of dependencies" | Then your design has a coupling problem. Fix the design. |
| "The test would be harder to write than the code" | That means you do not understand the requirements well enough. The test forces you to clarify. |
| "I'll just manually verify it" | Manual verification is not repeatable, not documented, and not trustworthy. |
| "This is throwaway/prototype code" | Prototype code has a habit of becoming production code. Test it now or regret it later. |
| "The framework makes it hard to test" | Use the framework's testing utilities, or isolate your logic from the framework. |
| "I'm under time pressure" | TDD is faster over any timeline longer than 20 minutes. The pressure is exactly why you need it. |
| 借口 | 现实 |
|---|---|
| "只是个小改动而已" | 小改动也会引发生产故障,必须测试 |
| "我之后再补测试" | 你不会补的。就算补了,测试也是为了适配已有的代码写的,校验能力会弱很多 |
| "只是重构而已" | 重构比你想的更容易改变行为,测试套件可以证明行为没有变化 |
| "我知道这段代码没问题" | 你只是觉得没问题,测试才能证明真的没问题 |
| "写测试会拖慢进度" | 没有测试的调试会让你慢10倍以上 |
| "这段代码太简单了不需要测试" | 如果简单到不会出错,那测试写起来也很简单,写就对了 |
| "有依赖我没法测试" | 说明你的设计存在耦合问题,先修复设计 |
| "写测试比写代码还难" | 说明你没有充分理解需求,测试会倒逼你理清需求 |
| "我手动测过了" | 手动校验不可重复、没有文档、也不可靠 |
| "这是临时/原型代码" | 原型代码很容易变成生产代码,现在测试不然之后后悔 |
| "框架不好写测试" | 用框架的测试工具,或者把业务逻辑和框架解耦 |
| "我赶时间" | 只要开发时长超过20分钟,TDD整体效率更高。赶时间的时候你才更需要它 |
Red Flags
危险信号
If you observe any of these, STOP and reassess:
| Red Flag | What It Means | Action |
|---|---|---|
| Writing production code with no failing test | Immediate violation | Stop. Write the test. |
| Test passes immediately on first run | Test is wrong or behavior exists | Investigate before proceeding |
| More than 5 minutes in GREEN phase | Writing too much code | Simplify. Make test more specific. |
| Refactoring changes behavior | Test coverage has a gap | Add missing tests |
| Tests modified to pass | Requirements inverted | Fix code to match tests |
| Multiple tests before any production code | Batch testing defeats purpose | One test at a time |
| Test suite not run after a change | Regressions invisible | Run tests. Always. Every time. |
如果出现以下任意情况,立即停止并重新评估:
| 危险信号 | 含义 | 应对措施 |
|---|---|---|
| 没有失败的测试就写生产代码 | 直接违反TDD规范 | 停止,先写测试 |
| 测试首次运行就通过 | 测试错误或者该行为已存在 | 先排查问题再继续 |
| 绿阶段超过5分钟 | 写了太多代码 | 简化实现,让测试更聚焦 |
| 重构改变了原有行为 | 测试覆盖有缺口 | 补充缺失的测试 |
| 为了通过测试修改测试用例 | 需求和实现关系倒置 | 修改代码适配测试 |
| 写生产代码前编写多个测试 | 批量测试违背TDD的目的 | 一次只写一个测试 |
| 修改后没有运行测试 | 无法发现回归问题 | 每次修改后都必须运行测试 |
Integration Points
集成关联
| Skill | Relationship |
|---|---|
| MUST be invoked before claiming any TDD work is complete |
| When a test fails unexpectedly during REFACTOR, switch to debugging |
| After completing a feature via TDD, review the test suite for completeness |
| Acceptance criteria drive the behavior list for TDD cycles |
| Plan breaks features into behaviors suitable for TDD cycles |
| Strategy defines frameworks; TDD defines the cycle |
| 技能 | 关系 |
|---|---|
| 声明TDD开发工作完成前必须调用 |
| 重构阶段测试意外失败时,切换到调试模式 |
| 通过TDD完成功能开发后,评审测试套件的完整性 |
| 验收标准是TDD周期行为列表的输入来源 |
| 规划阶段将功能拆解为适合TDD周期的行为单元 |
| 测试策略定义使用的框架,TDD定义开发周期 |
Test Types in TDD
TDD中的测试类型
| Type | Scope | Speed | When to Write |
|---|---|---|---|
| Unit (Primary) | Individual functions, methods, classes | Milliseconds | RED phase for every behavior |
| Integration (Secondary) | Component interactions | Seconds | After unit tests cover individual behaviors |
| E2E (Tertiary) | Complete user workflows | Seconds-minutes | Critical paths after unit and integration are solid |
| 类型 | 范围 | 速度 | 编写时机 |
|---|---|---|---|
| 单元测试(核心) | 单个函数、方法、类 | 毫秒级 | 每个行为的红阶段编写 |
| 集成测试(次要) | 组件之间的交互 | 秒级 | 单元测试覆盖单个行为后编写 |
| 端到端测试(补充) | 完整用户工作流 | 秒到分钟级 | 单元和集成测试稳定后,针对核心路径编写 |
Skill Type
规范类型
RIGID — The RED-GREEN-REFACTOR cycle is mandatory and cannot be reordered, skipped, or combined. Every phase has a HARD-GATE that must be satisfied before proceeding. No production code without a failing test first.
刚性规范 —— 红-绿-重构周期是强制要求,不能调整顺序、跳过或合并阶段。每个阶段都有硬门槛,必须满足才能进入下一阶段。没有失败的测试禁止编写生产代码。