tdd-workflow
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseTest-Driven Development Workflow
测试驱动开发工作流
Core Philosophy
核心理念
TDD is a design discipline, not just a testing technique. Writing tests first forces you to think about the interface before the implementation, producing code that is inherently testable, loosely coupled, and driven by actual requirements.
TDD是一种设计规范,而非仅仅是测试技术。先编写测试会迫使你在实现前思考接口,从而生成天生可测试、低耦合且由实际需求驱动的代码。
The Red-Green-Refactor Loop
红-绿-重构循环
Step 1: RED — Write a Failing Test
步骤1:RED——编写失败的测试
- Identify the smallest piece of behavior to implement next.
- Write a test that describes that behavior from the caller's perspective.
- Run the test suite. The new test MUST fail.
- If the test passes without writing new code, either the behavior already exists or the test is wrong.
Checklist before moving to GREEN:
- The test describes WHAT, not HOW.
- The test name reads like a specification (e.g., ).
it should return 404 when resource not found - The test fails for the right reason (expected assertion failure, not a syntax error or import error).
- The test is isolated — it does not depend on other tests or external state.
- 确定接下来要实现的最小功能单元。
- 从调用者的视角编写描述该功能的测试。
- 运行测试套件。新测试必须失败。
- 如果无需编写新代码测试就通过了,要么该功能已经存在,要么测试本身存在问题。
进入GREEN阶段前的检查清单:
- 测试描述的是要实现什么,而非如何实现。
- 测试名称读起来像一份规范(例如:)。
it should return 404 when resource not found - 测试因正确的原因失败(预期的断言失败,而非语法错误或导入错误)。
- 测试是独立的——不依赖其他测试或外部状态。
Step 2: GREEN — Write the Minimal Implementation
步骤2:GREEN——编写最小化实现
- Write the simplest code that makes the failing test pass.
- It is acceptable — even encouraged — to hardcode values, use naive algorithms, or write "ugly" code at this stage.
- Do NOT add logic that is not required by a failing test.
- Run the full test suite. ALL tests must pass.
The "simplest thing that could work" principle:
- If one test expects , write
return 42. Do not write a general formula yet.return 42 - If two tests expect different outputs, write an statement. Do not write a loop yet.
if - If three tests show a pattern, NOW consider a general algorithm.
Checklist before moving to REFACTOR:
- The new test passes.
- All previously passing tests still pass.
- No code was added beyond what the tests demand.
- 编写能让失败测试通过的最简代码。
- 在此阶段,硬编码值、使用朴素算法或编写“丑陋”的代码是可接受的——甚至是被鼓励的。
- 不要添加任何未被失败测试要求的逻辑。
- 运行完整测试套件。所有测试都必须通过。
“最简单可行方案”原则:
- 如果一个测试期望返回,就写
42。暂时不要编写通用公式。return 42 - 如果两个测试期望不同输出,就写一个语句。暂时不要编写循环。
if - 如果三个测试呈现出模式,此时再考虑通用算法。
进入REFACTOR阶段前的检查清单:
- 新测试已通过。
- 所有之前通过的测试仍然通过。
- 没有添加超出测试要求的代码。
Step 3: REFACTOR — Improve the Design
步骤3:REFACTOR——优化设计
- Look for duplication in both production code and test code.
- Extract methods, rename variables, simplify conditionals.
- Apply design patterns only when the code asks for them (three or more instances).
- Run the full test suite after every change. All tests must stay green.
Refactoring targets:
- Duplicate code extracted into shared functions.
- Variable and method names clearly express intent.
- No method exceeds 10-15 lines.
- No function has more than 3 parameters.
- Test code is also clean and readable.
- 检查生产代码和测试代码中的重复内容。
- 提取方法、重命名变量、简化条件语句。
- 仅当代码实际需要时(出现三次及以上相同模式)才应用设计模式。
- 每次变更后运行完整测试套件。所有测试必须保持通过状态。
重构目标:
- 重复代码已提取到共享函数中。
- 变量和方法名称能清晰表达意图。
- 方法行数不超过10-15行。
- 函数参数不超过3个。
- 测试代码同样简洁易读。
Step 4: REPEAT
步骤4:REPEAT——重复
Return to Step 1 with the next smallest behavior.
回到步骤1,处理下一个最小功能单元。
Test Naming Conventions
测试命名规范
Use descriptive names that serve as documentation:
// Pattern: [unit]_[scenario]_[expected result]
test_calculateTotal_withEmptyCart_returnsZero
test_calculateTotal_withSingleItem_returnsItemPrice
test_calculateTotal_withDiscount_appliesDiscountToSubtotal
// BDD-style
describe("calculateTotal") {
it("should return zero for an empty cart")
it("should return the item price for a single item")
it("should apply discount to subtotal")
}
// Given-When-Then
given_emptyCart_when_calculateTotal_then_returnsZeroRules:
- Never use "test1", "test2", or other meaningless names.
- The test name should tell you what broke without reading the test body.
- Group related tests under a or test class.
describe
使用具有描述性的名称,使其兼具文档作用:
// 模式:[单元]_[场景]_[预期结果]
test_calculateTotal_withEmptyCart_returnsZero
test_calculateTotal_withSingleItem_returnsItemPrice
test_calculateTotal_withDiscount_appliesDiscountToSubtotal
// BDD风格
describe("calculateTotal") {
it("should return zero for an empty cart")
it("should return the item price for a single item")
it("should apply discount to subtotal")
}
// Given-When-Then风格
given_emptyCart_when_calculateTotal_then_returnsZero规则:
- 绝不要使用“test1”、“test2”这类无意义的名称。
- 测试名称应能直接告诉你哪里出了问题,无需阅读测试代码主体。
- 将相关测试分组到块或测试类下。
describe
Choosing What to Test Next
选择下一步测试内容
Start with the degenerate cases:
从退化情况开始:
- Null / empty / zero inputs.
- Single element / simplest valid input.
- Typical valid inputs.
- Boundary values and edge cases.
- Error conditions and invalid inputs.
- 空值/空输入/零输入。
- 单个元素/最简单的有效输入。
- 典型有效输入。
- 边界值和边缘情况。
- 错误条件和无效输入。
Prioritization:
优先级:
- Begin with the behavior that drives the most architectural decisions.
- Defer I/O, persistence, and external service tests until core logic is solid.
- Test the happy path first, then edge cases, then error paths.
- 从驱动最多架构决策的功能开始。
- 延迟I/O、持久化和外部服务测试,直到核心逻辑稳定。
- 先测试正常流程,再测试边缘情况,最后测试错误路径。
TDD for Different Test Types
不同测试类型的TDD实践
Unit Tests (most common in TDD)
单元测试(TDD中最常见)
RED: Write a test for a single function or method.
GREEN: Implement just that function.
REFACTOR: Clean up the function and its test.
Cycle time: 1-5 minutes.RED: 为单个函数或方法编写测试。
GREEN: 仅实现该函数。
REFACTOR: 清理函数及其测试代码。
循环时间:1-5分钟。Integration Tests
集成测试
RED: Write a test that exercises two or more components together.
GREEN: Wire the components and implement any missing glue code.
REFACTOR: Extract shared setup, clean interfaces between components.
Cycle time: 5-15 minutes.RED: 编写测试,验证两个或多个组件协同工作的情况。
GREEN: 连接组件并实现所需的粘合代码。
REFACTOR: 提取共享设置、优化组件间的接口。
循环时间:5-15分钟。API / HTTP Tests
API/HTTP测试
RED: Write a test that sends an HTTP request and asserts on status + body.
GREEN: Implement the route handler with minimal logic.
REFACTOR: Extract validation, business logic, and serialization into separate layers.
Cycle time: 5-20 minutes.RED: 编写测试,发送HTTP请求并断言状态码和响应体。
GREEN: 实现带有最小逻辑的路由处理器。
REFACTOR: 将验证、业务逻辑和序列化提取到独立层中。
循环时间:5-20分钟。Example: Building a URL Shortener (Unit Level)
示例:构建URL短链接服务(单元级别)
python
undefinedpython
undefinedRED: First test — empty slug
RED: 第一个测试——空链接
def test_shorten_rejects_empty_url():
with pytest.raises(ValueError):
shorten("")
def test_shorten_rejects_empty_url():
with pytest.raises(ValueError):
shorten("")
GREEN: Minimal implementation
GREEN: 最小化实现
def shorten(url):
if not url:
raise ValueError("URL cannot be empty")
pass # nothing else needed yet
def shorten(url):
if not url:
raise ValueError("URL cannot be empty")
pass # 暂时不需要其他逻辑
RED: Second test — returns a short code
RED: 第二个测试——返回短码
def test_shorten_returns_six_char_code():
result = shorten("https://example.com")
assert len(result) == 6
def test_shorten_returns_six_char_code():
result = shorten("https://example.com")
assert len(result) == 6
GREEN: Hardcode, then generalize
GREEN: 先硬编码,再泛化
def shorten(url):
if not url:
raise ValueError("URL cannot be empty")
return url[:6] # naive but passing
def shorten(url):
if not url:
raise ValueError("URL cannot be empty")
return url[:6] # 朴素但能通过测试
RED: Third test — codes are unique
RED: 第三个测试——短码唯一
def test_shorten_returns_unique_codes():
a = shorten("https://example.com/a")
b = shorten("https://example.com/b")
assert a != b
def test_shorten_returns_unique_codes():
a = shorten("https://example.com/a")
b = shorten("https://example.com/b")
assert a != b
GREEN: Now we need real logic
GREEN: 现在需要真正的逻辑
import hashlib
def shorten(url):
if not url:
raise ValueError("URL cannot be empty")
return hashlib.md5(url.encode()).hexdigest()[:6]
import hashlib
def shorten(url):
if not url:
raise ValueError("URL cannot be empty")
return hashlib.md5(url.encode()).hexdigest()[:6]
REFACTOR: Extract hashing, add type hints, rename for clarity
REFACTOR: 提取哈希逻辑、添加类型提示、重命名以提升可读性
---
---Handling Untested Legacy Code
处理未测试的遗留代码
When adding features to code without tests:
- Characterization tests first. Write tests that document what the code currently does, not what it should do. Lock down existing behavior.
- Find the seam. Identify a point where you can intercept behavior (dependency injection, method override, function parameter).
- Apply TDD to the new feature. Write failing tests for the new behavior, implement it, then refactor.
- Expand the safety net. Gradually add tests around the touched code.
The Mikado Method for legacy TDD:
- Try a naive change.
- If tests break, note what broke and revert.
- Fix the prerequisites first (add missing tests, extract dependencies).
- Retry the original change.
在无测试的代码中添加功能时:
- 先编写特征测试。编写测试记录代码当前的行为,而非预期行为。锁定现有功能。
- 找到接缝点。识别可以拦截行为的位置(依赖注入、方法重写、函数参数)。
- 对新功能应用TDD。为新功能编写失败的测试,实现功能,然后重构。
- 扩大安全网。逐步为修改过的代码添加测试。
遗留代码TDD的米卡多方法:
- 尝试一个简单的修改。
- 如果测试失败,记录失败点并回滚修改。
- 先修复前置条件(添加缺失的测试、提取依赖)。
- 重试最初的修改。
Common TDD Pitfalls
常见TDD陷阱
Writing too large a test
编写过大的测试
- Symptom: The GREEN step takes more than 15 minutes.
- Fix: Break the test into smaller behavioral increments.
- 症状: GREEN阶段耗时超过15分钟。
- 解决: 将测试拆分为更小的功能增量。
Testing implementation details
测试实现细节
- Symptom: Tests break when you refactor, even though behavior is unchanged.
- Fix: Test inputs and outputs, not internal method calls or data structures.
- 症状: 重构时即使功能未变,测试也会失败。
- 解决: 测试输入和输出,而非内部方法调用或数据结构。
Skipping the RED step
跳过RED阶段
- Symptom: You write code and tests at the same time.
- Fix: Discipline. Always see the test fail first. A test you have never seen fail is a test you cannot trust.
- 症状: 同时编写代码和测试。
- 解决: 保持自律。务必先看到测试失败。从未失败过的测试是不可信的。
Skipping the REFACTOR step
跳过REFACTOR阶段
- Symptom: Code works but is messy, duplicated, or hard to read.
- Fix: Set a timer. After every GREEN, spend at least 2 minutes looking for cleanup opportunities.
- 症状: 代码能运行但混乱、重复或难以阅读。
- 解决: 设置计时器。每次GREEN阶段后,至少花2分钟寻找清理机会。
Gold-plating during GREEN
GREEN阶段过度设计
- Symptom: You add error handling, logging, or optimizations not required by any test.
- Fix: If no test demands it, delete it. You can add it later when a test asks for it.
- 症状: 添加了任何测试都未要求的错误处理、日志或优化。
- 解决: 如果没有测试要求,就删除它。后续有测试需要时再添加。
Fragile test fixtures
脆弱的测试夹具
- Symptom: Many tests break when a shared fixture changes.
- Fix: Use factory functions or builders. Each test should set up only what it needs.
- 症状: 共享夹具变更时,大量测试失败。
- 解决: 使用工厂函数或构建器。每个测试仅设置自身所需的内容。
Test interdependence
测试相互依赖
- Symptom: Tests pass in one order but fail in another.
- Fix: Each test must set up and tear down its own state. Run tests in random order to detect this.
- 症状: 测试按不同顺序运行时结果不同。
- 解决: 每个测试必须自行设置和清理状态。随机运行测试以检测此类问题。
TDD Decision Framework
TDD决策框架
Is the behavior well-understood?
YES -> Classic TDD (test-first)
NO -> Spike first (throwaway prototype), then TDD the real implementation
Is the code interacting with external systems?
YES -> Write a contract/interface test, then use a fake/stub for unit TDD
NO -> Pure function TDD (easiest case)
Is the code algorithmically complex?
YES -> Start with simple examples, build up with property-based tests
NO -> Standard example-based TDD
Are you fixing a bug?
YES -> Write a test that reproduces the bug FIRST, then fix it
NO -> Normal TDD cycle功能是否明确?
是 -> 经典TDD(先测试后实现)
否 -> 先做探索性原型(可丢弃),再用TDD实现正式版本
代码是否与外部系统交互?
是 -> 编写契约/接口测试,然后使用模拟/桩件进行单元TDD
否 -> 纯函数TDD(最简单的场景)
代码算法是否复杂?
是 -> 从简单示例开始,结合基于属性的测试逐步构建
否 -> 标准的基于示例的TDD
是否在修复Bug?
是 -> 先编写能复现Bug的测试,再修复Bug
否 -> 正常TDD循环Key Metrics
关键指标
- Cycle time: Each red-green-refactor should take 1-15 minutes. Longer cycles mean the step is too big.
- Test count growth: Roughly 1 test per 5-15 lines of production code.
- Refactor frequency: You should refactor at least every 3 cycles.
- All tests passing: At the end of every GREEN and REFACTOR step. Never commit with failing tests.
- 循环时间: 每个红-绿-重构循环应耗时1-15分钟。循环时间过长意味着步骤拆分过大。
- 测试数量增长: 每5-15行生产代码对应约1个测试。
- 重构频率: 至少每3个循环进行一次重构。
- 所有测试通过: 在每个GREEN和REFACTOR阶段结束时,所有测试必须通过。绝不要提交带有失败测试的代码。
Summary: The Three Laws of TDD (Robert C. Martin)
总结:TDD的三条法则(Robert C. Martin)
- You may not write production code until you have written a failing test.
- You may not write more of a test than is sufficient to fail (and not compiling counts as failing).
- You may not write more production code than is sufficient to pass the currently failing test.
- 除非已编写失败的测试,否则不得编写生产代码。
- 不得编写超出导致失败所需的测试代码(编译不通过也视为失败)。
- 不得编写超出通过当前失败测试所需的生产代码。