tdd-workflow

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Test-Driven Development Workflow

测试驱动开发工作流

Core Philosophy

核心理念

TDD is a design discipline, not just a testing technique. Writing tests first forces you to think about the interface before the implementation, producing code that is inherently testable, loosely coupled, and driven by actual requirements.

TDD是一种设计规范，而非仅仅是测试技术。先编写测试会迫使你在实现前思考接口，从而生成天生可测试、低耦合且由实际需求驱动的代码。

The Red-Green-Refactor Loop

红-绿-重构循环

Step 1: RED — Write a Failing Test

步骤1：RED——编写失败的测试

Identify the smallest piece of behavior to implement next.
Write a test that describes that behavior from the caller's perspective.
Run the test suite. The new test MUST fail.
If the test passes without writing new code, either the behavior already exists or the test is wrong.

Checklist before moving to GREEN:

The test describes WHAT, not HOW.
The test name reads like a specification (e.g.,
```
it should return 404 when resource not found
```
).
The test fails for the right reason (expected assertion failure, not a syntax error or import error).
The test is isolated — it does not depend on other tests or external state.

确定接下来要实现的最小功能单元。
从调用者的视角编写描述该功能的测试。
运行测试套件。新测试必须失败。
如果无需编写新代码测试就通过了，要么该功能已经存在，要么测试本身存在问题。

进入GREEN阶段前的检查清单：

测试描述的是要实现什么，而非如何实现。
测试名称读起来像一份规范（例如：
```
it should return 404 when resource not found
```
）。
测试因正确的原因失败（预期的断言失败，而非语法错误或导入错误）。
测试是独立的——不依赖其他测试或外部状态。

Step 2: GREEN — Write the Minimal Implementation

步骤2：GREEN——编写最小化实现

Write the simplest code that makes the failing test pass.
It is acceptable — even encouraged — to hardcode values, use naive algorithms, or write "ugly" code at this stage.
Do NOT add logic that is not required by a failing test.
Run the full test suite. ALL tests must pass.

The "simplest thing that could work" principle:

If one test expects
```
return 42
```
, write
```
return 42
```
. Do not write a general formula yet.
If two tests expect different outputs, write an
```
if
```
statement. Do not write a loop yet.
If three tests show a pattern, NOW consider a general algorithm.

Checklist before moving to REFACTOR:

The new test passes.
All previously passing tests still pass.
No code was added beyond what the tests demand.

编写能让失败测试通过的最简代码。
在此阶段，硬编码值、使用朴素算法或编写“丑陋”的代码是可接受的——甚至是被鼓励的。
不要添加任何未被失败测试要求的逻辑。
运行完整测试套件。所有测试都必须通过。

“最简单可行方案”原则：

如果一个测试期望返回
```
42
```
，就写
```
return 42
```
。暂时不要编写通用公式。
如果两个测试期望不同输出，就写一个
```
if
```
语句。暂时不要编写循环。
如果三个测试呈现出模式，此时再考虑通用算法。

进入REFACTOR阶段前的检查清单：

新测试已通过。
所有之前通过的测试仍然通过。
没有添加超出测试要求的代码。

Step 3: REFACTOR — Improve the Design

步骤3：REFACTOR——优化设计

Step 4: REPEAT

步骤4：REPEAT——重复

Return to Step 1 with the next smallest behavior.

回到步骤1，处理下一个最小功能单元。

Test Naming Conventions

测试命名规范

Use descriptive names that serve as documentation:

// Pattern: [unit]_[scenario]_[expected result]
test_calculateTotal_withEmptyCart_returnsZero
test_calculateTotal_withSingleItem_returnsItemPrice
test_calculateTotal_withDiscount_appliesDiscountToSubtotal

// BDD-style
describe("calculateTotal") {
  it("should return zero for an empty cart")
  it("should return the item price for a single item")
  it("should apply discount to subtotal")
}

// Given-When-Then
given_emptyCart_when_calculateTotal_then_returnsZero

Rules:

Never use "test1", "test2", or other meaningless names.
The test name should tell you what broke without reading the test body.
Group related tests under a
```
describe
```
or test class.

使用具有描述性的名称，使其兼具文档作用：

// 模式：[单元]_[场景]_[预期结果]
test_calculateTotal_withEmptyCart_returnsZero
test_calculateTotal_withSingleItem_returnsItemPrice
test_calculateTotal_withDiscount_appliesDiscountToSubtotal

// BDD风格
describe("calculateTotal") {
  it("should return zero for an empty cart")
  it("should return the item price for a single item")
  it("should apply discount to subtotal")
}

// Given-When-Then风格
given_emptyCart_when_calculateTotal_then_returnsZero

规则：

绝不要使用“test1”、“test2”这类无意义的名称。
测试名称应能直接告诉你哪里出了问题，无需阅读测试代码主体。
将相关测试分组到
```
describe
```
块或测试类下。

Choosing What to Test Next

选择下一步测试内容

Start with the degenerate cases:

从退化情况开始：

Null / empty / zero inputs.
Single element / simplest valid input.
Typical valid inputs.
Boundary values and edge cases.
Error conditions and invalid inputs.

空值/空输入/零输入。
单个元素/最简单的有效输入。
典型有效输入。
边界值和边缘情况。
错误条件和无效输入。

Prioritization:

优先级：

Begin with the behavior that drives the most architectural decisions.
Defer I/O, persistence, and external service tests until core logic is solid.
Test the happy path first, then edge cases, then error paths.

从驱动最多架构决策的功能开始。
延迟I/O、持久化和外部服务测试，直到核心逻辑稳定。
先测试正常流程，再测试边缘情况，最后测试错误路径。

TDD for Different Test Types

不同测试类型的TDD实践

Unit Tests (most common in TDD)

单元测试（TDD中最常见）

RED:   Write a test for a single function or method.
GREEN: Implement just that function.
REFACTOR: Clean up the function and its test.
Cycle time: 1-5 minutes.

RED:   为单个函数或方法编写测试。
GREEN: 仅实现该函数。
REFACTOR: 清理函数及其测试代码。
循环时间：1-5分钟。

Integration Tests

集成测试

RED:   Write a test that exercises two or more components together.
GREEN: Wire the components and implement any missing glue code.
REFACTOR: Extract shared setup, clean interfaces between components.
Cycle time: 5-15 minutes.

RED:   编写测试，验证两个或多个组件协同工作的情况。
GREEN: 连接组件并实现所需的粘合代码。
REFACTOR: 提取共享设置、优化组件间的接口。
循环时间：5-15分钟。

API / HTTP Tests

API/HTTP测试

RED:   Write a test that sends an HTTP request and asserts on status + body.
GREEN: Implement the route handler with minimal logic.
REFACTOR: Extract validation, business logic, and serialization into separate layers.
Cycle time: 5-20 minutes.

RED:   编写测试，发送HTTP请求并断言状态码和响应体。
GREEN: 实现带有最小逻辑的路由处理器。
REFACTOR: 将验证、业务逻辑和序列化提取到独立层中。
循环时间：5-20分钟。

Example: Building a URL Shortener (Unit Level)

示例：构建URL短链接服务（单元级别）

python

undefined

python

undefined

RED: First test — empty slug

RED: 第一个测试——空链接

def test_shorten_rejects_empty_url(): with pytest.raises(ValueError): shorten("")

GREEN: Minimal implementation

GREEN: 最小化实现

def shorten(url): if not url: raise ValueError("URL cannot be empty") pass # nothing else needed yet

def shorten(url): if not url: raise ValueError("URL cannot be empty") pass # 暂时不需要其他逻辑

RED: Second test — returns a short code

RED: 第二个测试——返回短码

def test_shorten_returns_six_char_code(): result = shorten("https://example.com") assert len(result) == 6

GREEN: Hardcode, then generalize

GREEN: 先硬编码，再泛化

def shorten(url): if not url: raise ValueError("URL cannot be empty") return url[:6] # naive but passing

def shorten(url): if not url: raise ValueError("URL cannot be empty") return url[:6] # 朴素但能通过测试

RED: Third test — codes are unique

RED: 第三个测试——短码唯一

def test_shorten_returns_unique_codes(): a = shorten("https://example.com/a") b = shorten("https://example.com/b") assert a != b

GREEN: Now we need real logic

GREEN: 现在需要真正的逻辑

import hashlib def shorten(url): if not url: raise ValueError("URL cannot be empty") return hashlib.md5(url.encode()).hexdigest()[:6]

REFACTOR: Extract hashing, add type hints, rename for clarity

REFACTOR: 提取哈希逻辑、添加类型提示、重命名以提升可读性

---

---

Handling Untested Legacy Code

处理未测试的遗留代码

When adding features to code without tests:

Characterization tests first. Write tests that document what the code currently does, not what it should do. Lock down existing behavior.
Find the seam. Identify a point where you can intercept behavior (dependency injection, method override, function parameter).
Apply TDD to the new feature. Write failing tests for the new behavior, implement it, then refactor.
Expand the safety net. Gradually add tests around the touched code.

The Mikado Method for legacy TDD:

Try a naive change.
If tests break, note what broke and revert.
Fix the prerequisites first (add missing tests, extract dependencies).
Retry the original change.

在无测试的代码中添加功能时：

先编写特征测试。编写测试记录代码当前的行为，而非预期行为。锁定现有功能。
找到接缝点。识别可以拦截行为的位置（依赖注入、方法重写、函数参数）。
对新功能应用TDD。为新功能编写失败的测试，实现功能，然后重构。
扩大安全网。逐步为修改过的代码添加测试。

遗留代码TDD的米卡多方法：

尝试一个简单的修改。
如果测试失败，记录失败点并回滚修改。
先修复前置条件（添加缺失的测试、提取依赖）。
重试最初的修改。

Common TDD Pitfalls

常见TDD陷阱

Writing too large a test

编写过大的测试

Symptom: The GREEN step takes more than 15 minutes.
Fix: Break the test into smaller behavioral increments.

症状： GREEN阶段耗时超过15分钟。
解决： 将测试拆分为更小的功能增量。

Testing implementation details

测试实现细节

Symptom: Tests break when you refactor, even though behavior is unchanged.
Fix: Test inputs and outputs, not internal method calls or data structures.

症状： 重构时即使功能未变，测试也会失败。
解决： 测试输入和输出，而非内部方法调用或数据结构。

Skipping the RED step

跳过RED阶段

Symptom: You write code and tests at the same time.
Fix: Discipline. Always see the test fail first. A test you have never seen fail is a test you cannot trust.

症状： 同时编写代码和测试。
解决： 保持自律。务必先看到测试失败。从未失败过的测试是不可信的。

Skipping the REFACTOR step

跳过REFACTOR阶段

Symptom: Code works but is messy, duplicated, or hard to read.
Fix: Set a timer. After every GREEN, spend at least 2 minutes looking for cleanup opportunities.

症状： 代码能运行但混乱、重复或难以阅读。
解决： 设置计时器。每次GREEN阶段后，至少花2分钟寻找清理机会。

Gold-plating during GREEN

GREEN阶段过度设计

Symptom: You add error handling, logging, or optimizations not required by any test.
Fix: If no test demands it, delete it. You can add it later when a test asks for it.

症状： 添加了任何测试都未要求的错误处理、日志或优化。
解决： 如果没有测试要求，就删除它。后续有测试需要时再添加。

Fragile test fixtures

脆弱的测试夹具

Symptom: Many tests break when a shared fixture changes.
Fix: Use factory functions or builders. Each test should set up only what it needs.

症状： 共享夹具变更时，大量测试失败。
解决： 使用工厂函数或构建器。每个测试仅设置自身所需的内容。

Test interdependence

测试相互依赖

Symptom: Tests pass in one order but fail in another.
Fix: Each test must set up and tear down its own state. Run tests in random order to detect this.

症状： 测试按不同顺序运行时结果不同。
解决： 每个测试必须自行设置和清理状态。随机运行测试以检测此类问题。

TDD Decision Framework

TDD决策框架

Is the behavior well-understood?
  YES -> Classic TDD (test-first)
  NO  -> Spike first (throwaway prototype), then TDD the real implementation

Is the code interacting with external systems?
  YES -> Write a contract/interface test, then use a fake/stub for unit TDD
  NO  -> Pure function TDD (easiest case)

Is the code algorithmically complex?
  YES -> Start with simple examples, build up with property-based tests
  NO  -> Standard example-based TDD

Are you fixing a bug?
  YES -> Write a test that reproduces the bug FIRST, then fix it
  NO  -> Normal TDD cycle

功能是否明确？
  是 -> 经典TDD（先测试后实现）
  否 -> 先做探索性原型（可丢弃），再用TDD实现正式版本

代码是否与外部系统交互？
  是 -> 编写契约/接口测试，然后使用模拟/桩件进行单元TDD
  否 -> 纯函数TDD（最简单的场景）

代码算法是否复杂？
  是 -> 从简单示例开始，结合基于属性的测试逐步构建
  否 -> 标准的基于示例的TDD

是否在修复Bug？
  是 -> 先编写能复现Bug的测试，再修复Bug
  否 -> 正常TDD循环

Key Metrics

关键指标

Cycle time: Each red-green-refactor should take 1-15 minutes. Longer cycles mean the step is too big.
Test count growth: Roughly 1 test per 5-15 lines of production code.
Refactor frequency: You should refactor at least every 3 cycles.
All tests passing: At the end of every GREEN and REFACTOR step. Never commit with failing tests.

循环时间： 每个红-绿-重构循环应耗时1-15分钟。循环时间过长意味着步骤拆分过大。
测试数量增长： 每5-15行生产代码对应约1个测试。
重构频率： 至少每3个循环进行一次重构。
所有测试通过： 在每个GREEN和REFACTOR阶段结束时，所有测试必须通过。绝不要提交带有失败测试的代码。

Summary: The Three Laws of TDD (Robert C. Martin)

总结：TDD的三条法则（Robert C. Martin）

You may not write production code until you have written a failing test.
You may not write more of a test than is sufficient to fail (and not compiling counts as failing).
You may not write more production code than is sufficient to pass the currently failing test.

除非已编写失败的测试，否则不得编写生产代码。
不得编写超出导致失败所需的测试代码（编译不通过也视为失败）。
不得编写超出通过当前失败测试所需的生产代码。