Test-Driven Development

测试驱动开发（TDD）

Philosophy

理念

Core principle: Tests should verify behavior through public interfaces, not implementation details. Code can change entirely; tests shouldn't.

Good tests are integration-style: they exercise real code paths through public APIs. They describe what the system does, not how it does it. A good test reads like a specification - "user can checkout with valid cart" tells you exactly what capability exists. These tests survive refactors because they don't care about internal structure.

Bad tests are coupled to implementation. They mock internal collaborators, test private methods, or verify through external means (like querying a database directly instead of using the interface). The warning sign: your test breaks when you refactor, but behavior hasn't changed. If you rename an internal function and tests fail, those tests were testing implementation, not behavior.

See tests.md for examples and mocking.md for mocking guidelines.

核心原则：测试应通过公共接口验证行为，而非实现细节。代码可以完全更改，但测试不应受影响。

好的测试是集成式的：它们通过公共API执行真实代码路径。它们描述系统做什么，而非怎么做。好的测试读起来像一份规格说明——比如"用户可使用有效购物车结账"，清晰告知你系统具备的能力。这类测试在重构后依然有效，因为它们不关心内部结构。

坏的测试与实现细节耦合。它们模拟内部协作对象、测试私有方法，或通过外部方式验证（比如直接查询数据库而非使用接口）。警示信号：重构时测试失败，但实际行为并未改变。如果你重命名了一个内部函数导致测试失败，那这些测试测试的是实现细节而非行为。

示例请参见tests.md，模拟指南请参见mocking.md。

Anti-Pattern: Horizontal Slices

反模式：横向切片

DO NOT write all tests first, then all implementation. This is "horizontal slicing" - treating RED as "write all tests" and GREEN as "write all code."

This produces crap tests:

Tests written in bulk test imagined behavior, not actual behavior
You end up testing the shape of things (data structures, function signatures) rather than user-facing behavior
Tests become insensitive to real changes - they pass when behavior breaks, fail when behavior is fine
You outrun your headlights, committing to test structure before understanding the implementation

Correct approach: Vertical slices via tracer bullets. One test → one implementation → repeat. Each test responds to what you learned from the previous cycle. Because you just wrote the code, you know exactly what behavior matters and how to verify it.

WRONG (horizontal):
  RED:   test1, test2, test3, test4, test5
  GREEN: impl1, impl2, impl3, impl4, impl5

RIGHT (vertical):
  RED→GREEN: test1→impl1
  RED→GREEN: test2→impl2
  RED→GREEN: test3→impl3
  ...

切勿先编写所有测试，再编写所有实现代码。 这就是"横向切片"——将RED阶段视为"编写所有测试"，GREEN阶段视为"编写所有代码"。

这会产生糟糕的测试：

批量编写的测试测试的是设想的行为，而非实际的行为
最终你测试的是事物的形式（数据结构、函数签名）而非面向用户的行为
测试对真实变化不敏感——当行为失效时它们通过，当行为正常时它们失败
你会超出自己的认知范围，在理解实现之前就确定了测试结构

正确方法：通过追踪式代码实现纵向切片。一个测试 → 一个实现 → 重复此过程。每个测试都基于你从上一个周期中学到的内容编写。因为你刚写完代码，你清楚知道哪些行为重要以及如何验证它们。

错误（横向）:
  RED:   test1, test2, test3, test4, test5
  GREEN: impl1, impl2, impl3, impl4, impl5

正确（纵向）:
  RED→GREEN: test1→impl1
  RED→GREEN: test2→impl2
  RED→GREEN: test3→impl3
  ...

Workflow

工作流程

1. Planning

1. 规划

Before writing any code:

Confirm with user what interface changes are needed
Confirm with user which behaviors to test (prioritize)
Identify opportunities for deep modules (small interface, deep implementation)
Design interfaces for testability
List the behaviors to test (not implementation steps)
Get user approval on the plan

Ask: "What should the public interface look like? Which behaviors are most important to test?"

You can't test everything. Confirm with the user exactly which behaviors matter most. Focus testing effort on critical paths and complex logic, not every possible edge case.

编写任何代码之前：

与用户确认所需的接口变更
与用户确认需要测试的行为（确定优先级）
寻找深度模块的机会（接口简洁，实现复杂）
为可测试性设计接口
列出需要测试的行为（而非实现步骤）
获得用户对计划的认可

提问："公共接口应该是什么样的？哪些行为是最需要测试的？"

你无法测试所有内容。 与用户确认哪些行为最为关键。将测试精力集中在关键路径和复杂逻辑上，而非所有可能的边缘情况。

2. Tracer Bullet

2. 追踪式代码

Write ONE test that confirms ONE thing about the system:

RED:   Write test for first behavior → test fails
GREEN: Write minimal code to pass → test passes

This is your tracer bullet - proves the path works end-to-end.

编写一个测试，确认系统的一项功能：

RED:   为第一个行为编写测试 → 测试失败
GREEN: 编写最简代码使测试通过 → 测试通过

这就是你的追踪式代码——证明端到端路径可行。

3. Incremental Loop

3. 增量循环

For each remaining behavior:

RED:   Write next test → fails
GREEN: Minimal code to pass → passes

Rules:

One test at a time
Only enough code to pass current test
Don't anticipate future tests
Keep tests focused on observable behavior

针对每个剩余行为：

RED:   写下一个测试 → 失败
GREEN: 编写最简代码使测试通过 → 测试通过

规则：

一次编写一个测试
仅编写足够通过当前测试的代码
不要预判未来的测试
保持测试聚焦于可观察的行为

4. Refactor

4. 重构

After all tests pass, look for refactor candidates:

Extract duplication
Deepen modules (move complexity behind simple interfaces)
Apply SOLID principles where natural
Consider what new code reveals about existing code
Run tests after each refactor step

Never refactor while RED. Get to GREEN first.

所有测试通过后，寻找重构候选对象：

提取重复代码
深化模块（将复杂逻辑隐藏在简洁接口之后）
自然地应用SOLID原则
思考新代码对现有代码的启示
每一步重构后运行测试

切勿在RED阶段进行重构。 先确保进入GREEN阶段。

Checklist Per Cycle

每个循环的检查清单

[ ] Test describes behavior, not implementation
[ ] Test uses public interface only
[ ] Test would survive internal refactor
[ ] Code is minimal for this test
[ ] No speculative features added

[ ] 测试描述行为，而非实现细节
[ ] 测试仅使用公共接口
[ ] 测试在内部重构后依然有效
[ ] 代码仅满足当前测试的最简需求
[ ] 未添加推测性功能

tdd

Original

Translation