testing

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Testing

测试

This skill provides guidance on testing philosophy and practices, emphasizing tests as specifications and API design through TDD.

本技能提供测试理念与实践的指导，强调通过TDD将测试作为规范和API设计的工具。

Core Philosophy

核心理念

Tests as Executable Specifications

测试作为可执行规范

Tests are not just verification tools — they are executable specifications that document how the system should behave. A well-written test suite serves as living documentation.

测试不仅仅是验证工具——它们是可执行规范，记录系统应有的行为。一套编写精良的测试套件可作为活文档。

Tests as API Consumers

测试作为API消费者

Tests are the first users of your code's APIs. This is why TDD is valuable: you design the API by thinking about the consumer first, before thinking about implementation.

When writing tests:

Consider what interface would be most convenient for the caller
Let the test drive the API design
If the test is awkward to write, the API is awkward to use

测试是代码API的首批使用者。这正是TDD的价值所在：在考虑实现之前，先从消费者的角度设计API。

编写测试时：

思考什么样的接口对调用者来说最便捷
让测试驱动API设计
如果测试编写起来很别扭，说明API的使用体验也会很糟糕

Test-Driven Development (TDD)

测试驱动开发（TDD）

Red-Green-Refactor

红-绿-重构

The TDD cycle consists of three phases:

Red: Write a failing test for the next piece of functionality
Green: Write the minimum code necessary to make the test pass
Refactor: Improve the code while keeping tests green

Each cycle should be short — ideally minutes, not hours. Small steps reduce risk and provide frequent feedback.

TDD循环包含三个阶段：

Red（红）：为下一个功能点编写一个失败的测试
Green（绿）：编写最少的代码让测试通过
Refactor（重构）：在保持测试通过的前提下优化代码

每个循环应该尽可能短——理想情况下是几分钟，而非几小时。小步迭代降低风险，同时提供频繁的反馈。

The Value of TDD

TDD的价值

Forces thinking about the API before implementation
Produces code with high test coverage by default
Encourages simpler designs (testable code tends to be well-designed)
Provides immediate feedback on whether code works
Creates executable documentation of intended behavior

迫使开发者在实现前先思考API设计
自然产生高测试覆盖率的代码
鼓励更简洁的设计（可测试的代码往往设计精良）
即时反馈代码是否正常工作
创建记录预期行为的可执行文档

Flexible TDD

灵活的TDD

Strict TDD (one test at a time, red-green-refactor) is the ideal for learning and for complex logic. However, flexibility is acceptable:

Writing all tests first is appropriate when:

Tests need human review/approval before implementation
The behavior is well-understood and stable
Documenting a specification before implementing

Writing tests after is acceptable when:

Exploring or prototyping (but add tests before committing)
The design is genuinely uncertain
Spiking to learn about a problem

The goal is well-tested code with tests that serve as specifications. The path matters less than the destination, but TDD often produces better results.

严格的TDD（一次写一个测试，遵循红-绿-重构）是学习和处理复杂逻辑的理想方式。不过，适当的灵活性也是可以接受的：

先编写所有测试适用于以下场景：

测试需要在实现前经过人工审核/批准
需求行为被充分理解且稳定
在实现前先编写规范文档

后编写测试适用于以下场景：

探索或原型开发阶段（但提交代码前需补充测试）
设计方案确实不确定
快速尝试以了解问题

目标是拥有测试充分且测试可作为规范的代码。实现路径不如结果重要，但TDD通常能产生更好的效果。

Speed Matters

速度至关重要

Tests should be fast. Slow tests discourage running them frequently, which defeats their purpose.

Target sub-second feedback for unit tests
Keep the full suite under a few minutes when possible
Identify and isolate slow tests

测试应快速执行。缓慢的测试会阻碍开发者频繁运行它们，从而失去测试的意义。

单元测试的反馈时间目标为亚秒级
尽可能将全量测试套件的执行时间控制在几分钟内
识别并隔离慢速测试

Database Access

数据库访问

Avoid hitting the database in tests except when:

Testing database-specific functionality (queries, constraints, transactions)
Integration tests that specifically verify database behavior

Do not hit the database just to:

Populate models or data structures
Create test fixtures when in-memory objects would suffice
Test business logic that happens to use database-backed models

Use factories or builders that create in-memory objects when database persistence isn't the thing being tested.

测试中应避免访问数据库，除非：

测试数据库特定功能（查询、约束、事务）
专门验证数据库行为的集成测试

以下情况不应访问数据库：

填充模型或数据结构
当内存对象足够时仍创建测试夹具
测试恰好使用数据库持久化模型的业务逻辑

当不需要测试数据库持久化时，使用工厂或构建器创建内存对象。

Test Structure

测试结构

One Thing Per Test

每个测试验证一个行为

Each test should verify one behavior. This doesn't always mean one assertion — sometimes verifying one behavior requires multiple assertions, especially when tests are slow. But the test should have a single reason to fail.

每个测试应仅验证一个行为。这并不总是意味着只有一个断言——有时验证一个行为需要多个断言，尤其是在测试执行缓慢的情况下。但测试应该只有一个失败的原因。

AAA Pattern

AAA模式

Structure tests using Arrange-Act-Assert:

Arrange: Set up the preconditions
Act: Execute the behavior being tested
Assert: Verify the expected outcome

Keep each section clearly delineated. If any section is complex, consider extracting helper methods.

使用Arrange-Act-Assert（AAA）模式构建测试：

Arrange（准备）：设置前置条件
Act（执行）：执行待测试的行为
Assert（断言）：验证预期结果

保持每个部分清晰划分。如果任何部分过于复杂，考虑提取辅助方法。

Given-When-Then

The BDD mindset aligns with AAA:

Given (Arrange): The initial context
When (Act): The event or action
Then (Assert): The expected outcome

This framing helps focus on behavior from the user's perspective.

BDD思维模式与AAA模式一致：

Given（给定）：初始上下文
When（当）：触发的事件或动作
Then（则）：预期的结果

这种框架有助于从用户视角聚焦于行为本身。

Mocking and Test Doubles

模拟对象与测试替身

Prefer Real Objects

优先使用真实对象

Avoid mocking when possible. Build small, simple components with immutable data to reduce the need for mocks.

尽可能避免使用mocks。构建小型、简单且使用不可变数据的组件，以减少对mocks的需求。

When Mocking is Necessary

必要时才使用Mocking

If mocking is unavoidable:

Mock roles, not objects — mock interfaces/behaviors, not concrete implementations
Prefer fakes over mocks — fakes (simplified implementations) are often clearer than mock expectations
Keep mock setups simple; complex mocking often signals design problems

如果必须使用mocking：

模拟角色而非对象——模拟接口/行为，而非具体实现
优先使用fakes而非mocks——伪对象（fakes，简化的实现）通常比mock预期更清晰
保持mock设置简单；复杂的mock通常意味着设计存在问题

Signs of Excessive Mocking

过度Mocking的迹象

Tests that are mostly mock setup
Mocks returning mocks
Tests that break when implementation details change
Difficulty understanding what's actually being tested

Consider these as signals to refactor the production code.

测试大部分内容都是mock设置
Mock返回其他Mock
当实现细节变化时测试失败
难以理解实际在测试什么

将这些视为需要重构生产代码的信号。

Custom Matchers

自定义匹配器

Use custom matchers (RSpec matchers, Jest matchers, etc.) to make assertions readable and intention-revealing.

Good:

ruby

expect(order).to be_fulfilled
expect(user).to have_permission(:admin)

Less clear:

ruby

expect(order.status).to eq("fulfilled")
expect(user.permissions).to include("admin")

Custom matchers:

Make tests read like specifications
Provide better failure messages
Encapsulate complex assertions
Can be reused across tests

使用自定义匹配器（RSpec匹配器、Jest匹配器等）让断言更具可读性，更能体现意图。

良好示例：

ruby

expect(order).to be_fulfilled
expect(user).to have_permission(:admin)

不够清晰的示例：

ruby

expect(order.status).to eq("fulfilled")
expect(user.permissions).to include("admin")

自定义匹配器：

让测试读起来像规范文档
提供更友好的失败提示信息
封装复杂的断言逻辑
可在多个测试中复用

Language-Specific Guidelines

语言特定指南

Ruby (RSpec)

Ruby（RSpec）

Use RSpec as the primary testing framework
Prefer
```
describe
```
for classes/methods,
```
context
```
for states/conditions
Use
```
let
```
for lazy-evaluated test data
Use
```
subject
```
for the thing being tested
Prefer
```
expect
```
syntax over
```
should
```
Use
```
before
```
sparingly; prefer explicit setup in each test when clarity matters
Create custom matchers for domain-specific assertions
Use
```
shared_examples
```
for common behavior across contexts
Use FactoryBot for test data, but prefer
```
build
```
over
```
create
```
when persistence isn't needed

ruby

RSpec.describe Order do
  describe "#fulfill" do
    context "when all items are in stock" do
      it "marks the order as fulfilled" do
        order = build(:order, :with_available_items)
        
        order.fulfill
        
        expect(order).to be_fulfilled
      end
    end
  end
end

使用RSpec作为主要测试框架
优先使用
```
describe
```
描述类/方法，
```
context
```
描述状态/条件
使用
```
let
```
定义延迟加载的测试数据
使用
```
subject
```
定义待测试的对象
优先使用
```
expect
```
语法而非
```
should
```
谨慎使用
```
before
```
；当清晰度很重要时，优先在每个测试中显式设置
为领域特定断言创建自定义匹配器
使用
```
shared_examples
```
处理不同上下文间的通用行为
使用FactoryBot生成测试数据，但不需要持久化时优先使用
```
build
```
而非
```
create
```

ruby

RSpec.describe Order do
  describe "#fulfill" do
    context "when all items are in stock" do
      it "marks the order as fulfilled" do
        order = build(:order, :with_available_items)
        
        order.fulfill
        
        expect(order).to be_fulfilled
      end
    end
  end
end

JavaScript (Jest/Vitest)

JavaScript（Jest/Vitest）

Use descriptive test names that read as specifications
Use
```
describe
```
blocks to group related tests
Prefer explicit assertions over snapshot tests (unless testing UI output)
Use
```
beforeEach
```
for common setup
Mock external dependencies, not internal modules

javascript

describe("Order", () => {
  describe("fulfill", () => {
    it("marks the order as fulfilled when all items are in stock", () => {
      const order = buildOrder({ items: availableItems });
      
      order.fulfill();
      
      expect(order.isFulfilled()).toBe(true);
    });
  });
});

使用描述性的测试名称，使其读起来像规范文档
使用
```
describe
```
块对相关测试进行分组
优先使用显式断言而非快照测试（除非测试UI输出）
使用
```
beforeEach
```
进行通用设置
模拟外部依赖，而非内部模块

javascript

describe("Order", () => {
  describe("fulfill", () => {
    it("marks the order as fulfilled when all items are in stock", () => {
      const order = buildOrder({ items: availableItems });
      
      order.fulfill();
      
      expect(order.isFulfilled()).toBe(true);
    });
  });
});

Bash (BATS or similar)

Bash（BATS或类似工具）

Test scripts by testing their behavior, not their output format
Use temporary directories for file-based tests
Clean up test artifacts in teardown
Test error conditions and exit codes

通过测试行为而非输出格式来测试脚本
针对基于文件的测试使用临时目录
在清理阶段清理测试产物
测试错误条件和退出码

Test Smells

测试坏味道

Watch for these warning signs:

Slow tests: Usually means too much real I/O or database access
Flaky tests: Often timing issues or shared state
Fragile tests: Breaking when implementation changes, not behavior
Mystery guests: Test data coming from somewhere non-obvious
Eager tests: Testing too many things at once
Obscure tests: Hard to understand what's being tested

注意以下警告信号：

慢速测试：通常意味着过多的真实I/O或数据库访问
不稳定测试：通常是计时问题或共享状态导致
脆弱测试：实现细节变化时测试失败，而非行为变化
神秘访客：测试数据来源不明确
过度测试：一次测试太多内容
晦涩测试：难以理解实际在测试什么