testing-best-practices

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Test layering policy

测试分层策略

Unit tests

Unit测试

Purpose: verify individual functions and invariants in isolation.
  • Data-driven: parameterized tables covering happy path, boundary, error, and edge cases.
  • Property-based: fuzz invariants that must hold across all inputs (e.g., idempotency, sort stability, roundtrip serialization).
  • Derive cases from the module's public API surface: input types/constraints, output shape, error modes, invariants.
目的:独立验证单个函数和不变量。
  • 数据驱动:参数化表格覆盖正常路径、边界、错误和边缘场景。
  • 基于属性:模糊测试所有输入下必须成立的不变量(如幂等性、排序稳定性、往返序列化)。
  • 从模块的公开API层面推导测试用例:输入类型/约束、输出结构、错误模式、不变量。

Integration / contract tests

Integration / 契约测试

Purpose: verify interactions between components and external services.
  • API envelope: request/response shape, status codes, content types, pagination.
  • Error contract: error codes, error shapes, rate limiting, retries.
  • Auth and scoping: token validation, role-based access, tenant isolation.
  • Eventual consistency: verify convergence within bounded time; poll rather than sleep.
  • Reuse auth state across tests where possible; avoid redundant login flows.
目的:验证组件与外部服务之间的交互。
  • API包络:请求/响应结构、状态码、内容类型、分页机制。
  • 错误契约:错误码、错误结构、限流规则、重试逻辑。
  • 认证与范围控制:令牌验证、基于角色的访问、租户隔离。
  • 最终一致性:验证在限定时间内的数据收敛;采用轮询而非休眠等待。
  • 尽可能在测试间复用认证状态;避免重复登录流程。

E2E tests

E2E测试

Purpose: verify real user workflows through the full stack.
  • No mocks; exercise real services, databases, and APIs.
  • Happy-path workflows only; save edge cases for lower layers.
  • State-tolerant: never assume a clean slate; tolerate and work with prior state.
  • Idempotent: safe to run repeatedly without cleanup between runs.
  • Flow-oriented: validate real data paths end-to-end rather than isolated assertions.
目的:验证真实用户通过全栈的工作流。
  • 不使用Mock;调用真实服务、数据库和API。
  • 仅覆盖正常路径工作流;边缘场景交给下层测试处理。
  • 状态兼容:绝不假设环境是干净状态;需兼容并处理已有状态。
  • 幂等性:可安全重复执行,无需在运行间清理状态。
  • 流程导向:端到端验证真实数据路径,而非孤立断言。

Hard rules

硬性规则

  • Never invent signatures, source locations, or line numbers. Only reference what you have read from the codebase.
  • No fabricated fixtures. Derive test data from actual schemas, types, or seed data in the repo.
  • No test-only hacks in product code. No
    if (process.env.TEST)
    branches, no test-specific exports, no test backdoors.
  • E2E must not rely on clean slate. Tests must tolerate pre-existing data, prior test runs, and shared environments.
  • 切勿编造签名、源码位置或行号。仅引用从代码库中读取到的内容。
  • 禁止伪造测试数据。从仓库中的实际 schema、类型或种子数据推导测试数据。
  • 产品代码中禁止测试专用 hack。不允许
    if (process.env.TEST)
    分支、测试专用导出或测试后门。
  • E2E测试不得依赖干净环境。测试必须兼容已有数据、之前的测试运行记录和共享环境。

Execution guidance

执行指南

Preflight checks (before e2e)

预检查(E2E测试前)

  1. Verify the target environment is reachable (health endpoint, ping).
  2. Confirm required services are running (database, API, auth provider).
  3. Validate test user / credentials exist and are functional.
  4. Check for leftover state that could cause false failures; log it, do not fail on it.
  1. 验证目标环境可访问(健康检查端点、Ping)。
  2. 确认所需服务正在运行(数据库、API、认证提供者)。
  3. 验证测试用户/凭证存在且可用。
  4. 检查可能导致误判失败的遗留状态;记录该状态,但不因此标记失败。

Deterministic fixtures

确定性测试数据

  • Use seeded randomness for generated data (seeded faker, deterministic UUIDs).
  • Fixtures should be self-contained; avoid cross-test fixture dependencies.
  • Prefer factory functions over shared mutable fixture objects.
  • 使用种子随机数生成测试数据(带种子的faker、确定性UUID)。
  • 测试数据应独立封装;避免测试间的测试数据依赖。
  • 优先使用工厂函数而非共享可变测试数据对象。

Async handling

异步处理

  • Poll with bounded timeout and backoff; never use fixed
    sleep
    /
    waitForTimeout
    .
  • Set explicit timeout per operation; fail fast with a descriptive message on timeout.
  • Bound retry attempts (e.g., max 3 retries with exponential backoff).
  • Use framework-native waiting (Playwright
    expect
    , async assertions) over manual loops.
  • 采用带超时限制和退避策略的轮询;绝不使用固定的
    sleep
    /
    waitForTimeout
  • 为每个操作设置明确超时;超时后快速失败并给出描述性信息。
  • 限制重试次数(如最多3次指数退避重试)。
  • 使用框架原生等待机制(Playwright
    expect
    、异步断言)而非手动循环。

Flake handling

不稳定测试处理

  • Single infrastructure retry per test run; if it fails twice, it is not flake.
  • On retry failure, collect diagnostics: screenshots, network logs, service health, timestamps.
  • Classify the failure (flaky / outdated / bug) before attempting a fix.
  • Never add arbitrary delays or retry loops as a flake "fix."
  • 每次测试运行仅重试一次基础设施相关失败;若连续失败两次,则不属于不稳定测试。
  • 重试失败后,收集诊断信息:截图、网络日志、服务健康状态、时间戳。
  • 在尝试修复前,先对失败进行分类(不稳定/过时/ bug)。
  • 切勿添加任意延迟或重试循环作为不稳定测试的“修复方案”。

API surface discovery

API接口识别

Before generating test cases:
  • Read the module source to enumerate exports/public functions.
  • Confirm scope from the user request and inspected code context; if ambiguous, state assumptions and proceed conservatively.
  • For each function: input types/constraints, output shape, error modes, invariants.
  • Probe for state dependencies and ordering constraints between functions.
生成测试用例前:
  • 读取模块源码以枚举导出/公开函数。
  • 根据用户请求和已检查的代码上下文确认范围;若存在歧义,说明假设并谨慎推进。
  • 针对每个函数:明确输入类型/约束、输出结构、错误模式、不变量。
  • 探查函数间的状态依赖和顺序约束。

Output format

输出格式

Use markdown. Produce three sections:
Test Strategy -- one bullet per layer (unit/integration/e2e) naming the functions/flows and their coverage type.
Test Matrix -- table per function: columns
ID | Category | Name | Input | Expected
. Case ID scheme:
{CATEGORY}-{NN}
(HP, BV, ERR, EDGE). Append-only; never renumber.
Implementation Plan -- ordered steps: fixtures, unit tests, integration tests, e2e flows, run command.
使用Markdown格式。生成三个部分:
测试策略 -- 每个分层(Unit/Integration/E2E)对应一个项目符号,说明要测试的函数/流程及其覆盖类型。
测试矩阵 -- 每个函数对应一个表格:列包括
ID | 分类 | 名称 | 输入 | 预期结果
。用例ID规则:
{分类}-{序号}
(HP=正常路径、BV=边界、ERR=错误、EDGE=边缘)。仅追加,绝不重编号。
实施计划 -- 有序步骤:测试数据准备、Unit测试、Integration测试、E2E流程、运行命令。

CI guidance

CI指导

Fast PR smoke lane

PR快速冒烟流水线

  • Unit tests + linting + type-check on every PR.
  • Subset of integration tests covering critical contracts.
  • Target: under 5 minutes.
  • 每个PR都运行Unit测试 + 代码检查 + 类型校验。
  • 运行覆盖关键契约的Integration测试子集。
  • 目标:耗时不超过5分钟。

Nightly full lane

夜间全量流水线

Full unit + integration + e2e suite with higher property-based iteration counts. Flag tests that pass on retry but failed initially.
运行完整的Unit + Integration + E2E套件,并提高基于属性测试的迭代次数。标记首次失败但重试后通过的测试。

Workflow

工作流程

  1. Spec or code defines the module behavior (types, constraints, API surface).
  2. Agent (with this skill) produces test strategy, matrix, and implementation plan.
  3. test-writer agent translates the plan to runnable code in the target language's idiom.
  4. Developer implements to pass the tests.
  5. If implementation reveals missing cases, propose them first; append to spec only when explicitly requested.
  1. 规格文档或代码定义模块行为(类型、约束、API接口)。
  2. Agent(具备此技能)生成测试策略、测试矩阵和实施计划。
  3. 测试编写Agent将计划转换为目标语言 idiom 的可运行代码。
  4. 开发人员实现代码以通过测试。
  5. 若实现过程发现遗漏用例,先提出补充用例;仅在明确请求时才追加到规格文档中。