test-harness

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Test Harness

测试工具集

Systematic test suite generation that transforms source code into comprehensive, runnable pytest files. Analyzes function signatures, dependency graphs, and complexity hotspots to produce tests covering happy paths, boundary conditions, error states, and async flows — with properly scoped fixtures and focused mocks.

系统化的测试套件生成工具，可将源代码转换为全面、可运行的pytest文件。它会分析函数签名、依赖关系图和复杂度热点，生成覆盖正常路径、边界条件、错误状态和异步流程的测试——同时包含范围合理的fixture和针对性的Mocks。

Reference Files

参考文件

File	Contents	Load When
`references/pytest-patterns.md`	Fixture scopes, parametrize, marks, conftest layout, built-in fixtures	Always
`references/mock-strategies.md`	Mock decision tree, patch boundaries, assertions, anti-patterns	Target has external dependencies
`references/async-testing.md`	pytest-asyncio modes, event loop fixtures, async mocking	Target contains async code
`references/fixture-design.md`	Factory fixtures, yield teardown, scope selection, composition	Test requires non-trivial setup
`references/coverage-targets.md`	Threshold table, branch vs line, pytest-cov config, exclusion patterns	Coverage assessment requested

文件路径	内容描述	加载时机
`references/pytest-patterns.md`	Fixture作用域、参数化、标记、conftest布局、内置fixture	始终加载
`references/mock-strategies.md`	Mock决策树、补丁边界、断言、反模式	目标代码存在外部依赖时加载
`references/async-testing.md`	pytest-asyncio模式、事件循环fixture、异步Mocking	目标代码包含异步代码时加载
`references/fixture-design.md`	工厂fixture、yield清理、作用域选择、组合方式	测试需要复杂初始化时加载
`references/coverage-targets.md`	阈值表、分支覆盖率vs行覆盖率、pytest-cov配置、排除模式	需要评估覆盖率时加载

Prerequisites

前置依赖

pytest >= 7.0
Python >= 3.10
pytest-asyncio — required only when generating async tests
pytest-mock — optional, provides
```
mocker
```
fixture as alternative to
```
unittest.mock
```

pytest >= 7.0
Python >= 3.10
pytest-asyncio — 仅在生成异步测试时需要
pytest-mock — 可选，提供
```
mocker
```
fixture作为
```
unittest.mock
```
的替代方案

Workflow

工作流程

Phase 1: Reconnaissance

阶段1：侦察分析

Before writing a single test, build a model of the target code:

Identify scope — What functions, classes, or modules need tests? If unspecified, check for recent modifications:
```
git diff --name-only HEAD~5
```
Read function signatures — Parameters, types, return types, defaults. Every parameter is a test dimension.
Map dependencies — Which calls go to external systems (DB, API, filesystem, clock)? These are mock candidates.
Detect complexity hotspots — Functions with high branch counts, deep nesting, or multiple return paths need more test cases.
Check existing tests — If tests already exist, understand what they cover. Do not duplicate; extend.
Read project conventions — Check CLAUDE.md, conftest.py, pytest.ini/pyproject.toml for fixtures, markers, and test organization patterns already in use.

在编写任何测试之前，先构建目标代码的模型：

确定范围 — 需要测试哪些函数、类或模块？如果未指定，检查最近的修改：
```
git diff --name-only HEAD~5
```
读取函数签名 — 参数、类型、返回类型、默认值。每个参数都是一个测试维度。
映射依赖关系 — 哪些调用指向外部系统（数据库、API、文件系统、时钟）？这些是Mock的候选对象。
检测复杂度热点 — 分支数多、嵌套深或返回路径多的函数需要更多测试用例。
检查现有测试 — 如果已有测试，了解其覆盖范围。不要重复测试，而是进行扩展。
遵循项目约定 — 查看CLAUDE.md、conftest.py、pytest.ini/pyproject.toml，了解已在使用的fixture、标记和测试组织模式。

Phase 2: Test Case Enumeration

阶段2：测试用例枚举

For each function under test, enumerate cases across four categories:

Category	What to Test	Example
Happy path	Expected inputs produce expected outputs	`add(2, 3)` returns `5`
Boundary	Edge values at limits of valid input	Empty string, zero, max int, single element
Error	Invalid inputs trigger proper exceptions	`None` where `str` expected, negative index
State	State transitions produce correct side effects	Object moves from `pending` to `active`

For each case, note:

Input values (concrete, not abstract)
Expected output or exception
Required setup (fixtures)
Required mocks (external calls to suppress)

Parametrize cases that share the same test logic but differ only in input/output values.

针对每个被测函数，从四个类别枚举测试用例：

类别	测试内容	示例
正常路径	预期输入产生预期输出	`add(2, 3)` 返回 `5`
边界情况	有效输入的极限值	空字符串、零、最大整数、单个元素
错误情况	无效输入触发正确的异常	预期为 `str` 时传入 `None` 、负索引
状态转换	状态转换产生正确的副作用	对象从 `pending` 状态变为 `active` 状态

针对每个用例，记录：

输入值（具体值，而非抽象值）
预期输出或异常
所需的初始化（fixtures）
所需的Mocks（需要抑制的外部调用）

对于仅输入/输出值不同但测试逻辑相同的用例，使用参数化方式处理。

Phase 3: Fixture Design

阶段3：Fixture设计

Identify shared setup — If 3+ tests need the same object, extract a fixture.

Select scope — Use the narrowest scope that avoids redundant setup:

Scope	Use When	Example
`function`	Default. Each test gets fresh state	Most unit tests
`class`	Tests within a class share expensive setup	DB connection per test class
`module`	All tests in a file share setup	Loaded config file
`session`	Entire test run shares setup	Docker container startup

Design teardown — Use
```
yield
```
fixtures when cleanup is needed. Never leave side effects (temp files, DB rows, monkey-patches) after a test.
Identify conftest candidates — Fixtures used across multiple test files belong in
```
conftest.py
```
. Fixtures used in one file stay in that file.

识别共享初始化 — 如果3个及以上测试需要相同对象，提取为fixture。

选择作用域 — 使用最窄的作用域以避免冗余初始化：

作用域	使用场景	示例
`function`	默认值。每个测试获取全新状态	大多数单元测试
`class`	类内的测试共享昂贵的初始化操作	每个测试类共享数据库连接
`module`	文件内的所有测试共享初始化操作	加载配置文件
`session`	整个测试运行过程共享初始化操作	Docker容器启动

设计清理逻辑 — 当需要清理时使用
```
yield
```
fixture。测试结束后不得留下副作用（临时文件、数据库行、猴子补丁）。
识别conftest候选对象 — 跨多个测试文件使用的fixture应放在
```
conftest.py
```
中。仅在一个文件中使用的fixture保留在该文件内。

Phase 4: Mock Strategy

阶段4：Mock策略

Decide what to mock — Mock external dependencies only:
- Network calls (API, database, message queues)
- Filesystem operations (when testing logic, not I/O)
- Time-dependent behavior (
```
datetime.now
```
  ,
```
time.sleep
```
  )
- Random/non-deterministic behavior
Decide what NOT to mock — Never mock:
- The function under test
- Pure functions called by the target (test them through the target)
- Data structures and value objects
Choose mock level — Patch at the import boundary of the module under test, not at the definition site.
```
@patch('mymodule.requests.get')
```
, not
```
@patch('requests.get')
```
.
Add mock assertions — Every mock should assert it was called with expected arguments and the expected number of times. Mocks without assertions are coverage holes.

确定Mock对象 — 仅Mock外部依赖：
- 网络调用（API、数据库、消息队列）
- 文件系统操作（测试逻辑而非I/O时）
- 时间相关行为（
```
datetime.now
```
  、
```
time.sleep
```
  ）
- 随机/非确定性行为
确定无需Mock的对象 — 绝不Mock：
- 被测函数本身
- 被测函数调用的纯函数（通过被测函数间接测试）
- 数据结构和值对象
选择Mock层级 — 在被测模块的导入边界打补丁，而非定义位置。例如使用
```
@patch('mymodule.requests.get')
```
，而非
```
@patch('requests.get')
```
。
添加Mock断言 — 每个Mock都应断言其被调用的参数和次数符合预期。没有断言的Mock是覆盖漏洞。

Phase 5: Output

阶段5：输出

Generate the test file following this structure:

Imports (pytest, mocks, target module)
Constants and test data
Fixtures (ordered by scope: session > module > class > function)
Test classes or functions grouped by target function
Parametrized tests where applicable

按照以下结构生成测试文件：

导入语句（pytest、mocks、目标模块）
常量和测试数据
Fixtures（按作用域排序：session > module > class > function）
按目标函数分组的测试类或函数
适用的参数化测试

Output Format

输出格式

text

undefined

text

undefined

tests/test_{module}.py

import pytest from unittest.mock import Mock, patch, MagicMock

from {module} import {target_function, TargetClass}

import pytest from unittest.mock import Mock, patch, MagicMock

from {module} import {target_function, TargetClass}

============================================================

Fixtures

============================================================

@pytest.fixture def valid_input(): """Standard valid input for happy path tests.""" return {concrete values}

@pytest.fixture def mock_database(): """Mock database connection.""" with patch("{module}.db_connection") as mock_db: mock_db.query.return_value = [{expected data}] yield mock_db

@pytest.fixture def valid_input(): """正常路径测试的标准有效输入。""" return {具体值}

@pytest.fixture def mock_database(): """Mock数据库连接。""" with patch("{module}.db_connection") as mock_db: mock_db.query.return_value = [{预期数据}] yield mock_db

============================================================

{target_function} Tests

{target_function} 测试

============================================================

class TestTargetFunction: """Tests for {target_function}."""

def test_happy_path(self, valid_input):
    """Returns expected result for valid input."""
    result = target_function(valid_input)
    assert result == {expected}

@pytest.mark.parametrize(
    "input_val, expected",
    [
        ({boundary_1}, {expected_1}),
        ({boundary_2}, {expected_2}),
        ({boundary_3}, {expected_3}),
    ],
    ids=["empty", "single", "maximum"],
)
def test_boundary_conditions(self, input_val, expected):
    """Handles boundary inputs correctly."""
    assert target_function(input_val) == expected

def test_invalid_input_raises(self):
    """Raises TypeError for invalid input."""
    with pytest.raises(TypeError, match="expected str"):
        target_function(None)

def test_external_call(self, mock_database):
    """Calls database with correct query."""
    target_function("lookup_key")
    mock_database.query.assert_called_once_with("SELECT * FROM t WHERE key = %s", ("lookup_key",))

undefined

class TestTargetFunction: """{target_function}的测试。"""

def test_happy_path(self, valid_input):
    """有效输入返回预期结果。"""
    result = target_function(valid_input)
    assert result == {预期值}

@pytest.mark.parametrize(
    "input_val, expected",
    [
        ({边界值1}, {预期值1}),
        ({边界值2}, {预期值2}),
        ({边界值3}, {预期值3}),
    ],
    ids=["empty", "single", "maximum"],
)
def test_boundary_conditions(self, input_val, expected):
    """正确处理边界输入。"""
    assert target_function(input_val) == expected

def test_invalid_input_raises(self):
    """无效输入触发TypeError。"""
    with pytest.raises(TypeError, match="expected str"):
        target_function(None)

def test_external_call(self, mock_database):
    """使用正确的查询调用数据库。"""
    target_function("lookup_key")
    mock_database.query.assert_called_once_with("SELECT * FROM t WHERE key = %s", ("lookup_key",))

undefined

Configuring Scope

范围配置

Mode	Scope	Depth	When to Use
`quick`	Single function	Happy path + 1 error case	Rapid iteration, TDD red-green cycle
`standard`	File or class	Happy + boundary + error + mocks	Default for most requests
`comprehensive`	Module or package	All categories + async + parametrized matrix	Pre-release, critical path code

模式	范围	深度	使用场景
`quick`	单个函数	正常路径 + 1个错误用例	快速迭代、TDD红-green循环
`standard`	文件或类	正常路径 + 边界情况 + 错误情况 + Mocks	大多数请求的默认模式
`comprehensive`	模块或包	所有类别 + 异步 + 参数化矩阵	预发布、关键路径代码

Calibration Rules

校准规则

Test isolation is non-negotiable. Every test must pass when run alone and in any order. No test may depend on the side effects of another test.
Mock discipline. Mock external dependencies, not internal logic. Over-mocking produces tests that pass when the code is broken. Under-mocking produces tests that fail when the network is down.
Concrete over abstract. Test data must be concrete values, not placeholders.
```
"alice@example.com"
```
not
```
"test_email"
```
.
```
42
```
not
```
"some_number"
```
. Concrete values catch type mismatches that abstract placeholders mask.
One assertion focus per test. A test should verify one behavior. Multiple assertions are acceptable when they verify different aspects of the same behavior (e.g., return value AND side effect), but not when they verify unrelated behaviors.
Parametrize, don't duplicate. If two tests differ only in input/output values, combine them with
```
@pytest.mark.parametrize
```
. Use
```
ids
```
for readable test names.
Match project conventions. If the project uses
```
conftest.py
```
fixtures, class-based tests, or specific markers, follow those patterns. Do not introduce a conflicting test style.

测试隔离是不可协商的。 每个测试单独运行或按任意顺序运行都必须通过。测试不得依赖其他测试的副作用。
Mock原则。 Mock外部依赖，而非内部逻辑。过度Mock会导致代码损坏时测试仍通过。Mock不足会导致网络故障时测试失败。
具体优先于抽象。 测试数据必须是具体值，而非占位符。使用
```
"alice@example.com"
```
而非
```
"test_email"
```
，使用
```
42
```
而非
```
"some_number"
```
。具体值能捕获抽象占位符掩盖的类型不匹配问题。
每个测试聚焦一个断言。 测试应验证一种行为。当断言验证同一行为的不同方面（如返回值和副作用）时，多个断言是可接受的，但不得验证无关行为。
参数化而非重复。 如果两个测试仅输入/输出值不同，使用
```
@pytest.mark.parametrize
```
合并。使用
```
ids
```
生成可读的测试名称。
遵循项目约定。 如果项目使用
```
conftest.py
```
fixture、基于类的测试或特定标记，遵循这些模式。不得引入冲突的测试风格。

Error Handling

错误处理

Problem	Resolution
Target function has no type hints	Infer types from usage patterns, default values, and docstrings. Note uncertainty in test docstring.
Target has deeply nested dependencies	Mock at the nearest boundary to the function under test. Do not mock transitive dependencies individually.
No existing test infrastructure (no conftest, no pytest config)	Generate a minimal `conftest.py` alongside the test file. Note the addition in output.
Target code is untestable (global state, hidden dependencies)	Flag the design issue in the output. Generate tests for what is testable. Suggest refactoring to improve testability.
Async code detected but pytest-asyncio not installed	Note the dependency requirement. Generate async test stubs with `@pytest.mark.asyncio` and instruct user to install.
Target module cannot be imported	Report the import error. Do not generate tests for unimportable code.

问题	解决方案
目标函数无类型提示	从使用模式、默认值和文档字符串推断类型。在测试文档字符串中注明不确定性。
目标代码存在深度嵌套依赖	在离被测函数最近的边界处Mock。不要单独Mock传递性依赖。
无现有测试基础设施（无conftest、无pytest配置）	在测试文件旁生成最小化的 `conftest.py` 。在输出中注明添加的内容。
目标代码不可测试（全局状态、隐藏依赖）	在输出中标记设计问题。为可测试部分生成测试。建议重构以提高可测试性。
检测到异步代码但未安装pytest-asyncio	注明依赖要求。生成带有 `@pytest.mark.asyncio` 的异步测试存根，并指导用户安装依赖。
目标模块无法导入	报告导入错误。不为无法导入的代码生成测试。

When NOT to Generate Tests

无需生成测试的场景

Push back if:

The code is auto-generated (protobuf, OpenAPI client, ORM models) — test the generator or the schema, not the output
The request is for UI/E2E tests — this skill generates unit and integration tests only
The code has no clear behavior to test (pure configuration, constant definitions)
The user wants tests for third-party library code — test your usage of the library, not the library itself

在以下情况下拒绝生成：

代码是自动生成的（protobuf、OpenAPI客户端、ORM模型）——测试生成器或模式，而非输出结果
请求生成UI/E2E测试——本工具仅生成单元测试和集成测试
代码无明确可测试行为（纯配置、常量定义）
用户要求为第三方库代码生成测试——测试你对库的使用，而非库本身

Rationalizations

常见误区纠正

Rationalization	Reality
"Manual testing is sufficient"	Manual testing doesn't run in CI, doesn't catch regressions, and doesn't scale with the codebase
"This code is too simple to test"	Simple code becomes complex code — tests document expected behavior and catch regressions from future changes
"I'll add tests later"	Tests are specifications; without them, code behavior is undefined and later never comes
"Mocking everything makes the test fast"	Over-mocked tests pass when the real system fails — mock at boundaries, not deep in the call chain
"100% coverage means the code is correct"	Coverage measures execution, not correctness — a test that runs code without meaningful assertions adds no value
"The happy path test is enough"	Edge cases and error paths cause most production incidents — happy-path-only testing is false confidence

错误观点	实际情况
"手动测试足够了"	手动测试无法在CI中运行，无法捕获回归问题，且无法随代码库扩展而扩展
"这段代码太简单无需测试"	简单代码会演变为复杂代码——测试记录预期行为并捕获未来变更导致的回归问题
"我以后再加测试"	测试就是规格说明；没有测试，代码行为未定义，而“以后”永远不会到来
"Mock所有内容让测试更快"	过度Mock的测试在真实系统故障时仍会通过——仅在边界处Mock，而非调用链深处
"100%覆盖率意味着代码正确"	覆盖率衡量执行情况，而非正确性——运行代码但无有意义断言的测试毫无价值
"正常路径测试就足够了"	边界情况和错误路径是大多数生产事故的原因——仅测试正常路径是虚假的信心

Red Flags

危险信号

Tests that only cover the happy path with no edge cases or error paths
Test names that describe implementation ("test_calls_function") instead of behavior ("test_returns_404_when_not_found")
More than two mocks per test — indicates the unit under test is too coupled
Tests that depend on execution order or shared mutable state
Assertions on implementation details (mock call counts) instead of observable behavior
Skipping integration tests because "unit tests cover it"

仅覆盖正常路径，无边界情况或错误路径的测试
描述实现细节的测试名称（如"test_calls_function"）而非行为（如"test_returns_404_when_not_found"）
每个测试包含2个以上Mock——表明被测单元耦合度过高
依赖执行顺序或共享可变状态的测试
断言实现细节（Mock调用次数）而非可观察行为的测试
跳过集成测试因为“单元测试已覆盖”

Verification

验证清单

Tests follow Arrange-Act-Assert structure with clear phase separation

Test names describe behavior:

test_<unit>_<scenario>_<expected_outcome>

Edge cases covered: empty input, boundary values, error paths, null/None
Coverage meets thresholds: 80% overall, 90% new code, 95% critical paths
All tests pass:
```
pytest
```
/
```
npm test
```
exits 0 with output captured
No test depends on execution order — can run in any sequence
Mocks used only at boundaries (external APIs, system clock, filesystem in unit tests)

测试遵循Arrange-Act-Assert结构，阶段划分清晰
测试名称描述行为：
```
test_<单元>_<场景>_<预期结果>
```
覆盖边界情况：空输入、边界值、错误路径、null/None
覆盖率符合阈值：整体80%，新代码90%，关键路径95%
所有测试通过：
```
pytest
```
/
```
npm test
```
以0状态退出，输出被捕获
测试不依赖执行顺序——可按任意顺序运行
Mocks仅用于边界（外部API、系统时钟、单元测试中的文件系统）