mutation-testing

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Mutation Testing

Mutation Testing

Value: Feedback -- mutation testing closes the verification loop by proving that tests actually detect the bugs they claim to prevent. Without it, passing tests may provide false confidence.
价值: 反馈——变异测试通过证明测试确实能检测到它们声称要预防的bug,从而闭合验证循环。没有它,测试通过可能会带来虚假的信心。

Purpose

目标

Teaches the agent to run mutation testing as a quality gate before PR creation. Mutation testing makes small changes (mutations) to production code and checks whether tests catch them. Surviving mutants reveal gaps where bugs could hide undetected. The required mutation kill rate is 100%.
指导Agent在创建PR前将变异测试作为质量门禁来运行。变异测试会对生产代码做出微小修改(变异),并检查测试是否能发现这些修改。存活的变异体揭示了bug可能隐藏而未被发现的漏洞。要求的变异体杀死率为100%。

Practices

实践

Detect and Run the Right Tool

检测并运行合适的工具

Detect the project type and run the appropriate mutation testing tool.
  1. Check for project markers:
    • Cargo.toml
      -> Rust ->
      cargo mutants
    • package.json
      -> TypeScript/JavaScript ->
      npx stryker run
    • pyproject.toml
      or
      setup.py
      -> Python ->
      mutmut run
    • mix.exs
      -> Elixir ->
      mix muzak
  2. Verify the tool is installed. If not, provide installation instructions:
    • Rust:
      cargo install cargo-mutants
    • TypeScript:
      npm install --save-dev @stryker-mutator/core
    • Python:
      pip install mutmut
    • Elixir: add
      {:muzak, "~> 1.0", only: :test}
      to deps
  3. Run mutation testing against the relevant scope. Prefer scoping to changed files or packages rather than the entire codebase when possible:
    # Rust (scoped to package)
    cargo mutants --package <package> --jobs 4
    
    # TypeScript
    npx stryker run
    
    # Python (scoped to source)
    mutmut run --paths-to-mutate=src/
    mutmut results
    
    # Elixir
    mix muzak
检测项目类型并运行相应的变异测试工具。
  1. 检查项目标识文件:
    • Cargo.toml
      -> Rust ->
      cargo mutants
    • package.json
      -> TypeScript/JavaScript ->
      npx stryker run
    • pyproject.toml
      setup.py
      -> Python ->
      mutmut run
    • mix.exs
      -> Elixir ->
      mix muzak
  2. 验证工具是否已安装。如果未安装,提供安装说明:
    • Rust:
      cargo install cargo-mutants
    • TypeScript:
      npm install --save-dev @stryker-mutator/core
    • Python:
      pip install mutmut
    • Elixir: 在依赖中添加
      {:muzak, "~> 1.0", only: :test}
  3. 针对相关范围运行变异测试。尽可能优先针对已修改的文件或包,而非整个代码库:
    # Rust(限定包范围)
    cargo mutants --package <package> --jobs 4
    
    # TypeScript
    npx stryker run
    
    # Python(限定源码范围)
    mutmut run --paths-to-mutate=src/
    mutmut results
    
    # Elixir
    mix muzak

Parse and Report Results

解析并报告结果

Extract from the mutation tool output:
  • Total mutants generated
  • Mutants killed (tests detected the change)
  • Mutants survived (tests did NOT detect the change)
  • Timed-out mutants
  • Mutation score percentage
从变异测试工具的输出中提取以下信息:
  • 生成的变异体总数
  • 被杀死的变异体(测试检测到了修改)
  • 存活的变异体(测试未检测到修改)
  • 超时的变异体
  • 变异测试得分百分比

Analyze Surviving Mutants

分析存活的变异体

For each surviving mutant, report three things:
  1. Location: File and line number
  2. Mutation: What was changed (e.g., "replaced
    +
    with
    -
    ")
  3. Meaning: What class of bug this lets through
Common mutation types and what survival indicates:
  • Arithmetic (
    +
    ->
    -
    ,
    *
    ->
    /
    ): Calculations not verified
  • Comparison (
    >
    ->
    >=
    ,
    ==
    ->
    !=
    ): Boundary conditions untested
  • Boolean (
    &&
    ->
    ||
    ,
    !
    removed): Logic branches not covered
  • Return value (
    true
    ->
    false
    ,
    Ok
    ->
    Err
    ): Return paths not checked
  • Statement removal (line deleted): Side effects not asserted
针对每个存活的变异体,报告三项内容:
  1. 位置: 文件和行号
  2. 变异内容: 做出了何种修改(例如:将
    +
    替换为
    -
  3. 影响: 这类变异会导致哪类bug被遗漏
常见的变异类型及其存活所代表的含义:
  • 算术运算
    +
    -
    *
    /
    ):计算逻辑未被验证
  • 比较运算
    >
    >=
    ==
    !=
    ):边界条件未被测试
  • 布尔运算
    &&
    ||
    ,移除
    !
    ):逻辑分支未被覆盖
  • 返回值
    true
    false
    Ok
    Err
    ):返回路径未被检查
  • 语句移除(删除行):副作用未被断言

Recommend Missing Tests

推荐缺失的测试用例

For each surviving mutant, suggest a specific test:
Surviving: src/money.rs:45 -- replaced `+` with `-` in Money::add()
Recommend: Test that adding Money(50) + Money(30) equals Money(80),
           not Money(20). The current tests do not assert the sum value.

Surviving: src/account.rs:78 -- replaced `>` with `>=` in check_balance()
Recommend: Test the exact boundary -- check_balance with exactly zero
           balance. Current tests only check positive and negative.
针对每个存活的变异体,建议具体的测试用例:
存活变异体: src/money.rs:45 -- 在Money::add()方法中把`+`替换为`-`
建议: 测试Money(50) + Money(30)的结果等于Money(80)而非Money(20)。当前测试未对求和结果进行断言。

存活变异体: src/account.rs:78 -- 在check_balance()方法中把`>`替换为`>=`
建议: 测试精确边界条件——余额恰好为零时的check_balance行为。当前测试仅覆盖了正数和负数余额的情况。

Enforce the Quality Gate

强制执行质量门禁

The required mutation kill rate is 100%. All mutants must be killed.
  • If score is 100%: Report success, proceed to PR creation
  • If score is below 100%: List all survivors with recommendations. Block PR creation with a clear warning. The user may override, but the default is to fix first.
Do:
  • Scope mutation runs to changed code when possible
  • Report survivors with actionable fix recommendations
  • Re-run after fixes to confirm all mutants are now killed
  • Treat timeouts as killed (the mutation broke something)
Do not:
  • Skip mutation testing before PR creation
  • Accept surviving mutants without reporting them
  • Run mutations on the entire codebase when only a module changed
  • Recommend tests for data validation that belongs in domain types
要求的变异体杀死率为100%。所有变异体必须被杀死。
  • 如果得分是100%:报告成功,继续创建PR
  • 如果得分低于100%:列出所有存活变异体及修复建议。通过明确警告阻止PR创建。用户可选择覆盖该规则,但默认要求先修复问题。
需要做:
  • 尽可能将变异测试限定在已修改的代码范围内
  • 报告存活变异体时提供可执行的修复建议
  • 修复后重新运行测试,确认所有变异体已被杀死
  • 将超时的变异体视为已杀死(该变异导致了程序异常)
不要做:
  • 在创建PR前跳过变异测试
  • 发现存活变异体却不进行报告
  • 仅修改了某个模块时,却对整个代码库运行变异测试
  • 建议将数据验证的测试放在领域类型中(这类验证应属于领域类型本身的职责)

Enforcement Note

执行说明

This skill provides advisory guidance. It instructs the agent to run mutation testing and enforce a 100% kill rate, but cannot mechanically prevent PR creation with surviving mutants. When used with the
tdd
skill in automated mode, the orchestrator can gate PR creation on mutation score. In guided mode or standalone, the agent follows this practice by convention. If you observe the agent skipping mutation testing before a PR, point it out.
本skill提供指导性建议。它指示Agent运行变异测试并强制执行100%的杀死率,但无法机械地阻止包含存活变异体的PR创建。在自动化模式下与
tdd
skill配合使用时,编排器可以根据变异测试得分来管控PR创建。在引导模式或独立使用时,Agent会按照本约定遵循该实践。如果您发现Agent在PR创建前跳过了变异测试,请指出这一问题。

Verification

验证

After completing mutation testing, verify:
  • Mutation testing tool was run against the relevant scope
  • All surviving mutants are listed with file, line, and mutation type
  • Each survivor has a specific test recommendation
  • Mutation score is 100% (or user explicitly chose to override)
  • If fixes were made, mutation testing was re-run to confirm
If any criterion is not met, revisit the relevant practice before proceeding.
完成变异测试后,验证以下内容:
  • 已针对相关范围运行了变异测试工具
  • 所有存活变异体均已列出,包含文件、行号和变异类型
  • 每个存活变异体都有具体的测试用例建议
  • 变异测试得分达到100%(或用户明确选择覆盖规则)
  • 如果进行了修复,已重新运行变异测试以确认结果
如果任何一项未满足,请在继续前重新检查相关实践。

Dependencies

依赖

This skill works standalone but is most valuable as a pre-PR quality gate. It integrates with:
  • tdd: TDD produces the tests that mutation testing validates; surviving mutants indicate the TDD cycle missed a case
  • code-review: Mutation results inform code review -- reviewers can check that new code has no surviving mutants
Missing a dependency? Install with:
npx skills add jwilger/agent-skills --skill tdd
本skill可独立使用,但作为PR前置质量门禁时价值最大。它可与以下skill集成:
  • tdd: TDD生成的测试会被变异测试验证;存活的变异体表明TDD循环遗漏了某些测试场景
  • code-review: 变异测试结果可为代码评审提供信息——评审者可检查新代码是否存在存活变异体
缺少依赖?通过以下命令安装:
npx skills add jwilger/agent-skills --skill tdd