genotoxic

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Genotoxic

Combines mutation testing and necessist (test statement removal) with code graph analysis to triage findings into actionable categories: false positives, missing unit tests, and fuzzing targets.

将突变测试、necessist（测试语句移除）与代码图分析相结合，将发现的问题分类为可落地的处理类别：误报、缺失的单元测试和模糊测试目标。

When to Use

适用场景

After mutation testing reveals survived mutants that need triage
Identifying where unit tests would have the highest impact
Finding functions that need fuzz harnesses instead of unit tests
Prioritizing test improvements using data flow context
Filtering out harmless mutants from actionable ones
Finding unnecessary test statements that indicate weak assertions (necessist)

突变测试发现需要分类处理的存活突变体时
识别单元测试投入性价比最高的位置时
查找需要模糊测试工具而非单元测试的函数时
利用数据流上下文优先改进测试用例时
从可处理的突变体中过滤掉无害突变体时
查找表明断言薄弱的不必要测试语句时（necessist）

When NOT to Use

不适用场景

Codebase has no existing test suite (write tests first)
Pure documentation or configuration changes
Single-file scripts with trivial logic

代码库没有现有测试套件（请先编写测试）
纯文档或配置变更
逻辑简单的单文件脚本

Prerequisites

前置条件

trailmark installed — if
```
uv run trailmark
```
fails, run:
bash
```
uv pip install trailmark
```
DO NOT fall back to "manual verification" or "manual analysis" as a substitute for running trailmark. Install it first. If installation fails, report the error instead of switching to manual analysis.
A mutation testing framework for the target language — if the framework command fails (not found, not installed), install it using the instructions in references/mutation-frameworks.md. DO NOT fall back to "manual mutation analysis" or skip mutation testing. Install the framework first. If installation fails, report the error instead of switching to manual mutation analysis.
necessist (optional, recommended) — if the target language is supported (Go, Rust, Solidity/Foundry, TypeScript/Hardhat, TypeScript/Vitest, Rust/Anchor), install with
```
cargo install necessist
```
. See references/mutation-frameworks.md for details.
An existing test suite that passes
macOS environment: Run
```
ulimit -n 1024
```
before any
```
mull-runner
```
invocation. macOS Tahoe (26+) sets unlimited file descriptors by default, which crashes Mull's subprocess spawning. See references/mutation-frameworks.md for details.

已安装 trailmark — 如果
```
uv run trailmark
```
运行失败，请执行：
bash
```
uv pip install trailmark
```
请勿使用「手动验证」或「手动分析」替代运行trailmark，请先安装该工具。如果安装失败，请上报错误而非切换到手动分析。
已安装目标语言对应的突变测试框架 — 如果框架命令运行失败（未找到、未安装），请按照references/mutation-frameworks.md中的说明安装。请勿使用「手动突变分析」或跳过突变测试，请先安装框架。如果安装失败，请上报错误而非切换到手动突变分析。
necessist（可选，推荐） — 如果目标语言受支持（Go、Rust、Solidity/Foundry、TypeScript/Hardhat、TypeScript/Vitest、Rust/Anchor），请使用
```
cargo install necessist
```
安装。详情请参考references/mutation-frameworks.md。
现有可正常运行的测试套件
macOS环境：运行任何
```
mull-runner
```
命令前请先执行
```
ulimit -n 1024
```
。macOS Tahoe（26+）默认设置无限制文件描述符，会导致Mull的子进程创建崩溃。详情请参考references/mutation-frameworks.md。

Rationalizations to Reject

需摒弃的错误认知

Rationalization	Why It's Wrong	Required Action
"All survived mutants need tests"	Many are harmless or equivalent	Triage before writing tests
"Mutation testing is too noisy"	Noise means you're not triaging	Use graph data to filter
"Unit tests cover everything"	Complex data flows need fuzzing	Check entrypoint reachability
"Dead code mutants don't matter"	Dead code should be removed	Flag for cleanup
"Low complexity = low risk"	Boundary bugs hide in simple code	Check mutant location
"Tool isn't installed, I'll do it manually"	Manual analysis misses what tooling catches	Install the tool first
"Necessist isn't mutation testing, skip it"	Necessist finds what mutation testing misses: weak tests	Run both when the language supports it

错误认知	错误原因	required 操作
「所有存活突变体都需要补充测试」	很多突变体是无害的或等价的	编写测试前先分类
「突变测试噪音太大」	噪音意味着你没有做分类	使用图数据过滤
「单元测试已经覆盖了所有内容」	复杂数据流需要模糊测试	检查入口点可达性
「死代码突变体不重要」	死代码应该被移除	标记为待清理
「低复杂度 = 低风险」	边界bug容易隐藏在简单代码中	检查突变体位置
「工具没装，我手动做就行」	手动分析会漏掉工具能捕获的问题	先安装工具
「Necessist不是突变测试，跳过就行」	Necessist能发现突变测试漏掉的问题：薄弱测试	语言支持时两者都运行

Quick Start

快速开始

bash

undefined

bash

undefined

1. Build the code graph

1. 构建代码图

uv run trailmark analyze --summary {targetDir}

2. Run mutation testing (language-dependent)

2. 运行突变测试（依语言而定）

Python:

uv run mutmut run --paths-to-mutate {targetDir}/src uv run mutmut results

2b. Run necessist (if language supported)

2b. 运行necessist（如果语言支持）

necessist

3. Analyze results with this skill's workflow (Phase 3)

3. 使用本工具的工作流分析结果（第3阶段）

---

---

Workflow Overview

工作流概览

Phase 1: Graph Build      → Parse codebase with trailmark
      ↓
Phase 2: Mutation Run     → Execute mutation testing framework
Phase 2b: Necessist Run   → Remove test statements (optional, parallel)
      ↓
Phase 3: Triage           → Classify findings using graph data
      ↓
Output: Categorized Report
  ├── Corroborated         (both tools flag same function — highest value)
  ├── False Positives      (harmless, skip)
  ├── Missing Tests        (write unit tests)
  └── Fuzzing Targets      (set up fuzz harnesses)

阶段1：构建代码图      → 用trailmark解析代码库
      ↓
阶段2：运行突变测试     → 执行突变测试框架
阶段2b：运行Necessist   → 移除测试语句（可选，可并行）
      ↓
阶段3：分类处理           → 利用图数据对发现的问题分类
      ↓
输出：分类报告
  ├── 交叉验证结果         （两个工具都标记了同一个函数 — 最高优先级）
  ├── 误报      （无害，跳过）
  ├── 缺失测试        （编写单元测试）
  └── 模糊测试目标      （搭建模糊测试框架）

Decision Tree

决策树

├─ Need to set up mutation testing for a language?
│  └─ Read: references/mutation-frameworks.md
│
├─ Need to set up necessist or find weak test statements?
│  └─ Read: references/mutation-frameworks.md (Necessist section)
│
├─ Need to understand the triage criteria in depth?
│  └─ Read: references/triage-methodology.md
│
├─ Need to understand how graph data informs triage?
│  └─ Read: references/graph-analysis.md
│
└─ Already have results + graph? Use Phase 3 below.

├─ 需要为某门语言搭建突变测试环境？
│  └─ 阅读：references/mutation-frameworks.md
│
├─ 需要搭建necessist或查找薄弱测试语句？
│  └─ 阅读：references/mutation-frameworks.md（Necessist章节）
│
├─ 需要深入理解分类标准？
│  └─ 阅读：references/triage-methodology.md
│
├─ 需要理解图数据如何支撑分类？
│  └─ 阅读：references/graph-analysis.md
│
└─ 已经有结果+图数据？使用下方第3阶段流程。

Phase 1: Build Code Graph and Run Pre-Analysis

阶段1：构建代码图并运行预分析

Parse the target codebase with trailmark and run pre-analysis before mutation testing. Pre-analysis computes blast radius, entry points, privilege boundaries, and taint propagation, which Phase 3 uses for triage.

bash

uv run trailmark analyze --summary {targetDir}

Use the

QueryEngine

API to build the graph and run pre-analysis:

QueryEngine.from_directory("{targetDir}", language="{lang}")

Call
```
engine.preanalysis()
```
— mandatory before triage
Export with
```
engine.to_json()
```
for cross-referencing with mutation results

See references/graph-analysis.md for the full API: node mapping, reachability queries, blast radius, and pre-analysis subgraph lookups.

在突变测试前先用trailmark解析目标代码库并运行预分析。预分析会计算影响范围、入口点、权限边界和污点传播，供第3阶段分类使用。

bash

uv run trailmark analyze --summary {targetDir}

使用

QueryEngine

API构建代码图并运行预分析：

QueryEngine.from_directory("{targetDir}", language="{lang}")

调用
```
engine.preanalysis()
```
— 分类前必须执行
用
```
engine.to_json()
```
导出，用于和突变测试结果交叉比对

完整API请参考references/graph-analysis.md：节点映射、可达性查询、影响范围、预分析子图查询。

Phase 2: Run Mutation Testing

阶段2：运行突变测试

Select and run the appropriate framework. See references/mutation-frameworks.md for language-specific setup.

Capture survived mutants. Each framework reports differently, but extract these fields per mutant:

Field	Description
File path	Source file containing the mutant
Line number	Line where mutation was applied
Mutation type	What was changed (operator, value, etc.)
Status	survived, killed, timeout, error

Filter to survived mutants only for Phase 3.

选择并运行对应的框架。语言特定的配置请参考references/mutation-frameworks.md。

收集存活突变体。 不同框架的报告格式不同，但需要提取每个突变体的以下字段：

字段	描述
文件路径	突变体所在的源文件
行号	突变应用的行位置
突变类型	变更内容（运算符、值等）
状态	存活、被杀死、超时、错误

仅过滤出存活的突变体进入第3阶段。

Phase 2b: Run Necessist (Optional)

阶段2b：运行Necessist（可选）

If the target language is supported (Go, Rust, Solidity/Foundry, TypeScript/Hardhat, TypeScript/Vitest, Rust/Anchor), run necessist to find unnecessary test statements. This runs independently of Phase 2 and can execute in parallel.

bash

undefined

如果目标语言受支持（Go、Rust、Solidity/Foundry、TypeScript/Hardhat、TypeScript/Vitest、Rust/Anchor），运行necessist查找不必要的测试语句。该步骤独立于阶段2，可并行执行。

bash

undefined

Auto-detect framework

自动检测框架

necessist

Or target specific test files

或指定测试文件

necessist tests/test_parser.rs

Export results

导出结果

necessist --dump


Filter to findings where the test **passed after removal**. See
[references/mutation-frameworks.md](references/mutation-frameworks.md)
for framework-specific configuration and the normalized record format.

Map each removal to a production function using the algorithm in
[references/graph-analysis.md](references/graph-analysis.md).

---

necessist --dump


过滤出语句移除后测试**仍通过**的结果。框架特定配置和标准化记录格式请参考[references/mutation-frameworks.md](references/mutation-frameworks.md)。

使用[references/graph-analysis.md](references/graph-analysis.md)中的算法，将每个移除的语句映射到对应的生产函数。

---

Phase 3: Triage Findings

阶段3：分类处理发现的问题

For each survived mutant and each necessist removal, determine its triage bucket using graph data. Necessist removals must first be mapped to a production function (see references/graph-analysis.md).

针对每个存活突变体和每个necessist移除的语句，使用图数据确定其分类桶。Necessist移除的语句必须先映射到对应的生产函数（参考references/graph-analysis.md）。

Quick Classification (Mutation Testing)

快速分类（突变测试）

Signal	Bucket	Reasoning
No callers in graph	False Positive	Dead code, mutant is unreachable
Only test callers	False Positive	Test infrastructure, not production
Logging/display string	False Positive	Cosmetic, no behavioral impact
Equivalent mutant	False Positive	Behavior unchanged despite mutation
Simple function, low CC, no entrypoint path	Missing Tests	Unit test is straightforward
Error handling path	Missing Tests	Should have negative test cases
Boundary condition (off-by-one)	Missing Tests	Property-based test candidate
Pure function, deterministic	Missing Tests	Easy to test, high value
High CC (>10), entrypoint reachable	Fuzzing Target	Complex + exposed = fuzz it
Parser/validator/deserializer	Fuzzing Target	Structured input handling
Many callers (>10) + moderate CC	Fuzzing Target	High blast radius
Binary/wire protocol handling	Fuzzing Target	Fuzzers excel at format testing

信号	分类桶	逻辑
图中无调用方	误报	死代码，突变体不可达
仅测试调用方	误报	测试基础设施，非生产代码
日志/展示字符串	误报	外观类修改，无行为影响
等价突变体	误报	尽管发生突变，行为未改变
简单函数、低圈复杂度、无入口点路径	缺失测试	单元测试容易编写
错误处理路径	缺失测试	应该有负向测试用例
边界条件（差一错误）	缺失测试	适合基于属性的测试
纯函数、确定性	缺失测试	容易测试，价值高
高圈复杂度（>10）、入口点可达	模糊测试目标	复杂+暴露 = 适合模糊测试
解析器/校验器/反序列化器	模糊测试目标	结构化输入处理
大量调用方（>10）+ 中等圈复杂度	模糊测试目标	影响范围大
二进制/网络协议处理	模糊测试目标	模糊测试擅长格式测试

Quick Classification (Necessist)

快速分类（Necessist）

Signal	Bucket	Reasoning
Redundant setup or debug call	False Positive	Statement genuinely unnecessary
Cannot map to production function	False Positive	No graph context for triage
Call removed, no assertion checks its effect	Missing Tests	Test has weak assertions
Assertion removed, test still passes	Missing Tests	Redundant or insufficient coverage
Maps to high-CC entrypoint-reachable function	Fuzzing Target	Complex + exposed + weak test

When both mutation testing and necessist flag the same production function, mark as corroborated — highest confidence finding.

For detailed criteria, see references/triage-methodology.md.

信号	分类桶	逻辑
冗余配置或调试调用	误报	语句本身确实不必要
无法映射到生产函数	误报	无分类所需的图上下文
调用被移除，没有断言检查其影响	缺失测试	测试断言薄弱
断言被移除，测试仍通过	缺失测试	覆盖冗余或不足
映射到高圈复杂度、入口点可达的函数	模糊测试目标	复杂+暴露+测试薄弱

当突变测试和necessist都标记了同一个生产函数，标记为交叉验证结果 — 置信度最高的发现。

详细标准请参考references/triage-methodology.md。

Graph Queries for Triage

分类用的图查询

For each mutant, map it to its containing graph node and use pre-analysis subgraphs (tainted, high_blast_radius, privilege_boundary) from Phase 1 to classify it. The classification logic checks: no callers → false positive, privilege boundary → fuzzing, high CC + tainted → fuzzing, high blast radius → fuzzing, otherwise → missing tests.

See references/graph-analysis.md for the

batch_triage

implementation and node mapping functions.

针对每个突变体，将其映射到对应的图节点，使用阶段1得到的预分析子图（污点、高影响范围、权限边界）进行分类。分类逻辑检查：无调用方→误报，权限边界→模糊测试，高圈复杂度+污点→模糊测试，高影响范围→模糊测试，其他情况→缺失测试。

batch_triage

实现和节点映射函数请参考references/graph-analysis.md。

Output Format

输出格式

Generate a markdown report:

markdown

undefined

生成markdown报告：

markdown

undefined

Genotoxic Triage Report

Genotoxic分类报告

Summary

摘要

Total survived mutants: N
Total necessist removals: N
Corroborated findings: N
False positives: N (N%)
Missing test coverage: N (N%)
Fuzzing targets: N (N%)

总存活突变体数：N
总necessist移除语句数：N
交叉验证结果数：N
误报数：N（N%）
缺失测试覆盖数：N（N%）
模糊测试目标数：N（N%）

Corroborated Findings

交叉验证结果

File	Line	Function	Mutation Signal	Necessist Signal	Action

文件	行号	函数	突变信号	Necessist信号	操作

False Positives

误报

File	Line	Mutation	Reason	Source

文件	行号	突变	原因	来源

Missing Test Coverage

缺失测试覆盖

File	Line	Function	CC	Callers	Suggested Test	Source

文件	行号	函数	圈复杂度	调用方数量	建议测试类型	来源

Fuzzing Targets

模糊测试目标

File	Line	Function	CC	Entrypoint Path	Blast Radius	Source


The `Source` column is `mutation`, `necessist`, or `corroborated`.

Write the report to `GENOTOXIC_REPORT.md` in the working directory.

---

文件	行号	函数	圈复杂度	入口点路径	影响范围	来源


`来源`列取值为`mutation`、`necessist`或`corroborated`。

将报告写入工作目录下的`GENOTOXIC_REPORT.md`文件。

---

Quality Checklist

质量检查清单

Before delivering:

Trailmark graph built for target language
Mutation framework ran to completion
Necessist ran (if language supported) or noted as not applicable
All survived mutants triaged (none unclassified)
All necessist removals triaged (if applicable)
Corroborated findings identified (if both tools ran)
False positives have clear justifications
Missing test items include suggested test type
Fuzzing targets include entrypoint paths and blast radius
Report file written to
```
GENOTOXIC_REPORT.md
```
User notified with summary statistics

交付前确认：

已为目标语言构建Trailmark图
突变框架已运行完成
已运行Necessist（如果语言支持）或标记为不适用
所有存活突变体都已分类（无未分类项）
所有Necessist移除语句都已分类（如果适用）
已识别交叉验证结果（如果两个工具都运行）
误报有明确的理由说明
缺失测试项包含建议的测试类型
模糊测试目标包含入口点路径和影响范围
报告文件已写入
```
GENOTOXIC_REPORT.md
```
已向用户通知摘要统计数据

Integration

集成

trailmark skill:

Phase 1: Build code graph, query complexity and entrypoints
Phase 3: Caller analysis, reachability, blast radius

property-based-testing skill:

Missing test coverage items involving boundary conditions
Roundtrip/idempotence properties for serialization mutants

testing-handbook-skills (fuzzing):

Fuzzing target items: use
```
harness-writing
```
,
```
cargo-fuzz
```
,
```
atheris
```

trailmark工具：

阶段1：构建代码图，查询复杂度和入口点
阶段3：调用方分析、可达性、影响范围

基于属性的测试工具：

涉及边界条件的缺失测试覆盖项
序列化突变体的往返/幂等属性测试

测试手册工具（模糊测试）：

模糊测试目标项：使用
```
harness-writing
```
、
```
cargo-fuzz
```
、
```
atheris
```

Supporting Documentation

支持文档

references/mutation-frameworks.md - Language-specific framework setup, output parsing, and necessist configuration
references/triage-methodology.md - Detailed triage criteria, edge cases, and worked examples for both mutation testing and necessist
references/graph-analysis.md - Graph query patterns, test-to-production mapping, and result merging

First-time users: Start with Phase 1 (graph build), then run mutations, then use the Quick Classification table in Phase 3.

Experienced users: Jump to Phase 3 and use the Decision Tree to load specific reference material.

references/mutation-frameworks.md — 语言特定的框架搭建、输出解析和necessist配置
references/triage-methodology.md — 突变测试和necessist的详细分类标准、边界情况和示例
references/graph-analysis.md — 图查询模式、测试到生产的映射、结果合并

首次使用用户： 从阶段1（构建图）开始，然后运行突变测试，再使用阶段3的快速分类表。

有经验用户： 直接跳转到阶段3，使用决策树加载对应的参考资料。