genotoxic
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseGenotoxic
Genotoxic
Combines mutation testing and necessist (test statement removal) with
code graph analysis to triage findings into actionable categories:
false positives, missing unit tests, and fuzzing targets.
将突变测试、necessist(测试语句移除)与代码图分析相结合,将发现的问题分类为可落地的处理类别:误报、缺失的单元测试和模糊测试目标。
When to Use
适用场景
- After mutation testing reveals survived mutants that need triage
- Identifying where unit tests would have the highest impact
- Finding functions that need fuzz harnesses instead of unit tests
- Prioritizing test improvements using data flow context
- Filtering out harmless mutants from actionable ones
- Finding unnecessary test statements that indicate weak assertions (necessist)
- 突变测试发现需要分类处理的存活突变体时
- 识别单元测试投入性价比最高的位置时
- 查找需要模糊测试工具而非单元测试的函数时
- 利用数据流上下文优先改进测试用例时
- 从可处理的突变体中过滤掉无害突变体时
- 查找表明断言薄弱的不必要测试语句时(necessist)
When NOT to Use
不适用场景
- Codebase has no existing test suite (write tests first)
- Pure documentation or configuration changes
- Single-file scripts with trivial logic
- 代码库没有现有测试套件(请先编写测试)
- 纯文档或配置变更
- 逻辑简单的单文件脚本
Prerequisites
前置条件
- trailmark installed — if fails, run:
uv run trailmarkDO NOT fall back to "manual verification" or "manual analysis" as a substitute for running trailmark. Install it first. If installation fails, report the error instead of switching to manual analysis.bashuv pip install trailmark - A mutation testing framework for the target language — if the framework command fails (not found, not installed), install it using the instructions in references/mutation-frameworks.md. DO NOT fall back to "manual mutation analysis" or skip mutation testing. Install the framework first. If installation fails, report the error instead of switching to manual mutation analysis.
- necessist (optional, recommended) — if the target language is
supported (Go, Rust, Solidity/Foundry, TypeScript/Hardhat,
TypeScript/Vitest, Rust/Anchor), install with . See references/mutation-frameworks.md for details.
cargo install necessist - An existing test suite that passes
- macOS environment: Run before any
ulimit -n 1024invocation. macOS Tahoe (26+) sets unlimited file descriptors by default, which crashes Mull's subprocess spawning. See references/mutation-frameworks.md for details.mull-runner
- 已安装 trailmark — 如果运行失败,请执行:
uv run trailmark请勿使用「手动验证」或「手动分析」替代运行trailmark,请先安装该工具。如果安装失败,请上报错误而非切换到手动分析。bashuv pip install trailmark - 已安装目标语言对应的突变测试框架 — 如果框架命令运行失败(未找到、未安装),请按照references/mutation-frameworks.md中的说明安装。请勿使用「手动突变分析」或跳过突变测试,请先安装框架。如果安装失败,请上报错误而非切换到手动突变分析。
- necessist(可选,推荐) — 如果目标语言受支持(Go、Rust、Solidity/Foundry、TypeScript/Hardhat、TypeScript/Vitest、Rust/Anchor),请使用安装。详情请参考references/mutation-frameworks.md。
cargo install necessist - 现有可正常运行的测试套件
- macOS环境:运行任何命令前请先执行
mull-runner。macOS Tahoe(26+)默认设置无限制文件描述符,会导致Mull的子进程创建崩溃。详情请参考references/mutation-frameworks.md。ulimit -n 1024
Rationalizations to Reject
需摒弃的错误认知
| Rationalization | Why It's Wrong | Required Action |
|---|---|---|
| "All survived mutants need tests" | Many are harmless or equivalent | Triage before writing tests |
| "Mutation testing is too noisy" | Noise means you're not triaging | Use graph data to filter |
| "Unit tests cover everything" | Complex data flows need fuzzing | Check entrypoint reachability |
| "Dead code mutants don't matter" | Dead code should be removed | Flag for cleanup |
| "Low complexity = low risk" | Boundary bugs hide in simple code | Check mutant location |
| "Tool isn't installed, I'll do it manually" | Manual analysis misses what tooling catches | Install the tool first |
| "Necessist isn't mutation testing, skip it" | Necessist finds what mutation testing misses: weak tests | Run both when the language supports it |
| 错误认知 | 错误原因 | required 操作 |
|---|---|---|
| 「所有存活突变体都需要补充测试」 | 很多突变体是无害的或等价的 | 编写测试前先分类 |
| 「突变测试噪音太大」 | 噪音意味着你没有做分类 | 使用图数据过滤 |
| 「单元测试已经覆盖了所有内容」 | 复杂数据流需要模糊测试 | 检查入口点可达性 |
| 「死代码突变体不重要」 | 死代码应该被移除 | 标记为待清理 |
| 「低复杂度 = 低风险」 | 边界bug容易隐藏在简单代码中 | 检查突变体位置 |
| 「工具没装,我手动做就行」 | 手动分析会漏掉工具能捕获的问题 | 先安装工具 |
| 「Necessist不是突变测试,跳过就行」 | Necessist能发现突变测试漏掉的问题:薄弱测试 | 语言支持时两者都运行 |
Quick Start
快速开始
bash
undefinedbash
undefined1. Build the code graph
1. 构建代码图
uv run trailmark analyze --summary {targetDir}
uv run trailmark analyze --summary {targetDir}
2. Run mutation testing (language-dependent)
2. 运行突变测试(依语言而定)
Python:
Python:
uv run mutmut run --paths-to-mutate {targetDir}/src
uv run mutmut results
uv run mutmut run --paths-to-mutate {targetDir}/src
uv run mutmut results
2b. Run necessist (if language supported)
2b. 运行necessist(如果语言支持)
necessist
necessist
3. Analyze results with this skill's workflow (Phase 3)
3. 使用本工具的工作流分析结果(第3阶段)
---
---Workflow Overview
工作流概览
Phase 1: Graph Build → Parse codebase with trailmark
↓
Phase 2: Mutation Run → Execute mutation testing framework
Phase 2b: Necessist Run → Remove test statements (optional, parallel)
↓
Phase 3: Triage → Classify findings using graph data
↓
Output: Categorized Report
├── Corroborated (both tools flag same function — highest value)
├── False Positives (harmless, skip)
├── Missing Tests (write unit tests)
└── Fuzzing Targets (set up fuzz harnesses)阶段1:构建代码图 → 用trailmark解析代码库
↓
阶段2:运行突变测试 → 执行突变测试框架
阶段2b:运行Necessist → 移除测试语句(可选,可并行)
↓
阶段3:分类处理 → 利用图数据对发现的问题分类
↓
输出:分类报告
├── 交叉验证结果 (两个工具都标记了同一个函数 — 最高优先级)
├── 误报 (无害,跳过)
├── 缺失测试 (编写单元测试)
└── 模糊测试目标 (搭建模糊测试框架)Decision Tree
决策树
├─ Need to set up mutation testing for a language?
│ └─ Read: references/mutation-frameworks.md
│
├─ Need to set up necessist or find weak test statements?
│ └─ Read: references/mutation-frameworks.md (Necessist section)
│
├─ Need to understand the triage criteria in depth?
│ └─ Read: references/triage-methodology.md
│
├─ Need to understand how graph data informs triage?
│ └─ Read: references/graph-analysis.md
│
└─ Already have results + graph? Use Phase 3 below.├─ 需要为某门语言搭建突变测试环境?
│ └─ 阅读:references/mutation-frameworks.md
│
├─ 需要搭建necessist或查找薄弱测试语句?
│ └─ 阅读:references/mutation-frameworks.md(Necessist章节)
│
├─ 需要深入理解分类标准?
│ └─ 阅读:references/triage-methodology.md
│
├─ 需要理解图数据如何支撑分类?
│ └─ 阅读:references/graph-analysis.md
│
└─ 已经有结果+图数据?使用下方第3阶段流程。Phase 1: Build Code Graph and Run Pre-Analysis
阶段1:构建代码图并运行预分析
Parse the target codebase with trailmark and run pre-analysis before
mutation testing. Pre-analysis computes blast radius, entry points, privilege
boundaries, and taint propagation, which Phase 3 uses for triage.
bash
uv run trailmark analyze --summary {targetDir}Use the API to build the graph and run pre-analysis:
QueryEngineQueryEngine.from_directory("{targetDir}", language="{lang}")- Call — mandatory before triage
engine.preanalysis() - Export with for cross-referencing with mutation results
engine.to_json()
See references/graph-analysis.md for the
full API: node mapping, reachability queries, blast radius, and
pre-analysis subgraph lookups.
在突变测试前先用trailmark解析目标代码库并运行预分析。预分析会计算影响范围、入口点、权限边界和污点传播,供第3阶段分类使用。
bash
uv run trailmark analyze --summary {targetDir}使用API构建代码图并运行预分析:
QueryEngineQueryEngine.from_directory("{targetDir}", language="{lang}")- 调用— 分类前必须执行
engine.preanalysis() - 用导出,用于和突变测试结果交叉比对
engine.to_json()
完整API请参考references/graph-analysis.md:节点映射、可达性查询、影响范围、预分析子图查询。
Phase 2: Run Mutation Testing
阶段2:运行突变测试
Select and run the appropriate framework. See
references/mutation-frameworks.md for
language-specific setup.
Capture survived mutants. Each framework reports differently, but
extract these fields per mutant:
| Field | Description |
|---|---|
| File path | Source file containing the mutant |
| Line number | Line where mutation was applied |
| Mutation type | What was changed (operator, value, etc.) |
| Status | survived, killed, timeout, error |
Filter to survived mutants only for Phase 3.
选择并运行对应的框架。语言特定的配置请参考references/mutation-frameworks.md。
收集存活突变体。 不同框架的报告格式不同,但需要提取每个突变体的以下字段:
| 字段 | 描述 |
|---|---|
| 文件路径 | 突变体所在的源文件 |
| 行号 | 突变应用的行位置 |
| 突变类型 | 变更内容(运算符、值等) |
| 状态 | 存活、被杀死、超时、错误 |
仅过滤出存活的突变体进入第3阶段。
Phase 2b: Run Necessist (Optional)
阶段2b:运行Necessist(可选)
If the target language is supported (Go, Rust, Solidity/Foundry,
TypeScript/Hardhat, TypeScript/Vitest, Rust/Anchor), run necessist to
find unnecessary test statements. This runs independently of Phase 2 and
can execute in parallel.
bash
undefined如果目标语言受支持(Go、Rust、Solidity/Foundry、TypeScript/Hardhat、TypeScript/Vitest、Rust/Anchor),运行necessist查找不必要的测试语句。该步骤独立于阶段2,可并行执行。
bash
undefinedAuto-detect framework
自动检测框架
necessist
necessist
Or target specific test files
或指定测试文件
necessist tests/test_parser.rs
necessist tests/test_parser.rs
Export results
导出结果
necessist --dump
Filter to findings where the test **passed after removal**. See
[references/mutation-frameworks.md](references/mutation-frameworks.md)
for framework-specific configuration and the normalized record format.
Map each removal to a production function using the algorithm in
[references/graph-analysis.md](references/graph-analysis.md).
---necessist --dump
过滤出语句移除后测试**仍通过**的结果。框架特定配置和标准化记录格式请参考[references/mutation-frameworks.md](references/mutation-frameworks.md)。
使用[references/graph-analysis.md](references/graph-analysis.md)中的算法,将每个移除的语句映射到对应的生产函数。
---Phase 3: Triage Findings
阶段3:分类处理发现的问题
For each survived mutant and each necessist removal, determine its
triage bucket using graph data. Necessist removals must first be mapped
to a production function (see
references/graph-analysis.md).
针对每个存活突变体和每个necessist移除的语句,使用图数据确定其分类桶。Necessist移除的语句必须先映射到对应的生产函数(参考references/graph-analysis.md)。
Quick Classification (Mutation Testing)
快速分类(突变测试)
| Signal | Bucket | Reasoning |
|---|---|---|
| No callers in graph | False Positive | Dead code, mutant is unreachable |
| Only test callers | False Positive | Test infrastructure, not production |
| Logging/display string | False Positive | Cosmetic, no behavioral impact |
| Equivalent mutant | False Positive | Behavior unchanged despite mutation |
| Simple function, low CC, no entrypoint path | Missing Tests | Unit test is straightforward |
| Error handling path | Missing Tests | Should have negative test cases |
| Boundary condition (off-by-one) | Missing Tests | Property-based test candidate |
| Pure function, deterministic | Missing Tests | Easy to test, high value |
| High CC (>10), entrypoint reachable | Fuzzing Target | Complex + exposed = fuzz it |
| Parser/validator/deserializer | Fuzzing Target | Structured input handling |
| Many callers (>10) + moderate CC | Fuzzing Target | High blast radius |
| Binary/wire protocol handling | Fuzzing Target | Fuzzers excel at format testing |
| 信号 | 分类桶 | 逻辑 |
|---|---|---|
| 图中无调用方 | 误报 | 死代码,突变体不可达 |
| 仅测试调用方 | 误报 | 测试基础设施,非生产代码 |
| 日志/展示字符串 | 误报 | 外观类修改,无行为影响 |
| 等价突变体 | 误报 | 尽管发生突变,行为未改变 |
| 简单函数、低圈复杂度、无入口点路径 | 缺失测试 | 单元测试容易编写 |
| 错误处理路径 | 缺失测试 | 应该有负向测试用例 |
| 边界条件(差一错误) | 缺失测试 | 适合基于属性的测试 |
| 纯函数、确定性 | 缺失测试 | 容易测试,价值高 |
| 高圈复杂度(>10)、入口点可达 | 模糊测试目标 | 复杂+暴露 = 适合模糊测试 |
| 解析器/校验器/反序列化器 | 模糊测试目标 | 结构化输入处理 |
| 大量调用方(>10)+ 中等圈复杂度 | 模糊测试目标 | 影响范围大 |
| 二进制/网络协议处理 | 模糊测试目标 | 模糊测试擅长格式测试 |
Quick Classification (Necessist)
快速分类(Necessist)
| Signal | Bucket | Reasoning |
|---|---|---|
| Redundant setup or debug call | False Positive | Statement genuinely unnecessary |
| Cannot map to production function | False Positive | No graph context for triage |
| Call removed, no assertion checks its effect | Missing Tests | Test has weak assertions |
| Assertion removed, test still passes | Missing Tests | Redundant or insufficient coverage |
| Maps to high-CC entrypoint-reachable function | Fuzzing Target | Complex + exposed + weak test |
When both mutation testing and necessist flag the same production
function, mark as corroborated — highest confidence finding.
For detailed criteria, see
references/triage-methodology.md.
| 信号 | 分类桶 | 逻辑 |
|---|---|---|
| 冗余配置或调试调用 | 误报 | 语句本身确实不必要 |
| 无法映射到生产函数 | 误报 | 无分类所需的图上下文 |
| 调用被移除,没有断言检查其影响 | 缺失测试 | 测试断言薄弱 |
| 断言被移除,测试仍通过 | 缺失测试 | 覆盖冗余或不足 |
| 映射到高圈复杂度、入口点可达的函数 | 模糊测试目标 | 复杂+暴露+测试薄弱 |
当突变测试和necessist都标记了同一个生产函数,标记为交叉验证结果 — 置信度最高的发现。
详细标准请参考references/triage-methodology.md。
Graph Queries for Triage
分类用的图查询
For each mutant, map it to its containing graph node and use pre-analysis
subgraphs (tainted, high_blast_radius, privilege_boundary) from Phase 1
to classify it. The classification logic checks: no callers → false
positive, privilege boundary → fuzzing, high CC + tainted → fuzzing,
high blast radius → fuzzing, otherwise → missing tests.
See references/graph-analysis.md for
the implementation and node mapping functions.
batch_triage针对每个突变体,将其映射到对应的图节点,使用阶段1得到的预分析子图(污点、高影响范围、权限边界)进行分类。分类逻辑检查:无调用方→误报,权限边界→模糊测试,高圈复杂度+污点→模糊测试,高影响范围→模糊测试,其他情况→缺失测试。
batch_triageOutput Format
输出格式
Generate a markdown report:
markdown
undefined生成markdown报告:
markdown
undefinedGenotoxic Triage Report
Genotoxic分类报告
Summary
摘要
- Total survived mutants: N
- Total necessist removals: N
- Corroborated findings: N
- False positives: N (N%)
- Missing test coverage: N (N%)
- Fuzzing targets: N (N%)
- 总存活突变体数:N
- 总necessist移除语句数:N
- 交叉验证结果数:N
- 误报数:N(N%)
- 缺失测试覆盖数:N(N%)
- 模糊测试目标数:N(N%)
Corroborated Findings
交叉验证结果
| File | Line | Function | Mutation Signal | Necessist Signal | Action |
|---|
| 文件 | 行号 | 函数 | 突变信号 | Necessist信号 | 操作 |
|---|
False Positives
误报
| File | Line | Mutation | Reason | Source |
|---|
| 文件 | 行号 | 突变 | 原因 | 来源 |
|---|
Missing Test Coverage
缺失测试覆盖
| File | Line | Function | CC | Callers | Suggested Test | Source |
|---|
| 文件 | 行号 | 函数 | 圈复杂度 | 调用方数量 | 建议测试类型 | 来源 |
|---|
Fuzzing Targets
模糊测试目标
| File | Line | Function | CC | Entrypoint Path | Blast Radius | Source |
|---|
The `Source` column is `mutation`, `necessist`, or `corroborated`.
Write the report to `GENOTOXIC_REPORT.md` in the working directory.
---| 文件 | 行号 | 函数 | 圈复杂度 | 入口点路径 | 影响范围 | 来源 |
|---|
`来源`列取值为`mutation`、`necessist`或`corroborated`。
将报告写入工作目录下的`GENOTOXIC_REPORT.md`文件。
---Quality Checklist
质量检查清单
Before delivering:
- Trailmark graph built for target language
- Mutation framework ran to completion
- Necessist ran (if language supported) or noted as not applicable
- All survived mutants triaged (none unclassified)
- All necessist removals triaged (if applicable)
- Corroborated findings identified (if both tools ran)
- False positives have clear justifications
- Missing test items include suggested test type
- Fuzzing targets include entrypoint paths and blast radius
- Report file written to
GENOTOXIC_REPORT.md - User notified with summary statistics
交付前确认:
- 已为目标语言构建Trailmark图
- 突变框架已运行完成
- 已运行Necessist(如果语言支持)或标记为不适用
- 所有存活突变体都已分类(无未分类项)
- 所有Necessist移除语句都已分类(如果适用)
- 已识别交叉验证结果(如果两个工具都运行)
- 误报有明确的理由说明
- 缺失测试项包含建议的测试类型
- 模糊测试目标包含入口点路径和影响范围
- 报告文件已写入
GENOTOXIC_REPORT.md - 已向用户通知摘要统计数据
Integration
集成
trailmark skill:
- Phase 1: Build code graph, query complexity and entrypoints
- Phase 3: Caller analysis, reachability, blast radius
property-based-testing skill:
- Missing test coverage items involving boundary conditions
- Roundtrip/idempotence properties for serialization mutants
testing-handbook-skills (fuzzing):
- Fuzzing target items: use ,
harness-writing,cargo-fuzzatheris
trailmark工具:
- 阶段1:构建代码图,查询复杂度和入口点
- 阶段3:调用方分析、可达性、影响范围
基于属性的测试工具:
- 涉及边界条件的缺失测试覆盖项
- 序列化突变体的往返/幂等属性测试
测试手册工具(模糊测试):
- 模糊测试目标项:使用、
harness-writing、cargo-fuzzatheris
Supporting Documentation
支持文档
- references/mutation-frameworks.md - Language-specific framework setup, output parsing, and necessist configuration
- references/triage-methodology.md - Detailed triage criteria, edge cases, and worked examples for both mutation testing and necessist
- references/graph-analysis.md - Graph query patterns, test-to-production mapping, and result merging
First-time users: Start with Phase 1 (graph build), then run mutations,
then use the Quick Classification table in Phase 3.
Experienced users: Jump to Phase 3 and use the Decision Tree to load
specific reference material.
- references/mutation-frameworks.md — 语言特定的框架搭建、输出解析和necessist配置
- references/triage-methodology.md — 突变测试和necessist的详细分类标准、边界情况和示例
- references/graph-analysis.md — 图查询模式、测试到生产的映射、结果合并
首次使用用户: 从阶段1(构建图)开始,然后运行突变测试,再使用阶段3的快速分类表。
有经验用户: 直接跳转到阶段3,使用决策树加载对应的参考资料。