graph-evolution

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Graph Evolution

图演变

Builds Trailmark code graphs at two source snapshots and computes a structural diff. Surfaces security-relevant changes that text-level diffs miss: new attack paths, complexity shifts, blast radius growth, taint propagation changes, and privilege boundary modifications.
基于两个源代码快照构建Trailmark代码图并计算结构差异,可识别文本级diff遗漏的安全相关变更:新攻击路径、复杂度变动、影响范围扩大、污点传播变化以及权限边界修改。

When to Use

适用场景

  • Comparing two git refs to understand what structurally changed
  • Auditing a range of commits for security-relevant evolution
  • Detecting new attack paths created by code changes
  • Finding functions whose blast radius or complexity grew silently
  • Identifying taint propagation changes across refactors
  • Pre-release structural comparison (tag-to-tag or branch-to-branch)
  • 对比两个git引用,了解代码结构层面的变更
  • 审计一系列提交的安全相关演变
  • 检测代码变更新增的攻击路径
  • 查找影响范围或复杂度悄无声息增长的函数
  • 识别重构过程中的污点传播变化
  • 版本发布前的结构对比(标签与标签或分支与分支间)

When NOT to Use

不适用场景

  • Line-level code review (use
    differential-review
    for text-diff analysis)
  • Single-snapshot analysis (use the
    trailmark
    skill directly)
  • Diagram generation from a single snapshot (use the
    diagramming-code
    skill)
  • Mutation testing triage (use the
    genotoxic
    skill)
  • 行级代码评审(使用
    differential-review
    做文本diff分析)
  • 单快照分析(直接使用
    trailmark
    skill)
  • 基于单快照生成图表(使用
    diagramming-code
    skill)
  • 突变测试分类(使用
    genotoxic
    skill)

Rationalizations to Reject

应当避免的错误认知

RationalizationWhy It's WrongRequired Action
"We just need the structural diff, skip pre-analysis"Without pre-analysis, you miss taint changes, blast radius growth, and privilege boundary shiftsRun
engine.preanalysis()
on both snapshots
"Text diff covers what changed"Text diffs miss new attack paths, transitive complexity shifts, and subgraph membership changesUse structural diff to complement text diff
"Only added nodes matter"Removed security functions and shifted privilege boundaries are equally dangerousReview removals and modifications, not just additions
"Low-severity structural changes can be ignored"INFO-level changes (dead code removal) can mask removed security checksClassify every change, review removals for replaced functionality
"One snapshot's graph is enough for comparison"Single-snapshot analysis can't detect evolution — you need both before and afterAlways build and export both graphs
"Tool isn't installed, I'll compare manually"Manual comparison misses what graph analysis catchesInstall trailmark first

错误认知错误原因正确做法
"我们只需要结构diff,跳过预分析"没有预分析的话,会遗漏污点变化、影响范围扩大和权限边界变动对两个快照都执行
engine.preanalysis()
"文本diff已经覆盖了所有变更"文本diff会遗漏新攻击路径、传递性复杂度变动和子图成员变化使用结构diff作为文本diff的补充
"只有新增节点才重要"被移除的安全函数和变动的权限边界同样危险不仅要审查新增内容,还要审查删除和修改的内容
"低危结构变更可以忽略"INFO级变更(如死代码移除)可能掩盖被删除的安全检查对所有变更做分类,审查删除内容确认是否有替代功能
"只需要一个快照的图就可以做对比"单快照分析无法检测演变,你需要变更前后两个版本的快照始终构建并导出两个版本的代码图
"工具没装,我手动对比就行"手动对比会遗漏图分析能发现的问题先安装trailmark

Prerequisites

前置要求

trailmark must be installed. If
uv run trailmark
fails, run:
bash
uv pip install trailmark
DO NOT fall back to "manual comparison" or reading source files as a substitute for running trailmark. The tool must be installed and used programmatically. If installation fails, report the error.

必须安装trailmark。如果
uv run trailmark
执行失败,运行:
bash
uv pip install trailmark
禁止用「手动对比」或阅读源代码替代运行trailmark。必须安装工具并通过程序调用使用。如果安装失败,请上报错误。

Quick Start

快速开始

bash
undefined
bash
undefined

Compare two git refs (e.g., tags, branches, commits)

对比两个git引用(如标签、分支、提交)

1. Build graphs at each snapshot

1. 为每个快照构建代码图

2. Run pre-analysis on both

2. 对两个快照都执行预分析

3. Compute structural diff

3. 计算结构diff

4. Generate report

4. 生成报告

Step-by-step: see Workflow below

分步操作见下方工作流程


---

---

Decision Tree

决策树

├─ Need to understand what each metric means?
│  └─ Read: references/evolution-metrics.md
├─ Need the report output format?
│  └─ Read: references/report-format.md
├─ Already have two graph JSON exports?
│  └─ Jump to Phase 3 (run graph_diff.py directly)
└─ Starting from two git refs?
   └─ Start at Phase 1

├─ 需要了解每个指标的含义?
│  └─ 阅读:references/evolution-metrics.md
├─ 需要了解报告输出格式?
│  └─ 阅读:references/report-format.md
├─ 已经有两个导出的JSON代码图?
│  └─ 跳转到第3阶段(直接运行graph_diff.py)
└─ 从两个git引用开始处理?
   └─ 从第1阶段开始

Workflow

工作流程

Graph Evolution Progress:
- [ ] Phase 1: Create snapshots (git worktrees)
- [ ] Phase 2: Build graphs + pre-analysis on both snapshots
- [ ] Phase 3: Compute structural diff
- [ ] Phase 4: Interpret diff and generate report
- [ ] Phase 5: Clean up worktrees
图演变进度:
- [ ] 阶段1:创建快照(git worktrees)
- [ ] 阶段2:为两个快照构建代码图并执行预分析
- [ ] 阶段3:计算结构diff
- [ ] 阶段4:解读diff并生成报告
- [ ] 阶段5:清理worktrees

Phase 1: Create Snapshots

阶段1:创建快照

Use git worktrees to get clean copies of each ref without disturbing the working tree.
bash
undefined
使用git worktree获取每个引用的干净副本,不会干扰当前工作树。
bash
undefined

Create temp directories for worktrees

为worktrees创建临时目录

BEFORE_DIR=$(mktemp -d) AFTER_DIR=$(mktemp -d)
BEFORE_DIR=$(mktemp -d) AFTER_DIR=$(mktemp -d)

Create worktrees (run from repo root)

创建worktrees(在仓库根目录执行)

git worktree add "$BEFORE_DIR" {before_ref} git worktree add "$AFTER_DIR" {after_ref}

If comparing two directories instead of git refs, skip this phase and
use the directory paths directly in Phase 2.
git worktree add "$BEFORE_DIR" {before_ref} git worktree add "$AFTER_DIR" {after_ref}

如果是对比两个目录而非git引用,跳过这个阶段,在第2阶段直接使用目录路径即可。

Phase 2: Build Graphs and Run Pre-Analysis

阶段2:构建代码图并执行预分析

Build Trailmark graphs for both snapshots and run pre-analysis on each. Pre-analysis computes blast radius, taint propagation, privilege boundaries, and entrypoint enumeration.
python
import json
from trailmark.query.api import QueryEngine

def build_and_export(target_dir, language, output_path):
    """Build graph, run pre-analysis, export JSON."""
    engine = QueryEngine.from_directory(target_dir, language=language)
    engine.preanalysis()
    json_str = engine.to_json()
    with open(output_path, "w") as f:
        f.write(json_str)
    return engine.summary()

import tempfile, os
work_dir = tempfile.mkdtemp(prefix="trailmark_evolution_")
before_json = os.path.join(work_dir, "before_graph.json")
after_json = os.path.join(work_dir, "after_graph.json")

before_summary = build_and_export(
    "{before_dir}", "{lang}", before_json
)
after_summary = build_and_export(
    "{after_dir}", "{lang}", after_json
)
Verify both graphs built successfully by checking the summary output. If either fails, check that the language parameter matches the codebase and that trailmark supports all file types present.
为两个快照构建Trailmark代码图,分别执行预分析。预分析会计算影响范围、污点传播、权限边界和入口点枚举。
python
import json
from trailmark.query.api import QueryEngine

def build_and_export(target_dir, language, output_path):
    """构建代码图,执行预分析,导出JSON。"""
    engine = QueryEngine.from_directory(target_dir, language=language)
    engine.preanalysis()
    json_str = engine.to_json()
    with open(output_path, "w") as f:
        f.write(json_str)
    return engine.summary()

import tempfile, os
work_dir = tempfile.mkdtemp(prefix="trailmark_evolution_")
before_json = os.path.join(work_dir, "before_graph.json")
after_json = os.path.join(work_dir, "after_graph.json")

before_summary = build_and_export(
    "{before_dir}", "{lang}", before_json
)
after_summary = build_and_export(
    "{after_dir}", "{lang}", after_json
)
通过检查摘要输出确认两个代码图都构建成功。如果任意一个构建失败,检查language参数是否与代码库匹配,以及trailmark是否支持当前所有文件类型。

Phase 3: Compute Structural Diff

阶段3:计算结构diff

Run the diff script on the two exported JSON files (using the same
work_dir
from Phase 2):
bash
uv run {baseDir}/scripts/graph_diff.py \
    --before "{before_json}" \
    --after "{after_json}" > "{work_dir}/evolution_diff.json"
The output JSON contains:
KeyContents
summary_delta
Changes in node/edge/entrypoint counts
nodes.added
New functions, classes, methods
nodes.removed
Deleted functions, classes, methods
nodes.modified
Functions with changed CC, params, return type, span
edges.added
New call/inheritance/import relationships
edges.removed
Deleted relationships
subgraphs
Per-subgraph membership changes (tainted, high_blast_radius, etc.)
在两个导出的JSON文件上运行diff脚本(使用第2阶段的同一个
work_dir
):
bash
uv run {baseDir}/scripts/graph_diff.py \
    --before "{before_json}" \
    --after "{after_json}" > "{work_dir}/evolution_diff.json"
输出的JSON包含以下内容:
内容
summary_delta
节点/边/入口点数量的变化
nodes.added
新增的函数、类、方法
nodes.removed
删除的函数、类、方法
nodes.modified
圈复杂度、参数、返回类型、代码跨度发生变化的函数
edges.added
新增的调用/继承/导入关系
edges.removed
删除的关系
subgraphs
各子图的成员变化(污点、高影响范围等)

Phase 4: Interpret Diff and Generate Report

阶段4:解读diff并生成报告

Read the diff JSON and generate a security-focused markdown report. See references/report-format.md for the full template.
Interpretation priorities (highest to lowest):
  1. New tainted paths — nodes entering the
    tainted
    subgraph, especially if they also appear in added edges targeting sensitive functions
  2. Privilege boundary changes — new or removed trust transitions
  3. Attack surface growth — new entrypoints, especially
    untrusted_external
  4. Blast radius increases — nodes entering
    high_blast_radius
  5. Complexity spikes — CC increases > 3 on tainted or entrypoint-reachable nodes
  6. Structural additions — new nodes and edges (review needed)
  7. Structural removals — verify removed security functions were replaced
Cross-reference structural changes with
git diff {before_ref}..{after_ref}
to add source-level context to findings.
Severity classification:
SeverityStructural Signal
CRITICALNew tainted path to sensitive function, removed auth boundary
HIGHNew entrypoint + high blast radius, large CC increase on tainted node
MEDIUMNew trust-boundary-crossing edges, moderate CC increase
LOWAdded nodes without entrypoint reachability
INFODead code removal, complexity reductions
For detailed metric definitions, see references/evolution-metrics.md.
读取diff JSON,生成聚焦安全的markdown报告。完整模板见references/report-format.md
解读优先级(从高到低):
  1. 新增污点路径 — 进入
    tainted
    子图的节点,尤其是如果这些节点还出现在指向敏感函数的新增边中
  2. 权限边界变更 — 新增或删除的信任转换
  3. 攻击面增长 — 新增入口点,尤其是
    untrusted_external
    类型的入口点
  4. 影响范围扩大 — 进入
    high_blast_radius
    的节点
  5. 复杂度骤增 — 污点或入口点可达节点的圈复杂度增长超过3
  6. 结构新增 — 新增的节点和边(需要评审)
  7. 结构删除 — 确认被删除的安全函数已有替代实现
将结构变更与
git diff {before_ref}..{after_ref}
交叉比对,为发现的问题补充源码层面的上下文。
严重等级分类:
严重等级结构信号
CRITICAL指向敏感函数的新污点路径、鉴权边界被删除
HIGH新增入口点+高影响范围、污点节点的圈复杂度大幅增长
MEDIUM新增跨信任边界的边、中等幅度的圈复杂度增长
LOW新增无入口点可达的节点
INFO死代码移除、复杂度降低
详细的指标定义见references/evolution-metrics.md

Phase 5: Clean Up

阶段5:清理

Remove git worktrees after the report is written:
bash
git worktree remove "{before_dir}"
git worktree remove "{after_dir}"

报告生成完成后删除git worktrees:
bash
git worktree remove "{before_dir}"
git worktree remove "{after_dir}"

Diff Script Reference

Diff脚本参考

uv run {baseDir}/scripts/graph_diff.py [OPTIONS]
ArgumentDefaultDescription
--before
requiredPath to the "before" graph JSON
--after
requiredPath to the "after" graph JSON
--indent
2
JSON output indentation
Input format: Trailmark JSON exports from
engine.to_json()
. Output: JSON structural diff to stdout.

uv run {baseDir}/scripts/graph_diff.py [OPTIONS]
参数默认值描述
--before
必填「变更前」代码图JSON的路径
--after
必填「变更后」代码图JSON的路径
--indent
2
JSON输出的缩进
输入格式:
engine.to_json()
导出的Trailmark JSON文件。 输出:输出到stdout的JSON结构diff。

Quality Checklist

质量检查清单

Before delivering the report:
  • Both graphs built successfully (check summaries)
  • Pre-analysis ran on both snapshots
  • Structural diff computed (non-empty diff JSON)
  • All subgraph changes interpreted (tainted, blast radius, etc.)
  • Critical findings include evidence (node IDs, edge diffs)
  • Severity levels assigned to all findings
  • Source-level context added via git diff cross-reference
  • Worktrees cleaned up (or temp dirs removed)
  • Report written to
    GRAPH_EVOLUTION_*.md

交付报告前确认:
  • 两个代码图都构建成功(检查摘要)
  • 两个快照都执行了预分析
  • 结构diff计算完成(diff JSON非空)
  • 所有子图变更都完成解读(污点、影响范围等)
  • 严重问题附带证据(节点ID、边diff)
  • 所有发现都分配了严重等级
  • 通过git diff交叉比对补充了源码层面上下文
  • Worktrees已清理(或临时目录已删除)
  • 报告已写入
    GRAPH_EVOLUTION_*.md

Integration

集成说明

trailmark skill: Phase 2 uses the trailmark API for graph building and pre-analysis. All trailmark query patterns work on either snapshot's engine.
differential-review skill: Use graph-evolution for structural analysis, differential-review for line-level code review. The two are complementary — graph-evolution finds attack paths that text diffs miss, while differential-review provides git blame context and micro-adversarial analysis.
genotoxic skill: If graph-evolution reveals new high-CC tainted nodes, feed them to genotoxic for mutation testing triage.
diagramming-code skill: Generate before/after diagrams to visualize structural changes. Use
call-graph
or
data-flow
diagrams focused on changed nodes.

trailmark skill: 第2阶段使用trailmark API构建代码图和执行预分析。所有trailmark查询模式都适用于任意一个快照的引擎。
differential-review skill: 使用graph-evolution做结构分析,differential-review做行级代码评审。两者是互补关系:graph-evolution能发现文本diff遗漏的攻击路径,而differential-review提供git blame上下文和微观对抗分析。
genotoxic skill: 如果graph-evolution发现了新高圈复杂度的污点节点,可将其输入给genotoxic做突变测试分类。
diagramming-code skill: 生成变更前后的图表可视化结构变化,使用聚焦变更节点的
call-graph
data-flow
图表。

Supporting Documentation

配套文档

  • references/evolution-metrics.md — What each structural metric means and why it matters for security
  • references/report-format.md — Report template, severity classification, and example findings
  • references/evolution-metrics.md — 各结构指标的含义以及与安全的相关性说明
  • references/report-format.md — 报告模板、严重等级分类和示例发现