audit-augmentation

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Audit Augmentation

审计增强

Projects findings from external tools (SARIF) and human auditors (weAudit) onto Trailmark code graphs as annotations and subgraphs.

将外部工具（SARIF）和人工审计（weAudit）的发现结果作为注释和子图投射到Trailmark代码图中。

When to Use

适用场景

Importing Semgrep, CodeQL, or other SARIF-producing tool results into a graph
Importing weAudit audit annotations into a graph
Cross-referencing static analysis findings with blast radius or taint data
Querying which functions have high-severity findings
Visualizing audit coverage alongside code structure

将Semgrep、CodeQL或其他生成SARIF的工具结果导入代码图
将weAudit审计注释导入代码图
将静态分析发现结果与blast radius或taint数据交叉引用
查询存在高严重级别发现结果的函数
结合代码结构可视化审计覆盖范围

When NOT to Use

不适用场景

Running static analysis tools (use semgrep/codeql directly, then import)
Building the code graph itself (use the
```
trailmark
```
skill)
Generating diagrams (use the
```
diagramming-code
```
skill after augmenting)

运行静态分析工具（直接使用semgrep/codeql运行后再导入结果）
构建代码图本身（使用
```
trailmark
```
skill）
生成图表（增强完成后使用
```
diagramming-code
```
skill）

Rationalizations to Reject

错误认知说明

Rationalization	Why It's Wrong	Required Action
"The user only asked about SARIF, skip pre-analysis"	Without pre-analysis, you can't cross-reference findings with blast radius or taint	Always run `engine.preanalysis()` before augmenting
"Unmatched findings don't matter"	Unmatched findings may indicate parsing gaps or out-of-scope files	Report unmatched count and investigate if high
"One severity subgraph is enough"	Different severities need different triage workflows	Query all severity subgraphs, not just `error`
"SARIF results speak for themselves"	Findings without graph context lack blast radius and taint reachability	Cross-reference with pre-analysis subgraphs
"weAudit and SARIF overlap, pick one"	Human auditors and tools find different things	Import both when available
"Tool isn't installed, I'll do it manually"	Manual analysis misses what tooling catches	Install trailmark first

错误认知	错误原因	正确操作
"用户只提到了SARIF，跳过预分析"	没有预分析的话无法将发现结果与blast radius或taint交叉引用	增强操作前必须运行 `engine.preanalysis()`
"匹配不上的发现结果不重要"	匹配失败可能意味着解析存在漏洞或者文件不在作用域内	上报匹配失败的数量，如果数量较多需要排查原因
"只要一个严重级别子图就够了"	不同严重级别需要不同的分类处理流程	查询所有严重级别的子图，不要只查询 `error`
"SARIF结果本身就能说明问题"	没有代码图上下文的发现结果缺少blast radius和taint可达性信息	与预分析子图进行交叉引用
"weAudit和SARIF结果有重叠，选一个就行"	人工审计和工具能发现不同的问题	两者都可用时全部导入
"工具没安装，我手动分析就行"	手动分析会遗漏工具能捕捉到的问题	先安装trailmark

Installation

安装

MANDATORY: If

uv run trailmark

fails, install trailmark first:

bash

uv pip install trailmark

强制要求： 如果

uv run trailmark

运行失败，先安装trailmark：

bash

uv pip install trailmark

Quick Start

快速开始

CLI

bash

undefined

bash

undefined

Augment with SARIF

用SARIF增强

uv run trailmark augment {targetDir} --sarif results.sarif

Augment with weAudit

用weAudit增强

uv run trailmark augment {targetDir} --weaudit .vscode/alice.weaudit

Both at once, output JSON

同时使用两者，输出JSON

uv run trailmark augment {targetDir}
--sarif results.sarif
--weaudit .vscode/alice.weaudit
--json

undefined

uv run trailmark augment {targetDir}
--sarif results.sarif
--weaudit .vscode/alice.weaudit
--json

undefined

Programmatic API

编程API

python

from trailmark.query.api import QueryEngine

engine = QueryEngine.from_directory("{targetDir}", language="python")

python

from trailmark.query.api import QueryEngine

engine = QueryEngine.from_directory("{targetDir}", language="python")

Run pre-analysis first for cross-referencing

先运行预分析用于交叉引用

engine.preanalysis()

Augment with SARIF

用SARIF增强

result = engine.augment_sarif("results.sarif")

result: {matched_findings: 12, unmatched_findings: 3, subgraphs_created: [...]}

Augment with weAudit

用weAudit增强

result = engine.augment_weaudit(".vscode/alice.weaudit")

Query findings

查询发现结果

engine.findings() # All findings engine.subgraph("sarif:error") # High-severity SARIF engine.subgraph("weaudit:high") # High-severity weAudit engine.subgraph("sarif:semgrep") # By tool name engine.annotations_of("function_name") # Per-node lookup

undefined

engine.findings() # 所有发现结果 engine.subgraph("sarif:error") # 高严重级别SARIF engine.subgraph("weaudit:high") # 高严重级别weAudit engine.subgraph("sarif:semgrep") # 按工具名称筛选 engine.annotations_of("function_name") # 按节点查询

undefined

Workflow

工作流

Augmentation Progress:
- [ ] Step 1: Build graph and run pre-analysis
- [ ] Step 2: Locate SARIF/weAudit files
- [ ] Step 3: Run augmentation
- [ ] Step 4: Inspect results and subgraphs
- [ ] Step 5: Cross-reference with pre-analysis

Step 1: Build the graph and run pre-analysis for blast radius and taint context:

python

engine = QueryEngine.from_directory("{targetDir}", language="{lang}")
engine.preanalysis()

Step 2: Locate input files:

SARIF: Usually output by tools like

semgrep --sarif -o results.sarif

codeql database analyze --format=sarif-latest

weAudit: Stored in
```
.vscode/<username>.weaudit
```
within the workspace

Step 3: Run augmentation via

engine.augment_sarif()

engine.augment_weaudit()

. Check

unmatched_findings

in the result — these are findings whose file/line locations didn't overlap any parsed code unit.

Step 4: Query findings and subgraphs. Use

engine.findings()

to list all annotated nodes. Use

engine.subgraph_names()

to see available subgraphs.

Step 5: Cross-reference with pre-analysis data to prioritize:

Findings on tainted nodes: overlap
```
sarif:error
```
with
```
tainted
```
subgraph
Findings on high blast radius nodes: overlap with
```
high_blast_radius
```
Findings on privilege boundaries: overlap with
```
privilege_boundary
```

增强进度：
- [ ] 步骤1：构建代码图并运行预分析
- [ ] 步骤2：定位SARIF/weAudit文件
- [ ] 步骤3：运行增强操作
- [ ] 步骤4：检查结果和子图
- [ ] 步骤5：与预分析结果交叉引用

步骤1： 构建代码图并运行预分析，获取blast radius和taint上下文：

python

engine = QueryEngine.from_directory("{targetDir}", language="{lang}")
engine.preanalysis()

步骤2： 定位输入文件：

SARIF：通常是

semgrep --sarif -o results.sarif

或者

codeql database analyze --format=sarif-latest

这类命令的输出

weAudit：存储在工作区的
```
.vscode/<username>.weaudit
```
路径下

步骤3： 通过

engine.augment_sarif()

或

engine.augment_weaudit()

运行增强操作。检查结果中的

unmatched_findings

——这些是对应的文件/行号位置没有匹配到任何已解析代码单元的发现结果。

步骤4： 查询发现结果和子图。使用

engine.findings()

列出所有已注释节点，使用

engine.subgraph_names()

查看可用的子图。

步骤5： 与预分析数据交叉引用来确定优先级：

受taint影响节点上的发现结果：将
```
sarif:error
```
与
```
tainted
```
子图重叠查询
高blast radius节点上的发现结果：与
```
high_blast_radius
```
重叠查询
权限边界上的发现结果：与
```
privilege_boundary
```
重叠查询

Annotation Format

注释格式

Findings are stored as standard Trailmark annotations:

Kind:
```
finding
```
(tool-generated) or
```
audit_note
```
(human notes)
Source:
```
sarif:<tool_name>
```
or
```
weaudit:<author>
```
Description: Compact single-line:
```
[SEVERITY] rule-id: message (tool)
```

发现结果以标准Trailmark注释格式存储：

类型：
```
finding
```
（工具生成）或
```
audit_note
```
（人工注释）
来源：
```
sarif:<工具名称>
```
或
```
weaudit:<作者>
```

描述：精简的单行内容：

[严重级别] 规则ID: 消息 (工具)

Subgraphs Created

生成的子图

Subgraph	Contents
`sarif:error`	Nodes with SARIF error-level findings
`sarif:warning`	Nodes with SARIF warning-level findings
`sarif:note`	Nodes with SARIF note-level findings
`sarif:<tool>`	Nodes flagged by a specific tool
`weaudit:high`	Nodes with high-severity weAudit findings
`weaudit:medium`	Nodes with medium-severity weAudit findings
`weaudit:low`	Nodes with low-severity weAudit findings
`weaudit:findings`	All weAudit findings (entryType=0)
`weaudit:notes`	All weAudit notes (entryType=1)

子图	内容
`sarif:error`	存在SARIF错误级别发现结果的节点
`sarif:warning`	存在SARIF警告级别发现结果的节点
`sarif:note`	存在SARIF提示级别发现结果的节点
`sarif:<工具名>`	特定工具标记的节点
`weaudit:high`	存在高严重级别weAudit发现结果的节点
`weaudit:medium`	存在中严重级别weAudit发现结果的节点
`weaudit:low`	存在低严重级别weAudit发现结果的节点
`weaudit:findings`	所有weAudit发现结果（entryType=0）
`weaudit:notes`	所有weAudit注释（entryType=1）

How Matching Works

匹配规则说明

Findings are matched to graph nodes by file path and line range overlap:

Finding file path is normalized relative to the graph's
```
root_path
```
Nodes whose
```
location.file_path
```
matches AND whose line range overlaps are selected
The tightest match (smallest span) is preferred
If a finding's location doesn't overlap any node, it counts as unmatched

SARIF paths may be relative, absolute, or

file://

URIs — all are handled. weAudit uses 0-indexed lines which are converted to 1-indexed automatically.

发现结果通过文件路径和行号范围重叠匹配到图节点：

发现结果的文件路径会被归一化为代码图
```
root_path
```
的相对路径
选择
```
location.file_path
```
匹配且行号范围重叠的节点
优先选择最匹配的（跨度最小的）节点
如果发现结果的位置没有匹配到任何节点，会被标记为匹配失败

SARIF路径可以是相对路径、绝对路径或者

file://

URI——所有格式都支持。weAudit使用的0索引行号会自动转换为1索引。

Supporting Documentation

参考文档

references/formats.md — SARIF 2.1.0 and weAudit file format field reference

references/formats.md — SARIF 2.1.0和weAudit文件格式字段参考