trailmark

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Trailmark

Parses source code into a directed graph of functions, classes, calls, and semantic metadata for security analysis. Supports 16 languages.

将源代码解析为由函数、类、调用关系和语义元数据组成的有向图，用于安全分析，支持16种编程语言。

When to Use

适用场景

Mapping call paths from user input to sensitive functions
Finding complexity hotspots for audit prioritization
Identifying attack surface and entrypoints
Understanding call relationships in unfamiliar codebases
Security review or audit preparation across polyglot projects
Adding LLM-inferred annotations (assumptions, preconditions) to code units
Pre-analysis before mutation testing (genotoxic skill) or diagramming

映射从用户输入到敏感函数的调用路径
查找复杂度热点以确定审计优先级
识别攻击面和入口点
理解不熟悉的代码库中的调用关系
多语言项目的安全评审或审计准备
为代码单元添加LLM推导的注解（假设、前置条件）
在变异测试（genotoxic skill）或绘图前执行预分析

When NOT to Use

不适用场景

Single-file scripts where call graph adds no value (read the file directly)
Architecture diagrams not derived from code (use the
```
diagramming-code
```
skill or draw by hand)
Mutation testing triage (use the genotoxic skill, which calls trailmark internally)
Runtime behavior analysis (trailmark is static, not dynamic)

单文件脚本，此时调用图没有额外价值（直接读取文件即可）
非基于代码生成的架构图（使用
```
diagramming-code
```
skill或手动绘制）
变异测试分类处理（使用genotoxic skill，其内部会调用trailmark）
运行时行为分析（trailmark是静态分析工具，不支持动态分析）

Rationalizations to Reject

应拒绝的错误理由

Rationalization	Why It's Wrong	Required Action
"I'll just read the source files manually"	Manual reading misses call paths, blast radius, and taint data	Install trailmark and use the API
"Pre-analysis isn't needed for a quick query"	Blast radius, taint, and privilege data are only available after `preanalysis()`	Always run `engine.preanalysis()` before handing off to other skills
"The graph is too large, I'll sample"	Sampling misses cross-module attack paths	Build the full graph; use subgraph queries to focus
"Uncertain edges don't matter"	Dynamic dispatch is where type confusion bugs hide	Account for `uncertain` edges in security claims
"Single-language analysis is enough"	Polyglot repos have FFI boundaries where bugs cluster	Use the correct `--language` flag per component
"Complexity hotspots are the only thing worth checking"	Low-complexity functions on tainted paths are high-value targets	Combine complexity with taint and blast radius data

错误理由	错误原因	required 操作
"我直接手动读取源文件就行"	手动读取会遗漏调用路径、影响范围和污点数据	安装trailmark并使用API
"快速查询不需要做预分析"	影响范围、污点和权限数据仅在 `preanalysis()` 执行后可用	在将任务移交到其他skill前务必运行 `engine.preanalysis()`
"图太大了，我采样分析就行"	采样会遗漏跨模块的攻击路径	构建完整图，使用子图查询缩小分析范围
"不确定的边不重要"	动态分发是类型混淆漏洞的高发区	在安全声明中要考虑 `uncertain` 边的存在
"单语言分析就够用了"	多语言仓库的FFI边界是漏洞集群区域	针对不同组件使用对应的 `--language` flag
"只需要检查复杂度热点就够了"	位于污点路径上的低复杂度函数也是高风险目标	结合复杂度、污点和影响范围数据综合分析

Installation

安装

MANDATORY: If

uv run trailmark

fails (command not found, import error, ModuleNotFoundError), install trailmark before doing anything else:

bash

uv pip install trailmark

DO NOT fall back to "manual verification", "manual analysis", or reading source files by hand as a substitute for running trailmark. The tool must be installed and used programmatically. If installation fails, report the error to the user instead of silently switching to manual code reading.

强制要求： 如果

uv run trailmark

执行失败（命令未找到、导入错误、ModuleNotFoundError），请先安装trailmark再执行其他操作：

bash

uv pip install trailmark

严禁采用“手动验证”、“手动分析”或手动读取源文件的方式替代运行trailmark。必须安装该工具并通过编程方式使用。如果安装失败，请向用户反馈错误，不要静默切换到手动读码模式。

Quick Start

快速入门

bash

undefined

bash

undefined

Python (default)

Python (默认)

uv run trailmark analyze --summary {targetDir}

Other languages

其他语言

uv run trailmark analyze --language rust {targetDir} uv run trailmark analyze --language javascript {targetDir} uv run trailmark analyze --language go --summary {targetDir}

Complexity hotspots

复杂度热点分析

uv run trailmark analyze --complexity 10 {targetDir}

undefined

uv run trailmark analyze --complexity 10 {targetDir}

undefined

Programmatic API

编程API

python

from trailmark.query.api import QueryEngine

python

from trailmark.query.api import QueryEngine

Specify language (defaults to "python")

指定语言（默认是"python"）

engine = QueryEngine.from_directory("{targetDir}", language="rust")

engine.callers_of("function_name") engine.callees_of("function_name") engine.paths_between("entry_func", "db_query") engine.complexity_hotspots(threshold=10) engine.attack_surface() engine.summary() engine.to_json()

engine = QueryEngine.from_directory("{targetDir}", language="rust")

Run pre-analysis (blast radius, entrypoints, privilege

运行预分析（影响范围、入口点、权限边界、污点传播）

boundaries, taint propagation)

—

result = engine.preanalysis()

Query subgraphs created by pre-analysis

查询预分析生成的子图

engine.subgraph_names() engine.subgraph("tainted") engine.subgraph("high_blast_radius") engine.subgraph("privilege_boundary") engine.subgraph("entrypoint_reachable")

Add LLM-inferred annotations

添加LLM推导的注解

from trailmark.models import AnnotationKind

engine.annotate("function_name", AnnotationKind.ASSUMPTION, "input is URL-encoded", source="llm")

from trailmark.models import AnnotationKind

engine.annotate("function_name", AnnotationKind.ASSUMPTION, "input is URL-encoded", source="llm")

Query annotations (including pre-analysis results)

查询注解（包含预分析结果）

engine.annotations_of("function_name") engine.annotations_of("function_name", kind=AnnotationKind.BLAST_RADIUS) engine.annotations_of("function_name", kind=AnnotationKind.TAINT_PROPAGATION)

undefined

engine.annotations_of("function_name") engine.annotations_of("function_name", kind=AnnotationKind.BLAST_RADIUS) engine.annotations_of("function_name", kind=AnnotationKind.TAINT_PROPAGATION)

undefined

Pre-Analysis Passes

预分析流程

Always run
engine.preanalysis()
before handing off to genotoxic or
diagramming-code
skills. Pre-analysis enriches the graph with four passes:

Blast radius estimation — counts downstream and upstream nodes per function, identifies critical high-complexity descendants
Entry point enumeration — maps entrypoints by trust level, computes reachable node sets
Privilege boundary detection — finds call edges where trust levels change (untrusted -> trusted)
Taint propagation — marks all nodes reachable from untrusted entrypoints

Results are stored as annotations and named subgraphs on the graph.

For detailed documentation, see references/preanalysis-passes.md.

在将任务移交到genotoxic或
diagramming-code
skill前，请务必运行
engine.preanalysis()
。预分析通过四个流程丰富图数据：

影响范围评估 —— 统计每个函数的上下游节点数量，识别关键的高复杂度后代节点
入口点枚举 —— 按信任级别映射入口点，计算可达节点集
权限边界检测 —— 查找信任级别发生变化（不可信→可信）的调用边
污点传播 —— 标记所有从不可信入口点可达的节点

结果会作为注解和命名子图存储在图中。

详细文档请参考references/preanalysis-passes.md。

Supported Languages

支持的语言

Language	`--language` value	Extensions
Python	`python`	`.py`
JavaScript	`javascript`	`.js` , `.jsx`
TypeScript	`typescript`	`.ts` , `.tsx`
PHP	`php`	`.php`
Ruby	`ruby`	`.rb`
C	`c`	`.c` , `.h`
C++	`cpp`	`.cpp` , `.hpp` , `.cc` , `.hh` , `.cxx` , `.hxx`
C#	`c_sharp`	`.cs`
Java	`java`	`.java`
Go	`go`	`.go`
Rust	`rust`	`.rs`
Solidity	`solidity`	`.sol`
Cairo	`cairo`	`.cairo`
Haskell	`haskell`	`.hs`
Circom	`circom`	`.circom`
Erlang	`erlang`	`.erl`

编程语言	`--language` 参数值	文件后缀
Python	`python`	`.py`
JavaScript	`javascript`	`.js` , `.jsx`
TypeScript	`typescript`	`.ts` , `.tsx`
PHP	`php`	`.php`
Ruby	`ruby`	`.rb`
C	`c`	`.c` , `.h`
C++	`cpp`	`.cpp` , `.hpp` , `.cc` , `.hh` , `.cxx` , `.hxx`
C#	`c_sharp`	`.cs`
Java	`java`	`.java`
Go	`go`	`.go`
Rust	`rust`	`.rs`
Solidity	`solidity`	`.sol`
Cairo	`cairo`	`.cairo`
Haskell	`haskell`	`.hs`
Circom	`circom`	`.circom`
Erlang	`erlang`	`.erl`

Graph Model

图模型

Node kinds:

function

method

class

module

struct

interface

trait

enum

namespace

contract

library

Edge kinds:

calls

inherits

implements

contains

imports

Edge confidence:

certain

(direct call,

self.method()

inferred

(attribute access on non-self object),

uncertain

(dynamic dispatch)

节点类型：

function

method

class

module

struct

interface

trait

enum

namespace

contract

library

边类型：

calls

inherits

implements

contains

imports

边置信度：

certain

（直接调用，

self.method()

）,

inferred

（非self对象的属性访问）,

uncertain

（动态分发）

Per Code Unit

每个代码单元包含

Parameters with types, return types, exception types
Cyclomatic complexity and branch metadata
Docstrings

Annotations:

assumption

precondition

postcondition

invariant

blast_radius

privilege_boundary

taint_propagation

带类型的参数、返回类型、异常类型
圈复杂度和分支元数据
文档字符串

注解：

assumption

precondition

postcondition

invariant

blast_radius

privilege_boundary

taint_propagation

Per Edge

每条边包含

Source/target node IDs, edge kind, confidence level

源/目标节点ID、边类型、置信度级别

Project Level

项目级别包含

Dependencies (imported packages)
Entrypoints with trust levels and asset values
Named subgraphs (populated by pre-analysis)

依赖（导入的包）
带信任级别和资产价值的入口点
命名子图（由预分析填充）

Key Concepts

核心概念

Declared contract vs. effective input domain: Trailmark separates what a function declares it accepts from what can actually reach it via call paths. Mismatches are where vulnerabilities hide:

Widening: Unconstrained data reaches a function that assumes validation
Safe by coincidence: No validation, but only safe callers exist today

Edge confidence: Dynamic dispatch produces

uncertain

edges. Account for confidence when making security claims.

Subgraphs: Named collections of node IDs produced by pre-analysis. Query with

engine.subgraph("name")

. Available after

engine.preanalysis()

声明约定 vs 实际输入域： Trailmark会区分函数声明接受的输入，和通过调用路径实际可抵达的输入。两者不匹配的地方就是漏洞隐藏的位置：

放宽校验： 未做约束的数据抵达了假设已做校验的函数
巧合安全： 没有校验，但当前只有安全的调用方

边置信度： 动态分发会产生

uncertain

边，在做安全声明时需要考虑置信度。

子图： 预分析生成的命名节点ID集合，可通过

engine.subgraph("name")

查询，在运行

engine.preanalysis()

后可用。

Query Patterns

查询模式

See references/query-patterns.md for common security analysis patterns.

See references/preanalysis-passes.md for pre-analysis pass documentation.

常见安全分析模式请参考references/query-patterns.md。

预分析流程文档请参考references/preanalysis-passes.md。