trailmark
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseTrailmark
Trailmark
Parses source code into a directed graph of functions, classes, calls, and
semantic metadata for security analysis. Supports 16 languages.
将源代码解析为由函数、类、调用关系和语义元数据组成的有向图,用于安全分析,支持16种编程语言。
When to Use
适用场景
- Mapping call paths from user input to sensitive functions
- Finding complexity hotspots for audit prioritization
- Identifying attack surface and entrypoints
- Understanding call relationships in unfamiliar codebases
- Security review or audit preparation across polyglot projects
- Adding LLM-inferred annotations (assumptions, preconditions) to code units
- Pre-analysis before mutation testing (genotoxic skill) or diagramming
- 映射从用户输入到敏感函数的调用路径
- 查找复杂度热点以确定审计优先级
- 识别攻击面和入口点
- 理解不熟悉的代码库中的调用关系
- 多语言项目的安全评审或审计准备
- 为代码单元添加LLM推导的注解(假设、前置条件)
- 在变异测试(genotoxic skill)或绘图前执行预分析
When NOT to Use
不适用场景
- Single-file scripts where call graph adds no value (read the file directly)
- Architecture diagrams not derived from code (use the skill or draw by hand)
diagramming-code - Mutation testing triage (use the genotoxic skill, which calls trailmark internally)
- Runtime behavior analysis (trailmark is static, not dynamic)
- 单文件脚本,此时调用图没有额外价值(直接读取文件即可)
- 非基于代码生成的架构图(使用skill或手动绘制)
diagramming-code - 变异测试分类处理(使用genotoxic skill,其内部会调用trailmark)
- 运行时行为分析(trailmark是静态分析工具,不支持动态分析)
Rationalizations to Reject
应拒绝的错误理由
| Rationalization | Why It's Wrong | Required Action |
|---|---|---|
| "I'll just read the source files manually" | Manual reading misses call paths, blast radius, and taint data | Install trailmark and use the API |
| "Pre-analysis isn't needed for a quick query" | Blast radius, taint, and privilege data are only available after | Always run |
| "The graph is too large, I'll sample" | Sampling misses cross-module attack paths | Build the full graph; use subgraph queries to focus |
| "Uncertain edges don't matter" | Dynamic dispatch is where type confusion bugs hide | Account for |
| "Single-language analysis is enough" | Polyglot repos have FFI boundaries where bugs cluster | Use the correct |
| "Complexity hotspots are the only thing worth checking" | Low-complexity functions on tainted paths are high-value targets | Combine complexity with taint and blast radius data |
| 错误理由 | 错误原因 | required 操作 |
|---|---|---|
| "我直接手动读取源文件就行" | 手动读取会遗漏调用路径、影响范围和污点数据 | 安装trailmark并使用API |
| "快速查询不需要做预分析" | 影响范围、污点和权限数据仅在 | 在将任务移交到其他skill前务必运行 |
| "图太大了,我采样分析就行" | 采样会遗漏跨模块的攻击路径 | 构建完整图,使用子图查询缩小分析范围 |
| "不确定的边不重要" | 动态分发是类型混淆漏洞的高发区 | 在安全声明中要考虑 |
| "单语言分析就够用了" | 多语言仓库的FFI边界是漏洞集群区域 | 针对不同组件使用对应的 |
| "只需要检查复杂度热点就够了" | 位于污点路径上的低复杂度函数也是高风险目标 | 结合复杂度、污点和影响范围数据综合分析 |
Installation
安装
MANDATORY: If fails (command not found, import error,
ModuleNotFoundError), install trailmark before doing anything else:
uv run trailmarkbash
uv pip install trailmarkDO NOT fall back to "manual verification", "manual analysis", or reading
source files by hand as a substitute for running trailmark. The tool must be
installed and used programmatically. If installation fails, report the error
to the user instead of silently switching to manual code reading.
强制要求: 如果执行失败(命令未找到、导入错误、ModuleNotFoundError),请先安装trailmark再执行其他操作:
uv run trailmarkbash
uv pip install trailmark严禁 采用“手动验证”、“手动分析”或手动读取源文件的方式替代运行trailmark。必须安装该工具并通过编程方式使用。如果安装失败,请向用户反馈错误,不要静默切换到手动读码模式。
Quick Start
快速入门
bash
undefinedbash
undefinedPython (default)
Python (默认)
uv run trailmark analyze --summary {targetDir}
uv run trailmark analyze --summary {targetDir}
Other languages
其他语言
uv run trailmark analyze --language rust {targetDir}
uv run trailmark analyze --language javascript {targetDir}
uv run trailmark analyze --language go --summary {targetDir}
uv run trailmark analyze --language rust {targetDir}
uv run trailmark analyze --language javascript {targetDir}
uv run trailmark analyze --language go --summary {targetDir}
Complexity hotspots
复杂度热点分析
uv run trailmark analyze --complexity 10 {targetDir}
undefineduv run trailmark analyze --complexity 10 {targetDir}
undefinedProgrammatic API
编程API
python
from trailmark.query.api import QueryEnginepython
from trailmark.query.api import QueryEngineSpecify language (defaults to "python")
指定语言(默认是"python")
engine = QueryEngine.from_directory("{targetDir}", language="rust")
engine.callers_of("function_name")
engine.callees_of("function_name")
engine.paths_between("entry_func", "db_query")
engine.complexity_hotspots(threshold=10)
engine.attack_surface()
engine.summary()
engine.to_json()
engine = QueryEngine.from_directory("{targetDir}", language="rust")
engine.callers_of("function_name")
engine.callees_of("function_name")
engine.paths_between("entry_func", "db_query")
engine.complexity_hotspots(threshold=10)
engine.attack_surface()
engine.summary()
engine.to_json()
Run pre-analysis (blast radius, entrypoints, privilege
运行预分析(影响范围、入口点、权限边界、污点传播)
boundaries, taint propagation)
—
result = engine.preanalysis()
result = engine.preanalysis()
Query subgraphs created by pre-analysis
查询预分析生成的子图
engine.subgraph_names()
engine.subgraph("tainted")
engine.subgraph("high_blast_radius")
engine.subgraph("privilege_boundary")
engine.subgraph("entrypoint_reachable")
engine.subgraph_names()
engine.subgraph("tainted")
engine.subgraph("high_blast_radius")
engine.subgraph("privilege_boundary")
engine.subgraph("entrypoint_reachable")
Add LLM-inferred annotations
添加LLM推导的注解
from trailmark.models import AnnotationKind
engine.annotate("function_name", AnnotationKind.ASSUMPTION,
"input is URL-encoded", source="llm")
from trailmark.models import AnnotationKind
engine.annotate("function_name", AnnotationKind.ASSUMPTION,
"input is URL-encoded", source="llm")
Query annotations (including pre-analysis results)
查询注解(包含预分析结果)
engine.annotations_of("function_name")
engine.annotations_of("function_name",
kind=AnnotationKind.BLAST_RADIUS)
engine.annotations_of("function_name",
kind=AnnotationKind.TAINT_PROPAGATION)
undefinedengine.annotations_of("function_name")
engine.annotations_of("function_name",
kind=AnnotationKind.BLAST_RADIUS)
engine.annotations_of("function_name",
kind=AnnotationKind.TAINT_PROPAGATION)
undefinedPre-Analysis Passes
预分析流程
Always run before handing off to genotoxic or
skills. Pre-analysis enriches the graph with four passes:
engine.preanalysis()diagramming-code- Blast radius estimation — counts downstream and upstream nodes per function, identifies critical high-complexity descendants
- Entry point enumeration — maps entrypoints by trust level, computes reachable node sets
- Privilege boundary detection — finds call edges where trust levels change (untrusted -> trusted)
- Taint propagation — marks all nodes reachable from untrusted entrypoints
Results are stored as annotations and named subgraphs on the graph.
For detailed documentation, see
references/preanalysis-passes.md.
在将任务移交到genotoxic或 skill前,请务必运行。预分析通过四个流程丰富图数据:
diagramming-codeengine.preanalysis()- 影响范围评估 —— 统计每个函数的上下游节点数量,识别关键的高复杂度后代节点
- 入口点枚举 —— 按信任级别映射入口点,计算可达节点集
- 权限边界检测 —— 查找信任级别发生变化(不可信→可信)的调用边
- 污点传播 —— 标记所有从不可信入口点可达的节点
结果会作为注解和命名子图存储在图中。
详细文档请参考references/preanalysis-passes.md。
Supported Languages
支持的语言
| Language | | Extensions |
|---|---|---|
| Python | | |
| JavaScript | | |
| TypeScript | | |
| PHP | | |
| Ruby | | |
| C | | |
| C++ | | |
| C# | | |
| Java | | |
| Go | | |
| Rust | | |
| Solidity | | |
| Cairo | | |
| Haskell | | |
| Circom | | |
| Erlang | | |
| 编程语言 | | 文件后缀 |
|---|---|---|
| Python | | |
| JavaScript | | |
| TypeScript | | |
| PHP | | |
| Ruby | | |
| C | | |
| C++ | | |
| C# | | |
| Java | | |
| Go | | |
| Rust | | |
| Solidity | | |
| Cairo | | |
| Haskell | | |
| Circom | | |
| Erlang | | |
Graph Model
图模型
Node kinds: , , , , ,
, , , , ,
functionmethodclassmodulestructinterfacetraitenumnamespacecontractlibraryEdge kinds: , , , ,
callsinheritsimplementscontainsimportsEdge confidence: (direct call, ),
(attribute access on non-self object), (dynamic dispatch)
certainself.method()inferreduncertain节点类型: , , , , ,
, , , , ,
functionmethodclassmodulestructinterfacetraitenumnamespacecontractlibrary边类型: , , , ,
callsinheritsimplementscontainsimports边置信度: (直接调用,), (非self对象的属性访问), (动态分发)
certainself.method()inferreduncertainPer Code Unit
每个代码单元包含
- Parameters with types, return types, exception types
- Cyclomatic complexity and branch metadata
- Docstrings
- Annotations: ,
assumption,precondition,postcondition,invariant,blast_radius,privilege_boundarytaint_propagation
- 带类型的参数、返回类型、异常类型
- 圈复杂度和分支元数据
- 文档字符串
- 注解:,
assumption,precondition,postcondition,invariant,blast_radius,privilege_boundarytaint_propagation
Per Edge
每条边包含
- Source/target node IDs, edge kind, confidence level
- 源/目标节点ID、边类型、置信度级别
Project Level
项目级别包含
- Dependencies (imported packages)
- Entrypoints with trust levels and asset values
- Named subgraphs (populated by pre-analysis)
- 依赖(导入的包)
- 带信任级别和资产价值的入口点
- 命名子图(由预分析填充)
Key Concepts
核心概念
Declared contract vs. effective input domain: Trailmark separates what a
function declares it accepts from what can actually reach it via call
paths. Mismatches are where vulnerabilities hide:
- Widening: Unconstrained data reaches a function that assumes validation
- Safe by coincidence: No validation, but only safe callers exist today
Edge confidence: Dynamic dispatch produces edges. Account for
confidence when making security claims.
uncertainSubgraphs: Named collections of node IDs produced by pre-analysis.
Query with . Available after .
engine.subgraph("name")engine.preanalysis()声明约定 vs 实际输入域: Trailmark会区分函数声明接受的输入,和通过调用路径实际可抵达的输入。两者不匹配的地方就是漏洞隐藏的位置:
- 放宽校验: 未做约束的数据抵达了假设已做校验的函数
- 巧合安全: 没有校验,但当前只有安全的调用方
边置信度: 动态分发会产生边,在做安全声明时需要考虑置信度。
uncertain子图: 预分析生成的命名节点ID集合,可通过查询,在运行后可用。
engine.subgraph("name")engine.preanalysis()Query Patterns
查询模式
See references/query-patterns.md for common
security analysis patterns.
See references/preanalysis-passes.md for
pre-analysis pass documentation.
常见安全分析模式请参考references/query-patterns.md。
预分析流程文档请参考references/preanalysis-passes.md。