xrefs
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseTrigger Intents
触发场景
Use this skill when user asks:
- "Who calls this?" / "What does this call?"
- "Where is this string/import referenced?"
- "Show call graph dependencies."
Route to:
- for candidate entity lookup by name/pattern before relationship analysis
grep - for broader triage context
analysis - for semantic interpretation after graph narrowing
decompiler - for instruction-level call-site proof
disassembly
当用户提出以下问题时使用此技能:
- "谁调用了这个?" / "这个调用了什么?"
- "这个字符串/导入项在哪里被引用?"
- "展示调用图依赖关系。"
路由至:
- 进行关系分析前,使用按名称/模式查找候选实体
grep - 使用获取更广泛的分类排查上下文
analysis - 调用图范围缩小后,使用进行语义解读
decompiler - 使用获取指令级调用站点验证
disassembly
Do This First (Warm-Start Sequence)
优先执行步骤(预热流程)
sql
-- 1) Core relation volume
SELECT COUNT(*) AS xref_count FROM xrefs;
-- 2) Top imports (dependency hints)
SELECT module, COUNT(*) AS import_count
FROM imports
GROUP BY module
ORDER BY import_count DESC;
-- 3) Most called functions
SELECT printf('0x%X', to_ea) AS callee, COUNT(*) AS callers
FROM xrefs
WHERE is_code = 1
GROUP BY to_ea
ORDER BY callers DESC
LIMIT 20;Interpretation guidance:
- Use relation counts to prioritize hotspots before expensive deep analysis.
- Prefer indexed filters (/
to_ea) for fast response.from_ea
sql
-- 1) 核心关联数量
SELECT COUNT(*) AS xref_count FROM xrefs;
-- 2) 热门导入项(依赖提示)
SELECT module, COUNT(*) AS import_count
FROM imports
GROUP BY module
ORDER BY import_count DESC;
-- 3) 被调用次数最多的函数
SELECT printf('0x%X', to_ea) AS callee, COUNT(*) AS callers
FROM xrefs
WHERE is_code = 1
GROUP BY to_ea
ORDER BY callers DESC
LIMIT 20;解读指南:
- 在进行高开销深度分析前,使用关联数量优先处理热点内容。
- 优先使用索引过滤器(/
to_ea)以获得快速响应。from_ea
Failure and Recovery
故障与恢复
- Full-scan query too slow:
- Add or
to_ea = Xconstraints.from_ea = X
- Add
- Target unresolved by name:
- Resolve/verify address first through the table or explicit EA literals.
names
- Resolve/verify address first through the
- Sparse results:
- Pivot through ,
imports, orstringsjoins.disasm_calls
- Pivot through
- 全量扫描查询过慢:
- 添加或
to_ea = X约束条件。from_ea = X
- 添加
- 无法通过名称定位目标:
- 先通过表或明确的EA字面量解析/验证地址。
names
- 先通过
- 结果稀疏:
- 通过、
imports或strings关联表进行转查。disasm_calls
- 通过
Handoff Patterns
移交模式
- ->
xrefsfor top candidate function semantics.decompiler - ->
xrefsfor campaign-level synthesis.analysis - ->
xrefsto persist relationship findings.annotations
- ->
xrefs:获取候选函数的语义信息。decompiler - ->
xrefs:进行全局范围的综合分析。analysis - ->
xrefs:保存关联关系分析结果。annotations
xrefs
xrefs
Cross-references - the most important table for understanding code relationships.
| Column | Type | Description |
|---|---|---|
| INT | Source address (who references) |
| INT | Target address (what is referenced) |
| INT | Xref type code |
| INT | 1=code xref (call/jump), 0=data xref |
| INT | Pre-computed containing function address (0 when not in a function) |
sql
-- Who calls function at 0x401000?
SELECT printf('0x%X', from_ea) as caller FROM xrefs WHERE to_ea = 0x401000 AND is_code = 1;
-- What does function at 0x401000 reference?
SELECT printf('0x%X', to_ea) as target FROM xrefs WHERE from_ea >= 0x401000 AND from_ea < 0x401100;交叉引用——理解代码关系最重要的表。
| 列 | 类型 | 描述 |
|---|---|---|
| INT | 源地址(引用方) |
| INT | 目标地址(被引用方) |
| INT | 交叉引用类型代码 |
| INT | 1=代码交叉引用(调用/跳转),0=数据交叉引用 |
| INT | 预计算的所属函数地址(不在函数内时为0) |
sql
-- 谁调用了地址为0x401000的函数?
SELECT printf('0x%X', from_ea) as caller FROM xrefs WHERE to_ea = 0x401000 AND is_code = 1;
-- 地址为0x401000的函数引用了什么?
SELECT printf('0x%X', to_ea) as target FROM xrefs WHERE from_ea >= 0x401000 AND from_ea < 0x401100;imports
imports
Imported functions from external libraries.
| Column | Type | Description |
|---|---|---|
| INT | Import address (IAT entry) |
| TEXT | Import name |
| TEXT | Module/DLL name |
| INT | Import ordinal |
sql
-- Imports from kernel32.dll
SELECT name FROM imports WHERE module LIKE '%kernel32%';来自外部库的导入函数。
| 列 | 类型 | 描述 |
|---|---|---|
| INT | 导入地址(IAT条目) |
| TEXT | 导入名称 |
| TEXT | 模块/DLL名称 |
| INT | 导入序号 |
sql
-- 来自kernel32.dll的导入项
SELECT name FROM imports WHERE module LIKE '%kernel32%';Convenience Views
便捷视图
callers
callers
Who calls each function. Caller names are resolved by the view from the and tables.
funcsnames| Column | Type | Description |
|---|---|---|
| INT | Target function address |
| INT | Xref source address |
| TEXT | Calling function name |
| INT | Calling function start (from |
Underlying query:
sql
SELECT x.to_ea as func_addr, x.from_ea as caller_addr,
COALESCE((SELECT name FROM names WHERE address = x.from_func LIMIT 1), printf('sub_%X', x.from_func)) as caller_name,
x.from_func as caller_func_addr
FROM xrefs x WHERE x.is_code = 1 AND x.from_func != 0sql
-- Who calls function at 0x401000?
SELECT caller_name, printf('0x%X', caller_addr) as from_addr
FROM callers WHERE func_addr = 0x401000;
-- Most called functions
SELECT printf('0x%X', func_addr) as addr, COUNT(*) as callers
FROM callers GROUP BY func_addr ORDER BY callers DESC LIMIT 10;每个函数的调用方。该视图通过和表解析调用方名称。
funcsnames| 列 | 类型 | 描述 |
|---|---|---|
| INT | 目标函数地址 |
| INT | 交叉引用源地址 |
| TEXT | 调用函数名称 |
| INT | 调用函数起始地址(来自 |
底层查询:
sql
SELECT x.to_ea as func_addr, x.from_ea as caller_addr,
COALESCE((SELECT name FROM names WHERE address = x.from_func LIMIT 1), printf('sub_%X', x.from_func)) as caller_name,
x.from_func as caller_func_addr
FROM xrefs x WHERE x.is_code = 1 AND x.from_func != 0sql
-- 谁调用了地址为0x401000的函数?
SELECT caller_name, printf('0x%X', caller_addr) as from_addr
FROM callers WHERE func_addr = 0x401000;
-- 被调用次数最多的函数
SELECT printf('0x%X', func_addr) as addr, COUNT(*) as callers
FROM callers GROUP BY func_addr ORDER BY callers DESC LIMIT 10;callees
callees
What each function calls. Inverse of callers view. Uses for efficient function-level grouping.
from_func| Column | Type | Description |
|---|---|---|
| INT | Calling function address (from |
| TEXT | Calling function name |
| INT | Called address |
| TEXT | Called function/symbol name |
sql
-- What does main call?
SELECT callee_name, printf('0x%X', callee_addr) as addr
FROM callees WHERE func_name LIKE '%main%';
-- Functions making most calls
SELECT func_name, COUNT(*) as call_count
FROM callees GROUP BY func_addr ORDER BY call_count DESC LIMIT 10;每个函数调用的对象。与callers视图相反。使用进行高效的函数级分组。
from_func| 列 | 类型 | 描述 |
|---|---|---|
| INT | 调用函数地址(来自 |
| TEXT | 调用函数名称 |
| INT | 被调用地址 |
| TEXT | 被调用函数/符号名称 |
sql
-- main函数调用了什么?
SELECT callee_name, printf('0x%X', callee_addr) as addr
FROM callees WHERE func_name LIKE '%main%';
-- 调用次数最多的函数
SELECT func_name, COUNT(*) as call_count
FROM callees GROUP BY func_addr ORDER BY call_count DESC LIMIT 10;string_refs
string_refs
Pre-joined view of string cross-references with containing function info.
| Column | Type | Description |
|---|---|---|
| INT | Address of the string |
| TEXT | String content |
| INT | String length |
| INT | Address of the referencing instruction |
| INT | Containing function address |
| TEXT | Containing function name |
sql
-- Strings referenced by a specific function
SELECT string_value, func_name FROM string_refs WHERE func_addr = 0x401000;
-- Find functions referencing password-related strings
SELECT string_value, func_name FROM string_refs WHERE string_value LIKE '%password%';
-- Most referenced strings
SELECT string_value, COUNT(*) as ref_count
FROM string_refs GROUP BY string_addr ORDER BY ref_count DESC LIMIT 10;字符串交叉引用与所属函数信息的预关联视图。
| 列 | 类型 | 描述 |
|---|---|---|
| INT | 字符串地址 |
| TEXT | 字符串内容 |
| INT | 字符串长度 |
| INT | 引用指令的地址 |
| INT | 所属函数地址 |
| TEXT | 所属函数名称 |
sql
-- 特定函数引用的字符串
SELECT string_value, func_name FROM string_refs WHERE func_addr = 0x401000;
-- 查找引用密码相关字符串的函数
SELECT string_value, func_name FROM string_refs WHERE string_value LIKE '%password%';
-- 被引用次数最多的字符串
SELECT string_value, COUNT(*) as ref_count
FROM string_refs GROUP BY string_addr ORDER BY ref_count DESC LIMIT 10;data_refs
data_refs
Cached table of data (non-code) cross-references with containing function info.
Whole-program aggregates are supported.
| Column | Type | Description |
|---|---|---|
| INT | Source address of the reference |
| INT | Target data address |
| INT | Containing function address |
| TEXT | Containing function name |
| INT | Xref type code |
sql
-- Data references from a specific function
SELECT * FROM data_refs WHERE from_func_addr = 0x401000;
-- Functions with most data references
SELECT from_func_name, COUNT(*) as data_ref_count
FROM data_refs GROUP BY from_func_addr ORDER BY data_ref_count DESC LIMIT 10;包含所属函数信息的数据(非代码)交叉引用缓存表。支持全程序聚合查询。
| 列 | 类型 | 描述 |
|---|---|---|
| INT | 引用的源地址 |
| INT | 目标数据地址 |
| INT | 所属函数地址 |
| TEXT | 所属函数名称 |
| INT | 交叉引用类型代码 |
sql
-- 特定函数的引用数据
SELECT * FROM data_refs WHERE from_func_addr = 0x401000;
-- 数据引用次数最多的函数
SELECT from_func_name, COUNT(*) as data_ref_count
FROM data_refs GROUP BY from_func_addr ORDER BY data_ref_count DESC LIMIT 10;call_graph
call_graph
Table-valued function for BFS call graph traversal. Uses HIDDEN parameters for traversal control.
| Column | Type | Description |
|---|---|---|
| INT | Function address in the graph |
| TEXT | Function name |
| INT | BFS depth from start |
| INT | Parent function address in the traversal |
| INT | HIDDEN — Starting function address |
| TEXT | HIDDEN — |
| INT | HIDDEN — Maximum traversal depth |
Always provide WHERE constraints for all 3 hidden parameters.
sql
-- Forward call tree from main
SELECT func_name, depth FROM call_graph
WHERE start = (SELECT address FROM funcs WHERE name = 'main')
AND direction = 'down' AND max_depth = 5;
-- All transitive callers
SELECT func_name, depth FROM call_graph
WHERE start = 0x405000 AND direction = 'up' AND max_depth = 10;
-- Bidirectional exploration
SELECT func_name, depth FROM call_graph
WHERE start = 0x401000 AND direction = 'both' AND max_depth = 3;
-- Join with string_refs to find strings reachable from a function
SELECT DISTINCT sr.string_value, sr.func_name
FROM call_graph cg
JOIN string_refs sr ON sr.func_addr = cg.func_addr
WHERE cg.start = 0x401000 AND cg.direction = 'down' AND cg.max_depth = 3;
-- Imported APIs reachable from a function's call tree
SELECT DISTINCT i.module, i.name as api
FROM call_graph cg
JOIN disasm_calls dc ON dc.func_addr = cg.func_addr
JOIN imports i ON dc.callee_addr = i.address
WHERE cg.start = 0x401000 AND cg.direction = 'down' AND cg.max_depth = 5
ORDER BY i.module, i.name;Performance: BFS with visited set. O(reachable functions). Always constrain hidden params.
Use this pattern when the destination is an import.
用于BFS调用图遍历的表值函数。使用隐藏参数控制遍历。
| 列 | 类型 | 描述 |
|---|---|---|
| INT | 图中的函数地址 |
| TEXT | 函数名称 |
| INT | 与起始点的BFS深度 |
| INT | 遍历中的父函数地址 |
| INT | 隐藏 — 起始函数地址 |
| TEXT | 隐藏 — |
| INT | 隐藏 — 最大遍历深度 |
必须为所有3个隐藏参数提供WHERE约束。
sql
-- 从main函数出发的正向调用树
SELECT func_name, depth FROM call_graph
WHERE start = (SELECT address FROM funcs WHERE name = 'main')
AND direction = 'down' AND max_depth = 5;
-- 所有间接调用方
SELECT func_name, depth FROM call_graph
WHERE start = 0x405000 AND direction = 'up' AND max_depth = 10;
-- 双向探索
SELECT func_name, depth FROM call_graph
WHERE start = 0x401000 AND direction = 'both' AND max_depth = 3;
-- 关联string_refs查找函数可访问的字符串
SELECT DISTINCT sr.string_value, sr.func_name
FROM call_graph cg
JOIN string_refs sr ON sr.func_addr = cg.func_addr
WHERE cg.start = 0x401000 AND cg.direction = 'down' AND cg.max_depth = 3;
-- 函数调用树可访问的导入API
SELECT DISTINCT i.module, i.name as api
FROM call_graph cg
JOIN disasm_calls dc ON dc.func_addr = cg.func_addr
JOIN imports i ON dc.callee_addr = i.address
WHERE cg.start = 0x401000 AND cg.direction = 'down' AND cg.max_depth = 5
ORDER BY i.module, i.name;性能:带访问集合的BFS。时间复杂度为O(可访问函数数量)。必须约束隐藏参数。
当目标是导入项时使用此模式。
shortest_path
shortest_path
Table-valued function for finding the shortest call path between two functions. Uses bidirectional BFS.
| Column | Type | Description |
|---|---|---|
| INT | Step number in the path (0 = source) |
| INT | Function address at this step |
| TEXT | Function name at this step |
| INT | HIDDEN — Source function address |
| INT | HIDDEN — Destination function address |
| INT | HIDDEN — Maximum search depth |
Always provide WHERE constraints for all 3 hidden parameters.
Both endpoints must resolve to functions. Imported API addresses from
are not valid endpoints.
importsshortest_pathsql
-- Find shortest call path between two functions
SELECT step, func_name FROM shortest_path
WHERE from_addr = (SELECT address FROM funcs WHERE name = 'main')
AND to_addr = (SELECT address FROM funcs WHERE name = 'target_func')
AND max_depth = 20;
-- Check reachability between two functions
SELECT COUNT(*) > 0 as reachable FROM shortest_path
WHERE from_addr = 0x401000 AND to_addr = 0x405000 AND max_depth = 20;
-- Annotate path steps with call count and string refs
SELECT sp.step, sp.func_name,
(SELECT COUNT(*) FROM disasm_calls dc WHERE dc.func_addr = sp.func_addr) as calls_made,
(SELECT COUNT(*) FROM string_refs sr WHERE sr.func_addr = sp.func_addr) as strings_used
FROM shortest_path sp
WHERE sp.from_addr = 0x401000 AND sp.to_addr = 0x405000 AND sp.max_depth = 20
ORDER BY sp.step;Performance: Bidirectional BFS. O(b^(d/2)) where b is branching factor and d is path length. Returns empty result set if no path exists within max_depth.
用于查找两个函数间最短调用路径的表值函数。使用双向BFS。
| 列 | 类型 | 描述 |
|---|---|---|
| INT | 路径中的步骤编号(0 = 源点) |
| INT | 该步骤的函数地址 |
| TEXT | 该步骤的函数名称 |
| INT | 隐藏 — 源函数地址 |
| INT | 隐藏 — 目标函数地址 |
| INT | 隐藏 — 最大搜索深度 |
必须为所有3个隐藏参数提供WHERE约束。
两个端点必须解析为函数。来自的导入API地址不能作为的端点。
importsshortest_pathsql
-- 查找两个函数间的最短调用路径
SELECT step, func_name FROM shortest_path
WHERE from_addr = (SELECT address FROM funcs WHERE name = 'main')
AND to_addr = (SELECT address FROM funcs WHERE name = 'target_func')
AND max_depth = 20;
-- 检查两个函数间的可达性
SELECT COUNT(*) > 0 as reachable FROM shortest_path
WHERE from_addr = 0x401000 AND to_addr = 0x405000 AND max_depth = 20;
-- 为路径步骤添加调用次数和字符串引用注释
SELECT sp.step, sp.func_name,
(SELECT COUNT(*) FROM disasm_calls dc WHERE dc.func_addr = sp.func_addr) as calls_made,
(SELECT COUNT(*) FROM string_refs sr WHERE sr.func_addr = sp.func_addr) as strings_used
FROM shortest_path sp
WHERE sp.from_addr = 0x401000 AND sp.to_addr = 0x405000 AND sp.max_depth = 20
ORDER BY sp.step;性能:双向BFS。时间复杂度为O(b^(d/2)),其中b是分支因子,d是路径长度。如果在max_depth内不存在路径,则返回空结果集。
Table-First Cross-Reference Queries
基于表的交叉引用查询示例
sql
-- Incoming references to an address
SELECT from_ea, to_ea, type, is_code, from_func
FROM xrefs
WHERE to_ea = 0x401000;
-- Exact outgoing references from an item address
SELECT from_ea, to_ea, type, is_code, from_func
FROM xrefs
WHERE from_ea = 0x401000;
-- Outgoing references from anywhere inside a function
SELECT from_ea, to_ea, type, is_code, from_func
FROM xrefs
WHERE from_func = 0x401000;sql
-- 某个地址的传入引用
SELECT from_ea, to_ea, type, is_code, from_func
FROM xrefs
WHERE to_ea = 0x401000;
-- 某个项目地址的精确传出引用
SELECT from_ea, to_ea, type, is_code, from_func
FROM xrefs
WHERE from_ea = 0x401000;
-- 某个函数内部所有位置的传出引用
SELECT from_ea, to_ea, type, is_code, from_func
FROM xrefs
WHERE from_func = 0x401000;grep
grep
Use to resolve internal symbols, types, and members before doing relationship analysis. Use when the callee may exist only as an imported API.
Canonical usage lives in .
For canonical schema and owner mapping, see ().
grepimports../grep/SKILL.md../connect/references/schema-catalog.mdgrepsql
-- Resolve internal functions with grep
SELECT name, kind, address
FROM grep
WHERE pattern = 'main%' AND kind = 'function'
ORDER BY name;
-- Resolve imported APIs with imports
SELECT module, name, address
FROM imports
WHERE name LIKE 'CreateFile%'
ORDER BY module, name;
-- Then pivot into callers/callees/xrefs
SELECT caller_name, printf('0x%X', caller_addr) AS from_addr
FROM callers
WHERE func_addr = (
SELECT address
FROM imports
WHERE name = 'CreateFileW'
ORDER BY name
LIMIT 1
);在进行关系分析前,使用解析内部符号、类型和成员。当被调用方可能仅作为导入API存在时,使用。
标准用法请参考。
关于标准模式和所有者映射,请查看(部分)。
grepimports../grep/SKILL.md../connect/references/schema-catalog.mdgrepsql
-- 使用grep解析内部函数
SELECT name, kind, address
FROM grep
WHERE pattern = 'main%' AND kind = 'function'
ORDER BY name;
-- 使用imports解析导入API
SELECT module, name, address
FROM imports
WHERE name LIKE 'CreateFile%'
ORDER BY module, name;
-- 转查调用方/被调用方/xrefs
SELECT caller_name, printf('0x%X', caller_addr) AS from_addr
FROM callers
WHERE func_addr = (
SELECT address
FROM imports
WHERE name = 'CreateFileW'
ORDER BY name
LIMIT 1
);Performance Rules
性能规则
Constraint Pushdown
约束下推
The table has optimized filters using efficient IDA SDK APIs:
xrefs| Filter | Cost | Behavior |
|---|---|---|
| 0.5 | O(xrefs to X) — fast, uses IDA's xref index |
| 0.5 | O(xrefs from X) — fast, uses IDA's xref index |
| 1.0 | O(callees of X) — uses XrefsFromFuncIterator |
No equality filter on | — | Falls back to a cache of xrefs to function entry points only — avoid relying on it for complete import/data/non-function coverage |
Always filter xrefs by , , or . Avoid unconstrained scans.
to_eafrom_eafrom_funcsql
-- FAST: xrefs to a specific target
SELECT * FROM xrefs WHERE to_ea = 0x401000;
-- FAST: xrefs from a specific source
SELECT * FROM xrefs WHERE from_ea = 0x401000;
-- FAST: all xrefs originating from a function
SELECT * FROM xrefs WHERE from_func = 0x401000;
-- INCOMPLETE/avoid: unconstrained scan falls back to a function-entry cache
SELECT * FROM xrefs WHERE is_code = 1;xrefs| 过滤器 | 开销 | 行为 |
|---|---|---|
| 0.5 | O(指向X的交叉引用数量) — 快速,使用IDA的交叉引用索引 |
| 0.5 | O(来自X的交叉引用数量) — 快速,使用IDA的交叉引用索引 |
| 1.0 | O(X的被调用方数量) — 使用XrefsFromFuncIterator |
未对 | — | 回退到仅包含函数入口点的交叉引用缓存 — 避免依赖它获取完整的导入/数据/非函数覆盖范围 |
始终通过、或过滤xrefs。避免无约束扫描。
to_eafrom_eafrom_funcsql
-- 快速:指向特定目标的交叉引用
SELECT * FROM xrefs WHERE to_ea = 0x401000;
-- 快速:来自特定源的交叉引用
SELECT * FROM xrefs WHERE from_ea = 0x401000;
-- 快速:来自某个函数的所有交叉引用
SELECT * FROM xrefs WHERE from_func = 0x401000;
-- 不完整/避免:无约束扫描回退到函数入口缓存
SELECT * FROM xrefs WHERE is_code = 1;Common Xref Patterns
常见交叉引用模式
Find Most Called Functions
查找被调用次数最多的函数
sql
SELECT f.name, COUNT(*) as caller_count
FROM funcs f
JOIN xrefs x ON f.address = x.to_ea
WHERE x.is_code = 1
GROUP BY f.address
ORDER BY caller_count DESC
LIMIT 10;sql
SELECT f.name, COUNT(*) as caller_count
FROM funcs f
JOIN xrefs x ON f.address = x.to_ea
WHERE x.is_code = 1
GROUP BY f.address
ORDER BY caller_count DESC
LIMIT 10;Find Functions Calling a Specific API
查找调用特定API的函数
sql
SELECT DISTINCT (SELECT name FROM funcs WHERE from_ea >= address AND from_ea < end_ea LIMIT 1) as caller
FROM xrefs
WHERE to_ea = (SELECT address FROM imports WHERE name = 'CreateFileW');sql
SELECT DISTINCT (SELECT name FROM funcs WHERE from_ea >= address AND from_ea < end_ea LIMIT 1) as caller
FROM xrefs
WHERE to_ea = (SELECT address FROM imports WHERE name = 'CreateFileW');String Cross-Reference Analysis
字符串交叉引用分析
sql
SELECT s.content, (SELECT name FROM funcs WHERE x.from_ea >= address AND x.from_ea < end_ea LIMIT 1) as used_by
FROM strings s
JOIN xrefs x ON s.address = x.to_ea
WHERE s.content LIKE '%password%';sql
SELECT s.content, (SELECT name FROM funcs WHERE x.from_ea >= address AND x.from_ea < end_ea LIMIT 1) as used_by
FROM strings s
JOIN xrefs x ON s.address = x.to_ea
WHERE s.content LIKE '%password%';Import Dependency Map
导入依赖映射
sql
-- Which modules does each function depend on?
SELECT f.name as func_name, i.module, COUNT(*) as api_count
FROM funcs f
JOIN disasm_calls dc ON dc.func_addr = f.address
JOIN imports i ON dc.callee_addr = i.address
GROUP BY f.address, i.module
ORDER BY f.name, api_count DESC;sql
-- 每个函数依赖哪些模块?
SELECT f.name as func_name, i.module, COUNT(*) as api_count
FROM funcs f
JOIN disasm_calls dc ON dc.func_addr = f.address
JOIN imports i ON dc.callee_addr = i.address
GROUP BY f.address, i.module
ORDER BY f.name, api_count DESC;Data Section References
数据段引用
sql
-- Functions referencing data sections
SELECT
f.name,
s.name as segment,
COUNT(*) as data_refs
FROM funcs f
JOIN xrefs x ON x.from_ea BETWEEN f.address AND f.end_ea
JOIN segments s ON x.to_ea BETWEEN s.start_ea AND s.end_ea
WHERE s.class = 'DATA' AND x.is_code = 0
GROUP BY f.address, s.name
ORDER BY data_refs DESC
LIMIT 20;sql
-- 引用数据段的函数
SELECT
f.name,
s.name as segment,
COUNT(*) as data_refs
FROM funcs f
JOIN xrefs x ON x.from_ea BETWEEN f.address AND f.end_ea
JOIN segments s ON x.to_ea BETWEEN s.start_ea AND s.end_ea
WHERE s.class = 'DATA' AND x.is_code = 0
GROUP BY f.address, s.name
ORDER BY data_refs DESC
LIMIT 20;Advanced Xref Patterns (CTEs and Recursive Queries)
高级交叉引用模式(CTE与递归查询)
Preferover manual CTEs. Thecall_graphtable uses C++ BFS with a visited set — correct BFS depths, no duplicate expansion on diamond-shaped call graphs, and early termination.call_graph
Before (recursive CTE — no visited tracking, exponential on diamond graphs):
sql
WITH RECURSIVE call_chain(root, current_func, depth) AS (
SELECT 0x401000, callee_addr, 1
FROM disasm_calls WHERE func_addr = 0x401000
UNION ALL
SELECT cc.root, dc.callee_addr, cc.depth + 1
FROM call_chain cc
JOIN disasm_calls dc ON dc.func_addr = cc.current_func
WHERE cc.depth < 10
)
SELECT DISTINCT current_func, MIN(depth) FROM call_chain GROUP BY current_func;After (call_graph table — C++ BFS with visited set, correct depths):
sql
SELECT func_addr, func_name, depth FROM call_graph
WHERE start = 0x401000 AND direction = 'down' AND max_depth = 10;优先使用而非手动CTE。call_graph表使用带访问集合的C++ BFS — 能正确计算BFS深度,避免菱形调用图中的重复扩展,并支持提前终止。call_graph
之前的方式(递归CTE — 无访问跟踪,菱形图上呈指数级增长):
sql
WITH RECURSIVE call_chain(root, current_func, depth) AS (
SELECT 0x401000, callee_addr, 1
FROM disasm_calls WHERE func_addr = 0x401000
UNION ALL
SELECT cc.root, dc.callee_addr, cc.depth + 1
FROM call_chain cc
JOIN disasm_calls dc ON dc.func_addr = cc.current_func
WHERE cc.depth < 10
)
SELECT DISTINCT current_func, MIN(depth) FROM call_chain GROUP BY current_func;优化后的方式(call_graph表 — 带访问集合的C++ BFS,深度计算正确):
sql
SELECT func_addr, func_name, depth FROM call_graph
WHERE start = 0x401000 AND direction = 'down' AND max_depth = 10;Recursive Call Graph — Forward Traversal
递归调用图 — 正向遍历
Note: For targeted traversal, prefer thetable which uses C++ BFS with visited tracking. Use recursive CTEs only when you need custom join logic (e.g., filtering by callee properties at each step).call_graph
Find all functions reachable from a starting function (up to depth 5):
sql
-- Preferred: use call_graph table
SELECT func_name, depth FROM call_graph
WHERE start = (SELECT address FROM funcs WHERE name = 'main')
AND direction = 'down' AND max_depth = 5;
-- Manual CTE (when custom filtering is needed at each step):
WITH RECURSIVE cg AS (
SELECT address as func_addr, name, 0 as depth
FROM funcs WHERE name = 'main'
UNION ALL
SELECT f.address, f.name, cg.depth + 1
FROM cg
JOIN disasm_calls dc ON dc.func_addr = cg.func_addr
JOIN funcs f ON f.address = dc.callee_addr
WHERE cg.depth < 5
AND dc.callee_addr != 0
)
SELECT DISTINCT func_addr, name, MIN(depth) as min_depth
FROM cg
GROUP BY func_addr
ORDER BY min_depth, name;注意: 对于定向遍历,优先使用带访问跟踪的C++ BFS实现的表。仅当需要自定义关联逻辑(例如,在每一步过滤被调用方属性)时,才使用递归CTE。call_graph
查找从起始函数可访问的所有函数(最大深度5):
sql
-- 推荐:使用call_graph表
SELECT func_name, depth FROM call_graph
WHERE start = (SELECT address FROM funcs WHERE name = 'main')
AND direction = 'down' AND max_depth = 5;
-- 手动CTE(需要每步自定义过滤时使用):
WITH RECURSIVE cg AS (
SELECT address as func_addr, name, 0 as depth
FROM funcs WHERE name = 'main'
UNION ALL
SELECT f.address, f.name, cg.depth + 1
FROM cg
JOIN disasm_calls dc ON dc.func_addr = cg.func_addr
JOIN funcs f ON f.address = dc.callee_addr
WHERE cg.depth < 5
AND dc.callee_addr != 0
)
SELECT DISTINCT func_addr, name, MIN(depth) as min_depth
FROM cg
GROUP BY func_addr
ORDER BY min_depth, name;Recursive Call Graph — Reverse (Who Calls This?)
递归调用图 — 反向(谁调用了这个?)
Note: For targeted traversal, prefer thetable withcall_graph. Use recursive CTEs only when you need custom join logic.direction = 'up'
Trace callers transitively up to depth 5:
sql
-- Preferred: use call_graph table
SELECT func_name, depth FROM call_graph
WHERE start = 0x401000 AND direction = 'up' AND max_depth = 5;
-- Manual CTE (when custom filtering is needed at each step):
WITH RECURSIVE callers_cte AS (
SELECT DISTINCT dc.func_addr, 1 as depth
FROM disasm_calls dc
WHERE dc.callee_addr = 0x401000
UNION ALL
SELECT DISTINCT dc.func_addr, c.depth + 1
FROM callers_cte c
JOIN disasm_calls dc ON dc.callee_addr = c.func_addr
WHERE c.depth < 5
)
SELECT (SELECT name FROM funcs WHERE func_addr >= address AND func_addr < end_ea LIMIT 1) as caller, MIN(depth) as distance
FROM callers_cte
GROUP BY func_addr
ORDER BY distance, caller;注意: 对于定向遍历,优先使用的direction = 'up'表。仅当需要自定义关联逻辑时,才使用递归CTE。call_graph
向上追踪间接调用方(最大深度5):
sql
-- 推荐:使用call_graph表
SELECT func_name, depth FROM call_graph
WHERE start = 0x401000 AND direction = 'up' AND max_depth = 5;
-- 手动CTE(需要每步自定义过滤时使用):
WITH RECURSIVE callers_cte AS (
SELECT DISTINCT dc.func_addr, 1 as depth
FROM disasm_calls dc
WHERE dc.callee_addr = 0x401000
UNION ALL
SELECT DISTINCT dc.func_addr, c.depth + 1
FROM callers_cte c
JOIN disasm_calls dc ON dc.callee_addr = c.func_addr
WHERE c.depth < 5
)
SELECT (SELECT name FROM funcs WHERE func_addr >= address AND func_addr < end_ea LIMIT 1) as caller, MIN(depth) as distance
FROM callers_cte
GROUP BY func_addr
ORDER BY distance, caller;CTE: Functions That Both Call malloc AND Check NULL
CTE:同时调用malloc和检查NULL的函数
sql
WITH malloc_callers AS (
SELECT DISTINCT func_addr
FROM disasm_calls
WHERE callee_name LIKE '%malloc%'
),
null_checkers AS (
SELECT DISTINCT func_addr
FROM ctree_v_comparisons
WHERE rhs_num = 0 AND op_name = 'cot_eq'
)
SELECT f.name
FROM funcs f
JOIN malloc_callers m ON f.address = m.func_addr
JOIN null_checkers n ON f.address = n.func_addr;sql
WITH malloc_callers AS (
SELECT DISTINCT func_addr
FROM disasm_calls
WHERE callee_name LIKE '%malloc%'
),
null_checkers AS (
SELECT DISTINCT func_addr
FROM ctree_v_comparisons
WHERE rhs_num = 0 AND op_name = 'cot_eq'
)
SELECT f.name
FROM funcs f
JOIN malloc_callers m ON f.address = m.func_addr
JOIN null_checkers n ON f.address = n.func_addr;CTE: Memory Allocation Without Free (Potential Leaks)
CTE:只分配内存不释放的函数(潜在内存泄漏)
sql
WITH allocators AS (
SELECT func_addr, COUNT(*) as alloc_count
FROM disasm_calls
WHERE callee_name LIKE '%alloc%' OR callee_name LIKE '%malloc%'
GROUP BY func_addr
),
freers AS (
SELECT func_addr, COUNT(*) as free_count
FROM disasm_calls
WHERE callee_name LIKE '%free%'
GROUP BY func_addr
)
SELECT f.name,
COALESCE(a.alloc_count, 0) as allocations,
COALESCE(r.free_count, 0) as frees
FROM funcs f
LEFT JOIN allocators a ON f.address = a.func_addr
LEFT JOIN freers r ON f.address = r.func_addr
WHERE a.alloc_count > 0 AND COALESCE(r.free_count, 0) = 0
ORDER BY allocations DESC;sql
WITH allocators AS (
SELECT func_addr, COUNT(*) as alloc_count
FROM disasm_calls
WHERE callee_name LIKE '%alloc%' OR callee_name LIKE '%malloc%'
GROUP BY func_addr
),
freers AS (
SELECT func_addr, COUNT(*) as free_count
FROM disasm_calls
WHERE callee_name LIKE '%free%'
GROUP BY func_addr
)
SELECT f.name,
COALESCE(a.alloc_count, 0) as allocations,
COALESCE(r.free_count, 0) as frees
FROM funcs f
LEFT JOIN allocators a ON f.address = a.func_addr
LEFT JOIN freers r ON f.address = r.func_addr
WHERE a.alloc_count > 0 AND COALESCE(r.free_count, 0) = 0
ORDER BY allocations DESC;EXISTS: Functions With at Least One String Reference
EXISTS:至少引用一个字符串的函数
More efficient than JOIN + DISTINCT for existence checks:
sql
SELECT f.name
FROM funcs f
WHERE EXISTS (
SELECT 1 FROM xrefs x
JOIN strings s ON x.to_ea = s.address
WHERE x.from_ea BETWEEN f.address AND f.end_ea
);对于存在性检查,比JOIN + DISTINCT更高效:
sql
SELECT f.name
FROM funcs f
WHERE EXISTS (
SELECT 1 FROM xrefs x
JOIN strings s ON x.to_ea = s.address
WHERE x.from_ea BETWEEN f.address AND f.end_ea
);EXISTS: Leaf Functions (No Outgoing Calls)
EXISTS:叶子函数(无传出调用)
sql
SELECT f.name, f.size
FROM funcs f
WHERE NOT EXISTS (
SELECT 1 FROM disasm_calls dc
WHERE dc.func_addr = f.address
)
ORDER BY f.size DESC;sql
SELECT f.name, f.size
FROM funcs f
WHERE NOT EXISTS (
SELECT 1 FROM disasm_calls dc
WHERE dc.func_addr = f.address
)
ORDER BY f.size DESC;