xrefs

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese


Trigger Intents

触发场景

Use this skill when user asks:
  • "Who calls this?" / "What does this call?"
  • "Where is this string/import referenced?"
  • "Show call graph dependencies."
Route to:
  • grep
    for candidate entity lookup by name/pattern before relationship analysis
  • analysis
    for broader triage context
  • decompiler
    for semantic interpretation after graph narrowing
  • disassembly
    for instruction-level call-site proof

当用户提出以下问题时使用此技能:
  • "谁调用了这个?" / "这个调用了什么?"
  • "这个字符串/导入项在哪里被引用?"
  • "展示调用图依赖关系。"
路由至:
  • 进行关系分析前,使用
    grep
    按名称/模式查找候选实体
  • 使用
    analysis
    获取更广泛的分类排查上下文
  • 调用图范围缩小后,使用
    decompiler
    进行语义解读
  • 使用
    disassembly
    获取指令级调用站点验证

Do This First (Warm-Start Sequence)

优先执行步骤(预热流程)

sql
-- 1) Core relation volume
SELECT COUNT(*) AS xref_count FROM xrefs;

-- 2) Top imports (dependency hints)
SELECT module, COUNT(*) AS import_count
FROM imports
GROUP BY module
ORDER BY import_count DESC;

-- 3) Most called functions
SELECT printf('0x%X', to_ea) AS callee, COUNT(*) AS callers
FROM xrefs
WHERE is_code = 1
GROUP BY to_ea
ORDER BY callers DESC
LIMIT 20;
Interpretation guidance:
  • Use relation counts to prioritize hotspots before expensive deep analysis.
  • Prefer indexed filters (
    to_ea
    /
    from_ea
    ) for fast response.

sql
-- 1) 核心关联数量
SELECT COUNT(*) AS xref_count FROM xrefs;

-- 2) 热门导入项(依赖提示)
SELECT module, COUNT(*) AS import_count
FROM imports
GROUP BY module
ORDER BY import_count DESC;

-- 3) 被调用次数最多的函数
SELECT printf('0x%X', to_ea) AS callee, COUNT(*) AS callers
FROM xrefs
WHERE is_code = 1
GROUP BY to_ea
ORDER BY callers DESC
LIMIT 20;
解读指南:
  • 在进行高开销深度分析前,使用关联数量优先处理热点内容。
  • 优先使用索引过滤器(
    to_ea
    /
    from_ea
    )以获得快速响应。

Failure and Recovery

故障与恢复

  • Full-scan query too slow:
    • Add
      to_ea = X
      or
      from_ea = X
      constraints.
  • Target unresolved by name:
    • Resolve/verify address first through the
      names
      table or explicit EA literals.
  • Sparse results:
    • Pivot through
      imports
      ,
      strings
      , or
      disasm_calls
      joins.

  • 全量扫描查询过慢:
    • 添加
      to_ea = X
      from_ea = X
      约束条件。
  • 无法通过名称定位目标:
    • 先通过
      names
      表或明确的EA字面量解析/验证地址。
  • 结果稀疏:
    • 通过
      imports
      strings
      disasm_calls
      关联表进行转查。

Handoff Patterns

移交模式

  1. xrefs
    ->
    decompiler
    for top candidate function semantics.
  2. xrefs
    ->
    analysis
    for campaign-level synthesis.
  3. xrefs
    ->
    annotations
    to persist relationship findings.

  1. xrefs
    ->
    decompiler
    :获取候选函数的语义信息。
  2. xrefs
    ->
    analysis
    :进行全局范围的综合分析。
  3. xrefs
    ->
    annotations
    :保存关联关系分析结果。

xrefs

xrefs

Cross-references - the most important table for understanding code relationships.
ColumnTypeDescription
from_ea
INTSource address (who references)
to_ea
INTTarget address (what is referenced)
type
INTXref type code
is_code
INT1=code xref (call/jump), 0=data xref
from_func
INTPre-computed containing function address (0 when not in a function)
sql
-- Who calls function at 0x401000?
SELECT printf('0x%X', from_ea) as caller FROM xrefs WHERE to_ea = 0x401000 AND is_code = 1;

-- What does function at 0x401000 reference?
SELECT printf('0x%X', to_ea) as target FROM xrefs WHERE from_ea >= 0x401000 AND from_ea < 0x401100;

交叉引用——理解代码关系最重要的表。
类型描述
from_ea
INT源地址(引用方)
to_ea
INT目标地址(被引用方)
type
INT交叉引用类型代码
is_code
INT1=代码交叉引用(调用/跳转),0=数据交叉引用
from_func
INT预计算的所属函数地址(不在函数内时为0)
sql
-- 谁调用了地址为0x401000的函数?
SELECT printf('0x%X', from_ea) as caller FROM xrefs WHERE to_ea = 0x401000 AND is_code = 1;

-- 地址为0x401000的函数引用了什么?
SELECT printf('0x%X', to_ea) as target FROM xrefs WHERE from_ea >= 0x401000 AND from_ea < 0x401100;

imports

imports

Imported functions from external libraries.
ColumnTypeDescription
address
INTImport address (IAT entry)
name
TEXTImport name
module
TEXTModule/DLL name
ordinal
INTImport ordinal
sql
-- Imports from kernel32.dll
SELECT name FROM imports WHERE module LIKE '%kernel32%';

来自外部库的导入函数。
类型描述
address
INT导入地址(IAT条目)
name
TEXT导入名称
module
TEXT模块/DLL名称
ordinal
INT导入序号
sql
-- 来自kernel32.dll的导入项
SELECT name FROM imports WHERE module LIKE '%kernel32%';

Convenience Views

便捷视图

callers

callers

Who calls each function. Caller names are resolved by the view from the
funcs
and
names
tables.
ColumnTypeDescription
func_addr
INTTarget function address
caller_addr
INTXref source address
caller_name
TEXTCalling function name
caller_func_addr
INTCalling function start (from
from_func
)
Underlying query:
sql
SELECT x.to_ea as func_addr, x.from_ea as caller_addr,
       COALESCE((SELECT name FROM names WHERE address = x.from_func LIMIT 1), printf('sub_%X', x.from_func)) as caller_name,
       x.from_func as caller_func_addr
FROM xrefs x WHERE x.is_code = 1 AND x.from_func != 0
sql
-- Who calls function at 0x401000?
SELECT caller_name, printf('0x%X', caller_addr) as from_addr
FROM callers WHERE func_addr = 0x401000;

-- Most called functions
SELECT printf('0x%X', func_addr) as addr, COUNT(*) as callers
FROM callers GROUP BY func_addr ORDER BY callers DESC LIMIT 10;
每个函数的调用方。该视图通过
funcs
names
表解析调用方名称。
类型描述
func_addr
INT目标函数地址
caller_addr
INT交叉引用源地址
caller_name
TEXT调用函数名称
caller_func_addr
INT调用函数起始地址(来自
from_func
底层查询:
sql
SELECT x.to_ea as func_addr, x.from_ea as caller_addr,
       COALESCE((SELECT name FROM names WHERE address = x.from_func LIMIT 1), printf('sub_%X', x.from_func)) as caller_name,
       x.from_func as caller_func_addr
FROM xrefs x WHERE x.is_code = 1 AND x.from_func != 0
sql
-- 谁调用了地址为0x401000的函数?
SELECT caller_name, printf('0x%X', caller_addr) as from_addr
FROM callers WHERE func_addr = 0x401000;

-- 被调用次数最多的函数
SELECT printf('0x%X', func_addr) as addr, COUNT(*) as callers
FROM callers GROUP BY func_addr ORDER BY callers DESC LIMIT 10;

callees

callees

What each function calls. Inverse of callers view. Uses
from_func
for efficient function-level grouping.
ColumnTypeDescription
func_addr
INTCalling function address (from
from_func
)
func_name
TEXTCalling function name
callee_addr
INTCalled address
callee_name
TEXTCalled function/symbol name
sql
-- What does main call?
SELECT callee_name, printf('0x%X', callee_addr) as addr
FROM callees WHERE func_name LIKE '%main%';

-- Functions making most calls
SELECT func_name, COUNT(*) as call_count
FROM callees GROUP BY func_addr ORDER BY call_count DESC LIMIT 10;
每个函数调用的对象。与callers视图相反。使用
from_func
进行高效的函数级分组。
类型描述
func_addr
INT调用函数地址(来自
from_func
func_name
TEXT调用函数名称
callee_addr
INT被调用地址
callee_name
TEXT被调用函数/符号名称
sql
-- main函数调用了什么?
SELECT callee_name, printf('0x%X', callee_addr) as addr
FROM callees WHERE func_name LIKE '%main%';

-- 调用次数最多的函数
SELECT func_name, COUNT(*) as call_count
FROM callees GROUP BY func_addr ORDER BY call_count DESC LIMIT 10;

string_refs

string_refs

Pre-joined view of string cross-references with containing function info.
ColumnTypeDescription
string_addr
INTAddress of the string
string_value
TEXTString content
string_length
INTString length
ref_addr
INTAddress of the referencing instruction
func_addr
INTContaining function address
func_name
TEXTContaining function name
sql
-- Strings referenced by a specific function
SELECT string_value, func_name FROM string_refs WHERE func_addr = 0x401000;

-- Find functions referencing password-related strings
SELECT string_value, func_name FROM string_refs WHERE string_value LIKE '%password%';

-- Most referenced strings
SELECT string_value, COUNT(*) as ref_count
FROM string_refs GROUP BY string_addr ORDER BY ref_count DESC LIMIT 10;
字符串交叉引用与所属函数信息的预关联视图。
类型描述
string_addr
INT字符串地址
string_value
TEXT字符串内容
string_length
INT字符串长度
ref_addr
INT引用指令的地址
func_addr
INT所属函数地址
func_name
TEXT所属函数名称
sql
-- 特定函数引用的字符串
SELECT string_value, func_name FROM string_refs WHERE func_addr = 0x401000;

-- 查找引用密码相关字符串的函数
SELECT string_value, func_name FROM string_refs WHERE string_value LIKE '%password%';

-- 被引用次数最多的字符串
SELECT string_value, COUNT(*) as ref_count
FROM string_refs GROUP BY string_addr ORDER BY ref_count DESC LIMIT 10;

data_refs

data_refs

Cached table of data (non-code) cross-references with containing function info. Whole-program aggregates are supported.
ColumnTypeDescription
from_addr
INTSource address of the reference
to_addr
INTTarget data address
from_func_addr
INTContaining function address
from_func_name
TEXTContaining function name
ref_type
INTXref type code
sql
-- Data references from a specific function
SELECT * FROM data_refs WHERE from_func_addr = 0x401000;

-- Functions with most data references
SELECT from_func_name, COUNT(*) as data_ref_count
FROM data_refs GROUP BY from_func_addr ORDER BY data_ref_count DESC LIMIT 10;

包含所属函数信息的数据(非代码)交叉引用缓存表。支持全程序聚合查询。
类型描述
from_addr
INT引用的源地址
to_addr
INT目标数据地址
from_func_addr
INT所属函数地址
from_func_name
TEXT所属函数名称
ref_type
INT交叉引用类型代码
sql
-- 特定函数的引用数据
SELECT * FROM data_refs WHERE from_func_addr = 0x401000;

-- 数据引用次数最多的函数
SELECT from_func_name, COUNT(*) as data_ref_count
FROM data_refs GROUP BY from_func_addr ORDER BY data_ref_count DESC LIMIT 10;

call_graph

call_graph

Table-valued function for BFS call graph traversal. Uses HIDDEN parameters for traversal control.
ColumnTypeDescription
func_addr
INTFunction address in the graph
func_name
TEXTFunction name
depth
INTBFS depth from start
parent_addr
INTParent function address in the traversal
start
INTHIDDEN — Starting function address
direction
TEXTHIDDEN
'down'
(callees),
'up'
(callers), or
'both'
max_depth
INTHIDDEN — Maximum traversal depth
Always provide WHERE constraints for all 3 hidden parameters.
sql
-- Forward call tree from main
SELECT func_name, depth FROM call_graph
WHERE start = (SELECT address FROM funcs WHERE name = 'main')
  AND direction = 'down' AND max_depth = 5;

-- All transitive callers
SELECT func_name, depth FROM call_graph
WHERE start = 0x405000 AND direction = 'up' AND max_depth = 10;

-- Bidirectional exploration
SELECT func_name, depth FROM call_graph
WHERE start = 0x401000 AND direction = 'both' AND max_depth = 3;

-- Join with string_refs to find strings reachable from a function
SELECT DISTINCT sr.string_value, sr.func_name
FROM call_graph cg
JOIN string_refs sr ON sr.func_addr = cg.func_addr
WHERE cg.start = 0x401000 AND cg.direction = 'down' AND cg.max_depth = 3;

-- Imported APIs reachable from a function's call tree
SELECT DISTINCT i.module, i.name as api
FROM call_graph cg
JOIN disasm_calls dc ON dc.func_addr = cg.func_addr
JOIN imports i ON dc.callee_addr = i.address
WHERE cg.start = 0x401000 AND cg.direction = 'down' AND cg.max_depth = 5
ORDER BY i.module, i.name;
Performance: BFS with visited set. O(reachable functions). Always constrain hidden params. Use this pattern when the destination is an import.

用于BFS调用图遍历的表值函数。使用隐藏参数控制遍历。
类型描述
func_addr
INT图中的函数地址
func_name
TEXT函数名称
depth
INT与起始点的BFS深度
parent_addr
INT遍历中的父函数地址
start
INT隐藏 — 起始函数地址
direction
TEXT隐藏
'down'
(被调用方)、
'up'
(调用方)或
'both'
(双向)
max_depth
INT隐藏 — 最大遍历深度
必须为所有3个隐藏参数提供WHERE约束。
sql
-- 从main函数出发的正向调用树
SELECT func_name, depth FROM call_graph
WHERE start = (SELECT address FROM funcs WHERE name = 'main')
  AND direction = 'down' AND max_depth = 5;

-- 所有间接调用方
SELECT func_name, depth FROM call_graph
WHERE start = 0x405000 AND direction = 'up' AND max_depth = 10;

-- 双向探索
SELECT func_name, depth FROM call_graph
WHERE start = 0x401000 AND direction = 'both' AND max_depth = 3;

-- 关联string_refs查找函数可访问的字符串
SELECT DISTINCT sr.string_value, sr.func_name
FROM call_graph cg
JOIN string_refs sr ON sr.func_addr = cg.func_addr
WHERE cg.start = 0x401000 AND cg.direction = 'down' AND cg.max_depth = 3;

-- 函数调用树可访问的导入API
SELECT DISTINCT i.module, i.name as api
FROM call_graph cg
JOIN disasm_calls dc ON dc.func_addr = cg.func_addr
JOIN imports i ON dc.callee_addr = i.address
WHERE cg.start = 0x401000 AND cg.direction = 'down' AND cg.max_depth = 5
ORDER BY i.module, i.name;
性能:带访问集合的BFS。时间复杂度为O(可访问函数数量)。必须约束隐藏参数。 当目标是导入项时使用此模式。

shortest_path

shortest_path

Table-valued function for finding the shortest call path between two functions. Uses bidirectional BFS.
ColumnTypeDescription
step
INTStep number in the path (0 = source)
func_addr
INTFunction address at this step
func_name
TEXTFunction name at this step
from_addr
INTHIDDEN — Source function address
to_addr
INTHIDDEN — Destination function address
max_depth
INTHIDDEN — Maximum search depth
Always provide WHERE constraints for all 3 hidden parameters. Both endpoints must resolve to functions. Imported API addresses from
imports
are not valid
shortest_path
endpoints.
sql
-- Find shortest call path between two functions
SELECT step, func_name FROM shortest_path
WHERE from_addr = (SELECT address FROM funcs WHERE name = 'main')
  AND to_addr = (SELECT address FROM funcs WHERE name = 'target_func')
  AND max_depth = 20;

-- Check reachability between two functions
SELECT COUNT(*) > 0 as reachable FROM shortest_path
WHERE from_addr = 0x401000 AND to_addr = 0x405000 AND max_depth = 20;

-- Annotate path steps with call count and string refs
SELECT sp.step, sp.func_name,
       (SELECT COUNT(*) FROM disasm_calls dc WHERE dc.func_addr = sp.func_addr) as calls_made,
       (SELECT COUNT(*) FROM string_refs sr WHERE sr.func_addr = sp.func_addr) as strings_used
FROM shortest_path sp
WHERE sp.from_addr = 0x401000 AND sp.to_addr = 0x405000 AND sp.max_depth = 20
ORDER BY sp.step;
Performance: Bidirectional BFS. O(b^(d/2)) where b is branching factor and d is path length. Returns empty result set if no path exists within max_depth.

用于查找两个函数间最短调用路径的表值函数。使用双向BFS。
类型描述
step
INT路径中的步骤编号(0 = 源点)
func_addr
INT该步骤的函数地址
func_name
TEXT该步骤的函数名称
from_addr
INT隐藏 — 源函数地址
to_addr
INT隐藏 — 目标函数地址
max_depth
INT隐藏 — 最大搜索深度
必须为所有3个隐藏参数提供WHERE约束。 两个端点必须解析为函数。来自
imports
的导入API地址不能作为
shortest_path
的端点。
sql
-- 查找两个函数间的最短调用路径
SELECT step, func_name FROM shortest_path
WHERE from_addr = (SELECT address FROM funcs WHERE name = 'main')
  AND to_addr = (SELECT address FROM funcs WHERE name = 'target_func')
  AND max_depth = 20;

-- 检查两个函数间的可达性
SELECT COUNT(*) > 0 as reachable FROM shortest_path
WHERE from_addr = 0x401000 AND to_addr = 0x405000 AND max_depth = 20;

-- 为路径步骤添加调用次数和字符串引用注释
SELECT sp.step, sp.func_name,
       (SELECT COUNT(*) FROM disasm_calls dc WHERE dc.func_addr = sp.func_addr) as calls_made,
       (SELECT COUNT(*) FROM string_refs sr WHERE sr.func_addr = sp.func_addr) as strings_used
FROM shortest_path sp
WHERE sp.from_addr = 0x401000 AND sp.to_addr = 0x405000 AND sp.max_depth = 20
ORDER BY sp.step;
性能:双向BFS。时间复杂度为O(b^(d/2)),其中b是分支因子,d是路径长度。如果在max_depth内不存在路径,则返回空结果集。

Table-First Cross-Reference Queries

基于表的交叉引用查询示例

sql
-- Incoming references to an address
SELECT from_ea, to_ea, type, is_code, from_func
FROM xrefs
WHERE to_ea = 0x401000;

-- Exact outgoing references from an item address
SELECT from_ea, to_ea, type, is_code, from_func
FROM xrefs
WHERE from_ea = 0x401000;

-- Outgoing references from anywhere inside a function
SELECT from_ea, to_ea, type, is_code, from_func
FROM xrefs
WHERE from_func = 0x401000;

sql
-- 某个地址的传入引用
SELECT from_ea, to_ea, type, is_code, from_func
FROM xrefs
WHERE to_ea = 0x401000;

-- 某个项目地址的精确传出引用
SELECT from_ea, to_ea, type, is_code, from_func
FROM xrefs
WHERE from_ea = 0x401000;

-- 某个函数内部所有位置的传出引用
SELECT from_ea, to_ea, type, is_code, from_func
FROM xrefs
WHERE from_func = 0x401000;

grep

grep

Use
grep
to resolve internal symbols, types, and members before doing relationship analysis. Use
imports
when the callee may exist only as an imported API. Canonical usage lives in
../grep/SKILL.md
. For canonical schema and owner mapping, see
../connect/references/schema-catalog.md
(
grep
).
sql
-- Resolve internal functions with grep
SELECT name, kind, address
FROM grep
WHERE pattern = 'main%' AND kind = 'function'
ORDER BY name;

-- Resolve imported APIs with imports
SELECT module, name, address
FROM imports
WHERE name LIKE 'CreateFile%'
ORDER BY module, name;

-- Then pivot into callers/callees/xrefs
SELECT caller_name, printf('0x%X', caller_addr) AS from_addr
FROM callers
WHERE func_addr = (
    SELECT address
    FROM imports
    WHERE name = 'CreateFileW'
    ORDER BY name
    LIMIT 1
);

在进行关系分析前,使用
grep
解析内部符号、类型和成员。当被调用方可能仅作为导入API存在时,使用
imports
。 标准用法请参考
../grep/SKILL.md
。 关于标准模式和所有者映射,请查看
../connect/references/schema-catalog.md
grep
部分)。
sql
-- 使用grep解析内部函数
SELECT name, kind, address
FROM grep
WHERE pattern = 'main%' AND kind = 'function'
ORDER BY name;

-- 使用imports解析导入API
SELECT module, name, address
FROM imports
WHERE name LIKE 'CreateFile%'
ORDER BY module, name;

-- 转查调用方/被调用方/xrefs
SELECT caller_name, printf('0x%X', caller_addr) AS from_addr
FROM callers
WHERE func_addr = (
    SELECT address
    FROM imports
    WHERE name = 'CreateFileW'
    ORDER BY name
    LIMIT 1
);

Performance Rules

性能规则

Constraint Pushdown

约束下推

The
xrefs
table has optimized filters using efficient IDA SDK APIs:
FilterCostBehavior
to_ea = X
0.5O(xrefs to X) — fast, uses IDA's xref index
from_ea = X
0.5O(xrefs from X) — fast, uses IDA's xref index
from_func = X
1.0O(callees of X) — uses XrefsFromFuncIterator
No equality filter on
to_ea
/
from_ea
/
from_func
Falls back to a cache of xrefs to function entry points only — avoid relying on it for complete import/data/non-function coverage
Always filter xrefs by
to_ea
,
from_ea
, or
from_func
. Avoid unconstrained scans.
sql
-- FAST: xrefs to a specific target
SELECT * FROM xrefs WHERE to_ea = 0x401000;

-- FAST: xrefs from a specific source
SELECT * FROM xrefs WHERE from_ea = 0x401000;

-- FAST: all xrefs originating from a function
SELECT * FROM xrefs WHERE from_func = 0x401000;

-- INCOMPLETE/avoid: unconstrained scan falls back to a function-entry cache
SELECT * FROM xrefs WHERE is_code = 1;

xrefs
表使用高效的IDA SDK API实现了优化过滤器
过滤器开销行为
to_ea = X
0.5O(指向X的交叉引用数量) — 快速,使用IDA的交叉引用索引
from_ea = X
0.5O(来自X的交叉引用数量) — 快速,使用IDA的交叉引用索引
from_func = X
1.0O(X的被调用方数量) — 使用XrefsFromFuncIterator
未对
to_ea
/
from_ea
/
from_func
设置相等过滤器
回退到仅包含函数入口点的交叉引用缓存 — 避免依赖它获取完整的导入/数据/非函数覆盖范围
始终通过
to_ea
from_ea
from_func
过滤xrefs。避免无约束扫描。
sql
-- 快速:指向特定目标的交叉引用
SELECT * FROM xrefs WHERE to_ea = 0x401000;

-- 快速:来自特定源的交叉引用
SELECT * FROM xrefs WHERE from_ea = 0x401000;

-- 快速:来自某个函数的所有交叉引用
SELECT * FROM xrefs WHERE from_func = 0x401000;

-- 不完整/避免:无约束扫描回退到函数入口缓存
SELECT * FROM xrefs WHERE is_code = 1;

Common Xref Patterns

常见交叉引用模式

Find Most Called Functions

查找被调用次数最多的函数

sql
SELECT f.name, COUNT(*) as caller_count
FROM funcs f
JOIN xrefs x ON f.address = x.to_ea
WHERE x.is_code = 1
GROUP BY f.address
ORDER BY caller_count DESC
LIMIT 10;
sql
SELECT f.name, COUNT(*) as caller_count
FROM funcs f
JOIN xrefs x ON f.address = x.to_ea
WHERE x.is_code = 1
GROUP BY f.address
ORDER BY caller_count DESC
LIMIT 10;

Find Functions Calling a Specific API

查找调用特定API的函数

sql
SELECT DISTINCT (SELECT name FROM funcs WHERE from_ea >= address AND from_ea < end_ea LIMIT 1) as caller
FROM xrefs
WHERE to_ea = (SELECT address FROM imports WHERE name = 'CreateFileW');
sql
SELECT DISTINCT (SELECT name FROM funcs WHERE from_ea >= address AND from_ea < end_ea LIMIT 1) as caller
FROM xrefs
WHERE to_ea = (SELECT address FROM imports WHERE name = 'CreateFileW');

String Cross-Reference Analysis

字符串交叉引用分析

sql
SELECT s.content, (SELECT name FROM funcs WHERE x.from_ea >= address AND x.from_ea < end_ea LIMIT 1) as used_by
FROM strings s
JOIN xrefs x ON s.address = x.to_ea
WHERE s.content LIKE '%password%';
sql
SELECT s.content, (SELECT name FROM funcs WHERE x.from_ea >= address AND x.from_ea < end_ea LIMIT 1) as used_by
FROM strings s
JOIN xrefs x ON s.address = x.to_ea
WHERE s.content LIKE '%password%';

Import Dependency Map

导入依赖映射

sql
-- Which modules does each function depend on?
SELECT f.name as func_name, i.module, COUNT(*) as api_count
FROM funcs f
JOIN disasm_calls dc ON dc.func_addr = f.address
JOIN imports i ON dc.callee_addr = i.address
GROUP BY f.address, i.module
ORDER BY f.name, api_count DESC;
sql
-- 每个函数依赖哪些模块?
SELECT f.name as func_name, i.module, COUNT(*) as api_count
FROM funcs f
JOIN disasm_calls dc ON dc.func_addr = f.address
JOIN imports i ON dc.callee_addr = i.address
GROUP BY f.address, i.module
ORDER BY f.name, api_count DESC;

Data Section References

数据段引用

sql
-- Functions referencing data sections
SELECT
    f.name,
    s.name as segment,
    COUNT(*) as data_refs
FROM funcs f
JOIN xrefs x ON x.from_ea BETWEEN f.address AND f.end_ea
JOIN segments s ON x.to_ea BETWEEN s.start_ea AND s.end_ea
WHERE s.class = 'DATA' AND x.is_code = 0
GROUP BY f.address, s.name
ORDER BY data_refs DESC
LIMIT 20;

sql
-- 引用数据段的函数
SELECT
    f.name,
    s.name as segment,
    COUNT(*) as data_refs
FROM funcs f
JOIN xrefs x ON x.from_ea BETWEEN f.address AND f.end_ea
JOIN segments s ON x.to_ea BETWEEN s.start_ea AND s.end_ea
WHERE s.class = 'DATA' AND x.is_code = 0
GROUP BY f.address, s.name
ORDER BY data_refs DESC
LIMIT 20;

Advanced Xref Patterns (CTEs and Recursive Queries)

高级交叉引用模式(CTE与递归查询)

Prefer
call_graph
over manual CTEs.
The
call_graph
table uses C++ BFS with a visited set — correct BFS depths, no duplicate expansion on diamond-shaped call graphs, and early termination.
Before (recursive CTE — no visited tracking, exponential on diamond graphs):
sql
WITH RECURSIVE call_chain(root, current_func, depth) AS (
    SELECT 0x401000, callee_addr, 1
    FROM disasm_calls WHERE func_addr = 0x401000
    UNION ALL
    SELECT cc.root, dc.callee_addr, cc.depth + 1
    FROM call_chain cc
    JOIN disasm_calls dc ON dc.func_addr = cc.current_func
    WHERE cc.depth < 10
)
SELECT DISTINCT current_func, MIN(depth) FROM call_chain GROUP BY current_func;
After (call_graph table — C++ BFS with visited set, correct depths):
sql
SELECT func_addr, func_name, depth FROM call_graph
WHERE start = 0x401000 AND direction = 'down' AND max_depth = 10;
优先使用
call_graph
而非手动CTE。
call_graph
表使用带访问集合的C++ BFS — 能正确计算BFS深度,避免菱形调用图中的重复扩展,并支持提前终止。
之前的方式(递归CTE — 无访问跟踪,菱形图上呈指数级增长):
sql
WITH RECURSIVE call_chain(root, current_func, depth) AS (
    SELECT 0x401000, callee_addr, 1
    FROM disasm_calls WHERE func_addr = 0x401000
    UNION ALL
    SELECT cc.root, dc.callee_addr, cc.depth + 1
    FROM call_chain cc
    JOIN disasm_calls dc ON dc.func_addr = cc.current_func
    WHERE cc.depth < 10
)
SELECT DISTINCT current_func, MIN(depth) FROM call_chain GROUP BY current_func;
优化后的方式(call_graph表 — 带访问集合的C++ BFS,深度计算正确):
sql
SELECT func_addr, func_name, depth FROM call_graph
WHERE start = 0x401000 AND direction = 'down' AND max_depth = 10;

Recursive Call Graph — Forward Traversal

递归调用图 — 正向遍历

Note: For targeted traversal, prefer the
call_graph
table which uses C++ BFS with visited tracking. Use recursive CTEs only when you need custom join logic (e.g., filtering by callee properties at each step).
Find all functions reachable from a starting function (up to depth 5):
sql
-- Preferred: use call_graph table
SELECT func_name, depth FROM call_graph
WHERE start = (SELECT address FROM funcs WHERE name = 'main')
  AND direction = 'down' AND max_depth = 5;

-- Manual CTE (when custom filtering is needed at each step):
WITH RECURSIVE cg AS (
    SELECT address as func_addr, name, 0 as depth
    FROM funcs WHERE name = 'main'

    UNION ALL

    SELECT f.address, f.name, cg.depth + 1
    FROM cg
    JOIN disasm_calls dc ON dc.func_addr = cg.func_addr
    JOIN funcs f ON f.address = dc.callee_addr
    WHERE cg.depth < 5
      AND dc.callee_addr != 0
)
SELECT DISTINCT func_addr, name, MIN(depth) as min_depth
FROM cg
GROUP BY func_addr
ORDER BY min_depth, name;
注意: 对于定向遍历,优先使用带访问跟踪的C++ BFS实现的
call_graph
表。仅当需要自定义关联逻辑(例如,在每一步过滤被调用方属性)时,才使用递归CTE。
查找从起始函数可访问的所有函数(最大深度5):
sql
-- 推荐:使用call_graph表
SELECT func_name, depth FROM call_graph
WHERE start = (SELECT address FROM funcs WHERE name = 'main')
  AND direction = 'down' AND max_depth = 5;

-- 手动CTE(需要每步自定义过滤时使用):
WITH RECURSIVE cg AS (
    SELECT address as func_addr, name, 0 as depth
    FROM funcs WHERE name = 'main'

    UNION ALL

    SELECT f.address, f.name, cg.depth + 1
    FROM cg
    JOIN disasm_calls dc ON dc.func_addr = cg.func_addr
    JOIN funcs f ON f.address = dc.callee_addr
    WHERE cg.depth < 5
      AND dc.callee_addr != 0
)
SELECT DISTINCT func_addr, name, MIN(depth) as min_depth
FROM cg
GROUP BY func_addr
ORDER BY min_depth, name;

Recursive Call Graph — Reverse (Who Calls This?)

递归调用图 — 反向(谁调用了这个?)

Note: For targeted traversal, prefer the
call_graph
table with
direction = 'up'
. Use recursive CTEs only when you need custom join logic.
Trace callers transitively up to depth 5:
sql
-- Preferred: use call_graph table
SELECT func_name, depth FROM call_graph
WHERE start = 0x401000 AND direction = 'up' AND max_depth = 5;

-- Manual CTE (when custom filtering is needed at each step):
WITH RECURSIVE callers_cte AS (
    SELECT DISTINCT dc.func_addr, 1 as depth
    FROM disasm_calls dc
    WHERE dc.callee_addr = 0x401000

    UNION ALL

    SELECT DISTINCT dc.func_addr, c.depth + 1
    FROM callers_cte c
    JOIN disasm_calls dc ON dc.callee_addr = c.func_addr
    WHERE c.depth < 5
)
SELECT (SELECT name FROM funcs WHERE func_addr >= address AND func_addr < end_ea LIMIT 1) as caller, MIN(depth) as distance
FROM callers_cte
GROUP BY func_addr
ORDER BY distance, caller;
注意: 对于定向遍历,优先使用
direction = 'up'
call_graph
表。仅当需要自定义关联逻辑时,才使用递归CTE。
向上追踪间接调用方(最大深度5):
sql
-- 推荐:使用call_graph表
SELECT func_name, depth FROM call_graph
WHERE start = 0x401000 AND direction = 'up' AND max_depth = 5;

-- 手动CTE(需要每步自定义过滤时使用):
WITH RECURSIVE callers_cte AS (
    SELECT DISTINCT dc.func_addr, 1 as depth
    FROM disasm_calls dc
    WHERE dc.callee_addr = 0x401000

    UNION ALL

    SELECT DISTINCT dc.func_addr, c.depth + 1
    FROM callers_cte c
    JOIN disasm_calls dc ON dc.callee_addr = c.func_addr
    WHERE c.depth < 5
)
SELECT (SELECT name FROM funcs WHERE func_addr >= address AND func_addr < end_ea LIMIT 1) as caller, MIN(depth) as distance
FROM callers_cte
GROUP BY func_addr
ORDER BY distance, caller;

CTE: Functions That Both Call malloc AND Check NULL

CTE:同时调用malloc和检查NULL的函数

sql
WITH malloc_callers AS (
    SELECT DISTINCT func_addr
    FROM disasm_calls
    WHERE callee_name LIKE '%malloc%'
),
null_checkers AS (
    SELECT DISTINCT func_addr
    FROM ctree_v_comparisons
    WHERE rhs_num = 0 AND op_name = 'cot_eq'
)
SELECT f.name
FROM funcs f
JOIN malloc_callers m ON f.address = m.func_addr
JOIN null_checkers n ON f.address = n.func_addr;
sql
WITH malloc_callers AS (
    SELECT DISTINCT func_addr
    FROM disasm_calls
    WHERE callee_name LIKE '%malloc%'
),
null_checkers AS (
    SELECT DISTINCT func_addr
    FROM ctree_v_comparisons
    WHERE rhs_num = 0 AND op_name = 'cot_eq'
)
SELECT f.name
FROM funcs f
JOIN malloc_callers m ON f.address = m.func_addr
JOIN null_checkers n ON f.address = n.func_addr;

CTE: Memory Allocation Without Free (Potential Leaks)

CTE:只分配内存不释放的函数(潜在内存泄漏)

sql
WITH allocators AS (
    SELECT func_addr, COUNT(*) as alloc_count
    FROM disasm_calls
    WHERE callee_name LIKE '%alloc%' OR callee_name LIKE '%malloc%'
    GROUP BY func_addr
),
freers AS (
    SELECT func_addr, COUNT(*) as free_count
    FROM disasm_calls
    WHERE callee_name LIKE '%free%'
    GROUP BY func_addr
)
SELECT f.name,
       COALESCE(a.alloc_count, 0) as allocations,
       COALESCE(r.free_count, 0) as frees
FROM funcs f
LEFT JOIN allocators a ON f.address = a.func_addr
LEFT JOIN freers r ON f.address = r.func_addr
WHERE a.alloc_count > 0 AND COALESCE(r.free_count, 0) = 0
ORDER BY allocations DESC;
sql
WITH allocators AS (
    SELECT func_addr, COUNT(*) as alloc_count
    FROM disasm_calls
    WHERE callee_name LIKE '%alloc%' OR callee_name LIKE '%malloc%'
    GROUP BY func_addr
),
freers AS (
    SELECT func_addr, COUNT(*) as free_count
    FROM disasm_calls
    WHERE callee_name LIKE '%free%'
    GROUP BY func_addr
)
SELECT f.name,
       COALESCE(a.alloc_count, 0) as allocations,
       COALESCE(r.free_count, 0) as frees
FROM funcs f
LEFT JOIN allocators a ON f.address = a.func_addr
LEFT JOIN freers r ON f.address = r.func_addr
WHERE a.alloc_count > 0 AND COALESCE(r.free_count, 0) = 0
ORDER BY allocations DESC;

EXISTS: Functions With at Least One String Reference

EXISTS:至少引用一个字符串的函数

More efficient than JOIN + DISTINCT for existence checks:
sql
SELECT f.name
FROM funcs f
WHERE EXISTS (
    SELECT 1 FROM xrefs x
    JOIN strings s ON x.to_ea = s.address
    WHERE x.from_ea BETWEEN f.address AND f.end_ea
);
对于存在性检查,比JOIN + DISTINCT更高效:
sql
SELECT f.name
FROM funcs f
WHERE EXISTS (
    SELECT 1 FROM xrefs x
    JOIN strings s ON x.to_ea = s.address
    WHERE x.from_ea BETWEEN f.address AND f.end_ea
);

EXISTS: Leaf Functions (No Outgoing Calls)

EXISTS:叶子函数(无传出调用)

sql
SELECT f.name, f.size
FROM funcs f
WHERE NOT EXISTS (
    SELECT 1 FROM disasm_calls dc
    WHERE dc.func_addr = f.address
)
ORDER BY f.size DESC;
sql
SELECT f.name, f.size
FROM funcs f
WHERE NOT EXISTS (
    SELECT 1 FROM disasm_calls dc
    WHERE dc.func_addr = f.address
)
ORDER BY f.size DESC;