analysis

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Additional Resources

额外资源

  • For crypto/network detection patterns: references/crypto-detection.md, references/network-detection.md
  • For advanced SQL patterns (CTEs, window functions, batch analysis, pagination) and extended query cookbook: references/analysis-cookbook.md

  • 加密/网络检测模式相关:references/crypto-detection.mdreferences/network-detection.md
  • 高级SQL模式(CTEs、窗口函数、批量分析、分页)及扩展查询指南:references/analysis-cookbook.md

Trigger Intents

触发场景

Use this skill when user prompts sound like:
  • "What does this binary do?"
  • "Find suspicious/security-relevant behavior."
  • "Which libraries/frameworks are present?"
  • "Give me a prioritized triage plan."
  • "Show higher-level insights, not just raw rows."
  • "Compare this decompilation to source."
  • "Help make this function review-ready."
Route to adjacent skills when needed:
  • Need raw caller/callee detail:
    xrefs
  • Need assembly-level investigation:
    disassembly
  • Need pseudocode semantics:
    decompiler
  • Need editing/annotation:
    annotations

当用户提出以下类似需求时使用本技能:
  • "这个二进制文件的功能是什么?"
  • "查找可疑/安全相关行为。"
  • "包含哪些库/框架?"
  • "给我一个优先级明确的排查方案。"
  • "展示更高级的洞察,不要只返回原始行数据。"
  • "对比反编译结果与源代码。"
  • "帮我让这个函数达到可评审的状态。"
必要时可转至关联技能:
  • 需要原始调用者/被调用者详情:
    xrefs
  • 需要汇编级别的调查:
    disassembly
  • 需要伪代码语义分析:
    decompiler
  • 需要编辑/注释:
    annotations

Do This First (Warm-Start Sequence)

初始步骤(快速启动流程)

Start broad, then narrow:
sql
-- 1) Binary orientation
SELECT * FROM welcome;

-- 2) Capability hints from imports
SELECT module, name FROM imports ORDER BY module, name;

-- 3) Behavioral hints from strings
SELECT content, printf('0x%X', address) AS addr
FROM strings
WHERE length >= 8
ORDER BY length DESC
LIMIT 40;
Interpretation guidance:
  • Crypto/network/process APIs + suspicious strings usually indicate highest-value functions to inspect first.
  • Move from "signals" to "functions" via
    xrefs
    /
    disasm_calls
    , then decompile likely hotspots.

先从广度分析,再逐步缩小范围:
sql
-- 1) 二进制文件定位
SELECT * FROM welcome;

-- 2) 从导入表获取功能线索
SELECT module, name FROM imports ORDER BY module, name;

-- 3) 从字符串获取行为线索
SELECT content, printf('0x%X', address) AS addr
FROM strings
WHERE length >= 8
ORDER BY length DESC
LIMIT 40;
解读指南:
  • 加密/网络/进程API + 可疑字符串通常指示最需要优先检查的高价值函数。
  • 通过
    xrefs
    /
    disasm_calls
    从“信号”定位到“函数”,然后反编译潜在的关键区域。

Failure and Recovery

故障处理与恢复

  • Empty/high-noise results:
    • Tighten patterns and pivot to module-specific imports.
    • Add JOINs with
      funcs
      and limit by size/name or call density.
  • Missing decompiler surfaces:
    • Continue with
      disassembly
      +
      xrefs
      patterns.
  • Timeout on complex analytics:
    • Decompose into staged CTEs and smaller candidate sets.

  • 结果为空或噪音过多:
    • 收紧匹配模式,转向特定模块的导入表。
    • 添加与
      funcs
      的JOIN操作,按大小/名称或调用密度进行限制。
  • 缺少反编译界面:
    • 继续使用
      disassembly
      +
      xrefs
      模式。
  • 复杂分析超时:
    • 分解为分步CTEs和更小的候选集。

Handoff Patterns

技能转接模式

  1. analysis
    ->
    xrefs
    : map signal addresses to caller/callee graph.
  2. analysis
    ->
    decompiler
    : inspect semantic logic in highest-risk functions.
  3. analysis
    ->
    annotations
    : persist findings as comments/renames and make the decompilation review-ready.

  1. analysis
    ->
    xrefs
    :将信号地址映射到调用者/被调用者图。
  2. analysis
    ->
    decompiler
    :检查高风险函数的语义逻辑。
  3. analysis
    ->
    annotations
    :将发现结果保存为注释/重命名,使反编译结果达到可评审状态。

High-Fidelity Review Handoff

高保真评审转接

When the user wants side-by-side comparison with source, or asks to "clean up" a function so it reads better, stop treating the task as read-only triage and hand off to
annotations
.
Use this review probe first:
sql
SELECT decompile(0x401000);
SELECT idx, name, type FROM ctree_lvars WHERE func_addr = 0x401000 ORDER BY idx;
SELECT callee_name FROM disasm_calls WHERE func_addr = 0x401000;
Then route to
annotations
for the edit pass. Success markers for a review-ready function are:
  • typed signature and clearer field access
  • named locals, globals, and labels
  • one heading-style summary comment near function start
  • less raw pointer math and fewer generic temp names
Treat that summary comment as part of the analysis product, not just presentation polish:
  • it should support semantic search later
  • it should help whole-program understanding when many functions have already been annotated
Non-goal:
  • exact source syntax. Decompiler-stable forms such as
    qmemcpy(...)
    can still be acceptable if names, types, and comments are correct.

当用户需要与源代码进行并排对比,或要求“整理”函数使其更易读时,停止只读式排查,转至
annotations
技能。
先使用以下评审探查语句:
sql
SELECT decompile(0x401000);
SELECT idx, name, type FROM ctree_lvars WHERE func_addr = 0x401000 ORDER BY idx;
SELECT callee_name FROM disasm_calls WHERE func_addr = 0x401000;
然后转至
annotations
进行编辑。函数达到可评审状态的标志:
  • 带有类型的签名和更清晰的字段访问方式
  • 已命名的局部变量、全局变量和标签
  • 函数开头附近有一条标题式的总结注释
  • 更少的原始指针运算和通用临时变量名
将该总结注释视为分析成果的一部分,而非仅用于展示优化:
  • 它应支持后续的语义搜索
  • 当已有大量函数被注释时,它应有助于理解整个程序
非目标:
  • 完全匹配源代码语法。如果名称、类型和注释正确,反编译稳定形式如
    qmemcpy(...)
    仍可接受。

Quick Start Examples

快速入门示例

"What does this binary do?"

"这个二进制文件的功能是什么?"

sql
-- Entry points
SELECT * FROM entries;

-- Imported APIs (hints at functionality)
SELECT module, name FROM imports ORDER BY module, name;

-- Interesting strings
SELECT content FROM strings WHERE length > 10 ORDER BY length DESC LIMIT 20;
sql
-- 入口点
SELECT * FROM entries;

-- 导入的API(功能线索)
SELECT module, name FROM imports ORDER BY module, name;

-- 有趣的字符串
SELECT content FROM strings WHERE length > 10 ORDER BY length DESC LIMIT 20;

"Find security-relevant code"

"查找安全相关代码"

sql
-- Dangerous string functions
SELECT DISTINCT (SELECT name FROM funcs WHERE func_addr >= address AND func_addr < end_ea LIMIT 1) FROM disasm_calls
WHERE callee_name IN ('strcpy', 'strcat', 'sprintf', 'gets');

-- Crypto-related
SELECT * FROM imports WHERE name LIKE '%Crypt%' OR name LIKE '%Hash%';

-- Network-related
SELECT * FROM imports WHERE name LIKE '%socket%' OR name LIKE '%connect%' OR name LIKE '%send%';
sql
-- 危险的字符串函数
SELECT DISTINCT (SELECT name FROM funcs WHERE func_addr >= address AND func_addr < end_ea LIMIT 1) FROM disasm_calls
WHERE callee_name IN ('strcpy', 'strcat', 'sprintf', 'gets');

-- 加密相关
SELECT * FROM imports WHERE name LIKE '%Crypt%' OR name LIKE '%Hash%';

-- 网络相关
SELECT * FROM imports WHERE name LIKE '%socket%' OR name LIKE '%connect%' OR name LIKE '%send%';

"Understand a specific function"

"理解特定函数"

sql
-- Basic info
SELECT * FROM funcs WHERE address = 0x401000;

-- Full disassembly
SELECT disasm_func(0x401000);

-- Decompile (if Hex-Rays available)
SELECT decompile(0x401000);

-- Local variables
SELECT name, type, size FROM ctree_lvars WHERE func_addr = 0x401000;

-- What it calls
SELECT callee_name FROM disasm_calls WHERE func_addr = 0x401000;

-- What calls it
SELECT (SELECT name FROM funcs WHERE from_ea >= address AND from_ea < end_ea LIMIT 1) FROM xrefs WHERE to_ea = 0x401000 AND is_code = 1;
sql
-- 基本信息
SELECT * FROM funcs WHERE address = 0x401000;

-- 完整反汇编
SELECT disasm_func(0x401000);

-- 反编译(若Hex-Rays可用)
SELECT decompile(0x401000);

-- 局部变量
SELECT name, type, size FROM ctree_lvars WHERE func_addr = 0x401000;

-- 它调用的函数
SELECT callee_name FROM disasm_calls WHERE func_addr = 0x401000;

-- 调用它的函数
SELECT (SELECT name FROM funcs WHERE from_ea >= address AND from_ea < end_ea LIMIT 1) FROM xrefs WHERE to_ea = 0x401000 AND is_code = 1;

"Find all uses of a string"

"查找字符串的所有使用场景"

sql
SELECT s.content, (SELECT name FROM funcs WHERE x.from_ea >= address AND x.from_ea < end_ea LIMIT 1) as function, printf('0x%X', x.from_ea) as location
FROM strings s
JOIN xrefs x ON s.address = x.to_ea
WHERE s.content LIKE '%config%';

sql
SELECT s.content, (SELECT name FROM funcs WHERE x.from_ea >= address AND x.from_ea < end_ea LIMIT 1) as function, printf('0x%X', x.from_ea) as location
FROM strings s
JOIN xrefs x ON s.address = x.to_ea
WHERE s.content LIKE '%config%';

Natural Language Query Examples

自然语言查询示例

Function Signature Queries

函数签名查询

"Show me functions that return integers"
sql
SELECT name, return_type, arg_count FROM funcs
WHERE return_is_integral = 1 LIMIT 20;

-- Or via types_func_args (typedef-aware)
SELECT DISTINCT type_name FROM types_func_args
WHERE arg_index = -1 AND is_integral_resolved = 1;
"Show me functions that take 4 string arguments"
sql
SELECT type_name, COUNT(*) as string_args
FROM types_func_args
WHERE arg_index >= 0
  AND is_ptr_resolved = 1
  AND base_type_resolved IN ('char', 'wchar_t', 'CHAR', 'WCHAR')
GROUP BY type_ordinal
HAVING string_args = 4;
"Which functions return pointers?"
sql
SELECT name, return_type FROM funcs
WHERE return_is_ptr = 1 ORDER BY name LIMIT 20;
"Find void functions with many arguments"
sql
SELECT name, arg_count FROM funcs
WHERE return_is_void = 1 AND arg_count >= 4
ORDER BY arg_count DESC;
"What calling conventions are used?"
sql
SELECT calling_conv, COUNT(*) as count FROM funcs
WHERE calling_conv IS NOT NULL AND calling_conv != ''
GROUP BY calling_conv ORDER BY count DESC;
"显示返回整数的函数"
sql
SELECT name, return_type, arg_count FROM funcs
WHERE return_is_integral = 1 LIMIT 20;

-- 或通过types_func_args(支持typedef)
SELECT DISTINCT type_name FROM types_func_args
WHERE arg_index = -1 AND is_integral_resolved = 1;
"显示接受4个字符串参数的函数"
sql
SELECT type_name, COUNT(*) as string_args
FROM types_func_args
WHERE arg_index >= 0
  AND is_ptr_resolved = 1
  AND base_type_resolved IN ('char', 'wchar_t', 'CHAR', 'WCHAR')
GROUP BY type_ordinal
HAVING string_args = 4;
"哪些函数返回指针?"
sql
SELECT name, return_type FROM funcs
WHERE return_is_ptr = 1 ORDER BY name LIMIT 20;
"查找带有多个参数的void函数"
sql
SELECT name, arg_count FROM funcs
WHERE return_is_void = 1 AND arg_count >= 4
ORDER BY arg_count DESC;
"使用了哪些调用约定?"
sql
SELECT calling_conv, COUNT(*) as count FROM funcs
WHERE calling_conv IS NOT NULL AND calling_conv != ''
GROUP BY calling_conv ORDER BY count DESC;

Return Value Analysis

返回值分析

"Which functions return 0?"
sql
SELECT DISTINCT f.name FROM funcs f
JOIN ctree_v_returns r ON r.func_addr = f.address
WHERE r.return_num = 0;
"Find functions that return -1 (error pattern)"
sql
SELECT DISTINCT f.name FROM funcs f
JOIN ctree_v_returns r ON r.func_addr = f.address
WHERE r.return_num = -1;
"Functions that return their input argument"
sql
SELECT DISTINCT f.name FROM funcs f
JOIN ctree_v_returns r ON r.func_addr = f.address
WHERE r.returns_arg = 1;
"Functions that return the result of another call (wrappers)"
sql
SELECT DISTINCT f.name FROM funcs f
JOIN ctree_v_returns r ON r.func_addr = f.address
WHERE r.returns_call_result = 1;
"Functions with multiple return statements"
sql
SELECT f.name, COUNT(*) as return_count
FROM funcs f
JOIN ctree_v_returns r ON r.func_addr = f.address
GROUP BY f.address
HAVING return_count > 1
ORDER BY return_count DESC LIMIT 20;

"哪些函数返回0?"
sql
SELECT DISTINCT f.name FROM funcs f
JOIN ctree_v_returns r ON r.func_addr = f.address
WHERE r.return_num = 0;
"查找返回-1的函数(错误模式)"
sql
SELECT DISTINCT f.name FROM funcs f
JOIN ctree_v_returns r ON r.func_addr = f.address
WHERE r.return_num = -1;
"返回输入参数的函数"
sql
SELECT DISTINCT f.name FROM funcs f
JOIN ctree_v_returns r ON r.func_addr = f.address
WHERE r.returns_arg = 1;
"返回另一个调用结果的函数(包装器)"
sql
SELECT DISTINCT f.name FROM funcs f
JOIN ctree_v_returns r ON r.func_addr = f.address
WHERE r.returns_call_result = 1;
"包含多个返回语句的函数"
sql
SELECT f.name, COUNT(*) as return_count
FROM funcs f
JOIN ctree_v_returns r ON r.func_addr = f.address
GROUP BY f.address
HAVING return_count > 1
ORDER BY return_count DESC LIMIT 20;

Common Query Patterns

常见查询模式

Find Most Called Functions

查找被调用次数最多的函数

sql
SELECT f.name, COUNT(*) as callers
FROM funcs f
JOIN xrefs x ON f.address = x.to_ea
WHERE x.is_code = 1
GROUP BY f.address
ORDER BY callers DESC
LIMIT 10;
sql
SELECT f.name, COUNT(*) as callers
FROM funcs f
JOIN xrefs x ON f.address = x.to_ea
WHERE x.is_code = 1
GROUP BY f.address
ORDER BY callers DESC
LIMIT 10;

Find Functions Calling a Specific API

查找调用特定API的函数

sql
SELECT DISTINCT (SELECT name FROM funcs WHERE from_ea >= address AND from_ea < end_ea LIMIT 1) as caller
FROM xrefs
WHERE to_ea = (SELECT address FROM imports WHERE name = 'CreateFileW');
sql
SELECT DISTINCT (SELECT name FROM funcs WHERE from_ea >= address AND from_ea < end_ea LIMIT 1) as caller
FROM xrefs
WHERE to_ea = (SELECT address FROM imports WHERE name = 'CreateFileW');

String Cross-Reference Analysis

字符串交叉引用分析

sql
SELECT s.content, (SELECT name FROM funcs WHERE x.from_ea >= address AND x.from_ea < end_ea LIMIT 1) as used_by
FROM strings s
JOIN xrefs x ON s.address = x.to_ea
WHERE s.content LIKE '%password%';
sql
SELECT s.content, (SELECT name FROM funcs WHERE x.from_ea >= address AND x.from_ea < end_ea LIMIT 1) as used_by
FROM strings s
JOIN xrefs x ON s.address = x.to_ea
WHERE s.content LIKE '%password%';

Function Complexity (by Block Count)

函数复杂度(按块计数)

sql
SELECT (SELECT name FROM funcs WHERE func_ea >= address AND func_ea < end_ea LIMIT 1) as name, COUNT(*) as block_count
FROM blocks
GROUP BY func_ea
ORDER BY block_count DESC
LIMIT 10;
sql
SELECT (SELECT name FROM funcs WHERE func_ea >= address AND func_ea < end_ea LIMIT 1) as name, COUNT(*) as block_count
FROM blocks
GROUP BY func_ea
ORDER BY block_count DESC
LIMIT 10;

Find Leaf Functions (No Outgoing Calls)

查找叶子函数(无对外调用)

sql
SELECT f.name, f.size
FROM funcs f
LEFT JOIN disasm_calls c ON c.func_addr = f.address
GROUP BY f.address
HAVING COUNT(c.ea) = 0
ORDER BY f.size DESC;
sql
SELECT f.name, f.size
FROM funcs f
LEFT JOIN disasm_calls c ON c.func_addr = f.address
GROUP BY f.address
HAVING COUNT(c.ea) = 0
ORDER BY f.size DESC;

Functions with Deep Call Chains

调用链较深的函数

sql
SELECT f.name, MAX(cc.depth) as max_depth
FROM disasm_v_call_chains cc
JOIN funcs f ON f.address = cc.root_func
GROUP BY cc.root_func
ORDER BY max_depth DESC
LIMIT 10;
For targeted traversal, prefer the
call_graph
table over
disasm_v_call_chains
:
sql
-- Map all functions in a call subtree
SELECT func_name, depth FROM call_graph
WHERE start = 0x401000 AND direction = 'down' AND max_depth = 5;
sql
SELECT f.name, MAX(cc.depth) as max_depth
FROM disasm_v_call_chains cc
JOIN funcs f ON f.address = cc.root_func
GROUP BY cc.root_func
ORDER BY max_depth DESC
LIMIT 10;
如需定向遍历,优先使用
call_graph
表而非
disasm_v_call_chains
sql
-- 映射调用子树中的所有函数
SELECT func_name, depth FROM call_graph
WHERE start = 0x401000 AND direction = 'down' AND max_depth = 5;

Trace Call Path to Target Function

追踪到目标函数的调用路径

sql
-- Trace call path to an internal helper
SELECT step, func_name FROM shortest_path
WHERE from_addr = (SELECT address FROM funcs WHERE name = 'main')
  AND to_addr = (SELECT address FROM funcs WHERE name = 'copy_user_input')
  AND max_depth = 20;
Use
call_graph
+
disasm_calls
+
imports
when the destination is an imported API.
shortest_path
endpoints must resolve to functions.
sql
-- 追踪到内部辅助函数的调用路径
SELECT step, func_name FROM shortest_path
WHERE from_addr = (SELECT address FROM funcs WHERE name = 'main')
  AND to_addr = (SELECT address FROM funcs WHERE name = 'copy_user_input')
  AND max_depth = 20;
当目标是导入API时,结合使用
call_graph
+
disasm_calls
+
imports
shortest_path
的端点必须解析为函数。

Find All Strings Reachable from a Function

查找函数可访问的所有字符串

sql
SELECT DISTINCT sr.string_value
FROM call_graph cg
JOIN string_refs sr ON sr.func_addr = cg.func_addr
WHERE cg.start = 0x401000 AND cg.direction = 'down' AND cg.max_depth = 3;
sql
SELECT DISTINCT sr.string_value
FROM call_graph cg
JOIN string_refs sr ON sr.func_addr = cg.func_addr
WHERE cg.start = 0x401000 AND cg.direction = 'down' AND cg.max_depth = 3;

Three-Way: Strings + Imports Reachable from a Function

三方分析:函数可访问的字符串+导入项

sql
-- Three-way: strings + imports reachable from a function
SELECT 'string' as kind, sr.string_value as detail, cg.func_name as via_func
FROM call_graph cg
JOIN string_refs sr ON sr.func_addr = cg.func_addr
WHERE cg.start = 0x401000 AND cg.direction = 'down' AND cg.max_depth = 5
  AND sr.string_value LIKE '%http%'
UNION ALL
SELECT 'import', i.name, cg.func_name
FROM call_graph cg
JOIN disasm_calls dc ON dc.func_addr = cg.func_addr
JOIN imports i ON dc.callee_addr = i.address
WHERE cg.start = 0x401000 AND cg.direction = 'down' AND cg.max_depth = 5
ORDER BY kind, detail;
sql
-- 三方分析:函数可访问的字符串+导入项
SELECT 'string' as kind, sr.string_value as detail, cg.func_name as via_func
FROM call_graph cg
JOIN string_refs sr ON sr.func_addr = cg.func_addr
WHERE cg.start = 0x401000 AND cg.direction = 'down' AND cg.max_depth = 5
  AND sr.string_value LIKE '%http%'
UNION ALL
SELECT 'import', i.name, cg.func_name
FROM call_graph cg
JOIN disasm_calls dc ON dc.func_addr = cg.func_addr
JOIN imports i ON dc.callee_addr = i.address
WHERE cg.start = 0x401000 AND cg.direction = 'down' AND cg.max_depth = 5
ORDER BY kind, detail;