Trigger Intents
Use this skill when user asks:
- "Who calls this?" / "What does this call?"
- "Where is this string/import referenced?"
- "Show call graph dependencies."
Route to:
- for candidate entity lookup by name/pattern before relationship analysis
- for broader triage context
- for semantic interpretation after graph narrowing
- for instruction-level call-site proof
Do This First (Warm-Start Sequence)
sql
-- 1) Core relation volume
SELECT COUNT(*) AS xref_count FROM xrefs;
-- 2) Top imports (dependency hints)
SELECT module, COUNT(*) AS import_count
FROM imports
GROUP BY module
ORDER BY import_count DESC;
-- 3) Most called functions
SELECT printf('0x%X', to_ea) AS callee, COUNT(*) AS callers
FROM xrefs
WHERE is_code = 1
GROUP BY to_ea
ORDER BY callers DESC
LIMIT 20;
Interpretation guidance:
- Use relation counts to prioritize hotspots before expensive deep analysis.
- Prefer indexed filters (/) for fast response.
Failure and Recovery
- Full-scan query too slow:
- Target unresolved by name:
- Resolve/verify address first through the table or explicit EA literals.
- Sparse results:
- Pivot through , , or joins.
Handoff Patterns
- -> for top candidate function semantics.
- -> for campaign-level synthesis.
- -> to persist relationship findings.
xrefs
Cross-references - the most important table for understanding code relationships.
| Column | Type | Description |
|---|
| INT | Source address (who references) |
| INT | Target address (what is referenced) |
| INT | Xref type code |
| INT | 1=code xref (call/jump), 0=data xref |
| INT | Pre-computed containing function address (0 when not in a function) |
sql
-- Who calls function at 0x401000?
SELECT printf('0x%X', from_ea) as caller FROM xrefs WHERE to_ea = 0x401000 AND is_code = 1;
-- What does function at 0x401000 reference?
SELECT printf('0x%X', to_ea) as target FROM xrefs WHERE from_ea >= 0x401000 AND from_ea < 0x401100;
imports
Imported functions from external libraries.
| Column | Type | Description |
|---|
| INT | Import address (IAT entry) |
| TEXT | Import name |
| TEXT | Module/DLL name |
| INT | Import ordinal |
sql
-- Imports from kernel32.dll
SELECT name FROM imports WHERE module LIKE '%kernel32%';
Convenience Views
callers
Who calls each function. Caller names are resolved by the view from the
and
tables.
| Column | Type | Description |
|---|
| INT | Target function address |
| INT | Xref source address |
| TEXT | Calling function name |
| INT | Calling function start (from ) |
Underlying query:
sql
SELECT x.to_ea as func_addr, x.from_ea as caller_addr,
COALESCE((SELECT name FROM names WHERE address = x.from_func LIMIT 1), printf('sub_%X', x.from_func)) as caller_name,
x.from_func as caller_func_addr
FROM xrefs x WHERE x.is_code = 1 AND x.from_func != 0
sql
-- Who calls function at 0x401000?
SELECT caller_name, printf('0x%X', caller_addr) as from_addr
FROM callers WHERE func_addr = 0x401000;
-- Most called functions
SELECT printf('0x%X', func_addr) as addr, COUNT(*) as callers
FROM callers GROUP BY func_addr ORDER BY callers DESC LIMIT 10;
callees
What each function calls. Inverse of callers view. Uses
for efficient function-level grouping.
| Column | Type | Description |
|---|
| INT | Calling function address (from ) |
| TEXT | Calling function name |
| INT | Called address |
| TEXT | Called function/symbol name |
sql
-- What does main call?
SELECT callee_name, printf('0x%X', callee_addr) as addr
FROM callees WHERE func_name LIKE '%main%';
-- Functions making most calls
SELECT func_name, COUNT(*) as call_count
FROM callees GROUP BY func_addr ORDER BY call_count DESC LIMIT 10;
string_refs
Pre-joined view of string cross-references with containing function info.
| Column | Type | Description |
|---|
| INT | Address of the string |
| TEXT | String content |
| INT | String length |
| INT | Address of the referencing instruction |
| INT | Containing function address |
| TEXT | Containing function name |
sql
-- Strings referenced by a specific function
SELECT string_value, func_name FROM string_refs WHERE func_addr = 0x401000;
-- Find functions referencing password-related strings
SELECT string_value, func_name FROM string_refs WHERE string_value LIKE '%password%';
-- Most referenced strings
SELECT string_value, COUNT(*) as ref_count
FROM string_refs GROUP BY string_addr ORDER BY ref_count DESC LIMIT 10;
data_refs
Cached table of data (non-code) cross-references with containing function info.
Whole-program aggregates are supported.
| Column | Type | Description |
|---|
| INT | Source address of the reference |
| INT | Target data address |
| INT | Containing function address |
| TEXT | Containing function name |
| INT | Xref type code |
sql
-- Data references from a specific function
SELECT * FROM data_refs WHERE from_func_addr = 0x401000;
-- Functions with most data references
SELECT from_func_name, COUNT(*) as data_ref_count
FROM data_refs GROUP BY from_func_addr ORDER BY data_ref_count DESC LIMIT 10;
call_graph
Table-valued function for BFS call graph traversal. Uses HIDDEN parameters for traversal control.
| Column | Type | Description |
|---|
| INT | Function address in the graph |
| TEXT | Function name |
| INT | BFS depth from start |
| INT | Parent function address in the traversal |
| INT | HIDDEN — Starting function address |
| TEXT | HIDDEN — (callees), (callers), or |
| INT | HIDDEN — Maximum traversal depth |
Always provide WHERE constraints for all 3 hidden parameters.
sql
-- Forward call tree from main
SELECT func_name, depth FROM call_graph
WHERE start = (SELECT address FROM funcs WHERE name = 'main')
AND direction = 'down' AND max_depth = 5;
-- All transitive callers
SELECT func_name, depth FROM call_graph
WHERE start = 0x405000 AND direction = 'up' AND max_depth = 10;
-- Bidirectional exploration
SELECT func_name, depth FROM call_graph
WHERE start = 0x401000 AND direction = 'both' AND max_depth = 3;
-- Join with string_refs to find strings reachable from a function
SELECT DISTINCT sr.string_value, sr.func_name
FROM call_graph cg
JOIN string_refs sr ON sr.func_addr = cg.func_addr
WHERE cg.start = 0x401000 AND cg.direction = 'down' AND cg.max_depth = 3;
-- Imported APIs reachable from a function's call tree
SELECT DISTINCT i.module, i.name as api
FROM call_graph cg
JOIN disasm_calls dc ON dc.func_addr = cg.func_addr
JOIN imports i ON dc.callee_addr = i.address
WHERE cg.start = 0x401000 AND cg.direction = 'down' AND cg.max_depth = 5
ORDER BY i.module, i.name;
Performance: BFS with visited set. O(reachable functions). Always constrain hidden params.
Use this pattern when the destination is an import.
shortest_path
Table-valued function for finding the shortest call path between two functions. Uses bidirectional BFS.
| Column | Type | Description |
|---|
| INT | Step number in the path (0 = source) |
| INT | Function address at this step |
| TEXT | Function name at this step |
| INT | HIDDEN — Source function address |
| INT | HIDDEN — Destination function address |
| INT | HIDDEN — Maximum search depth |
Always provide WHERE constraints for all 3 hidden parameters.
Both endpoints must resolve to functions. Imported API addresses from
are not valid
endpoints.
sql
-- Find shortest call path between two functions
SELECT step, func_name FROM shortest_path
WHERE from_addr = (SELECT address FROM funcs WHERE name = 'main')
AND to_addr = (SELECT address FROM funcs WHERE name = 'target_func')
AND max_depth = 20;
-- Check reachability between two functions
SELECT COUNT(*) > 0 as reachable FROM shortest_path
WHERE from_addr = 0x401000 AND to_addr = 0x405000 AND max_depth = 20;
-- Annotate path steps with call count and string refs
SELECT sp.step, sp.func_name,
(SELECT COUNT(*) FROM disasm_calls dc WHERE dc.func_addr = sp.func_addr) as calls_made,
(SELECT COUNT(*) FROM string_refs sr WHERE sr.func_addr = sp.func_addr) as strings_used
FROM shortest_path sp
WHERE sp.from_addr = 0x401000 AND sp.to_addr = 0x405000 AND sp.max_depth = 20
ORDER BY sp.step;
Performance: Bidirectional BFS. O(b^(d/2)) where b is branching factor and d is path length. Returns empty result set if no path exists within max_depth.
Table-First Cross-Reference Queries
sql
-- Incoming references to an address
SELECT from_ea, to_ea, type, is_code, from_func
FROM xrefs
WHERE to_ea = 0x401000;
-- Exact outgoing references from an item address
SELECT from_ea, to_ea, type, is_code, from_func
FROM xrefs
WHERE from_ea = 0x401000;
-- Outgoing references from anywhere inside a function
SELECT from_ea, to_ea, type, is_code, from_func
FROM xrefs
WHERE from_func = 0x401000;
grep
Use
to resolve internal symbols, types, and members before doing relationship analysis. Use
when the callee may exist only as an imported API.
Canonical usage lives in
.
For canonical schema and owner mapping, see
../connect/references/schema-catalog.md
(
).
sql
-- Resolve internal functions with grep
SELECT name, kind, address
FROM grep
WHERE pattern = 'main%' AND kind = 'function'
ORDER BY name;
-- Resolve imported APIs with imports
SELECT module, name, address
FROM imports
WHERE name LIKE 'CreateFile%'
ORDER BY module, name;
-- Then pivot into callers/callees/xrefs
SELECT caller_name, printf('0x%X', caller_addr) AS from_addr
FROM callers
WHERE func_addr = (
SELECT address
FROM imports
WHERE name = 'CreateFileW'
ORDER BY name
LIMIT 1
);
Performance Rules
Constraint Pushdown
The
table has
optimized filters using efficient IDA SDK APIs:
| Filter | Cost | Behavior |
|---|
| 0.5 | O(xrefs to X) — fast, uses IDA's xref index |
| 0.5 | O(xrefs from X) — fast, uses IDA's xref index |
| 1.0 | O(callees of X) — uses XrefsFromFuncIterator |
| No equality filter on / / | — | Falls back to a cache of xrefs to function entry points only — avoid relying on it for complete import/data/non-function coverage |
Always filter xrefs by , , or . Avoid unconstrained scans.
sql
-- FAST: xrefs to a specific target
SELECT * FROM xrefs WHERE to_ea = 0x401000;
-- FAST: xrefs from a specific source
SELECT * FROM xrefs WHERE from_ea = 0x401000;
-- FAST: all xrefs originating from a function
SELECT * FROM xrefs WHERE from_func = 0x401000;
-- INCOMPLETE/avoid: unconstrained scan falls back to a function-entry cache
SELECT * FROM xrefs WHERE is_code = 1;
Common Xref Patterns
Find Most Called Functions
sql
SELECT f.name, COUNT(*) as caller_count
FROM funcs f
JOIN xrefs x ON f.address = x.to_ea
WHERE x.is_code = 1
GROUP BY f.address
ORDER BY caller_count DESC
LIMIT 10;
Find Functions Calling a Specific API
sql
SELECT DISTINCT (SELECT name FROM funcs WHERE from_ea >= address AND from_ea < end_ea LIMIT 1) as caller
FROM xrefs
WHERE to_ea = (SELECT address FROM imports WHERE name = 'CreateFileW');
String Cross-Reference Analysis
sql
SELECT s.content, (SELECT name FROM funcs WHERE x.from_ea >= address AND x.from_ea < end_ea LIMIT 1) as used_by
FROM strings s
JOIN xrefs x ON s.address = x.to_ea
WHERE s.content LIKE '%password%';
Import Dependency Map
sql
-- Which modules does each function depend on?
SELECT f.name as func_name, i.module, COUNT(*) as api_count
FROM funcs f
JOIN disasm_calls dc ON dc.func_addr = f.address
JOIN imports i ON dc.callee_addr = i.address
GROUP BY f.address, i.module
ORDER BY f.name, api_count DESC;
Data Section References
sql
-- Functions referencing data sections
SELECT
f.name,
s.name as segment,
COUNT(*) as data_refs
FROM funcs f
JOIN xrefs x ON x.from_ea BETWEEN f.address AND f.end_ea
JOIN segments s ON x.to_ea BETWEEN s.start_ea AND s.end_ea
WHERE s.class = 'DATA' AND x.is_code = 0
GROUP BY f.address, s.name
ORDER BY data_refs DESC
LIMIT 20;
Advanced Xref Patterns (CTEs and Recursive Queries)
Prefer over manual CTEs. The
table uses C++ BFS with a visited set — correct BFS depths, no duplicate expansion on diamond-shaped call graphs, and early termination.
Before (recursive CTE — no visited tracking, exponential on diamond graphs):
sql
WITH RECURSIVE call_chain(root, current_func, depth) AS (
SELECT 0x401000, callee_addr, 1
FROM disasm_calls WHERE func_addr = 0x401000
UNION ALL
SELECT cc.root, dc.callee_addr, cc.depth + 1
FROM call_chain cc
JOIN disasm_calls dc ON dc.func_addr = cc.current_func
WHERE cc.depth < 10
)
SELECT DISTINCT current_func, MIN(depth) FROM call_chain GROUP BY current_func;
After (call_graph table — C++ BFS with visited set, correct depths):
sql
SELECT func_addr, func_name, depth FROM call_graph
WHERE start = 0x401000 AND direction = 'down' AND max_depth = 10;
Recursive Call Graph — Forward Traversal
Note: For targeted traversal, prefer the
table which uses C++ BFS with visited tracking. Use recursive CTEs only when you need custom join logic (e.g., filtering by callee properties at each step).
Find all functions reachable from a starting function (up to depth 5):
sql
-- Preferred: use call_graph table
SELECT func_name, depth FROM call_graph
WHERE start = (SELECT address FROM funcs WHERE name = 'main')
AND direction = 'down' AND max_depth = 5;
-- Manual CTE (when custom filtering is needed at each step):
WITH RECURSIVE cg AS (
SELECT address as func_addr, name, 0 as depth
FROM funcs WHERE name = 'main'
UNION ALL
SELECT f.address, f.name, cg.depth + 1
FROM cg
JOIN disasm_calls dc ON dc.func_addr = cg.func_addr
JOIN funcs f ON f.address = dc.callee_addr
WHERE cg.depth < 5
AND dc.callee_addr != 0
)
SELECT DISTINCT func_addr, name, MIN(depth) as min_depth
FROM cg
GROUP BY func_addr
ORDER BY min_depth, name;
Recursive Call Graph — Reverse (Who Calls This?)
Note: For targeted traversal, prefer the
table with
. Use recursive CTEs only when you need custom join logic.
Trace callers transitively up to depth 5:
sql
-- Preferred: use call_graph table
SELECT func_name, depth FROM call_graph
WHERE start = 0x401000 AND direction = 'up' AND max_depth = 5;
-- Manual CTE (when custom filtering is needed at each step):
WITH RECURSIVE callers_cte AS (
SELECT DISTINCT dc.func_addr, 1 as depth
FROM disasm_calls dc
WHERE dc.callee_addr = 0x401000
UNION ALL
SELECT DISTINCT dc.func_addr, c.depth + 1
FROM callers_cte c
JOIN disasm_calls dc ON dc.callee_addr = c.func_addr
WHERE c.depth < 5
)
SELECT (SELECT name FROM funcs WHERE func_addr >= address AND func_addr < end_ea LIMIT 1) as caller, MIN(depth) as distance
FROM callers_cte
GROUP BY func_addr
ORDER BY distance, caller;
CTE: Functions That Both Call malloc AND Check NULL
sql
WITH malloc_callers AS (
SELECT DISTINCT func_addr
FROM disasm_calls
WHERE callee_name LIKE '%malloc%'
),
null_checkers AS (
SELECT DISTINCT func_addr
FROM ctree_v_comparisons
WHERE rhs_num = 0 AND op_name = 'cot_eq'
)
SELECT f.name
FROM funcs f
JOIN malloc_callers m ON f.address = m.func_addr
JOIN null_checkers n ON f.address = n.func_addr;
CTE: Memory Allocation Without Free (Potential Leaks)
sql
WITH allocators AS (
SELECT func_addr, COUNT(*) as alloc_count
FROM disasm_calls
WHERE callee_name LIKE '%alloc%' OR callee_name LIKE '%malloc%'
GROUP BY func_addr
),
freers AS (
SELECT func_addr, COUNT(*) as free_count
FROM disasm_calls
WHERE callee_name LIKE '%free%'
GROUP BY func_addr
)
SELECT f.name,
COALESCE(a.alloc_count, 0) as allocations,
COALESCE(r.free_count, 0) as frees
FROM funcs f
LEFT JOIN allocators a ON f.address = a.func_addr
LEFT JOIN freers r ON f.address = r.func_addr
WHERE a.alloc_count > 0 AND COALESCE(r.free_count, 0) = 0
ORDER BY allocations DESC;
EXISTS: Functions With at Least One String Reference
More efficient than JOIN + DISTINCT for existence checks:
sql
SELECT f.name
FROM funcs f
WHERE EXISTS (
SELECT 1 FROM xrefs x
JOIN strings s ON x.to_ea = s.address
WHERE x.from_ea BETWEEN f.address AND f.end_ea
);
EXISTS: Leaf Functions (No Outgoing Calls)
sql
SELECT f.name, f.size
FROM funcs f
WHERE NOT EXISTS (
SELECT 1 FROM disasm_calls dc
WHERE dc.func_addr = f.address
)
ORDER BY f.size DESC;