Trigger Intents
Use this skill when user asks to:
- search strings/bytes/patterns quickly
- map string evidence to code usage
- investigate raw data-level indicators (IOCs, constants, signatures)
Route to:
- for relationship expansion from matched addresses
- for triage synthesis from data signals
- when findings should drive patch/breakpoint actions
Do This First (Warm-Start Sequence)
sql
-- 1) Validate string availability
SELECT COUNT(*) AS strings FROM strings;
-- 2) Sample high-value strings
SELECT content, printf('0x%X', address) AS addr
FROM strings
WHERE length >= 8
ORDER BY length DESC
LIMIT 40;
-- 3) If expected strings are missing, rebuild once
SELECT rebuild_strings();
Interpretation guidance:
- Strings are often quickest behavioral clues; pivot to immediately for execution context.
- For opcode/pattern hunts, prefer the table over full instruction scans.
Failure and Recovery
- No strings or unexpectedly low count:
- Too many false positives:
- Increase specificity (, regex-like pattern narrowing, module/function join filters).
- Byte pattern search too broad:
- Restrict by range or join matched byte addresses to .
- Need named functions, labels, structs, or members instead of string contents:
Handoff Patterns
- -> to convert data hits into code paths/functions.
- -> for risk scoring and campaign-level insight.
- -> for actionable patch/watchpoint setup.
strings
String literals found in the binary. IDA maintains a cached string list that can be configured.
| Column | Type | Description |
|---|
| INT | String address |
| INT | String length |
| INT | String type (raw encoding bits) |
| TEXT | Type name: ascii, utf16, utf32 |
| INT | Char width (0=1-byte, 1=2-byte, 2=4-byte) |
| TEXT | Width name: 1-byte, 2-byte, 4-byte |
| INT | String layout (0=null-terminated, 1-3=pascal) |
| TEXT | Layout name: termchr, pascal1, pascal2, pascal4 |
| INT | Encoding index (0=default) |
| TEXT | String content (the actual text — not or ) |
String Type Encoding:
IDA stores string type as a 32-bit value:
- Bits 0-1: Width (0=1B/ASCII, 1=2B/UTF-16, 2=4B/UTF-32)
- Bits 2-7: Layout (0=TERMCHR, 1=PASCAL1, 2=PASCAL2, 3=PASCAL4)
- Bits 8-15: term1 (first termination character)
- Bits 16-23: term2 (second termination character)
- Bits 24-31: encoding index
sql
-- Find error messages
SELECT content, printf('0x%X', address) as addr FROM strings WHERE content LIKE '%error%';
-- ASCII strings only
SELECT * FROM strings WHERE type_name = 'ascii';
-- UTF-16 strings (common in Windows)
SELECT * FROM strings WHERE type_name = 'utf16';
-- Count strings by type
SELECT type_name, layout_name, COUNT(*) as count
FROM strings GROUP BY type_name, layout_name ORDER BY count DESC;
Important: For new analysis (exe/dll), strings are auto-built. For existing databases (i64/idb), strings are already saved. If you see 0 strings unexpectedly, run
once to rebuild the list. See String List Surfaces section below.
String References (explicit join pattern)
Use
directly. This is the canonical pattern.
sql
-- Find call sites/functions referencing error-like strings
SELECT
s.content as string_value,
printf('0x%X', x.from_ea) as ref_addr,
(SELECT name FROM funcs WHERE x.from_ea >= address AND x.from_ea < end_ea LIMIT 1) as func_name
FROM strings s
JOIN xrefs x ON x.to_ea = s.address
WHERE s.content LIKE '%error%' OR s.content LIKE '%fail%'
ORDER BY func_name, ref_addr;
-- Functions with most string references
SELECT
(SELECT name FROM funcs WHERE x.from_ea >= address AND x.from_ea < end_ea LIMIT 1) as func_name,
COUNT(*) as string_refs
FROM strings s
JOIN xrefs x ON x.to_ea = s.address
GROUP BY func_name
ORDER BY string_refs DESC
LIMIT 10;
SQL Functions — Byte Access (Read-Only)
| Function | Description |
|---|
| Read raw bytes as hex string |
| Read bytes as BLOB |
For row-shaped byte workflows,
is a pure mapped-byte table:
sql
SELECT ea, value
FROM bytes
WHERE ea >= 0x401000 AND ea < 0x401010
ORDER BY ea;
Use
when you need IDA item size/type metadata.
Binary Search Table
Use
for raw bytes/opcodes. It requires
;
is an output column, not the search input.
| Column | Type | Description |
|---|
| INT | Match address |
| TEXT | Matched bytes rendered as hex text |
| BLOB | Matched bytes as raw bytes |
| INT | Match size in bytes |
| HIDDEN TEXT | Required IDA byte pattern input |
| HIDDEN INT | Optional inclusive lower bound |
| HIDDEN INT | Optional exclusive upper bound |
| HIDDEN INT | Optional generator cap |
Pattern syntax (IDA native):
- - Exact bytes (hex, space-separated)
- or - = any byte wildcard (whole byte only)
- - Alternatives (match any of these bytes)
Note: Nibble wildcards and regex are not supported in byte patterns.
Example:
sql
-- Find all matches for a pattern
SELECT address, matched_hex, size
FROM byte_search
WHERE pattern = '48 8B ? 00'
LIMIT 10;
-- First match only
SELECT printf('0x%llX', address) AS addr
FROM byte_search
WHERE pattern = 'CC CC CC'
ORDER BY address
LIMIT 1;
-- Search with alternatives
SELECT address, matched_hex
FROM byte_search
WHERE pattern = 'E8 (01 02 03 04)'
LIMIT 20;
Optimization Pattern: Find functions using specific instruction
To answer "How many functions use RDTSC instruction?" efficiently:
sql
-- Count unique functions containing RDTSC (opcode: 0F 31)
SELECT COUNT(DISTINCT f.address) as count
FROM byte_search b
JOIN funcs f ON b.address >= f.address AND b.address < f.end_ea
WHERE b.pattern = '0F 31';
-- List those functions with names
SELECT DISTINCT
f.address as func_ea,
f.name as func_name
FROM byte_search b
JOIN funcs f ON b.address >= f.address AND b.address < f.end_ea
WHERE b.pattern = '0F 31';
This is much faster than scanning all disassembly lines because:
- uses IDA's native binary search
- the containment join uses the compact table instead of scanning every instruction
Choose the Right Search Surface
- Use for named entities such as functions, labels, structs, enums, and members.
- Use when the user is searching literal string contents inside the binary.
- Use when the target is a raw byte or opcode pattern.
SQL Surfaces — String List
IDA maintains a cached list of strings. Use
to detect and cache strings,
for the current count, and
for row-level analysis.
| Surface | Description |
|---|
| Rebuild with ASCII + UTF-16, minlen 5 (default) |
| Rebuild with custom minimum length |
rebuild_strings(minlen, types)
| Rebuild with custom length and type mask |
SELECT COUNT(*) FROM strings
| Current string-list count (optimized without row materialization) |
Type mask values:
- = ASCII only (STRTYPE_C)
- = UTF-16 only (STRTYPE_C_16)
- = UTF-32 only (STRTYPE_C_32)
- = ASCII + UTF-16 (default)
- = All types
sql
-- Check current string count
SELECT COUNT(*) AS strings FROM strings;
-- Rebuild with defaults (ASCII + UTF-16, minlen 5)
SELECT rebuild_strings();
-- Rebuild with shorter minimum length
SELECT rebuild_strings(4);
-- Rebuild with specific types
SELECT rebuild_strings(5, 1); -- ASCII only
SELECT rebuild_strings(5, 7); -- All types (ASCII + UTF-16 + UTF-32)
-- Typical workflow: rebuild then query
SELECT rebuild_strings();
SELECT * FROM strings WHERE content LIKE '%error%';
IMPORTANT - Agent Behavior for String Queries:
When the user asks about strings (e.g., "show me the strings", "what strings are in this binary"):
- First run to ensure strings are detected
- Then query the table
The
function configures IDA's string detection with sensible defaults (ASCII + UTF-16, minimum length 5) and rebuilds the string list. This ensures the user gets results even if the database had no prior string analysis.
Performance Rules
| Table/Function | Architecture | Notes |
|---|
| Cached table count path | O(1) current string-list count |
| Cached | Rebuilt on demand via ; fast once cached |
| Native binary search table | Much faster than iterating instructions table |
| Direct read | O(1) per address, no table overhead |
Key rules:
- Always call before the first string query on a new database or after making code/data changes that may create new strings.
- uses IDA's native binary search engine; for "find functions containing opcode X", join matches to by containment instead of scanning the table.
- table is a snapshot of the cached string list. If IDA's analysis creates new strings after your initial query, call again.
- For cross-referencing strings with functions, the JOIN pattern is canonical — IDA's xref index makes the JOIN fast.
Advanced Data Patterns (CTEs)
Security-relevant string triage
Categorize strings by security relevance for rapid threat assessment:
sql
-- Categorize strings by security relevance
SELECT rebuild_strings();
WITH categorized AS (
SELECT address, content,
CASE
WHEN content LIKE '%password%' OR content LIKE '%passwd%' OR content LIKE '%secret%'
THEN 'credential'
WHEN content LIKE '%http://%' OR content LIKE '%https://%' OR content LIKE '%ftp://%'
THEN 'url'
WHEN content LIKE '%error%' OR content LIKE '%fail%' OR content LIKE '%exception%'
THEN 'error'
WHEN content LIKE '%debug%' OR content LIKE '%trace%' OR content LIKE '%assert%'
THEN 'debug'
WHEN content LIKE '%.exe%' OR content LIKE '%.dll%' OR content LIKE '%.sys%'
THEN 'file_path'
WHEN content LIKE '%HKEY_%' OR content LIKE '%SOFTWARE\\%'
THEN 'registry'
ELSE 'other'
END AS category
FROM strings
WHERE length >= 5
)
SELECT category, COUNT(*) AS count,
GROUP_CONCAT(SUBSTR(content, 1, 60), ' | ') AS samples
FROM categorized
WHERE category != 'other'
GROUP BY category
ORDER BY count DESC;
Combine string references with function size for suspicion scoring
Functions referencing security-relevant strings AND having significant size are high-priority targets:
sql
-- Suspicious functions: reference interesting strings AND are non-trivial
WITH interesting_strings AS (
SELECT address, content FROM strings
WHERE content LIKE '%password%' OR content LIKE '%encrypt%'
OR content LIKE '%decrypt%' OR content LIKE '%http%'
OR content LIKE '%socket%' OR content LIKE '%connect%'
),
string_funcs AS (
SELECT DISTINCT (SELECT address FROM funcs WHERE x.from_ea >= address AND x.from_ea < end_ea LIMIT 1) AS func_addr,
s.content AS matched_string
FROM interesting_strings s
JOIN xrefs x ON x.to_ea = s.address
WHERE (SELECT address FROM funcs WHERE x.from_ea >= address AND x.from_ea < end_ea LIMIT 1) IS NOT NULL
)
SELECT (SELECT name FROM funcs WHERE sf.func_addr >= address AND sf.func_addr < end_ea LIMIT 1) AS func_name,
printf('0x%X', sf.func_addr) AS addr,
f.size AS func_size,
GROUP_CONCAT(sf.matched_string, ' | ') AS strings_referenced
FROM string_funcs sf
JOIN funcs f ON f.address = sf.func_addr
GROUP BY sf.func_addr
ORDER BY f.size DESC
LIMIT 20;
Find functions referencing both crypto-related and network-related strings
Cross-category correlation for identifying data exfiltration or C2 communication:
sql
-- Functions touching both crypto and network strings
WITH crypto_refs AS (
SELECT DISTINCT (SELECT address FROM funcs WHERE x.from_ea >= address AND x.from_ea < end_ea LIMIT 1) AS func_addr
FROM strings s JOIN xrefs x ON x.to_ea = s.address
WHERE s.content LIKE '%crypt%' OR s.content LIKE '%aes%'
OR s.content LIKE '%cipher%' OR s.content LIKE '%hash%'
),
network_refs AS (
SELECT DISTINCT (SELECT address FROM funcs WHERE x.from_ea >= address AND x.from_ea < end_ea LIMIT 1) AS func_addr
FROM strings s JOIN xrefs x ON x.to_ea = s.address
WHERE s.content LIKE '%socket%' OR s.content LIKE '%connect%'
OR s.content LIKE '%send%' OR s.content LIKE '%recv%'
OR s.content LIKE '%http%'
)
SELECT (SELECT name FROM funcs WHERE c.func_addr >= address AND c.func_addr < end_ea LIMIT 1) AS func_name,
printf('0x%X', c.func_addr) AS addr
FROM crypto_refs c
JOIN network_refs n ON n.func_addr = c.func_addr;