spaa

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Analyzing SPAA Files

分析SPAA文件

SPAA (Stack Profile for Agentic Analysis) is an NDJSON file format for representing sampled performance stack traces. Each line is a self-contained JSON object with a
type
field.
Before analyzing an SPAA file, read the full specification at references/SPEC.md.
SPAA(Agentic分析栈配置文件)是一种用于表示采样性能栈追踪的NDJSON文件格式。每一行都是一个独立的JSON对象,包含一个
type
字段。
在分析SPAA文件之前,请阅读references/SPEC.md中的完整规范。

Quick Start

快速开始

1. Examine the header

1. 查看文件头

The first line is always the header. It tells you what profiler generated the data and how to interpret metrics:
bash
head -1 profile.spaa | jq .
Key header fields:
  • source_tool
    : The profiler that generated this data (e.g., "perf", "dtrace")
  • frame_order
    : Either
    "leaf_to_root"
    or
    "root_to_leaf"
    - determines how to read stack frames
  • events[].sampling.primary_metric
    : The authoritative metric for weighting (e.g., "period" for perf, "samples" for DTrace)
第一行永远是文件头,它会告诉你生成数据的剖析工具以及如何解读指标:
bash
head -1 profile.spaa | jq .
文件头关键字段:
  • source_tool
    :生成该数据的剖析工具(例如"perf"、"dtrace")
  • frame_order
    :取值为
    "leaf_to_root"
    "root_to_leaf"
    ,决定栈帧的读取顺序
  • events[].sampling.primary_metric
    :用于加权的权威指标(例如perf的"period",DTrace的"samples")

2. Understand the record types

2. 了解记录类型

bash
undefined
bash
undefined

Count records by type

按类型统计记录数量

grep -o '"type":"[^"]*"' profile.spaa | sort | uniq -c | sort -rn

Common types:
- `header` - File metadata (exactly one, always first)
- `dso` - Shared libraries and binaries
- `frame` - Individual stack frame definitions
- `thread` - Thread/process info
- `stack` - Aggregated call stacks with weights (the main data)
- `sample` - Individual sample events (optional, for temporal analysis)
grep -o '"type":"[^"]*"' profile.spaa | sort | uniq -c | sort -rn

常见类型:
- `header` - 文件元数据(仅有一个,始终位于第一行)
- `dso` - 共享库和二进制文件
- `frame` - 单个栈帧定义
- `thread` - 线程/进程信息
- `stack` - 带权重的聚合调用栈(核心数据)
- `sample` - 单个采样事件(可选,用于时序分析)

3. Find performance hotspots

3. 定位性能热点

Extract the heaviest stacks by the primary metric:
bash
undefined
按主指标提取权重最高的调用栈:
bash
undefined

For perf data (uses "period" metric)

针对perf数据(使用"period"指标)

grep '"type":"stack"' profile.spaa |
jq -s 'sort_by(-.weights[] | select(.metric=="period") | .value) | .[0:10]'
grep '"type":"stack"' profile.spaa |
jq -s 'sort_by(-.weights[] | select(.metric=="period") | .value) | .[0:10]'

For DTrace data (uses "samples" metric)

针对DTrace数据(使用"samples"指标)

grep '"type":"stack"' profile.spaa |
jq -s 'sort_by(-.weights[] | select(.metric=="samples") | .value) | .[0:10]'
undefined
grep '"type":"stack"' profile.spaa |
jq -s 'sort_by(-.weights[] | select(.metric=="samples") | .value) | .[0:10]'
undefined

4. Find hot functions (exclusive time)

4. 查找热点函数(独占时间)

Exclusive time shows where the CPU actually spent time, not just functions on the call path:
bash
grep '"type":"stack"' profile.spaa | \
  jq -s '[.[] | select(.exclusive) | {frame: .exclusive.frame, value: (.exclusive.weights[] | select(.metric=="period") | .value)}] | group_by(.frame) | map({frame: .[0].frame, total: (map(.value) | add)}) | sort_by(-.total) | .[0:20]'
Then look up the frame IDs to get function names:
bash
undefined
独占时间显示CPU实际消耗时间的位置,而非仅调用路径上的函数:
bash
grep '"type":"stack"' profile.spaa | \
  jq -s '[.[] | select(.exclusive) | {frame: .exclusive.frame, value: (.exclusive.weights[] | select(.metric=="period") | .value)}] | group_by(.frame) | map({frame: .[0].frame, total: (map(.value) | add)}) | sort_by(-.total) | .[0:20]'
然后通过帧ID查找函数名称:
bash
undefined

Get frame details for a specific ID

获取特定ID的帧详情

grep '"type":"frame"' profile.spaa | jq 'select(.id == 101)'
undefined
grep '"type":"frame"' profile.spaa | jq 'select(.id == 101)'
undefined

Analyzing Memory Profiles

分析内存配置文件

SPAA also supports heap/allocation profilers. Memory events use different metrics:
bash
undefined
SPAA也支持堆/内存分配剖析工具。内存事件使用不同的指标:
bash
undefined

Find top allocation sites by bytes allocated

按分配字节数查找顶级分配位点

grep '"type":"stack"' profile.spaa |
jq -s '[.[] | select(.weights[] | .metric == "alloc_bytes")] | sort_by(-.weights[] | select(.metric=="alloc_bytes") | .value) | .[0:10]'
grep '"type":"stack"' profile.spaa |
jq -s '[.[] | select(.weights[] | .metric == "alloc_bytes")] | sort_by(-.weights[] | select(.metric=="alloc_bytes") | .value) | .[0:10]'

Find potential memory leaks (high live_bytes)

查找潜在内存泄漏(高live_bytes值)

grep '"type":"stack"' profile.spaa |
jq -s '[.[] | select(.weights[] | .metric == "live_bytes")] | sort_by(-.weights[] | select(.metric=="live_bytes") | .value) | .[0:10]'

Key memory metrics:
- `alloc_bytes` / `alloc_count` - Total allocations
- `live_bytes` / `live_count` - Currently unreleased memory (potential leaks)
- `peak_bytes` - High-water mark
grep '"type":"stack"' profile.spaa |
jq -s '[.[] | select(.weights[] | .metric == "live_bytes")] | sort_by(-.weights[] | select(.metric=="live_bytes") | .value) | .[0:10]'

关键内存指标:
- `alloc_bytes` / `alloc_count` - 总分配量
- `live_bytes` / `live_count` - 当前未释放的内存(潜在泄漏)
- `peak_bytes` - 内存使用峰值

Reconstructing Call Stacks

重构调用栈

Stack records contain frame IDs. To see the actual function names:
bash
undefined
栈记录包含帧ID,要查看实际函数名称:
bash
undefined

Extract a stack and resolve its frames

提取一个调用栈并解析其帧

STACK_FRAMES=$(grep '"type":"stack"' profile.spaa | head -1 | jq -r '.frames | @csv')
STACK_FRAMES=$(grep '"type":"stack"' profile.spaa | head -1 | jq -r '.frames | @csv')

Build a frame lookup table, then query it

构建帧查找表,然后查询

grep '"type":"frame"' profile.spaa | jq -s 'INDEX(.id)' > /tmp/frames.json echo $STACK_FRAMES | tr ',' '\n' | while read fid; do jq --arg id "$fid" '.[$id] | "(.func) ((.srcline // "unknown"))"' /tmp/frames.json done
undefined
grep '"type":"frame"' profile.spaa | jq -s 'INDEX(.id)' > /tmp/frames.json echo $STACK_FRAMES | tr ',' '\n' | while read fid; do jq --arg id "$fid" '.[$id] | "(.func) ((.srcline // "unknown"))"' /tmp/frames.json done
undefined

Common Analysis Patterns

常见分析模式

Filter by thread/process

按线程/进程过滤

bash
grep '"type":"stack"' profile.spaa | jq 'select(.context.tid == 4511)'
bash
grep '"type":"stack"' profile.spaa | jq 'select(.context.tid == 4511)'

Filter by event type

按事件类型过滤

bash
grep '"type":"stack"' profile.spaa | jq 'select(.context.event == "cycles")'
bash
grep '"type":"stack"' profile.spaa | jq 'select(.context.event == "cycles")'

Find kernel vs userspace time

区分内核态与用户态时间

bash
undefined
bash
undefined

Kernel stacks

内核态调用栈

grep '"type":"stack"' profile.spaa | jq 'select(.stack_type == "kernel")'
grep '"type":"stack"' profile.spaa | jq 'select(.stack_type == "kernel")'

Or check frame kinds

或检查帧类型

grep '"type":"frame"' profile.spaa | jq 'select(.kind == "kernel")' | head -20
undefined
grep '"type":"frame"' profile.spaa | jq 'select(.kind == "kernel")' | head -20
undefined

Temporal analysis (if sample records exist)

时序分析(如果存在sample记录)

bash
undefined
bash
undefined

Check if raw samples are included

检查是否包含原始采样数据

grep -c '"type":"sample"' profile.spaa
grep -c '"type":"sample"' profile.spaa

Plot sample distribution over time

绘制采样数据的时间分布

grep '"type":"sample"' profile.spaa | jq -s 'group_by(.timestamp | floor) | map({time: .[0].timestamp | floor, count: length})'
undefined
grep '"type":"sample"' profile.spaa | jq -s 'group_by(.timestamp | floor) | map({time: .[0].timestamp | floor, count: length})'
undefined

Tips for Performance Analysis

性能分析技巧

  1. Start with the header - Understand the profiler, sampling mode, and time range
  2. Check the primary metric - Use
    period
    for perf,
    samples
    for DTrace
  3. Look at exclusive time first - This shows actual hotspots, not just callers
  4. Cross-reference frame IDs - Build a lookup table for readable output
  5. Filter by context - Narrow down by thread, CPU, or event type
  6. For memory issues - Focus on
    live_bytes
    to find leaks,
    alloc_bytes
    for churn
  1. 从文件头开始 - 了解剖析工具、采样模式和时间范围
  2. 确认主指标 - perf使用
    period
    ,DTrace使用
    samples
  3. 优先查看独占时间 - 这能显示实际热点,而非仅调用者
  4. 交叉引用帧ID - 构建查找表以获得可读输出
  5. 按上下文过滤 - 按线程、CPU或事件类型缩小范围
  6. 针对内存问题 - 聚焦
    live_bytes
    查找泄漏,
    alloc_bytes
    查看内存 churn