exploring-llm-traces

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Exploring LLM traces with MCP tools

使用MCP工具探索LLM追踪数据

PostHog captures LLM/AI agent activity as traces. Each trace is a tree of events representing a single AI interaction — from the top-level agent invocation down to individual LLM API calls.
PostHog会将LLM/AI Agent的活动捕获为追踪数据。每条追踪数据都是一个事件树,代表一次完整的AI交互——从顶层Agent调用到各个LLM API调用。

Available tools

可用工具

ToolPurpose
posthog:query-llm-traces-list
Search and list traces (compact — no large content)
posthog:query-llm-trace
Get a single trace by ID with full event tree
posthog:execute-sql
Ad-hoc SQL for complex trace analysis
工具用途
posthog:query-llm-traces-list
搜索并列出追踪数据(精简版,无大体积内容)
posthog:query-llm-trace
通过ID获取包含完整事件树的单个追踪数据
posthog:execute-sql
用于复杂追踪数据分析的临时SQL查询

Event hierarchy

事件层级

See the event reference for the full schema.
text
$ai_trace (top-level container)
  └── $ai_span (logical groupings, e.g. "RAG retrieval", "tool execution")
        ├── $ai_generation (individual LLM API call)
        └── $ai_embedding (embedding creation)
Events are linked via
$ai_parent_id
→ parent's
$ai_span_id
or
$ai_trace_id
.
完整架构请查看事件参考文档
text
$ai_trace (顶层容器)
  └── $ai_span(逻辑分组,例如"RAG检索"、"工具执行")
        ├── $ai_generation(单个LLM API调用)
        └── $ai_embedding(嵌入向量创建)
事件通过
$ai_parent_id
与父级的
$ai_span_id
$ai_trace_id
关联。

Workflow: debug a trace from a URL

工作流程:从URL调试追踪数据

Step 1 — Fetch the trace

步骤1 — 获取追踪数据

json
posthog:query-llm-trace
{
  "traceId": "<trace_id>",
  "dateRange": {"date_from": "-7d"}
}
The result contains the full event tree with all properties. The response may be large — when it exceeds the inline limit, Claude Code auto-persists it to a file.
From the result you get:
  • Every event with its type (
    $ai_span
    ,
    $ai_generation
    , etc.)
  • Span names (
    $ai_span_name
    ) — these are the tool/step names
  • Latency, error flags, models used
  • Parent-child relationships via
    $ai_parent_id
  • _posthogUrl
    always include this in your response so the user can click through to the UI
json
posthog:query-llm-trace
{
  "traceId": "<trace_id>",
  "dateRange": {"date_from": "-7d"}
}
结果包含带有所有属性的完整事件树。响应内容可能较大——当超过内联限制时,Claude Code会自动将其保存到文件中。
从结果中你可以获取:
  • 每个事件及其类型(
    $ai_span
    $ai_generation
    等)
  • Span名称(
    $ai_span_name
    )——即工具/步骤名称
  • 延迟、错误标记、使用的模型
  • 通过
    $ai_parent_id
    体现的父子关系
  • _posthogUrl
    —— 务必在回复中包含该链接,方便用户点击进入UI界面

Step 2 — Parse large results with scripts

步骤2 — 使用脚本解析大型结果

When the result is persisted to a file (large traces with full
$ai_input
/
$ai_output_choices
), use the parsing scripts to explore it.
Start with the summary to get the full picture, then drill into specifics:
bash
undefined
当结果被保存到文件(包含完整
$ai_input
/
$ai_output_choices
的大型追踪数据)时,使用解析脚本进行探索。
先查看摘要获取全局信息,再深入细节:
bash
undefined

1. Overview: metadata, tool calls, final output, errors

1. 概览:元数据、工具调用、最终输出、错误信息

python3 scripts/print_summary.py /path/to/persisted-file.json
python3 scripts/print_summary.py /path/to/persisted-file.json

2. Timeline: chronological event list with truncated I/O

2. 时间线:带截断输入输出的按时间排序事件列表

python3 scripts/print_timeline.py /path/to/persisted-file.json
python3 scripts/print_timeline.py /path/to/persisted-file.json

3. Drill into a specific span's full input/output

3. 深入查看特定Span的完整输入输出

SPAN="tool_name" python3 scripts/extract_span.py /path/to/persisted-file.json
SPAN="tool_name" python3 scripts/extract_span.py /path/to/persisted-file.json

4. Full conversation with thinking blocks and tool calls

4. 包含思考块和工具调用的完整对话

python3 scripts/extract_conversation.py /path/to/persisted-file.json
python3 scripts/extract_conversation.py /path/to/persisted-file.json

5. Search for a keyword across all properties

5. 在所有属性中搜索关键词

SEARCH="keyword" python3 scripts/search_traces.py /path/to/persisted-file.json

All scripts support `MAX_LEN=N` env var to control truncation (0 = unlimited).
SEARCH="keyword" python3 scripts/search_traces.py /path/to/persisted-file.json

所有脚本都支持`MAX_LEN=N`环境变量来控制截断长度(0表示无限制)。

Investigation patterns

调查模式

"Did the agent use the tool correctly?"

"Agent是否正确使用了工具?"

  1. Find the
    $ai_span
    for the tool call (look at
    $ai_span_name
    )
  2. Check
    $ai_input_state
    — what arguments were passed to the tool?
  3. Check
    $ai_output_state
    — what did the tool return?
  4. Check
    $ai_is_error
    — did the tool call fail?
  1. 找到对应工具调用的
    $ai_span
    (查看
    $ai_span_name
  2. 检查
    $ai_input_state
    ——传递给工具的参数是什么?
  3. 检查
    $ai_output_state
    ——工具返回了什么?
  4. 检查
    $ai_is_error
    ——工具调用是否失败?

"Was the context correct?" / "Were the right files surfaced?"

"上下文是否正确?" / "是否呈现了正确的文件?"

  1. Find the
    $ai_generation
    event where the LLM made the decision
  2. Check
    $ai_input
    — this is the full message history the LLM saw
  3. Look at preceding
    $ai_span
    events for retrieval/search steps
  4. Check their
    $ai_output_state
    — what content was retrieved and fed to the LLM?
  1. 找到LLM做出决策的
    $ai_generation
    事件
  2. 检查
    $ai_input
    ——这是LLM看到的完整消息历史
  3. 查看之前的
    $ai_span
    事件中的检索/搜索步骤
  4. 检查它们的
    $ai_output_state
    ——哪些内容被检索并提供给LLM?

"Did the subagent work?"

"子Agent是否正常工作?"

  1. In the structural overview, find spans that are children of other spans (via
    $ai_parent_id
    )
  2. The parent span is the orchestrator; child spans are subagent steps
  3. Check each child's
    $ai_output_state
    and
    $ai_is_error
  4. If a child span contains
    $ai_generation
    events, those are the subagent's LLM calls
  1. 在结构概览中,找到作为其他Span子级的Span(通过
    $ai_parent_id
  2. 父级Span是编排器;子级Span是子Agent的步骤
  3. 检查每个子级的
    $ai_output_state
    $ai_is_error
  4. 如果子级Span包含
    $ai_generation
    事件,则这些是子Agent的LLM调用

"Why did the LLM say X?"

"为什么LLM会输出X内容?"

  1. Use
    search_traces.py
    to find where the text appears:
    SEARCH="the text" python3 scripts/search_traces.py FILE
  2. This shows which event and property path contains it
  3. Check the
    $ai_input
    of that generation to see what the LLM was told before it said X
  1. 使用
    search_traces.py
    查找文本出现的位置:
    SEARCH="the text" python3 scripts/search_traces.py FILE
  2. 这会显示包含该文本的事件和属性路径
  3. 查看该生成事件的
    $ai_input
    ,了解LLM输出X之前接收的信息

Constructing UI links

构建UI链接

The trace tools return
_posthogUrl
— always surface this to the user.
You can also construct links manually:
  • Trace detail:
    https://app.posthog.com/llm-observability/traces/<trace_id>?timestamp=<url_encoded_timestamp>&event=<optional_event_id>
  • Traces list with filters: returned in
    _posthogUrl
    from
    query-llm-traces-list
The
timestamp
query param is required — use the
createdAt
of the earliest event in the trace, URL-encoded (e.g.
timestamp=2026-04-01T19%3A39%3A20Z
).
When presenting findings, always include the relevant PostHog URL so the user can verify.
追踪工具会返回
_posthogUrl
——务必将其提供给用户。
你也可以手动构建链接:
  • 追踪详情
    https://app.posthog.com/llm-observability/traces/<trace_id>?timestamp=<url_encoded_timestamp>&event=<optional_event_id>
  • 带筛选条件的追踪列表:从
    query-llm-traces-list
    _posthogUrl
    返回结果中获取
timestamp
查询参数是必填项——使用追踪中最早事件的
createdAt
,并进行URL编码(例如
timestamp=2026-04-01T19%3A39%3A20Z
)。
呈现调查结果时,务必包含相关的PostHog链接,方便用户验证。

Finding traces

查找追踪数据

Use
posthog:query-llm-traces-list
to search and filter traces.
CRITICAL: Never assume event names, property names, or property values from training data. Every project instruments different custom properties. Always call
posthog:read-data-schema
first to discover what properties and values actually exist in the project's data before constructing filters.
使用
posthog:query-llm-traces-list
进行搜索和筛选。
重要提示:切勿根据训练数据假设事件名称、属性名称或属性值。 每个项目都会配置不同的自定义属性。在构建筛选条件之前,务必先调用
posthog:read-data-schema
来发现项目数据中实际存在的属性和值。

Discovering the schema first

先发现架构

Before filtering traces, discover what's available:
  1. Confirm AI events exist — call
    posthog:read-data-schema
    with
    kind: "events"
    and look for
    $ai_*
    events
  2. Find filterable properties — call
    posthog:read-data-schema
    with
    kind: "event_properties"
    and
    event_name: "$ai_generation"
    (or another AI event) to see what properties are captured
  3. Get actual values — call
    posthog:read-data-schema
    with
    kind: "event_property_values"
    ,
    event_name: "$ai_generation"
    , and
    property_name: "$ai_model"
    to see real model names in use
Only then construct the
query-llm-traces-list
call with property filters.
This is especially important for custom properties like
project_id
,
conversation_id
,
user_tier
, etc. — these vary per project and cannot be guessed.
Do not confirm
$ai_*
properties, but confirm any other like
email
of a person.
筛选追踪数据之前,先了解可用的内容:
  1. 确认AI事件存在——调用
    posthog:read-data-schema
    并设置
    kind: "events"
    ,查找
    $ai_*
    事件
  2. 找到可筛选的属性——调用
    posthog:read-data-schema
    并设置
    kind: "event_properties"
    event_name: "$ai_generation"
    (或其他AI事件),查看捕获的属性
  3. 获取实际值——调用
    posthog:read-data-schema
    并设置
    kind: "event_property_values"
    event_name: "$ai_generation"
    property_name: "$ai_model"
    ,查看实际使用的模型名称
之后再使用属性筛选条件构建
query-llm-traces-list
调用。
这对于
project_id
conversation_id
user_tier
等自定义属性尤为重要——这些属性因项目而异,无法猜测。
无需确认
$ai_*
属性,但需确认其他属性(如用户的
email
)。

By filters

通过筛选条件

json
posthog:query-llm-traces-list
{
  "dateRange": {"date_from": "-1h"},
  "filterTestAccounts": true,
  "limit": 20,
  "properties": [
    {"type": "event", "key": "$ai_model", "value": "gpt-4o", "operator": "exact"}
  ]
}
Multiple filters are AND-ed together:
json
posthog:query-llm-traces-list
{
  "dateRange": {"date_from": "-1h"},
  "filterTestAccounts": true,
  "properties": [
    {"type": "event", "key": "$ai_provider", "value": "anthropic", "operator": "exact"},
    {"type": "event", "key": "$ai_is_error", "value": ["true"], "operator": "exact"}
  ]
}
You can also filter by person properties (discover them via
read-data-schema
with
kind: "entity_properties"
and
entity: "person"
):
json
posthog:query-llm-traces-list
{
  "dateRange": {"date_from": "-1h"},
  "filterTestAccounts": true,
  "properties": [
    {"type": "person", "key": "email", "value": "@company.com", "operator": "icontains"}
  ]
}
json
posthog:query-llm-traces-list
{
  "dateRange": {"date_from": "-1h"},
  "filterTestAccounts": true,
  "limit": 20,
  "properties": [
    {"type": "event", "key": "$ai_model", "value": "gpt-4o", "operator": "exact"}
  ]
}
多个筛选条件为AND关系:
json
posthog:query-llm-traces-list
{
  "dateRange": {"date_from": "-1h"},
  "filterTestAccounts": true,
  "properties": [
    {"type": "event", "key": "$ai_provider", "value": "anthropic", "operator": "exact"},
    {"type": "event", "key": "$ai_is_error", "value": ["true"], "operator": "exact"}
  ]
}
你也可以通过用户属性筛选(通过
read-data-schema
设置
kind: "entity_properties"
entity: "person"
发现这些属性):
json
posthog:query-llm-traces-list
{
  "dateRange": {"date_from": "-1h"},
  "filterTestAccounts": true,
  "properties": [
    {"type": "person", "key": "email", "value": "@company.com", "operator": "icontains"}
  ]
}

By external identifiers

通过外部标识符

Customers often store their own IDs as event or person properties. Use
posthog:read-data-schema
to discover what custom properties exist, then filter:
  1. Call
    posthog:read-data-schema
    with
    kind: "event_properties"
    and
    event_name: "$ai_trace"
    to find custom properties
  2. Review the returned properties and their sample values
  3. Construct the filter using the discovered property key and a known value
json
posthog:query-llm-traces-list
{
  "dateRange": {"date_from": "-7d"},
  "properties": [
    {"type": "event", "key": "project_id", "value": "proj_abc123", "operator": "exact"}
  ]
}
客户通常会将自己的ID存储为事件或用户属性。 使用
posthog:read-data-schema
发现存在的自定义属性,然后进行筛选:
  1. 调用
    posthog:read-data-schema
    并设置
    kind: "event_properties"
    event_name: "$ai_trace"
    ,查找自定义属性
  2. 查看返回的属性及其示例值
  3. 使用发现的属性键和已知值构建筛选条件
json
posthog:query-llm-traces-list
{
  "dateRange": {"date_from": "-7d"},
  "properties": [
    {"type": "event", "key": "project_id", "value": "proj_abc123", "operator": "exact"}
  ]
}

By SQL (for full-text search or custom aggregations)

通过SQL(用于全文搜索或自定义聚合)

Use SQL when you need something
query-llm-traces-list
can't express — typically full-text search across message content or custom aggregations.
sql
SELECT
    properties.$ai_trace_id AS trace_id,
    properties.$ai_model AS model,
    timestamp
FROM events
WHERE
    event = '$ai_generation'
    AND timestamp >= now() - INTERVAL 1 HOUR
    AND properties.$ai_input ILIKE '%search term%'
ORDER BY timestamp DESC
LIMIT 20
For more complex SQL patterns, read these references:
  • Single trace retrieval — fetches a single trace by ID with all events and properties (renders the
    TraceQuery
    HogQL)
  • Traces list with aggregated metrics — two-phase query: find trace IDs first, then fetch aggregated latency, tokens, costs, and error counts
query-llm-traces-list
无法满足需求时,使用SQL——通常用于消息内容的全文搜索或自定义聚合。
sql
SELECT
    properties.$ai_trace_id AS trace_id,
    properties.$ai_model AS model,
    timestamp
FROM events
WHERE
    event = '$ai_generation'
    AND timestamp >= now() - INTERVAL 1 HOUR
    AND properties.$ai_input ILIKE '%search term%'
ORDER BY timestamp DESC
LIMIT 20
如需更复杂的SQL模式,请查看以下参考文档:
  • 单个追踪数据检索——通过ID获取包含所有事件和属性的单个追踪数据(渲染
    TraceQuery
    HogQL)
  • 带聚合指标的追踪列表——两阶段查询:先查找追踪ID,然后获取聚合延迟、Token、成本和错误计数

Parsing large trace results

解析大型追踪结果

Trace tool results are JSON. When too large to read inline, Claude Code persists them to a file.
追踪工具的结果为JSON格式。当内容过大无法内联查看时,Claude Code会将其保存到文件中。

Persisted file format

保存文件格式

json
[{ "type": "text", "text": "{\"results\": [...], \"_posthogUrl\": \"...\"}" }]
json
[{ "type": "text", "text": "{\"results\": [...], \"_posthogUrl\": \"...\"}" }]

Trace JSON structure

追踪JSON结构

text
results (array for list, object for single trace)
  ├── id, traceName, createdAt, totalLatency, totalCost
  ├── inputState, outputState (trace-level state)
  └── events[]
        ├── event ($ai_span | $ai_generation | $ai_embedding | $ai_metric | $ai_feedback)
        ├── id, createdAt
        └── properties
              ├── $ai_span_name, $ai_latency, $ai_is_error
              ├── $ai_input_state, $ai_output_state (span tool I/O)
              ├── $ai_input, $ai_output_choices (generation messages)
              ├── $ai_model, $ai_provider
              └── $ai_input_tokens, $ai_output_tokens, $ai_total_cost_usd
text
results(列表为数组,单个追踪为对象)
  ├── id, traceName, createdAt, totalLatency, totalCost
  ├── inputState, outputState(追踪级状态)
  └── events[]
        ├── event ($ai_span | $ai_generation | $ai_embedding | $ai_metric | $ai_feedback)
        ├── id, createdAt
        └── properties
              ├── $ai_span_name, $ai_latency, $ai_is_error
              ├── $ai_input_state, $ai_output_state(Span工具输入输出)
              ├── $ai_input, $ai_output_choices(生成消息)
              ├── $ai_model, $ai_provider
              └── $ai_input_tokens, $ai_output_tokens, $ai_total_cost_usd

Available scripts

可用脚本

ScriptPurposeUsage
print_summary.py
Trace metadata, tool calls, errors, and final LLM output
python3 scripts/print_summary.py FILE
print_timeline.py
Chronological event timeline with I/O summaries
python3 scripts/print_timeline.py FILE
extract_span.py
Full input/output of a specific span by name
SPAN="name" python3 scripts/extract_span.py FILE
extract_conversation.py
LLM messages with thinking blocks and tool calls
python3 scripts/extract_conversation.py FILE
search_traces.py
Find a keyword across all event properties
SEARCH="keyword" python3 scripts/search_traces.py FILE
show_structure.py
Show JSON keys and types without values
cat blob.json | python3 scripts/show_structure.py
脚本用途使用方式
print_summary.py
追踪元数据、工具调用、错误信息和最终LLM输出
python3 scripts/print_summary.py FILE
print_timeline.py
带输入输出摘要的按时间排序事件时间线
python3 scripts/print_timeline.py FILE
extract_span.py
指定名称Span的完整输入输出
SPAN="name" python3 scripts/extract_span.py FILE
extract_conversation.py
包含思考块和工具调用的LLM消息
python3 scripts/extract_conversation.py FILE
search_traces.py
在所有事件属性中查找关键词
SEARCH="keyword" python3 scripts/search_traces.py FILE
show_structure.py
显示JSON键和类型,不包含值
cat blob.json | python3 scripts/show_structure.py

Tips

小贴士

  • Always set
    dateRange
    — queries without a time range are slow. Use narrow windows (
    -30m
    ,
    -1h
    ) for broad listing queries; wider windows (
    -7d
    ,
    -30d
    ) are fine for narrow queries filtered by trace ID or specific property values
  • Always include the
    _posthogUrl
    in your response so the user can click through
  • $ai_input_state
    /
    $ai_output_state
    on spans contain tool call inputs and outputs
  • $ai_input
    /
    $ai_output_choices
    on generations contain the full LLM conversation — can be megabytes; when the result is persisted to a file, use the parsing scripts
  • Use
    filterTestAccounts: true
    to exclude internal/test traffic when searching
  • $ai_trace
    events are NOT in the
    events
    array — their data is surfaced via trace-level
    inputState
    ,
    outputState
    , and
    traceName
  • 务必设置
    dateRange
    ——无时间范围的查询速度较慢。广泛列表查询使用窄窗口(
    -30m
    -1h
    );通过追踪ID或特定属性值筛选的窄查询可使用宽窗口(
    -7d
    -30d
  • 回复中务必包含
    _posthogUrl
    ,方便用户点击跳转
  • Span的
    $ai_input_state
    /
    $ai_output_state
    包含工具调用的输入和输出
  • 生成事件的
    $ai_input
    /
    $ai_output_choices
    包含完整LLM对话——可能达到MB级;当结果保存到文件时,使用解析脚本查看
  • 搜索时使用
    filterTestAccounts: true
    排除内部/测试流量
  • $ai_trace
    事件不在
    events
    数组中——其数据通过追踪级的
    inputState
    outputState
    traceName
    呈现