langsmith-trace-analyzer
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseLangSmith Trace Analyzer
LangSmith 追踪数据分析工具
Use this skill to move from raw LangSmith traces to actionable debugging/evaluation insights.
使用该技能可将原始LangSmith追踪数据转化为可用于调试和评估的实用见解。
Quick Start
快速工作流
bash
undefinedbash
undefinedInstall dependencies
安装依赖
uv pip install langsmith langsmith-fetch
uv pip install langsmith langsmith-fetch
Auth
身份验证
export LANGSMITH_API_KEY=<your_langsmith_api_key>
undefinedexport LANGSMITH_API_KEY=<your_langsmith_api_key>
undefinedFast workflow
快速流程
- Download traces with (or
scripts/download_traces.py).scripts/download_traces.ts - Analyze downloaded JSON with .
scripts/analyze_traces.py - Load targeted references only when needed:
- for query/filter syntax
references/filtering-querying.md - for deeper diagnostics
references/analysis-patterns.md - for benchmark-specific workflows
references/benchmark-analysis.md
- 使用(或
scripts/download_traces.py)下载追踪数据。scripts/download_traces.ts - 使用分析下载的JSON数据。
scripts/analyze_traces.py - 仅在需要时加载针对性参考文档:
- :查询/过滤语法
references/filtering-querying.md - :深度诊断方法
references/analysis-patterns.md - :基准测试专属工作流
references/benchmark-analysis.md
Decision Guide
决策指南
-
Known trace IDs
Usedirectly, orlangsmith-fetch trace <id>in downloader scripts.--trace-ids -
Need to discover traces first
Use LangSmith SDKwith filters, then download selected trace IDs.list_runs/listRuns -
Need aggregate insights
Runfor summary stats, patterns, and passed-vs-failed comparisons.analyze_traces.py
-
已知追踪ID
直接使用,或在下载脚本中使用langsmith-fetch trace <id>参数。--trace-ids -
需先发现追踪记录
使用LangSmith SDK的方法搭配过滤器,然后下载选定的追踪ID。list_runs/listRuns -
需要聚合见解
运行获取汇总统计数据、模式分析以及通过与失败案例的对比结果。analyze_traces.py
Core Workflows
核心工作流
1) Download and organize traces
1) 下载并整理追踪数据
Python:
bash
uv run skills/langsmith-trace-analyzer/scripts/download_traces.py \
--project "my-project" \
--filter "job_id=abc123" \
--last-hours 24 \
--limit 100 \
--output ./traces \
--organizeTypeScript:
bash
ts-node skills/langsmith-trace-analyzer/scripts/download_traces.ts \
--project "my-project" \
--filter "job_id=abc123" \
--last-hours 24 \
--limit 100 \
--output ./tracesOutput layout:
text
traces/
├── manifest.json
└── by-outcome/
├── passed/
├── failed/
└── error/
├── GraphRecursionError/
├── TimeoutError/
└── DaytonaError/Notes:
- Python script supports .
--organize/--no-organize - Both scripts use SDK filtering plus for full trace payload export.
langsmith-fetch
Python:
bash
uv run skills/langsmith-trace-analyzer/scripts/download_traces.py \
--project "my-project" \
--filter "job_id=abc123" \
--last-hours 24 \
--limit 100 \
--output ./traces \
--organizeTypeScript:
bash
ts-node skills/langsmith-trace-analyzer/scripts/download_traces.ts \
--project "my-project" \
--filter "job_id=abc123" \
--last-hours 24 \
--limit 100 \
--output ./traces输出结构:
text
traces/
├── manifest.json
└── by-outcome/
├── passed/
├── failed/
└── error/
├── GraphRecursionError/
├── TimeoutError/
└── DaytonaError/注意事项:
- Python脚本支持参数。
--organize/--no-organize - 两个脚本均使用SDK过滤功能,结合导出完整追踪数据。
langsmith-fetch
2) Analyze downloaded traces
2) 分析已下载的追踪数据
bash
undefinedbash
undefinedMarkdown report
生成Markdown报告
uv run skills/langsmith-trace-analyzer/scripts/analyze_traces.py ./traces --output report.md
uv run skills/langsmith-trace-analyzer/scripts/analyze_traces.py ./traces --output report.md
JSON output
生成JSON输出
uv run skills/langsmith-trace-analyzer/scripts/analyze_traces.py ./traces --json
uv run skills/langsmith-trace-analyzer/scripts/analyze_traces.py ./traces --json
Compare passed vs failed (expects by-outcome folders)
对比通过与失败案例(需存在by-outcome文件夹)
uv run skills/langsmith-trace-analyzer/scripts/analyze_traces.py ./traces --compare --output comparison.md
The analyzer reports:
- message/tool-call/token/duration summaries
- top tool usage
- anomaly patterns (high message count, repeated tools, quick failures)
- passed-vs-failed metric deltas when comparison is enableduv run skills/langsmith-trace-analyzer/scripts/analyze_traces.py ./traces --compare --output comparison.md
分析工具会生成以下报告内容:
- 消息/工具调用/令牌/时长汇总
- 高频工具使用情况
- 异常模式(消息数量过多、重复调用工具、快速失败等)
- 启用对比功能时,通过与失败案例的指标差异3) Query traces correctly (SDK)
3) 正确查询追踪数据(SDK)
Use official LangSmith run filter syntax via and/or :
filterstart_timepython
from datetime import datetime, timedelta, timezone
from langsmith import Client
client = Client()
start = datetime.now(timezone.utc) - timedelta(hours=24)
filter_query = 'and(eq(metadata_key, "job_id"), eq(metadata_value, "abc123"))'
runs = client.list_runs(
project_name="my-project",
is_root=True,
start_time=start,
filter=filter_query,
)For TypeScript:
ts
import { Client } from "langsmith";
const client = new Client();
for await (const run of client.listRuns({
projectName: "my-project",
isRoot: true,
filter: 'and(eq(metadata_key, "job_id"), eq(metadata_value, "abc123"))',
})) {
console.log(run.id, run.status);
}通过和/或参数使用官方LangSmith运行实例过滤语法:
filterstart_timepython
from datetime import datetime, timedelta, timezone
from langsmith import Client
client = Client()
start = datetime.now(timezone.utc) - timedelta(hours=24)
filter_query = 'and(eq(metadata_key, "job_id"), eq(metadata_value, "abc123"))'
runs = client.list_runs(
project_name="my-project",
is_root=True,
start_time=start,
filter=filter_query,
)TypeScript版本:
ts
import { Client } from "langsmith";
const client = new Client();
for await (const run of client.listRuns({
projectName: "my-project",
isRoot: true,
filter: 'and(eq(metadata_key, "job_id"), eq(metadata_value, "abc123"))',
})) {
console.log(run.id, run.status);
}Accuracy and Schema Notes
准确性与架构说明
- LangSmith run fields are commonly top-level (,
status,error,total_tokens,start_time).end_time - Some exported traces also include nested metadata (or
metadata) and/orextra.metadata.messages - is resilient to multiple payload shapes, including raw array payloads.
analyze_traces.py - For full conversation content, prefer downloaded trace payloads over bare results.
list_runs
- LangSmith运行实例的字段通常位于顶层(如、
status、error、total_tokens、start_time)。end_time - 部分导出的追踪数据还包含嵌套元数据(或
metadata)和/或extra.metadata字段。messages - 可适配多种数据结构,包括原始数组格式。
analyze_traces.py - 如需完整对话内容,建议使用下载的追踪数据JSON,而非单纯的结果。
list_runs
Troubleshooting
故障排除
| Issue | Likely Cause | Action |
|---|---|---|
| Auth not configured | |
| No runs returned | Wrong project/filter/time range | Verify project name and filter syntax |
| Empty/partial message arrays | Run schema differs or incomplete data | Use downloaded trace JSON and inspect |
| JSON parse error on downloaded files | Bad/incomplete export | Re-download trace; use |
| Re-downloading same traces repeatedly | Existing files in nested folders | Use current scripts (they check existing files across output tree) |
| 问题 | 可能原因 | 解决方法 |
|---|---|---|
缺少 | 未配置身份验证 | 执行 |
| 未返回任何运行实例 | 项目/过滤器/时间范围错误 | 验证项目名称和过滤器语法 |
| 消息数组为空/不完整 | 运行实例架构不同或数据不完整 | 使用下载的追踪数据JSON并检查 |
| 下载的文件出现JSON解析错误 | 导出数据损坏/不完整 | 重新下载追踪数据;在脚本中使用 |
| 重复下载相同的追踪数据 | 嵌套文件夹中存在现有文件 | 使用当前版本的脚本(它们会检查输出目录中的现有文件) |
Safety for Open Source
开源安全注意事项
- Do not commit downloaded trace artifacts (, trace JSON dumps) unless sanitized.
manifest.json - Trace payloads can contain user prompts, outputs, metadata, and other sensitive runtime data.
- Keep this skill repository focused on scripts/templates, not production trace exports.
- 除非经过脱敏处理,否则请勿提交已下载的追踪数据文件(如、追踪数据JSON转储文件)。
manifest.json - 追踪数据可能包含用户提示、输出、元数据以及其他敏感运行时数据。
- 请确保该技能仓库仅聚焦于脚本/模板,而非生产环境追踪数据导出。
Resources
资源
scripts/
scripts/目录
- : Python downloader + organizer
scripts/download_traces.py - : TypeScript downloader + organizer
scripts/download_traces.ts - : Offline analysis and reporting
scripts/analyze_traces.py
- :Python下载器+整理工具
scripts/download_traces.py - :TypeScript下载器+整理工具
scripts/download_traces.ts - :离线分析与报告生成工具
scripts/analyze_traces.py
references/
references/目录
- : LangSmith query/filter examples
references/filtering-querying.md - : Diagnostic patterns and heuristics
references/analysis-patterns.md - : Benchmark-oriented analysis
references/benchmark-analysis.md
- :LangSmith查询/过滤示例
references/filtering-querying.md - :诊断模式与启发式方法
references/analysis-patterns.md - :面向基准测试的分析方法
references/benchmark-analysis.md