langsmith-trace-analyzer
Original:🇺🇸 English
Translated
3 scripts
Fetch, organize, and analyze LangSmith traces for debugging and evaluation. Use when you need to: query traces/runs by project, metadata, status, or time window; download traces to JSON; organize outcomes into passed/failed/error buckets; analyze token/message/tool-call patterns; compare passed vs failed behavior; or investigate benchmark and production failures.
6installs
Added on
NPX Install
npx skill4agent add lubu-labs/langchain-agent-skills langsmith-trace-analyzerTags
Translated version includes tags in frontmatterSKILL.md Content
View Translation Comparison →LangSmith Trace Analyzer
Use this skill to move from raw LangSmith traces to actionable debugging/evaluation insights.
Quick Start
bash
# Install dependencies
uv pip install langsmith langsmith-fetch
# Auth
export LANGSMITH_API_KEY=<your_langsmith_api_key>Fast workflow
- Download traces with (or
scripts/download_traces.py).scripts/download_traces.ts - Analyze downloaded JSON with .
scripts/analyze_traces.py - Load targeted references only when needed:
- for query/filter syntax
references/filtering-querying.md - for deeper diagnostics
references/analysis-patterns.md - for benchmark-specific workflows
references/benchmark-analysis.md
Decision Guide
-
Known trace IDs
Usedirectly, orlangsmith-fetch trace <id>in downloader scripts.--trace-ids -
Need to discover traces first
Use LangSmith SDKwith filters, then download selected trace IDs.list_runs/listRuns -
Need aggregate insights
Runfor summary stats, patterns, and passed-vs-failed comparisons.analyze_traces.py
Core Workflows
1) Download and organize traces
Python:
bash
uv run skills/langsmith-trace-analyzer/scripts/download_traces.py \
--project "my-project" \
--filter "job_id=abc123" \
--last-hours 24 \
--limit 100 \
--output ./traces \
--organizeTypeScript:
bash
ts-node skills/langsmith-trace-analyzer/scripts/download_traces.ts \
--project "my-project" \
--filter "job_id=abc123" \
--last-hours 24 \
--limit 100 \
--output ./tracesOutput layout:
text
traces/
├── manifest.json
└── by-outcome/
├── passed/
├── failed/
└── error/
├── GraphRecursionError/
├── TimeoutError/
└── DaytonaError/Notes:
- Python script supports .
--organize/--no-organize - Both scripts use SDK filtering plus for full trace payload export.
langsmith-fetch
2) Analyze downloaded traces
bash
# Markdown report
uv run skills/langsmith-trace-analyzer/scripts/analyze_traces.py ./traces --output report.md
# JSON output
uv run skills/langsmith-trace-analyzer/scripts/analyze_traces.py ./traces --json
# Compare passed vs failed (expects by-outcome folders)
uv run skills/langsmith-trace-analyzer/scripts/analyze_traces.py ./traces --compare --output comparison.mdThe analyzer reports:
- message/tool-call/token/duration summaries
- top tool usage
- anomaly patterns (high message count, repeated tools, quick failures)
- passed-vs-failed metric deltas when comparison is enabled
3) Query traces correctly (SDK)
Use official LangSmith run filter syntax via and/or :
filterstart_timepython
from datetime import datetime, timedelta, timezone
from langsmith import Client
client = Client()
start = datetime.now(timezone.utc) - timedelta(hours=24)
filter_query = 'and(eq(metadata_key, "job_id"), eq(metadata_value, "abc123"))'
runs = client.list_runs(
project_name="my-project",
is_root=True,
start_time=start,
filter=filter_query,
)For TypeScript:
ts
import { Client } from "langsmith";
const client = new Client();
for await (const run of client.listRuns({
projectName: "my-project",
isRoot: true,
filter: 'and(eq(metadata_key, "job_id"), eq(metadata_value, "abc123"))',
})) {
console.log(run.id, run.status);
}Accuracy and Schema Notes
- LangSmith run fields are commonly top-level (,
status,error,total_tokens,start_time).end_time - Some exported traces also include nested metadata (or
metadata) and/orextra.metadata.messages - is resilient to multiple payload shapes, including raw array payloads.
analyze_traces.py - For full conversation content, prefer downloaded trace payloads over bare results.
list_runs
Troubleshooting
| Issue | Likely Cause | Action |
|---|---|---|
| Auth not configured | |
| No runs returned | Wrong project/filter/time range | Verify project name and filter syntax |
| Empty/partial message arrays | Run schema differs or incomplete data | Use downloaded trace JSON and inspect |
| JSON parse error on downloaded files | Bad/incomplete export | Re-download trace; use |
| Re-downloading same traces repeatedly | Existing files in nested folders | Use current scripts (they check existing files across output tree) |
Safety for Open Source
- Do not commit downloaded trace artifacts (, trace JSON dumps) unless sanitized.
manifest.json - Trace payloads can contain user prompts, outputs, metadata, and other sensitive runtime data.
- Keep this skill repository focused on scripts/templates, not production trace exports.
Resources
scripts/
- : Python downloader + organizer
scripts/download_traces.py - : TypeScript downloader + organizer
scripts/download_traces.ts - : Offline analysis and reporting
scripts/analyze_traces.py
references/
- : LangSmith query/filter examples
references/filtering-querying.md - : Diagnostic patterns and heuristics
references/analysis-patterns.md - : Benchmark-oriented analysis
references/benchmark-analysis.md