extract-transcripts

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Extract Transcripts

提取转录文本

Extracts readable markdown transcripts from Claude Code and Codex CLI session JSONL files.
从Claude Code和Codex CLI会话JSONL文件中提取可读的Markdown格式转录文本。

Scripts

脚本

Claude Code Sessions

Claude Code会话

bash
undefined
bash
undefined

Extract a single session

提取单个会话

uv run ~/.claude/skills/extract-transcripts/extract_transcript.py <session.jsonl>
uv run ~/.claude/skills/extract-transcripts/extract_transcript.py <session.jsonl>

With tool calls and thinking blocks

包含工具调用和思考块

uv run ~/.claude/skills/extract-transcripts/extract_transcript.py <session.jsonl> --include-tools --include-thinking
uv run ~/.claude/skills/extract-transcripts/extract_transcript.py <session.jsonl> --include-tools --include-thinking

Extract all sessions from a directory

提取目录中的所有会话

uv run ~/.claude/skills/extract-transcripts/extract_transcript.py <directory> --all
uv run ~/.claude/skills/extract-transcripts/extract_transcript.py <directory> --all

Output to file

输出到文件

uv run ~/.claude/skills/extract-transcripts/extract_transcript.py <session.jsonl> -o output.md
uv run ~/.claude/skills/extract-transcripts/extract_transcript.py <session.jsonl> -o output.md

Summary only (quick overview)

仅生成摘要(快速概览)

uv run ~/.claude/skills/extract-transcripts/extract_transcript.py <session.jsonl> --summary
uv run ~/.claude/skills/extract-transcripts/extract_transcript.py <session.jsonl> --summary

Skip empty/warmup-only sessions

跳过空会话/仅热身会话

uv run ~/.claude/skills/extract-transcripts/extract_transcript.py <directory> --all --skip-empty

**Options:**
- `--include-tools`: Include tool calls and results
- `--include-thinking`: Include Claude's thinking blocks
- `--all`: Process all .jsonl files in directory
- `-o, --output`: Output file path (default: stdout)
- `--summary`: Only output brief summary
- `--skip-empty`: Skip empty and warmup-only sessions
- `--min-messages N`: Minimum messages for --skip-empty (default: 2)
uv run ~/.claude/skills/extract-transcripts/extract_transcript.py <directory> --all --skip-empty

**选项:**
- `--include-tools`:包含工具调用及结果
- `--include-thinking`:包含Claude的思考块
- `--all`:处理目录中所有.jsonl文件
- `-o, --output`:输出文件路径(默认:标准输出)
- `--summary`:仅输出简短摘要
- `--skip-empty`:跳过空会话和仅热身会话
- `--min-messages N`:`--skip-empty`选项的最小消息数(默认:2)

Codex CLI Sessions

Codex CLI会话

bash
undefined
bash
undefined

Extract a Codex session

提取Codex会话

uv run ~/.claude/skills/extract-transcripts/extract_codex_transcript.py <session.jsonl>
uv run ~/.claude/skills/extract-transcripts/extract_codex_transcript.py <session.jsonl>

Extract from Codex history file

从Codex历史文件中提取

uv run ~/.claude/skills/extract-transcripts/extract_codex_transcript.py ~/.codex/history.jsonl --history
undefined
uv run ~/.claude/skills/extract-transcripts/extract_codex_transcript.py ~/.codex/history.jsonl --history
undefined

Session File Locations

会话文件位置

Claude Code

Claude Code

  • Sessions:
    ~/.claude/projects/<project-path>/<session-id>.jsonl
  • 会话:
    ~/.claude/projects/<project-path>/<session-id>.jsonl

Codex CLI

Codex CLI

  • Sessions:
    ~/.codex/sessions/<session_id>/rollout.jsonl
  • History:
    ~/.codex/history.jsonl
  • 会话:
    ~/.codex/sessions/<session_id>/rollout.jsonl
  • 历史:
    ~/.codex/history.jsonl

DuckDB-Based Transcript Index

基于DuckDB的转录文本索引

For querying across many sessions, use the DuckDB-based indexer:
bash
undefined
如需跨多个会话查询,可使用基于DuckDB的索引工具:
bash
undefined

Index all sessions (incremental - only new/changed files)

索引所有会话(增量式 - 仅处理新增/修改的文件)

uv run ~/.claude/skills/extract-transcripts/transcript_index.py index
uv run ~/.claude/skills/extract-transcripts/transcript_index.py index

Force full reindex

强制全量重新索引

uv run ~/.claude/skills/extract-transcripts/transcript_index.py index --full
uv run ~/.claude/skills/extract-transcripts/transcript_index.py index --full

Limit number of files to process

限制处理的文件数量

uv run ~/.claude/skills/extract-transcripts/transcript_index.py index --limit 10
uv run ~/.claude/skills/extract-transcripts/transcript_index.py index --limit 10

List recent sessions

列出最近的会话

uv run ~/.claude/skills/extract-transcripts/transcript_index.py recent uv run ~/.claude/skills/extract-transcripts/transcript_index.py recent --limit 20 uv run ~/.claude/skills/extract-transcripts/transcript_index.py recent --project myapp uv run ~/.claude/skills/extract-transcripts/transcript_index.py recent --since 7d
uv run ~/.claude/skills/extract-transcripts/transcript_index.py recent uv run ~/.claude/skills/extract-transcripts/transcript_index.py recent --limit 20 uv run ~/.claude/skills/extract-transcripts/transcript_index.py recent --project myapp uv run ~/.claude/skills/extract-transcripts/transcript_index.py recent --since 7d

Search across sessions

跨会话搜索

uv run ~/.claude/skills/extract-transcripts/transcript_index.py search "error handling" uv run ~/.claude/skills/extract-transcripts/transcript_index.py search "query" --cwd ~/myproject
uv run ~/.claude/skills/extract-transcripts/transcript_index.py search "error handling" uv run ~/.claude/skills/extract-transcripts/transcript_index.py search "query" --cwd ~/myproject

Show a session transcript

显示会话转录文本

uv run ~/.claude/skills/extract-transcripts/transcript_index.py show <file_path> uv run ~/.claude/skills/extract-transcripts/transcript_index.py show <file_path> --summary

**Requirements:** uv (dependencies auto-installed via inline script metadata)

**Database location:** `~/.claude/transcript-index/sessions.duckdb`
uv run ~/.claude/skills/extract-transcripts/transcript_index.py show <file_path> uv run ~/.claude/skills/extract-transcripts/transcript_index.py show <file_path> --summary

**依赖要求:** uv(依赖项通过内联脚本元数据自动安装)

**数据库位置:** `~/.claude/transcript-index/sessions.duckdb`

Output Format

输出格式

Transcripts are formatted as markdown with:
  • Session metadata (date, duration, model, working directory, git branch)
  • User messages prefixed with
    ## User
  • Assistant responses prefixed with
    ## Assistant
  • Tool calls in code blocks (if --include-tools)
  • Thinking in blockquotes (if --include-thinking)
  • Tool usage summary for Codex sessions
转录文本将格式化为Markdown,包含:
  • 会话元数据(日期、时长、模型、工作目录、Git分支)
  • 用户消息前缀为
    ## User
  • 助手回复前缀为
    ## Assistant
  • 工具调用内容放在代码块中(如果使用--include-tools)
  • 思考内容放在块引用中(如果使用--include-thinking)
  • Codex会话的工具使用摘要