trx

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

trx -- Agent-First Transcription CLI

trx -- 以Agent为核心的转录CLI

Install:

npx skills add crafter-station/trx -g

安装：

npx skills add crafter-station/trx -g

Prerequisites

前置条件

Check setup:

trx doctor --output json

. If dependencies missing, run

trx init

Install:

bun add -g @crafter/trx

检查配置：

trx doctor --output json

。如果缺少依赖，请运行

trx init

。

安装：

bun add -g @crafter/trx

Workflow

工作流程

1. Dry-run first (always)

1. 优先执行Dry-run（必须步骤）

bash

trx transcribe <input> --dry-run --output json

Validates input, checks dependencies, shows execution plan without running.

bash

trx transcribe <input> --dry-run --output json

验证输入有效性、检查依赖、展示执行计划但不实际运行。

2. Transcribe

2. 执行转录

For URLs (YouTube, Twitter, Instagram, etc.):

bash

trx transcribe "https://youtube.com/watch?v=..." --output json

For local files:

bash

trx transcribe ./recording.mp4 --output json

Agent-optimized (text only, saves tokens):

bash

trx transcribe <input> --fields text --output json

处理URL（YouTube、Twitter、Instagram等）：

bash

trx transcribe "https://youtube.com/watch?v=..." --output json

处理本地文件：

bash

trx transcribe ./recording.mp4 --output json

Agent优化模式（仅返回文本，节省tokens）：

bash

trx transcribe <input> --fields text --output json

3. Post-process (fix whisper mistakes)

3. 后处理（修复whisper识别错误）

After transcription, read the

.txt

output and apply corrections. Read whisper-fixes.md for common patterns.

Correction checklist:

Punctuation: Whisper drops periods at paragraph boundaries and misplaces commas. Fix sentence boundaries.
Accents (Spanish): Whisper often drops diacritics. Restore: como -> como/cmo, esta -> esta/est, mas -> mas/ms.
Technical terms: Whisper misspells domain-specific words. Ask user for a glossary or infer from context.
Repeated phrases: Whisper sometimes stutters on word boundaries. Remove exact consecutive duplicates.
Speaker attribution: If user provides speaker names, insert
```
[Speaker Name]:
```
markers.
Filler words: Remove "um", "uh", "este", "o sea" if user wants clean output.
Timestamp alignment: If editing
```
.srt
```
, preserve the timestamp structure. Only modify text between timestamps.

转录完成后，读取输出的

.txt

文件并进行校正。常见错误模式可参考whisper-fixes.md。

校正检查清单：

标点符号：Whisper经常遗漏段落末尾的句号，逗号位置也容易出错，需要修正句子边界。
重音符号（西班牙语）：Whisper经常遗漏变音符号，需要补全：例如 como -> cómo/cómo，esta -> ésta/está，mas -> más/ms 这类错误。
专业术语：Whisper容易拼写错误领域专有词汇，可以向用户索要术语表或者结合上下文推断修正。
重复短语：Whisper有时会在词汇边界出现重复识别，需要删除连续的完全重复内容。
说话人标注：如果用户提供了说话人姓名，需要插入
```
[说话人姓名]:
```
标记。
填充词：如果用户需要干净的输出，可删除「um」、「uh」、「este」、「o sea」这类填充词。
时间戳对齐：如果编辑
```
.srt
```
文件，需要保留时间戳结构，仅修改时间戳之间的文本内容。

4. Schema introspection

4. 结构自查

bash

trx schema transcribe
trx schema init

bash

trx schema transcribe
trx schema init

Commands

命令列表

Command	Example
`init`	`trx init --model small`
`transcribe`	`trx transcribe <url-or-file> --output json`
`doctor`	`trx doctor --output json`
`schema`	`trx schema transcribe`

命令	示例
`init`	`trx init --model small`
`transcribe`	`trx transcribe <url-or-file> --output json`
`doctor`	`trx doctor --output json`
`schema`	`trx schema transcribe`

Shorthand

简写说明

trx <input>

is equivalent to

trx transcribe <input>

trx <input>

等价于

trx transcribe <input>

。

Output format

输出格式

```
--output json
```
: Machine-readable (default when piped)
```
--output table
```
: Human-readable with progress (default when TTY)
```
--fields text
```
: Only return transcript text (saves tokens)
```
--fields metadata
```
: Only return metadata (language, model)
```
--dry-run
```
: Validate without executing

```
--output json
```
：机器可读格式（管道传输时的默认选项）
```
--output table
```
：带进度的人类可读格式（终端交互时的默认选项）
```
--fields text
```
：仅返回转录文本（节省tokens）
```
--fields metadata
```
：仅返回元数据（语言、模型）
```
--dry-run
```
：仅验证不执行

Flags reference

参数参考

Flag	Description	Default
`--language <code>`	ISO 639-1 language code	`auto` (from config)
`--model <size>`	Override model: tiny, base, small, medium, large	from config
`--output-dir <dir>`	Output directory	`.` (cwd)
`--no-download`	Skip yt-dlp (local files only)	false
`--no-clean`	Skip ffmpeg audio cleaning	false
`--json <payload>`	Raw JSON input	-

参数	说明	默认值
`--language <code>`	ISO 639-1 语言编码	`auto` （读取配置）
`--model <size>`	覆写模型：tiny、base、small、medium、large	读取配置
`--output-dir <dir>`	输出目录	`.` （当前工作目录）
`--no-download`	跳过yt-dlp下载（仅处理本地文件）	false
`--no-clean`	跳过ffmpeg音频清洗	false
`--json <payload>`	原始JSON输入	-

Edge cases

边界场景处理

yt-dlp extension mismatch: yt-dlp sometimes outputs
```
.mp4.webm
```
instead of
```
.mp4
```
. The CLI handles this by scanning for the downloaded file by prefix.
Large files (>1hr): Whisper processes in segments. Works but is slow on CPU. Consider
```
--model tiny
```
for speed.
No GPU: whisper-cli uses CPU by default. Acceptable for tiny/base/small models.
Auto-detect language: When
```
--language auto
```
, Whisper detects the language from the first 30 seconds. For multilingual content, specify the primary language.

yt-dlp扩展名不匹配：yt-dlp有时会输出
```
.mp4.webm
```
而非
```
.mp4
```
文件，该CLI会通过前缀扫描下载文件自动处理该问题。
大文件（时长超过1小时）：Whisper会分段处理，CPU运行速度较慢，可使用
```
--model tiny
```
参数提升速度。
无GPU环境：whisper-cli默认使用CPU运行，可正常处理tiny/base/small模型。
自动语言检测：当使用
```
--language auto
```
参数时，Whisper会根据前30秒内容检测语言，多语言内容建议指定主语言。