omnicaptions-translate

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Caption Translation

字幕翻译

Default: Claude native translation (no API key needed)

Use Gemini API only when user explicitly requests it.

默认：Claude原生翻译（无需API密钥）

仅当用户明确要求时才使用Gemini API。

Default Workflow (Claude)

默认工作流（Claude）

Read the caption file
Translate using Claude's native ability
Write output with
```
_Claude_{lang}
```
suffix

读取字幕文件
利用Claude的原生能力进行翻译
生成带有
```
_Claude_{lang}
```
后缀的输出文件

Gemini API (Optional)

Gemini API（可选）

Use CLI when user requests Gemini:

bash

omnicaptions translate input.srt -l zh --bilingual

Output:

input_Gemini_zh.srt

当用户请求使用Gemini时，使用CLI命令：

bash

omnicaptions translate input.srt -l zh --bilingual

输出文件：

input_Gemini_zh.srt

When to Use

适用场景

Translate SRT/VTT/ASS to another language
Generate bilingual captions (original + translation)
Translate YouTube video transcripts
Need context-aware translation (not line-by-line)

将SRT/VTT/ASS格式字幕翻译成其他语言
生成双语字幕（原文+译文）
翻译YouTube视频转录文本
需要上下文感知翻译（而非逐行翻译）

When NOT to Use

不适用场景

Need transcription (use
```
/omnicaptions:transcribe
```
)
Just format conversion without translation (use
```
/omnicaptions:convert
```
)

需要转录功能（使用
```
/omnicaptions:transcribe
```
）
仅需格式转换无需翻译（使用
```
/omnicaptions:convert
```
）

Setup

安装设置

bash

pip install omni-captions-skills

bash

pip install omni-captions-skills

API Key

API密钥优先级

Priority:

GEMINI_API_KEY

env →

.env

file →

~/.config/omnicaptions/config.json

If not set, ask user:

Please enter your Gemini API key (get from https://aistudio.google.com/apikey):

Then run with

-k <key>

. Key will be saved to config file automatically.

优先级顺序：

GEMINI_API_KEY

环境变量 →

.env

文件 →

~/.config/omnicaptions/config.json

若未设置，将提示用户：

请输入您的Gemini API密钥（获取地址：https://aistudio.google.com/apikey）：

随后可通过

-k <密钥>

参数运行，密钥将自动保存至配置文件。

Context-Aware Translation

上下文感知翻译

LLM-based translation is superior to traditional machine translation because it understands context across multiple lines:

基于大语言模型（LLM）的翻译优于传统机器翻译，因为它能理解多行上下文：

Why Context Matters

上下文的重要性

Approach	Problem	Result
Line-by-line	No context	Robotic, disconnected translations
Batch + Context	Sees surrounding lines	Natural, coherent dialogue

翻译方式	问题	结果
逐行翻译	无上下文参考	翻译生硬、上下文脱节
批量+上下文	可查看前后行内容	翻译自然、对话连贯

How It Works

工作原理

┌─────────────────────────────────────────┐
│  Batch size: 30 lines                   │
│  Context: 5 lines before/after          │
├─────────────────────────────────────────┤
│  [5 previous lines] → context           │
│  [30 current lines] → translate         │
│  [5 next lines]     → preview           │
└─────────────────────────────────────────┘

Benefits:

Speaker continuity - maintains character voice
Split sentences - handles dialogue spanning multiple lines
Idioms & culture - adapts cultural references naturally
Pronoun resolution - correct he/she/they based on context

┌─────────────────────────────────────────┐
│  批量大小：30行                         │
│  上下文范围：前后各5行                  │
├─────────────────────────────────────────┤
│  [前5行内容] → 上下文参考               │
│  [当前30行内容] → 待翻译内容             │
│  [后5行内容] → 前置预览                 │
└─────────────────────────────────────────┘

优势：

说话人一致性 - 保持角色语气和风格
断句处理 - 处理跨多行的对话内容
习语与文化适配 - 自然适配习语和文化引用
代词解析 - 根据上下文修正人称代词翻译

Advanced Features

高级功能

Bilingual Output

双语输出

bash

undefined

bash

undefined

Original + Translation (for language learning)

生成原文+译文双语字幕（适用于语言学习）

omnicaptions translate input.srt -l zh --bilingual


Output example:
```srt
1
00:00:01,000 --> 00:00:03,500
Welcome to the show.
欢迎来到节目。

2
00:00:03,500 --> 00:00:06,000
Thank you for having me.
感谢邀请我。

omnicaptions translate input.srt -l zh --bilingual


输出示例：
```srt
1
00:00:01,000 --> 00:00:03,500
Welcome to the show.
欢迎来到节目。

2
00:00:03,500 --> 00:00:06,000
Thank you for having me.
感谢邀请我。

Custom Glossary (Coming Soon)

自定义术语表（即将推出）

For domain-specific or branded content:

bash

undefined

针对特定领域或品牌内容：

bash

undefined

Use glossary for consistent terminology

使用术语表确保术语翻译一致性

omnicaptions translate input.srt -l zh --glossary terms.json


Glossary format:
```json
{
  "API": "接口",
  "Token": "令牌",
  "Machine Learning": "机器学习"
}

Benefits:

Terminology consistency - "one term, one translation"
Brand compliance - use official product names
Domain accuracy - medical, legal, technical terms

omnicaptions translate input.srt -l zh --glossary terms.json


术语表格式：
```json
{
  "API": "接口",
  "Token": "令牌",
  "Machine Learning": "机器学习"
}

优势：

术语一致性 - “一词一译”
品牌合规 - 使用官方产品名称
领域准确性 - 适配医疗、法律、技术等专业术语

Best Practices

最佳实践

1. Provide Context for Better Quality

1. 提供上下文以提升翻译质量

For specialized content, use custom prompts:

python

from omnicaptions import GeminiCaption

gc = GeminiCaption()
gc._translation_prompt = """
You are translating captions for a medical documentary.
Use formal Chinese medical terminology.
Glossary: {glossary}
"""
gc.translate("input.srt", "output.srt", "zh")

针对专业内容，可使用自定义提示词：

python

from omnicaptions import GeminiCaption

gc = GeminiCaption()
gc._translation_prompt = """
您正在为一部医疗纪录片翻译字幕。
请使用正式的中文医学术语。
术语表：{glossary}
"""
gc.translate("input.srt", "output.srt", "zh")

2. Choose the Right Model

2. 选择合适的模型

Model	Best For
`gemini-3-flash-preview`	Fast, everyday content
`gemini-3-pro-preview`	Complex, nuanced content

模型	适用场景
`gemini-3-flash-preview`	日常内容，翻译速度快
`gemini-3-pro-preview`	复杂、需要精细处理的内容

3. Review Bilingual Output

3. 审核双语输出内容

Bilingual captions let viewers verify translation quality - ideal for:

Language learners
Quality assurance
Accessibility

双语字幕可让观众验证翻译质量，非常适合：

语言学习者
质量保证环节
无障碍访问需求

CLI Usage

CLI使用方法

bash

undefined

bash

undefined

Translate (auto-output to same directory)

翻译（自动输出至同一目录）

omnicaptions translate input.srt -l zh # → ./input_Gemini_zh.srt

Specify output file or directory

指定输出文件或目录

omnicaptions translate input.srt -o output/ -l zh # → output/input_Gemini_zh.srt omnicaptions translate input.srt -o zh.srt -l zh # → zh.srt

Bilingual output (original + translation)

生成双语输出（原文+译文）

omnicaptions translate input.srt -l zh --bilingual

Specify model

指定模型

omnicaptions translate input.vtt -l ja -m gemini-3-pro-preview


| Option | Description |
|--------|-------------|
| `-k, --api-key` | Gemini API key (auto-prompted if missing) |
| `-o, --output` | Output file or directory (default: same dir as input) |
| `-l, --language` | Target language code (required) |
| `--bilingual` | Output both original and translation |
| `-m, --model` | Model name (default: gemini-3-flash-preview) |
| `-v, --verbose` | Verbose output |

omnicaptions translate input.vtt -l ja -m gemini-3-pro-preview


| 参数 | 描述 |
|--------|-------------|
| `-k, --api-key` | Gemini API密钥（若未设置将自动提示） |
| `-o, --output` | 输出文件或目录（默认：与输入文件同一目录） |
| `-l, --language` | 目标语言代码（必填） |
| `--bilingual` | 输出原文+译文双语字幕 |
| `-m, --model` | 模型名称（默认：gemini-3-flash-preview） |
| `-v, --verbose` | 显示详细输出信息 |

Language Codes

语言代码

Language	Code
Chinese (Simplified)	`zh`
Chinese (Traditional)	`zh-TW`
Japanese	`ja`
Korean	`ko`
English	`en`
Spanish	`es`
French	`fr`
German	`de`

语言	代码
中文（简体）	`zh`
中文（繁体）	`zh-TW`
日语	`ja`
韩语	`ko`
英语	`en`
西班牙语	`es`
法语	`fr`
德语	`de`

Supported Formats

支持的格式

All formats from

lattifai-captions

: SRT, VTT, ASS, TTML, JSON, Gemini MD, etc.

支持

lattifai-captions

的所有格式：SRT、VTT、ASS、TTML、JSON、Gemini MD等。

Common Mistakes

常见问题

Mistake	Fix
No API key	Use `-k YOUR_KEY` or follow the prompt
Wrong language code	Use ISO codes: zh, ja, en, etc.
Lost formatting	ASS styles preserved; SRT basic only
Inconsistent terms	Use glossary for technical content

问题	解决方法
无API密钥	使用 `-k YOUR_KEY` 参数或按照提示输入
语言代码错误	使用ISO标准代码：zh、ja、en等
格式丢失	ASS样式将被保留；SRT仅保留基础格式
术语翻译不一致	对专业内容使用术语表功能

References

参考资料

Caption LLM Translator - Context window approach
Caption Translator - Batch processing
Captions.Translate.Agent - Multi-agent workflow

Caption LLM Translator - 上下文窗口处理方案
Caption Translator - 批量处理方案
Captions.Translate.Agent - 多智能体工作流

Related Skills

Skill	Use When
`/omnicaptions:transcribe`	Need transcript first
`/omnicaptions:LaiCut`	Align timing before translation
`/omnicaptions:convert`	Convert format after translation
`/omnicaptions:download`	Download captions to translate

技能	适用场景
`/omnicaptions:transcribe`	先需要生成转录文本时
`/omnicaptions:LaiCut`	翻译前需要对齐时间轴时
`/omnicaptions:convert`	翻译后需要转换格式时
`/omnicaptions:download`	需要先下载字幕再翻译时

翻译方式	后缀	示例
Claude (默认)	`_Claude_zh`	`video.en_LaiCut_Claude_zh.srt`
Gemini API	`_Gemini_zh`	`video.en_LaiCut_Gemini_zh.srt`

翻译方式	后缀	示例
Claude（默认）	`_Claude_zh`	`video.en_LaiCut_Claude_zh.srt`
Gemini API	`_Gemini_zh`	`video.en_LaiCut_Gemini_zh.srt`

1. LaiCut 对齐 (保留词级时间)

1. LaiCut 对齐（保留词级时间轴）

omnicaptions LaiCut video.mp4 video.en.vtt

→ video.en_LaiCut.json

2. 转换为 SRT (翻译用，文件小)

2. 转换为SRT格式（用于翻译，文件体积更小）

omnicaptions convert video.en_LaiCut.json -o video.en_LaiCut.srt

3a. Claude 翻译 (默认)

3a. Claude 翻译（默认）

→ video.en_LaiCut_Claude_zh.srt

3b. 或 Gemini 翻译

3b. 或使用 Gemini 翻译

omnicaptions translate video.en_LaiCut.srt -l zh --bilingual

→ video.en_LaiCut_Gemini_zh.srt

4. 转换为带颜色的 ASS

4. 转换为带颜色的ASS格式

omnicaptions convert video.en_LaiCut_Claude_zh.srt -o video.en_LaiCut_Claude_zh_Color.ass
--line1-color "#00FF00" --line2-color "#FFFF00"

undefined

omnicaptions convert video.en_LaiCut_Claude_zh.srt -o video.en_LaiCut_Claude_zh_Color.ass
--line1-color "#00FF00" --line2-color "#FFFF00"

undefined

Large JSON Files

大型JSON文件处理

LaiCut outputs JSON with word-level timing. For translation, convert to SRT first (much smaller):

bash

undefined

LaiCut输出的JSON文件包含词级时间轴信息。翻译前请先转换为SRT格式（体积小10-20倍）：

bash

undefined

JSON (word-level, ~150KB) → SRT (segment-level, ~15KB)

JSON（词级时间轴，约150KB）→ SRT（段落级时间轴，约15KB）

omnicaptions convert video.en_LaiCut.json -o video.en_LaiCut.srt


Why? JSON preserves word timing for karaoke, but translation only needs segment text. SRT is 10-20x smaller.

omnicaptions convert video.en_LaiCut.json -o video.en_LaiCut.srt


原因：JSON格式保留词级时间轴用于卡拉OK等场景，但翻译仅需段落文本内容。SRT文件体积更小，处理效率更高。

Claude Translation Rules (Default)

Claude翻译规则（默认）

Preserve format exactly - Keep all timing codes, formatting tags, style definitions
Context-aware - Consider surrounding lines for coherent dialogue
Speaker consistency - Maintain character voice and tone
Cultural adaptation - Adapt idioms and references naturally
Large files - Process in batches of 100 lines to maintain quality

完全保留格式 - 保留所有时间码、格式标签、样式定义
上下文感知 - 参考前后行内容确保对话连贯
说话人一致性 - 保持角色语气和风格
文化适配 - 自然适配习语和文化引用
大文件处理 - 按100行批量处理以保证翻译质量

Claude vs Gemini

Feature	Claude (Default)	Gemini API
API Key	None needed	Required
Invocation	Skill (Read/Write)	CLI command
Output suffix	`_Claude_{lang}`	`_Gemini_{lang}`
Best for	Most tasks	Large files, automation

特性	Claude（默认）	Gemini API
API密钥	无需	必填
调用方式	技能（读写操作）	CLI命令
输出后缀	`_Claude_{lang}`	`_Gemini_{lang}`
最佳适用场景	大多数常规任务	大文件处理、自动化流程

omnicaptions-translate

Original

Translation

Caption Translation

字幕翻译

Default Workflow (Claude)

默认工作流（Claude）

Gemini API (Optional)

Gemini API（可选）

When to Use

适用场景

When NOT to Use

不适用场景

Setup

安装设置

API Key

API密钥优先级

Context-Aware Translation

上下文感知翻译

Why Context Matters

上下文的重要性

How It Works

工作原理

Advanced Features

高级功能

Bilingual Output

双语输出

Original + Translation (for language learning)

生成原文+译文双语字幕（适用于语言学习）

Custom Glossary (Coming Soon)

自定义术语表（即将推出）

Use glossary for consistent terminology

使用术语表确保术语翻译一致性

Best Practices

最佳实践

1. Provide Context for Better Quality

1. 提供上下文以提升翻译质量

2. Choose the Right Model

2. 选择合适的模型

3. Review Bilingual Output

3. 审核双语输出内容

CLI Usage

CLI使用方法

Translate (auto-output to same directory)

翻译（自动输出至同一目录）

Specify output file or directory

指定输出文件或目录

Bilingual output (original + translation)

生成双语输出（原文+译文）

Specify model

指定模型

Language Codes

语言代码

Supported Formats

支持的格式

Common Mistakes

常见问题

References

参考资料

Related Skills

相关技能

Workflow Examples

工作流示例

1. LaiCut 对齐 (保留词级时间)

1. LaiCut 对齐（保留词级时间轴）

→ video.en_LaiCut.json

→ video.en_LaiCut.json

2. 转换为 SRT (翻译用，文件小)

2. 转换为SRT格式（用于翻译，文件体积更小）

3a. Claude 翻译 (默认)

3a. Claude 翻译（默认）

→ video.en_LaiCut_Claude_zh.srt

→ video.en_LaiCut_Claude_zh.srt

3b. 或 Gemini 翻译

3b. 或使用 Gemini 翻译

→ video.en_LaiCut_Gemini_zh.srt

→ video.en_LaiCut_Gemini_zh.srt

4. 转换为带颜色的 ASS

4. 转换为带颜色的ASS格式

Large JSON Files