youtube-transcript-cn

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

YouTube 字幕提取技能

YouTube Subtitle Extraction Skill

从 YouTube 视频提取字幕，输出中文文字稿。

Extract subtitles from YouTube videos and output Chinese transcripts.

快速使用

Quick Start

bash

python3 scripts/get_transcript.py "https://www.youtube.com/watch?v=VIDEO_ID"

bash

python3 scripts/get_transcript.py "https://www.youtube.com/watch?v=VIDEO_ID"

工作流程

Workflow

获取视频链接：从用户消息中提取 YouTube URL
运行脚本：执行
```
scripts/get_transcript.py
```
提取字幕
处理输出：
- 如果字幕是中文：直接输出
- 如果字幕是英文：翻译为中文后输出
保存结果：根据用户需求保存为 Markdown 文件

Get video link: Extract YouTube URL from user messages
Run script: Execute
```
scripts/get_transcript.py
```
to extract subtitles
Process output:
- If subtitles are in Chinese: Output directly
- If subtitles are in English: Translate to Chinese before output
Save results: Save as Markdown file per user requirements

脚本参数

Script Parameters

参数	说明	默认值
`url`	YouTube 视频链接或 ID	必填
`-f, --format`	输出格式: `text` , `markdown` , `json`	`markdown`
`-t, --timestamps`	包含时间戳	否
`-l, --lang`	首选语言（逗号分隔）	`zh,en`
`-o, --output`	输出文件路径	stdout

Parameter	Description	Default Value
`url`	YouTube video link or ID	Required
`-f, --format`	Output format: `text` , `markdown` , `json`	`markdown`
`-t, --timestamps`	Include timestamps	No
`-l, --lang`	Preferred languages (comma separated)	`zh,en`
`-o, --output`	Output file path	stdout

使用示例

Usage Examples

基本用法

Basic Usage

bash

python3 scripts/get_transcript.py "https://www.youtube.com/watch?v=dQw4w9WgXcQ"

bash

python3 scripts/get_transcript.py "https://www.youtube.com/watch?v=dQw4w9WgXcQ"

带时间戳输出

Output with Timestamps

bash

python3 scripts/get_transcript.py -t "https://youtu.be/VIDEO_ID"

bash

python3 scripts/get_transcript.py -t "https://youtu.be/VIDEO_ID"

保存到文件

Save to File

bash

python3 scripts/get_transcript.py -o output.md "https://www.youtube.com/watch?v=VIDEO_ID"

bash

python3 scripts/get_transcript.py -o output.md "https://www.youtube.com/watch?v=VIDEO_ID"

JSON 格式（便于后处理）

JSON Format (for easy post-processing)

bash

python3 scripts/get_transcript.py -f json "VIDEO_ID"

bash

python3 scripts/get_transcript.py -f json "VIDEO_ID"

支持的 URL 格式

Supported URL Formats

https://www.youtube.com/watch?v=VIDEO_ID

```
https://youtu.be/VIDEO_ID
```
```
https://www.youtube.com/embed/VIDEO_ID
```
直接使用 Video ID

https://www.youtube.com/watch?v=VIDEO_ID

```
https://youtu.be/VIDEO_ID
```
```
https://www.youtube.com/embed/VIDEO_ID
```
Direct use of Video ID

语言优先级

Language Priority

脚本按以下顺序查找字幕：

简体中文 (zh-Hans)
繁体中文 (zh-Hant)
中文 (zh)
英文 (en)
自动生成字幕

The script looks for subtitles in the following order:

Simplified Chinese (zh-Hans)
Traditional Chinese (zh-Hant)
Chinese (zh)
English (en)
Auto-generated subtitles

翻译处理

Translation Processing

如果获取到的是英文字幕，需要翻译为中文：

运行脚本获取英文字幕
使用 Claude 内置能力翻译为流畅的中文
保持原文结构和段落划分

If the acquired subtitles are in English, they need to be translated into Chinese:

Run the script to get English subtitles
Use Claude's built-in capability to translate into fluent Chinese
Maintain the original structure and paragraph division

错误处理

Error Handling

错误	原因	解决方案
No captions available	视频无字幕	告知用户该视频没有可用字幕
Cannot extract video ID	URL 格式错误	请求用户提供正确的 YouTube 链接
网络超时	网络问题	重试或检查网络连接

Error	Cause	Solution
No captions available	The video has no subtitles	Inform the user that there are no available subtitles for this video
Cannot extract video ID	URL format error	Ask the user to provide a correct YouTube link
Network timeout	Network problem	Retry or check network connection

输出格式示例

Output Format Example

Markdown 格式

Markdown Format

markdown

undefined

markdown

undefined

视频标题

Video ID: abc123
语言: 中文（自动生成）
链接: https://www.youtube.com/watch?v=abc123

Video ID: abc123
语言: 中文（自动生成）
链接: https://www.youtube.com/watch?v=abc123

文字稿

这是视频的第一段内容...

这是视频的第二段内容...

undefined

这是视频的第一段内容...

这是视频的第二段内容...

undefined