youtube-transcribe-skill

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

YouTube Transcript Extraction

YouTube视频转录文本提取

Extract subtitles/transcripts from a YouTube video URL and save them as a local file.

Input YouTube URL: $ARGUMENTS

从YouTube视频URL中提取字幕/转录文本并保存为本地文件。

输入YouTube URL：$ARGUMENTS

Step 1: Verify URL and Get Video Information

步骤1：验证URL并获取视频信息

Verify URL Format: Confirm the input is a valid YouTube URL (supports
```
youtube.com/watch?v=
```
or
```
youtu.be/
```
formats).
Get Video Information: Use WebFetch or firecrawl to fetch the page and extract the video title for subsequent file naming.

验证URL格式：确认输入为有效的YouTube URL（支持
```
youtube.com/watch?v=
```
或
```
youtu.be/
```
格式）。
获取视频信息：使用WebFetch或firecrawl抓取页面并提取视频标题，用于后续文件命名。

Step 2: CLI Quick Extraction (Priority Attempt)

步骤2：CLI快速提取（优先尝试）

Use command-line tools to quickly extract subtitles.

Check Tool Availability: Execute
```
which yt-dlp
```
.
- If
```
yt-dlp
```
  is found, proceed to subtitle download.
- If
```
yt-dlp
```
  is NOT found, skip immediately to Step 3.

Execute Subtitle Download (Only if

yt-dlp

is found):

Tip: Always add
```
--cookies-from-browser
```
to avoid sign-in restrictions. Default to
```
chrome
```
.
Retry Logic: If
```
yt-dlp
```
fails with a browser error (e.g., "Could not open Chrome"), ask the user to specify their available browser (e.g.,
```
firefox
```
,
```
safari
```
,
```
edge
```
) and retry.

bash

# Get the title first (try chrome first)
yt-dlp --cookies-from-browser=chrome --get-title "[VIDEO_URL]"

# Download subtitles
yt-dlp --cookies-from-browser=chrome --write-auto-sub --write-sub --sub-lang zh-Hans,zh-Hant,en --skip-download --output "<Video Title>.%(ext)s" "[VIDEO_URL]"

Verify Results:
- Check the command exit code.
- Exit code 0 (Success): Subtitles have been saved locally, task complete.
- Exit code non-0 (Failure):
  - If error is related to browser/cookies, ask user for correct browser and retry Step 2.
  - If other errors (e.g., video unavailable), proceed to Step 3.

使用命令行工具快速提取字幕。

检查工具可用性：执行
```
which yt-dlp
```
。
- 如果找到
```
yt-dlp
```
  ，继续进行字幕下载。
- 如果未找到
```
yt-dlp
```
  ，直接跳转到步骤3。

执行字幕下载（仅当找到

yt-dlp

时）：

提示：始终添加
```
--cookies-from-browser
```
参数以避免登录限制，默认使用
```
chrome
```
。
重试逻辑：如果
```
yt-dlp
```
因浏览器错误（如“无法打开Chrome”）失败，请询问用户指定其可用浏览器（如
```
firefox
```
、
```
safari
```
、
```
edge
```
）并重试。

bash

# 先获取标题（优先尝试chrome）
yt-dlp --cookies-from-browser=chrome --get-title "[VIDEO_URL]"

# 下载字幕
yt-dlp --cookies-from-browser=chrome --write-auto-sub --write-sub --sub-lang zh-Hans,zh-Hant,en --skip-download --output "<Video Title>.%(ext)s" "[VIDEO_URL]"

验证结果：
- 检查命令退出码。
- 退出码0（成功）：字幕已保存到本地，任务完成。
- 退出码非0（失败）：
  - 如果错误与浏览器/ cookies相关，询问用户正确的浏览器并重试步骤2。
  - 如果是其他错误（如视频不可用），继续执行步骤3。

Step 3: Browser Automation (Fallback)

步骤3：浏览器自动化（备选方案）

When the CLI method fails or

yt-dlp

is missing, use browser UI automation to extract subtitles.

Check Tool Availability:
- Check if
```
chrome-devtools-mcp
```
  tools (specifically
```
mcp__plugin_claude-code-settings_chrome__new_page
```
  ) are available.
- CRITICAL CHECK: If
```
chrome-devtools-mcp
```
  is NOT available AND
```
yt-dlp
```
  was NOT found in Step 2:
  - STOP execution.
  - Notify the User: "Unable to proceed. Please either install
```
yt-dlp
```
    (for fast CLI extraction) OR configure
```
chrome-devtools-mcp
```
    (for browser automation)."
Initialize Browser Session (If tools are available):
Call
```
mcp__plugin_claude-code-settings_chrome__new_page
```
to open the video URL.

当CLI方法失败或未找到

yt-dlp

时，使用浏览器UI自动化提取字幕。

检查工具可用性：
- 检查是否有
```
chrome-devtools-mcp
```
  工具（特别是
```
mcp__plugin_claude-code-settings_chrome__new_page
```
  ）可用。
- 关键检查：如果
```
chrome-devtools-mcp
```
  不可用且步骤2中未找到
```
yt-dlp
```
  ：
  - 停止执行。
  - 通知用户：“无法继续操作。请安装
```
yt-dlp
```
    （用于快速CLI提取）或配置
```
chrome-devtools-mcp
```
    （用于浏览器自动化）。”
初始化浏览器会话（如果工具可用）：

调用

mcp__plugin_claude-code-settings_chrome__new_page

打开视频URL。

3.2 Analyze Page State

3.2 分析页面状态

Call

mcp__plugin_claude-code-settings_chrome__take_snapshot

to read the page accessibility tree.

调用

mcp__plugin_claude-code-settings_chrome__take_snapshot

读取页面可访问性树。

3.3 Expand Video Description

3.3 展开视频描述

Reason: The "Show transcript" button is usually hidden within the collapsed description area.

Search the snapshot for a button labeled "...more", "...更多", or "Show more" (usually located in the description block below the video title).

Call

mcp__plugin_claude-code-settings_chrome__click

to click that button.

原因：“显示转录文本”按钮通常隐藏在折叠的描述区域内。

在快照中搜索标签为**“...more”、“...更多”或“Show more”**的按钮（通常位于视频标题下方的描述块中）。

调用

mcp__plugin_claude-code-settings_chrome__click

点击该按钮。

3.4 Open Transcript Panel

3.4 打开转录文本面板

Call

mcp__plugin_claude-code-settings_chrome__take_snapshot

to get the updated UI snapshot.

Search for a button labeled "Show transcript", "显示转录稿", or "内容转文字".

Call

mcp__plugin_claude-code-settings_chrome__click

to click that button.

调用

mcp__plugin_claude-code-settings_chrome__take_snapshot

获取更新后的UI快照。

搜索标签为**“Show transcript”、“显示转录稿”或“内容转文字”**的按钮。

调用

mcp__plugin_claude-code-settings_chrome__click

点击该按钮。

3.5 Extract Content via DOM

3.5 通过DOM提取内容

Reason: Directly reading the accessibility tree for long lists is slow and consumes many tokens; DOM injection is more efficient.

Call

mcp__plugin_claude-code-settings_chrome__evaluate_script

to execute the following JavaScript:

javascript

() => {
  // Select all transcript segment containers
  const segments = document.querySelectorAll("ytd-transcript-segment-renderer");
  if (!segments.length) return "BUFFERING"; // Retry if empty

  // Iterate and format as "timestamp text"
  return Array.from(segments)
    .map((seg) => {
      const time = seg.querySelector(".segment-timestamp")?.innerText.trim();
      const text = seg.querySelector(".segment-text")?.innerText.trim();
      return `${time} ${text}`;
    })
    .join("\n");
};

If it returns "BUFFERING", wait a few seconds and retry.

原因：直接读取长列表的可访问性树速度慢且消耗大量令牌；DOM注入更高效。

调用

mcp__plugin_claude-code-settings_chrome__evaluate_script

执行以下JavaScript：

javascript

() => {
  // 选择所有转录文本片段容器
  const segments = document.querySelectorAll("ytd-transcript-segment-renderer");
  if (!segments.length) return "BUFFERING"; // 如果为空则重试

  // 迭代并格式化为“时间戳 文本”
  return Array.from(segments)
    .map((seg) => {
      const time = seg.querySelector(".segment-timestamp")?.innerText.trim();
      const text = seg.querySelector(".segment-text")?.innerText.trim();
      return `${time} ${text}`;
    })
    .join("\n");
};

如果返回“BUFFERING”，等待几秒后重试。

3.6 Save and Cleanup

3.6 保存并清理

Use the Write tool to save the extracted text as a local file (e.g.,
```
<Video Title>.txt
```
).

Call

mcp__plugin_claude-code-settings_chrome__close_page

to release resources.

使用写入工具将提取的文本保存为本地文件（如
```
<Video Title>.txt
```
）。

调用

mcp__plugin_claude-code-settings_chrome__close_page

释放资源。

Output Requirements

输出要求

Save the subtitle file to the current working directory.
Filename format:
```
<Video Title>.txt
```
File content format: Each line should be
```
Timestamp Subtitle Text
```
.
Report upon completion: File path, subtitle language, total number of lines.

将字幕文件保存到当前工作目录。
文件名格式：
```
<Video Title>.txt
```
文件内容格式：每行应为
```
时间戳 字幕文本
```
。
完成后报告：文件路径、字幕语言、总行数。