douyin-video-extractor
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseDouyin Video Extractor Skill
抖音视频提取Skill
Overview
概述
douyin-mcp-serverKey Features:
- Extract high-quality watermark-free video download links
- AI-powered speech-to-text transcription using SenseVoice
- Automatic chunking for large audio files (>1 hour or >50MB)
- MCP integration for Claude Desktop and other AI assistants
- Web interface for browser-based usage
douyin-mcp-server核心功能:
- 提取高质量无水印视频下载链接
- 基于SenseVoice的AI语音转文字功能
- 对大型音频文件(超过1小时或50MB)自动分块处理
- 可与Claude Desktop及其他AI助手进行MCP集成
- 支持浏览器端使用的Web界面
Installation
安装
Prerequisites
前置依赖
bash
undefinedbash
undefinedInstall uv (Python package manager)
安装uv(Python包管理器)
curl -LsSf https://astral.sh/uv/install.sh | sh
curl -LsSf https://astral.sh/uv/install.sh | sh
Install FFmpeg (required for audio processing)
安装FFmpeg(音频处理必需)
macOS
macOS
brew install ffmpeg
brew install ffmpeg
Ubuntu/Debian
Ubuntu/Debian
apt install ffmpeg
apt install ffmpeg
Windows (with chocolatey)
Windows(使用chocolatey)
choco install ffmpeg
undefinedchoco install ffmpeg
undefinedSetup
配置步骤
bash
undefinedbash
undefinedClone the repository
克隆仓库
git clone https://github.com/yzfly/douyin-mcp-server.git
cd douyin-mcp-server
git clone https://github.com/yzfly/douyin-mcp-server.git
cd douyin-mcp-server
Install dependencies
安装依赖
uv sync
uv sync
Set API key for transcription (optional, only needed for text extraction)
设置转写功能的API密钥(可选,仅在提取文本时需要)
export API_KEY="sk-xxxxxxxxxxxxxxxx"
undefinedexport API_KEY="sk-xxxxxxxxxxxxxxxx"
undefinedUsage Modes
使用模式
1. WebUI (Recommended for Interactive Use)
1. WebUI(交互式使用推荐)
bash
undefinedbash
undefinedStart the web server
启动Web服务器
uv run python web/app.py
uv run python web/app.py
Access in browser: http://localhost:8080
在浏览器中访问:http://localhost:8080
**WebUI Features:**
- Parse video info without API key
- Extract transcripts with API key (configured in browser or env var)
- Download videos directly
- Export transcripts as Markdown
**WebUI功能:**
- 无需API密钥即可解析视频信息
- 使用API密钥提取转写文本(可在浏览器或环境变量中配置)
- 直接下载视频
- 将转写文本导出为Markdown格式2. MCP Server (For AI Assistants)
2. MCP服务器(适用于AI助手)
Configure in or similar MCP client config:
claude_desktop_config.jsonjson
{
"mcpServers": {
"douyin-mcp": {
"command": "uvx",
"args": ["douyin-mcp-server"],
"env": {
"API_KEY": "sk-xxxxxxxxxxxxxxxx"
}
}
}
}Available MCP Tools:
- - Parse video metadata (no API key needed)
parse_douyin_video_info - - Get watermark-free download URL (no API key needed)
get_douyin_download_link - - Extract video transcript via AI (requires API key)
extract_douyin_text
在或类似MCP客户端配置文件中进行配置:
claude_desktop_config.jsonjson
{
"mcpServers": {
"douyin-mcp": {
"command": "uvx",
"args": ["douyin-mcp-server"],
"env": {
"API_KEY": "sk-xxxxxxxxxxxxxxxx"
}
}
}
}可用MCP工具:
- - 解析视频元数据(无需API密钥)
parse_douyin_video_info - - 获取无水印下载链接(无需API密钥)
get_douyin_download_link - - 通过AI提取视频转写文本(需要API密钥)
extract_douyin_text
3. Command Line Interface
3. 命令行界面
bash
undefinedbash
undefinedGet video information (no API key required)
获取视频信息(无需API密钥)
Download watermark-free video
下载无水印视频
uv run python douyin-video/scripts/douyin_downloader.py
-l "https://v.douyin.com/xxxxx/"
-a download
-o ./videos
-l "https://v.douyin.com/xxxxx/"
-a download
-o ./videos
uv run python douyin-video/scripts/douyin_downloader.py
-l "https://v.douyin.com/xxxxx/"
-a download
-o ./videos
-l "https://v.douyin.com/xxxxx/"
-a download
-o ./videos
Extract transcript (requires API_KEY)
提取转写文本(需要API_KEY)
uv run python douyin-video/scripts/douyin_downloader.py
-l "https://v.douyin.com/xxxxx/"
-a extract
-o ./output
-l "https://v.douyin.com/xxxxx/"
-a extract
-o ./output
uv run python douyin-video/scripts/douyin_downloader.py
-l "https://v.douyin.com/xxxxx/"
-a extract
-o ./output
-l "https://v.douyin.com/xxxxx/"
-a extract
-o ./output
Extract transcript and save video
提取转写文本并保存视频
uv run python douyin-video/scripts/douyin_downloader.py
-l "https://v.douyin.com/xxxxx/"
-a extract
-o ./output
--save-video
-l "https://v.douyin.com/xxxxx/"
-a extract
-o ./output
--save-video
**CLI Arguments:**
- `-l, --link` - Douyin share link (required)
- `-a, --action` - Action: `info`, `download`, or `extract` (required)
- `-o, --output` - Output directory (default: `./output`)
- `--save-video` - Save video file when extracting transcript
- `--api-key` - Override API key from environmentuv run python douyin-video/scripts/douyin_downloader.py
-l "https://v.douyin.com/xxxxx/"
-a extract
-o ./output
--save-video
-l "https://v.douyin.com/xxxxx/"
-a extract
-o ./output
--save-video
**命令行参数:**
- `-l, --link` - 抖音分享链接(必填)
- `-a, --action` - 操作类型:`info`(查看信息)、`download`(下载)或`extract`(提取转写文本)(必填)
- `-o, --output` - 输出目录(默认:`./output`)
- `--save-video` - 提取转写文本时同时保存视频文件
- `--api-key` - 覆盖环境变量中的API密钥Python Integration
Python集成
Parse Video Info
解析视频信息
python
from douyin_video.parser import DouyinParserpython
from douyin_video.parser import DouyinParserInitialize parser
初始化解析器
parser = DouyinParser()
parser = DouyinParser()
Parse video information
解析视频信息
share_link = "https://v.douyin.com/xxxxx/"
video_info = parser.parse_video_info(share_link)
print(f"Title: {video_info['title']}")
print(f"Video ID: {video_info['video_id']}")
print(f"Download URL: {video_info['download_url']}")
undefinedshare_link = "https://v.douyin.com/xxxxx/"
video_info = parser.parse_video_info(share_link)
print(f"标题: {video_info['title']}")
print(f"视频ID: {video_info['video_id']}")
print(f"下载链接: {video_info['download_url']}")
undefinedDownload Video
下载视频
python
from douyin_video.downloader import DouyinDownloader
downloader = DouyinDownloader()python
from douyin_video.downloader import DouyinDownloader
downloader = DouyinDownloader()Download watermark-free video
下载无水印视频
video_url = "https://v.douyin.com/xxxxx/"
output_path = "./videos"
file_path = downloader.download_video(video_url, output_path)
print(f"Video saved to: {file_path}")
undefinedvideo_url = "https://v.douyin.com/xxxxx/"
output_path = "./videos"
file_path = downloader.download_video(video_url, output_path)
print(f"视频已保存至: {file_path}")
undefinedExtract Transcript
提取转写文本
python
from douyin_video.transcriber import VideoTranscriber
import ospython
from douyin_video.transcriber import VideoTranscriber
import osInitialize with API key
使用API密钥初始化
api_key = os.getenv("API_KEY")
transcriber = VideoTranscriber(api_key=api_key)
api_key = os.getenv("API_KEY")
transcriber = VideoTranscriber(api_key=api_key)
Extract transcript from video URL
从视频链接提取转写文本
video_url = "https://v.douyin.com/xxxxx/"
transcript = transcriber.extract_transcript(video_url)
print(f"Transcript: {transcript['text']}")
print(f"Video ID: {transcript['video_id']}")
print(f"Title: {transcript['title']}")
video_url = "https://v.douyin.com/xxxxx/"
transcript = transcriber.extract_transcript(video_url)
print(f"转写文本: {transcript['text']}")
print(f"视频ID: {transcript['video_id']}")
print(f"标题: {transcript['title']}")
Save as Markdown
保存为Markdown格式
transcriber.save_markdown(
transcript=transcript,
output_dir="./output"
)
undefinedtranscriber.save_markdown(
transcript=transcript,
output_dir="./output"
)
undefinedHandle Large Files
处理大型文件
The library automatically handles large audio files:
python
undefined该库会自动处理大型音频文件:
python
undefinedFiles >1 hour or >50MB are automatically chunked
超过1小时或50MB的文件会自动分块
No special configuration needed
无需特殊配置
transcript = transcriber.extract_transcript(long_video_url)
transcript = transcriber.extract_transcript(long_video_url)
Chunks are processed and merged automatically
分块会被自动处理并合并
undefinedundefinedConfiguration
配置说明
API Key Setup
API密钥设置
Get a free API key from SiliconFlow (new users get free credits).
Option 1: Environment Variable
bash
export API_KEY="sk-xxxxxxxxxxxxxxxx"Option 2: WebUI Browser Storage
- Open WebUI
- Click "API 未配置" button
- Enter and save API key
- Key persists in browser localStorage
Option 3: CLI Argument
bash
uv run python douyin-video/scripts/douyin_downloader.py \
--api-key "sk-xxxxxxxxxxxxxxxx" \
-l "https://v.douyin.com/xxxxx/" \
-a extract从SiliconFlow获取免费API密钥(新用户可获得免费额度)。
选项1:环境变量
bash
export API_KEY="sk-xxxxxxxxxxxxxxxx"选项2:WebUI浏览器存储
- 打开WebUI
- 点击“API 未配置”按钮
- 输入并保存API密钥
- 密钥会保存在浏览器localStorage中
选项3:命令行参数
bash
uv run python douyin-video/scripts/douyin_downloader.py \
--api-key "sk-xxxxxxxxxxxxxxxx" \
-l "https://v.douyin.com/xxxxx/" \
-a extractOutput Format
输出格式
Extracted transcripts are saved as Markdown:
markdown
undefined提取的转写文本会保存为Markdown格式:
markdown
undefinedVideo Title
视频标题
| 属性 | 值 |
|---|---|
| 视频ID | |
| 提取时间 | 2026-01-30 14:19:00 |
| 下载链接 | 点击下载 |
| 属性 | 值 |
|---|---|
| 视频ID | |
| 提取时间 | 2026-01-30 14:19:00 |
| 下载链接 | 点击下载 |
文案内容
文案内容
Transcribed text content appears here...
undefined转写文本内容显示在这里...
undefinedCommon Patterns
常见使用场景
Batch Processing Multiple Videos
批量处理多个视频
python
from douyin_video.transcriber import VideoTranscriber
import os
api_key = os.getenv("API_KEY")
transcriber = VideoTranscriber(api_key=api_key)
video_urls = [
"https://v.douyin.com/xxxxx1/",
"https://v.douyin.com/xxxxx2/",
"https://v.douyin.com/xxxxx3/",
]
for url in video_urls:
try:
transcript = transcriber.extract_transcript(url)
transcriber.save_markdown(transcript, "./batch_output")
print(f"✓ Processed: {transcript['title']}")
except Exception as e:
print(f"✗ Failed {url}: {e}")python
from douyin_video.transcriber import VideoTranscriber
import os
api_key = os.getenv("API_KEY")
transcriber = VideoTranscriber(api_key=api_key)
video_urls = [
"https://v.douyin.com/xxxxx1/",
"https://v.douyin.com/xxxxx2/",
"https://v.douyin.com/xxxxx3/",
]
for url in video_urls:
try:
transcript = transcriber.extract_transcript(url)
transcriber.save_markdown(transcript, "./batch_output")
print(f"✓ 处理完成: {transcript['title']}")
except Exception as e:
print(f"✗ 处理失败 {url}: {e}")Error Handling
错误处理
python
from douyin_video.parser import DouyinParser
from douyin_video.exceptions import ParseError, DownloadError
parser = DouyinParser()
try:
video_info = parser.parse_video_info(share_link)
except ParseError as e:
print(f"Failed to parse video: {e}")
except DownloadError as e:
print(f"Failed to download: {e}")
except Exception as e:
print(f"Unexpected error: {e}")python
from douyin_video.parser import DouyinParser
from douyin_video.exceptions import ParseError, DownloadError
parser = DouyinParser()
try:
video_info = parser.parse_video_info(share_link)
except ParseError as e:
print(f"解析视频失败: {e}")
except DownloadError as e:
print(f"下载失败: {e}")
except Exception as e:
print(f"意外错误: {e}")Custom Output Handling
自定义输出处理
python
from douyin_video.transcriber import VideoTranscriber
import json
transcriber = VideoTranscriber(api_key=os.getenv("API_KEY"))
transcript = transcriber.extract_transcript(video_url)python
from douyin_video.transcriber import VideoTranscriber
import json
transcriber = VideoTranscriber(api_key=os.getenv("API_KEY"))
transcript = transcriber.extract_transcript(video_url)Save as JSON
保存为JSON格式
with open("transcript.json", "w", encoding="utf-8") as f:
json.dump(transcript, f, ensure_ascii=False, indent=2)
with open("transcript.json", "w", encoding="utf-8") as f:
json.dump(transcript, f, ensure_ascii=False, indent=2)
Extract specific fields
提取特定字段
video_id = transcript["video_id"]
text_content = transcript["text"]
download_url = transcript["download_url"]
undefinedvideo_id = transcript["video_id"]
text_content = transcript["text"]
download_url = transcript["download_url"]
undefinedTroubleshooting
故障排查
FFmpeg Not Found
FFmpeg未找到
Error:
FileNotFoundError: [Errno 2] No such file or directory: 'ffmpeg'Solution:
bash
undefined错误:
FileNotFoundError: [Errno 2] No such file or directory: 'ffmpeg'解决方案:
bash
undefinedVerify FFmpeg installation
验证FFmpeg是否安装
ffmpeg -version
ffmpeg -version
If not installed, install via package manager
如果未安装,通过包管理器安装
brew install ffmpeg # macOS
apt install ffmpeg # Ubuntu
undefinedbrew install ffmpeg # macOS
apt install ffmpeg # Ubuntu
undefinedAPI Key Not Working
API密钥无效
Error:
Unauthorized: Invalid API keySolution:
- Verify API key is correct
- Check environment variable:
echo $API_KEY - Ensure API key has sufficient credits at SiliconFlow
Large File Processing Fails
大型文件处理失败
Error: or timeout errors
Request Entity Too LargeSolution: The library automatically chunks large files, but ensure:
- FFmpeg is installed and accessible
- Sufficient disk space for temporary files
- Stable network connection for multiple API calls
错误: 或超时错误
Request Entity Too Large解决方案: 该库会自动对大型文件分块,但需确保:
- FFmpeg已安装且可正常访问
- 有足够的磁盘空间存放临时文件
- 网络连接稳定以支持多次API调用
Video Link Not Parsing
视频链接无法解析
Error:
Failed to parse video linkSolution:
- Ensure link is a valid Douyin share link (starts with )
https://v.douyin.com/ - Try copying the share link again from the Douyin app
- Check if video is still available (not deleted)
错误:
Failed to parse video link解决方案:
- 确保链接是有效的抖音分享链接(以开头)
https://v.douyin.com/ - 尝试重新从抖音APP复制分享链接
- 检查视频是否仍可访问(未被删除)
Permission Denied on Output Directory
输出目录权限不足
Error:
PermissionError: [Errno 13] Permission deniedSolution:
bash
undefined错误:
PermissionError: [Errno 13] Permission denied解决方案:
bash
undefinedEnsure output directory exists and is writable
确保输出目录存在且可写
mkdir -p ./output
chmod 755 ./output
mkdir -p ./output
chmod 755 ./output
Or specify a different output directory
或指定其他输出目录
uv run python douyin-video/scripts/douyin_downloader.py
-l "url" -a extract -o ~/Documents/douyin_output
-l "url" -a extract -o ~/Documents/douyin_output
undefineduv run python douyin-video/scripts/douyin_downloader.py
-l "url" -a extract -o ~/Documents/douyin_output
-l "url" -a extract -o ~/Documents/douyin_output
undefinedWebUI Not Loading
WebUI无法加载
Error: Browser shows connection refused or 404
Solution:
bash
undefined错误: 浏览器显示连接拒绝或404
解决方案:
bash
undefinedEnsure server is running
确保服务器正在运行
uv run python web/app.py
uv run python web/app.py
Check if port 8080 is available
检查端口8080是否可用
lsof -i :8080
lsof -i :8080
Use different port if needed
如果需要,使用其他端口
PORT=8081 uv run python web/app.py
undefinedPORT=8081 uv run python web/app.py
undefinedAdvanced Usage
进阶用法
Custom Transcription Settings
自定义转写设置
python
from douyin_video.transcriber import VideoTranscriber
transcriber = VideoTranscriber(
api_key=os.getenv("API_KEY"),
model="FunAudioLLM/SenseVoiceSmall", # Default model
chunk_duration=540 # 9 minutes per chunk (default)
)python
from douyin_video.transcriber import VideoTranscriber
transcriber = VideoTranscriber(
api_key=os.getenv("API_KEY"),
model="FunAudioLLM/SenseVoiceSmall", # 默认模型
chunk_duration=540 # 每个分块9分钟(默认值)
)Programmatic MCP Server
程序化MCP服务器
python
from douyin_video.mcp_server import DouyinMCPServer
server = DouyinMCPServer(api_key=os.getenv("API_KEY"))
await server.run()python
from douyin_video.mcp_server import DouyinMCPServer
server = DouyinMCPServer(api_key=os.getenv("API_KEY"))
await server.run()