douyin-video-extractor

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Douyin Video Extractor Skill

抖音视频提取Skill

Skill by ara.so — MCP Skills collection.

由ara.so开发的Skill — MCP Skills工具集。

Overview

概述

douyin-mcp-server

extracts watermark-free videos from Douyin (Chinese TikTok) share links and uses AI to transcribe audio content into text. It supports three usage modes: WebUI, MCP server integration, and command-line interface.

Key Features:

Extract high-quality watermark-free video download links
AI-powered speech-to-text transcription using SenseVoice
Automatic chunking for large audio files (>1 hour or >50MB)
MCP integration for Claude Desktop and other AI assistants
Web interface for browser-based usage

douyin-mcp-server

可从抖音（中国版TikTok）分享链接中提取无水印视频，并借助AI将音频内容转写为文本。它支持三种使用模式：WebUI、MCP服务器集成和命令行界面。

核心功能：

提取高质量无水印视频下载链接
基于SenseVoice的AI语音转文字功能
对大型音频文件（超过1小时或50MB）自动分块处理
可与Claude Desktop及其他AI助手进行MCP集成
支持浏览器端使用的Web界面

Installation

安装

Prerequisites

前置依赖

bash

undefined

bash

undefined

Install uv (Python package manager)

安装uv（Python包管理器）

curl -LsSf https://astral.sh/uv/install.sh | sh

Install FFmpeg (required for audio processing)

安装FFmpeg（音频处理必需）

macOS

brew install ffmpeg

Ubuntu/Debian

apt install ffmpeg

Windows (with chocolatey)

Windows（使用chocolatey）

choco install ffmpeg

undefined

choco install ffmpeg

undefined

Setup

配置步骤

bash

undefined

bash

undefined

Clone the repository

克隆仓库

git clone https://github.com/yzfly/douyin-mcp-server.git cd douyin-mcp-server

Install dependencies

安装依赖

uv sync

Set API key for transcription (optional, only needed for text extraction)

设置转写功能的API密钥（可选，仅在提取文本时需要）

export API_KEY="sk-xxxxxxxxxxxxxxxx"

undefined

export API_KEY="sk-xxxxxxxxxxxxxxxx"

undefined

Usage Modes

使用模式

1. WebUI (Recommended for Interactive Use)

1. WebUI（交互式使用推荐）

bash

undefined

bash

undefined

Start the web server

启动Web服务器

uv run python web/app.py

Access in browser: http://localhost:8080

在浏览器中访问：http://localhost:8080


**WebUI Features:**
- Parse video info without API key
- Extract transcripts with API key (configured in browser or env var)
- Download videos directly
- Export transcripts as Markdown


**WebUI功能：**
- 无需API密钥即可解析视频信息
- 使用API密钥提取转写文本（可在浏览器或环境变量中配置）
- 直接下载视频
- 将转写文本导出为Markdown格式

2. MCP Server (For AI Assistants)

2. MCP服务器（适用于AI助手）

Configure in

claude_desktop_config.json

or similar MCP client config:

json

{
  "mcpServers": {
    "douyin-mcp": {
      "command": "uvx",
      "args": ["douyin-mcp-server"],
      "env": {
        "API_KEY": "sk-xxxxxxxxxxxxxxxx"
      }
    }
  }
}

Available MCP Tools:

```
parse_douyin_video_info
```
- Parse video metadata (no API key needed)
```
get_douyin_download_link
```
- Get watermark-free download URL (no API key needed)
```
extract_douyin_text
```
- Extract video transcript via AI (requires API key)

在

claude_desktop_config.json

或类似MCP客户端配置文件中进行配置：

json

{
  "mcpServers": {
    "douyin-mcp": {
      "command": "uvx",
      "args": ["douyin-mcp-server"],
      "env": {
        "API_KEY": "sk-xxxxxxxxxxxxxxxx"
      }
    }
  }
}

可用MCP工具：

```
parse_douyin_video_info
```
- 解析视频元数据（无需API密钥）
```
get_douyin_download_link
```
- 获取无水印下载链接（无需API密钥）
```
extract_douyin_text
```
- 通过AI提取视频转写文本（需要API密钥）

3. Command Line Interface

3. 命令行界面

bash

undefined

bash

undefined

Get video information (no API key required)

获取视频信息（无需API密钥）

uv run python douyin-video/scripts/douyin_downloader.py
-l "https://v.douyin.com/xxxxx/"
-a info

Download watermark-free video

下载无水印视频

uv run python douyin-video/scripts/douyin_downloader.py
-l "https://v.douyin.com/xxxxx/"
-a download
-o ./videos

Extract transcript (requires API_KEY)

提取转写文本（需要API_KEY）

uv run python douyin-video/scripts/douyin_downloader.py
-l "https://v.douyin.com/xxxxx/"
-a extract
-o ./output

Extract transcript and save video

提取转写文本并保存视频

uv run python douyin-video/scripts/douyin_downloader.py
-l "https://v.douyin.com/xxxxx/"
-a extract
-o ./output
--save-video


**CLI Arguments:**
- `-l, --link` - Douyin share link (required)
- `-a, --action` - Action: `info`, `download`, or `extract` (required)
- `-o, --output` - Output directory (default: `./output`)
- `--save-video` - Save video file when extracting transcript
- `--api-key` - Override API key from environment

uv run python douyin-video/scripts/douyin_downloader.py
-l "https://v.douyin.com/xxxxx/"
-a extract
-o ./output
--save-video


**命令行参数：**
- `-l, --link` - 抖音分享链接（必填）
- `-a, --action` - 操作类型：`info`（查看信息）、`download`（下载）或`extract`（提取转写文本）（必填）
- `-o, --output` - 输出目录（默认：`./output`）
- `--save-video` - 提取转写文本时同时保存视频文件
- `--api-key` - 覆盖环境变量中的API密钥

Python Integration

Python集成

Parse Video Info

解析视频信息

python

from douyin_video.parser import DouyinParser

python

from douyin_video.parser import DouyinParser

Initialize parser

初始化解析器

parser = DouyinParser()

Parse video information

解析视频信息

share_link = "https://v.douyin.com/xxxxx/" video_info = parser.parse_video_info(share_link)

print(f"Title: {video_info['title']}") print(f"Video ID: {video_info['video_id']}") print(f"Download URL: {video_info['download_url']}")

undefined

share_link = "https://v.douyin.com/xxxxx/" video_info = parser.parse_video_info(share_link)

print(f"标题: {video_info['title']}") print(f"视频ID: {video_info['video_id']}") print(f"下载链接: {video_info['download_url']}")

undefined

Download Video

下载视频

python

from douyin_video.downloader import DouyinDownloader

downloader = DouyinDownloader()

python

from douyin_video.downloader import DouyinDownloader

downloader = DouyinDownloader()

Download watermark-free video

下载无水印视频

video_url = "https://v.douyin.com/xxxxx/" output_path = "./videos" file_path = downloader.download_video(video_url, output_path) print(f"Video saved to: {file_path}")

undefined

video_url = "https://v.douyin.com/xxxxx/" output_path = "./videos" file_path = downloader.download_video(video_url, output_path) print(f"视频已保存至: {file_path}")

undefined

Extract Transcript

提取转写文本

python

from douyin_video.transcriber import VideoTranscriber
import os

python

from douyin_video.transcriber import VideoTranscriber
import os

Initialize with API key

使用API密钥初始化

api_key = os.getenv("API_KEY") transcriber = VideoTranscriber(api_key=api_key)

Extract transcript from video URL

从视频链接提取转写文本

video_url = "https://v.douyin.com/xxxxx/" transcript = transcriber.extract_transcript(video_url)

print(f"Transcript: {transcript['text']}") print(f"Video ID: {transcript['video_id']}") print(f"Title: {transcript['title']}")

video_url = "https://v.douyin.com/xxxxx/" transcript = transcriber.extract_transcript(video_url)

print(f"转写文本: {transcript['text']}") print(f"视频ID: {transcript['video_id']}") print(f"标题: {transcript['title']}")

Save as Markdown

保存为Markdown格式

transcriber.save_markdown( transcript=transcript, output_dir="./output" )

undefined

transcriber.save_markdown( transcript=transcript, output_dir="./output" )

undefined

Handle Large Files

处理大型文件

The library automatically handles large audio files:

python

undefined

该库会自动处理大型音频文件：

python

undefined

Files >1 hour or >50MB are automatically chunked

超过1小时或50MB的文件会自动分块

No special configuration needed

无需特殊配置

transcript = transcriber.extract_transcript(long_video_url)

Chunks are processed and merged automatically

分块会被自动处理并合并

undefined

undefined

Configuration

配置说明

API Key Setup

API密钥设置

Get a free API key from SiliconFlow (new users get free credits).

Option 1: Environment Variable

bash

export API_KEY="sk-xxxxxxxxxxxxxxxx"

Option 2: WebUI Browser Storage

Open WebUI
Click "API 未配置" button
Enter and save API key
Key persists in browser localStorage

Option 3: CLI Argument

bash

uv run python douyin-video/scripts/douyin_downloader.py \
  --api-key "sk-xxxxxxxxxxxxxxxx" \
  -l "https://v.douyin.com/xxxxx/" \
  -a extract

从SiliconFlow获取免费API密钥（新用户可获得免费额度）。

选项1：环境变量

bash

export API_KEY="sk-xxxxxxxxxxxxxxxx"

选项2：WebUI浏览器存储

打开WebUI
点击“API 未配置”按钮
输入并保存API密钥
密钥会保存在浏览器localStorage中

选项3：命令行参数

bash

uv run python douyin-video/scripts/douyin_downloader.py \
  --api-key "sk-xxxxxxxxxxxxxxxx" \
  -l "https://v.douyin.com/xxxxx/" \
  -a extract

Output Format

输出格式

Extracted transcripts are saved as Markdown:

markdown

undefined

提取的转写文本会保存为Markdown格式：

markdown

undefined

Video Title

视频标题

属性	值
视频ID	`7600361826030865707`
提取时间	2026-01-30 14:19:00
下载链接	点击下载

属性	值
视频ID	`7600361826030865707`
提取时间	2026-01-30 14:19:00
下载链接	点击下载

文案内容

Transcribed text content appears here...

undefined

转写文本内容显示在这里...

undefined

Common Patterns

常见使用场景

Batch Processing Multiple Videos

批量处理多个视频

python

from douyin_video.transcriber import VideoTranscriber
import os

api_key = os.getenv("API_KEY")
transcriber = VideoTranscriber(api_key=api_key)

video_urls = [
    "https://v.douyin.com/xxxxx1/",
    "https://v.douyin.com/xxxxx2/",
    "https://v.douyin.com/xxxxx3/",
]

for url in video_urls:
    try:
        transcript = transcriber.extract_transcript(url)
        transcriber.save_markdown(transcript, "./batch_output")
        print(f"✓ Processed: {transcript['title']}")
    except Exception as e:
        print(f"✗ Failed {url}: {e}")

python

from douyin_video.transcriber import VideoTranscriber
import os

api_key = os.getenv("API_KEY")
transcriber = VideoTranscriber(api_key=api_key)

video_urls = [
    "https://v.douyin.com/xxxxx1/",
    "https://v.douyin.com/xxxxx2/",
    "https://v.douyin.com/xxxxx3/",
]

for url in video_urls:
    try:
        transcript = transcriber.extract_transcript(url)
        transcriber.save_markdown(transcript, "./batch_output")
        print(f"✓ 处理完成: {transcript['title']}")
    except Exception as e:
        print(f"✗ 处理失败 {url}: {e}")

Error Handling

错误处理

python

from douyin_video.parser import DouyinParser
from douyin_video.exceptions import ParseError, DownloadError

parser = DouyinParser()

try:
    video_info = parser.parse_video_info(share_link)
except ParseError as e:
    print(f"Failed to parse video: {e}")
except DownloadError as e:
    print(f"Failed to download: {e}")
except Exception as e:
    print(f"Unexpected error: {e}")

python

from douyin_video.parser import DouyinParser
from douyin_video.exceptions import ParseError, DownloadError

parser = DouyinParser()

try:
    video_info = parser.parse_video_info(share_link)
except ParseError as e:
    print(f"解析视频失败: {e}")
except DownloadError as e:
    print(f"下载失败: {e}")
except Exception as e:
    print(f"意外错误: {e}")

Custom Output Handling

自定义输出处理

python

from douyin_video.transcriber import VideoTranscriber
import json

transcriber = VideoTranscriber(api_key=os.getenv("API_KEY"))
transcript = transcriber.extract_transcript(video_url)

python

from douyin_video.transcriber import VideoTranscriber
import json

transcriber = VideoTranscriber(api_key=os.getenv("API_KEY"))
transcript = transcriber.extract_transcript(video_url)

Save as JSON

保存为JSON格式

with open("transcript.json", "w", encoding="utf-8") as f: json.dump(transcript, f, ensure_ascii=False, indent=2)

Extract specific fields

提取特定字段

video_id = transcript["video_id"] text_content = transcript["text"] download_url = transcript["download_url"]

undefined

video_id = transcript["video_id"] text_content = transcript["text"] download_url = transcript["download_url"]

undefined

Troubleshooting

故障排查

FFmpeg Not Found

FFmpeg未找到

Error:

FileNotFoundError: [Errno 2] No such file or directory: 'ffmpeg'

Solution:

bash

undefined

错误:

FileNotFoundError: [Errno 2] No such file or directory: 'ffmpeg'

解决方案:

bash

undefined

Verify FFmpeg installation

验证FFmpeg是否安装

ffmpeg -version

If not installed, install via package manager

如果未安装，通过包管理器安装

brew install ffmpeg # macOS apt install ffmpeg # Ubuntu

undefined

brew install ffmpeg # macOS apt install ffmpeg # Ubuntu

undefined

API Key Not Working

API密钥无效

Error:

Unauthorized: Invalid API key

Solution:

Verify API key is correct
Check environment variable:
```
echo $API_KEY
```
Ensure API key has sufficient credits at SiliconFlow

错误:

Unauthorized: Invalid API key

解决方案:

确认API密钥正确
检查环境变量：
```
echo $API_KEY
```
确保API密钥在SiliconFlow有足够的额度

Large File Processing Fails

大型文件处理失败

Error:

Request Entity Too Large

or timeout errors

Solution: The library automatically chunks large files, but ensure:

FFmpeg is installed and accessible
Sufficient disk space for temporary files
Stable network connection for multiple API calls

错误:

Request Entity Too Large

或超时错误

解决方案: 该库会自动对大型文件分块，但需确保：

FFmpeg已安装且可正常访问
有足够的磁盘空间存放临时文件
网络连接稳定以支持多次API调用

Video Link Not Parsing

视频链接无法解析

Error:

Failed to parse video link

Solution:

Ensure link is a valid Douyin share link (starts with
```
https://v.douyin.com/
```
)
Try copying the share link again from the Douyin app
Check if video is still available (not deleted)

错误:

Failed to parse video link

解决方案:

确保链接是有效的抖音分享链接（以
```
https://v.douyin.com/
```
开头）
尝试重新从抖音APP复制分享链接
检查视频是否仍可访问（未被删除）

Permission Denied on Output Directory

输出目录权限不足

Error:

PermissionError: [Errno 13] Permission denied

Solution:

bash

undefined

错误:

PermissionError: [Errno 13] Permission denied

解决方案:

bash

undefined

Ensure output directory exists and is writable

确保输出目录存在且可写

mkdir -p ./output chmod 755 ./output

Or specify a different output directory

或指定其他输出目录

uv run python douyin-video/scripts/douyin_downloader.py
-l "url" -a extract -o ~/Documents/douyin_output

undefined

uv run python douyin-video/scripts/douyin_downloader.py
-l "url" -a extract -o ~/Documents/douyin_output

undefined

WebUI Not Loading

WebUI无法加载

Error: Browser shows connection refused or 404

Solution:

bash

undefined

错误: 浏览器显示连接拒绝或404

解决方案:

bash

undefined

Ensure server is running

确保服务器正在运行

uv run python web/app.py

Check if port 8080 is available

检查端口8080是否可用

lsof -i :8080

Use different port if needed

如果需要，使用其他端口

PORT=8081 uv run python web/app.py

undefined

PORT=8081 uv run python web/app.py

undefined

Advanced Usage

进阶用法

Custom Transcription Settings

自定义转写设置

python

from douyin_video.transcriber import VideoTranscriber

transcriber = VideoTranscriber(
    api_key=os.getenv("API_KEY"),
    model="FunAudioLLM/SenseVoiceSmall",  # Default model
    chunk_duration=540  # 9 minutes per chunk (default)
)

python

from douyin_video.transcriber import VideoTranscriber

transcriber = VideoTranscriber(
    api_key=os.getenv("API_KEY"),
    model="FunAudioLLM/SenseVoiceSmall",  # 默认模型
    chunk_duration=540  # 每个分块9分钟（默认值）
)

Programmatic MCP Server

程序化MCP服务器

python

from douyin_video.mcp_server import DouyinMCPServer

server = DouyinMCPServer(api_key=os.getenv("API_KEY"))
await server.run()

python

from douyin_video.mcp_server import DouyinMCPServer

server = DouyinMCPServer(api_key=os.getenv("API_KEY"))
await server.run()