douyin-batch-download

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

抖音视频批量下载

Douyin Video Batch Download

本技能基于 F2 框架实现抖音视频批量下载,提供高效、稳定的批量下载能力。
This skill implements batch download of Douyin videos based on the F2 framework, providing efficient and stable batch download capabilities.

功能概述

Feature Overview

  • 单个博主下载 - 输入主页链接或 ID,下载全部或指定数量
  • 批量下载 - 一次指定多个博主,批量处理
  • 增量下载 - 自动跳过已下载的视频(按 aweme_id 判断)
  • Cookie 管理 - 优先从浏览器自动读取,失败则提示手动配置
  • 关注列表管理 - 维护 following.json 记录已处理的博主
  • 差量更新 - 支持只下载主页有但本地没有的视频
  • 目录兼容 - 用数字 ID + 三层结构(视频/封面/转录文字)
  • 视频压缩 - 使用 ffmpeg 压缩视频,节省存储空间
  • 视频元数据 - 抓取并保存视频统计数据(点赞、评论、收藏、分享数)
  • 数据可视化 - Web 界面展示博主和视频的统计信息,支持排序和筛选
  • Single Creator Download - Enter the homepage link or ID to download all or a specified number of videos
  • Batch Download - Specify multiple creators at once for batch processing
  • Incremental Download - Automatically skip already downloaded videos (judged by aweme_id)
  • Cookie Management - Prioritizes automatic reading from the browser; prompts for manual configuration if it fails
  • Following List Management - Maintains following.json to record processed creators
  • Differential Update - Supports downloading only videos that exist on the homepage but not locally
  • Directory Compatibility - Uses numeric ID + three-layer structure (video/cover/transcribed text)
  • Video Compression - Uses ffmpeg to compress videos and save storage space
  • Video Metadata - Captures and saves video statistics (like count, comment count, collect count, share count)
  • Data Visualization - Web interface displays statistical information of creators and videos, supporting sorting and filtering

使用场景

Usage Scenarios

  • 服务器批量下载:部署在专用服务器上,定时批量抓取特定博主视频
  • 定期更新视频库:自动检测新视频,只下载缺失部分
  • 备份与迁移:视频文件分类存储,便于备份和后续处理
  • 内容分析:基于视频统计数据(点赞、评论、收藏)进行博主内容分析
  • Server Batch Download: Deploy on a dedicated server to regularly batch capture videos from specific creators
  • Regular Video Library Update: Automatically detects new videos and only downloads missing ones
  • Backup and Migration: Video files are stored categorically for easy backup and subsequent processing
  • Content Analysis: Conduct creator content analysis based on video statistics (likes, comments, collects)

视频元数据

Video Metadata

下载视频时,系统会自动提取并保存以下数据:
字段说明
aweme_id
视频唯一 ID
uid
作者 UID
desc
视频描述/文案
create_time
发布时间
duration
视频时长
digg_count
点赞数
comment_count
评论数
collect_count
收藏数
share_count
分享数
数据存储在
douyin_users.db
video_metadata
表中。
When downloading videos, the system automatically extracts and saves the following data:
FieldDescription
aweme_id
Unique video ID
uid
Creator UID
desc
Video description/copy
create_time
Release time
duration
Video duration
digg_count
Like count
comment_count
Comment count
collect_count
Collect count
share_count
Share count
Data is stored in the
video_metadata
table of
douyin_users.db
.

手动提取/更新元数据

Manual Extraction/Update of Metadata

bash
undefined
bash
undefined

扫描本地视频并提取元数据(基本信息)

Scan local videos and extract metadata (basic information)

python scripts/extract-metadata.py
python scripts/extract-metadata.py

查看统计摘要

View statistical summary

python scripts/extract-metadata.py --stats

> ⚠️ **注意**: `--fetch` 选项已废弃。推荐使用 `download-v2.py` 重新下载视频,会自动保存统计数据。
python scripts/extract-metadata.py --stats

> ⚠️ **Note**: The `--fetch` option has been deprecated. It is recommended to re-download videos using `download-v2.py`, which will automatically save statistical data.

快速开始

Quick Start

bash
undefined
bash
undefined

创建配置

Create configuration

mkdir -p config cp config/config.yaml.example config/config.yaml
mkdir -p config cp config/config.yaml.example config/config.yaml

编辑配置(填写 Cookie)

Edit configuration (fill in Cookie)

${EDITOR:-nano} config/config.yaml
${EDITOR:-nano} config/config.yaml

单个下载(推荐)

Single download (recommended)

python scripts/download-v2.py "https://www.douyin.com/user/MS4wLjABAAAA..."
python scripts/download-v2.py "https://www.douyin.com/user/MS4wLjABAAAA..."

批量下载

Batch download

python scripts/batch-download.py --all
python scripts/batch-download.py --all

交互式选择博主下载

Interactive selection of creators for download

python scripts/batch-download.py
python scripts/batch-download.py

采样下载(每个博主1个视频,快速更新数据)

Sample download (1 video per creator, quick data update)

python scripts/batch-download.py --sample
python scripts/batch-download.py --sample

生成 Web 界面数据

Generate Web interface data

python scripts/generate-data.py
python scripts/generate-data.py

查看 Web 界面

View Web interface

open downloads/index.html
undefined
open downloads/index.html
undefined

推荐工作流

Recommended Workflow

1. 添加博主 → python scripts/manage-following.py --batch
2. 批量下载 → python scripts/batch-download.py --all
3. 查看数据 → open downloads/index.html
下载时自动保存:
  • ✅ 视频文件
  • ✅ 点赞、评论、收藏、分享数
  • ✅ 视频描述、发布时间、时长
1. Add creators → python scripts/manage-following.py --batch
2. Batch download → python scripts/batch-download.py --all
3. View data → open downloads/index.html
Automatically saves during download:
  • ✅ Video files
  • ✅ Like, comment, collect, share counts
  • ✅ Video description, release time, duration

目录结构

Directory Structure

skills/douyin-batch-download/
├── SKILL.md                  # 本文件
├── references/
│   ├── INSTALLATION.md        # 详细安装依赖说明
│   └── USAGE.md              # 详细使用说明
├── scripts/
│   ├── download-v2.py        # ✅ 推荐下载脚本(自动保存统计数据)
│   ├── batch-download.py     # 批量下载入口
│   ├── download.py           # ⚠️ 旧版下载脚本(已废弃)
│   ├── manage-following.py   # 关注列表管理(添加/删除/搜索)
│   ├── sync-following.py     # 从 F2 数据库同步 following.json
│   ├── compress.py           # 视频压缩脚本
│   ├── extract-metadata.py   # 视频元数据提取
│   ├── generate-data.py      # 生成 Web 界面数据文件
│   ├── following.py          # following.json 操作库
│   └── login.py              # 扫码登录脚本
├── config/
│   ├── config.yaml.example  # 配置模板
│   └── following.json       # 关注列表(已下载的博主)
├── downloads/
│   ├── {uid}/               # 按博主 UID 分类的视频目录
│   ├── data.js              # Web 界面数据文件
│   └── index.html           # Web 管理界面
└── douyin_users.db          # SQLite 数据库(用户信息 + 视频元数据)
skills/douyin-batch-download/
├── SKILL.md                  # This file
├── references/
│   ├── INSTALLATION.md        # Detailed dependency installation instructions
│   └── USAGE.md              # Detailed usage instructions
├── scripts/
│   ├── download-v2.py        # ✅ Recommended download script (automatically saves statistical data)
│   ├── batch-download.py     # Batch download entry
│   ├── download.py           # ⚠️ Legacy download script (deprecated)
│   ├── manage-following.py   # Following list management (add/delete/search)
│   ├── sync-following.py     # Sync following.json from F2 database
│   ├── compress.py           # Video compression script
│   ├── extract-metadata.py   # Video metadata extraction
│   ├── generate-data.py      # Generate Web interface data files
│   ├── following.py          # following.json operation library
│   └── login.py              # QR code login script
├── config/
│   ├── config.yaml.example  # Configuration template
│   └── following.json       # Following list (downloaded creators)
├── downloads/
│   ├── {uid}/               # Video directory categorized by creator UID
│   ├── data.js              # Web interface data file
│   └── index.html           # Web management interface
└── douyin_users.db          # SQLite database (user info + video metadata)

依赖

Dependencies

系统依赖

System Dependencies

依赖安装方式
Chrome/Chromium下载地址
ffmpegmacOS:
brew install ffmpeg
/ Ubuntu:
sudo apt install ffmpeg
ffmpeg 用于视频压缩功能,如仅需下载功能可不安装。
DependencyInstallation Method
Chrome/ChromiumDownload Link
ffmpegmacOS:
brew install ffmpeg
/ Ubuntu:
sudo apt install ffmpeg
ffmpeg is used for video compression; it can be omitted if only download functionality is needed.

Python 包

Python Packages

| 包名 | 用途 | |------|------|----------| |
f2
| 抖音视频下载框架 | |
playwright
| 浏览器自动化(扫码登录) | |
pyyaml
| YAML 配置文件解析 | |
httpx
| 异步 HTTP 客户端 | |
aiofiles
| 异步文件操作 |
详细安装说明:见 references/INSTALLATION.md
详细使用说明:见 references/USAGE.md
| Package Name | Purpose | |------|------|----------| |
f2
| Douyin video download framework | |
playwright
| Browser automation (QR code login) | |
pyyaml
| YAML configuration file parsing | |
httpx
| Asynchronous HTTP client | |
aiofiles
| Asynchronous file operations |
Detailed Installation Instructions: See references/INSTALLATION.md
Detailed Usage Instructions: See references/USAGE.md

参考资源

Reference Resources

与其他技能配合

Collaboration with Other Skills

FunASR 语音转文字

FunASR Speech-to-Text

下载的视频可以使用 funasr-transcribe 技能将视频转录为带时间戳的 Markdown 文件。
配合方式:先使用抖音下载技能获取视频,再使用 FunASR 技能进行转录。两个技能独立运行,可根据需要灵活组合使用。
Downloaded videos can use the funasr-transcribe skill to transcribe videos into timestamped Markdown files.
Collaboration Method: First use the Douyin download skill to obtain videos, then use the FunASR skill for transcription. The two skills run independently and can be flexibly combined as needed.