tiktok-collection-scraper

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

TikTok Collection Scraper

TikTok收藏夹抓取工具

Batch extract TikTok user collection folders and their video links — including play counts, likes, comments, and shares. Zero external API, no paid service, just

curl_cffi

批量提取TikTok用户收藏夹及其视频链接——包括播放量、点赞数、评论数和分享数。无需外部API，无需付费服务，仅需

curl_cffi

。

Features

功能特性

🔓 No login required — works without cookies for public collections (~80% coverage)
🔑 Full access with cookie — get all collections including private ones (100%)
🚀 Zero external API — only needs
```
curl_cffi
```
, no TikHub/RapidAPI/paid services
📥 7 input formats — username, @username, profile URL, video URL, short link, user_id, secUid
📊 Rich metadata — plays, likes, comments, shares per video
⚡ Fast — 50 collections + 300 videos in ~40 seconds

🔓 无需登录 —— 无需Cookie即可访问公开收藏内容（覆盖率约80%）
🔑 Cookie全权限访问 —— 获取所有收藏内容，包括私人收藏（覆盖率100%）
🚀 无外部API依赖 —— 仅需
```
curl_cffi
```
，无需TikHub/RapidAPI等付费服务
📥 7种输入格式 —— 用户名、@用户名、个人主页URL、视频URL、短链接、user_id、secUid
📊 丰富元数据 —— 每个视频的播放量、点赞数、评论数、分享数
⚡ 快速高效 —— 约40秒内完成50个收藏夹+300个视频的抓取

Prerequisites

前置条件

Ensure

curl_cffi

is installed:

bash

pip install curl_cffi

确保已安装

curl_cffi

：

bash

pip install curl_cffi

Quick Start

快速开始

Run the bundled script. All paths below are relative to this skill's directory.

bash

undefined

运行捆绑脚本。以下所有路径均相对于本工具的目录。

bash

undefined

Guest mode (public collections, no cookie needed)

访客模式（仅公开收藏，无需Cookie）

python3 scripts/scrape_collections.py <target> -o /tmp/result.json

python3 scripts/scrape_collections.py <目标> -o /tmp/result.json

Login mode (all collections, 100% coverage)

登录模式（所有收藏，100%覆盖率）

python3 scripts/scrape_collections.py <target> --cookie /path/to/cookie.txt -o /tmp/result.json

undefined

python3 scripts/scrape_collections.py <目标> --cookie /path/to/cookie.txt -o /tmp/result.json

undefined

Supported Input Formats

支持的输入格式

Format	Example
Username	`chengfeng_yulin`
@Username	`@chengfeng_yulin`
Profile URL	`https://www.tiktok.com/@chengfeng_yulin`
Video URL	`https://www.tiktok.com/@user/video/7602514407133941000`
Short link	`https://vm.tiktok.com/ZMkVKQxsb/`
User ID	`6811802142106764293`
secUid	`MS4wLjABAAAA...`

格式	示例
用户名	`chengfeng_yulin`
@用户名	`@chengfeng_yulin`
个人主页URL	`https://www.tiktok.com/@chengfeng_yulin`
视频URL	`https://www.tiktok.com/@user/video/7602514407133941000`
短链接	`https://vm.tiktok.com/ZMkVKQxsb/`
用户ID	`6811802142106764293`
secUid	`MS4wLjABAAAA...`

Output Format

输出格式

JSON with structure:

json

{
  "target": "chengfeng_yulin",
  "secUid": "MS4wLjAB...",
  "uid": "68118...",
  "uniqueId": "chengfeng_yulin",
  "mode": "guest",
  "totalCollections": 50,
  "totalVideos": 308,
  "elapsedSeconds": 40.0,
  "collections": [
    {
      "collectionId": "760379...",
      "name": "收藏夹名称",
      "expected": 3,
      "actual": 3,
      "items": [
        {
          "id": "760251...",
          "url": "https://www.tiktok.com/@author/video/760251...",
          "desc": "Video description...",
          "author": "author_username",
          "plays": 2000000,
          "likes": 25100,
          "comments": 632,
          "shares": 85000
        }
      ]
    }
  ]
}

JSON结构如下：

json

{
  \"target\": \"chengfeng_yulin\",
  \"secUid\": \"MS4wLjAB...\",
  \"uid\": \"68118...\",
  \"uniqueId\": \"chengfeng_yulin\",
  \"mode\": \"guest\",
  \"totalCollections\": 50,
  \"totalVideos\": 308,
  \"elapsedSeconds\": 40.0,
  \"collections\": [
    {
      \"collectionId\": \"760379...\",
      \"name\": \"收藏夹名称\",
      \"expected\": 3,
      \"actual\": 3,
      \"items\": [
        {
          \"id\": \"760251...\",
          \"url\": \"https://www.tiktok.com/@author/video/760251...\",
          \"desc\": \"视频描述...\",
          \"author\": \"author_username\",
          \"plays\": 2000000,
          \"likes\": 25100,
          \"comments\": 632,
          \"shares\": 85000
        }
      ]
    }
  ]
}

Cookie

Cookie说明

Not needed for public collections (status=3, typically ~50% of folders, ~80% of videos)
Needed for private collections (status=1) — must be the target account's own login cookie
Cookie format: raw cookie string from browser (semicolon-separated key=value pairs)

非必需：访问公开收藏内容时无需Cookie（状态码=3，通常约50%的收藏夹，80%的视频）
必需：访问私人收藏内容时（状态码=1）——必须使用目标账号的登录Cookie
Cookie格式：从浏览器复制的原始Cookie字符串（以分号分隔的key=value键值对）

How It Works

工作原理

Uses TikTok's internal web APIs with

curl_cffi

for Chrome TLS fingerprint impersonation:

Resolve user — any input format →
```
secUid
```
(via TikTok's own redirects and page parsing)
Fetch collections —
```
GET /api/user/collection_list/
```
(no auth needed)
Fetch videos —
```
GET /api/collection/item_list/
```
with
```
sourceType=113
```
(the undocumented key parameter)

sourceType=113
is an undocumented parameter discovered through browser request interception. Without it, the API returns success with empty results.

See

references/api-notes.md

for full API documentation.

通过

curl_cffi

模拟Chrome的TLS指纹，调用TikTok内部Web API：

解析用户信息 —— 任意输入格式 → 转换为
```
secUid
```
（通过TikTok自身的重定向和页面解析）
获取收藏夹列表 —— 调用
```
GET /api/user/collection_list/
```
（无需授权）
获取视频内容 —— 调用
```
GET /api/collection/item_list/
```
并携带
```
sourceType=113
```
（未公开的关键参数）

sourceType=113
是通过拦截浏览器请求发现的未公开参数。如果没有该参数，API会返回成功但结果为空。

详细API文档请查看

references/api-notes.md

。

Error Handling

错误处理

```
⚠️
```
= fewer videos than expected (likely deleted videos)
```
❌
```
= zero videos returned (video removed or API issue)
Script retries failed requests up to 3 times with 5s backoff
Progress is printed to stderr; JSON output goes to stdout (or file with
```
-o
```
)

```
⚠️
```
= 实际获取的视频数量少于预期（可能视频已被删除）
```
❌
```
= 未返回任何视频（视频已移除或API问题）
脚本会对失败的请求最多重试3次，每次间隔5秒
进度信息会输出到stderr；JSON结果输出到stdout（或通过
```
-o
```
参数保存到文件）