supadata

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Supadata API

Supadata API

Use the Supadata API via direct
curl
calls to extract video transcripts and scrape web content for AI applications.
Official docs:
https://docs.supadata.ai/

通过直接调用
curl
来使用Supadata API,为AI应用提取视频字幕抓取网页内容
官方文档:
https://docs.supadata.ai/

When to Use

使用场景

Use this skill when you need to:
  • Extract transcripts from YouTube, TikTok, Instagram, X (Twitter), Facebook videos
  • Scrape web pages to markdown format for AI processing
  • Get video/channel metadata from social platforms
  • Crawl websites to extract content from multiple pages

在以下场景中使用该技能:
  • 从YouTube、TikTok、Instagram、X(Twitter)、Facebook视频中提取字幕
  • 将网页抓取为Markdown格式,用于AI处理
  • 从社交平台获取视频/频道元数据
  • 爬取网站,从多个页面提取内容

Prerequisites

前置条件

  1. Sign up at Supadata Dashboard
  2. API key is automatically generated on signup (no credit card required)
  3. Store your API key in environment variable
bash
export SUPADATA_API_KEY="your-api-key"
  1. Supadata控制台注册账号
  2. 注册后会自动生成API密钥(无需信用卡)
  3. 将API密钥存储到环境变量中
bash
export SUPADATA_API_KEY="your-api-key"

Pricing

定价

  • Transcript fetch (existing): 1 credit
  • Transcript generation (AI): 2 credits/minute
  • Free tier available

Important: When using
$VAR
in a command that pipes to another command, wrap the command containing
$VAR
in
bash -c '...'
. Due to a Claude Code bug, environment variables are silently cleared when pipes are used directly.
bash
bash -c 'curl -s "https://api.example.com" -H "Authorization: Bearer $API_KEY"' | jq .
  • 获取已有字幕:1积分
  • AI生成字幕:2积分/分钟
  • 提供免费套餐

重要提示: 当在包含管道的命令中使用
$VAR
时,请将包含
$VAR
的命令用
bash -c '...'
包裹。由于Claude Code的bug,直接使用管道时环境变量会被静默清除。
bash
bash -c 'curl -s "https://api.example.com" -H "Authorization: Bearer $API_KEY"' | jq .

How to Use

使用方法

All examples below assume you have
SUPADATA_API_KEY
set.
The base URL for the API is:
  • https://api.supadata.ai/v1
Authentication uses the
x-api-key
header.

以下所有示例均假设你已设置好
SUPADATA_API_KEY
环境变量。
API的基础URL为:
  • https://api.supadata.ai/v1
认证使用
x-api-key
请求头。

1. Get YouTube Video Transcript

1. 获取YouTube视频字幕

Extract transcript from a YouTube video:
Write to
/tmp/supadata_url.txt
:
https://www.youtube.com/watch?v=dQw4w9WgXcQ
bash
bash -c 'curl -s "https://api.supadata.ai/v1/transcript" -H "x-api-key: ${SUPADATA_API_KEY}" -G --data-urlencode "url@/tmp/supadata_url.txt" -d "text=true"'
Parameters:
  • url
    : Video URL (required)
  • text
    : Return plain text (
    true
    ) or timestamped chunks (
    false
    , default)
  • lang
    : Preferred language (ISO 639-1 code, e.g.,
    en
    ,
    zh
    )
  • mode
    :
    native
    (existing only),
    generate
    (AI),
    auto
    (default)

提取YouTube视频的字幕:
将视频URL写入
/tmp/supadata_url.txt
https://www.youtube.com/watch?v=dQw4w9WgXcQ
bash
bash -c 'curl -s "https://api.supadata.ai/v1/transcript" -H "x-api-key: ${SUPADATA_API_KEY}" -G --data-urlencode "url@/tmp/supadata_url.txt" -d "text=true"'
参数说明:
  • url
    :视频URL(必填)
  • text
    :返回纯文本格式(
    true
    )或带时间戳的片段(
    false
    ,默认值)
  • lang
    :首选语言(ISO 639-1代码,例如
    en
    zh
  • mode
    native
    (仅获取已有字幕)、
    generate
    (AI生成)、
    auto
    (默认值)

2. Get Transcript with Timestamps

2. 获取带时间戳的字幕

Get transcript with timing information:
bash
bash -c 'curl -s "https://api.supadata.ai/v1/transcript" -H "x-api-key: ${SUPADATA_API_KEY}" -G --data-urlencode "url@/tmp/supadata_url.txt" -d "text=false"' | jq '.content[:3]'
Response format:
json
{
  "content": [
  {"text": "Hello", "offset": 0, "duration": 1500, "lang": "en"}
  ],
  "lang": "en",
  "availableLangs": ["en", "es", "zh"]
}

获取包含时间信息的字幕:
bash
bash -c 'curl -s "https://api.supadata.ai/v1/transcript" -H "x-api-key: ${SUPADATA_API_KEY}" -G --data-urlencode "url@/tmp/supadata_url.txt" -d "text=false"' | jq '.content[:3]'
响应格式:
json
{
  "content": [
  {"text": "Hello", "offset": 0, "duration": 1500, "lang": "en"}
  ],
  "lang": "en",
  "availableLangs": ["en", "es", "zh"]
}

3. Get TikTok/Instagram/X Transcript

3. 获取TikTok/Instagram/X平台的字幕

Extract transcript from other platforms:
bash
undefined
从其他平台提取字幕:
bash
undefined

TikTok

TikTok

bash -c 'curl -s "https://api.supadata.ai/v1/transcript" -H "x-api-key: ${SUPADATA_API_KEY}" -G --data-urlencode "url@/tmp/supadata_url.txt" -d "text=true"'
bash -c 'curl -s "https://api.supadata.ai/v1/transcript" -H "x-api-key: ${SUPADATA_API_KEY}" -G --data-urlencode "url@/tmp/supadata_url.txt" -d "text=true"'

Instagram Reel

Instagram Reel

bash -c 'curl -s "https://api.supadata.ai/v1/transcript" -H "x-api-key: ${SUPADATA_API_KEY}" -G --data-urlencode "url@/tmp/supadata_url.txt" -d "text=true"'

Supported platforms: YouTube, TikTok, Instagram, X (Twitter), Facebook

---
bash -c 'curl -s "https://api.supadata.ai/v1/transcript" -H "x-api-key: ${SUPADATA_API_KEY}" -G --data-urlencode "url@/tmp/supadata_url.txt" -d "text=true"'

支持的平台:YouTube、TikTok、Instagram、X(Twitter)、Facebook

---

4. Native Transcript Only (Save Credits)

4. 仅获取原生字幕(节省积分)

Fetch only existing transcripts without AI generation:
bash
bash -c 'curl -s "https://api.supadata.ai/v1/transcript" -H "x-api-key: ${SUPADATA_API_KEY}" -G --data-urlencode "url@/tmp/supadata_url.txt" -d "text=true" -d "mode=native"'
Use
mode=native
to avoid AI generation costs (1 credit vs 2 credits/min).

仅获取已有字幕,不使用AI生成:
bash
bash -c 'curl -s "https://api.supadata.ai/v1/transcript" -H "x-api-key: ${SUPADATA_API_KEY}" -G --data-urlencode "url@/tmp/supadata_url.txt" -d "text=true" -d "mode=native"'
使用
mode=native
可避免AI生成的费用(1积分 vs 2积分/分钟)。

5. Get YouTube Channel Metadata

5. 获取YouTube频道元数据

Get channel information:
bash
bash -c 'curl -s "https://api.supadata.ai/v1/youtube/channel" -H "x-api-key: ${SUPADATA_API_KEY}" -G --data-urlencode "id=@mkbhd"' | jq '{name, subscriberCount, videoCount}
Accepts channel URL, channel ID, or handle (e.g.,
@mkbhd
).

获取频道信息:
bash
bash -c 'curl -s "https://api.supadata.ai/v1/youtube/channel" -H "x-api-key: ${SUPADATA_API_KEY}" -G --data-urlencode "id=@mkbhd"' | jq '{name, subscriberCount, videoCount}'
支持传入频道URL、频道ID或用户名(例如
@mkbhd
)。

6. Get YouTube Video Metadata

6. 获取YouTube视频元数据

Get video information:
bash
bash -c 'curl -s "https://api.supadata.ai/v1/youtube/video" -H "x-api-key: ${SUPADATA_API_KEY}" -G --data-urlencode "url@/tmp/supadata_url.txt"' | jq '{title, viewCount, likeCount, duration}

获取视频信息:
bash
bash -c 'curl -s "https://api.supadata.ai/v1/youtube/video" -H "x-api-key: ${SUPADATA_API_KEY}" -G --data-urlencode "url@/tmp/supadata_url.txt"' | jq '{title, viewCount, likeCount, duration}'

7. Get Social Media Metadata

7. 获取社交媒体元数据

Get metadata from any supported platform:
bash
bash -c 'curl -s "https://api.supadata.ai/v1/metadata" -H "x-api-key: ${SUPADATA_API_KEY}" -G --data-urlencode "url@/tmp/supadata_url.txt"'
Works with YouTube, TikTok, Instagram, X, Facebook posts.

从任意支持的平台获取元数据:
bash
bash -c 'curl -s "https://api.supadata.ai/v1/metadata" -H "x-api-key: ${SUPADATA_API_KEY}" -G --data-urlencode "url@/tmp/supadata_url.txt"'
适用于YouTube、TikTok、Instagram、X、Facebook的帖子。

8. Scrape Web Page to Markdown

8. 将网页抓取为Markdown格式

Extract web page content:
bash
bash -c 'curl -s "https://api.supadata.ai/v1/web/scrape" -H "x-api-key: ${SUPADATA_API_KEY}" -G --data-urlencode "url@/tmp/supadata_url.txt"'
Returns page content in Markdown format, ideal for AI processing.

提取网页内容:
bash
bash -c 'curl -s "https://api.supadata.ai/v1/web/scrape" -H "x-api-key: ${SUPADATA_API_KEY}" -G --data-urlencode "url@/tmp/supadata_url.txt"'
返回Markdown格式的页面内容,非常适合AI处理。

9. Map Website Links

9. 网站链接映射

Get all links from a website:
bash
bash -c 'curl -s "https://api.supadata.ai/v1/web/map" -H "x-api-key: ${SUPADATA_API_KEY}" -G --data-urlencode "url@/tmp/supadata_url.txt"' | jq '.urls[:10]'

获取网站的所有链接:
bash
bash -c 'curl -s "https://api.supadata.ai/v1/web/map" -H "x-api-key: ${SUPADATA_API_KEY}" -G --data-urlencode "url@/tmp/supadata_url.txt"' | jq '.urls[:10]'

10. Crawl Website (Async)

10. 异步爬取网站

Start a crawl job for multiple pages.
Write to
/tmp/supadata_request.json
:
json
{
  "url": "https://example.com",
  "maxPages": 10
}
Then run:
bash
undefined
启动一个多页面爬取任务。
将请求内容写入
/tmp/supadata_request.json
json
{
  "url": "https://example.com",
  "maxPages": 10
}
然后执行:
bash
undefined

Start crawl

启动爬取任务

JOB_ID="$(bash -c 'curl -s "https://api.supadata.ai/v1/web/crawl" -X POST -H "x-api-key: ${SUPADATA_API_KEY}" -H "Content-Type: application/json" -d @/tmp/supadata_request.json' | jq -r '.jobId')"
echo "Job ID: ${JOB_ID}"
JOB_ID="$(bash -c 'curl -s "https://api.supadata.ai/v1/web/crawl" -X POST -H "x-api-key: ${SUPADATA_API_KEY}" -H "Content-Type: application/json" -d @/tmp/supadata_request.json' | jq -r '.jobId')"
echo "Job ID: ${JOB_ID}"

Check status

检查任务状态

bash -c 'curl -s "https://api.supadata.ai/v1/web/crawl/<your-job-id>" -H "x-api-key: ${SUPADATA_API_KEY}"' | jq '{status, pagesCompleted}'

Status values: `queued`, `active`, `completed`, `failed`

---
bash -c 'curl -s "https://api.supadata.ai/v1/web/crawl/<your-job-id>" -H "x-api-key: ${SUPADATA_API_KEY}"' | jq '{status, pagesCompleted}'

状态值:`queued`、`active`、`completed`、`failed`

---

11. Translate Transcript

11. 翻译字幕

Translate a YouTube transcript to another language:
bash
bash -c 'curl -s "https://api.supadata.ai/v1/youtube/transcript/translate" -H "x-api-key: ${SUPADATA_API_KEY}" -G --data-urlencode "url@/tmp/supadata_url.txt" -d "lang=zh" -d "text=true"'

将YouTube字幕翻译为其他语言:
bash
bash -c 'curl -s "https://api.supadata.ai/v1/youtube/transcript/translate" -H "x-api-key: ${SUPADATA_API_KEY}" -G --data-urlencode "url@/tmp/supadata_url.txt" -d "lang=zh" -d "text=true"'

Response Handling

响应处理

Synchronous (HTTP 200): Direct result returned.
Asynchronous (HTTP 202): Returns
jobId
for polling:
json
{"jobId": "abc123"}
Poll the job endpoint until status is
completed
.

同步响应(HTTP 200): 直接返回结果。
异步响应(HTTP 202): 返回
jobId
用于轮询:
json
{"jobId": "abc123"}
轮询任务端点直到状态变为
completed

Guidelines

使用指南

  1. Use
    mode=native
    to save credits
    : Only fetches existing transcripts
  2. URL encode parameters: Use
    --data-urlencode
    for URLs
  3. Check available languages: Response includes
    availableLangs
    array
  4. Handle async responses: Some requests return job IDs for polling
  5. Max file size: 1GB for direct file URLs
  6. Supported formats: MP4, WEBM, MP3, FLAC, MPEG, M4A, OGG, WAV
  1. 使用
    mode=native
    节省积分
    :仅获取已有字幕
  2. URL编码参数:对URL使用
    --data-urlencode
  3. 检查可用语言:响应中包含
    availableLangs
    数组
  4. 处理异步响应:部分请求会返回任务ID用于轮询
  5. 最大文件大小:直接文件URL支持最大1GB
  6. 支持的格式:MP4、WEBM、MP3、FLAC、MPEG、M4A、OGG、WAV