Loading...
Loading...
Production-grade YouTube transcript extraction with multi-format output, language selection, auto-generated support, intelligent caching, rate limiting, and retry logic. Supports SRT, VTT, JSON, CSV, TSV, and plain text.
npx skill4agent add winsorllc/upgraded-carnival youtube-transcript# Requires Node.js (already available)
node --version
# No additional dependencies required# Extract transcript with video ID
{baseDir}/youtube-transcript.js VIDEO_ID
# Extract with full URL
{baseDir}/youtube-transcript.js "https://www.youtube.com/watch?v=VIDEO_ID"
# Extract with short URL
{baseDir}/youtube-transcript.js "https://youtu.be/VIDEO_ID"# Plain text with timestamps (default)
{baseDir}/youtube-transcript.js VIDEO_ID --format text
[0:00:00.00] Here is the transcript text
[0:00:05.32] More transcript content
# Plain text without timestamps
{baseDir}/youtube-transcript.js VIDEO_ID --format plain
Here is the transcript text More transcript content
# JSON with metadata
{baseDir}/youtube-transcript.js VIDEO_ID --format json
{
"title": "Video Title",
"author": "Channel Name",
"language": "en",
"isAutoGenerated": false,
"transcript": [...]
}
# SRT subtitle format
{baseDir}/youtube-transcript.js VIDEO_ID --format srt > video.srt
1
00:00:00,000 --> 00:00:05,320
Here is the transcript text
2
00:00:05,320 --> 00:00:08,150
More transcript content
# VTT subtitle format
{baseDir}/youtube-transcript.js VIDEO_ID --format vtt > video.vtt
WEBVTT
1
00:00.000 --> 00:05.320
Here is the transcript text
# TSV tab-separated values
{baseDir}/youtube-transcript.js VIDEO_ID --format tsv
start\tduration\ttext
0.000\t5.320\tHere is the transcript text
# CSV comma-separated values
{baseDir}/youtube-transcript.js VIDEO_ID --format csv
start,duration,text
0.000,5.320,"Here is the transcript text"# Auto-select best available (default)
{baseDir}/youtube-transcript.js VIDEO_ID
# Specific language by code
{baseDir}/youtube-transcript.js VIDEO_ID --language en
{baseDir}/youtube-transcript.js VIDEO_ID --language es
{baseDir}/youtube-transcript.js VIDEO_ID --language fr
# Partial matches work too
{baseDir}/youtube-transcript.js VIDEO_ID --language zh # Matches zh-CN, zh-TW, etc.
# Language with auto-generated preference
{baseDir}/youtube-transcript.js VIDEO_ID --language ja --format srt| Code | Language |
|---|---|
| en | English |
| es | Spanish |
| fr | French |
| de | German |
| ja | Japanese |
| ko | Korean |
| zh | Chinese |
| pt | Portuguese |
| ru | Russian |
| hi | Hindi |
| ar | Arabic |
| it | Italian |
# Save transcript directly to file
{baseDir}/youtube-transcript.js VIDEO_ID --output transcript.txt
{baseDir}/youtube-transcript.js VIDEO_ID --format srt --output subtitles.srt
{baseDir}/youtube-transcript.js VIDEO_ID --format json --output data.json
# Shell redirection (equivalent)
{baseDir}/youtube-transcript.js VIDEO_ID --format vtt > captions.vtt# Skip cache (force fresh fetch)
{baseDir}/youtube-transcript.js VIDEO_ID --no-cache
# Verbose debugging output
DEBUG=1 {baseDir}/youtube-transcript.js VIDEO_ID
# Combine options
{baseDir}/youtube-transcript.js VIDEO_ID --language es --format srt --output spanish.srt --no-cache| Format | Use Case | Human Readable | Machine Readable |
|---|---|---|---|
| Default viewing | ✅ | ⚠️ |
| Content only | ✅ | ⚠️ |
| API integration | ⚠️ | ✅ |
| Subtitle files | ✅ | ✅ |
| Web captions | ✅ | ✅ |
| Spreadsheet import | ⚠️ | ✅ |
| Database import | ⚠️ | ✅ |
# Plain video ID (11 characters)
EBw7gsDPAYQ
# Standard YouTube URL
https://www.youtube.com/watch?v=EBw7gsDPAYQ
# Short youtu.be URL
https://youtu.be/EBw7gsDPAYQ
# Embed URL
https://www.youtube.com/embed/EBw7gsDPAYQ
# YouTube Live URL
https://www.youtube.com/live/EBw7gsDPAYQ
# URLs with additional parameters (automatically handled)
https://www.youtube.com/watch?v=EBw7gsDPAYQ&t=120s
https://www.youtube.com/watch?v=EBw7gsDPAYQ&index=2
# Playlist URLs (extracts first video)
https://www.youtube.com/watch?v=EBw7gsDPAYQ&list=.../tmp/youtube-transcript-cache/# Force fresh fetch
{baseDir}/youtube-transcript.js VIDEO_ID --no-cache| Code | Name | Description | Resolution |
|---|---|---|---|
| 0 | SUCCESS | Transcript fetched | None needed |
| 1 | INVALID_VIDEO_ID | Bad URL/ID format double-check the video ID | |
| 2 | VIDEO_NOT_FOUND | Video doesn't exist | Verify video exists |
| 3 | TRANSCRIPT_DISABLED | Creator disabled captions | Contact creator |
| 4 | NO_TRANSCRIPT | No captions available | Wait for transcript |
| 5 | VIDEO_UNAVAILABLE | Can't access | Check restrictions |
| 6 | PRIVATE_VIDEO | Video is private | Get access/permission |
| 7 | RATE_LIMITED | Too many requests | Wait before retry |
| 8 | NETWORK_ERROR | Connection issue | Check internet |
| 9 | PARSE_ERROR | Data extraction failed | Try again |
| 99 | UNKNOWN | Unexpected error | Report issue |
--format jsonVideo ID/URL
↓
Extract Video ID ← URL parser (7+ formats)
↓
Check Cache ← 24hr TTL store
↓[cache miss]
Fetch YouTube Page ← HTTP with retry logic
↓
Extract Player Data ← ytInitialPlayerResponse
↓
Parse Caption Tracks ← Language selection
↓
Fetch Transcript ← Select appropriate URL
↓
Parse Entries ← XML/JSON parsing
↓
Format Output ← 7 output formats
↓
Cache & Return ← Store for 24hrytInitialPlayerResponseplayerResponse&&<<>>""''''{
"title": "How Artificial Intelligence Works",
"author": "Example Channel",
"duration": "PT10M32S",
"language": "en",
"isAutoGenerated": true,
"transcript": [
{
"start": 0.000,
"duration": 5.320,
"text": "In this video, we'll explore how AI systems learn and adapt"
},
{
"start": 5.320,
"duration": 4.180,
"text": "to perform tasks that traditionally required human intelligence"
}
],
"word_count": 2847,
"total_entries": 156
}1
00:00:00,000 --> 00:00:05,320
In this video, we'll explore how AI systems
learn and adapt
2
00:00:05,320 --> 00:00:09,500
to perform tasks that traditionally
required human intelligence
3
00:00:09,500 --> 00:00:13,240
This process is called
machine learning
...WEBVTT
1
00:00.000 --> 00:05.320
In this video, we'll explore how AI systems
learn and adapt
2
00:05.320 --> 00:09.500
to perform tasks that traditionally
required human intelligence
...#!/bin/bash
# Process multiple videos from IDs file
for video_id in $(cat video_ids.txt); do
echo "Processing: $video_id"
{baseDir}/youtube-transcript.js "$video_id" --format srt --output "transcripts/${video_id}.srt" 2>/dev/null
if [ $? -eq 0 ]; then
echo " ✓ Success"
else
echo " ✗ Failed"
fi
# Sleep to respect rate limits
sleep 2
done#!/bin/bash
VIDEO_ID="EBw7gsDPAYQ"
# Get transcript
{baseDir}/youtube-transcript.js "$VIDEO_ID" --format plain > transcript.txt
# Convert to PDF (requires pandoc)
pandoc transcript.txt -o transcript.pdf
echo "PDF created: transcript.pdf"#!/bin/bash
VIDEO_ID="EBw7gsDPAYQ"
# Get JSON format
{baseDir}/youtube-transcript.js "$VIDEO_ID" --format json | jq -r '
"Title: \(.title)",
"Author: \(.author)",
"Words: \(.word_count)",
"Entries: \(.total_entries)",
"Language: \(.language)\(.isAutoGenerated ? " (auto)" : "")"
'#!/bin/bash
VIDEOS=("VIDEO1" "VIDEO2" "VIDEO3")
TOTAL=${#VIDEOS[@]}
for i in "${!VIDEOS[@]}"; do
id="${VIDEOS[$i]}"
echo "[$((i+1))/$TOTAL] Processing $id..."
{baseDir}/youtube-transcript.js "$id" --format json --output "data/${id}.json" 2>/dev/null
sleep 1 # Rate limit protection
done#!/bin/bash
VIDEO_ID="your-video-id"
# Get English and Spanish
{baseDir}/youtube-transcript.js "$VIDEO_ID" --language en --format srt > english.srt
echo "English ✓"
{baseDir}/youtube-transcript.js "$VIDEO_ID" --language es --format srt > spanish.srt
echo "Spanish ✓"
# Combine (requires ffmpeg)
ffmpeg -i video.mp4 -i english.srt -i spanish.srt \
-map 0:v -map 0:a -map 1:s:0 -map 2:s:0 \
-c:v copy -c:a copy -c:s mov_text \
"${VIDEO_ID}_bilingual.mp4"
echo "Bilingual video created ✓"# First time (slow)
{baseDir}/youtube-transcript.js VIDEO_ID
# Second time (fast - from cache)
{baseDir}/youtube-transcript.js VIDEO_ID
# Force refresh (slow)
{baseDir}/youtube-transcript.js VIDEO_ID --no-cache# Bad - might hit rate limits
for id in $IDS; do
{baseDir}/youtube-transcript.js "$id"
done
# Good - respects rate limits
for id in $IDS; do
{baseDir}/youtube-transcript.js "$id"
sleep 2
done# Process 2-3 at a time (don't exceed rate limit)
{baseDir}/youtube-transcript.js VIDEO1 &
{baseDir}/youtube-transcript.js VIDEO2 &
{baseDir}/youtube-transcript.js VIDEO3 &
waitplaintextjsonsrtvtt--output