Loading...
Loading...
Transcribe audio and video files to text using OpenAI Whisper. Use when: converting podcasts to blog posts; creating video subtitles; extracting quotes from interviews; repurposing video content to text; building searchable audio archives
npx skill4agent add guia-matthieu/clawfu-skills whisper-transcriptionTranscribe any audio or video to text using OpenAI's Whisper model - the same technology powering ChatGPT voice features.
| Claude Does | You Decide |
|---|---|
| Structures production workflow | Final creative direction |
| Suggests technical approaches | Equipment and tool choices |
| Creates templates and checklists | Quality standards |
| Identifies best practices | Brand/voice decisions |
| Generates script outlines | Final script approval |
pip install openai-whisper torch ffmpeg-python click
# Also requires ffmpeg installed on system
# macOS: brew install ffmpeg
# Ubuntu: sudo apt install ffmpegpython scripts/main.py transcribe audio.mp3 --model medium --output transcript.txt
python scripts/main.py transcribe video.mp4 --format srt --output subtitles.srtpython scripts/main.py batch ./recordings/ --format txt --output ./transcripts/python scripts/main.py translate foreign-audio.mp3 --to enpython scripts/main.py timestamps podcast.mp3 --format json# Transcribe 1-hour podcast
python scripts/main.py transcribe episode-42.mp3 --model medium
# Output: episode-42.txt (full transcript with timestamps)
# Processing time: ~5 min for 1 hour audio on M1 Mac# Generate SRT for video upload
python scripts/main.py transcribe marketing-video.mp4 --format srt
# Output: marketing-video.srt
# Upload directly to YouTube/Vimeo# Transcribe all recordings in folder
python scripts/main.py batch ./customer-interviews/ --model small --format txt
# Output: ./customer-interviews/*.txt (one per audio file)| Model | Speed | Accuracy | VRAM | Best For |
|---|---|---|---|---|
| Fastest | ~70% | 1GB | Quick drafts, short clips |
| Fast | ~80% | 1GB | Social media clips |
| Medium | ~85% | 2GB | Podcasts, interviews |
| Slow | ~90% | 5GB | Professional transcripts |
| Slowest | ~95% | 10GB | Critical accuracy needs |
smallmedium| Format | Extension | Use Case |
|---|---|---|
| .txt | Blog posts, analysis |
| .srt | Video subtitles (YouTube) |
| .vtt | Web video subtitles |
| .json | Programmatic access |
| .tsv | Spreadsheet analysis |
--languagecategory: automation
subcategory: audio-processing
dependencies: [openai-whisper, torch, ffmpeg-python]
difficulty: beginner
time_saved: 10+ hours/week