whisper-transcription

Original：🇺🇸 English

Translated

1 scriptsChecked / no sensitive code detected

Transcribe audio and video files to text using OpenAI Whisper. Use when: converting podcasts to blog posts; creating video subtitles; extracting quotes from interviews; repurposing video content to text; building searchable audio archives

9installs

Sourceguia-matthieu/clawfu-skills

Added on2026-02-22

NPX Install

npx skill4agent add guia-matthieu/clawfu-skills whisper-transcription

SKILL.md Content

View Translation Comparison →

Whisper Transcription

Transcribe any audio or video to text using OpenAI's Whisper model - the same technology powering ChatGPT voice features.

When to Use This Skill

Podcast repurposing - Convert episodes to blog posts, show notes, social snippets
Video subtitles - Generate SRT/VTT files for YouTube, social media
Interview extraction - Pull quotes and insights from recorded calls
Content audit - Make audio/video libraries searchable
Translation - Transcribe and translate foreign language content

What Claude Does vs What You Decide

Claude Does	You Decide
Structures production workflow	Final creative direction
Suggests technical approaches	Equipment and tool choices
Creates templates and checklists	Quality standards
Identifies best practices	Brand/voice decisions
Generates script outlines	Final script approval

Dependencies

bash

pip install openai-whisper torch ffmpeg-python click
# Also requires ffmpeg installed on system
# macOS: brew install ffmpeg
# Ubuntu: sudo apt install ffmpeg

Commands

Transcribe Single File

bash

python scripts/main.py transcribe audio.mp3 --model medium --output transcript.txt
python scripts/main.py transcribe video.mp4 --format srt --output subtitles.srt

Batch Transcription

bash

python scripts/main.py batch ./recordings/ --format txt --output ./transcripts/

Transcribe + Translate

bash

python scripts/main.py translate foreign-audio.mp3 --to en

Extract Timestamps

bash

python scripts/main.py timestamps podcast.mp3 --format json

Examples

Example 1: Podcast to Blog Post

bash

# Transcribe 1-hour podcast
python scripts/main.py transcribe episode-42.mp3 --model medium

# Output: episode-42.txt (full transcript with timestamps)
# Processing time: ~5 min for 1 hour audio on M1 Mac

Example 2: YouTube Subtitles

bash

# Generate SRT for video upload
python scripts/main.py transcribe marketing-video.mp4 --format srt

# Output: marketing-video.srt
# Upload directly to YouTube/Vimeo

Example 3: Batch Process Interview Library

bash

# Transcribe all recordings in folder
python scripts/main.py batch ./customer-interviews/ --model small --format txt

# Output: ./customer-interviews/*.txt (one per audio file)

Model Selection Guide

Model	Speed	Accuracy	VRAM	Best For
`tiny`	Fastest	~70%	1GB	Quick drafts, short clips
`base`	Fast	~80%	1GB	Social media clips
`small`	Medium	~85%	2GB	Podcasts, interviews
`medium`	Slow	~90%	5GB	Professional transcripts
`large`	Slowest	~95%	10GB	Critical accuracy needs

Recommendation: Start with

small

for most marketing content. Use

medium

for client deliverables.

Output Formats

Format	Extension	Use Case
`txt`	.txt	Blog posts, analysis
`srt`	.srt	Video subtitles (YouTube)
`vtt`	.vtt	Web video subtitles
`json`	.json	Programmatic access
`tsv`	.tsv	Spreadsheet analysis

Performance Tips

GPU acceleration - 10x faster with CUDA GPU
Audio extraction - Script auto-extracts audio from video
Chunking - Long files auto-split for memory efficiency
Language detection - Automatic, or specify with
```
--language
```

Skill Boundaries

What This Skill Does Well

Structuring audio production workflows
Providing technical guidance
Creating quality checklists
Suggesting creative approaches

What This Skill Cannot Do

Replace audio engineering expertise
Make subjective creative decisions
Access or edit audio files directly
Guarantee commercial success

Related Skills

video-processing - Extract audio from video
youtube-downloader - Download videos to transcribe
content-repurposer - Transform transcripts to content
podcast-production - Create podcasts

Skill Metadata

Mode: cyborg

yaml

category: automation
subcategory: audio-processing
dependencies: [openai-whisper, torch, ffmpeg-python]
difficulty: beginner
time_saved: 10+ hours/week

whisper-transcription

NPX Install

Tags

SKILL.md Content

Whisper Transcription

When to Use This Skill

What Claude Does vs What You Decide

Dependencies

Commands

Transcribe Single File

Batch Transcription

Transcribe + Translate

Extract Timestamps

Examples

Example 1: Podcast to Blog Post

Example 2: YouTube Subtitles

Example 3: Batch Process Interview Library

Model Selection Guide

Output Formats

Performance Tips

Skill Boundaries

What This Skill Does Well

What This Skill Cannot Do

Related Skills

Skill Metadata