Loading...
Loading...
Transcribe audio to text with Whisper models via inference.sh CLI. Models: Fast Whisper Large V3, Whisper V3 Large. Capabilities: transcription, translation, multi-language, timestamps. Use for: meeting transcription, subtitles, podcast transcripts, voice notes. Triggers: speech to text, transcription, whisper, audio to text, transcribe audio, voice to text, stt, automatic transcription, subtitles generation, transcribe meeting, audio transcription, whisper ai
npx skill4agent add inference-shell/skills speech-to-text
Requires inference.sh CLI (). Get installation instructions:infshnpx skills add inference-sh/skills@agent-tools
infsh login
infsh app run infsh/fast-whisper-large-v3 --input '{"audio_url": "https://audio.mp3"}'| Model | App ID | Best For |
|---|---|---|
| Fast Whisper V3 | | Fast transcription |
| Whisper V3 Large | | Highest accuracy |
infsh app run infsh/fast-whisper-large-v3 --input '{"audio_url": "https://meeting.mp3"}'infsh app sample infsh/fast-whisper-large-v3 --save input.json
# {
# "audio_url": "https://podcast.mp3",
# "timestamps": true
# }
infsh app run infsh/fast-whisper-large-v3 --input input.jsoninfsh app run infsh/whisper-v3-large --input '{
"audio_url": "https://french-audio.mp3",
"task": "translate"
}'# Extract audio from video first
infsh app run infsh/video-audio-extractor --input '{"video_url": "https://video.mp4"}' > audio.json
# Transcribe the extracted audio
infsh app run infsh/fast-whisper-large-v3 --input '{"audio_url": "<audio-url>"}'# 1. Transcribe video audio
infsh app run infsh/fast-whisper-large-v3 --input '{
"audio_url": "https://video.mp4",
"timestamps": true
}' > transcript.json
# 2. Use transcript for captions
infsh app run infsh/caption-videos --input '{
"video_url": "https://video.mp4",
"captions": "<transcript-from-step-1>"
}'textsegmentslanguage# Full platform skill (all 150+ apps)
npx skills add inference-sh/skills@agent-tools
# Text-to-speech (reverse direction)
npx skills add inference-sh/skills@text-to-speech
# Video generation (add captions)
npx skills add inference-sh/skills@ai-video-generation
# AI avatars (lipsync with transcripts)
npx skills add inference-sh/skills@ai-avatar-videoinfsh app list --category audio