speech-to-text
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseSpeech-to-Text
语音转文本
Transcribe audio to text via inference.sh CLI.
通过inference.sh CLI将音频转录为文本。
Quick Start
快速开始
bash
curl -fsSL https://cli.inference.sh | sh && infsh login
infsh app run infsh/fast-whisper-large-v3 --input '{"audio_url": "https://audio.mp3"}'bash
curl -fsSL https://cli.inference.sh | sh && infsh login
infsh app run infsh/fast-whisper-large-v3 --input '{"audio_url": "https://audio.mp3"}'Available Models
可用模型
| Model | App ID | Best For |
|---|---|---|
| Fast Whisper V3 | | Fast transcription |
| Whisper V3 Large | | Highest accuracy |
| 模型 | App ID | 最适用场景 |
|---|---|---|
| Fast Whisper V3 | | 快速转录 |
| Whisper V3 Large | | 最高准确率 |
Examples
示例
Basic Transcription
基础转录
bash
infsh app run infsh/fast-whisper-large-v3 --input '{"audio_url": "https://meeting.mp3"}'bash
infsh app run infsh/fast-whisper-large-v3 --input '{"audio_url": "https://meeting.mp3"}'With Timestamps
带时间戳
bash
infsh app sample infsh/fast-whisper-large-v3 --save input.jsonbash
infsh app sample infsh/fast-whisper-large-v3 --save input.json{
{
"audio_url": "https://podcast.mp3",
"audio_url": "https://podcast.mp3",
"timestamps": true
"timestamps": true
}
}
infsh app run infsh/fast-whisper-large-v3 --input input.json
undefinedinfsh app run infsh/fast-whisper-large-v3 --input input.json
undefinedTranslation (to English)
翻译(至英文)
bash
infsh app run infsh/whisper-v3-large --input '{
"audio_url": "https://french-audio.mp3",
"task": "translate"
}'bash
infsh app run infsh/whisper-v3-large --input '{
"audio_url": "https://french-audio.mp3",
"task": "translate"
}'From Video
从视频提取音频转录
bash
undefinedbash
undefinedExtract audio from video first
先从视频中提取音频
infsh app run infsh/video-audio-extractor --input '{"video_url": "https://video.mp4"}' > audio.json
infsh app run infsh/video-audio-extractor --input '{"video_url": "https://video.mp4"}' > audio.json
Transcribe the extracted audio
转录提取出的音频
infsh app run infsh/fast-whisper-large-v3 --input '{"audio_url": "<audio-url>"}'
undefinedinfsh app run infsh/fast-whisper-large-v3 --input '{"audio_url": "<audio-url>"}'
undefinedWorkflow: Video Subtitles
工作流:视频字幕
bash
undefinedbash
undefined1. Transcribe video audio
1. 转录视频音频
infsh app run infsh/fast-whisper-large-v3 --input '{
"audio_url": "https://video.mp4",
"timestamps": true
}' > transcript.json
infsh app run infsh/fast-whisper-large-v3 --input '{
"audio_url": "https://video.mp4",
"timestamps": true
}' > transcript.json
2. Use transcript for captions
2. 使用转录结果生成字幕
infsh app run infsh/caption-videos --input '{
"video_url": "https://video.mp4",
"captions": "<transcript-from-step-1>"
}'
undefinedinfsh app run infsh/caption-videos --input '{
"video_url": "https://video.mp4",
"captions": "<transcript-from-step-1>"
}'
undefinedSupported Languages
支持的语言
Whisper supports 99+ languages including:
English, Spanish, French, German, Italian, Portuguese, Chinese, Japanese, Korean, Arabic, Hindi, Russian, and many more.
Whisper支持99+种语言,包括:
英语、西班牙语、法语、德语、意大利语、葡萄牙语、中文、日语、韩语、阿拉伯语、印地语、俄语等多种语言。
Use Cases
适用场景
- Meetings: Transcribe recordings
- Podcasts: Generate transcripts
- Subtitles: Create captions for videos
- Voice Notes: Convert to searchable text
- Interviews: Transcription for research
- Accessibility: Make audio content accessible
- 会议:转录会议录音
- 播客:生成播客文稿
- 字幕:为视频创建字幕
- 语音笔记:转换为可搜索文本
- 访谈:为研究转录访谈内容
- 无障碍访问:让音频内容更易获取
Output Format
输出格式
Returns JSON with:
- : Full transcription
text - : Timestamped segments (if requested)
segments - : Detected language
language
返回包含以下内容的JSON:
- :完整转录文本
text - :带时间戳的片段(若请求)
segments - :检测到的语言
language
Related Skills
相关技能
bash
undefinedbash
undefinedFull platform skill (all 150+ apps)
全平台技能(包含150+应用)
npx skills add inference-sh/skills@inference-sh
npx skills add inference-sh/skills@inference-sh
Text-to-speech (reverse direction)
文本转语音(反向功能)
npx skills add inference-sh/skills@text-to-speech
npx skills add inference-sh/skills@text-to-speech
Video generation (add captions)
视频生成(添加字幕)
npx skills add inference-sh/skills@ai-video-generation
npx skills add inference-sh/skills@ai-video-generation
AI avatars (lipsync with transcripts)
AI虚拟形象(与转录文稿同步唇形)
npx skills add inference-sh/skills@ai-avatar-video
Browse all audio apps: `infsh app list --category audio`npx skills add inference-sh/skills@ai-avatar-video
浏览所有音频应用:`infsh app list --category audio`Documentation
文档
- Running Apps - How to run apps via CLI
- Audio Transcription Example - Complete transcription guide
- Apps Overview - Understanding the app ecosystem