ez-stt
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
Chineseez-stt - Local Speech-to-Text
ez-stt - 本地语音转文本
Unified local speech-to-text using ONNX Runtime with int8 quantization. Choose your backend:
- Parakeet (default): Best accuracy for English, correctly captures names and filler words
- Whisper: Fastest inference, supports 99 languages
Requires installed.
ffmpeg基于ONNX Runtime并采用int8量化的统一本地语音转文本工具。您可以选择以下后端:
- Parakeet(默认):英文识别准确率最高,能准确捕捉姓名和填充词
- Whisper:推理速度最快,支持99种语言
需要预先安装。
ffmpegUsage
使用方法
bash
undefinedbash
undefinedDefault: Parakeet v2 (best English accuracy)
默认:Parakeet v2(英文识别准确率最高)
scripts/stt.py audio.ogg
scripts/stt.py audio.ogg
Explicit backend selection
显式选择后端
scripts/stt.py audio.ogg -b whisper
scripts/stt.py audio.ogg -b parakeet -m v3
scripts/stt.py audio.ogg -b whisper
scripts/stt.py audio.ogg -b parakeet -m v3
Quiet mode (suppress progress)
静默模式(隐藏进度)
scripts/stt.py audio.ogg --quiet
undefinedscripts/stt.py audio.ogg --quiet
undefinedOptions
选项
- :
-b/--backend(default),parakeetwhisper - : Model variant (see below)
-m/--model - : Disable int8 quantization
--no-int8 - : Suppress progress
-q/--quiet - : Matrix room ID for direct message
--room-id
- :
-b/--backend(默认)、parakeetwhisper - : 模型变体(详见下文)
-m/--model - : 禁用int8量化
--no-int8 - : 隐藏进度
-q/--quiet - : 用于直接消息的Matrix房间ID
--room-id
Models
模型
Parakeet (default backend)
Parakeet(默认后端)
| Model | Description |
|---|---|
| v2 (default) | English only, best accuracy |
| v3 | Multilingual |
| 模型 | 描述 |
|---|---|
| v2(默认) | 仅支持英文,准确率最高 |
| v3 | 支持多语言 |
Whisper
Whisper
| Model | Description |
|---|---|
| tiny | Fastest, lower accuracy |
| base (default) | Good balance |
| small | Better accuracy |
| large-v3-turbo | Best quality, slower |
| 模型 | 描述 |
|---|---|
| tiny | 速度最快,准确率较低 |
| base(默认) | 平衡性能与准确率 |
| small | 准确率更高 |
| large-v3-turbo | 质量最佳,速度较慢 |
Benchmark (24s audio)
基准测试(24秒音频)
| Backend/Model | Time | RTF | Notes |
|---|---|---|---|
| Whisper Base int8 | 0.43s | 0.018x | Fastest |
| Parakeet v2 int8 | 0.60s | 0.025x | Best accuracy |
| Parakeet v3 int8 | 0.63s | 0.026x | Multilingual |
| 后端/模型 | 耗时 | RTF | 说明 |
|---|---|---|---|
| Whisper Base int8 | 0.43秒 | 0.018x | 速度最快 |
| Parakeet v2 int8 | 0.60秒 | 0.025x | 准确率最高 |
| Parakeet v3 int8 | 0.63秒 | 0.026x | 支持多语言 |
OpenClaw
OpenClaw
See OPENCLAW.md for OpenClaw-specific setup and configuration.
openclaw.json请查看OPENCLAW.md了解OpenClaw专属设置和配置方法。
openclaw.json