parakeet
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseParakeet Dictation Skill
Parakeet 听写技能
Local speech-to-text powered by NVIDIA Parakeet TDT 0.6B V3 (~600MB model, 100% offline).
由NVIDIA Parakeet TDT 0.6B V3(约600MB模型,100%离线)驱动的本地语音转文本工具。
Two Modes
两种模式
1. Handy App (Primary — Push-to-Talk into Any Text Field)
1. Handy应用(首选 — 一键说话输入至任意文本框)
Handy is a free, open-source Tauri app (Rust + React) providing
push-to-talk dictation with Parakeet V3 built in. Inference via
transcribe-rs (ONNX Runtime, int8 quantized).
bash
brew install --cask handy- Default hotkey: ⌥Space (Option-Space) on macOS, Ctrl-Space on Windows/Linux
- Modes: Push-to-talk (hold) or toggle (press to start/stop)
- Select Parakeet V3 in Settings → Models (auto-downloads ~478MB)
- Grant microphone + accessibility permissions
- Includes VAD (Silero), model management UI
- Additional models: Whisper (Small/Medium/Turbo/Large), Moonshine, SenseVoice
- Models stored at
~/Library/Application Support/com.pais.handy/models/
Handy是一款免费的开源Tauri应用(基于Rust + React),内置Parakeet V3,支持按键说话式听写。通过transcribe-rs(ONNX Runtime,int8量化)进行推理。
bash
brew install --cask handy- 默认快捷键:macOS为⌥Space(Option-空格),Windows/Linux为Ctrl-空格
- 模式:按键说话(按住)或切换模式(按一下开始/停止)
- 在设置→模型中选择Parakeet V3(自动下载约478MB)
- 授予麦克风和辅助功能权限
- 包含VAD(Silero)、模型管理UI
- 额外模型:Whisper(Small/Medium/Turbo/Large)、Moonshine、SenseVoice
- 模型存储路径:
~/Library/Application Support/com.pais.handy/models/
2. CLI Scripts (Claude Code File Transcription & Terminal Dictation)
2. CLI脚本(Claude代码文件转录与终端听写)
CLI scripts remain for headless/terminal use within Claude Code. These use NeMo/PyTorch.
CLI脚本适用于Claude Code中的无界面/终端场景,基于NeMo/PyTorch实现。
Performance
性能表现
| System | Speed | Engine |
|---|---|---|
| Handy (M4 Max) | ~30x realtime | transcribe-rs / ONNX int8 |
| Handy (Zen 3) | ~20x realtime | transcribe-rs / ONNX int8 |
| Handy (Skylake i5) | ~5x realtime | transcribe-rs / ONNX int8 |
| NeMo CLI (MPS) | Varies | NeMo / PyTorch |
- Accuracy: 6.05% WER (Word Error Rate)
- Languages: 25 European languages with automatic detection (no prompting)
- Privacy: 100% local processing, no cloud API
- License: CC BY 4.0 (model), MIT (Handy app)
| 系统 | 速度 | 引擎 |
|---|---|---|
| Handy(M4 Max) | 约实时的30倍 | transcribe-rs / ONNX int8 |
| Handy(Zen 3) | 约实时的20倍 | transcribe-rs / ONNX int8 |
| Handy(Skylake i5) | 约实时的5倍 | transcribe-rs / ONNX int8 |
| NeMo CLI(MPS) | 视情况而定 | NeMo / PyTorch |
- 准确率:6.05% 词错误率(WER)
- 语言支持:25种欧洲语言,支持自动检测(无需提示)
- 隐私性:100%本地处理,无云端API调用
- 许可证:模型采用CC BY 4.0,Handy应用采用MIT
Commands
命令说明
Transcribe Audio File
转录音频文件
bash
/parakeet path/to/audio.wav
/parakeet ~/recordings/interview.mp3
/parakeet meeting.m4aSupported formats: , , , , ,
.wav.mp3.m4a.flac.ogg.aacbash
/parakeet path/to/audio.wav
/parakeet ~/recordings/interview.mp3
/parakeet meeting.m4a支持格式:, , , , ,
.wav.mp3.m4a.flac.ogg.aacLive Dictation (Terminal)
实时听写(终端)
bash
/parakeet
/parakeet dictateRecord from microphone until Enter is pressed, then transcribe.
bash
/parakeet
/parakeet dictate从麦克风录制,按下Enter后停止并转录。
Check Installation
检查安装情况
bash
/parakeet checkVerify Parakeet is properly installed and model can load.
bash
/parakeet check验证Parakeet是否正确安装且模型可加载。
Setup
设置步骤
Handy (Push-to-Talk UI)
Handy(按键说话UI)
bash
brew install --cask handyLaunch from Applications, select Parakeet V3 model, configure hotkey.
bash
brew install --cask handy从应用程序文件夹启动,选择Parakeet V3模型,配置快捷键。
CLI Scripts (Prerequisites)
CLI脚本(前置要求)
- Parakeet Dictate repo at with Python venv
~/Programming/parakeet-dictate/ - Install dependencies:
bash
cd ~/Programming/parakeet-dictate uv venv && uv pip install -r requirements.txt - (Optional) Set custom path:
export PARAKEET_HOME=/path/to/parakeet-dictate
- Parakeet Dictate仓库需位于,并配有Python虚拟环境
~/Programming/parakeet-dictate/ - 安装依赖:
bash
cd ~/Programming/parakeet-dictate uv venv && uv pip install -r requirements.txt - (可选)设置自定义路径:
export PARAKEET_HOME=/path/to/parakeet-dictate
Implementation
实现逻辑
When this skill is invoked:
-
For audio files: Run the transcription scriptbash
cd ~/.claude/skills/parakeet/scripts && \ ${PARAKEET_HOME:-~/Programming/parakeet-dictate}/.venv/bin/python transcribe.py "<filepath>" -
For live dictation: Run the dictation scriptbash
cd ~/.claude/skills/parakeet/scripts && \ ${PARAKEET_HOME:-~/Programming/parakeet-dictate}/.venv/bin/python dictate.py -
For checking setup: Run the check scriptbash
cd ~/.claude/skills/parakeet/scripts && \ ${PARAKEET_HOME:-~/Programming/parakeet-dictate}/.venv/bin/python check_setup.py
当调用该技能时:
-
处理音频文件:运行转录脚本bash
cd ~/.claude/skills/parakeet/scripts && \ ${PARAKEET_HOME:-~/Programming/parakeet-dictate}/.venv/bin/python transcribe.py "<filepath>" -
处理实时听写:运行听写脚本bash
cd ~/.claude/skills/parakeet/scripts && \ ${PARAKEET_HOME:-~/Programming/parakeet-dictate}/.venv/bin/python dictate.py -
检查设置:运行检查脚本bash
cd ~/.claude/skills/parakeet/scripts && \ ${PARAKEET_HOME:-~/Programming/parakeet-dictate}/.venv/bin/python check_setup.py
Model Caches
模型缓存
| System | Cache Location | Size | Engine |
|---|---|---|---|
| Handy | | ~478MB | transcribe-rs (ONNX int8) |
| NeMo CLI | | ~1.2GB | NeMo / PyTorch |
Model caches are separate. Handy's Parakeet V3 int8 model structure:
parakeet-tdt-0.6b-v3-int8/
├── encoder-model.int8.onnx
├── decoder_joint-model.int8.onnx
├── nemo128.onnx (audio preprocessor)
└── vocab.txt| 系统 | 缓存位置 | 大小 | 引擎 |
|---|---|---|---|
| Handy | | 约478MB | transcribe-rs (ONNX int8) |
| NeMo CLI | | 约1.2GB | NeMo / PyTorch |
模型缓存相互独立。Handy的Parakeet V3 int8模型结构:
parakeet-tdt-0.6b-v3-int8/
├── encoder-model.int8.onnx
├── decoder_joint-model.int8.onnx
├── nemo128.onnx (audio preprocessor)
└── vocab.txtTroubleshooting
故障排除
"No module named nemo"
"No module named nemo"
Use the Parakeet virtual environment. Scripts automatically use the correct Python.
使用Parakeet虚拟环境,脚本会自动调用正确的Python环境。
"MPS not available"
"MPS not available"
Apple Silicon Metal acceleration requires PyTorch 2.0+. Falls back to CPU automatically.
Apple Silicon的Metal加速需要PyTorch 2.0+,若不满足会自动回退到CPU运行。
"Permission denied: microphone"
"Permission denied: microphone"
Grant microphone access in System Preferences → Privacy & Security → Microphone.
在系统偏好设置→隐私与安全性→麦克风中授予访问权限。
Model download slow
模型下载缓慢
The Parakeet model downloads on first use (~478MB for Handy, ~1.2GB for NeMo). Subsequent runs use cache.
Parakeet模型在首次使用时下载(Handy版本约478MB,NeMo版本约1.2GB),后续运行会使用缓存。
Configuration
配置
| Variable | Default | Description |
|---|---|---|
| | Parakeet Dictate installation path |
| 变量 | 默认值 | 描述 |
|---|---|---|
| | Parakeet Dictate的安装路径 |
Dependencies
依赖项
Handy: (standalone, no other deps)
brew install --cask handyCLI scripts require:
- Parakeet Dictate repo at (default:
$PARAKEET_HOME)~/Programming/parakeet-dictate - Python virtual environment at
$PARAKEET_HOME/.venv - NeMo toolkit with ASR support ()
nemo_toolkit[asr]>=2.0.0 - PyTorch 2.0+ (for MPS/CUDA acceleration)
- soundfile and sounddevice for audio handling
Handy:(独立应用,无其他依赖)
brew install --cask handyCLI脚本需要:
- Parakeet Dictate仓库位于(默认:
$PARAKEET_HOME)~/Programming/parakeet-dictate - Python虚拟环境位于
$PARAKEET_HOME/.venv - 带ASR支持的NeMo工具包 ()
nemo_toolkit[asr]>=2.0.0 - PyTorch 2.0+(用于MPS/CUDA加速)
- soundfile和sounddevice(用于音频处理)