parakeet

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Parakeet Dictation Skill

Parakeet 听写技能

Local speech-to-text powered by NVIDIA Parakeet TDT 0.6B V3 (~600MB model, 100% offline).
由NVIDIA Parakeet TDT 0.6B V3(约600MB模型,100%离线)驱动的本地语音转文本工具。

Two Modes

两种模式

1. Handy App (Primary — Push-to-Talk into Any Text Field)

1. Handy应用(首选 — 一键说话输入至任意文本框)

Handy is a free, open-source Tauri app (Rust + React) providing push-to-talk dictation with Parakeet V3 built in. Inference via transcribe-rs (ONNX Runtime, int8 quantized).
bash
brew install --cask handy
  • Default hotkey: ⌥Space (Option-Space) on macOS, Ctrl-Space on Windows/Linux
  • Modes: Push-to-talk (hold) or toggle (press to start/stop)
  • Select Parakeet V3 in Settings → Models (auto-downloads ~478MB)
  • Grant microphone + accessibility permissions
  • Includes VAD (Silero), model management UI
  • Additional models: Whisper (Small/Medium/Turbo/Large), Moonshine, SenseVoice
  • Models stored at
    ~/Library/Application Support/com.pais.handy/models/
Handy是一款免费的开源Tauri应用(基于Rust + React),内置Parakeet V3,支持按键说话式听写。通过transcribe-rs(ONNX Runtime,int8量化)进行推理。
bash
brew install --cask handy
  • 默认快捷键:macOS为⌥Space(Option-空格),Windows/Linux为Ctrl-空格
  • 模式:按键说话(按住)或切换模式(按一下开始/停止)
  • 在设置→模型中选择Parakeet V3(自动下载约478MB)
  • 授予麦克风和辅助功能权限
  • 包含VAD(Silero)、模型管理UI
  • 额外模型:Whisper(Small/Medium/Turbo/Large)、Moonshine、SenseVoice
  • 模型存储路径:
    ~/Library/Application Support/com.pais.handy/models/

2. CLI Scripts (Claude Code File Transcription & Terminal Dictation)

2. CLI脚本(Claude代码文件转录与终端听写)

CLI scripts remain for headless/terminal use within Claude Code. These use NeMo/PyTorch.
CLI脚本适用于Claude Code中的无界面/终端场景,基于NeMo/PyTorch实现。

Performance

性能表现

SystemSpeedEngine
Handy (M4 Max)~30x realtimetranscribe-rs / ONNX int8
Handy (Zen 3)~20x realtimetranscribe-rs / ONNX int8
Handy (Skylake i5)~5x realtimetranscribe-rs / ONNX int8
NeMo CLI (MPS)VariesNeMo / PyTorch
  • Accuracy: 6.05% WER (Word Error Rate)
  • Languages: 25 European languages with automatic detection (no prompting)
  • Privacy: 100% local processing, no cloud API
  • License: CC BY 4.0 (model), MIT (Handy app)
系统速度引擎
Handy(M4 Max)约实时的30倍transcribe-rs / ONNX int8
Handy(Zen 3)约实时的20倍transcribe-rs / ONNX int8
Handy(Skylake i5)约实时的5倍transcribe-rs / ONNX int8
NeMo CLI(MPS)视情况而定NeMo / PyTorch
  • 准确率:6.05% 词错误率(WER)
  • 语言支持:25种欧洲语言,支持自动检测(无需提示)
  • 隐私性:100%本地处理,无云端API调用
  • 许可证:模型采用CC BY 4.0,Handy应用采用MIT

Commands

命令说明

Transcribe Audio File

转录音频文件

bash
/parakeet path/to/audio.wav
/parakeet ~/recordings/interview.mp3
/parakeet meeting.m4a
Supported formats:
.wav
,
.mp3
,
.m4a
,
.flac
,
.ogg
,
.aac
bash
/parakeet path/to/audio.wav
/parakeet ~/recordings/interview.mp3
/parakeet meeting.m4a
支持格式:
.wav
,
.mp3
,
.m4a
,
.flac
,
.ogg
,
.aac

Live Dictation (Terminal)

实时听写(终端)

bash
/parakeet
/parakeet dictate
Record from microphone until Enter is pressed, then transcribe.
bash
/parakeet
/parakeet dictate
从麦克风录制,按下Enter后停止并转录。

Check Installation

检查安装情况

bash
/parakeet check
Verify Parakeet is properly installed and model can load.
bash
/parakeet check
验证Parakeet是否正确安装且模型可加载。

Setup

设置步骤

Handy (Push-to-Talk UI)

Handy(按键说话UI)

bash
brew install --cask handy
Launch from Applications, select Parakeet V3 model, configure hotkey.
bash
brew install --cask handy
从应用程序文件夹启动,选择Parakeet V3模型,配置快捷键。

CLI Scripts (Prerequisites)

CLI脚本(前置要求)

  1. Parakeet Dictate repo at
    ~/Programming/parakeet-dictate/
    with Python venv
  2. Install dependencies:
    bash
    cd ~/Programming/parakeet-dictate
    uv venv && uv pip install -r requirements.txt
  3. (Optional) Set custom path:
    export PARAKEET_HOME=/path/to/parakeet-dictate
  1. Parakeet Dictate仓库需位于
    ~/Programming/parakeet-dictate/
    ,并配有Python虚拟环境
  2. 安装依赖
    bash
    cd ~/Programming/parakeet-dictate
    uv venv && uv pip install -r requirements.txt
  3. (可选)设置自定义路径
    export PARAKEET_HOME=/path/to/parakeet-dictate

Implementation

实现逻辑

When this skill is invoked:
  1. For audio files: Run the transcription script
    bash
    cd ~/.claude/skills/parakeet/scripts && \
    ${PARAKEET_HOME:-~/Programming/parakeet-dictate}/.venv/bin/python transcribe.py "<filepath>"
  2. For live dictation: Run the dictation script
    bash
    cd ~/.claude/skills/parakeet/scripts && \
    ${PARAKEET_HOME:-~/Programming/parakeet-dictate}/.venv/bin/python dictate.py
  3. For checking setup: Run the check script
    bash
    cd ~/.claude/skills/parakeet/scripts && \
    ${PARAKEET_HOME:-~/Programming/parakeet-dictate}/.venv/bin/python check_setup.py
当调用该技能时:
  1. 处理音频文件:运行转录脚本
    bash
    cd ~/.claude/skills/parakeet/scripts && \
    ${PARAKEET_HOME:-~/Programming/parakeet-dictate}/.venv/bin/python transcribe.py "<filepath>"
  2. 处理实时听写:运行听写脚本
    bash
    cd ~/.claude/skills/parakeet/scripts && \
    ${PARAKEET_HOME:-~/Programming/parakeet-dictate}/.venv/bin/python dictate.py
  3. 检查设置:运行检查脚本
    bash
    cd ~/.claude/skills/parakeet/scripts && \
    ${PARAKEET_HOME:-~/Programming/parakeet-dictate}/.venv/bin/python check_setup.py

Model Caches

模型缓存

SystemCache LocationSizeEngine
Handy
~/Library/Application Support/com.pais.handy/models/
~478MBtranscribe-rs (ONNX int8)
NeMo CLI
~/.cache/nemo/
~1.2GBNeMo / PyTorch
Model caches are separate. Handy's Parakeet V3 int8 model structure:
parakeet-tdt-0.6b-v3-int8/
├── encoder-model.int8.onnx
├── decoder_joint-model.int8.onnx
├── nemo128.onnx (audio preprocessor)
└── vocab.txt
系统缓存位置大小引擎
Handy
~/Library/Application Support/com.pais.handy/models/
约478MBtranscribe-rs (ONNX int8)
NeMo CLI
~/.cache/nemo/
约1.2GBNeMo / PyTorch
模型缓存相互独立。Handy的Parakeet V3 int8模型结构:
parakeet-tdt-0.6b-v3-int8/
├── encoder-model.int8.onnx
├── decoder_joint-model.int8.onnx
├── nemo128.onnx (audio preprocessor)
└── vocab.txt

Troubleshooting

故障排除

"No module named nemo"

"No module named nemo"

Use the Parakeet virtual environment. Scripts automatically use the correct Python.
使用Parakeet虚拟环境,脚本会自动调用正确的Python环境。

"MPS not available"

"MPS not available"

Apple Silicon Metal acceleration requires PyTorch 2.0+. Falls back to CPU automatically.
Apple Silicon的Metal加速需要PyTorch 2.0+,若不满足会自动回退到CPU运行。

"Permission denied: microphone"

"Permission denied: microphone"

Grant microphone access in System Preferences → Privacy & Security → Microphone.
在系统偏好设置→隐私与安全性→麦克风中授予访问权限。

Model download slow

模型下载缓慢

The Parakeet model downloads on first use (~478MB for Handy, ~1.2GB for NeMo). Subsequent runs use cache.
Parakeet模型在首次使用时下载(Handy版本约478MB,NeMo版本约1.2GB),后续运行会使用缓存。

Configuration

配置

VariableDefaultDescription
PARAKEET_HOME
~/Programming/parakeet-dictate
Parakeet Dictate installation path
变量默认值描述
PARAKEET_HOME
~/Programming/parakeet-dictate
Parakeet Dictate的安装路径

Dependencies

依赖项

Handy:
brew install --cask handy
(standalone, no other deps)
CLI scripts require:
  • Parakeet Dictate repo at
    $PARAKEET_HOME
    (default:
    ~/Programming/parakeet-dictate
    )
  • Python virtual environment at
    $PARAKEET_HOME/.venv
  • NeMo toolkit with ASR support (
    nemo_toolkit[asr]>=2.0.0
    )
  • PyTorch 2.0+ (for MPS/CUDA acceleration)
  • soundfile and sounddevice for audio handling
Handy
brew install --cask handy
(独立应用,无其他依赖)
CLI脚本需要:
  • Parakeet Dictate仓库位于
    $PARAKEET_HOME
    (默认:
    ~/Programming/parakeet-dictate
  • Python虚拟环境位于
    $PARAKEET_HOME/.venv
  • 带ASR支持的NeMo工具包 (
    nemo_toolkit[asr]>=2.0.0
    )
  • PyTorch 2.0+(用于MPS/CUDA加速)
  • soundfilesounddevice(用于音频处理)