chough

Original🇺🇸 English
Translated

Fast ASR CLI tool for transcribing audio/video files. Use when user wants to transcribe audio/video, generate subtitles (VTT), convert speech to text with timestamps (JSON), or optimize transcription for low memory.

8installs
Added on

NPX Install

npx skill4agent add hyperpuncher/dotagents chough

Installation

Arch Linux:
paru -S chough-bin
macOS:
brew install --cask hyperpuncher/tap/chough
Windows:
winget install chough
Source:
go install github.com/hyperpuncher/chough/cmd/chough@latest
Requires:
ffmpeg
for audio/video support

Quick Reference

bash
# Basic transcription (text to stdout)
chough audio.mp3

# JSON with timestamps
chough -f json podcast.mp3 > transcript.json

# WebVTT subtitles
chough -f vtt -o subs.vtt video.mp4

# Low memory (30s chunks)
chough -c 30 audiobook.mp3

Flags

FlagDescriptionDefault
-c, --chunk-size
Chunk size in seconds60
-f, --format
Output: text, json, vtttext
-o, --output
Output filestdout
--version
Show version-

Chunk Size Guide

  • 15-30s: Low memory (~500MB), higher error rate
  • 60s: Balanced (default) - ~1.6GB RAM

Performance

DurationTimeSpeed
15s2.0s7.4x realtime
1min4.3s14.1x realtime
5min16.2s18.5x realtime
30min90.2s19.9x realtime

Troubleshooting

Out of memory: Use
-c 30
or
-c 15
Model fails: Check internet, verify
$XDG_CACHE_HOME
is writable ffmpeg errors: Ensure ffmpeg is installed

Notes

  • First run downloads ~650MB model to
    $XDG_CACHE_HOME/chough/models
  • Auto-extracts audio from video files
  • Set
    CHOUGH_MODEL
    env var to use custom model path
  • VTT groups tokens into subtitle cues automatically

Docs