vox

Original：🇨🇳 Chinese

Translated

5 scripts

Vox single-entry voice orchestration skill. Used to complete environment guarding, CLI installation, on-demand model download, ASR transcription, voice cloning, pipeline execution and task troubleshooting through natural language. It is used when users only describe the target without providing specific commands.

11installs

Sourcecatoncat/vox-cli

Added on2026-02-19

NPX Install

npx skill4agent add catoncat/vox-cli vox

SKILL.md Content (Chinese)

View Translation Comparison →

Vox

Use a single entry point to cover the full-process capabilities of Vox CLI. Always execute via CLI, do not write inference code directly.

First switch to the skill root directory (where

SKILL.md

is located) before running the following scripts to avoid relative path errors.

Fixed Workflow

First perform initialization and guarding:

bash

# Side-effect-free pre-check (recommended to run first)
bash scripts/bootstrap.sh --check

# Actual installation and repair
bash scripts/bootstrap.sh

Match intents based on requirements (see
```
references/intents.md
```
).
When models are needed, first ensure they are available:

bash

bash scripts/ensure_model.sh <model_id|asr-auto|tts-default>

Execute the corresponding
```
vox
```
command.
Mandatory self-check before delivery:

bash

bash scripts/self_check.sh [--require-model <...>] [--require-file <...>]

Record samples when failure occurs:

bash

bash scripts/log_failure.sh --stage "<stage>" --command "<cmd>" --error "<msg>" [--retry "<retry-cmd>"]

5 Hard Standards (Must Be Fully Met)

The platform must be
```
Darwin + arm64
```
.
```
vox doctor --json
```
must return
```
ok=true
```
.
The models required for this task must have
```
verified=true
```
.
The exit code of the main command must be
```
0
```
.
When file output is required, the target file must exist and the path must be reported.

Execution Rules

Prioritize using JSON output (
```
--json
```
or
```
--format ndjson
```
).
Declare the command to be run before execution; report key results after execution.
If the command fails, first provide the retry command, then call
```
log_failure.sh
```
to record the failure sample.
It is forbidden to skip guarding and directly perform heavy operations (model download, cloning, pipeline).

Common Command Templates

bash

# Environment check
scripts/vox_cmd.sh doctor --json

# Check model status
scripts/vox_cmd.sh model status --json

# Offline transcription
scripts/vox_cmd.sh asr transcribe --audio ./speech.wav --lang zh --model auto --json

# Voice cloning
scripts/vox_cmd.sh tts clone --profile narrator --text "你好" --out ./out.wav --model qwen-tts-1.7b --json

# Integrated workflow
scripts/vox_cmd.sh pipeline run --profile narrator --audio ./input.wav --clone-text "把这段内容读出来" --lang zh --json

Dependencies and Release Conventions

Prioritize using
```
uv
```
for installation.
System dependencies installed by default:
```
ffmpeg
```
,
```
portaudio
```
.
CLI sources:
- Prioritize
```
VOX_CLI_PACKAGE_SPEC
```
  (explicitly specified package).
- If
```
VOX_CLI_GIT_URL
```
  is set, install using
```
git+<url>
```
  .
- Default fallback to
```
git+https://github.com/catoncat/vox-cli.git
```
  .

vox

NPX Install

Tags

SKILL.md Content (Chinese)

Vox

Fixed Workflow

5 Hard Standards (Must Be Fully Met)

Execution Rules

Common Command Templates

Dependencies and Release Conventions