voxflow

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

VoxFlow Skill

VoxFlow 技能

Use this skill whenever users ask for text-to-speech, voice synthesis, podcast creation, narrated stories, or voice/audio generation tasks.

VoxFlow provides 200+ voices in 40+ languages.

当用户请求文本转语音、语音合成、播客创作、有声故事或语音/音频生成相关任务时，均可使用本技能。

VoxFlow提供40+种语言的200+种音色。

IMPORTANT: How to execute (read this first)

重要说明：执行方法（请先阅读）

Follow this decision tree EVERY TIME:

每次执行都请遵循以下决策树：

Step 1: Try MCP tools

步骤1：尝试使用MCP工具

If VoxFlow MCP tools are available (voice_list, tts_synthesize, etc.), use them directly:

```
voice_list
```
— search voices
```
tts_synthesize
```
— text to speech (returns base64, save to file then
```
open
```
it)
```
asr_transcribe
```
— transcribe audio
```
podcast_script
```
— generate podcast dialogue
```
quota_check
```
— check remaining quota

Example: user says "给我讲个故事" → write a story → call

tts_synthesize

→ save mp3 →

open

it.

If MCP tools work, stop here. Do NOT install or use CLI.

如果VoxFlow MCP工具可用（如voice_list、tts_synthesize等），请直接使用：

```
voice_list
```
—— 搜索音色
```
tts_synthesize
```
—— 文本转语音（返回base64格式，保存为文件后调用
```
open
```
打开）
```
asr_transcribe
```
—— 音频转文字
```
podcast_script
```
—— 生成播客对话脚本
```
quota_check
```
—— 查看剩余配额

示例：用户说“给我讲个故事”→ 编写故事→ 调用

tts_synthesize

→ 保存为mp3→ 调用

open

打开。

如果MCP工具可用，到此为止。请勿安装或使用CLI。

Step 2: MCP not available → install CLI automatically

步骤2：MCP不可用→自动安装CLI

If MCP tools are not available, install the CLI yourself (do NOT ask the user to do it):

bash

npm install -g voxflow

Then login:

bash

voxflow login

This opens a browser. After the user logs in, the token is cached and all subsequent commands work.

如果MCP工具不可用，请自行安装CLI（无需让用户操作）：

bash

npm install -g voxflow

然后登录：

bash

voxflow login

此命令会打开浏览器。用户登录后，令牌将被缓存，后续所有命令均可正常使用。

Step 3: CLI not installable → guide user to add MCP

步骤3：CLI无法安装→引导用户添加MCP

npm install -g

fails (permissions, no Node.js, etc.), tell the user:

Run this in your terminal, then restart the session:
claude mcp add voxflow https://api.voxflow.studio/api/mcp

如果

npm install -g

执行失败（如权限问题、未安装Node.js等），请告知用户：

在终端中运行以下命令，然后重启会话：
claude mcp add voxflow https://api.voxflow.studio/api/mcp

Using CLI (when MCP is not available)

使用CLI（当MCP不可用时）

If MCP is not available but CLI is installed:

如果MCP不可用但CLI已安装：

Quick Reference

快速参考

Command	What it does	Example
`say`	Text → speech audio	`voxflow say "Hello" -o hello.mp3`
`narrate`	File/text → multi-segment TTS	`voxflow narrate script.txt -o out.wav`
`podcast`	Topic → AI podcast episode	`voxflow podcast "AI trends" --duration 3`
`story`	Topic → AI narrated story	`voxflow story "太空冒险" -o story.wav`
`voices`	Search voice library	`voxflow voices --lang zh --gender female`
`asr`	Audio → text transcript	`voxflow asr meeting.mp3`
`status`	Check login & quota	`voxflow status`

命令	功能	示例
`say`	文本转语音音频	`voxflow say "Hello" -o hello.mp3`
`narrate`	文件/长文本转多段式TTS	`voxflow narrate script.txt -o out.wav`
`podcast`	从主题生成AI播客节目	`voxflow podcast "AI trends" --duration 3`
`story`	从主题生成AI有声故事	`voxflow story "太空冒险" -o story.wav`
`voices`	浏览音色库	`voxflow voices --lang zh --gender female`
`asr`	音频转文字转录	`voxflow asr meeting.mp3`
`status`	检查登录状态与配额	`voxflow status`

Authentication

身份验证

bash

undefined

bash

undefined

Login (opens browser for Google/email OTP)

登录（打开浏览器进行Google/邮箱OTP验证）

voxflow login

Check login status and remaining quota

检查登录状态和剩余配额

voxflow status

Logout

登出

voxflow logout


- Login is required before any command that calls the API.
- Token is cached at `~/.config/voxflow/token.json`.
- For CI environments, set `VOXFLOW_TOKEN` env var.

voxflow logout


- 调用API的所有命令都需要先登录。
- 令牌会缓存至`~/.config/voxflow/token.json`。
- 对于CI环境，请设置`VOXFLOW_TOKEN`环境变量。

Commands

命令详解

Text-to-Speech (

say

)

文本转语音（

say

）

The core command. Convert text to speech audio.

bash

undefined

核心命令，将文本转换为语音音频。

bash

undefined

Basic usage

基础用法

voxflow say "你好世界" -o hello.mp3

With specific voice

使用指定音色

voxflow say "Hello world" --voice v-female-R2s4N9qJ -o greeting.mp3

Slow narration speed

慢速朗读

voxflow say "慢速朗读" --speed 0.8 -o slow.mp3

WAV format

WAV格式

voxflow say "高质量音频" --format wav -o output.wav


**Flags:** `--voice <id>` · `--speed <0.5-2.0>` · `--format <mp3|wav>` · `-o <path>`

voxflow say "高质量音频" --format wav -o output.wav


**参数：** `--voice <id>` · `--speed <0.5-2.0>` · `--format <mp3|wav>` · `-o <路径>`

Narrate (

narrate

)

有声朗读（

narrate

）

Read a file or long text, split into segments, synthesize each with TTS.

bash

undefined

读取文件或长文本，分割为多个片段后逐段合成TTS。

bash

undefined

From file

从文件读取

voxflow narrate article.txt -o narration.wav

From stdin (pipe any text)

从标准输入读取（管道传递文本）

cat readme.md | voxflow narrate -o readme_audio.wav

With voice

使用指定音色

voxflow narrate script.txt --voice v-male-Bk7vD3xP -o output.wav


Best for: long documents, articles, README files, email newsletters.

voxflow narrate script.txt --voice v-male-Bk7vD3xP -o output.wav


最佳适用场景：长文档、文章、README文件、电子邮件通讯。

Podcast (

podcast

)

播客生成（

podcast

）

Generate a multi-speaker podcast episode with AI-written script.

bash

undefined

通过AI编写的脚本生成多播客节目。

bash

undefined

From topic

从主题生成

voxflow podcast "程序员如何用AI提升效率" --duration 3

With background music

添加背景音乐

voxflow podcast "Tech trends" --bgm lofi --duration 5

From existing script

从现有脚本生成

voxflow podcast --script dialogue.txt

Control language

指定语言

voxflow podcast "量子计算" --language zh-CN


**Flags:** `--duration <min>` · `--bgm <lofi|jazz|ambient>` · `--script <file>` · `--language <zh-CN|en-US|ja-JP>`

voxflow podcast "量子计算" --language zh-CN


**参数：** `--duration <分钟>` · `--bgm <lofi|jazz|ambient>` · `--script <文件>` · `--language <zh-CN|en-US|ja-JP>`

Story (

story

)

有声故事生成（

story

）

AI writes a short story and narrates it with TTS.

bash

voxflow story "一只会飞的小猫" -o story.wav
voxflow story "space adventure" --lang en -o adventure.wav

Best for: bedtime stories, creative writing demos, content samples.

AI编写短篇故事并通过TTS合成有声版本。

bash

voxflow story "一只会飞的小猫" -o story.wav
voxflow story "space adventure" --lang en -o adventure.wav

最佳适用场景：睡前故事、创意写作演示、内容样本。

Voice Search (

voices

)

音色搜索（

voices

）

Browse the voice library. No login required.

bash

undefined

浏览音色库，无需登录。

bash

undefined

Chinese female voices

中文女性音色

voxflow voices --lang zh --gender female

English voices

英文音色

voxflow voices --lang en

Search by keyword

关键词搜索

voxflow voices --search "narrator"

All voices

查看所有音色

voxflow voices --all


Always call `voices` first when the user wants a specific voice style.

voxflow voices --all


当用户需要特定风格的音色时，请先调用`voices`搜索。

Speech Recognition (

asr

)

语音识别（

asr

）

Transcribe audio to text. Supports Chinese, English, Japanese, Korean.

bash

voxflow asr recording.mp3
voxflow asr meeting.wav --lang en

Note: Requires publicly accessible audio URL or local file upload.

将音频转录为文字，支持中文、英文、日文、韩文。

bash

voxflow asr recording.mp3
voxflow asr meeting.wav --lang en

注意： 需要可公开访问的音频URL或本地文件上传。

Voice Selection Guide

音色选择指南

bash

undefined

bash

undefined

Step 1: Search for matching voices

步骤1：搜索匹配的音色

voxflow voices --lang zh --gender female

Step 2: Use the voice ID in synthesis

步骤2：在合成时使用音色ID

voxflow say "测试" --voice v-female-R2s4N9qJ -o test.mp3


Popular voices:
- `v-female-R2s4N9qJ` — 温柔姐姐 (Gentle Female, Chinese)
- `v-male-Bk7vD3xP` — 威严霸总 (Authoritative Male, Chinese)
- `v-female-m1KpW7zE` — 傲娇学姐 (Sassy Female, Chinese)

voxflow say "测试" --voice v-female-R2s4N9qJ -o test.mp3


热门音色：
- `v-female-R2s4N9qJ` —— 温柔姐姐（中文女声）
- `v-male-Bk7vD3xP` —— 威严霸总（中文男声）
- `v-female-m1KpW7zE` —— 傲娇学姐（中文女声）

Common Scenarios

常见场景示例

"把这段话念出来"

bash

voxflow say "用户输入的文字" -o output.mp3 && open output.mp3

bash

voxflow say "用户输入的文字" -o output.mp3 && open output.mp3

"用温柔女声读这个文件"

bash

voxflow voices --lang zh --gender female   # 先找音色
voxflow narrate file.txt --voice v-female-R2s4N9qJ -o narration.mp3

bash

voxflow voices --lang zh --gender female   # 先查找音色
voxflow narrate file.txt --voice v-female-R2s4N9qJ -o narration.mp3

"生成一个关于 XX 的播客"

"生成一个关于XX的播客"

bash

voxflow status                              # 检查配额（播客约 5000）
voxflow podcast "话题" --duration 3 --bgm lofi

bash

voxflow status                              # 检查配额（播客约消耗5000配额）
voxflow podcast "话题" --duration 3 --bgm lofi

"讲个睡前故事"

bash

voxflow story "小狐狸的星星种子" -o bedtime.mp3 && open bedtime.mp3

bash

voxflow story "小狐狸的星星种子" -o bedtime.mp3 && open bedtime.mp3

"转录这段录音"

bash

voxflow asr recording.mp3

bash

voxflow asr recording.mp3

Creative Workflows

创意工作流

These workflows combine VoxFlow TTS with the AI agent's own abilities (writing, coding, web fetching). The agent writes content, then calls

voxflow say

voxflow narrate

to synthesize each part.

这些工作流结合了VoxFlow TTS与AI Agent的自有能力（写作、编码、网页抓取）。Agent先编写内容，再调用

voxflow say

或

voxflow narrate

合成各部分音频。

Audio Storybook (有声绘本)

有声绘本

AI writes a children's story, generates SVG illustrations, synthesizes narration per page, and bundles everything into a single offline HTML file.

Steps:

Write a 6-page children's story
For each page: generate an inline SVG illustration (400×300)

Synthesize narration:

voxflow say "page text" --voice v-female-R2s4N9qJ --speed 0.85 -o /tmp/page_N.mp3

Read the mp3 files, base64 encode, embed inline in HTML
Output a single self-contained HTML file with audio play buttons per page
```
open /tmp/storybook.html
```

AI编写儿童故事，生成SVG插图，逐页合成有声朗读，并将所有内容打包为单个离线HTML文件。

步骤：

编写一个6页的儿童故事
为每页生成内嵌SVG插图（400×300）

合成有声朗读：

voxflow say "页面文本" --voice v-female-R2s4N9qJ --speed 0.85 -o /tmp/page_N.mp3

读取mp3文件，进行base64编码，内嵌到HTML中
输出包含每页音频播放按钮的独立HTML文件
调用
```
open /tmp/storybook.html
```
打开

Audio Presentation (有声演示文稿)

有声演示文稿

AI creates an HTML slide deck with TTS narration on each slide.

Steps:

Generate N slides (title + bullet points + narration script)

For each slide:

voxflow say "narration script" -o /tmp/slide_N.mp3

Build an HTML file with slide navigation (prev/next) and audio buttons
Embed audio inline as base64
```
open /tmp/presentation.html
```

Best for: product introductions, technical tutorials, course materials.

AI创建带有每页TTS朗读的HTML幻灯片。

步骤：

生成N张幻灯片（标题+要点+朗读脚本）

为每页合成音频：

voxflow say "朗读脚本" -o /tmp/slide_N.mp3

构建带有幻灯片导航（上一页/下一页）和音频按钮的HTML文件
将音频以base64格式内嵌
调用
```
open /tmp/presentation.html
```
打开

最佳适用场景：产品介绍、技术教程、课程资料。

Article → Audio Briefing (文章有声摘要)

文章转有声摘要

Read a URL or document, summarize, synthesize as audio.

Steps:

Fetch/read the content
Summarize into 3-5 key points

voxflow say "summary text" --voice v-male-Bk7vD3xP -o /tmp/briefing.mp3

```
open /tmp/briefing.mp3
```

读取URL或文档内容，生成摘要后合成为音频。

步骤：

获取/读取内容
提炼为3-5个要点

合成音频：

voxflow say "摘要文本" --voice v-male-Bk7vD3xP -o /tmp/briefing.mp3

调用
```
open /tmp/briefing.mp3
```
打开

Document Narration (文档朗读)

文档有声朗读

Read a README, code comments, or any text file aloud.

bash

voxflow narrate README.md --voice v-female-R2s4N9qJ --speed 0.9 -o /tmp/readme.mp3
open /tmp/readme.mp3

朗读README、代码注释或任何文本文件。

bash

voxflow narrate README.md --voice v-female-R2s4N9qJ --speed 0.9 -o /tmp/readme.mp3
open /tmp/readme.mp3

Multi-language Voice (多语言合成)

多语言语音合成

AI translates text, then synthesizes in each language with matching voices.

Steps:

Translate user text to target languages (AI does this natively)

voxflow voices --lang en --gender female

→ pick English voice

voxflow voices --lang ja --gender female

→ pick Japanese voice

voxflow say "English text" --voice <en_id> -o /tmp/en.mp3

voxflow say "日本語テキスト" --voice <ja_id> -o /tmp/ja.mp3

AI翻译文本，然后使用对应语言的匹配音色合成音频。

步骤：

将用户文本翻译为目标语言（AI原生支持）

调用

voxflow voices --lang en --gender female

→选择英文音色

调用

voxflow voices --lang ja --gender female

→选择日文音色

合成英文音频：

voxflow say "英文文本" --voice <英文音色ID> -o /tmp/en.mp3

合成日文音频：

voxflow say "日本語テキスト" --voice <日文音色ID> -o /tmp/ja.mp3

Git Daily Report Audio (Git 日报音频)

Git日报音频

Steps:

Read
```
git log --oneline --since="1 day ago"
```
Summarize changes into a brief report

voxflow say "today's summary..." -o /tmp/daily_report.mp3

```
open /tmp/daily_report.mp3
```

步骤：

读取
```
git log --oneline --since="1 day ago"
```
的输出
总结当日变更为简短报告

合成音频：

voxflow say "今日总结..." -o /tmp/daily_report.mp3

调用
```
open /tmp/daily_report.mp3
```
打开

Code PR Explanation (PR 语音讲解)

PR语音讲解

Steps:

Read the PR diff
Write a plain-language explanation

voxflow say "explanation..." -o /tmp/pr_review.mp3

```
open /tmp/pr_review.mp3
```

步骤：

读取PR的差异内容
用通俗易懂的语言编写讲解内容

合成音频：

voxflow say "讲解内容..." -o /tmp/pr_review.mp3

调用
```
open /tmp/pr_review.mp3
```
打开

Mock Interview (模拟面试)

模拟面试

Steps:

Generate 3 interview questions on the topic

For each:

voxflow say "question N..." --voice v-male-Bk7vD3xP -o /tmp/q_N.mp3

Play questions in sequence:
```
open /tmp/q_1.mp3
```

步骤：

生成3个与主题相关的面试问题

为每个问题合成音频：

voxflow say "问题N..." --voice v-male-Bk7vD3xP -o /tmp/q_N.mp3

依次播放问题：
```
open /tmp/q_1.mp3
```

Quota

配额说明

Free: 10,000 quota/month
1 TTS call ≈ 100 quota
1 podcast ≈ 5,000 quota
Check:
```
voxflow status
```

免费版：每月10,000配额
1次TTS调用≈消耗100配额
1个播客≈消耗5,000配额
查询配额：
```
voxflow status
```

Prerequisites

前置要求

Node.js 20+ required
ffmpeg optional (only for video-related commands)
Install CLI:
```
npm install -g voxflow
```

Node.js 20+版本
ffmpeg 可选（仅用于视频相关命令）
安装CLI：
```
npm install -g voxflow
```

Rules

规则

Always try MCP first — if MCP tools are available, use them instead of CLI.
Always search voices before synthesizing — never guess voice IDs.
Check quota before expensive operations (podcast ≈ 5000 quota).
After synthesis, auto-play the file:
```
open output.mp3
```
(macOS).
Never print tokens or secrets.
If CLI fails with "not logged in", suggest MCP as alternative:
```
claude mcp add voxflow https://api.voxflow.studio/api/mcp
```
If a command fails, check
```
--help
```
and correct flags before retrying.

优先使用MCP——如果MCP工具可用，请使用MCP而非CLI。
合成前务必搜索音色——切勿猜测音色ID。
执行高消耗操作前检查配额（播客≈5000配额）。
合成完成后自动播放文件：
```
open output.mp3
```
（macOS系统）。
切勿打印令牌或机密信息。
如果CLI提示“未登录”，建议用户使用MCP替代：
```
claude mcp add voxflow https://api.voxflow.studio/api/mcp
```
如果命令执行失败，请先查看
```
--help
```
并修正参数后重试。

MCP Setup (if not already configured)

MCP配置（若未配置）

If MCP tools are not available and you want the easiest setup:

bash

claude mcp add voxflow https://api.voxflow.studio/api/mcp

After adding, MCP tools work immediately — OAuth auto-login, no CLI install needed. Restart the agent session to load MCP tools.

如果MCP工具不可用且希望获得最简设置：

bash

claude mcp add voxflow https://api.voxflow.studio/api/mcp

添加后，MCP工具即可立即使用——OAuth自动登录，无需安装CLI。重启Agent会话以加载MCP工具。

voxflow

Original

Translation

VoxFlow Skill

VoxFlow 技能

IMPORTANT: How to execute (read this first)

重要说明：执行方法（请先阅读）

Step 1: Try MCP tools

步骤1：尝试使用MCP工具

Step 2: MCP not available → install CLI automatically

步骤2：MCP不可用→自动安装CLI

Step 3: CLI not installable → guide user to add MCP

步骤3：CLI无法安装→引导用户添加MCP

Using CLI (when MCP is not available)

使用CLI（当MCP不可用时）

Quick Reference

快速参考

Authentication

身份验证

Login (opens browser for Google/email OTP)

登录（打开浏览器进行Google/邮箱OTP验证）

Check login status and remaining quota

检查登录状态和剩余配额

Logout

登出

Commands

命令详解

Text-to-Speech (say)

文本转语音（say）

Basic usage

基础用法

With specific voice

使用指定音色

Slow narration speed

慢速朗读

WAV format

WAV格式

Narrate (narrate)

有声朗读（narrate）

From file

从文件读取

From stdin (pipe any text)

从标准输入读取（管道传递文本）

With voice

使用指定音色

Podcast (podcast)

播客生成（podcast）

From topic

从主题生成

With background music

添加背景音乐

From existing script

从现有脚本生成

Control language

指定语言

Story (story)

有声故事生成（story）

Voice Search (voices)

音色搜索（voices）

Chinese female voices

中文女性音色

English voices

英文音色

Search by keyword

关键词搜索

All voices

查看所有音色

Speech Recognition (asr)

语音识别（asr）

Voice Selection Guide

音色选择指南

Step 1: Search for matching voices

步骤1：搜索匹配的音色

Step 2: Use the voice ID in synthesis

步骤2：在合成时使用音色ID

Common Scenarios

常见场景示例

"把这段话念出来"

"把这段话念出来"

"用温柔女声读这个文件"

Text-to-Speech (
`say`
)

文本转语音（
`say`
）

Narrate (
`narrate`
)

有声朗读（
`narrate`
）

Podcast (
`podcast`
)

播客生成（
`podcast`
）

Story (
`story`
)

有声故事生成（
`story`
）

Voice Search (
`voices`
)

音色搜索（
`voices`
）

Speech Recognition (
`asr`
)

语音识别（
`asr`
）