voxflow

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

VoxFlow Skill

VoxFlow 技能

Use this skill whenever users ask for text-to-speech, voice synthesis, podcast creation, narrated stories, or voice/audio generation tasks.
VoxFlow provides 200+ voices in 40+ languages.
当用户请求文本转语音、语音合成、播客创作、有声故事或语音/音频生成相关任务时,均可使用本技能。
VoxFlow提供40+种语言的200+种音色。

IMPORTANT: How to execute (read this first)

重要说明:执行方法(请先阅读)

Follow this decision tree EVERY TIME:
每次执行都请遵循以下决策树:

Step 1: Try MCP tools

步骤1:尝试使用MCP工具

If VoxFlow MCP tools are available (voice_list, tts_synthesize, etc.), use them directly:
  • voice_list
    — search voices
  • tts_synthesize
    — text to speech (returns base64, save to file then
    open
    it)
  • asr_transcribe
    — transcribe audio
  • podcast_script
    — generate podcast dialogue
  • quota_check
    — check remaining quota
Example: user says "给我讲个故事" → write a story → call
tts_synthesize
→ save mp3 →
open
it.
If MCP tools work, stop here. Do NOT install or use CLI.
如果VoxFlow MCP工具可用(如voice_list、tts_synthesize等),请直接使用:
  • voice_list
    —— 搜索音色
  • tts_synthesize
    —— 文本转语音(返回base64格式,保存为文件后调用
    open
    打开)
  • asr_transcribe
    —— 音频转文字
  • podcast_script
    —— 生成播客对话脚本
  • quota_check
    —— 查看剩余配额
示例:用户说“给我讲个故事”→ 编写故事→ 调用
tts_synthesize
→ 保存为mp3→ 调用
open
打开。
如果MCP工具可用,到此为止。请勿安装或使用CLI。

Step 2: MCP not available → install CLI automatically

步骤2:MCP不可用→自动安装CLI

If MCP tools are not available, install the CLI yourself (do NOT ask the user to do it):
bash
npm install -g voxflow
Then login:
bash
voxflow login
This opens a browser. After the user logs in, the token is cached and all subsequent commands work.
如果MCP工具不可用,请自行安装CLI(无需让用户操作):
bash
npm install -g voxflow
然后登录:
bash
voxflow login
此命令会打开浏览器。用户登录后,令牌将被缓存,后续所有命令均可正常使用。

Step 3: CLI not installable → guide user to add MCP

步骤3:CLI无法安装→引导用户添加MCP

If
npm install -g
fails (permissions, no Node.js, etc.), tell the user:
Run this in your terminal, then restart the session:
claude mcp add voxflow https://api.voxflow.studio/api/mcp
如果
npm install -g
执行失败(如权限问题、未安装Node.js等),请告知用户:
在终端中运行以下命令,然后重启会话:
claude mcp add voxflow https://api.voxflow.studio/api/mcp

Using CLI (when MCP is not available)

使用CLI(当MCP不可用时)

If MCP is not available but CLI is installed:
如果MCP不可用但CLI已安装:

Quick Reference

快速参考

CommandWhat it doesExample
say
Text → speech audio
voxflow say "Hello" -o hello.mp3
narrate
File/text → multi-segment TTS
voxflow narrate script.txt -o out.wav
podcast
Topic → AI podcast episode
voxflow podcast "AI trends" --duration 3
story
Topic → AI narrated story
voxflow story "太空冒险" -o story.wav
voices
Search voice library
voxflow voices --lang zh --gender female
asr
Audio → text transcript
voxflow asr meeting.mp3
status
Check login & quota
voxflow status
命令功能示例
say
文本转语音音频
voxflow say "Hello" -o hello.mp3
narrate
文件/长文本转多段式TTS
voxflow narrate script.txt -o out.wav
podcast
从主题生成AI播客节目
voxflow podcast "AI trends" --duration 3
story
从主题生成AI有声故事
voxflow story "太空冒险" -o story.wav
voices
浏览音色库
voxflow voices --lang zh --gender female
asr
音频转文字转录
voxflow asr meeting.mp3
status
检查登录状态与配额
voxflow status

Authentication

身份验证

bash
undefined
bash
undefined

Login (opens browser for Google/email OTP)

登录(打开浏览器进行Google/邮箱OTP验证)

voxflow login
voxflow login

Check login status and remaining quota

检查登录状态和剩余配额

voxflow status
voxflow status

Logout

登出

voxflow logout

- Login is required before any command that calls the API.
- Token is cached at `~/.config/voxflow/token.json`.
- For CI environments, set `VOXFLOW_TOKEN` env var.
voxflow logout

- 调用API的所有命令都需要先登录。
- 令牌会缓存至`~/.config/voxflow/token.json`。
- 对于CI环境,请设置`VOXFLOW_TOKEN`环境变量。

Commands

命令详解

Text-to-Speech (
say
)

文本转语音(
say

The core command. Convert text to speech audio.
bash
undefined
核心命令,将文本转换为语音音频。
bash
undefined

Basic usage

基础用法

voxflow say "你好世界" -o hello.mp3
voxflow say "你好世界" -o hello.mp3

With specific voice

使用指定音色

voxflow say "Hello world" --voice v-female-R2s4N9qJ -o greeting.mp3
voxflow say "Hello world" --voice v-female-R2s4N9qJ -o greeting.mp3

Slow narration speed

慢速朗读

voxflow say "慢速朗读" --speed 0.8 -o slow.mp3
voxflow say "慢速朗读" --speed 0.8 -o slow.mp3

WAV format

WAV格式

voxflow say "高质量音频" --format wav -o output.wav

**Flags:** `--voice <id>` · `--speed <0.5-2.0>` · `--format <mp3|wav>` · `-o <path>`
voxflow say "高质量音频" --format wav -o output.wav

**参数:** `--voice <id>` · `--speed <0.5-2.0>` · `--format <mp3|wav>` · `-o <路径>`

Narrate (
narrate
)

有声朗读(
narrate

Read a file or long text, split into segments, synthesize each with TTS.
bash
undefined
读取文件或长文本,分割为多个片段后逐段合成TTS。
bash
undefined

From file

从文件读取

voxflow narrate article.txt -o narration.wav
voxflow narrate article.txt -o narration.wav

From stdin (pipe any text)

从标准输入读取(管道传递文本)

cat readme.md | voxflow narrate -o readme_audio.wav
cat readme.md | voxflow narrate -o readme_audio.wav

With voice

使用指定音色

voxflow narrate script.txt --voice v-male-Bk7vD3xP -o output.wav

Best for: long documents, articles, README files, email newsletters.
voxflow narrate script.txt --voice v-male-Bk7vD3xP -o output.wav

最佳适用场景:长文档、文章、README文件、电子邮件通讯。

Podcast (
podcast
)

播客生成(
podcast

Generate a multi-speaker podcast episode with AI-written script.
bash
undefined
通过AI编写的脚本生成多播客节目。
bash
undefined

From topic

从主题生成

voxflow podcast "程序员如何用AI提升效率" --duration 3
voxflow podcast "程序员如何用AI提升效率" --duration 3

With background music

添加背景音乐

voxflow podcast "Tech trends" --bgm lofi --duration 5
voxflow podcast "Tech trends" --bgm lofi --duration 5

From existing script

从现有脚本生成

voxflow podcast --script dialogue.txt
voxflow podcast --script dialogue.txt

Control language

指定语言

voxflow podcast "量子计算" --language zh-CN

**Flags:** `--duration <min>` · `--bgm <lofi|jazz|ambient>` · `--script <file>` · `--language <zh-CN|en-US|ja-JP>`
voxflow podcast "量子计算" --language zh-CN

**参数:** `--duration <分钟>` · `--bgm <lofi|jazz|ambient>` · `--script <文件>` · `--language <zh-CN|en-US|ja-JP>`

Story (
story
)

有声故事生成(
story

AI writes a short story and narrates it with TTS.
bash
voxflow story "一只会飞的小猫" -o story.wav
voxflow story "space adventure" --lang en -o adventure.wav
Best for: bedtime stories, creative writing demos, content samples.
AI编写短篇故事并通过TTS合成有声版本。
bash
voxflow story "一只会飞的小猫" -o story.wav
voxflow story "space adventure" --lang en -o adventure.wav
最佳适用场景:睡前故事、创意写作演示、内容样本。

Voice Search (
voices
)

音色搜索(
voices

Browse the voice library. No login required.
bash
undefined
浏览音色库,无需登录。
bash
undefined

Chinese female voices

中文女性音色

voxflow voices --lang zh --gender female
voxflow voices --lang zh --gender female

English voices

英文音色

voxflow voices --lang en
voxflow voices --lang en

Search by keyword

关键词搜索

voxflow voices --search "narrator"
voxflow voices --search "narrator"

All voices

查看所有音色

voxflow voices --all

Always call `voices` first when the user wants a specific voice style.
voxflow voices --all

当用户需要特定风格的音色时,请先调用`voices`搜索。

Speech Recognition (
asr
)

语音识别(
asr

Transcribe audio to text. Supports Chinese, English, Japanese, Korean.
bash
voxflow asr recording.mp3
voxflow asr meeting.wav --lang en
Note: Requires publicly accessible audio URL or local file upload.
将音频转录为文字,支持中文、英文、日文、韩文。
bash
voxflow asr recording.mp3
voxflow asr meeting.wav --lang en
注意: 需要可公开访问的音频URL或本地文件上传。

Voice Selection Guide

音色选择指南

bash
undefined
bash
undefined

Step 1: Search for matching voices

步骤1:搜索匹配的音色

voxflow voices --lang zh --gender female
voxflow voices --lang zh --gender female

Step 2: Use the voice ID in synthesis

步骤2:在合成时使用音色ID

voxflow say "测试" --voice v-female-R2s4N9qJ -o test.mp3

Popular voices:
- `v-female-R2s4N9qJ` — 温柔姐姐 (Gentle Female, Chinese)
- `v-male-Bk7vD3xP` — 威严霸总 (Authoritative Male, Chinese)
- `v-female-m1KpW7zE` — 傲娇学姐 (Sassy Female, Chinese)
voxflow say "测试" --voice v-female-R2s4N9qJ -o test.mp3

热门音色:
- `v-female-R2s4N9qJ` —— 温柔姐姐(中文女声)
- `v-male-Bk7vD3xP` —— 威严霸总(中文男声)
- `v-female-m1KpW7zE` —— 傲娇学姐(中文女声)

Common Scenarios

常见场景示例

"把这段话念出来"

"把这段话念出来"

bash
voxflow say "用户输入的文字" -o output.mp3 && open output.mp3
bash
voxflow say "用户输入的文字" -o output.mp3 && open output.mp3

"用温柔女声读这个文件"

"用温柔女声读这个文件"

bash
voxflow voices --lang zh --gender female   # 先找音色
voxflow narrate file.txt --voice v-female-R2s4N9qJ -o narration.mp3
bash
voxflow voices --lang zh --gender female   # 先查找音色
voxflow narrate file.txt --voice v-female-R2s4N9qJ -o narration.mp3

"生成一个关于 XX 的播客"

"生成一个关于XX的播客"

bash
voxflow status                              # 检查配额(播客约 5000)
voxflow podcast "话题" --duration 3 --bgm lofi
bash
voxflow status                              # 检查配额(播客约消耗5000配额)
voxflow podcast "话题" --duration 3 --bgm lofi

"讲个睡前故事"

"讲个睡前故事"

bash
voxflow story "小狐狸的星星种子" -o bedtime.mp3 && open bedtime.mp3
bash
voxflow story "小狐狸的星星种子" -o bedtime.mp3 && open bedtime.mp3

"转录这段录音"

"转录这段录音"

bash
voxflow asr recording.mp3
bash
voxflow asr recording.mp3

Creative Workflows

创意工作流

These workflows combine VoxFlow TTS with the AI agent's own abilities (writing, coding, web fetching). The agent writes content, then calls
voxflow say
or
voxflow narrate
to synthesize each part.
这些工作流结合了VoxFlow TTS与AI Agent的自有能力(写作、编码、网页抓取)。Agent先编写内容,再调用
voxflow say
voxflow narrate
合成各部分音频。

Audio Storybook (有声绘本)

有声绘本

AI writes a children's story, generates SVG illustrations, synthesizes narration per page, and bundles everything into a single offline HTML file.
Steps:
  1. Write a 6-page children's story
  2. For each page: generate an inline SVG illustration (400×300)
  3. Synthesize narration:
    voxflow say "page text" --voice v-female-R2s4N9qJ --speed 0.85 -o /tmp/page_N.mp3
  4. Read the mp3 files, base64 encode, embed inline in HTML
  5. Output a single self-contained HTML file with audio play buttons per page
  6. open /tmp/storybook.html
AI编写儿童故事,生成SVG插图,逐页合成有声朗读,并将所有内容打包为单个离线HTML文件。
步骤:
  1. 编写一个6页的儿童故事
  2. 为每页生成内嵌SVG插图(400×300)
  3. 合成有声朗读:
    voxflow say "页面文本" --voice v-female-R2s4N9qJ --speed 0.85 -o /tmp/page_N.mp3
  4. 读取mp3文件,进行base64编码,内嵌到HTML中
  5. 输出包含每页音频播放按钮的独立HTML文件
  6. 调用
    open /tmp/storybook.html
    打开

Audio Presentation (有声演示文稿)

有声演示文稿

AI creates an HTML slide deck with TTS narration on each slide.
Steps:
  1. Generate N slides (title + bullet points + narration script)
  2. For each slide:
    voxflow say "narration script" -o /tmp/slide_N.mp3
  3. Build an HTML file with slide navigation (prev/next) and audio buttons
  4. Embed audio inline as base64
  5. open /tmp/presentation.html
Best for: product introductions, technical tutorials, course materials.
AI创建带有每页TTS朗读的HTML幻灯片。
步骤:
  1. 生成N张幻灯片(标题+要点+朗读脚本)
  2. 为每页合成音频:
    voxflow say "朗读脚本" -o /tmp/slide_N.mp3
  3. 构建带有幻灯片导航(上一页/下一页)和音频按钮的HTML文件
  4. 将音频以base64格式内嵌
  5. 调用
    open /tmp/presentation.html
    打开
最佳适用场景:产品介绍、技术教程、课程资料。

Article → Audio Briefing (文章有声摘要)

文章转有声摘要

Read a URL or document, summarize, synthesize as audio.
Steps:
  1. Fetch/read the content
  2. Summarize into 3-5 key points
  3. voxflow say "summary text" --voice v-male-Bk7vD3xP -o /tmp/briefing.mp3
  4. open /tmp/briefing.mp3
读取URL或文档内容,生成摘要后合成为音频。
步骤:
  1. 获取/读取内容
  2. 提炼为3-5个要点
  3. 合成音频:
    voxflow say "摘要文本" --voice v-male-Bk7vD3xP -o /tmp/briefing.mp3
  4. 调用
    open /tmp/briefing.mp3
    打开

Document Narration (文档朗读)

文档有声朗读

Read a README, code comments, or any text file aloud.
bash
voxflow narrate README.md --voice v-female-R2s4N9qJ --speed 0.9 -o /tmp/readme.mp3
open /tmp/readme.mp3
朗读README、代码注释或任何文本文件。
bash
voxflow narrate README.md --voice v-female-R2s4N9qJ --speed 0.9 -o /tmp/readme.mp3
open /tmp/readme.mp3

Multi-language Voice (多语言合成)

多语言语音合成

AI translates text, then synthesizes in each language with matching voices.
Steps:
  1. Translate user text to target languages (AI does this natively)
  2. voxflow voices --lang en --gender female
    → pick English voice
  3. voxflow voices --lang ja --gender female
    → pick Japanese voice
  4. voxflow say "English text" --voice <en_id> -o /tmp/en.mp3
  5. voxflow say "日本語テキスト" --voice <ja_id> -o /tmp/ja.mp3
AI翻译文本,然后使用对应语言的匹配音色合成音频。
步骤:
  1. 将用户文本翻译为目标语言(AI原生支持)
  2. 调用
    voxflow voices --lang en --gender female
    →选择英文音色
  3. 调用
    voxflow voices --lang ja --gender female
    →选择日文音色
  4. 合成英文音频:
    voxflow say "英文文本" --voice <英文音色ID> -o /tmp/en.mp3
  5. 合成日文音频:
    voxflow say "日本語テキスト" --voice <日文音色ID> -o /tmp/ja.mp3

Git Daily Report Audio (Git 日报音频)

Git日报音频

Steps:
  1. Read
    git log --oneline --since="1 day ago"
  2. Summarize changes into a brief report
  3. voxflow say "today's summary..." -o /tmp/daily_report.mp3
  4. open /tmp/daily_report.mp3
步骤:
  1. 读取
    git log --oneline --since="1 day ago"
    的输出
  2. 总结当日变更为简短报告
  3. 合成音频:
    voxflow say "今日总结..." -o /tmp/daily_report.mp3
  4. 调用
    open /tmp/daily_report.mp3
    打开

Code PR Explanation (PR 语音讲解)

PR语音讲解

Steps:
  1. Read the PR diff
  2. Write a plain-language explanation
  3. voxflow say "explanation..." -o /tmp/pr_review.mp3
  4. open /tmp/pr_review.mp3
步骤:
  1. 读取PR的差异内容
  2. 用通俗易懂的语言编写讲解内容
  3. 合成音频:
    voxflow say "讲解内容..." -o /tmp/pr_review.mp3
  4. 调用
    open /tmp/pr_review.mp3
    打开

Mock Interview (模拟面试)

模拟面试

Steps:
  1. Generate 3 interview questions on the topic
  2. For each:
    voxflow say "question N..." --voice v-male-Bk7vD3xP -o /tmp/q_N.mp3
  3. Play questions in sequence:
    open /tmp/q_1.mp3
步骤:
  1. 生成3个与主题相关的面试问题
  2. 为每个问题合成音频:
    voxflow say "问题N..." --voice v-male-Bk7vD3xP -o /tmp/q_N.mp3
  3. 依次播放问题:
    open /tmp/q_1.mp3

Quota

配额说明

  • Free: 10,000 quota/month
  • 1 TTS call ≈ 100 quota
  • 1 podcast ≈ 5,000 quota
  • Check:
    voxflow status
  • 免费版:每月10,000配额
  • 1次TTS调用≈消耗100配额
  • 1个播客≈消耗5,000配额
  • 查询配额:
    voxflow status

Prerequisites

前置要求

  • Node.js 20+ required
  • ffmpeg optional (only for video-related commands)
  • Install CLI:
    npm install -g voxflow
  • Node.js 20+版本
  • ffmpeg 可选(仅用于视频相关命令)
  • 安装CLI:
    npm install -g voxflow

Rules

规则

  1. Always try MCP first — if MCP tools are available, use them instead of CLI.
  2. Always search voices before synthesizing — never guess voice IDs.
  3. Check quota before expensive operations (podcast ≈ 5000 quota).
  4. After synthesis, auto-play the file:
    open output.mp3
    (macOS).
  5. Never print tokens or secrets.
  6. If CLI fails with "not logged in", suggest MCP as alternative:
    claude mcp add voxflow https://api.voxflow.studio/api/mcp
  7. If a command fails, check
    --help
    and correct flags before retrying.
  1. 优先使用MCP——如果MCP工具可用,请使用MCP而非CLI。
  2. 合成前务必搜索音色——切勿猜测音色ID。
  3. 执行高消耗操作前检查配额(播客≈5000配额)。
  4. 合成完成后自动播放文件:
    open output.mp3
    (macOS系统)。
  5. 切勿打印令牌或机密信息。
  6. 如果CLI提示“未登录”,建议用户使用MCP替代:
    claude mcp add voxflow https://api.voxflow.studio/api/mcp
  7. 如果命令执行失败,请先查看
    --help
    并修正参数后重试。

MCP Setup (if not already configured)

MCP配置(若未配置)

If MCP tools are not available and you want the easiest setup:
bash
claude mcp add voxflow https://api.voxflow.studio/api/mcp
After adding, MCP tools work immediately — OAuth auto-login, no CLI install needed. Restart the agent session to load MCP tools.
如果MCP工具不可用且希望获得最简设置:
bash
claude mcp add voxflow https://api.voxflow.studio/api/mcp
添加后,MCP工具即可立即使用——OAuth自动登录,无需安装CLI。重启Agent会话以加载MCP工具。