text-to-speech

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Text-to-Speech

文本转语音

Convert text to natural speech via inference.sh CLI.

通过inference.sh CLI将文本转换为自然语音。

Quick Start

快速开始

bash

undefined

bash

undefined

Install CLI

安装CLI

curl -fsSL https://cli.inference.sh | sh && infsh login

Generate speech

生成语音

infsh app run infsh/kokoro-tts --input '{"text": "Hello, welcome to our product demo."}'

undefined

infsh app run infsh/kokoro-tts --input '{"text": "Hello, welcome to our product demo."}'

undefined

Available Models

可用模型

Model	App ID	Best For
DIA TTS	`infsh/dia-tts`	Conversational, expressive
Kokoro TTS	`infsh/kokoro-tts`	Fast, natural
Chatterbox	`infsh/chatterbox`	General purpose
Higgs Audio	`infsh/higgs-audio`	Emotional control
VibeVoice	`infsh/vibevoice`	Podcasts, long-form

模型	应用ID	最佳适用场景
DIA TTS	`infsh/dia-tts`	对话式、富有表现力
Kokoro TTS	`infsh/kokoro-tts`	快速、自然
Chatterbox	`infsh/chatterbox`	通用场景
Higgs Audio	`infsh/higgs-audio`	情绪控制
VibeVoice	`infsh/vibevoice`	播客、长内容

Browse All Audio Apps

浏览所有音频应用

bash

infsh app list --category audio

bash

infsh app list --category audio

Examples

示例

Basic Text-to-Speech

基础文本转语音

bash

infsh app run infsh/kokoro-tts --input '{"text": "Welcome to our tutorial."}'

bash

infsh app run infsh/kokoro-tts --input '{"text": "Welcome to our tutorial."}'

Conversational TTS with DIA

使用DIA实现对话式TTS

bash

infsh app sample infsh/dia-tts --save input.json

bash

infsh app sample infsh/dia-tts --save input.json

Edit input.json:

编辑input.json:

{

"text": "Hey! How are you doing today? I'm really excited to share this with you.",

"voice": "conversational"

}

infsh app run infsh/dia-tts --input input.json

undefined

infsh app run infsh/dia-tts --input input.json

undefined

Long-form Audio (Podcasts)

长音频生成（播客）

bash

infsh app sample infsh/vibevoice --save input.json

bash

infsh app sample infsh/vibevoice --save input.json

Edit input.json with your podcast script

编辑input.json，填入你的播客脚本

infsh app run infsh/vibevoice --input input.json

undefined

infsh app run infsh/vibevoice --input input.json

undefined

Expressive Speech with Higgs

使用Higgs生成富有表现力的语音

bash

infsh app sample infsh/higgs-audio --save input.json

bash

infsh app sample infsh/higgs-audio --save input.json

{

"text": "This is absolutely incredible!",

"emotion": "excited"

}

infsh app run infsh/higgs-audio --input input.json

undefined

infsh app run infsh/higgs-audio --input input.json

undefined

Use Cases

适用场景

Voiceovers: Product demos, explainer videos
Audiobooks: Convert text to spoken word
Podcasts: Generate podcast episodes
Accessibility: Make content accessible
IVR: Phone system voice prompts
Video Narration: Add narration to videos

配音：产品演示、解说视频
有声书：将文本转换为有声内容
播客：生成播客剧集
无障碍服务：让内容更易获取
IVR：电话系统语音提示
视频旁白：为视频添加旁白

Combine with Video

与视频结合使用

Generate speech, then create a talking head video:

bash

undefined

生成语音后，创建虚拟人说话视频：

bash

undefined

1. Generate speech

1. 生成语音

infsh app run infsh/kokoro-tts --input '{"text": "Your script here"}' > speech.json

2. Use the audio URL with OmniHuman for avatar video

2. 使用音频URL结合OmniHuman生成虚拟人视频

infsh app run bytedance/omnihuman-1-5 --input '{ "image_url": "https://portrait.jpg", "audio_url": "<audio-url-from-step-1>" }'

undefined

infsh app run bytedance/omnihuman-1-5 --input '{ "image_url": "https://portrait.jpg", "audio_url": "<audio-url-from-step-1>" }'

undefined

Related Skills

Full platform skill (all 150+ apps)

全平台技能（包含150+应用）

npx skills add inference-sh/skills@inference-sh

AI avatars (combine TTS with talking heads)

AI虚拟人（结合TTS与说话头像）

npx skills add inference-sh/skills@ai-avatar-video

AI music generation

AI音乐生成

npx skills add inference-sh/skills@ai-music-generation

Speech-to-text (transcription)

语音转文本（转录）

npx skills add inference-sh/skills@speech-to-text

Video generation

AI视频生成

npx skills add inference-sh/skills@ai-video-generation


Browse all apps: `infsh app list`

npx skills add inference-sh/skills@ai-video-generation


浏览所有应用：`infsh app list`

Documentation

文档

Running Apps - How to run apps via CLI
Audio Transcription Example - Audio processing workflows
Apps Overview - Understanding the app ecosystem

运行应用 - 如何通过CLI运行应用
音频转录示例 - 音频处理工作流
应用概览 - 了解应用生态系统

text-to-speech

Original

Translation

Text-to-Speech

文本转语音

Quick Start

快速开始

Install CLI

安装CLI

Generate speech

生成语音

Available Models

可用模型

Browse All Audio Apps

浏览所有音频应用

Examples

示例

Basic Text-to-Speech

基础文本转语音

Conversational TTS with DIA

使用DIA实现对话式TTS

Edit input.json:

编辑input.json:

{

{

"text": "Hey! How are you doing today? I'm really excited to share this with you.",

"text": "Hey! How are you doing today? I'm really excited to share this with you.",

"voice": "conversational"

"voice": "conversational"

}

}

Long-form Audio (Podcasts)

长音频生成（播客）

Edit input.json with your podcast script

编辑input.json，填入你的播客脚本

Expressive Speech with Higgs

使用Higgs生成富有表现力的语音

{

{

"text": "This is absolutely incredible!",

"text": "This is absolutely incredible!",

"emotion": "excited"

"emotion": "excited"

}

}

Use Cases

适用场景

Combine with Video

与视频结合使用

1. Generate speech

1. 生成语音

2. Use the audio URL with OmniHuman for avatar video

2. 使用音频URL结合OmniHuman生成虚拟人视频

Related Skills

相关技能

Full platform skill (all 150+ apps)

全平台技能（包含150+应用）

AI avatars (combine TTS with talking heads)

AI虚拟人（结合TTS与说话头像）

AI music generation

AI音乐生成

Speech-to-text (transcription)

语音转文本（转录）

Video generation

AI视频生成

Documentation

文档