ai-voice-cloning

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

AI Voice Generation

AI语音生成

Generate natural AI voices via inference.sh CLI.

通过inference.sh CLI生成自然的AI语音。

Quick Start

快速开始

bash

curl -fsSL https://cli.inference.sh | sh && infsh login

bash

curl -fsSL https://cli.inference.sh | sh && infsh login

Generate speech

生成语音

infsh app run infsh/kokoro-tts --input '{ "text": "Hello! This is an AI-generated voice that sounds natural and engaging.", "voice": "af_sarah" }'

undefined

infsh app run infsh/kokoro-tts --input '{ "text": "Hello! This is an AI-generated voice that sounds natural and engaging.", "voice": "af_sarah" }'

undefined

Available Models

可用模型

Model	App ID	Best For
Kokoro TTS	`infsh/kokoro-tts`	Natural, multiple voices
DIA	`infsh/dia-tts`	Conversational, expressive
Chatterbox	`infsh/chatterbox`	Casual, entertainment
Higgs	`infsh/higgs-tts`	Professional narration
VibeVoice	`infsh/vibevoice`	Emotional range

模型	App ID	适用场景
Kokoro TTS	`infsh/kokoro-tts`	自然音色、多语音选择
DIA	`infsh/dia-tts`	对话式、富有表现力
Chatterbox	`infsh/chatterbox`	休闲风格、娱乐场景
Higgs	`infsh/higgs-tts`	专业旁白
VibeVoice	`infsh/vibevoice`	丰富情感表达

Kokoro Voice Library

Kokoro语音库

American English

美式英语

Voice ID	Gender	Style
`af_sarah`	Female	Warm, friendly
`af_nicole`	Female	Professional
`af_sky`	Female	Youthful
`am_michael`	Male	Authoritative
`am_adam`	Male	Conversational
`am_echo`	Male	Clear, neutral

语音ID	性别	风格
`af_sarah`	女	温暖友好
`af_nicole`	女	专业正式
`af_sky`	女	年轻活力
`am_michael`	男	权威稳重
`am_adam`	男	轻松对话
`am_echo`	男	清晰中立

British English

英式英语

Voice ID	Gender	Style
`bf_emma`	Female	Refined
`bf_isabella`	Female	Warm
`bm_george`	Male	Classic
`bm_lewis`	Male	Modern

语音ID	性别	风格
`bf_emma`	女	优雅精致
`bf_isabella`	女	温暖亲和
`bm_george`	男	经典沉稳
`bm_lewis`	男	现代随性

Voice Generation Examples

语音生成示例

Professional Narration

专业旁白

bash

infsh app run infsh/kokoro-tts --input '{
  "text": "Welcome to our quarterly earnings call. Today we will discuss the financial performance and strategic initiatives for the past quarter.",
  "voice": "am_michael",
  "speed": 1.0
}'

bash

infsh app run infsh/kokoro-tts --input '{
  "text": "Welcome to our quarterly earnings call. Today we will discuss the financial performance and strategic initiatives for the past quarter.",
  "voice": "am_michael",
  "speed": 1.0
}'

Conversational Style

对话风格

bash

infsh app run infsh/dia-tts --input '{
  "text": "Hey, so I was thinking about that project we discussed. What if we tried a different approach?",
  "voice": "conversational"
}'

bash

infsh app run infsh/dia-tts --input '{
  "text": "Hey, so I was thinking about that project we discussed. What if we tried a different approach?",
  "voice": "conversational"
}'

Audiobook Narration

有声书旁白

bash

infsh app run infsh/kokoro-tts --input '{
  "text": "Chapter One. The morning mist hung low over the valley as Sarah made her way down the winding path. She had been walking for hours.",
  "voice": "bf_emma",
  "speed": 0.9
}'

bash

infsh app run infsh/kokoro-tts --input '{
  "text": "Chapter One. The morning mist hung low over the valley as Sarah made her way down the winding path. She had been walking for hours.",
  "voice": "bf_emma",
  "speed": 0.9
}'

Video Voiceover

视频配音

bash

infsh app run infsh/kokoro-tts --input '{
  "text": "Introducing the next generation of productivity. Work smarter, not harder.",
  "voice": "af_nicole",
  "speed": 1.1
}'

bash

infsh app run infsh/kokoro-tts --input '{
  "text": "Introducing the next generation of productivity. Work smarter, not harder.",
  "voice": "af_nicole",
  "speed": 1.1
}'

Podcast Host

播客主播

bash

infsh app run infsh/kokoro-tts --input '{
  "text": "Welcome back to Tech Talk! Im your host, and today we are diving deep into the world of artificial intelligence.",
  "voice": "am_adam"
}'

bash

infsh app run infsh/kokoro-tts --input '{
  "text": "Welcome back to Tech Talk! Im your host, and today we are diving deep into the world of artificial intelligence.",
  "voice": "am_adam"
}'

Multi-Voice Conversation

多语音对话生成

bash

undefined

bash

undefined

Generate dialogue between two speakers

生成两位说话者的对话

Speaker 1

说话者1

infsh app run infsh/kokoro-tts --input '{ "text": "Have you seen the latest AI developments? Its incredible how fast things are moving.", "voice": "am_michael" }' > speaker1.json

Speaker 2

说话者2

infsh app run infsh/kokoro-tts --input '{ "text": "I know, right? Just last week I tried that new image generator and was blown away.", "voice": "af_sarah" }' > speaker2.json

Merge conversation

合并对话

infsh app run infsh/media-merger --input '{ "audio_files": ["<speaker1-url>", "<speaker2-url>"], "crossfade_ms": 300 }'

undefined

infsh app run infsh/media-merger --input '{ "audio_files": ["<speaker1-url>", "<speaker2-url>"], "crossfade_ms": 300 }'

undefined

Long-Form Content

长文本内容处理

Chunked Processing

分块处理

For content over 5000 characters, split into chunks:

bash

undefined

对于超过5000字符的内容，需拆分为多个块：

bash

undefined

Process long text in chunks

分块处理长文本

TEXT="Your very long text here..."

Split and generate

拆分并生成

Chunk 1

块1

infsh app run infsh/kokoro-tts --input '{ "text": "<chunk-1>", "voice": "bf_emma" }' > chunk1.json

Chunk 2

块2

infsh app run infsh/kokoro-tts --input '{ "text": "<chunk-2>", "voice": "bf_emma" }' > chunk2.json

Merge chunks

合并块

infsh app run infsh/media-merger --input '{ "audio_files": ["<chunk1-url>", "<chunk2-url>"], "crossfade_ms": 100 }'

undefined

infsh app run infsh/media-merger --input '{ "audio_files": ["<chunk1-url>", "<chunk2-url>"], "crossfade_ms": 100 }'

undefined

Voice + Video Workflow

语音+视频工作流

Add Voiceover to Video

为视频添加配音

bash

undefined

bash

undefined

1. Generate voiceover

1. 生成配音

infsh app run infsh/kokoro-tts --input '{ "text": "This stunning footage shows the beauty of nature in its purest form.", "voice": "am_michael" }' > voiceover.json

2. Merge with video

2. 与视频合并

infsh app run infsh/media-merger --input '{ "video_url": "https://your-video.mp4", "audio_url": "<voiceover-url>" }'

undefined

infsh app run infsh/media-merger --input '{ "video_url": "https://your-video.mp4", "audio_url": "<voiceover-url>" }'

undefined

Create Talking Head

创建虚拟主播

bash

undefined

bash

undefined

1. Generate speech

1. 生成语音

infsh app run infsh/kokoro-tts --input '{ "text": "Hi, Im excited to share some updates with you today.", "voice": "af_sarah" }' > speech.json

2. Animate with avatar

2. 结合头像生成动画

infsh app run bytedance/omnihuman-1-5 --input '{ "image_url": "https://portrait.jpg", "audio_url": "<speech-url>" }'

undefined

infsh app run bytedance/omnihuman-1-5 --input '{ "image_url": "https://portrait.jpg", "audio_url": "<speech-url>" }'

undefined

Speed and Pacing

语速与节奏

Speed	Effect	Use For
0.8	Slow, deliberate	Audiobooks, meditation
0.9	Slightly slow	Education, tutorials
1.0	Normal	General purpose
1.1	Slightly fast	Commercials, energy
1.2	Fast	Quick announcements

bash

undefined

语速	效果	适用场景
0.8	缓慢、沉稳	有声书、冥想内容
0.9	稍慢	教育内容、教程
1.0	正常	通用场景
1.1	稍快	商业广告、活力内容
1.2	快速	简短公告

bash

undefined

Slow narration

慢速旁白

infsh app run infsh/kokoro-tts --input '{ "text": "Take a deep breath. Let yourself relax.", "voice": "bf_emma", "speed": 0.8 }'

undefined

infsh app run infsh/kokoro-tts --input '{ "text": "Take a deep breath. Let yourself relax.", "voice": "bf_emma", "speed": 0.8 }'

undefined

Punctuation for Pacing

标点符号控制节奏

Use punctuation to control speech rhythm:

Punctuation	Effect
Period `.`	Full pause
Comma `,`	Brief pause
`...`	Extended pause
`!`	Emphasis
`?`	Question intonation
`-`	Quick break

bash

infsh app run infsh/kokoro-tts --input '{
  "text": "Wait... Did you hear that? Something is coming. Something big!",
  "voice": "am_adam"
}'

使用标点符号控制语音节奏：

标点符号	效果
句号 `.`	完全停顿
逗号 `,`	短暂停顿
`...`	延长停顿
`!`	加重语气
`?`	疑问语调
`-`	快速停顿

bash

infsh app run infsh/kokoro-tts --input '{
  "text": "Wait... Did you hear that? Something is coming. Something big!",
  "voice": "am_adam"
}'

Best Practices

最佳实践

Match voice to content - Professional voice for business, casual for social
Use punctuation - Control pacing with periods and commas
Keep sentences short - Easier to generate and sounds more natural
Test different voices - Same text sounds different across voices
Adjust speed - Slightly slower often sounds more natural
Break long content - Process in chunks for consistency

匹配语音与内容 - 商务内容用专业音色，社交内容用休闲音色
合理使用标点 - 用句号和逗号控制节奏
保持短句 - 短句更易生成且听起来更自然
测试不同音色 - 同一文本在不同音色下效果不同
调整语速 - 稍慢的语速通常听起来更自然
拆分长内容 - 分块处理以保证一致性

Use Cases

适用场景

Voiceovers - Video narration, commercials
Audiobooks - Full book narration
Podcasts - AI hosts and guests
E-learning - Course narration
Accessibility - Screen reader content
IVR - Phone system messages
Content localization - Translate and voice

配音 - 视频旁白、商业广告
有声书 - 整本书籍旁白
播客 - AI主播与嘉宾
在线教育 - 课程旁白
无障碍服务 - 屏幕阅读器内容
IVR - 电话系统语音提示
内容本地化 - 翻译并生成语音

Related Skills

All TTS models

所有TTS模型

npx skills add inference-sh/skills@text-to-speech

Podcast creation

播客创建

npx skills add inference-sh/skills@ai-podcast-creation

AI avatars

AI虚拟形象

npx skills add inference-sh/skills@ai-avatar-video

Video generation

AI视频生成

npx skills add inference-sh/skills@ai-video-generation

Full platform skill

完整平台技能

npx skills add inference-sh/skills@inference-sh


Browse audio apps: `infsh app list --category audio`

npx skills add inference-sh/skills@inference-sh


浏览音频类应用：`infsh app list --category audio`

ai-voice-cloning

Original

Translation

AI Voice Generation

AI语音生成

Quick Start

快速开始

Generate speech

生成语音

Available Models

可用模型

Kokoro Voice Library

Kokoro语音库

American English

美式英语

British English

英式英语

Voice Generation Examples

语音生成示例

Professional Narration

专业旁白

Conversational Style

对话风格

Audiobook Narration

有声书旁白

Video Voiceover

视频配音

Podcast Host

播客主播

Multi-Voice Conversation

多语音对话生成

Generate dialogue between two speakers

生成两位说话者的对话

Speaker 1

说话者1

Speaker 2

说话者2

Merge conversation

合并对话

Long-Form Content

长文本内容处理

Chunked Processing

分块处理

Process long text in chunks

分块处理长文本

Split and generate

拆分并生成

Chunk 1

块1

Chunk 2

块2

Merge chunks

合并块

Voice + Video Workflow

语音+视频工作流

Add Voiceover to Video

为视频添加配音

1. Generate voiceover

1. 生成配音

2. Merge with video

2. 与视频合并

Create Talking Head

创建虚拟主播

1. Generate speech

1. 生成语音

2. Animate with avatar

2. 结合头像生成动画

Speed and Pacing

语速与节奏

Slow narration

慢速旁白

Punctuation for Pacing

标点符号控制节奏

Best Practices

最佳实践

Use Cases

适用场景

Related Skills

相关技能

All TTS models