ai-avatar-video

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

AI Avatar & Talking Head Videos

AI头像与会说话的头部视频

Create AI avatars and talking head videos via inference.sh CLI.
AI Avatar & Talking Head Videos
通过inference.sh CLI创建AI头像和会说话的头部视频。
AI头像与会说话的头部视频

Quick Start

快速开始

bash
curl -fsSL https://cli.inference.sh | sh && infsh login
bash
curl -fsSL https://cli.inference.sh | sh && infsh login

Create avatar video from image + audio

从图片+音频创建头像视频

infsh app run bytedance/omnihuman-1-5 --input '{ "image_url": "https://portrait.jpg", "audio_url": "https://speech.mp3" }'
undefined
infsh app run bytedance/omnihuman-1-5 --input '{ "image_url": "https://portrait.jpg", "audio_url": "https://speech.mp3" }'
undefined

Available Models

可用模型

ModelApp IDBest For
OmniHuman 1.5
bytedance/omnihuman-1-5
Multi-character, best quality
OmniHuman 1.0
bytedance/omnihuman-1-0
Single character
Fabric 1.0
falai/fabric-1-0
Image talks with lipsync
PixVerse Lipsync
falai/pixverse-lipsync
Highly realistic
模型App ID最佳适用场景
OmniHuman 1.5
bytedance/omnihuman-1-5
多角色,画质最佳
OmniHuman 1.0
bytedance/omnihuman-1-0
单角色
Fabric 1.0
falai/fabric-1-0
图片唇形同步说话
PixVerse Lipsync
falai/pixverse-lipsync
超真实效果

Search Avatar Apps

搜索头像应用

bash
infsh app list --search "omnihuman"
infsh app list --search "lipsync"
infsh app list --search "fabric"
bash
infsh app list --search "omnihuman"
infsh app list --search "lipsync"
infsh app list --search "fabric"

Examples

示例

OmniHuman 1.5 (Multi-Character)

OmniHuman 1.5(多角色)

bash
infsh app run bytedance/omnihuman-1-5 --input '{
  "image_url": "https://portrait.jpg",
  "audio_url": "https://speech.mp3"
}'
Supports specifying which character to drive in multi-person images.
bash
infsh app run bytedance/omnihuman-1-5 --input '{
  "image_url": "https://portrait.jpg",
  "audio_url": "https://speech.mp3"
}'
支持在多人图片中指定要驱动的角色。

Fabric 1.0 (Image Talks)

Fabric 1.0(图片说话)

bash
infsh app run falai/fabric-1-0 --input '{
  "image_url": "https://face.jpg",
  "audio_url": "https://audio.mp3"
}'
bash
infsh app run falai/fabric-1-0 --input '{
  "image_url": "https://face.jpg",
  "audio_url": "https://audio.mp3"
}'

PixVerse Lipsync

PixVerse Lipsync

bash
infsh app run falai/pixverse-lipsync --input '{
  "image_url": "https://portrait.jpg",
  "audio_url": "https://speech.mp3"
}'
Generates highly realistic lipsync from any audio.
bash
infsh app run falai/pixverse-lipsync --input '{
  "image_url": "https://portrait.jpg",
  "audio_url": "https://speech.mp3"
}'
可通过任意音频生成超真实的唇形同步效果。

Full Workflow: TTS + Avatar

完整工作流:文本转语音 + 头像生成

bash
undefined
bash
undefined

1. Generate speech from text

1. 从文本生成语音

infsh app run infsh/kokoro-tts --input '{ "text": "Welcome to our product demo. Today I will show you..." }' > speech.json
infsh app run infsh/kokoro-tts --input '{ "text": "欢迎观看我们的产品演示,今天我将为大家展示..." }' > speech.json

2. Create avatar video with the speech

2. 用生成的语音创建头像视频

infsh app run bytedance/omnihuman-1-5 --input '{ "image_url": "https://presenter-photo.jpg", "audio_url": "<audio-url-from-step-1>" }'
undefined
infsh app run bytedance/omnihuman-1-5 --input '{ "image_url": "https://presenter-photo.jpg", "audio_url": "<步骤1生成的音频链接>" }'
undefined

Full Workflow: Dub Video in Another Language

完整工作流:视频多语言配音

bash
undefined
bash
undefined

1. Transcribe original video

1. 转录原视频音频

infsh app run infsh/fast-whisper-large-v3 --input '{"audio_url": "https://video.mp4"}' > transcript.json
infsh app run infsh/fast-whisper-large-v3 --input '{"audio_url": "https://video.mp4"}' > transcript.json

2. Translate text (manually or with an LLM)

2. 翻译文本(手动或借助大语言模型)

3. Generate speech in new language

3. 生成目标语言的语音

infsh app run infsh/kokoro-tts --input '{"text": "<translated-text>"}' > new_speech.json
infsh app run infsh/kokoro-tts --input '{"text": "<翻译后的文本>"}' > new_speech.json

4. Lipsync the original video with new audio

4. 将原视频与新音频进行唇形同步

infsh app run infsh/latentsync-1-6 --input '{ "video_url": "https://original-video.mp4", "audio_url": "<new-audio-url>" }'
undefined
infsh app run infsh/latentsync-1-6 --input '{ "video_url": "https://original-video.mp4", "audio_url": "<新音频链接>" }'
undefined

Use Cases

适用场景

  • Marketing: Product demos with AI presenter
  • Education: Course videos, explainers
  • Localization: Dub content in multiple languages
  • Social Media: Consistent virtual influencer
  • Corporate: Training videos, announcements
  • 营销:AI主持人产品演示
  • 教育:课程视频、讲解视频
  • 本地化:多语言内容配音
  • 社交媒体:风格统一的虚拟网红
  • 企业:培训视频、公告视频

Tips

小贴士

  • Use high-quality portrait photos (front-facing, good lighting)
  • Audio should be clear with minimal background noise
  • OmniHuman 1.5 supports multiple people in one image
  • LatentSync is best for syncing existing videos to new audio
  • 使用高质量正面肖像照(光线充足)
  • 音频需清晰,背景噪音尽可能小
  • OmniHuman 1.5支持单图多角色
  • LatentSync最适合将现有视频与新音频同步

Related Skills

相关技能

bash
undefined
bash
undefined

Full platform skill (all 150+ apps)

全平台技能(包含150+应用)

npx skills add inference-sh/skills@inference-sh
npx skills add inference-sh/skills@inference-sh

Text-to-speech (generate audio for avatars)

文本转语音(为头像生成音频)

npx skills add inference-sh/skills@text-to-speech
npx skills add inference-sh/skills@text-to-speech

Speech-to-text (transcribe for dubbing)

语音转文本(用于配音转录)

npx skills add inference-sh/skills@speech-to-text
npx skills add inference-sh/skills@speech-to-text

Video generation

视频生成

npx skills add inference-sh/skills@ai-video-generation
npx skills add inference-sh/skills@ai-video-generation

Image generation (create avatar images)

图像生成(创建头像图片)

npx skills add inference-sh/skills@ai-image-generation

Browse all video apps: `infsh app list --category video`
npx skills add inference-sh/skills@ai-image-generation

浏览所有视频应用:`infsh app list --category video`

Documentation

文档