video-translation

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Video Translation

视频翻译

Translate a video's speech into another language, using TTS to generate the dubbed audio and replacing the original audio track.

将视频中的语音翻译成另一种语言，使用TTS生成配音音频并替换原始音轨。

Triggers

触发词

translate this video
dub this video to English
把视频从 X 语译成 Y 语
视频翻译

翻译这个视频
将这个视频配音为英语
把视频从 X 语译成 Y 语
视频翻译

Use Cases

使用场景

The user wants to watch a foreign language YouTube video but prefers to hear it in their native language.
The user provides a video link and explicitly requests changing the audio language.

用户想要观看外语YouTube视频，但更希望用母语收听。
用户提供视频链接，并明确要求更改音频语言。

Workflow

工作流程

When the user asks to translate a video:

Download Video & Subtitles: Use the
```
youtube-downloader
```
skill to download the video and its subtitles as SRT. Make sure you specify the source language to fetch the correct subtitle.
bash
```
python path/to/youtube-downloader/scripts/download_video.py "VIDEO_URL" --subtitles --sub-lang <source_lang_code> -o /tmp/video-translation
```
Translate Subtitles: Read the downloaded
```
.srt
```
file. Translate its contents sentence by sentence into the target language using the following fixed prompt. Keep the exact same SRT index and timestamp format!
Translation Prompt:

Translate the following subtitle text from <Source Language> to <Target Language>. Provide ONLY the translated text. Do not explain, do not add notes, do not add index numbers. The translation must be colloquial, natural-sounding, and suitable for video dubbing.
Save the translated text into a new file
```
translated.srt
```
.
Generate Dubbed Audio: Use the
```
tts
```
skill to render the timeline-accurate audio from the translated SRT. The Noiz backend automatically aligns the duration of each sentence to the original video's subtitle timestamps.
To ensure the cloned voice matches the original speaker's exact tone and emotion for each sentence, pass the original video file to
```
--ref-audio-track
```
. The TTS engine will automatically slice the original audio at each subtitle's exact timestamp and use it as the reference for that specific segment.
Create a basic
```
voice_map.json
```
:
json
```
{
  "default": {
    "target_lang": "<target_lang_code>"
  }
}
```
Render the timeline-accurate audio:
bash
```
bash skills/tts/scripts/tts.sh render --srt translated.srt --voice-map voice_map.json --backend noiz --auto-emotion --ref-audio-track original_video.mp4 -o dubbed.wav
```
Replace Audio in Video: Use the
```
replace_audio.sh
```
script to merge the original video with the new dubbed audio. To keep the original video's non-speech audio background outside of translated segments, pass the
```
--srt
```
file.
bash
```
bash skills/video-translation/scripts/replace_audio.sh --video original_video.mp4 --audio dubbed.wav --output final_video.mp4 --srt translated.srt
```
Present the Result: Return the
```
final_video.mp4
```
file path to the user.

当用户要求翻译视频时：

下载视频与字幕: 使用

youtube-downloader

技能下载视频及其SRT格式的字幕。请确保指定源语言以获取正确的字幕。

bash

python path/to/youtube-downloader/scripts/download_video.py "VIDEO_URL" --subtitles --sub-lang <source_lang_code> -o /tmp/video-translation

翻译字幕: 读取下载的
```
.srt
```
文件。使用以下固定提示词将内容逐句翻译成目标语言。请严格保留SRT的索引和时间戳格式！
翻译提示词:

将以下字幕文本从<源语言>翻译为<目标语言>。仅提供翻译后的文本。不要解释，不要添加注释，不要添加索引号。翻译内容需口语化、听起来自然，且适合视频配音。
将翻译后的文本保存到新文件
```
translated.srt
```
中。
生成配音音频: 使用
```
tts
```
技能根据翻译后的SRT生成时间线对齐的音频。Noiz后端会自动将每个句子的时长与原视频字幕的时间戳对齐。
为确保克隆的语音与原说话者每个句子的语气和情绪完全匹配，请将原视频文件传递给
```
--ref-audio-track
```
参数。TTS引擎会自动在每个字幕的精确时间戳处切割原始音频，并将其作为该特定片段的参考。
创建基础的
```
voice_map.json
```
文件:
json
```
{
  "default": {
    "target_lang": "<target_lang_code>"
  }
}
```
生成时间线对齐的音频:
bash
```
bash skills/tts/scripts/tts.sh render --srt translated.srt --voice-map voice_map.json --backend noiz --auto-emotion --ref-audio-track original_video.mp4 -o dubbed.wav
```

替换视频中的音频: 使用

replace_audio.sh

脚本将原视频与新的配音音频合并。若要在翻译片段之外保留原视频的非语音背景音，请传递

--srt

文件。

bash

bash skills/video-translation/scripts/replace_audio.sh --video original_video.mp4 --audio dubbed.wav --output final_video.mp4 --srt translated.srt

交付结果: 将
```
final_video.mp4
```
的文件路径返回给用户。

Inputs

输入项

Required inputs:
- ```
VIDEO_URL
```
  : The URL of the video to translate.
- ```
target_language
```
  : The language to translate the audio to.
Optional inputs:
- ```
source_language
```
  : The language of the original video (if not auto-detected or specified).
- ```
reference_audio
```
  : Specific audio file/URL to use for voice cloning instead of the dynamic original video track.

必填输入项:
- ```
VIDEO_URL
```
  : 待翻译视频的URL。
- ```
target_language
```
  : 音频要翻译成的目标语言。
可选输入项:
- ```
source_language
```
  : 原始视频的语言（如果未自动检测或指定）。
- ```
reference_audio
```
  : 用于语音克隆的特定音频文件/URL，替代动态的原始视频音轨。

Outputs

输出项

Success: Path to the final video file with replaced audio.
Failure: Clear error message specifying whether download, TTS, or audio replacement failed.

成功：替换音频后的最终视频文件路径。
失败：明确的错误信息，说明是下载、TTS生成还是音频替换环节失败。

Requirements

要求

Dependencies (other skills)
- youtube-downloader (crazynomad/skills) — SKILL.md
  Install: clone or copy the
```
skills/youtube-downloader
```
  directory from crazynomad/skills into your
```
skills/
```
  folder so that
```
skills/youtube-downloader/scripts/download_video.py
```
  is available.
- tts (NoizAI/skills) — SKILL.md
  If not already in this repo: clone or copy the
```
skills/tts
```
  directory from NoizAI/skills into your
```
skills/
```
  folder. Ensure
```
skills/tts/scripts/tts.sh
```
  and related scripts are present.
```
NOIZ_API_KEY
```
configured for the Noiz backend. If it is not set, first guide the user to get an API key from
```
https://developers.noiz.ai/api-keys
```
. After the user provides the key, ask whether they want to persist it; if they agree, either write/update
```
NOIZ_API_KEY=...
```
in the project's
```
.env
```
file or run
```
bash skills/tts/scripts/tts.sh config --set-api-key YOUR_KEY
```
to store it.
```
ffmpeg
```
installed.

依赖项（其他技能）
- youtube-downloader (crazynomad/skills) — SKILL.md
  安装：从crazynomad/skills克隆或复制
```
skills/youtube-downloader
```
  目录到你的
```
skills/
```
  文件夹中，确保
```
skills/youtube-downloader/scripts/download_video.py
```
  可用。
- tts (NoizAI/skills) — SKILL.md
  如果本仓库中没有该技能：从NoizAI/skills克隆或复制
```
skills/tts
```
  目录到你的
```
skills/
```
  文件夹中。确保
```
skills/tts/scripts/tts.sh
```
  及相关脚本存在。
为Noiz后端配置
```
NOIZ_API_KEY
```
。如果尚未设置，请先引导用户从
```
https://developers.noiz.ai/api-keys
```
获取API密钥。用户提供密钥后，询问是否要持久化存储；若用户同意，可在项目的
```
.env
```
文件中写入/更新
```
NOIZ_API_KEY=...
```
，或运行
```
bash skills/tts/scripts/tts.sh config --set-api-key YOUR_KEY
```
来存储密钥。
已安装
```
ffmpeg
```
。

Limitations

局限性

The source video must have subtitles (or auto-generated subtitles) available on the platform for the source language.
Very long videos may take a significant amount of time to translate and dub.

源视频必须在平台上有对应源语言的字幕（或自动生成的字幕）。
超长视频的翻译和配音可能需要大量时间。