alibabacloud-video-translation

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Video Translation Skill

视频翻译Skill

One-click video translation powered by Alibaba Cloud IMS, supporting subtitle-level and speech-level translation.

由阿里云IMS提供支持的一键视频翻译，支持字幕级和语音级翻译。

Input Format Requirements

输入格式要求

IMPORTANT: Different APIs use different address formats!

重要提示：不同API使用的地址格式不同！

API Address Format Reference

API地址格式参考

API	Address Format	Example
`SubmitIProductionJob` (subtitle extraction)	`oss://` format	`oss://my-bucket/videos/test.mp4`
`SubmitVideoTranslationJob` (video translation)	HTTP URL format	`https://my-bucket.oss-cn-shanghai.aliyuncs.com/videos/test.mp4`

Key: Subtitle extraction uses
oss://
, video translation uses HTTP URL!

API	地址格式	示例
`SubmitIProductionJob` （字幕提取）	`oss://` 格式	`oss://my-bucket/videos/test.mp4`
`SubmitVideoTranslationJob` （视频翻译）	HTTP URL格式	`https://my-bucket.oss-cn-shanghai.aliyuncs.com/videos/test.mp4`

核心规则：字幕提取使用
oss://
格式，视频翻译使用HTTP URL格式！

User Input Handling

用户输入处理规则

User Input Type	Processing Method
HTTP URL format	Use directly for video translation; convert to `oss://` if subtitle extraction needed
`oss://` format	Use directly for subtitle extraction; convert to HTTP URL for video translation
Local video	MUST ask for OSS upload path, save both formats after upload

用户输入类型	处理方式
HTTP URL格式	直接用于视频翻译；如果需要提取字幕则转换为 `oss://` 格式
`oss://` 格式	直接用于字幕提取；如果需要视频翻译则转换为HTTP URL格式
本地视频	必须询问用户OSS上传路径，上传完成后保存两种格式

Format Conversion Rules

格式转换规则

oss:// format ⇄ HTTP URL format

oss://my-bucket/videos/test.mp4
    ⇄
https://my-bucket.oss-cn-shanghai.aliyuncs.com/videos/test.mp4

Conversion Formula:

oss://<bucket>/<path>

→

https://<bucket>.oss-<region>.aliyuncs.com/<path>

HTTP URL does not require signing, use Bucket domain format directly

oss:// 格式 ⇄ HTTP URL 格式

oss://my-bucket/videos/test.mp4
    ⇄
https://my-bucket.oss-cn-shanghai.aliyuncs.com/videos/test.mp4

转换公式：

oss://<bucket>/<path>

→

https://<bucket>.oss-<region>.aliyuncs.com/<path>

HTTP URL不需要签名，直接使用Bucket域名格式即可

Local Video Processing Flow

本地视频处理流程

User provides local video path
    │
    ├─ AskUserQuestion: "Please provide OSS upload path (format: oss://<bucket>/<path>/<filename>.mp4)"
    │
    ├─ User specifies upload path
    │   ├─ Check if Bucket exists
    │   ├─ Upload file: aliyun oss cp <local_path> <oss_path>
    │   ├─ Save oss:// format → for subtitle extraction
    │   └─ Save HTTP URL format → for video translation
    │
    └─ User does not specify path → STOP, user MUST provide upload path

Upload Command:

bash

aliyun oss cp <local_path> oss://<bucket>/<path>/<filename>.mp4

Save both formats after upload:

Local: /Users/demo/videos/test.mp4
Uploaded to: oss://my-bucket/videos/test.mp4
    ├─ oss:// format: oss://my-bucket/videos/test.mp4 (for subtitle extraction)
    └─ HTTP URL: https://my-bucket.oss-cn-shanghai.aliyuncs.com/videos/test.mp4 (for video translation)

用户提供本地视频路径
    │
    ├─ 询问用户："请提供OSS上传路径（格式：oss://<bucket>/<path>/<filename>.mp4）"
    │
    ├─ 用户指定上传路径
    │   ├─ 检查Bucket是否存在
    │   ├─ 上传文件：aliyun oss cp <local_path> <oss_path>
    │   ├─ 保存oss://格式 → 用于字幕提取
    │   └─ 保存HTTP URL格式 → 用于视频翻译
    │
    └─ 用户未指定路径 → 终止流程，用户必须提供上传路径

上传命令：

bash

aliyun oss cp <local_path> oss://<bucket>/<path>/<filename>.mp4

上传完成后保存两种格式：

本地路径：/Users/demo/videos/test.mp4
上传至：oss://my-bucket/videos/test.mp4
    ├─ oss://格式：oss://my-bucket/videos/test.mp4（用于字幕提取）
    └─ HTTP URL：https://my-bucket.oss-cn-shanghai.aliyuncs.com/videos/test.mp4（用于视频翻译）

Execution Gate Checklist

执行门槛检查清单

Strict Requirement: Agent MUST execute in phase order, cannot proceed without passing current phase!

严格要求：Agent必须按阶段顺序执行，未通过当前阶段不得进入下一阶段！

Phase 0: Environment and Credential Check (HARD-GATE)

阶段0：环境与凭证检查（硬性门槛）

Check Item	Command	Pass Condition	Failure Handling
CLI version	`aliyun version`	>= 3.3.1	STOP, see cli-installation-guide.md
Credential status	`aliyun configure list`	Valid status	STOP, guide configuration
Plugin installation	`aliyun configure set --auto-plugin-install true`	Set	Auto-set

HARD-GATE: Cannot proceed with any subsequent operations without passing!

检查项	命令	通过条件	失败处理
CLI版本	`aliyun version`	>= 3.3.1	终止流程，参考 cli-installation-guide.md
凭证状态	`aliyun configure list`	状态有效	终止流程，引导用户配置
插件安装设置	`aliyun configure set --auto-plugin-install true`	已设置	自动设置

硬性门槛：未通过检查不得执行任何后续操作！

Phase 1: Translation Mode Confirmation (BLOCKING)

阶段1：翻译模式确认（阻塞式）

AskUserQuestion: "Do you need subtitle translation (translate subtitles only) or speech translation (translate subtitles + replace voiceover)?"

┌─ Subtitle translation → NeedSpeechTranslate: false
└─ Speech translation → NeedSpeechTranslate: true

⚠️ No reply received → STOP, cannot proceed!

DO NOT infer translation mode from input type!

询问用户："您需要字幕翻译（仅翻译字幕）还是语音翻译（翻译字幕+替换配音）？"

┌─ 字幕翻译 → NeedSpeechTranslate: false
└─ 语音翻译 → NeedSpeechTranslate: true

⚠️ 未收到回复 → 终止流程，无法继续！

禁止根据输入类型推断翻译模式！

Phase 2: Subtitle Processing Confirmation (BLOCKING)

阶段2：字幕处理确认（阻塞式）

AskUserQuestion: "Do you need to erase original subtitles from the video? Do you need to burn-in translated subtitles?"

⚠️ No reply received → STOP, cannot proceed!

Parameter Mapping:

Feature	Parameter	Value
Erase original subtitles	`DetextArea`	`"Auto"` / coordinates / not set (no erasure)
Burn-in new subtitles	`SubtitleConfig`	config object / not set (no burn-in)

询问用户："是否需要擦除视频中原字幕？是否需要将翻译后的字幕烧入视频？"

⚠️ 未收到回复 → 终止流程，无法继续！

参数映射：

功能	参数	取值
擦除原字幕	`DetextArea`	`"Auto"` / 坐标 / 不设置（不擦除）
烧入新字幕	`SubtitleConfig`	配置对象 / 不设置（不烧入）

Phase 3: Output Path Confirmation (Non-blocking)

阶段3：输出路径确认（非阻塞式）

Condition	Processing Method
User explicitly specifies	Use user's path
User does not specify	Use default path and inform user

Default Output Rules:

Bucket: Same bucket as input video
Directory: Same directory as input video
Filename:
```
{source}_translated_{random8}.mp4
```

Example:

oss://bucket/videos/demo.mp4

→

oss://bucket/videos/demo_translated_a1b2c3d4.mp4

DO NOT use shell variables, use Python:

python3 -c "import random; print(''.join(random.choices('abcdefghijkmnpqrstuvwxyz23456789', k=8)))"

条件	处理方式
用户明确指定路径	使用用户指定的路径
用户未指定路径	使用默认路径并告知用户

默认输出规则：

Bucket：与输入视频的Bucket相同
目录：与输入视频的目录相同
文件名：
```
{source}_translated_{random8}.mp4
```

示例：

oss://bucket/videos/demo.mp4

→

oss://bucket/videos/demo_translated_a1b2c3d4.mp4

不要使用shell变量，使用Python生成随机串：
python3 -c "import random; print(''.join(random.choices('abcdefghijkmnpqrstuvwxyz23456789', k=8)))"

Phase 4: Subtitle Review Confirmation (Conditional Blocking)

阶段4：字幕审核确认（条件阻塞式）

Trigger Condition	Processing Method
User chooses to review subtitles	BLOCKING, MUST wait for user confirmation of review result
User does not need review	Non-blocking, proceed

CRITICAL: After subtitle extraction, MUST output content as-is for user review, DO NOT change format!

触发条件	处理方式
用户选择审核字幕	阻塞式，必须等待用户确认审核结果
用户不需要审核	非阻塞式，继续执行

关键要求：字幕提取完成后，必须原样输出内容供用户审核，不得修改格式！

Scenario Entry Selector

场景入口选择器

Key Points:

When user inputs local video, MUST first upload to OSS and get HTTP URL

When user does not provide subtitle, MUST ask if subtitle extraction and review is needed

User inputs video
    │
    ├─ Local video?
    │   └─ Yes → AskUserQuestion: "Please provide OSS upload path"
    │       ├─ User provides path → Upload to OSS → Convert to HTTP URL → Continue
    │       └─ User does not provide → STOP
    │
    ├─ oss:// format?
    │   └─ Yes → Inform user to convert to HTTP URL format
    │
    └─ HTTP URL format? → Continue
        │
        ├─ User provides SRT file?
        │   ├─ Yes → Input type = with_subtitle
        │   │   ├─ Translation mode = speech → 【Scenario 4】 ⚠️ MUST ask CustomSrtType
        │   │   └─ Translation mode = subtitle → 【Scenario 3】
        │   │
        │   └─ No → Input type = only_video ⚠️ MUST ask if review needed
        │       │
        │       ├─ AskUserQuestion: "Do you need to extract subtitles for review first, or translate directly?"
        │       │
        │       ├─ Need review → 【Scenario 2】 ⚠️ Phase 4 blocking
        │       │
        │       └─ Direct translation → 【Scenario 1】 (TextSource=OCR_ASR)

Scenario	Name	Blocking Point	TextSource	Flow
0	Local video upload	OSS upload path inquiry	-	Upload→HTTP URL→Subsequent scenario
1	Direct translation	Phase 1, 2	`OCR_ASR`	Submit translation directly
2	Subtitle review	Phase 1, 2, Subtitle review inquiry, Phase 4	`SubtitleFile`	Extract subtitle→Review→Translate
3	Subtitle translation + user subtitle	Phase 1, 2	`SubtitleFile`	Use user SRT to translate directly
4	Speech translation + user subtitle	Phase 1, 2 + CustomSrtType confirmation	`SubtitleFile`	Confirm subtitle language then translate

Scenario 0 (Local video) detailed flow:
AskUserQuestion: "Please provide OSS upload path (format: oss://<bucket>/<path>/<filename>.mp4)"
After user specifies path, execute
aliyun oss cp <local_path> <oss_path>
Convert to HTTP URL:
https://<bucket>.oss-<region>.aliyuncs.com/<path>/<filename>.mp4
Continue with subsequent scenario flow

Scenario 2 detailed flow:
Ask for subtitle detection region (roi parameter)
Call
CaptionExtraction
to extract subtitles, input and output use oss:// format
Output subtitle content as-is for user review

After user confirmation, use reviewed SRT to submit translation

核心要点：

用户输入本地视频时，必须先上传到OSS并获取HTTP URL

用户未提供字幕时，必须询问是否需要提取字幕并审核

用户输入视频
    │
    ├─ 是否为本地视频？
    │   └─ 是 → 询问用户："请提供OSS上传路径"
    │       ├─ 用户提供路径 → 上传到OSS → 转换为HTTP URL → 继续流程
    │       └─ 用户未提供 → 终止流程
    │
    ├─ 是否为oss://格式？
    │   └─ 是 → 告知用户转换为HTTP URL格式
    │
    └─ 是否为HTTP URL格式？ → 继续流程
        │
        ├─ 用户是否提供SRT文件？
        │   ├─ 是 → 输入类型 = 带字幕
        │   │   ├─ 翻译模式 = 语音 → 【场景4】 ⚠️ 必须询问CustomSrtType
        │   │   └─ 翻译模式 = 字幕 → 【场景3】
        │   │
        │   └─ 否 → 输入类型 = 仅视频 ⚠️ 必须询问是否需要审核
        │       │
        │       ├─ 询问用户："您需要先提取字幕审核，还是直接翻译？"
        │       │
        │       ├─ 需要审核 → 【场景2】 ⚠️ 阶段4阻塞
        │       │
        │       └─ 直接翻译 → 【场景1】 (TextSource=OCR_ASR)

场景	名称	阻塞点	TextSource	流程
0	本地视频上传	OSS上传路径询问	-	上传→生成HTTP URL→进入后续场景
1	直接翻译	阶段1、阶段2	`OCR_ASR`	直接提交翻译任务
2	字幕审核	阶段1、阶段2、字幕审核询问、阶段4	`SubtitleFile`	提取字幕→审核→翻译
3	字幕翻译+用户提供字幕	阶段1、阶段2	`SubtitleFile`	直接使用用户提供的SRT翻译
4	语音翻译+用户提供字幕	阶段1、阶段2 + CustomSrtType确认	`SubtitleFile`	确认字幕语言后翻译

场景0（本地视频）详细流程：
询问用户："请提供OSS上传路径（格式：oss://<bucket>/<path>/<filename>.mp4）"
用户指定路径后，执行
aliyun oss cp <local_path> <oss_path>
转换为HTTP URL：
https://<bucket>.oss-<region>.aliyuncs.com/<path>/<filename>.mp4
继续后续场景流程

场景2详细流程：
询问字幕检测区域（roi参数）
调用
CaptionExtraction
提取字幕，输入输出均使用oss://格式
原样输出字幕内容供用户审核

用户确认后，使用审核后的SRT提交翻译任务

Parameter Decision Table

参数决策表

Decision Rules: Clearly define handling for each parameter, DO NOT assume arbitrarily!

Parameter	Trigger Condition	Handling Method	Default Value	Prohibited Behavior
`NeedSpeechTranslate`	Always	MUST ask	None	DO NOT infer from input
`NeedFaceTranslate`	Always	Fixed value	`false`	DO NOT set to true
`DetextArea`	User chooses erasure	MUST ask	None	DO NOT set to Auto arbitrarily
`SubtitleConfig`	User chooses burn-in	Can use default	Standard style	DO NOT skip confirmation
`TextSource`	Scenario decides	Scenario rules	See scenario mapping	DO NOT choose arbitrarily
`CustomSrtType`	Scenario 4	MUST ask	None	DO NOT infer arbitrarily
`OutputConfig.MediaURL`	Output path	Can use default	Default rules	DO NOT use shell variables
`JobParams.roi`	Subtitle extraction	MUST ask	`[[0.5,1],[0,1]]`	DO NOT set default arbitrarily
`SourceLanguage`	User specifies or inferable	Can use default	Auto detect	Use zh for Chinese only
`TargetLanguage`	User specifies	Can use default	`en`	Ask for other languages

TextSource Scenario Mapping:

Scenario	Value	Description
1	`OCR_ASR`	Auto-detect subtitles
2	`SubtitleFile`	Reviewed SRT
3, 4	`SubtitleFile`	User-provided SRT

CustomSrtType Trigger Rules:

Condition	Value
CaptionExtraction extracted	`SourceSrt`
User provides subtitle (Scenario 4)	MUST ask: SourceSrt / TargetSrt

决策规则：清晰定义每个参数的处理方式，不得随意假设！

参数	触发条件	处理方式	默认值	禁止行为
`NeedSpeechTranslate`	所有场景	必须询问用户	无	禁止根据输入推断
`NeedFaceTranslate`	所有场景	固定取值	`false`	禁止设置为true
`DetextArea`	用户选择擦除字幕	必须询问用户	无	禁止随意设置为Auto
`SubtitleConfig`	用户选择烧入字幕	可使用默认值	标准样式	禁止跳过确认环节
`TextSource`	由场景决定	遵循场景规则	见场景映射	禁止随意选择
`CustomSrtType`	场景4	必须询问用户	无	禁止随意推断
`OutputConfig.MediaURL`	输出路径设置	可使用默认值	遵循默认规则	禁止使用shell变量
`JobParams.roi`	字幕提取场景	必须询问用户	`[[0.5,1],[0,1]]`	禁止随意设置默认值
`SourceLanguage`	用户指定或可推断	可使用默认值	自动检测	仅中文内容可设置为zh
`TargetLanguage`	用户指定	可使用默认值	`en`	其他语言需要询问用户

TextSource场景映射：

场景	取值	说明
1	`OCR_ASR`	自动检测字幕
2	`SubtitleFile`	审核后的SRT
3、4	`SubtitleFile`	用户提供的SRT

CustomSrtType触发规则：

条件	取值
由CaptionExtraction提取的字幕	`SourceSrt`
用户提供字幕（场景4）	必须询问：SourceSrt / TargetSrt

Failure Protection Mechanism

故障保护机制

HARD-GATE: After speech translation fails, DO NOT auto-switch to subtitle translation!

硬性门槛：语音翻译失败后，不得自动切换为字幕翻译！

API Error Handling

API错误处理

ErrorCode	Handling Action
`Forbidden.SubscriptionRequired`	See ram-policies.md
`InvalidParameter`	See api-parameters.md
`InputConfig.Subtitle is invalid`	See troubleshooting.md
`JobFailed`	Record JobId, ask user if retry needed

错误码	处理动作
`Forbidden.SubscriptionRequired`	参考 ram-policies.md
`InvalidParameter`	参考 api-parameters.md
`InputConfig.Subtitle is invalid`	参考 troubleshooting.md
`JobFailed`	记录JobId，询问用户是否需要重试

SRT Format Repair Flow

SRT格式修复流程

Detect empty subtitle entries → Delete empty entries → Renumber → Upload repaired file → Inform user

See troubleshooting.md for details.

检测空字幕条目 → 删除空条目 → 重新编号 → 上传修复后的文件 → 告知用户

参考 troubleshooting.md 获取详情。

CLI Command Templates

CLI命令模板

IMPORTANT: Before submitting API, MUST reference api-parameters.md to confirm parameter format!

See cli-commands.md for details.

Core Commands:

bash

undefined

重要提示：提交API前，必须参考 api-parameters.md 确认参数格式！

参考 cli-commands.md 获取详情。

核心命令：

bash

undefined

Register media asset

注册媒体资产

aliyun ice register-media-info --input-url "oss://<bucket>/<object>" --media-type video --user-agent AlibabaCloud-Agent-Skills

Submit subtitle extraction (use OSS path)

提交字幕提取任务（使用OSS路径）

aliyun ice submit-iproduction-job
--function-name CaptionExtraction
--input "Media=oss://<bucket>/<object> Type=OSS"
--biz-output "Media=oss://<bucket>/<output>.srt Type=OSS"
--job-params '{"lang":"ch","roi":[[0.5,1],[0,1]]}'
--force
--user-agent AlibabaCloud-Agent-Skills

Submit video translation

提交视频翻译任务

aliyun ice submit-video-translation-job
--user-agent AlibabaCloud-Agent-Skills


> **CLI Format Key Points**:
> - Subtitle extraction uses command name `submit-iproduction-job` (lowercase, `-` separator)
> - `--input` and `--biz-output` format: space-separated string `"Media=... Type=OSS"`, NOT JSON
> - `--job-params` format: JSON string
> - MUST add `--force` to skip plugin parameter validation
> - **All ICE commands MUST add `--user-agent AlibabaCloud-Agent-Skills`**

---

aliyun ice submit-video-translation-job
--user-agent AlibabaCloud-Agent-Skills


> **CLI格式核心要点**：
> - 字幕提取使用命令名`submit-iproduction-job`（小写，`-`分隔）
> - `--input` 和 `--biz-output` 格式：空格分隔的字符串 `"Media=... Type=OSS"`，不是JSON
> - `--job-params` 格式：JSON字符串
> - 必须添加`--force`跳过插件参数校验
> - **所有ICE命令必须添加`--user-agent AlibabaCloud-Agent-Skills`**

---

Documentation Reference

文档参考

Document	Content
workflow-details.md	Detailed execution flow for 4 scenarios
cli-commands.md	CLI command template library
troubleshooting.md	Error handling details
api-parameters.md	Complete API parameter documentation
ram-policies.md	RAM permission requirements
cli-installation-guide.md	CLI installation guide

文档	内容
workflow-details.md	4个场景的详细执行流程
cli-commands.md	CLI命令模板库
troubleshooting.md	错误处理详情
api-parameters.md	完整API参数文档
ram-policies.md	RAM权限要求
cli-installation-guide.md	CLI安装指南

Key Constraints

核心约束

Before submitting API, MUST reference api-parameters.md to confirm parameter format
All ICE CLI commands MUST add
--user-agent AlibabaCloud-Agent-Skills
Subtitle extraction (SubmitIProductionJob) uses
oss://
format
Video translation (SubmitVideoTranslationJob) uses HTTP URL format, no signing needed
Local videos MUST first be uploaded to OSS, user MUST provide upload path
```
NeedFaceTranslate
```
MUST be
```
false
```
```
SpeechTranslate
```
and
```
SubtitleTranslate
```
are mutually exclusive
```
InputConfig.Subtitle
```
MUST use HTTPS format, DO NOT use
```
oss://
```
Speech translation + SRT input requires
```
SpeechTranslate.CustomSrtType
```
DO NOT infer translation mode from input type

提交API前，必须参考 api-parameters.md 确认参数格式
所有ICE CLI命令必须添加
--user-agent AlibabaCloud-Agent-Skills
字幕提取（SubmitIProductionJob）使用
oss://
格式
视频翻译（SubmitVideoTranslationJob）使用HTTP URL格式，无需签名
本地视频必须先上传到OSS，用户必须提供上传路径
```
NeedFaceTranslate
```
必须设置为
```
false
```
```
SpeechTranslate
```
和
```
SubtitleTranslate
```
互斥
```
InputConfig.Subtitle
```
必须使用HTTPS格式，不得使用
```
oss://
```
语音翻译+SRT输入需要设置
```
SpeechTranslate.CustomSrtType
```
不得根据输入类型推断翻译模式

Task Polling

任务轮询

Mandatory: MUST continuously poll task status until completion (
State=Finished
) or failure (
State=Failed
), DO NOT exit early!

Task Type	Query Command	Interval	Timeout
Subtitle extraction	`QueryIProductionJob`	30 seconds	5 minutes
Video translation	`get-smart-handle-job`	30 seconds	30 minutes

Polling Logic:

Loop polling until:
  - State == "Finished" → Return result
  - State == "Failed" → Report error
  - Exceeds 30 minutes → Report TimeoutError

Prohibited: Return after single query / Skip polling and return JobId directly

Time Reference (3-minute video):

Subtitle-level translation: 3-5 minutes
Speech-level translation: 10-20 minutes

强制要求：必须持续轮询任务状态直到完成（
State=Finished
）或失败（
State=Failed
），不得提前退出！

任务类型	查询命令	轮询间隔	超时时间
字幕提取	`QueryIProductionJob`	30秒	5分钟
视频翻译	`get-smart-handle-job`	30秒	30分钟

轮询逻辑：

循环轮询直到：
  - State == "Finished" → 返回结果
  - State == "Failed" → 上报错误
  - 超过30分钟 → 上报TimeoutError

禁止行为：单次查询后返回 / 跳过轮询直接返回JobId

时间参考（3分钟视频）：

字幕级翻译：3-5分钟
语音级翻译：10-20分钟

Result Retrieval

结果获取

bash

undefined

bash

undefined

Get media asset info

获取媒体资产信息

aliyun ice get-media-info --media-id "<MediaId>"

Generate signed URL (for private Bucket)

生成签名URL（用于私有Bucket）

aliyun oss sign oss://<bucket>/<object> --timeout 3600


---

*End of Document*

aliyun oss sign oss://<bucket>/<object> --timeout 3600


---

*文档结束*