alibabacloud-video-translation

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Video Translation Skill

视频翻译Skill

One-click video translation powered by Alibaba Cloud IMS, supporting subtitle-level and speech-level translation.

由阿里云IMS提供支持的一键视频翻译,支持字幕级和语音级翻译。

Input Format Requirements

输入格式要求

IMPORTANT: Different APIs use different address formats!
重要提示:不同API使用的地址格式不同!

API Address Format Reference

API地址格式参考

APIAddress FormatExample
SubmitIProductionJob
(subtitle extraction)
oss://
format
oss://my-bucket/videos/test.mp4
SubmitVideoTranslationJob
(video translation)
HTTP URL format
https://my-bucket.oss-cn-shanghai.aliyuncs.com/videos/test.mp4
Key: Subtitle extraction uses
oss://
, video translation uses HTTP URL!
API地址格式示例
SubmitIProductionJob
(字幕提取)
oss://
格式
oss://my-bucket/videos/test.mp4
SubmitVideoTranslationJob
(视频翻译)
HTTP URL格式
https://my-bucket.oss-cn-shanghai.aliyuncs.com/videos/test.mp4
核心规则:字幕提取使用
oss://
格式,视频翻译使用HTTP URL格式!

User Input Handling

用户输入处理规则

User Input TypeProcessing Method
HTTP URL formatUse directly for video translation; convert to
oss://
if subtitle extraction needed
oss://
format
Use directly for subtitle extraction; convert to HTTP URL for video translation
Local videoMUST ask for OSS upload path, save both formats after upload
用户输入类型处理方式
HTTP URL格式直接用于视频翻译;如果需要提取字幕则转换为
oss://
格式
oss://
格式
直接用于字幕提取;如果需要视频翻译则转换为HTTP URL格式
本地视频必须询问用户OSS上传路径,上传完成后保存两种格式

Format Conversion Rules

格式转换规则

oss:// format ⇄ HTTP URL format

oss://my-bucket/videos/test.mp4
https://my-bucket.oss-cn-shanghai.aliyuncs.com/videos/test.mp4
Conversion Formula:
  • oss://<bucket>/<path>
    https://<bucket>.oss-<region>.aliyuncs.com/<path>
  • HTTP URL does not require signing, use Bucket domain format directly
oss:// 格式 ⇄ HTTP URL 格式

oss://my-bucket/videos/test.mp4
https://my-bucket.oss-cn-shanghai.aliyuncs.com/videos/test.mp4
转换公式
  • oss://<bucket>/<path>
    https://<bucket>.oss-<region>.aliyuncs.com/<path>
  • HTTP URL不需要签名,直接使用Bucket域名格式即可

Local Video Processing Flow

本地视频处理流程

User provides local video path
    ├─ AskUserQuestion: "Please provide OSS upload path (format: oss://<bucket>/<path>/<filename>.mp4)"
    ├─ User specifies upload path
    │   ├─ Check if Bucket exists
    │   ├─ Upload file: aliyun oss cp <local_path> <oss_path>
    │   ├─ Save oss:// format → for subtitle extraction
    │   └─ Save HTTP URL format → for video translation
    └─ User does not specify path → STOP, user MUST provide upload path
Upload Command:
bash
aliyun oss cp <local_path> oss://<bucket>/<path>/<filename>.mp4
Save both formats after upload:
Local: /Users/demo/videos/test.mp4
Uploaded to: oss://my-bucket/videos/test.mp4
    ├─ oss:// format: oss://my-bucket/videos/test.mp4 (for subtitle extraction)
    └─ HTTP URL: https://my-bucket.oss-cn-shanghai.aliyuncs.com/videos/test.mp4 (for video translation)

用户提供本地视频路径
    ├─ 询问用户:"请提供OSS上传路径(格式:oss://<bucket>/<path>/<filename>.mp4)"
    ├─ 用户指定上传路径
    │   ├─ 检查Bucket是否存在
    │   ├─ 上传文件:aliyun oss cp <local_path> <oss_path>
    │   ├─ 保存oss://格式 → 用于字幕提取
    │   └─ 保存HTTP URL格式 → 用于视频翻译
    └─ 用户未指定路径 → 终止流程,用户必须提供上传路径
上传命令
bash
aliyun oss cp <local_path> oss://<bucket>/<path>/<filename>.mp4
上传完成后保存两种格式
本地路径:/Users/demo/videos/test.mp4
上传至:oss://my-bucket/videos/test.mp4
    ├─ oss://格式:oss://my-bucket/videos/test.mp4(用于字幕提取)
    └─ HTTP URL:https://my-bucket.oss-cn-shanghai.aliyuncs.com/videos/test.mp4(用于视频翻译)

Execution Gate Checklist

执行门槛检查清单

Strict Requirement: Agent MUST execute in phase order, cannot proceed without passing current phase!
严格要求:Agent必须按阶段顺序执行,未通过当前阶段不得进入下一阶段!

Phase 0: Environment and Credential Check (HARD-GATE)

阶段0:环境与凭证检查(硬性门槛)

Check ItemCommandPass ConditionFailure Handling
CLI version
aliyun version
>= 3.3.1STOP, see cli-installation-guide.md
Credential status
aliyun configure list
Valid statusSTOP, guide configuration
Plugin installation
aliyun configure set --auto-plugin-install true
SetAuto-set
HARD-GATE: Cannot proceed with any subsequent operations without passing!

检查项命令通过条件失败处理
CLI版本
aliyun version
>= 3.3.1终止流程,参考 cli-installation-guide.md
凭证状态
aliyun configure list
状态有效终止流程,引导用户配置
插件安装设置
aliyun configure set --auto-plugin-install true
已设置自动设置
硬性门槛:未通过检查不得执行任何后续操作!

Phase 1: Translation Mode Confirmation (BLOCKING)

阶段1:翻译模式确认(阻塞式)

AskUserQuestion: "Do you need subtitle translation (translate subtitles only) or speech translation (translate subtitles + replace voiceover)?"

┌─ Subtitle translation → NeedSpeechTranslate: false
└─ Speech translation → NeedSpeechTranslate: true

⚠️ No reply received → STOP, cannot proceed!
DO NOT infer translation mode from input type!

询问用户:"您需要字幕翻译(仅翻译字幕)还是语音翻译(翻译字幕+替换配音)?"

┌─ 字幕翻译 → NeedSpeechTranslate: false
└─ 语音翻译 → NeedSpeechTranslate: true

⚠️ 未收到回复 → 终止流程,无法继续!
禁止根据输入类型推断翻译模式!

Phase 2: Subtitle Processing Confirmation (BLOCKING)

阶段2:字幕处理确认(阻塞式)

AskUserQuestion: "Do you need to erase original subtitles from the video? Do you need to burn-in translated subtitles?"

⚠️ No reply received → STOP, cannot proceed!
Parameter Mapping:
FeatureParameterValue
Erase original subtitles
DetextArea
"Auto"
/ coordinates / not set (no erasure)
Burn-in new subtitles
SubtitleConfig
config object / not set (no burn-in)

询问用户:"是否需要擦除视频中原字幕?是否需要将翻译后的字幕烧入视频?"

⚠️ 未收到回复 → 终止流程,无法继续!
参数映射
功能参数取值
擦除原字幕
DetextArea
"Auto"
/ 坐标 / 不设置(不擦除)
烧入新字幕
SubtitleConfig
配置对象 / 不设置(不烧入)

Phase 3: Output Path Confirmation (Non-blocking)

阶段3:输出路径确认(非阻塞式)

ConditionProcessing Method
User explicitly specifiesUse user's path
User does not specifyUse default path and inform user
Default Output Rules:
  • Bucket: Same bucket as input video
  • Directory: Same directory as input video
  • Filename:
    {source}_translated_{random8}.mp4
  • Example:
    oss://bucket/videos/demo.mp4
    oss://bucket/videos/demo_translated_a1b2c3d4.mp4
DO NOT use shell variables, use Python:
python3 -c "import random; print(''.join(random.choices('abcdefghijkmnpqrstuvwxyz23456789', k=8)))"

条件处理方式
用户明确指定路径使用用户指定的路径
用户未指定路径使用默认路径并告知用户
默认输出规则
  • Bucket:与输入视频的Bucket相同
  • 目录:与输入视频的目录相同
  • 文件名:
    {source}_translated_{random8}.mp4
  • 示例:
    oss://bucket/videos/demo.mp4
    oss://bucket/videos/demo_translated_a1b2c3d4.mp4
不要使用shell变量,使用Python生成随机串:
python3 -c "import random; print(''.join(random.choices('abcdefghijkmnpqrstuvwxyz23456789', k=8)))"

Phase 4: Subtitle Review Confirmation (Conditional Blocking)

阶段4:字幕审核确认(条件阻塞式)

Trigger ConditionProcessing Method
User chooses to review subtitlesBLOCKING, MUST wait for user confirmation of review result
User does not need reviewNon-blocking, proceed
CRITICAL: After subtitle extraction, MUST output content as-is for user review, DO NOT change format!

触发条件处理方式
用户选择审核字幕阻塞式,必须等待用户确认审核结果
用户不需要审核非阻塞式,继续执行
关键要求:字幕提取完成后,必须原样输出内容供用户审核,不得修改格式!

Scenario Entry Selector

场景入口选择器

Key Points:
  1. When user inputs local video, MUST first upload to OSS and get HTTP URL
  2. When user does not provide subtitle, MUST ask if subtitle extraction and review is needed
User inputs video
    ├─ Local video?
    │   └─ Yes → AskUserQuestion: "Please provide OSS upload path"
    │       ├─ User provides path → Upload to OSS → Convert to HTTP URL → Continue
    │       └─ User does not provide → STOP
    ├─ oss:// format?
    │   └─ Yes → Inform user to convert to HTTP URL format
    └─ HTTP URL format? → Continue
        ├─ User provides SRT file?
        │   ├─ Yes → Input type = with_subtitle
        │   │   ├─ Translation mode = speech → 【Scenario 4】 ⚠️ MUST ask CustomSrtType
        │   │   └─ Translation mode = subtitle → 【Scenario 3】
        │   │
        │   └─ No → Input type = only_video ⚠️ MUST ask if review needed
        │       │
        │       ├─ AskUserQuestion: "Do you need to extract subtitles for review first, or translate directly?"
        │       │
        │       ├─ Need review → 【Scenario 2】 ⚠️ Phase 4 blocking
        │       │
        │       └─ Direct translation → 【Scenario 1】 (TextSource=OCR_ASR)
ScenarioNameBlocking PointTextSourceFlow
0Local video uploadOSS upload path inquiry-Upload→HTTP URL→Subsequent scenario
1Direct translationPhase 1, 2
OCR_ASR
Submit translation directly
2Subtitle reviewPhase 1, 2, Subtitle review inquiry, Phase 4
SubtitleFile
Extract subtitle→Review→Translate
3Subtitle translation + user subtitlePhase 1, 2
SubtitleFile
Use user SRT to translate directly
4Speech translation + user subtitlePhase 1, 2 + CustomSrtType confirmation
SubtitleFile
Confirm subtitle language then translate
Scenario 0 (Local video) detailed flow:
  1. AskUserQuestion: "Please provide OSS upload path (format: oss://<bucket>/<path>/<filename>.mp4)"
  2. After user specifies path, execute
    aliyun oss cp <local_path> <oss_path>
  3. Convert to HTTP URL:
    https://<bucket>.oss-<region>.aliyuncs.com/<path>/<filename>.mp4
  4. Continue with subsequent scenario flow
Scenario 2 detailed flow:
  1. Ask for subtitle detection region (roi parameter)
  2. Call
    CaptionExtraction
    to extract subtitles, input and output use oss:// format
  3. Output subtitle content as-is for user review
  4. After user confirmation, use reviewed SRT to submit translation

核心要点
  1. 用户输入本地视频时,必须先上传到OSS并获取HTTP URL
  2. 用户未提供字幕时,必须询问是否需要提取字幕并审核
用户输入视频
    ├─ 是否为本地视频?
    │   └─ 是 → 询问用户:"请提供OSS上传路径"
    │       ├─ 用户提供路径 → 上传到OSS → 转换为HTTP URL → 继续流程
    │       └─ 用户未提供 → 终止流程
    ├─ 是否为oss://格式?
    │   └─ 是 → 告知用户转换为HTTP URL格式
    └─ 是否为HTTP URL格式? → 继续流程
        ├─ 用户是否提供SRT文件?
        │   ├─ 是 → 输入类型 = 带字幕
        │   │   ├─ 翻译模式 = 语音 → 【场景4】 ⚠️ 必须询问CustomSrtType
        │   │   └─ 翻译模式 = 字幕 → 【场景3】
        │   │
        │   └─ 否 → 输入类型 = 仅视频 ⚠️ 必须询问是否需要审核
        │       │
        │       ├─ 询问用户:"您需要先提取字幕审核,还是直接翻译?"
        │       │
        │       ├─ 需要审核 → 【场景2】 ⚠️ 阶段4阻塞
        │       │
        │       └─ 直接翻译 → 【场景1】 (TextSource=OCR_ASR)
场景名称阻塞点TextSource流程
0本地视频上传OSS上传路径询问-上传→生成HTTP URL→进入后续场景
1直接翻译阶段1、阶段2
OCR_ASR
直接提交翻译任务
2字幕审核阶段1、阶段2、字幕审核询问、阶段4
SubtitleFile
提取字幕→审核→翻译
3字幕翻译+用户提供字幕阶段1、阶段2
SubtitleFile
直接使用用户提供的SRT翻译
4语音翻译+用户提供字幕阶段1、阶段2 + CustomSrtType确认
SubtitleFile
确认字幕语言后翻译
场景0(本地视频)详细流程
  1. 询问用户:"请提供OSS上传路径(格式:oss://<bucket>/<path>/<filename>.mp4)"
  2. 用户指定路径后,执行
    aliyun oss cp <local_path> <oss_path>
  3. 转换为HTTP URL:
    https://<bucket>.oss-<region>.aliyuncs.com/<path>/<filename>.mp4
  4. 继续后续场景流程
场景2详细流程
  1. 询问字幕检测区域(roi参数)
  2. 调用
    CaptionExtraction
    提取字幕,输入输出均使用oss://格式
  3. 原样输出字幕内容供用户审核
  4. 用户确认后,使用审核后的SRT提交翻译任务

Parameter Decision Table

参数决策表

Decision Rules: Clearly define handling for each parameter, DO NOT assume arbitrarily!
ParameterTrigger ConditionHandling MethodDefault ValueProhibited Behavior
NeedSpeechTranslate
AlwaysMUST askNoneDO NOT infer from input
NeedFaceTranslate
AlwaysFixed value
false
DO NOT set to true
DetextArea
User chooses erasureMUST askNoneDO NOT set to Auto arbitrarily
SubtitleConfig
User chooses burn-inCan use defaultStandard styleDO NOT skip confirmation
TextSource
Scenario decidesScenario rulesSee scenario mappingDO NOT choose arbitrarily
CustomSrtType
Scenario 4MUST askNoneDO NOT infer arbitrarily
OutputConfig.MediaURL
Output pathCan use defaultDefault rulesDO NOT use shell variables
JobParams.roi
Subtitle extractionMUST ask
[[0.5,1],[0,1]]
DO NOT set default arbitrarily
SourceLanguage
User specifies or inferableCan use defaultAuto detectUse zh for Chinese only
TargetLanguage
User specifiesCan use default
en
Ask for other languages
TextSource Scenario Mapping:
ScenarioValueDescription
1
OCR_ASR
Auto-detect subtitles
2
SubtitleFile
Reviewed SRT
3, 4
SubtitleFile
User-provided SRT
CustomSrtType Trigger Rules:
ConditionValue
CaptionExtraction extracted
SourceSrt
User provides subtitle (Scenario 4)MUST ask: SourceSrt / TargetSrt

决策规则:清晰定义每个参数的处理方式,不得随意假设!
参数触发条件处理方式默认值禁止行为
NeedSpeechTranslate
所有场景必须询问用户禁止根据输入推断
NeedFaceTranslate
所有场景固定取值
false
禁止设置为true
DetextArea
用户选择擦除字幕必须询问用户禁止随意设置为Auto
SubtitleConfig
用户选择烧入字幕可使用默认值标准样式禁止跳过确认环节
TextSource
由场景决定遵循场景规则见场景映射禁止随意选择
CustomSrtType
场景4必须询问用户禁止随意推断
OutputConfig.MediaURL
输出路径设置可使用默认值遵循默认规则禁止使用shell变量
JobParams.roi
字幕提取场景必须询问用户
[[0.5,1],[0,1]]
禁止随意设置默认值
SourceLanguage
用户指定或可推断可使用默认值自动检测仅中文内容可设置为zh
TargetLanguage
用户指定可使用默认值
en
其他语言需要询问用户
TextSource场景映射
场景取值说明
1
OCR_ASR
自动检测字幕
2
SubtitleFile
审核后的SRT
3、4
SubtitleFile
用户提供的SRT
CustomSrtType触发规则
条件取值
由CaptionExtraction提取的字幕
SourceSrt
用户提供字幕(场景4)必须询问:SourceSrt / TargetSrt

Failure Protection Mechanism

故障保护机制

HARD-GATE: After speech translation fails, DO NOT auto-switch to subtitle translation!
硬性门槛:语音翻译失败后,不得自动切换为字幕翻译!

API Error Handling

API错误处理

ErrorCodeHandling Action
Forbidden.SubscriptionRequired
See ram-policies.md
InvalidParameter
See api-parameters.md
InputConfig.Subtitle is invalid
See troubleshooting.md
JobFailed
Record JobId, ask user if retry needed
错误码处理动作
Forbidden.SubscriptionRequired
参考 ram-policies.md
InvalidParameter
参考 api-parameters.md
InputConfig.Subtitle is invalid
参考 troubleshooting.md
JobFailed
记录JobId,询问用户是否需要重试

SRT Format Repair Flow

SRT格式修复流程

Detect empty subtitle entries → Delete empty entries → Renumber → Upload repaired file → Inform user
See troubleshooting.md for details.

检测空字幕条目 → 删除空条目 → 重新编号 → 上传修复后的文件 → 告知用户
参考 troubleshooting.md 获取详情。

CLI Command Templates

CLI命令模板

IMPORTANT: Before submitting API, MUST reference api-parameters.md to confirm parameter format!
See cli-commands.md for details.
Core Commands:
bash
undefined
重要提示:提交API前,必须参考 api-parameters.md 确认参数格式
参考 cli-commands.md 获取详情。
核心命令
bash
undefined

Register media asset

注册媒体资产

aliyun ice register-media-info --input-url "oss://<bucket>/<object>" --media-type video --user-agent AlibabaCloud-Agent-Skills
aliyun ice register-media-info --input-url "oss://<bucket>/<object>" --media-type video --user-agent AlibabaCloud-Agent-Skills

Submit subtitle extraction (use OSS path)

提交字幕提取任务(使用OSS路径)

aliyun ice submit-iproduction-job
--function-name CaptionExtraction
--input "Media=oss://<bucket>/<object> Type=OSS"
--biz-output "Media=oss://<bucket>/<output>.srt Type=OSS"
--job-params '{"lang":"ch","roi":[[0.5,1],[0,1]]}'
--force
--user-agent AlibabaCloud-Agent-Skills
aliyun ice submit-iproduction-job
--function-name CaptionExtraction
--input "Media=oss://<bucket>/<object> Type=OSS"
--biz-output "Media=oss://<bucket>/<output>.srt Type=OSS"
--job-params '{"lang":"ch","roi":[[0.5,1],[0,1]]}'
--force
--user-agent AlibabaCloud-Agent-Skills

Submit video translation

提交视频翻译任务

aliyun ice submit-video-translation-job
--user-agent AlibabaCloud-Agent-Skills

> **CLI Format Key Points**:
> - Subtitle extraction uses command name `submit-iproduction-job` (lowercase, `-` separator)
> - `--input` and `--biz-output` format: space-separated string `"Media=... Type=OSS"`, NOT JSON
> - `--job-params` format: JSON string
> - MUST add `--force` to skip plugin parameter validation
> - **All ICE commands MUST add `--user-agent AlibabaCloud-Agent-Skills`**

---
aliyun ice submit-video-translation-job
--user-agent AlibabaCloud-Agent-Skills

> **CLI格式核心要点**:
> - 字幕提取使用命令名`submit-iproduction-job`(小写,`-`分隔)
> - `--input` 和 `--biz-output` 格式:空格分隔的字符串 `"Media=... Type=OSS"`,不是JSON
> - `--job-params` 格式:JSON字符串
> - 必须添加`--force`跳过插件参数校验
> - **所有ICE命令必须添加`--user-agent AlibabaCloud-Agent-Skills`**

---

Documentation Reference

文档参考

DocumentContent
workflow-details.mdDetailed execution flow for 4 scenarios
cli-commands.mdCLI command template library
troubleshooting.mdError handling details
api-parameters.mdComplete API parameter documentation
ram-policies.mdRAM permission requirements
cli-installation-guide.mdCLI installation guide

文档内容
workflow-details.md4个场景的详细执行流程
cli-commands.mdCLI命令模板库
troubleshooting.md错误处理详情
api-parameters.md完整API参数文档
ram-policies.mdRAM权限要求
cli-installation-guide.mdCLI安装指南

Key Constraints

核心约束

  • Before submitting API, MUST reference api-parameters.md to confirm parameter format
  • All ICE CLI commands MUST add
    --user-agent AlibabaCloud-Agent-Skills
  • Subtitle extraction (SubmitIProductionJob) uses
    oss://
    format
  • Video translation (SubmitVideoTranslationJob) uses HTTP URL format, no signing needed
  • Local videos MUST first be uploaded to OSS, user MUST provide upload path
  • NeedFaceTranslate
    MUST be
    false
  • SpeechTranslate
    and
    SubtitleTranslate
    are mutually exclusive
  • InputConfig.Subtitle
    MUST use HTTPS format, DO NOT use
    oss://
  • Speech translation + SRT input requires
    SpeechTranslate.CustomSrtType
  • DO NOT infer translation mode from input type

  • 提交API前,必须参考 api-parameters.md 确认参数格式
  • 所有ICE CLI命令必须添加
    --user-agent AlibabaCloud-Agent-Skills
  • 字幕提取(SubmitIProductionJob)使用
    oss://
    格式
  • 视频翻译(SubmitVideoTranslationJob)使用HTTP URL格式,无需签名
  • 本地视频必须先上传到OSS,用户必须提供上传路径
  • NeedFaceTranslate
    必须设置为
    false
  • SpeechTranslate
    SubtitleTranslate
    互斥
  • InputConfig.Subtitle
    必须使用HTTPS格式,不得使用
    oss://
  • 语音翻译+SRT输入需要设置
    SpeechTranslate.CustomSrtType
  • 不得根据输入类型推断翻译模式

Task Polling

任务轮询

Mandatory: MUST continuously poll task status until completion (
State=Finished
) or failure (
State=Failed
), DO NOT exit early!
Task TypeQuery CommandIntervalTimeout
Subtitle extraction
QueryIProductionJob
30 seconds5 minutes
Video translation
get-smart-handle-job
30 seconds30 minutes
Polling Logic:
Loop polling until:
  - State == "Finished" → Return result
  - State == "Failed" → Report error
  - Exceeds 30 minutes → Report TimeoutError

Prohibited: Return after single query / Skip polling and return JobId directly
Time Reference (3-minute video):
  • Subtitle-level translation: 3-5 minutes
  • Speech-level translation: 10-20 minutes

强制要求:必须持续轮询任务状态直到完成(
State=Finished
)或失败(
State=Failed
),不得提前退出
任务类型查询命令轮询间隔超时时间
字幕提取
QueryIProductionJob
30秒5分钟
视频翻译
get-smart-handle-job
30秒30分钟
轮询逻辑
循环轮询直到:
  - State == "Finished" → 返回结果
  - State == "Failed" → 上报错误
  - 超过30分钟 → 上报TimeoutError

禁止行为:单次查询后返回 / 跳过轮询直接返回JobId
时间参考(3分钟视频):
  • 字幕级翻译:3-5分钟
  • 语音级翻译:10-20分钟

Result Retrieval

结果获取

bash
undefined
bash
undefined

Get media asset info

获取媒体资产信息

aliyun ice get-media-info --media-id "<MediaId>"
aliyun ice get-media-info --media-id "<MediaId>"

Generate signed URL (for private Bucket)

生成签名URL(用于私有Bucket)

aliyun oss sign oss://<bucket>/<object> --timeout 3600

---

*End of Document*
aliyun oss sign oss://<bucket>/<object> --timeout 3600

---

*文档结束*