alibabacloud-oss-media-process

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Alibaba Cloud OSS Media Processing

阿里云OSS媒体处理

Process images, audio, and video files stored in Alibaba Cloud OSS using native OSS media processing capabilities. Synchronous processing returns immediate results via

x-oss-process

; asynchronous processing handles long-running jobs via

x-oss-async-process

with polling.

Default language: 默认中文回复。Only use English when the user explicitly writes in English.

利用阿里云OSS原生媒体处理能力，处理存储在OSS中的图片、音频和视频文件。同步处理通过

x-oss-process

即时返回结果；异步处理通过

x-oss-async-process

处理耗时任务，并自动轮询等待完成。

默认语言：默认中文回复。仅当用户明确使用英文提问时才用英文回复。

Quick Start

快速开始

Working directory

工作目录

All script commands run from the skill package root. Use full absolute paths to invoke scripts:

bash

python /path/to/skill/scripts/process.py ...

Do not

cd

into the directory and use relative paths. If a script fails with "No such file or directory", use Glob to find

**/alibabacloud-oss-media-process/scripts/process.py

and use its full path.

Setup workspace output directory (run once per session):

bash

WORKSPACE_OUTPUT=$(pwd)/outputs && mkdir -p "$WORKSPACE_OUTPUT"

All

--output-path

arguments MUST use

$WORKSPACE_OUTPUT/<filename>

— files saved inside the skill directory will NOT be renderable.

所有脚本命令需从技能包根目录运行。使用完整绝对路径调用脚本：

bash

python /path/to/skill/scripts/process.py ...

不要

cd

到目标目录后使用相对路径。若脚本提示"No such file or directory"，使用Glob查找

**/alibabacloud-oss-media-process/scripts/process.py

并使用其完整路径。

设置工作区输出目录（每个会话运行一次）：

bash

WORKSPACE_OUTPUT=$(pwd)/outputs && mkdir -p "$WORKSPACE_OUTPUT"

所有

--output-path

参数必须使用

$WORKSPACE_OUTPUT/<filename>

格式——保存到技能目录内的文件无法被渲染。

Credentials (Aliyun CLI)

凭证（Aliyun CLI）

This skill uses Aliyun CLI for credential management. Python scripts auto-discover credentials via the

alibabacloud-credentials

default chain (supporting

~/.aliyun/config.json

, environment variables, ECS instance roles, etc.).

Security rules:

Never read, echo, print,
```
cat
```
, or dump
```
~/.aliyun/config.json
```
, credential files, or any raw command output that contains
```
access_key_id
```
,
```
access_key_secret
```
,
```
sts_token
```
,
```
AccessKeyId
```
,
```
AccessKeySecret
```
, or
```
SecurityToken
```
values.
Never ask the user to input AK/SK directly in the conversation or command line
Guide users to use
```
aliyun configure
```
to set up credentials securely
Never write
```
AccessKeyId
```
,
```
AccessKeySecret
```
, or
```
SecurityToken
```
into any temporary Python/Shell script, here-doc, env export, or intermediate file. All credentials must be discovered through Aliyun CLI or the SDK default credential chain.
For credential diagnostics, use
```
aliyun configure list
```
,
```
python scripts/load_env.py
```
, or other non-secret checks. If you must inspect configuration structure, only inspect non-sensitive fields and do not print secret or token values to the transcript.
Treat full presigned URLs as sensitive whenever they contain signing parameters such as
```
OSSAccessKeyId
```
,
```
accessKeyId
```
,
```
x-oss-credential
```
,
```
Signature
```
,
```
x-oss-signature
```
,
```
security-token
```
,
```
SecurityToken
```
, or
```
sts_token
```
. Do not print these full URLs into the conversation transcript, command echo, markdown summary, or ordinary log files.
When a signed URL is needed for user consumption, distinguish between delivery and display: it is acceptable to generate a usable signed URL, but unless the runtime provides a secure private-output channel that does not enter the transcript or logs, only display a redacted URL or an OSS path in normal user-facing text.

本技能使用Aliyun CLI管理凭证。Python脚本通过

alibabacloud-credentials

默认链自动发现凭证（支持

~/.aliyun/config.json

、环境变量、ECS实例角色等）。

安全规则：

切勿读取、回显、打印、
```
cat
```
或导出
```
~/.aliyun/config.json
```
、凭证文件或任何包含
```
access_key_id
```
、
```
access_key_secret
```
、
```
sts_token
```
、
```
AccessKeyId
```
、
```
AccessKeySecret
```
或
```
SecurityToken
```
值的原始命令输出。
切勿要求用户在对话或命令行中直接输入AK/SK
引导用户使用
```
aliyun configure
```
安全设置凭证
切勿将
```
AccessKeyId
```
、
```
AccessKeySecret
```
或
```
SecurityToken
```
写入任何临时Python/Shell脚本、here-doc、环境变量导出或中间文件。所有凭证必须通过Aliyun CLI或SDK默认凭证链自动发现。
如需诊断凭证问题，使用
```
aliyun configure list
```
、
```
python scripts/load_env.py
```
或其他非敏感检查方式。若必须检查配置结构，仅查看非敏感字段，切勿将密钥或令牌值打印到对话记录中。
当签名URL包含
```
OSSAccessKeyId
```
、
```
accessKeyId
```
、
```
x-oss-credential
```
、
```
Signature
```
、
```
x-oss-signature
```
、
```
security-token
```
、
```
SecurityToken
```
或
```
sts_token
```
等签名参数时，需将其视为敏感信息。切勿将完整URL打印到对话记录、命令回显、Markdown摘要或普通日志文件中。
当用户需要签名URL时，区分交付和展示：生成可用的签名URL是可行的，但除非运行时提供不进入对话记录或日志的安全私有输出通道，否则在面向用户的文本中仅显示脱敏URL或OSS路径。

Prerequisites

前置条件

Step	Action	Command
1	Install Aliyun CLI (>=3.3.3)	`curl -fsSL https://aliyuncli.alicdn.com/setup.sh
2	Configure credentials	`aliyun configure`
3	Run blocking preflight check 1	`python scripts/load_env.py`
4	Run blocking preflight check 2	`aliyun configure list`
5	Enable plugins	`aliyun configure set --auto-plugin-install true && aliyun plugin update`
6	Install Python deps	`pip install -r scripts/requirements.txt`
7	Set bucket/region (choose one)	`export ALIBABA_CLOUD_OSS_BUCKET=<b> ALIBABA_CLOUD_OSS_REGION=<r>` (add to `~/.bashrc` / `~/.zshrc` for persistence), or pass `--bucket <b> --region <r>` on every command

Blocking preflight policy:

```
python scripts/load_env.py
```
may report missing SDKs, missing credentials, missing bucket/region, or RAM permission problems.
```
aliyun configure list
```
must show a usable configured CLI profile.
Treat preflight results as stale after any environment or runtime change. If you install Python packages, run
```
aliyun configure
```
, change env vars, edit shell profiles, switch users, or otherwise modify credential/runtime state, you must rerun both
```
python scripts/load_env.py
```
and
```
aliyun configure list
```
before the next
```
python scripts/process.py ...
```
command.
If either command fails these checks, stop immediately.
Do not run
```
python scripts/process.py ...
```
.
Do not retry media processing.
Do not simulate a successful result.
Return only configuration guidance until both checks pass.

步骤	操作	命令
1	安装Aliyun CLI（版本≥3.3.3）	`curl -fsSL https://aliyuncli.alicdn.com/setup.sh
2	配置凭证	`aliyun configure`
3	运行阻塞式预检检查1	`python scripts/load_env.py`
4	运行阻塞式预检检查2	`aliyun configure list`
5	启用插件	`aliyun configure set --auto-plugin-install true && aliyun plugin update`
6	安装Python依赖	`pip install -r scripts/requirements.txt`
7	设置存储桶/地域（二选一）	`export ALIBABA_CLOUD_OSS_BUCKET=<b> ALIBABA_CLOUD_OSS_REGION=<r>` （添加到 `~/.bashrc` / `~/.zshrc` 以持久生效），或在每次命令中传递 `--bucket <b> --region <r>`

阻塞式预检策略：

```
python scripts/load_env.py
```
可能报告缺少SDK、缺少凭证、缺少存储桶/地域或RAM权限问题。
```
aliyun configure list
```
必须显示可用的已配置CLI配置文件。
任何环境或运行时变更后，预检结果视为过期。若安装Python包、运行
```
aliyun configure
```
、修改环境变量、编辑Shell配置文件、切换用户或修改凭证/运行时状态，必须在下次运行
```
python scripts/process.py ...
```
命令前重新运行
```
python scripts/load_env.py
```
和
```
aliyun configure list
```
。
若任一命令未通过检查，立即停止操作。
切勿运行
```
python scripts/process.py ...
```
。
切勿重试媒体处理。
切勿模拟成功结果。
仅返回配置指导，直到两项检查均通过。

AI-Mode

AI模式

Enable at session start:

bash

aliyun configure ai-mode enable
aliyun configure ai-mode set-user-agent --user-agent "AlibabaCloud-Agent-Skills/alibabacloud-oss-media-process"

Disable on every exit: success, failure, error, cancellation, or session end:

bash

aliyun configure ai-mode disable

会话开始时启用：

bash

aliyun configure ai-mode enable
aliyun configure ai-mode set-user-agent --user-agent "AlibabaCloud-Agent-Skills/alibabacloud-oss-media-process"

会话结束时禁用：无论成功、失败、错误、取消或会话终止，均需执行：

bash

aliyun configure ai-mode disable

Preflight then Execute

预检后执行

When the user requests a media operation (resize, detect faces, watermark, etc.), apply the blocking preflight policy above before running any

python scripts/process.py ...

command.

process.py

also performs a runtime dependency preflight and exits with

pip install -r scripts/requirements.txt

guidance if required SDKs are missing. If you change the environment after a failed attempt (for example by installing dependencies, editing env vars, or re-running

aliyun configure

), do not assume the earlier preflight still holds — rerun the full blocking preflight first.

当用户请求媒体操作（resize、人脸检测、水印等）时，在运行任何

python scripts/process.py ...

命令前执行上述阻塞式预检策略。

process.py

也会执行运行时依赖预检，若缺少所需SDK，将退出并提示

pip install -r scripts/requirements.txt

指导。若在失败尝试后更改环境（例如安装依赖、编辑环境变量或重新运行

aliyun configure

），切勿假设之前的预检仍然有效——先重新执行完整的阻塞式预检。

First-time setup

首次设置

Direct users to run

aliyun configure

to set up credentials, then verify with:

bash

aliyun configure list

Python scripts use the

alibabacloud-credentials

SDK to auto-discover credentials from the Aliyun CLI config. Bucket and region are read from the

ALIBABA_CLOUD_OSS_BUCKET

ALIBABA_CLOUD_OSS_REGION

environment variables, or from

--bucket

--region

CLI flags.

load_env.py

scans shell config files (

~/.bashrc

~/.zshrc

) for these exports and loads them into

os.environ

引导用户运行

aliyun configure

设置凭证，然后通过以下命令验证：

bash

aliyun configure list

Python脚本使用

alibabacloud-credentials

SDK从Aliyun CLI配置中自动发现凭证。存储桶和地域从

ALIBABA_CLOUD_OSS_BUCKET

ALIBABA_CLOUD_OSS_REGION

环境变量读取，或从

--bucket

--region

CLI标志获取。

load_env.py

会扫描Shell配置文件（

~/.bashrc

、

~/.zshrc

）中的这些导出变量并加载到

os.environ

中。

Recommended Workflow

Quick Decision Guide

快速决策指南

All processing goes through

process.py

所有处理均通过

process.py

执行

Image, video, and audio operations MUST be executed via

python scripts/process.py --operations "..."

. The agent must not write its own SDK or CLI calls to bypass

process.py

imm_admin.py

for video/audio/image processing. Underlying SDK or API requests triggered internally by these scripts (including IMM requests such as

CreateMediaConvertTask

) are expected implementation behavior and do not count as direct agent-side SDK usage. The only intentional script-level IMM entry points are

imm_admin.py

for project setup and

blindwatermark-extract

for async watermark extraction.

Never create your own Python scripts or wrappers to bypass

process.py

. When

process.py

doesn't support a feature, check SKILL.md and

references/

documentation, use

--dry-run

to preview, and report to the user if it truly cannot be done.

图片、视频和音频操作必须通过

python scripts/process.py --operations "..."

执行。Agent不得编写自己的SDK或CLI调用以绕过

process.py

或

imm_admin.py

进行视频/音频/图片处理。这些脚本内部触发的底层SDK或API请求（包括IMM请求如

CreateMediaConvertTask

）属于预期实现行为，不算作Agent端直接使用SDK。唯一有意的脚本级IMM入口点是用于项目设置的

imm_admin.py

和用于异步水印提取的

blindwatermark-extract

。

切勿创建自己的Python脚本或包装器以绕过

process.py

。当

process.py

不支持某功能时，检查SKILL.md和

references/

文档，使用

--dry-run

预览，若确实无法实现则告知用户。

IMM setup (before IMM-dependent ops)

IMM设置（依赖IMM的操作前）

Before running video, audio, HLS, or image-intelligent operations, first run

imm_admin.py auto-setup

to ensure the bucket is bound to an IMM project. Pass

--imm-project <project_name>

only for

blindwatermark-extract

, or if you intentionally want to override the optional

ALIBABA_CLOUD_IMM_PROJECT

fallback used by that operation.

在运行视频、音频、HLS或图片智能操作前，先运行

imm_admin.py auto-setup

确保存储桶已绑定到IMM项目。仅在

blindwatermark-extract

操作中，或有意覆盖该操作使用的可选

ALIBABA_CLOUD_IMM_PROJECT

回退值时，才传递

--imm-project <project_name>

。

Source selection

源选择

OSS object →
```
--source object-key
```
Local file or URL →
```
--uri /path/to/file
```
(auto-uploads, processes, cleans up)

OSS对象 →
```
--source object-key
```
本地文件或URL →
```
--uri /path/to/file
```
（自动上传、处理、清理）

Sync vs Async (auto-detected)

同步vs异步（自动检测）

Sync (

x-oss-process

): image ops,

video/snapshot

video/info

audio/info

hls/m3u8

, AI detection

Async (

x-oss-async-process

video/convert

video/animation

video/snapshots

video/sprite

video/concat

audio/convert

audio/concat

blindwatermark-extract

The script auto-detects async-only operations and handles routing/polling automatically — no

--async

--wait

flags needed.

同步（

x-oss-process

）：图片操作、

video/snapshot

、

video/info

、

audio/info

、

hls/m3u8

、AI检测

异步（

x-oss-async-process

）：

video/convert

、

video/animation

、

video/snapshots

、

video/sprite

、

video/concat

、

audio/convert

、

audio/concat

、

blindwatermark-extract

脚本会自动检测仅支持异步的操作，并自动处理路由/轮询——无需

--async

或

--wait

标志。

Output rules

输出规则

Operation type	Output mode	Command pattern
Sync (image)	`download`	`--output-mode download --output-path $WORKSPACE_OUTPUT/<file>`
Async (video/audio)	`save` then `download`	1. `--output-mode save --target-key output/<file>` → 2. `--operations download --output-path $WORKSPACE_OUTPUT/<file>`
`video/snapshots`	`save` with auto-download	`--output-mode save --target-key output/frames/frame --output-path $WORKSPACE_OUTPUT/` — script auto-polls and downloads all frames
`hls/m3u8`	`url`	`--output-mode url` — returns signed URL for browser/player (not a downloadable file)

All

--output-path

MUST use

$WORKSPACE_OUTPUT/<filename>

— files saved inside the skill directory will NOT be renderable.

No-local-download rule: if the user explicitly says not to download locally, only to save in OSS, or only to return a link/URL, do not pass

--output-path

and do not perform any follow-up download for verification. Use

--output-mode url

for sync results meant to be consumed remotely, and use

--output-mode save --target-key ...

for async media results that should remain in OSS. Never download to

$WORKSPACE_OUTPUT

/tmp

, or any local path just to verify success; rely on the

process.py

JSON response instead.

Ambiguous save wording rule: if the user says "保存", "保存下来", "存起来", or similar wording but does not explicitly say "下载到本地", "本地查看", "给我本地文件", or another clear local-destination phrase, default to saving the result back to OSS with

--output-mode save --target-key ...

. Only use

--output-mode download --output-path ...

when the user explicitly asks for a local file. If the user only wants to inspect the result and does not require a persisted local copy, prefer

--output-mode url

for sync outputs and

--output-mode save

plus the OSS path for async outputs.

Signed-URL delivery rule: the purpose of

--output-mode url

is to make a remote result accessible, not to force the full signed query string into the transcript. In ordinary text responses, prefer an OSS path or a redacted URL. Only provide a full presigned URL when the runtime offers a secure private-output channel that keeps the raw URL out of transcript/log surfaces. If no such channel exists, explain the limitation briefly and avoid printing the full signed query parameters. A redacted URL should keep the path and any non-sensitive query parameters, while replacing sensitive signing values with

***

, for example:

https://bucket.oss-cn-hangzhou.aliyuncs.com/output/result.webp?OSSAccessKeyId=***&x-oss-credential=***&Signature=***&security-token=***&Expires=1700000000

Unique suffix rule: when you need a unique OSS target key suffix for evals, retries, or parallel runs, prefer Python-generated UUIDs or a timestamp-plus-random suffix. Do not rely on

uuidgen

being available. If you must generate a suffix from shell commands, first verify the command exists; otherwise fall back to a timestamp plus random digits. Safe shell example:

SUFFIX=$(python3 -c "import uuid; print(uuid.uuid4().hex[:8])" 2>/dev/null || date +%Y%m%d_%H%M%S_$RANDOM)

操作类型	输出模式	命令模式
同步（图片）	`download`	`--output-mode download --output-path $WORKSPACE_OUTPUT/<file>`
异步（视频/音频）	`save` 后 `download`	1. `--output-mode save --target-key output/<file>` → 2. `--operations download --output-path $WORKSPACE_OUTPUT/<file>`
`video/snapshots`	`save` 并自动下载	`--output-mode save --target-key output/frames/frame --output-path $WORKSPACE_OUTPUT/` —— 脚本自动轮询并下载所有帧
`hls/m3u8`	`url`	`--output-mode url` —— 返回供浏览器/播放器使用的签名URL（非可下载文件）

所有

--output-path

必须使用

$WORKSPACE_OUTPUT/<filename>

格式——保存到技能目录内的文件无法被渲染。

禁止本地下载规则：若用户明确表示不下载到本地，仅保存到OSS或仅返回链接/URL，则不传递

--output-path

，也不执行任何后续下载验证。同步结果需远程使用时使用

--output-mode url

，异步媒体结果需保留在OSS中时使用

--output-mode save --target-key ...

。切勿仅为验证成功而下载到

$WORKSPACE_OUTPUT

、

/tmp

或任何本地路径；依赖

process.py

的JSON响应即可。

模糊保存措辞规则：若用户说「保存」「保存下来」「存起来」等类似措辞，但未明确说「下载到本地」「本地查看」「给我本地文件」或其他明确的本地目标短语，默认使用

--output-mode save --target-key ...

将结果保存回OSS。仅当用户明确要求本地文件时，才使用

--output-mode download --output-path ...

。若用户仅需查看结果且不需要持久化本地副本，同步输出优先使用

--output-mode url

，异步输出优先使用

--output-mode save

加OSS路径。

签名URL交付规则：

--output-mode url

的目的是让远程结果可访问，而非强制将完整签名查询字符串写入对话记录。在普通文本回复中，优先使用OSS路径或脱敏URL。仅当运行时提供安全私有输出通道以避免原始URL进入对话记录/日志时，才提供完整签名URL。若没有此类通道，简要说明限制并避免打印完整签名参数。脱敏URL应保留路径和任何非敏感查询参数，将敏感签名值替换为

***

，例如：

https://bucket.oss-cn-hangzhou.aliyuncs.com/output/result.webp?OSSAccessKeyId=***&x-oss-credential=***&Signature=***&security-token=***&Expires=1700000000

。

唯一后缀规则：当需要为评估、重试或并行运行生成唯一的OSS目标键后缀时，优先使用Python生成的UUID或时间戳加随机后缀。切勿依赖

uuidgen

可用。若必须通过Shell命令生成后缀，先验证命令是否存在；否则回退到时间戳加随机数字。安全Shell示例：

SUFFIX=$(python3 -c "import uuid; print(uuid.uuid4().hex[:8])" 2>/dev/null || date +%Y%m%d_%H%M%S_$RANDOM)

。

Chaining rules

链式规则

See the dedicated

Chaining Rules

section below for full chaining guidelines.

完整链式指南请见下方「链式规则」章节。

Core Parameter Rules

核心参数规则

Only pass parameters the user specifies — do not invent defaults. OSS uses official defaults for unspecified parameters (e.g., keep original width/height, original bitrate, original framerate).
Recipes are examples, not defaults — parameter values in recipe tables (e.g.,
```
w=800
```
,
```
vb=2000000
```
) are for specific scenarios and should NOT be used as defaults.
video/convert — remux vs re-encode: omitting
```
vcodec
```
means OSS only does remux (stream copy without re-encoding). Parameters like
```
videoslim
```
,
```
vb
```
,
```
crf
```
,
```
s
```
,
```
fps
```
are silently ignored in remux mode. Always specify
vcodec
(default
h264
) when the user says "transcode", "compress", or "slim". Only omit
```
vcodec
```
for pure remux (e.g., AVI→MP4 container switch) or audio extraction.
video/concat — when input params differ: if input videos have different resolution, framerate, or codec, you must ask the user which video to align to (option A: first video, B: second video, C: custom params). Never auto-decide.
video/concat — validation scope:
```
process.py
```
always performs input compatibility checks before submitting the async task. Additional local
```
ffprobe
```
output validation only runs when the command also downloads the result via
```
--output-path
```
. If you use
```
--output-mode save
```
without a local download path, there is no post-download media validation step.
Snapshots vs snapshot: use
```
video/snapshots
```
(async) for multi-frame extraction. Never use multiple
```
video/snapshot
```
calls as a workaround.
```
video/snapshots
```
target-key must NOT have a file extension.
For full parameter specifications, see the corresponding reference files in
```
references/
```
.

仅传递用户指定的参数 —— 切勿自行设置默认值。OSS对未指定的参数使用官方默认值（例如保留原始宽/高、原始比特率、原始帧率）。
示例仅作参考，非默认值 —— 示例表中的参数值（如
```
w=800
```
、
```
vb=2000000
```
）适用于特定场景，不应作为默认值使用。
video/convert —— 封装转换vs重新编码：省略
```
vcodec
```
意味着OSS仅执行封装转换（流复制，不重新编码）。
```
videoslim
```
、
```
vb
```
、
```
crf
```
、
```
s
```
、
```
fps
```
等参数在封装转换模式下会被忽略。当用户说「转码」「压缩」或「瘦身」时，必须指定
vcodec
（默认
h264
）。仅在纯封装转换（如AVI→MP4容器切换）或音频提取时省略
```
vcodec
```
。
video/concat —— 输入参数不同时：若输入视频的分辨率、帧率或编解码器不同，必须询问用户对齐到哪个视频（选项A：第一个视频，B：第二个视频，C：自定义参数）。切勿自行决定。
video/concat —— 验证范围：
```
process.py
```
在提交异步任务前始终执行输入兼容性检查。仅当命令同时通过
```
--output-path
```
下载结果时，才会执行额外的本地
```
ffprobe
```
输出验证。若使用
```
--output-mode save
```
且未指定本地下载路径，则无下载后媒体验证步骤。
多帧截图vs单帧截图：使用
```
video/snapshots
```
（异步）提取多帧。切勿使用多次
```
video/snapshot
```
调用作为替代方案。
```
video/snapshots
```
的目标键不得包含文件扩展名。
完整参数说明请见
```
references/
```
中的对应参考文件。

Result Presentation

结果展示

After every successful

process.py

execution, present results in this format:

Language rule: unless the user explicitly requested English, the final user-facing result summary in this section must be written in Chinese. Use a result template that matches the response language. For Chinese responses, use a Chinese lead-in such as

处理结果如下：

and Chinese field labels such as

状态

请求 ID

任务 ID

源文件

输出

参数

文件大小

OSS 路径

. For English responses, use

Result summary:

and the corresponding English labels

Status

RequestID

Task ID

Source

Output

Params

File Size

OSS Path

1. File path: output the local absolute path in a code block (e.g.,

/path/to/outputs/snapshot.jpg

). Never use

open

or Read tool to display files. Only include this section when the file was actually downloaded or written locally. Do not present an

outputs/...

path that was only planned, inferred, or mentioned in a transcript.

2. Result table:

Item	Detail
Status	✅ Completed
RequestID	`<request_id>` (or `N/A` )
Task ID	`<task_id>` (async only)
Source	`source/input.mp4`
Output	`output/result.mp4`
Params	Dynamic — from your command (e.g., MP4/H.264/2Mbps, or 800x600/JPEG)
File Size	From download output
OSS Path	`oss://<bucket>/<target-key>` (save mode only)

Field sourcing rules:

Status

and

Params

must be quoted directly from the

process.py

JSON response.

Status

must come from the returned

success

field, and

Params

must come from the returned

operations

field. Never rewrite, estimate, normalize, or summarize numeric/media values by hand, including confidence scores, bitrate, resolution, dimensions, frame rate, or codec details.

If you need a textual summary, include the original command or process string in a fenced code block and describe it conservatively. Do not invent parameter values or restate them in free-form prose when they are not explicitly present in the

process.py

response.

Final summary constraints:

Do not insert fixed English filler such as
```
Task Completed Successfully
```
.
Numeric values such as sample rate, bitrate, resolution, duration, frame rate, and file count must be copied directly from
```
process.py
```
JSON fields or an explicitly performed read-only verification result.
If a value was not obtained directly from machine output, omit it instead of rewriting, estimating, rounding, or normalizing it by hand.
If an explicitly performed read-only verification result differs from the requested value, report the actual verified output value and describe the request as only partially satisfied when necessary. Do not replace the verified value with the requested one.
If no read-only verification result was obtained, do not claim that machine-verifiable output properties were independently confirmed.

If the user forbids local downloads, omit the

File path

row/section entirely and do not create temporary local files for validation. In that case, present only the JSON-backed metadata returned by

process.py

, such as

success

request_id

task_id

target_key

generated_keys

, or

url

process.py

returns a signed URL, treat the full query string as sensitive output. In normal visible summaries, prefer the OSS path, target key, or a redacted URL. Do not expand raw signing parameters into the final summary unless the runtime has a secure private-output channel for secret delivery.

If independent verification was requested but the workflow returned only a signed URL and did not create a persisted OSS target object, do not claim that a follow-up

info

check was performed on a final output object. Either save the result first and verify the saved object, or state clearly that no persisted-object verification was available.

For image outputs and visual effects such as watermarks, overlays, blur regions, or face redaction, distinguish between metadata verification and visual verification. If the output was not downloaded or rendered locally, do not claim that a visual element was independently confirmed by inspection; state that only the service-reported processing result was verified unless a local render or explicit inspection step was actually performed.

Rules:

Do not run
```
video/info
```
,
```
audio/info
```
, or image
```
--operations info
```
after processing for ordinary result reporting. However, if the user explicitly asks you to verify concrete machine-verifiable output properties such as codec, bitrate, sample rate, channel count, duration, resolution, frame rate, width, height, or format, or if the eval/acceptance criteria explicitly require an independent property check, prefer running one additional read-only verification step against a persisted OSS output object and report that verification separately from the main
```
process.py
```
result. Use
```
audio/info
```
or
```
video/info
```
for audio/video outputs, and use a separate
```
--operations info
```
command for image outputs.
Do not assume local verification libraries or binaries such as
```
PIL
```
/
```
Pillow
```
,
```
ffprobe
```
, or similar tools are preinstalled. Use them only when they are actually available and the workflow genuinely requires a local-file check; otherwise rely on
```
process.py
```
JSON output and permitted read-only OSS-side checks.
For image width/height/format verification, prefer OSS-side
```
--operations info
```
on the saved target object even if a local file is present. Do not use
```
PIL
```
/
```
Pillow
```
as the default verification method for evals or routine skill runs.
Requests to verify image width, height, format, or similar machine-verifiable properties do not by themselves authorize a local download. If the user did not explicitly request a local file, and a saved OSS target object can be verified with
```
info
```
, do not switch to
```
--output-mode download
```
solely for verification.
Do not use
```
head_object
```
as a substitute for media-property verification.
Avoid
```
sleep
```
+ retry loops; the script handles async polling internally.
All media processing goes through
```
process.py
```
; if unsupported, check
```
references/
```
and report — do not write custom scripts.

每次成功执行

process.py

后，按以下格式展示结果：

语言规则：除非用户明确要求英文，否则本章节的最终面向用户结果摘要必须使用中文。使用与响应语言匹配的结果模板。中文回复使用中文开头如「处理结果如下：」和中文字段标签如「状态」「请求ID」「任务ID」「源文件」「输出」「参数」「文件大小」「OSS路径」。英文回复使用「Result summary:」和对应的英文标签「Status」「RequestID」「Task ID」「Source」「Output」「Params」「File Size」「OSS Path」。

1. 文件路径：在代码块中输出本地绝对路径（例如

/path/to/outputs/snapshot.jpg

）。切勿使用

open

或读取工具显示文件内容。仅当文件实际下载或写入本地时，才包含此部分。切勿展示仅计划、推断或在对话记录中提及的

outputs/...

路径。

2. 结果表格：

项目	详情
状态	✅ 已完成
请求ID	`<request_id>` （或 `N/A` ）
任务ID	`<task_id>` （仅异步任务）
源文件	`source/input.mp4`
输出	`output/result.mp4`
参数	动态获取——来自你的命令（例如MP4/H.264/2Mbps，或800x600/JPEG）
文件大小	来自下载输出
OSS路径	`oss://<bucket>/<target-key>` （仅保存模式）

字段来源规则：

状态

和

参数

必须直接引用

process.py

的JSON响应。

状态

必须来自返回的

success

字段，

参数

必须来自返回的

operations

字段。切勿手动重写、估算、标准化或总结数值/媒体值，包括置信度分数、比特率、分辨率、尺寸、帧率或编解码器细节。

若需要文本摘要，在代码块中包含原始命令或处理字符串，并保守描述。切勿在

process.py

响应中不存在参数值时自行编造或用自由格式散文重述。

最终摘要约束：

切勿插入固定英文填充内容如
```
Task Completed Successfully
```
。
采样率、比特率、分辨率、时长、帧率和文件数量等数值必须直接复制自
```
process.py
```
JSON字段或明确执行的只读验证结果。
若值未直接从机器输出获取，省略该值，切勿手动重写、估算、四舍五入或标准化。
若明确执行的只读验证结果与请求值不同，报告实际验证的输出值，必要时说明请求仅部分满足。切勿替换验证值为请求值。
若未获取只读验证结果，切勿声称可机器验证的输出属性已独立确认。

若用户禁止本地下载，完全省略「文件路径」行/部分，切勿创建临时本地文件用于验证。在这种情况下，仅展示

process.py

返回的基于JSON的元数据，如

success

、

request_id

、

task_id

、

target_key

、

generated_keys

或

url

。

若

process.py

返回签名URL，将完整查询字符串视为敏感输出。在普通可见摘要中，优先使用OSS路径、目标键或脱敏URL。除非运行时提供用于秘密交付的安全私有输出通道，否则切勿在最终摘要中展开原始签名参数。

若请求独立验证但工作流仅返回签名URL且未创建持久化OSS目标对象，切勿声称已对最终输出对象执行后续

info

检查。要么先保存结果并验证保存的对象，要么明确说明无法执行持久化对象验证。

对于图片输出和视觉效果如水印、叠加层、模糊区域或人脸打码，区分元数据验证和视觉验证。若输出未下载或本地渲染，切勿声称通过检查独立确认了视觉元素；除非实际执行了本地渲染或明确检查步骤，否则说明仅验证了服务报告的处理结果。

规则：

普通结果报告时，处理后切勿运行
```
video/info
```
、
```
audio/info
```
或图片
```
--operations info
```
。但如果用户明确要求验证具体的可机器验证输出属性如编解码器、比特率、采样率、声道数、时长、分辨率、帧率、宽度、高度或格式，或评估/验收标准明确要求独立属性检查，优先对持久化的OSS输出对象执行额外的只读验证步骤，并将该验证与主
```
process.py
```
结果分开报告。音频/视频输出使用
```
audio/info
```
或
```
video/info
```
，图片输出使用单独的
```
--operations info
```
命令。
切勿假设本地验证库或二进制文件如
```
PIL
```
/
```
Pillow
```
、
```
ffprobe
```
或类似工具已预装。仅当工具实际可用且工作流确实需要本地文件检查时才使用；否则依赖
```
process.py
```
JSON输出和允许的OSS端只读检查。
对于图片宽度/高度/格式验证，即使存在本地文件，优先对保存的目标对象执行OSS端
```
--operations info
```
验证。在评估或常规技能运行中，切勿将
```
PIL
```
/
```
Pillow
```
作为默认验证方法。
请求验证图片宽度、高度、格式或类似可机器验证属性本身并不授权本地下载。若用户未明确请求本地文件，且保存的OSS目标对象可通过
```
info
```
验证，切勿仅为验证而切换到
```
--output-mode download
```
。
切勿使用
```
head_object
```
替代媒体属性验证。
避免
```
sleep
```
+重试循环；脚本内部处理异步轮询。
所有媒体处理均通过
```
process.py
```
执行；若不支持，检查
```
references/
```
并报告——切勿编写自定义脚本。

Chaining Rules

链式规则

Image Operations

图片操作

Basic operations can be freely chained with each other
```
blindwatermark-embed
```
can follow basic ops but must be the last operation
```
blindwatermark-extract
```
must be used alone — no chaining
AI detection (
```
faces
```
,
```
bodies
```
,
```
cars
```
,
```
codes
```
,
```
labels
```
,
```
score
```
) must be used alone

基础操作可自由链式组合
```
blindwatermark-embed
```
可跟随基础操作，但必须是最后一个操作
```
blindwatermark-extract
```
必须单独使用——不可链式组合
AI检测（
```
faces
```
、
```
bodies
```
、
```
cars
```
、
```
codes
```
、
```
labels
```
、
```
score
```
）必须单独使用

Video/Audio Operations

视频/音频操作

Video/audio operations cannot be chained with image operations
Only one video/audio operation per request (no chaining)
For complex workflows, use multiple separate requests

视频/音频操作不可与图片操作链式组合
每个请求仅可包含一个视频/音频操作（不可链式组合）
复杂工作流需使用多个独立请求

Credential & Environment Setup

凭证与环境设置

Credentials are managed by Aliyun CLI (

~/.aliyun/config.json

). Python scripts auto-discover them via the

alibabacloud-credentials

SDK default chain. See Prerequisites above for setup steps.

Diagnostic check:

bash

python scripts/load_env.py

This scans for legacy env vars and verifies RAM permissions. Use this if operations fail with access errors.

Runtime dependency preflight:

process.py

checks required Python packages before execution. Basic OSS/file operations require

oss2

and

alibabacloud-credentials

; video/audio/HLS/IMM operations also require the IMM SDK packages from

scripts/requirements.txt

. If any dependency is missing, the command fails fast with an install hint instead of starting a partial execution.

IMM project — usually discovered by

imm_admin.py auto-setup

process.py

only consumes

--imm-project

ALIBABA_CLOUD_IMM_PROJECT

for

blindwatermark-extract

凭证由Aliyun CLI管理（

~/.aliyun/config.json

）。Python脚本通过

alibabacloud-credentials

SDK默认链自动发现凭证。设置步骤请见上方「前置条件」。

诊断检查：

bash

python scripts/load_env.py

该命令扫描遗留环境变量并验证RAM权限。若操作因访问错误失败，使用此命令。

运行时依赖预检：

process.py

在执行前检查所需Python包。基础OSS/文件操作需要

oss2

和

alibabacloud-credentials

；视频/音频/HLS/IMM操作还需要

scripts/requirements.txt

中的IMM SDK包。若缺少任何依赖，命令会快速失败并提示安装指引，而非开始部分执行。

IMM项目——通常由

imm_admin.py auto-setup

自动发现。

process.py

仅在

blindwatermark-extract

操作中使用

--imm-project

ALIBABA_CLOUD_IMM_PROJECT

。

IMM Auto-Setup

IMM自动设置

Video/audio processing and image-intelligent features require an IMM project bound to the bucket. Follow this workflow for IMM-dependent operations:

Step 1 — Detect IMM project (before any processing command):

bash

python scripts/imm_admin.py auto-setup --bucket <bucket> --region <region>

This ensures the bucket is bound to a usable IMM project and prints the resolved project name.

Step 2 — Execute the media operation:

bash

python scripts/process.py --source video.mp4 \
  --operations "video/convert:f=mp4,vcodec=h264" \
  --output-mode save --target-key output/video.mp4

For

blindwatermark-extract

, append

--imm-project <project_name>

if you do not want to rely on the optional

ALIBABA_CLOUD_IMM_PROJECT

fallback.

Step 3 — Present results per Execution & Output Workflow above.

Operations that require IMM bucket setup: all video/audio/HLS ops, image-intelligent ops (faces, bodies, cars, codes, labels, score, blindwatermark-embed/extract), smart crop (

crop:g=auto

crop:g=face

), face blur (

blur:g=face

blur:g=faces

). Only

blindwatermark-extract

requires the project name as a direct

process.py

input.

视频/音频处理和图片智能功能需要绑定到存储桶的IMM项目。依赖IMM的操作遵循以下工作流程：

步骤1——检测IMM项目（任何处理命令前）：

bash

python scripts/imm_admin.py auto-setup --bucket <bucket> --region <region>

该命令确保存储桶已绑定到可用的IMM项目，并打印解析后的项目名称。

步骤2——执行媒体操作：

bash

python scripts/process.py --source video.mp4 \
  --operations "video/convert:f=mp4,vcodec=h264" \
  --output-mode save --target-key output/video.mp4

对于

blindwatermark-extract

，若不想依赖可选的

ALIBABA_CLOUD_IMM_PROJECT

回退值，附加

--imm-project <project_name>

。

步骤3——按执行与输出工作流程展示结果。

需要IMM存储桶设置的操作：所有视频/音频/HLS操作、图片智能操作（faces、bodies、cars、codes、labels、score、blindwatermark-embed/extract）、智能裁剪（

crop:g=auto

crop:g=face

）、人脸模糊（

blur:g=face

blur:g=faces

）。仅

blindwatermark-extract

需要将项目名称作为

process.py

的直接输入。

Available Operations

可用操作

Image Processing (Sync)

图片处理（同步）

Operation	Description	Reference
`resize` , `crop` , `indexcrop` , `rotate` , `flip`	Basic transformations	`references/image-basic-operations.md`
`quality` , `format` , `interlace`	Quality & format	`references/image-basic-operations.md`
`watermark` , `blur` , `sharpen` , `bright` , `contrast`	Effects	`references/image-basic-operations.md`
`auto-orient` , `circle` , `rounded-corners`	Utilities	`references/image-basic-operations.md`
`info` , `average-hue`	Metadata (JSON)	`references/image-basic-operations.md`

操作	描述	参考
`resize` 、 `crop` 、 `indexcrop` 、 `rotate` 、 `flip`	基础变换	`references/image-basic-operations.md`
`quality` 、 `format` 、 `interlace`	质量与格式	`references/image-basic-operations.md`
`watermark` 、 `blur` 、 `sharpen` 、 `bright` 、 `contrast`	特效	`references/image-basic-operations.md`
`auto-orient` 、 `circle` 、 `rounded-corners`	工具类	`references/image-basic-operations.md`
`info` 、 `average-hue`	元数据（JSON）	`references/image-basic-operations.md`

Image-Intelligent (IMM)

图片智能处理（IMM）

Operation	Mode	Description	Reference
`blindwatermark-embed`	Sync	Embed invisible watermark. Must be last in chain.	`references/image-imm-operations.md`
`blindwatermark-extract`	Async	Extract watermark. Use alone.	`references/image-imm-operations.md`
`faces` , `bodies` , `cars`	Sync	Detect faces/bodies/cars (JSON).	`references/image-imm-operations.md`
`codes` , `labels` , `score`	Sync	QR/barcode recognition, labels, quality score (JSON).	`references/image-imm-operations.md`

操作	模式	描述	参考
`blindwatermark-embed`	同步	嵌入不可见水印。必须是操作链的最后一步。	`references/image-imm-operations.md`
`blindwatermark-extract`	异步	提取水印。单独使用。	`references/image-imm-operations.md`
`faces` 、 `bodies` 、 `cars`	同步	检测人脸/人体/车辆（JSON）。	`references/image-imm-operations.md`
`codes` 、 `labels` 、 `score`	同步	二维码/条形码识别、标签、质量评分（JSON）。	`references/image-imm-operations.md`

Video Processing

视频处理

Operation	Mode	Description	Reference
`video/convert`	Async	Transcode video. Must specify `vcodec` for re-encode.	`references/video-operations.md`
`video/snapshot`	Sync	Extract single frame. `t` (time ms) required.	`references/video-operations.md`
`video/info`	Sync	Video metadata (JSON).	`references/video-operations.md`
`video/animation`	Async	Video to GIF/WebP.	`references/video-operations.md`
`video/snapshots`	Async	Multi-frame extraction. target-key must NOT have extension.	`references/video-operations.md`
`video/sprite`	Async	Sprite sheet. Must specify `num` or `inter` .	`references/video-operations.md`
`video/concat`	Async	Concatenate videos (max 11). Must verify input params match.	`references/video-operations.md`

操作	模式	描述	参考
`video/convert`	异步	视频转码。重新编码必须指定 `vcodec` 。	`references/video-operations.md`
`video/snapshot`	同步	提取单帧。必须指定 `t` （时间，毫秒）。	`references/video-operations.md`
`video/info`	同步	视频元数据（JSON）。	`references/video-operations.md`
`video/animation`	异步	视频转GIF/WebP。	`references/video-operations.md`
`video/snapshots`	异步	多帧提取。目标键不得包含扩展名。	`references/video-operations.md`
`video/sprite`	异步	雪碧图。必须指定 `num` 或 `inter` 。	`references/video-operations.md`
`video/concat`	异步	视频拼接（最多11个）。必须验证输入参数匹配。	`references/video-operations.md`

Audio Processing

音频处理

Operation	Mode	Description	Reference
`audio/convert`	Async	Transcode audio.	`references/audio-operations.md`
`audio/concat`	Async	Concatenate audio files.	`references/audio-operations.md`
`audio/info`	Sync	Audio metadata (JSON).	`references/audio-operations.md`

操作	模式	描述	参考
`audio/convert`	异步	音频转码。	`references/audio-operations.md`
`audio/concat`	异步	音频文件拼接。	`references/audio-operations.md`
`audio/info`	同步	音频元数据（JSON）。	`references/audio-operations.md`

HLS Streaming

HLS流媒体

Operation	Mode	Description	Reference
`hls/m3u8`	Sync	HLS playlist (returns a playlist, not a file — use `--output-mode url` ).	`references/video-operations.md`

操作	模式	描述	参考
`hls/m3u8`	同步	HLS播放列表（返回播放列表，非文件——使用 `--output-mode url` ）。	`references/video-operations.md`

File Operations

文件操作

Operation	Mode	Description
`upload`	Sync	Upload local file/URL to OSS. Use with `--uri` and `--target-key` .
`download`	Sync	Download OSS object. Use with `--source` and `--output-path` .

操作	模式	描述
`upload`	同步	上传本地文件/URL到OSS。配合 `--uri` 和 `--target-key` 使用。
`download`	同步	下载OSS对象。配合 `--source` 和 `--output-path` 使用。

Processing Modes

处理模式

Synchronous (
```
x-oss-process
```
): image basic processing,
```
video/snapshot
```
,
```
video/info
```
,
```
audio/info
```
,
```
hls/m3u8
```
, AI detection — results returned immediately
Asynchronous (
```
x-oss-async-process
```
): video/audio transcoding, animation, sprite, snapshots, concat, blindwatermark-extract — auto-detected, auto-polled until completion

同步（
```
x-oss-process
```
）：图片基础处理、
```
video/snapshot
```
、
```
video/info
```
、
```
audio/info
```
、
```
hls/m3u8
```
、AI检测——结果即时返回
异步（
```
x-oss-async-process
```
）：视频/音频转码、动图生成、雪碧图、多帧截图、拼接、盲水印提取——自动检测，自动轮询直到完成

Usage

使用方法

bash

python scripts/process.py \
  [--bucket BUCKET_NAME] \
  [--region REGION_ID] \
  (--source OSS_OBJECT_KEY | --uri URI) \
  --operations OPERATION [OPERATION ...] \
  [--output-mode url|download|save] \
  [--expires SECONDS] \
  [--output-path LOCAL_PATH] \
  [--target-key OSS_TARGET_KEY] \
  [--endpoint CUSTOM_ENDPOINT] \
  [--imm-project IMM_PROJECT_NAME] \
  [--dry-run]

--imm-project

is only consumed by

blindwatermark-extract

; other operations rely on IMM bucket binding, not this flag.

bash

python scripts/process.py \
  [--bucket BUCKET_NAME] \
  [--region REGION_ID] \
  (--source OSS_OBJECT_KEY | --uri URI) \
  --operations OPERATION [OPERATION ...] \
  [--output-mode url|download|save] \
  [--expires SECONDS] \
  [--output-path LOCAL_PATH] \
  [--target-key OSS_TARGET_KEY] \
  [--endpoint CUSTOM_ENDPOINT] \
  [--imm-project IMM_PROJECT_NAME] \
  [--dry-run]

--imm-project

仅被

blindwatermark-extract

使用；其他操作依赖IMM存储桶绑定，而非此标志。

--uri

--uri

Process a file from a local file path or URL (http/https) without pre-uploading. The script auto-uploads to a temp key, processes, and cleans up.

--uri

and

--source

are mutually exclusive.

无需预上传，直接处理本地文件路径或URL（http/https）的文件。脚本会自动上传到临时键、处理并清理。

--uri

和

--source

互斥。

--dry-run

--dry-run

Prints the generated process string and operation details as JSON to stdout, then exits without connecting to OSS.

将生成的处理字符串和操作详情以JSON格式打印到标准输出，然后退出，不连接到OSS。

Operation String Format

操作字符串格式

Each operation:

name:key=value,key=value

. No-param operations use just the name (e.g.,

info

video/info

). Video/audio operations use slash notation:

video/convert

audio/convert

每个操作格式：

name:key=value,key=value

。无参数操作仅使用名称（例如

info

、

video/info

）。视频/音频操作使用斜杠表示法：

video/convert

、

audio/convert

。

End-to-End Example

端到端示例

User request:

text

Resize `images/photo.jpg` in OSS to width 600px, add a bottom-right text watermark `Copyright 2026`, and download the result locally. The bucket is `my-media-bucket` in region `cn-shanghai`.

Command:

bash

python scripts/process.py --bucket my-media-bucket --region cn-shanghai \
  --source images/photo.jpg \
  --operations "resize:w=600" "watermark:text=Copyright 2026,g=se,opacity=60,size=30" \
  --output-mode download \
  --output-path "$WORKSPACE_OUTPUT/photo-watermarked.jpg"

Expected result shape:

json

{
  "success": true,
  "mode": "download",
  "path": "/absolute/path/to/outputs/photo-watermarked.jpg",
  "size": 12345,
  "request_id": "xxxxxx"
}

Interpretation:

```
success: true
```
means OSS processing completed successfully.
```
path
```
is the local file path you should present to the user.
```
request_id
```
is the server-side request trace ID for troubleshooting.

用户请求：

text

将OSS中`images/photo.jpg`调整为宽度600px，添加右下角文字水印`Copyright 2026`，并将结果下载到本地。存储桶为`my-media-bucket`，地域为`cn-shanghai`。

命令：

bash

python scripts/process.py --bucket my-media-bucket --region cn-shanghai \
  --source images/photo.jpg \
  --operations "resize:w=600" "watermark:text=Copyright 2026,g=se,opacity=60,size=30" \
  --output-mode download \
  --output-path "$WORKSPACE_OUTPUT/photo-watermarked.jpg"

预期结果格式：

json

{
  "success": true,
  "mode": "download",
  "path": "/absolute/path/to/outputs/photo-watermarked.jpg",
  "size": 12345,
  "request_id": "xxxxxx"
}

解释：

```
success: true
```
表示OSS处理成功完成。
```
path
```
是应展示给用户的本地文件路径。
```
request_id
```
是服务器端请求跟踪ID，用于故障排查。

Additional Examples

HLS streaming (with IMM auto-setup)

HLS流媒体（配合IMM自动设置）

python scripts/imm_admin.py auto-setup --bucket my-bucket --region cn-hangzhou

→ Capture project name from output

→ 从输出中获取项目名称

python scripts/process.py --bucket my-bucket --region cn-hangzhou
--source videos/input.mp4
--operations "hls/m3u8:ss=15000,t=1800000,vcodec=h264,fps=25,s=1280x720,vb=2000000,acodec=aac,ab=128000"
--output-mode url

Upload a local file to OSS

上传本地文件到OSS

python scripts/process.py --bucket my-bucket --region cn-hangzhou
--uri /path/to/report.pdf --operations upload --target-key documents/report.pdf

Download a file from OSS

从OSS下载文件

python scripts/process.py --bucket my-bucket --region cn-hangzhou
--source documents/report.pdf --operations download --output-path $WORKSPACE_OUTPUT/report.pdf

undefined

python scripts/process.py --bucket my-bucket --region cn-hangzhou
--source documents/report.pdf --operations download --output-path $WORKSPACE_OUTPUT/report.pdf

undefined

Edge Cases

边缘情况

watermark

values that contain commas should be quoted. For example:

preprocess="resize:w=200,text=demo,image/logo.png"

video/snapshots

target keys must not include a file extension. Use

output/frames/frame

, not

output/frames/frame.jpg

```
video/concat
```
always performs input compatibility checks before task submission. Additional local
```
ffprobe
```
output validation only runs when the result is also downloaded via
```
--output-path
```
.
Async media polling defaults to 600 seconds. Override with
```
--timeout-seconds <n>
```
or
```
ALIBABA_CLOUD_ASYNC_TIMEOUT_SECONDS
```
.
```
blindwatermark-extract
```
must run alone.
```
blindwatermark-embed
```
can follow basic image operations, but it must be the last operation in the chain.

包含逗号的

watermark

值需加引号。例如：

preprocess="resize:w=200,text=demo,image/logo.png"

。

```
video/snapshots
```
的目标键不得包含文件扩展名。使用
```
output/frames/frame
```
，而非
```
output/frames/frame.jpg
```
。
```
video/concat
```
在提交任务前始终执行输入兼容性检查。仅当结果同时通过
```
--output-path
```
下载时，才会执行额外的本地
```
ffprobe
```
输出验证。
异步媒体轮询默认超时为600秒。可通过
```
--timeout-seconds <n>
```
或
```
ALIBABA_CLOUD_ASYNC_TIMEOUT_SECONDS
```
覆盖。
```
blindwatermark-extract
```
必须单独运行。
```
blindwatermark-embed
```
可跟随基础图片操作，但必须是操作链的最后一步。

Error Recovery

错误恢复

Error	Cause	Recovery
Repeated `AccessDenied` or `InvalidArgument` twice in a row	Configuration or authorization is still unresolved, and blind retries risk fabricated diagnosis	Stop immediately. Do not simulate output, do not fabricate logs, and do not keep retrying `process.py` . Run `aliyun configure list` to verify the active CLI profile, then check RAM permissions with `python scripts/check_permissions.py` or the relevant RAM policy setup. If you changed dependencies, env vars, or CLI configuration while recovering, rerun `python scripts/load_env.py` and `aliyun configure list` before any next `process.py` attempt.
`task_id: null`	IMM project not bound to bucket, or `blindwatermark-extract` missing `--imm-project` / `ALIBABA_CLOUD_IMM_PROJECT`	Run `python scripts/imm_admin.py auto-setup --bucket <b> --region <r>` first; for `blindwatermark-extract` , also pass `--imm-project <project>` if needed
`NoSuchKey`	Source file does not exist in OSS	Check `--source` path, or upload first with `--uri` and `upload` operation
`AccessDenied` / `403`	RAM policy missing required permissions	Run `python scripts/check_permissions.py` for diagnosis
`InvalidArgument`	Wrong parameter format or unsupported combination	Check parameter spelling; verify against `references/` docs
Async timeout / polling exceeds limit	Job too large or queue backlog	Note the `task_id` , tell user to retry later; do NOT use `sleep` loops

错误	原因	恢复方法
连续两次出现 `AccessDenied` 或 `InvalidArgument`	配置或授权问题仍未解决，盲目重试可能导致错误诊断	立即停止。切勿模拟输出、伪造日志或持续重试 `process.py` 。运行 `aliyun configure list` 验证当前CLI配置文件，然后使用 `python scripts/check_permissions.py` 或相关RAM策略设置检查RAM权限。若在恢复过程中更改了依赖、环境变量或CLI配置，在下次尝试 `process.py` 前重新运行 `python scripts/load_env.py` 和 `aliyun configure list` 。
`task_id: null`	IMM项目未绑定到存储桶，或 `blindwatermark-extract` 缺少 `--imm-project` / `ALIBABA_CLOUD_IMM_PROJECT`	先运行 `python scripts/imm_admin.py auto-setup --bucket <b> --region <r>` ；对于 `blindwatermark-extract` ，若需要还需传递 `--imm-project <project>`
`NoSuchKey`	源文件在OSS中不存在	检查 `--source` 路径，或先使用 `--uri` 和 `upload` 操作上传文件
`AccessDenied` / `403`	RAM策略缺少必要权限	运行 `python scripts/check_permissions.py` 进行诊断
`InvalidArgument`	参数格式错误或不支持的组合	检查参数拼写；对照 `references/` 文档验证
异步超时 / 轮询超过限制	任务过大或队列积压	记录 `task_id` ，告知用户稍后重试；切勿使用 `sleep` 循环

Quick References

快速参考

Parameter details:

references/image-basic-operations.md

references/image-imm-operations.md

references/video-operations.md

references/audio-operations.md

RAM Permissions:
```
references/ram-policies.md
```
Format Support & Limitations:
```
references/limitations.md
```
IMM Administration:
```
references/imm-admin.md
```

参数详情：

references/image-basic-operations.md

、

references/image-imm-operations.md

、

references/video-operations.md

、

references/audio-operations.md

RAM权限：
```
references/ram-policies.md
```
格式支持与限制：
```
references/limitations.md
```
IMM管理：
```
references/imm-admin.md
```

alibabacloud-oss-media-process

Original

Translation

Alibaba Cloud OSS Media Processing

阿里云OSS媒体处理

Quick Start

快速开始

Working directory

工作目录

Credentials (Aliyun CLI)

凭证（Aliyun CLI）

Prerequisites

前置条件

AI-Mode

AI模式

Preflight then Execute

预检后执行

First-time setup

首次设置

Recommended Workflow

推荐工作流程

Quick Decision Guide

快速决策指南

All processing goes through process.py

所有处理均通过process.py执行

IMM setup (before IMM-dependent ops)

IMM设置（依赖IMM的操作前）

Source selection

源选择

Sync vs Async (auto-detected)

同步vs异步（自动检测）

Output rules

输出规则

Chaining rules

链式规则

Core Parameter Rules

核心参数规则

Result Presentation

结果展示

Chaining Rules

链式规则

Image Operations

图片操作

Video/Audio Operations

视频/音频操作

Credential & Environment Setup

凭证与环境设置

IMM Auto-Setup

IMM自动设置

Available Operations

可用操作

Image Processing (Sync)

图片处理（同步）

Image-Intelligent (IMM)

图片智能处理（IMM）

Video Processing

视频处理

Audio Processing

音频处理

HLS Streaming

HLS流媒体

File Operations

文件操作

Processing Modes

处理模式

Usage

使用方法

--uri

--uri

--dry-run

--dry-run

Operation String Format

操作字符串格式

End-to-End Example

端到端示例

Additional Examples

更多示例

HLS streaming (with IMM auto-setup)

HLS流媒体（配合IMM自动设置）

→ Capture project name from output

All processing goes through
`process.py`

所有处理均通过
`process.py`
执行

`--uri`

`--uri`

`--dry-run`

`--dry-run`