volcengine-video-understanding

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

火山视频理解

Volcengine Video Understanding

使用字节跳动火山方舟视频理解 API（doubao-seed-2-0-pro-260215 等模型）对视频进行深度理解和分析。

推荐方式：Files API 上传 + Responses API 分析

支持最大 512MB 视频文件
自动视频预处理（FPS采样）
文件可重复使用（存储7天）

Uses ByteDance Volcano Ark Video Understanding API (models such as doubao-seed-2-0-pro-260215) to conduct in-depth understanding and analysis of videos.

Recommended method: Files API upload + Responses API analysis

Supports video files up to 512MB
Automatic video preprocessing (FPS sampling)
Files can be reused (stored for 7 days)

功能

Features

视频上传：通过 Files API 上传本地视频（推荐，最大512MB）
内容理解：分析视频场景、人物、动作、情感
视频问答：基于视频内容回答用户问题
视频描述：自动生成视频描述和摘要

Video Upload: Upload local videos via Files API (recommended, max 512MB)
Content Understanding: Analyze video scenes, characters, actions, emotions
Video Q&A: Answer user questions based on video content
Video Description: Automatically generate video descriptions and summaries

前置要求

Prerequisites

需要设置

ARK_API_KEY

环境变量。

The

ARK_API_KEY

environment variable needs to be set.

配置方式（推荐）

Configuration Method (Recommended)

复制配置模板：

bash

cp .canghe-skills/.env.example .canghe-skills/.env

编辑
```
.canghe-skills/.env
```
文件，填写你的 API Key：

ARK_API_KEY=your-actual-api-key-here

Copy the configuration template:

bash

cp .canghe-skills/.env.example .canghe-skills/.env

Edit the
```
.canghe-skills/.env
```
file and fill in your API Key:

ARK_API_KEY=your-actual-api-key-here

或使用环境变量

Or use environment variables

bash

export ARK_API_KEY="your-api-key"

bash

export ARK_API_KEY="your-api-key"

加载优先级

Loading Priority

系统环境变量 (
```
process.env
```
)
当前目录
```
.canghe-skills/.env
```
用户主目录
```
~/.canghe-skills/.env
```

System environment variable (
```
process.env
```
)
```
.canghe-skills/.env
```
in current directory
```
~/.canghe-skills/.env
```
in user home directory

使用方法

Usage

1. 基础视频分析（Files API 方式 - 推荐）

1. Basic Video Analysis (Files API method - Recommended)

bash

cd ~/.openclaw/workspace/skills/volcengine-video-understanding
python3 scripts/video_understand.py /path/to/video.mp4 "描述这个视频的内容"

bash

cd ~/.openclaw/workspace/skills/volcengine-video-understanding
python3 scripts/video_understand.py /path/to/video.mp4 "Describe the content of this video"

2. 视频问答

2. Video Q&A

bash

python3 scripts/video_understand.py /path/to/video.mp4 "视频中出现了哪些人物？"

bash

python3 scripts/video_understand.py /path/to/video.mp4 "What characters appear in the video?"

3. 情感分析

3. Sentiment Analysis

bash

python3 scripts/video_understand.py /path/to/video.mp4 "分析视频中人物的情感变化"

bash

python3 scripts/video_understand.py /path/to/video.mp4 "Analyze the emotional changes of the characters in the video"

4. 指定模型和帧率

4. Specify Model and Frame Rate

bash

python3 scripts/video_understand.py /path/to/video.mp4 "总结视频要点" \
  --model doubao-seed-2-0-pro-260215 \
  --fps 2

bash

python3 scripts/video_understand.py /path/to/video.mp4 "Summarize the key points of the video" \
  --model doubao-seed-2-0-pro-260215 \
  --fps 2

5. 保存结果到文件

5. Save Results to File

bash

python3 scripts/video_understand.py /path/to/video.mp4 "描述视频" --output result.json

bash

python3 scripts/video_understand.py /path/to/video.mp4 "Describe the video" --output result.json

参数说明

Parameter Description

参数	默认值	说明
`video_path`	必填	视频文件路径
`instruction`	必填	分析指令/问题
`--model`	doubao-seed-2-0-pro-260215	模型 ID
`--fps`	1	视频采样帧率（预处理）
`--output`	-	结果输出文件路径

Parameter	Default Value	Description
`video_path`	Required	Video file path
`instruction`	Required	Analysis instruction/question
`--model`	doubao-seed-2-0-pro-260215	Model ID
`--fps`	1	Video sampling frame rate (preprocessing)
`--output`	-	Result output file path

支持的模型

Supported Models

```
doubao-seed-2-0-pro-260215
```
(默认)
```
doubao-seed-2-0-lite-250728
```
```
doubao-seed-1-6-251015
```
其他 Seed 系列视频理解模型

```
doubao-seed-2-0-pro-260215
```
(default)
```
doubao-seed-2-0-lite-250728
```
```
doubao-seed-1-6-251015
```
Other Seed series video understanding models

分析示例

Analysis Examples

示例 1：视频内容描述

Example 1: Video Content Description

bash

python3 scripts/video_understand.py ~/Desktop/video.mp4 "详细描述这个视频的内容，包括场景、人物和动作"

bash

python3 scripts/video_understand.py ~/Desktop/video.mp4 "Describe the content of this video in detail, including scenes, characters and actions"

示例 2：视频摘要

Example 2: Video Summary

bash

python3 scripts/video_understand.py ~/Desktop/video.mp4 "用3句话总结这个视频的要点"

bash

python3 scripts/video_understand.py ~/Desktop/video.mp4 "Summarize the key points of this video in 3 sentences"

示例 3：动作识别

Example 3: Action Recognition

bash

python3 scripts/video_understand.py ~/Desktop/video.mp4 "视频中的人物在做什么动作？按时间顺序描述"

bash

python3 scripts/video_understand.py ~/Desktop/video.mp4 "What actions are the characters in the video doing? Describe in chronological order"

示例 4：场景分析

Example 4: Scene Analysis

bash

python3 scripts/video_understand.py ~/Desktop/video.mp4 "分析视频中的场景变化和环境特征"

bash

python3 scripts/video_understand.py ~/Desktop/video.mp4 "Analyze the scene changes and environmental characteristics in the video"

技术细节

Technical Details

调用流程

Call Process

上传视频：通过 Files API 上传本地视频文件，指定 FPS 预处理配置
等待处理：等待视频预处理完成（状态变为 processed）
创建任务：调用 Responses API 进行视频理解
获取结果：返回分析结果

Upload Video: Upload local video file via Files API, specify FPS preprocessing configuration
Wait for Processing: Wait for video preprocessing to complete (status changes to processed)
Create Task: Call Responses API for video understanding
Get Results: Return analysis results

API 格式

API Format

Files API 上传：

bash

curl https://ark.cn-beijing.volces.com/api/v3/files \
  -H "Authorization: Bearer $ARK_API_KEY" \
  -F 'purpose=user_data' \
  -F 'file=@video.mp4' \
  -F 'preprocess_configs[video][fps]=1'

Responses API 分析：

json

{
  "model": "doubao-seed-2-0-pro-260215",
  "input": [
    {
      "role": "user",
      "content": [
        {
          "type": "input_video",
          "file_id": "file-xxxx"
        },
        {
          "type": "input_text",
          "text": "用户指令"
        }
      ]
    }
  ]
}

Files API Upload:

bash

curl https://ark.cn-beijing.volces.com/api/v3/files \
  -H "Authorization: Bearer $ARK_API_KEY" \
  -F 'purpose=user_data' \
  -F 'file=@video.mp4' \
  -F 'preprocess_configs[video][fps]=1'

Responses API Analysis:

json

{
  "model": "doubao-seed-2-0-pro-260215",
  "input": [
    {
      "role": "user",
      "content": [
        {
          "type": "input_video",
          "file_id": "file-xxxx"
        },
        {
          "type": "input_text",
          "text": "User instruction"
        }
      ]
    }
  ]
}

FPS 设置建议

FPS Setting Recommendations

FPS	适用场景
0.3-0.5	慢节奏视频、静态场景、节省token
1	一般视频分析（默认）
2-3	快速动作、细节分析

FPS	Applicable Scenarios
0.3-0.5	Slow-paced videos, static scenes, token saving
1	General video analysis (default)
2-3	Fast actions, detail analysis

限制

Limitations

视频格式：MP4（推荐）、MOV、AVI
文件大小：最大 512MB（Files API 方式）
存储时间：上传的文件默认存储 7 天
处理时间：根据视频长度和复杂度，通常 10-60 秒

Video Format: MP4 (recommended), MOV, AVI
File Size: Max 512MB (Files API method)
Storage Time: Uploaded files are stored for 7 days by default
Processing Time: Usually 10-60 seconds depending on video length and complexity

Python API 使用

Python API Usage

python

from scripts.video_understand import analyze_video

result = analyze_video(
    file_path="/path/to/video.mp4",
    instruction="描述视频内容",
    model="doubao-seed-2-0-pro-260215",
    fps=1
)

python

from scripts.video_understand import analyze_video

result = analyze_video(
    file_path="/path/to/video.mp4",
    instruction="Describe video content",
    model="doubao-seed-2-0-pro-260215",
    fps=1
)

提取回答

Extract answer

text = "" for item in result.get("output", []): if item.get("type") == "message": for content in item.get("content", []): if content.get("type") == "output_text": text = content.get("text", "") break

print(text)

undefined

text = "" for item in result.get("output", []): if item.get("type") == "message": for content in item.get("content", []): if content.get("type") == "output_text": text = content.get("text", "") break

print(text)

undefined

错误处理

Error Handling

常见错误及解决方案：

错误	原因	解决方案
API Key 错误	未设置或错误	检查 ARK_API_KEY 环境变量
文件不存在	路径错误	检查文件路径
上传失败	文件过大或格式不支持	检查文件大小（<512MB）和格式
处理超时	视频过长或复杂	缩短视频或降低 FPS

Common errors and solutions:

Error	Cause	Solution
API Key error	Not set or incorrect	Check ARK_API_KEY environment variable
File does not exist	Wrong path	Check file path
Upload failed	File too large or format not supported	Check file size (<512MB) and format
Processing timeout	Video too long or complex	Shorten video or reduce FPS

volcengine-video-understanding

Original

Translation

火山视频理解

Volcengine Video Understanding

功能

Features

前置要求

Prerequisites

配置方式（推荐）

Configuration Method (Recommended)

或使用环境变量

Or use environment variables

加载优先级

Loading Priority

使用方法

Usage

1. 基础视频分析（Files API 方式 - 推荐）

1. Basic Video Analysis (Files API method - Recommended)

2. 视频问答

2. Video Q&A

3. 情感分析

3. Sentiment Analysis

4. 指定模型和帧率

4. Specify Model and Frame Rate

5. 保存结果到文件

5. Save Results to File

参数说明

Parameter Description

支持的模型

Supported Models

分析示例

Analysis Examples

示例 1：视频内容描述

Example 1: Video Content Description

示例 2：视频摘要

Example 2: Video Summary

示例 3：动作识别

Example 3: Action Recognition

示例 4：场景分析

Example 4: Scene Analysis

技术细节

Technical Details

调用流程

Call Process

API 格式

API Format

FPS 设置建议

FPS Setting Recommendations

限制

Limitations

Python API 使用

Python API Usage

提取回答

Extract answer

错误处理

Error Handling

参考文档

Reference Documents