volcengine-video-understanding
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
Chinese火山视频理解
Volcengine Video Understanding
使用字节跳动火山方舟视频理解 API(doubao-seed-2-0-pro-260215 等模型)对视频进行深度理解和分析。
推荐方式:Files API 上传 + Responses API 分析
- 支持最大 512MB 视频文件
- 自动视频预处理(FPS采样)
- 文件可重复使用(存储7天)
Uses ByteDance Volcano Ark Video Understanding API (models such as doubao-seed-2-0-pro-260215) to conduct in-depth understanding and analysis of videos.
Recommended method: Files API upload + Responses API analysis
- Supports video files up to 512MB
- Automatic video preprocessing (FPS sampling)
- Files can be reused (stored for 7 days)
功能
Features
- 视频上传:通过 Files API 上传本地视频(推荐,最大512MB)
- 内容理解:分析视频场景、人物、动作、情感
- 视频问答:基于视频内容回答用户问题
- 视频描述:自动生成视频描述和摘要
- Video Upload: Upload local videos via Files API (recommended, max 512MB)
- Content Understanding: Analyze video scenes, characters, actions, emotions
- Video Q&A: Answer user questions based on video content
- Video Description: Automatically generate video descriptions and summaries
前置要求
Prerequisites
需要设置 环境变量。
ARK_API_KEYThe environment variable needs to be set.
ARK_API_KEY配置方式(推荐)
Configuration Method (Recommended)
- 复制配置模板:
bash
cp .canghe-skills/.env.example .canghe-skills/.env- 编辑 文件,填写你的 API Key:
.canghe-skills/.env
ARK_API_KEY=your-actual-api-key-here- Copy the configuration template:
bash
cp .canghe-skills/.env.example .canghe-skills/.env- Edit the file and fill in your API Key:
.canghe-skills/.env
ARK_API_KEY=your-actual-api-key-here或使用环境变量
Or use environment variables
bash
export ARK_API_KEY="your-api-key"bash
export ARK_API_KEY="your-api-key"加载优先级
Loading Priority
- 系统环境变量 ()
process.env - 当前目录
.canghe-skills/.env - 用户主目录
~/.canghe-skills/.env
- System environment variable ()
process.env - in current directory
.canghe-skills/.env - in user home directory
~/.canghe-skills/.env
使用方法
Usage
1. 基础视频分析(Files API 方式 - 推荐)
1. Basic Video Analysis (Files API method - Recommended)
bash
cd ~/.openclaw/workspace/skills/volcengine-video-understanding
python3 scripts/video_understand.py /path/to/video.mp4 "描述这个视频的内容"bash
cd ~/.openclaw/workspace/skills/volcengine-video-understanding
python3 scripts/video_understand.py /path/to/video.mp4 "Describe the content of this video"2. 视频问答
2. Video Q&A
bash
python3 scripts/video_understand.py /path/to/video.mp4 "视频中出现了哪些人物?"bash
python3 scripts/video_understand.py /path/to/video.mp4 "What characters appear in the video?"3. 情感分析
3. Sentiment Analysis
bash
python3 scripts/video_understand.py /path/to/video.mp4 "分析视频中人物的情感变化"bash
python3 scripts/video_understand.py /path/to/video.mp4 "Analyze the emotional changes of the characters in the video"4. 指定模型和帧率
4. Specify Model and Frame Rate
bash
python3 scripts/video_understand.py /path/to/video.mp4 "总结视频要点" \
--model doubao-seed-2-0-pro-260215 \
--fps 2bash
python3 scripts/video_understand.py /path/to/video.mp4 "Summarize the key points of the video" \
--model doubao-seed-2-0-pro-260215 \
--fps 25. 保存结果到文件
5. Save Results to File
bash
python3 scripts/video_understand.py /path/to/video.mp4 "描述视频" --output result.jsonbash
python3 scripts/video_understand.py /path/to/video.mp4 "Describe the video" --output result.json参数说明
Parameter Description
| 参数 | 默认值 | 说明 |
|---|---|---|
| 必填 | 视频文件路径 |
| 必填 | 分析指令/问题 |
| doubao-seed-2-0-pro-260215 | 模型 ID |
| 1 | 视频采样帧率(预处理) |
| - | 结果输出文件路径 |
| Parameter | Default Value | Description |
|---|---|---|
| Required | Video file path |
| Required | Analysis instruction/question |
| doubao-seed-2-0-pro-260215 | Model ID |
| 1 | Video sampling frame rate (preprocessing) |
| - | Result output file path |
支持的模型
Supported Models
- (默认)
doubao-seed-2-0-pro-260215 doubao-seed-2-0-lite-250728doubao-seed-1-6-251015- 其他 Seed 系列视频理解模型
- (default)
doubao-seed-2-0-pro-260215 doubao-seed-2-0-lite-250728doubao-seed-1-6-251015- Other Seed series video understanding models
分析示例
Analysis Examples
示例 1:视频内容描述
Example 1: Video Content Description
bash
python3 scripts/video_understand.py ~/Desktop/video.mp4 "详细描述这个视频的内容,包括场景、人物和动作"bash
python3 scripts/video_understand.py ~/Desktop/video.mp4 "Describe the content of this video in detail, including scenes, characters and actions"示例 2:视频摘要
Example 2: Video Summary
bash
python3 scripts/video_understand.py ~/Desktop/video.mp4 "用3句话总结这个视频的要点"bash
python3 scripts/video_understand.py ~/Desktop/video.mp4 "Summarize the key points of this video in 3 sentences"示例 3:动作识别
Example 3: Action Recognition
bash
python3 scripts/video_understand.py ~/Desktop/video.mp4 "视频中的人物在做什么动作?按时间顺序描述"bash
python3 scripts/video_understand.py ~/Desktop/video.mp4 "What actions are the characters in the video doing? Describe in chronological order"示例 4:场景分析
Example 4: Scene Analysis
bash
python3 scripts/video_understand.py ~/Desktop/video.mp4 "分析视频中的场景变化和环境特征"bash
python3 scripts/video_understand.py ~/Desktop/video.mp4 "Analyze the scene changes and environmental characteristics in the video"技术细节
Technical Details
调用流程
Call Process
- 上传视频:通过 Files API 上传本地视频文件,指定 FPS 预处理配置
- 等待处理:等待视频预处理完成(状态变为 processed)
- 创建任务:调用 Responses API 进行视频理解
- 获取结果:返回分析结果
- Upload Video: Upload local video file via Files API, specify FPS preprocessing configuration
- Wait for Processing: Wait for video preprocessing to complete (status changes to processed)
- Create Task: Call Responses API for video understanding
- Get Results: Return analysis results
API 格式
API Format
Files API 上传:
bash
curl https://ark.cn-beijing.volces.com/api/v3/files \
-H "Authorization: Bearer $ARK_API_KEY" \
-F 'purpose=user_data' \
-F 'file=@video.mp4' \
-F 'preprocess_configs[video][fps]=1'Responses API 分析:
json
{
"model": "doubao-seed-2-0-pro-260215",
"input": [
{
"role": "user",
"content": [
{
"type": "input_video",
"file_id": "file-xxxx"
},
{
"type": "input_text",
"text": "用户指令"
}
]
}
]
}Files API Upload:
bash
curl https://ark.cn-beijing.volces.com/api/v3/files \
-H "Authorization: Bearer $ARK_API_KEY" \
-F 'purpose=user_data' \
-F 'file=@video.mp4' \
-F 'preprocess_configs[video][fps]=1'Responses API Analysis:
json
{
"model": "doubao-seed-2-0-pro-260215",
"input": [
{
"role": "user",
"content": [
{
"type": "input_video",
"file_id": "file-xxxx"
},
{
"type": "input_text",
"text": "User instruction"
}
]
}
]
}FPS 设置建议
FPS Setting Recommendations
| FPS | 适用场景 |
|---|---|
| 0.3-0.5 | 慢节奏视频、静态场景、节省token |
| 1 | 一般视频分析(默认) |
| 2-3 | 快速动作、细节分析 |
| FPS | Applicable Scenarios |
|---|---|
| 0.3-0.5 | Slow-paced videos, static scenes, token saving |
| 1 | General video analysis (default) |
| 2-3 | Fast actions, detail analysis |
限制
Limitations
- 视频格式:MP4(推荐)、MOV、AVI
- 文件大小:最大 512MB(Files API 方式)
- 存储时间:上传的文件默认存储 7 天
- 处理时间:根据视频长度和复杂度,通常 10-60 秒
- Video Format: MP4 (recommended), MOV, AVI
- File Size: Max 512MB (Files API method)
- Storage Time: Uploaded files are stored for 7 days by default
- Processing Time: Usually 10-60 seconds depending on video length and complexity
Python API 使用
Python API Usage
python
from scripts.video_understand import analyze_video
result = analyze_video(
file_path="/path/to/video.mp4",
instruction="描述视频内容",
model="doubao-seed-2-0-pro-260215",
fps=1
)python
from scripts.video_understand import analyze_video
result = analyze_video(
file_path="/path/to/video.mp4",
instruction="Describe video content",
model="doubao-seed-2-0-pro-260215",
fps=1
)提取回答
Extract answer
text = ""
for item in result.get("output", []):
if item.get("type") == "message":
for content in item.get("content", []):
if content.get("type") == "output_text":
text = content.get("text", "")
break
print(text)
undefinedtext = ""
for item in result.get("output", []):
if item.get("type") == "message":
for content in item.get("content", []):
if content.get("type") == "output_text":
text = content.get("text", "")
break
print(text)
undefined错误处理
Error Handling
常见错误及解决方案:
| 错误 | 原因 | 解决方案 |
|---|---|---|
| API Key 错误 | 未设置或错误 | 检查 ARK_API_KEY 环境变量 |
| 文件不存在 | 路径错误 | 检查文件路径 |
| 上传失败 | 文件过大或格式不支持 | 检查文件大小(<512MB)和格式 |
| 处理超时 | 视频过长或复杂 | 缩短视频或降低 FPS |
Common errors and solutions:
| Error | Cause | Solution |
|---|---|---|
| API Key error | Not set or incorrect | Check ARK_API_KEY environment variable |
| File does not exist | Wrong path | Check file path |
| Upload failed | File too large or format not supported | Check file size (<512MB) and format |
| Processing timeout | Video too long or complex | Shorten video or reduce FPS |