byted-seedance-video-generate

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Video Generate Skill

Video Generate 技能

This skill generates videos using Doubao Seedance 1.0/1.5 models.
本技能使用Doubao Seedance 1.0/1.5模型生成视频。

Trigger Conditions

触发条件

  1. User wants to generate videos from text descriptions
  2. User wants to create videos based on images (first/last frame)
  3. User wants to create videos with reference materials (images, videos, audio)
  4. User asks for video generation capabilities
  1. 用户想要通过文本描述生成视频
  2. 用户想要基于图像(首帧/末帧)创建视频
  3. 用户想要使用参考素材(图像、视频、音频)创建视频
  4. 用户询问视频生成能力

Usage

使用方法

Environment Variables

环境变量

Before using this skill, ensure the following environment variables are set:
  • ARK_API_KEY
    or
    MODEL_VIDEO_API_KEY
    or
    MODEL_AGENT_API_KEY
    : API key for the video generation service
  • MODEL_VIDEO_API_BASE
    : API base URL (optional, has default)
  • MODEL_VIDEO_NAME
    : Model name (optional, has default)
在使用本技能前,请确保已设置以下环境变量:
  • ARK_API_KEY
    MODEL_VIDEO_API_KEY
    MODEL_AGENT_API_KEY
    :视频生成服务的API密钥
  • MODEL_VIDEO_API_BASE
    :API基础URL(可选,有默认值)
  • MODEL_VIDEO_NAME
    :模型名称(可选,有默认值)

Function Signature

函数签名

python
async def video_generate(
    params: list,
    batch_size: int = 10,
    max_wait_seconds: int = 1200,
    model_name: str = None,
) -> Dict:
python
async def video_generate(
    params: list,
    batch_size: int = 10,
    max_wait_seconds: int = 1200,
    model_name: str = None,
) -> Dict:

Parameters

参数

params (list[dict])

params (list[dict])

A list of video generation requests. Each item is a dict with the following fields:
Required per item:
  • video_name
    (str): Name/identifier of the output video file
  • prompt
    (str): Text describing the video to generate. Supports Chinese and English.
Optional per item - Input Materials:
  • first_frame
    (str): URL for the first frame image
  • last_frame
    (str): URL for the last frame image
  • reference_images
    (list[str]): 1-4 reference image URLs for style/content guidance
  • reference_videos
    (list[str]): 0-3 reference video URLs (mp4/mov, 2-15s each, total ≤15s)
  • reference_audios
    (list[str]): 0-3 reference audio URLs (mp3/wav, 2-15s each, total ≤15s)
Optional per item - Video Output Parameters:
  • ratio
    (str): Aspect ratio. Options: "16:9" (default), "9:16", "4:3", "3:4", "1:1", "2:1", "21:9", "adaptive"
  • duration
    (int): Video length in seconds. Range: 2-12s depending on model
  • resolution
    (str): Video resolution. Options: "480p", "720p", "1080p"
  • frames
    (int): Total frame count. Must be in [29, 289] and follow format 25 + 4n
  • camera_fixed
    (bool): Lock camera movement. Default: false
  • seed
    (int): Random seed for reproducibility. Range: [-1, 2^32-1]
  • watermark
    (bool): Whether to add watermark. Default: false
  • generate_audio
    (bool): Whether to generate audio. Only Seedance 1.5 supports this
  • tools
    (list[dict]): Tool configuration, e.g.,
    [{"type": "web_search"}]
一个视频生成请求的列表。每个请求项是一个字典,包含以下字段:
每个请求项必填:
  • video_name
    (str):输出视频文件的名称/标识符
  • prompt
    (str):描述要生成视频的文本,支持中英文。
每个请求项可选 - 输入素材:
  • first_frame
    (str):首帧图像的URL
  • last_frame
    (str):末帧图像的URL
  • reference_images
    (list[str]):用于风格/内容引导的1-4张参考图像URL
  • reference_videos
    (list[str]):0-3个参考视频URL(格式为mp4/mov,每个时长2-15秒,总时长≤15秒)
  • reference_audios
    (list[str]):0-3个参考音频URL(格式为mp3/wav,每个时长2-15秒,总时长≤15秒)
每个请求项可选 - 视频输出参数:
  • ratio
    (str):宽高比。可选值:"16:9"(默认)、"9:16"、"4:3"、"3:4"、"1:1"、"2:1"、"21:9"、"adaptive"
  • duration
    (int):视频时长(秒)。范围:根据模型不同为2-12秒
  • resolution
    (str):视频分辨率。可选值:"480p"、"720p"、"1080p"
  • frames
    (int):总帧数。必须在[29, 289]范围内,且符合25 + 4n的格式
  • camera_fixed
    (bool):锁定镜头移动。默认值:false
  • seed
    (int):用于复现结果的随机种子。范围:[-1, 2^32-1]
  • watermark
    (bool):是否添加水印。默认值:false
  • generate_audio
    (bool):是否生成音频。仅Seedance 1.5支持此功能
  • tools
    (list[dict]):工具配置,例如
    [{"type": "web_search"}]

Input Modes

输入模式

  1. Text-to-Video: Only provide prompt, no images/videos
  2. First Frame Guidance: Provide first_frame for starting image
  3. First + Last Frame Guidance: Provide both for transition video
  4. Reference Images: Provide reference_images for style/content guidance
  5. Multimodal Reference: Combine reference_images, reference_videos, reference_audios
  1. 文本转视频(Text-to-Video):仅提供prompt,不提供图像/视频
  2. 首帧引导:提供first_frame作为起始图像
  3. 首帧+末帧引导:同时提供两者以生成过渡视频
  4. 参考图像引导:提供reference_images进行风格/内容引导
  5. 多模态参考:结合reference_images、reference_videos、reference_audios

Return Value

返回值

Script Return Info

脚本返回信息

The video_generate.py script will return these info:
python
{
    "status": "success" | "partial_success" | "error",
    "success_list": [{"video_name": "video_url"}],
    "error_list": ["video_name"],
    "error_details": [{"video_name": "...", "error": {...}}],
    "pending_list": [{"video_name": "...", "task_id": "cgt-xxx", ...}]
}
Based on the script return info, the final response returned to the user consists of a description of the video generation task and the video URL(s). You may download the video from the URL, but the video URL should still be provided to the user for viewing and downloading.
Note: the URL is the 'url' in the success_list of script return info. The URL must return in two ways:
video_generate.py脚本将返回以下信息:
python
{
    "status": "success" | "partial_success" | "error",
    "success_list": [{"video_name": "video_url"}],
    "error_list": ["video_name"],
    "error_details": [{"video_name": "...", "error": {...}}],
    "pending_list": [{"video_name": "...", "task_id": "cgt-xxx", ...}]
}
根据脚本返回的信息,最终返回给用户的响应包含视频生成任务的描述和视频URL。你可以从URL下载视频,但仍需将视频URL提供给用户以便查看和下载。
注意:URL是脚本返回信息中success_list里的'url'字段。 URL必须以两种方式返回:

Final Return Info

最终返回信息

You must return three types of information:
  1. File format, return both file (if you have some other methods to send the video file) and local path, for example: /root/.openclaw/workspace/skills/video-generate/xxx.mp4
  2. After generation, present list of video URL in Markdown format, for example:
<video src="https://example.com/video1.mp4" width="640" controls>video-1</video>
<video src="https://example.com/video2.mp4" width="640" controls>video-2</video>
你必须返回三类信息:
  1. 文件格式,同时返回文件(如果你有其他发送视频文件的方式)和本地路径,例如: /root/.openclaw/workspace/skills/video-generate/xxx.mp4
  2. 生成完成后,以Markdown格式呈现视频URL列表,例如:
<video src="https://example.com/video1.mp4" width="640" controls>video-1</video>
<video src="https://example.com/video2.mp4" width="640" controls>video-2</video>

Code Implementation

代码实现

See scripts/video_generate.py for the full implementation.
完整实现请查看scripts/video_generate.py

Example Usage

使用示例

bash
undefined
bash
undefined

Text-to-Video

文本转视频

python scripts/video_generate.py -p "小猫骑着滑板穿过公园" -n cat_park -r 16:9 -d 5 --resolution 720p
python scripts/video_generate.py -p "小猫骑着滑板穿过公园" -n cat_park -r 16:9 -d 5 --resolution 720p

First Frame Guidance

首帧引导

python scripts/video_generate.py -p "小猫跳起来" -n cat_jump -f "https://example.com/cat.png" -r adaptive -d 5
python scripts/video_generate.py -p "小猫跳起来" -n cat_jump -f "https://example.com/cat.png" -r adaptive -d 5

First + Last Frame Guidance

首帧+末帧引导

python scripts/video_generate.py -p "平滑过渡动画" -n transition
-f "https://example.com/start.png"
-l "https://example.com/end.png"
-d 6
python scripts/video_generate.py -p "平滑过渡动画" -n transition
-f "https://example.com/start.png"
-l "https://example.com/end.png"
-d 6

Reference Images (style/content guidance)

参考图像引导(风格/内容)

python scripts/video_generate.py -p "[图1]戴着眼镜的男生和[图2]柯基小狗坐在草坪上" -n styled
--ref-images "https://example.com/boy.png" "https://example.com/dog.png"
-r 16:9 -d 5
python scripts/video_generate.py -p "[图1]戴着眼镜的男生和[图2]柯基小狗坐在草坪上" -n styled
--ref-images "https://example.com/boy.png" "https://example.com/dog.png"
-r 16:9 -d 5

Multimodal Reference (video + audio)

多模态参考(视频+音频)

python scripts/video_generate.py -p "将视频中的人物换成[图1]中的男孩" -n multimodal
--ref-images "https://example.com/boy.png"
--ref-videos "https://example.com/source.mp4"
--ref-audios "https://example.com/voice.wav"
-d 5
python scripts/video_generate.py -p "将视频中的人物换成[图1]中的男孩" -n multimodal
--ref-images "https://example.com/boy.png"
--ref-videos "https://example.com/source.mp4"
--ref-audios "https://example.com/voice.wav"
-d 5

With Audio Generation (Seedance 1.5 only)

生成音频(仅Seedance 1.5支持)

python scripts/video_generate.py -p "女孩抱着狐狸,可以听到风声和树叶沙沙声" -n with_audio
-f "https://example.com/girl_fox.png"
--generate-audio
-m doubao-seedance-1-5-pro-251215
-d 6 --resolution 1080p
python scripts/video_generate.py -p "女孩抱着狐狸,可以听到风声和树叶沙沙声" -n with_audio
-f "https://example.com/girl_fox.png"
--generate-audio
-m doubao-seedance-1-5-pro-251215
-d 6 --resolution 1080p

Query task status

查询任务状态

python scripts/video_generate.py -q "cgt-20260222165751-wsnw8"
python scripts/video_generate.py -q "cgt-20260222165751-wsnw8"

Use specific model

使用指定模型

python scripts/video_generate.py -p "A futuristic city" -m doubao-seedance-1-5-pro-251215
python scripts/video_generate.py -p "A futuristic city" -m doubao-seedance-1-5-pro-251215

No watermark

无水印

python scripts/video_generate.py -p "A beautiful landscape" --no-watermark
undefined
python scripts/video_generate.py -p "A beautiful landscape" --no-watermark
undefined

Command Line Options

命令行选项

OptionShortDescription
--prompt
-p
Text description of the video (required)
--name
-n
Video name identifier (default: video)
--model
-m
Model name (default: doubao-seedance-1-0-pro-250528)
--ratio
-r
Aspect ratio (default: 16:9)
--duration
-d
Video duration in seconds (2-12)
--resolution
Video resolution: 480p, 720p, 1080p
--first-frame
-f
First frame image URL
--last-frame
-l
Last frame image URL
--ref-images
Reference image URLs (space-separated, 1-4 images)
--ref-videos
Reference video URLs (space-separated, 0-3 videos)
--ref-audios
Reference audio URLs (space-separated, 0-3 audios)
--generate-audio
Generate audio (Seedance 1.5 only)
--seed
Random seed for reproducibility
--no-watermark
Disable watermark
--timeout
-t
Max wait time in seconds (default: 1200)
--query-task
-q
Query task status by task_id
选项缩写描述
--prompt
-p
视频的文本描述(必填)
--name
-n
视频名称标识符(默认值:video)
--model
-m
模型名称(默认值:doubao-seedance-1-0-pro-250528)
--ratio
-r
宽高比(默认值:16:9)
--duration
-d
视频时长(秒,范围2-12)
--resolution
视频分辨率:480p、720p、1080p
--first-frame
-f
首帧图像URL
--last-frame
-l
末帧图像URL
--ref-images
参考图像URL(空格分隔,1-4张)
--ref-videos
参考视频URL(空格分隔,0-3个)
--ref-audios
参考音频URL(空格分隔,0-3个)
--generate-audio
生成音频(仅Seedance 1.5支持)
--seed
用于复现结果的随机种子
--no-watermark
禁用水印
--timeout
-t
最大等待时间(秒,默认值:1200)
--query-task
-q
通过task_id查询任务状态

Model Fallback

模型降级方案

If you encounter a model-related error (like
ModelNotOpen
), you can downgrade to these models:
  • doubao-seedance-1-5-pro-251215
  • doubao-seedance-1-0-pro-250528
如果遇到模型相关错误(如
ModelNotOpen
),可以降级到以下模型:
  • doubao-seedance-1-5-pro-251215
  • doubao-seedance-1-0-pro-250528

Error Handling

错误处理

  • IF the script raises the error "PermissionError: ARK_API_KEY or MODEL_VIDEO_API_KEY or MODEL_AGENT_API_KEY not found in environment variables", inform the user that they need to provide the
    ARK_API_KEY
    or
    MODEL_VIDEO_API_KEY
    or
    MODEL_AGENT_API_KEY
    environment variable. Write it to the environment variable file in the workspace. If the file already exists, append it to the end. Ensure the environment variable format is correct, make the environment variable effective, and retry the video generation task that just failed.
  • 如果脚本抛出错误“PermissionError: ARK_API_KEY or MODEL_VIDEO_API_KEY or MODEL_AGENT_API_KEY not found in environment variables”,请告知用户需要设置
    ARK_API_KEY
    MODEL_VIDEO_API_KEY
    MODEL_AGENT_API_KEY
    环境变量。将其写入工作区的环境变量文件中,如果文件已存在,则追加到末尾。确保环境变量格式正确,使环境变量生效后,重试刚刚失败的视频生成任务。

Notes

注意事项

  • Keep prompt concise (recommended ≤ 500 characters)
  • For first/last frame, ensure aspect ratios match your chosen ratio
  • Reference images: 1-4 images, formats: jpeg/png/webp/bmp/tiff/gif
  • Reference videos: 0-3 videos, formats: mp4/mov, total duration ≤ 15s
  • Reference audios: 0-3 audios, formats: mp3/wav, total duration ≤ 15s
  • Multimodal requires at least one image or video (audio-only not supported)
  • Audio generation is only supported by Seedance 1.5 pro
  • If polling times out, use
    --query-task
    with the returned task_id
  • 请保持提示词简洁(建议≤500字符)
  • 对于首帧/末帧,确保其宽高比与你选择的ratio匹配
  • 参考图像:1-4张,格式:jpeg/png/webp/bmp/tiff/gif
  • 参考视频:0-3个,格式:mp4/mov,总时长≤15秒
  • 参考音频:0-3个,格式为mp3/wav,总时长≤15秒
  • 多模态参考至少需要一张图像或一个视频(不支持仅音频)
  • 音频生成仅支持Seedance 1.5 pro模型
  • 如果轮询超时,使用返回的task_id通过
    --query-task
    查询状态