video-agent

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Video Agent - AI Content Generation Suite

Video Agent - AI内容生成套件

A comprehensive AI content generation package providing a unified interface across 35+ models for image, video, and audio creation.
这是一个全面的AI内容生成包,为图像、视频和音频创作提供了统一接口,支持35+种模型。

When to Use This Skill

何时使用该Skill

  • Text-to-image generation
  • Image-to-image transformations
  • Text-to-video creation
  • Image-to-video animation
  • Professional text-to-speech
  • Multi-step content pipelines
  • Batch content generation
  • 文本转图像生成
  • 图像转图像变换
  • 文本转视频创作
  • 图像转视频动画
  • 专业文本转语音
  • 多步骤内容流水线
  • 批量内容生成

Supported Providers

支持的提供商

FAL AI

FAL AI

  • FLUX models (text-to-image)
  • Image transformations
  • Fast inference
  • FLUX模型(文本转图像)
  • 图像变换
  • 快速推理

Google Vertex AI

Google Vertex AI

  • Imagen 4 (text-to-image)
  • Veo (text-to-video)
  • High quality outputs
  • Imagen 4(文本转图像)
  • Veo(文本转视频)
  • 高质量输出

ElevenLabs

ElevenLabs

  • 20+ voice options
  • Professional TTS
  • Multiple languages
  • 20+种语音选项
  • 专业TTS
  • 多语言支持

OpenRouter

OpenRouter

  • Access to various LLMs
  • Text generation
  • Content writing
  • 访问各类LLM
  • 文本生成
  • 内容写作

Core Capabilities

核心功能

Image Generation

图像生成

Generate image:
Prompt: "A serene Japanese garden at sunset"
Model: flux-pro
Size: 1024x1024
Style: photorealistic
Available Models:
  • FLUX Pro/Dev (FAL)
  • Imagen 4 (Google)
  • Stable Diffusion variants
Generate image:
Prompt: "A serene Japanese garden at sunset"
Model: flux-pro
Size: 1024x1024
Style: photorealistic
可用模型:
  • FLUX Pro/Dev (FAL)
  • Imagen 4 (Google)
  • Stable Diffusion变体

Video Creation

视频创作

Generate video:
Prompt: "Ocean waves crashing on rocky shore"
Model: veo
Duration: 5 seconds
Resolution: 1080p
Available Models:
  • Google Veo
  • MiniMax Hailuo
  • Kling
Generate video:
Prompt: "Ocean waves crashing on rocky shore"
Model: veo
Duration: 5 seconds
Resolution: 1080p
可用模型:
  • Google Veo
  • MiniMax Hailuo
  • Kling

Image-to-Video

图像转视频

Animate image:
Source: /path/to/image.png
Motion: "gentle zoom out with particle effects"
Duration: 4 seconds
Animate image:
Source: /path/to/image.png
Motion: "gentle zoom out with particle effects"
Duration: 4 seconds

Text-to-Speech

文本转语音

Generate audio:
Text: "Welcome to our product demo..."
Voice: professional-female-1
Speed: 1.0
Output: welcome.mp3
Voice Options:
  • Professional male/female
  • Casual conversational
  • Narrator styles
  • Multiple accents
Generate audio:
Text: "Welcome to our product demo..."
Voice: professional-female-1
Speed: 1.0
Output: welcome.mp3
语音选项:
  • 专业男声/女声
  • 日常对话风格
  • 旁白风格
  • 多种口音

Pipeline Orchestration

流水线编排

YAML Configuration

YAML配置

yaml
pipeline: product-demo
steps:
  - name: generate-logo
    type: image
    model: flux-pro
    prompt: "Modern tech logo for AI startup"

  - name: create-intro
    type: video
    model: veo
    prompt: "Logo animation reveal"

  - name: add-voiceover
    type: audio
    model: elevenlabs
    text: "Introducing the future of AI..."
    voice: professional-male

  - name: combine
    type: merge
    inputs: [create-intro, add-voiceover]
yaml
pipeline: product-demo
steps:
  - name: generate-logo
    type: image
    model: flux-pro
    prompt: "Modern tech logo for AI startup"

  - name: create-intro
    type: video
    model: veo
    prompt: "Logo animation reveal"

  - name: add-voiceover
    type: audio
    model: elevenlabs
    text: "Introducing the future of AI..."
    voice: professional-male

  - name: combine
    type: merge
    inputs: [create-intro, add-voiceover]

JSON Configuration

JSON配置

json
{
  "pipeline": "social-content",
  "parallel": true,
  "steps": [
    {
      "type": "image",
      "variants": 4,
      "prompt": "Product hero shot"
    }
  ]
}
json
{
  "pipeline": "social-content",
  "parallel": true,
  "steps": [
    {
      "type": "image",
      "variants": 4,
      "prompt": "Product hero shot"
    }
  ]
}

Cost Management

成本管理

Real-time Estimation

实时估算

Estimate cost for:
- 10 images (1024x1024)
- 2 videos (5 seconds)
- 1 audio (60 seconds)

Estimated: $2.45
Estimate cost for:
- 10 images (1024x1024)
- 2 videos (5 seconds)
- 1 audio (60 seconds)

Estimated: $2.45

Budget Limits

预算限制

yaml
budget:
  max_per_job: $5.00
  max_daily: $50.00
  alert_threshold: 80%
yaml
budget:
  max_per_job: $5.00
  max_daily: $50.00
  alert_threshold: 80%

Performance Features

性能特性

Parallel Execution

并行执行

Generate 10 image variants in parallel
Threads: 4
Expected speedup: 2-3x
Generate 10 image variants in parallel
Threads: 4
Expected speedup: 2-3x

Caching

缓存

  • Automatic prompt caching
  • Reuse similar generations
  • Reduce redundant API calls
  • 自动提示缓存
  • 复用相似生成结果
  • 减少冗余API调用

CLI Commands

CLI命令

bash
undefined
bash
undefined

Image generation

Image generation

video-agent image "prompt" --model flux-pro --size 1024
video-agent image "prompt" --model flux-pro --size 1024

Video generation

Video generation

video-agent video "prompt" --model veo --duration 5
video-agent video "prompt" --model veo --duration 5

Audio generation

Audio generation

video-agent audio "text" --voice professional-female
video-agent audio "text" --voice professional-female

Pipeline execution

Pipeline execution

video-agent pipeline config.yaml
video-agent pipeline config.yaml

Cost check

Cost check

video-agent cost --estimate
undefined
video-agent cost --estimate
undefined

Python API

Python API

python
from video_agent import ImageGenerator, VideoGenerator
python
from video_agent import ImageGenerator, VideoGenerator

Generate image

Generate image

img = ImageGenerator(model="flux-pro") result = img.generate("sunset over mountains")
img = ImageGenerator(model="flux-pro") result = img.generate("sunset over mountains")

Generate video

Generate video

vid = VideoGenerator(model="veo") result = vid.generate("timelapse of clouds")
undefined
vid = VideoGenerator(model="veo") result = vid.generate("timelapse of clouds")
undefined

Setup

安装设置

1. Install Package

1. 安装包

bash
pip install video-agent-claude-skill
bash
pip install video-agent-claude-skill

2. Configure API Keys

2. 配置API密钥

bash
export FAL_API_KEY="your-key"
export GOOGLE_VERTEX_KEY="your-key"
export ELEVENLABS_API_KEY="your-key"
bash
export FAL_API_KEY="your-key"
export GOOGLE_VERTEX_KEY="your-key"
export ELEVENLABS_API_KEY="your-key"

3. Verify Setup

3. 验证设置

bash
video-agent status
bash
video-agent status

Use Cases

使用场景

  • Marketing: Product images, promo videos
  • Social Media: Content at scale
  • Education: Explainer videos, voiceovers
  • Prototyping: Visual concepts, mockups
  • Automation: Batch content pipelines
  • 营销领域:产品图像、宣传视频
  • 社交媒体:规模化内容生产
  • 教育领域:讲解视频、旁白配音
  • 原型设计:视觉概念、模型原型
  • 自动化:批量内容流水线

Credits

致谢

Created by donghaozhang. Licensed under MIT.
donghaozhang开发。基于MIT许可证开源。