agent-media

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Agent Media

Agent Media

Agent Media is an agent-first media toolkit that provides CLI-accessible commands for image, video, and audio processing. All commands produce deterministic, machine-readable JSON output.
Agent Media是一款面向Agent的媒体工具包,提供可通过CLI调用的图片、视频和音频处理命令。所有命令均输出确定性、机器可读的JSON格式结果。

Available Commands

可用命令

Image Commands

图片命令

  • agent-media image resize
    - Resize an image
  • agent-media image convert
    - Convert image format
  • agent-media image remove-background
    - Remove image background
  • agent-media image generate
    - Generate image from text
  • agent-media image resize
    - 调整图片尺寸
  • agent-media image convert
    - 转换图片格式
  • agent-media image remove-background
    - 移除图片背景
  • agent-media image generate
    - 基于文本生成图片

Audio Commands

音频命令

  • agent-media audio extract
    - Extract audio from video
  • agent-media audio transcribe
    - Transcribe audio to text
  • agent-media audio extract
    - 从视频中提取音频
  • agent-media audio transcribe
    - 将音频转写为文本

Video Commands

视频命令

  • agent-media video generate
    - Generate video from text or image
  • agent-media video generate
    - 基于文本或图片生成视频

Output Format

输出格式

All commands return JSON to stdout:
json
{
  "ok": true,
  "media_type": "image",
  "action": "resize",
  "provider": "local",
  "output_path": "output_123.webp",
  "mime": "image/webp",
  "bytes": 12345
}
On error:
json
{
  "ok": false,
  "error": {
    "code": "INVALID_INPUT",
    "message": "input file not found"
  }
}
所有命令均向标准输出(stdout)返回JSON:
json
{
  "ok": true,
  "media_type": "image",
  "action": "resize",
  "provider": "local",
  "output_path": "output_123.webp",
  "mime": "image/webp",
  "bytes": 12345
}
错误时返回:
json
{
  "ok": false,
  "error": {
    "code": "INVALID_INPUT",
    "message": "input file not found"
  }
}

Providers

服务提供商

  • local - Default provider using Sharp (resize, convert) and Transformers.js (remove-background, transcribe)
  • fal - fal.ai provider (generate, edit, remove-background, transcribe, video)
  • replicate - Replicate API (generate, edit, remove-background, transcribe, video)
  • runpod - Runpod API (generate, edit)
  • ai-gateway - Vercel AI Gateway (generate, edit)
  • local - 默认提供商,使用Sharp(调整尺寸、格式转换)和Transformers.js(移除背景、语音转写)
  • fal - fal.ai提供商(生成、编辑、移除背景、语音转写、视频处理)
  • replicate - Replicate API(生成、编辑、移除背景、语音转写、视频处理)
  • runpod - Runpod API(生成、编辑)
  • ai-gateway - Vercel AI Gateway(生成、编辑)

Provider Selection

提供商选择

  1. Explicit:
    --provider <name>
  2. Auto-detect from environment variables
  3. Fallback to local provider
  1. 显式指定:
    --provider <name>
  2. 通过环境变量自动检测
  3. 回退至local提供商

Environment Variables

环境变量

  • AGENT_MEDIA_DIR
    - Custom output directory
  • FAL_API_KEY
    - Enable fal provider
  • REPLICATE_API_TOKEN
    - Enable replicate provider
  • RUNPOD_API_KEY
    - Enable runpod provider
  • AI_GATEWAY_API_KEY
    - Enable ai-gateway provider
  • AGENT_MEDIA_DIR
    - 自定义输出目录
  • FAL_API_KEY
    - 启用fal提供商
  • REPLICATE_API_TOKEN
    - 启用replicate提供商
  • RUNPOD_API_KEY
    - 启用runpod提供商
  • AI_GATEWAY_API_KEY
    - 启用ai-gateway提供商