video-analyzer

Original：🇨🇳 Chinese

Translated

1 scriptsChecked / no sensitive code detected

Analyze video content using visual/video large models. This tool is triggered when the user uses phrases like "analyze video", "video understanding", "look at this video", or "analyze video".

4installs

Sourcezrong/skills

Added on2026-03-13

NPX Install

npx skill4agent add zrong/skills video-analyzer

SKILL.md Content (Chinese)

View Translation Comparison →

Video Analyzer

Analyze video content using visual/video large models, supporting local video files and online videos.

Use Cases

Users request to analyze, understand, or describe a video
Users provide a video file path or URL and want to know the video content
Users need to ask questions about the video

Configuration

Environment Variables

Set the corresponding API Key environment variables based on the model used:

bash

# VolcEngine (Doubao)
export ARK_API_KEY="your-api-key"

# OpenAI
export OPENAI_API_KEY="your-api-key"

Model Configuration

Edit

scripts/models.json

to add or modify model configurations. Each model requires:

```
base_url
```
— API endpoint
```
api_key_env
```
— Name of the environment variable for reading the API Key
```
model
```
— Model ID
```
api_type
```
—
```
responses
```
or
```
chat_completions
```
```
supports_video
```
— Whether native video input is supported

Workflow

Confirm Video Source: Obtain the video path or URL provided by the user.
Confirm Analysis Requirements: Clarify what the user wants to know (e.g., content summary, question answering, scene description, etc.). If
```
$ARGUMENTS
```
is not empty, use it as the analysis prompt.
Select Model: By default, use the
```
default_model
```
in
```
models.json
```
; users can also specify a model.
Execute Analysis: Run the script (execute in the
```
scripts/
```
directory):
bash
```
uv run analyze.py --video <video path or URL> --prompt "<analysis prompt>"
```
Optional parameters:
- ```
--model <name>
```
  — Specify a model (corresponds to the key in models.json)
- ```
--frames <number>
```
  — Number of frames to extract (default: 10)
- ```
--max-size <pixels>
```
  — Maximum side length of frames (default: 720)
Display Results: Present the analysis results returned by the model to the user.

CLI Reference