Bilibili Video Analyzer
Description
A Bilibili video content analysis tool. After providing a video URL, it automatically downloads the video, splits it into frame images, then uses AI to analyze the content, and finally generates high-quality thematic documents or practical tutorials.
Core Features:
- Instead of simple timeline recording, it reorganizes and restructures the content into a complete document
- Practical videos → Generate ready-to-use operation tutorials
- Knowledge-based videos → Generate structured thematic documents
- Insert key screenshots in the report using the format

Source & Documentation
Installation
1. Install .NET 10 SDK
The script uses .NET 10 single-file execution feature, so .NET 10 SDK is required.
Verify installation:
2. Install FFmpeg
Windows:
powershell
# Chocolatey
choco install ffmpeg
# Or Scoop
scoop install ffmpeg
# Or manual download: https://ffmpeg.org/download.html
macOS:
Linux:
bash
# Ubuntu/Debian
sudo apt install ffmpeg
# CentOS/RHEL
sudo yum install ffmpeg
Verify installation:
Trigger
- command
- User requests to analyze Bilibili videos
- User provides a Bilibili video link and requests analysis
Provided Script
This skill provides the
script for downloading videos and extracting frame images.
Script Location:
skills/tools/bilibili-analyzer/scripts/prepare.cs
Execution Method: Use .NET 10 single-file execution feature
Usage
bash
# Basic usage
dotnet run scripts/prepare.cs "<Video URL>" -o <Output Directory>
# Example
dotnet run scripts/prepare.cs "https://www.bilibili.com/video/BV1xx411c7mD" -o ./output
# Long video (reduce frame rate)
dotnet run scripts/prepare.cs "https://www.bilibili.com/video/BV1xx411c7mD" -o ./output --fps 0.5
Parameter Description
| Parameter | Description | Default Value |
|---|
| Bilibili video URL (required) | - |
| Output directory | Current directory |
| Frames extracted per second | 1.0 |
| Similarity threshold (0-1), adjacent frames exceeding this value will be deduplicated | 0.80 |
| Disable similar frame deduplication | false |
| Only download video, do not extract frames | false |
| Only extract frames (requires existing video.mp4) | false |
Similar Frame Deduplication
The script automatically performs similarity detection on adjacent frames and removes duplicate frames with similarity exceeding the threshold (default 80%):
- Uses FFmpeg's SSIM/PSNR algorithm to calculate similarity
- Only compares adjacent frames, no cross-frame comparison
- Automatically renumbers after deduplication (frame_0001.jpg, frame_0002.jpg, ...)
- Threshold can be adjusted via
- Deduplication can be disabled via
Output Structure
<Output Directory>/
├── video.mp4 # Downloaded video file
└── images/ # Frame image directory
├── frame_0001.jpg
├── frame_0002.jpg
├── frame_0003.jpg
└── ...
Workflow (Prompt)
You are a video content analysis assistant. When a user provides a Bilibili video link, follow these steps:
Step 1: Download Video and Split into Frames
Use the provided script to download the video and split it into frame images:
bash
dotnet run skills/tools/bilibili-analyzer/scripts/prepare.cs "<Video URL>" -o <Output Directory>
Notes:
- Short videos (<10 minutes): Use default
- Medium videos (10-30 minutes): Use
- Long videos (>30 minutes): Use
Step 2: Analyze Frame Images
Use the
Task tool to analyze images in the
directory in batches and in parallel.
Batching Strategy (dynamically calculated based on total number of images):
| Total Images | Number of Batches | Images per Batch |
|---|
| 1-30 | 1 batch | All |
| 31-60 | 2 batches | ~15-30 images/batch |
| 61-120 | 3 batches | ~20-40 images/batch |
| 121-200 | 4 batches | ~30-50 images/batch |
| 200+ | 5 batches | Evenly distributed |
Calculation Formula:
Total Images <= 30: 1 batch
Total Images <= 60: 2 batches
Total Images <= 120: 3 batches
Total Images <= 200: 4 batches
Total Images > 200: 5 batches
Images per Batch = Total Images / Number of Batches (rounded up)
Example: If there are 85 images → split into 3 batches
Task 1: Analyze frame_0001.jpg ~ frame_0029.jpg (29 images)
Task 2: Analyze frame_0030.jpg ~ frame_0058.jpg (29 images)
Task 3: Analyze frame_0059.jpg ~ frame_0085.jpg (27 images)
Task Prompt Template:
Read and analyze frame_0001.jpg to frame_0020.jpg (total 20 images) in the <Output Directory>/images/ directory.
【Important Requirements】
Your response must be a **complete and detailed report** of these images' content, do not omit any information.
For each image, record in detail:
1. **Frame Number**: frame_xxxx.jpg
2. **Scene Type**: Code Editor/Terminal/Browser/PPT/Dialog/Other
3. **Interface Content**:
- UI elements like window titles, menus, buttons
- Currently open files/pages
4. **Text Content**:
- Full transcription of all text on the screen
- Code content (fully copied, retain format)
- Terminal commands and output
- Comments and explanatory text
5. **Operation Actions**:
- Mouse position, click target
- Ongoing operation
6. **Key Information**:
- Important configuration items
- Key step explanations
- Error messages or warnings
【Output Format】
## frame_0001.jpg
- Scene: [Scene Type]
- Content: [Detailed Description]
- Text/Code:
[Complete text or code content]
- Operation: [Ongoing Operation]
- Key Points: [Key Information]
## frame_0002.jpg
...
【Notes】
- Do not omit any images
- Code and text must be fully transcribed
- The more detailed the information, the better
Analysis Key Points:
- Fully transcribe all text and code content
- Describe interface elements and operation steps in detail
- Record key information of each image
- Mark important screenshot frame numbers (e.g., frame_0042.jpg)
Step 3: Generate Document
Reorganize and restructure the analysis results into
based on video type:
Judge Video Type:
- Practical type: Programming tutorials, software operations, configuration demonstrations, etc.
- Knowledge type: Concept explanations, principle analysis, experience sharing, etc.
【Key】Images and Content Must Strictly Corresponding:
Wrong Example ❌:
### Install Node.js
First download Node.js...
 ← The image may be other content
Correct Example ✅:
### Install Node.js
First download Node.js...
 ← The image is indeed the download page
Correct Document Generation Process:
-
First organize all analysis results returned by Tasks
- Summarize analysis content of all frames
- Establish the correspondence between "frame number → content"
-
Reorganize content by topic (not in chronological order)
- Categorize related content into the same chapter
- Determine which frames' information are needed for each chapter
-
Verify carefully when inserting images
- Only insert images directly related to the current content
- Image descriptions must accurately reflect the actual content of the image
- Use format:

-
Code must come from actual code in images
- Do not fabricate code yourself
- Mark code source:
Important Principles:
- Image-Text Correspondence - Each image must match the content of its context
- No Chronological流水账 - Reorganize content like writing an article
- Clear Structure - Have chapter divisions and logical order
- Authentic Code - Only use code that appears in images, do not fabricate
- Independent Readability - Fully understandable without watching the video
Output Format
Practical Tutorial Type
markdown
# {Tutorial Topic}
## Introduction
{Tutorial Objectives}
{Prerequisites and Requirements}
## Environment Preparation
{Software to be Installed}
{Configuration Requirements}
## Operation Steps
### 1. {Step Title}
{Detailed description, content must correspond to the image below}

<!-- Code from frame_xxxx -->
```code block```
### 2. {Step Title}
{Detailed description}

...
## Complete Code
<!-- Summarized from frame_xxxx, frame_xxxx, frame_xxxx -->
{Summarize all code snippets, mark source frame numbers}
## Common Issues
{Possible issues and solutions}
## Summary
{Review of core points}
{Extended learning suggestions}
Knowledge Document Type
markdown
# {Topic}
## Overview
{Topic background introduction}
{Why it's important}
## {Chapter 1 Title}
{Content, must correspond to the accompanying image}

## {Chapter 2 Title}
{Content}

## Core Points
- Point 1
- Point 2
- Point 3
## Extended Reading
{Related resources and suggestions}
Image Insertion Specifications
| Rule | Description |
|---|
| Frame Number Must Be Marked | 
|
| Description Must Be Accurate | Describe the actual content of the image, not expected content |
| Content Must Match | Text above/below the image must be directly related to the image content |
| Code Source Marked | |
| No Random Images | Do not insert images if there is no suitable one, do not force matching |
API Reference
Bilibili API
The script uses Bilibili official API to download videos:
# Get video information
GET https://api.bilibili.com/x/web-interface/view?bvid=BV1xx411c7mD
# Get playback address
GET https://api.bilibili.com/x/player/playurl?bvid=BV1xx411c7mD&cid={cid}&qn=80&fnval=1
FFmpeg Frame Extraction Commands
bash
# 1 frame per second
ffmpeg -i video.mp4 -vf "fps=1" -q:v 2 images/frame_%04d.jpg
# 0.5 frames per second (1 frame every 2 seconds)
ffmpeg -i video.mp4 -vf "fps=0.5" -q:v 2 images/frame_%04d.jpg
# Specify time range
ffmpeg -i video.mp4 -ss 00:01:00 -to 00:05:00 -vf "fps=1" -q:v 2 images/frame_%04d.jpg
# Extract keyframes (scene changes)
ffmpeg -i video.mp4 -vf "select='gt(scene,0.3)'" -vsync vfr -q:v 2 images/frame_%04d.jpg
Examples
Example 1: Analyze Programming Tutorial
bash
# 1. Download and split into frames
dotnet run scripts/prepare.cs "https://www.bilibili.com/video/BV1xx411c7mD" -o ./react-tutorial
# 2. Analyze images (use Task tool)
# 3. Generate react-tutorial/Video Analysis.md
Example 2: Analyze Long Video
bash
# Reduce frame rate to reduce number of images
dotnet run scripts/prepare.cs "https://www.bilibili.com/video/BV1xx411c7mD" -o ./long-video --fps 0.2
Example 3: Only Download Video
bash
dotnet run scripts/prepare.cs "https://www.bilibili.com/video/BV1xx411c7mD" -o ./output --video-only
Quality Checklist
Before generating the document, check each item of the following requirements:
Content Quality
Image-Text Correspondence (Important!)
Code Quality
Tags
Compatibility
- Codex: Yes
- Claude Code: Yes