clipify-video-clip-generator

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Clipify Video Clip Generator

Clipify 短视频片段生成工具

Skill by ara.so — Devtools Skills collection.

Clipify is a Claude Code skill that automatically turns long-form videos into social-ready clips. It transcribes video, identifies clip-worthy moments, reframes 16:9 to 9:16 with face-tracking pans, and burns opus-style word-by-word captions.

Key capabilities:

Auto-detect punchlines, reversals, and awkward pauses via Whisper transcription
Face-tracking pan for 9:16 vertical clips (no ML models — uses motion energy)
Opus/karaoke/minimal subtitle styles with word-level highlighting
Hardware-accelerated rendering (VideoToolbox on macOS)
~20s render time for 20s clips on Apple Silicon

由 ara.so 开发的技能 —— Devtools Skills 合集。

Clipify 是一款 Claude Code 技能，可自动将长视频转换为适合社交媒体发布的短视频片段。它能转录视频内容、识别值得剪辑的片段、通过人脸追踪平移将16:9画面重构图为9:16竖屏，并添加Opus风格的逐词高亮字幕。

核心功能：

通过Whisper转录自动识别笑点、反转情节和尴尬停顿
针对9:16竖屏片段的人脸追踪平移（无需机器学习模型——使用运动能量算法）
支持Opus/卡拉OK/极简等字幕样式，带逐词高亮效果
硬件加速渲染（macOS平台使用VideoToolbox）
Apple Silicon设备上，20秒片段的渲染时间约为20秒

Installation

安装步骤

bash

undefined

bash

undefined

Clone to Claude Code skills directory

克隆到Claude Code技能目录

git clone https://github.com/louisedesadeleer/clipify.git ~/.claude/skills/clipify

Install dependencies

安装依赖

brew install ffmpeg pip install openai-whisper numpy


**Requirements:**
- macOS (or Linux/Windows with `-hwaccel videotoolbox` removed from SKILL.md)
- ffmpeg with libx264
- Python 3 with numpy
- Whisper (openai-whisper)

Restart Claude Code after installation. The `/clipify` slash command will be available.

brew install ffmpeg pip install openai-whisper numpy


**系统要求：**
- macOS（Linux/Windows用户需移除SKILL.md中的`-hwaccel videotoolbox`参数）
- 带libx264的ffmpeg
- 安装了numpy的Python 3
- Whisper（openai-whisper）

安装完成后重启Claude Code，即可使用`/clipify`命令。

Usage Workflow

使用流程

1. Invoke the skill

1. 调用技能

In Claude Code:

/clipify

Provide the path to your source video when prompted:

/path/to/long-interview.mp4

在Claude Code中输入：

/clipify

根据提示提供源视频路径：

/path/to/long-interview.mp4

2. Review proposed clips

2. 查看推荐片段

Clipify transcribes the video and proposes 3-5 candidates with:

Timestamp range
Title/description
Reason (punchline, reversal, audio peak, awkward pause)

Example output:

Clip 1: "The worst product advice" (02:34 - 02:51)
  Reason: Reversal after awkward pause

Clip 2: "We burned $2M on this" (08:12 - 08:29)
  Reason: Audio peak + punchline

Clip 3: "My co-founder quit on Zoom" (15:03 - 15:24)
  Reason: Punchline

Clipify会转录视频并推荐3-5个候选片段，包含：

时间戳范围
标题/描述
推荐理由（笑点、反转、音频峰值、尴尬停顿）

示例输出：

片段1："最糟糕的产品建议"（02:34 - 02:51）
  理由：尴尬停顿后的反转情节

片段2："我们在这上面烧了200万美元"（08:12 - 08:29）
  理由：音频峰值+笑点

片段3："我的联合创始人在Zoom上辞职了"（15:03 - 15:24）
  理由：笑点

3. Select clip and format

3. 选择片段和格式

Choose which clip to cut, then specify:

Aspect ratio: 9:16 (vertical), 16:9 (horizontal), 1:1 (square)
Reframe style (if 9:16 from 16:9 with two speakers): pan (follow speaker) or split-screen
Subtitle style: opus (bold white + yellow highlight), karaoke (word-by-word), minimal, or paste reference image

选择要剪辑的片段，然后指定：

宽高比：9:16（竖屏）、16:9（横屏）、1:1（方形）
重构图样式（若从16:9转换为9:16且有两位发言者）：平移（跟随发言者）或分屏
字幕样式：Opus（白色粗体+黄色高亮）、卡拉OK（逐词显示）、极简，或粘贴参考图片

4. Output

4. 输出结果

Final clips are saved to:

<source-video-dir>/clipify_out/clip_<timestamp>.mp4

最终片段将保存至：

<源视频目录>/clipify_out/clip_<时间戳>.mp4

Scripts Reference

脚本参考

Clipify uses standalone Python scripts for each processing step. You can call these directly for custom workflows.

Clipify使用独立的Python脚本处理每个步骤，你可以直接调用这些脚本实现自定义工作流。

analyze.py — Speaker timeline from motion energy

analyze.py —— 基于运动能量生成发言者时间线

python

undefined

python

undefined

Generate motion energy files for two face regions

为两个面部区域生成运动能量文件

ffmpeg -i video.mp4 -vf "crop=300:200:100:50,format=gray,tblend=all_mode=difference"
-f rawvideo -pix_fmt gray - | python scripts/analyze.py motion_left.bin

ffmpeg -i video.mp4 -vf "crop=300:200:1000:50,format=gray,tblend=all_mode=difference"
-f rawvideo -pix_fmt gray - | python scripts/analyze.py motion_right.bin

ffmpeg -i video.mp4 -vf "crop=300:200:100:50,format=gray,tblend=all_mode=difference"
-f rawvideo -pix_fmt gray - | python scripts/analyze.py motion_left.bin

ffmpeg -i video.mp4 -vf "crop=300:200:1000:50,format=gray,tblend=all_mode=difference"
-f rawvideo -pix_fmt gray - | python scripts/analyze.py motion_right.bin

Analyze both to generate speaker timeline

分析两个文件生成发言者时间线

python scripts/analyze.py motion_left.bin motion_right.bin --fps 30 > timeline.txt


**Output format (timeline.txt):**

0.00-2.34:left 2.34-5.67:right 5.67-8.12:left

undefined

python scripts/analyze.py motion_left.bin motion_right.bin --fps 30 > timeline.txt


**输出格式（timeline.txt）：**

0.00-2.34:left 2.34-5.67:right 5.67-8.12:left

undefined

build_pan.py — Generate ffmpeg crop expression

build_pan.py —— 生成ffmpeg裁剪表达式

python

undefined

python

undefined

From speaker timeline, build hard-cut pan expression

根据发言者时间线生成硬切平移表达式

python scripts/build_pan.py timeline.txt --left-x 100 --right-x 1000 --width 608 > pan_expr.txt

Use in ffmpeg crop filter

在ffmpeg裁剪滤镜中使用

ffmpeg -i source.mp4 -vf "crop=608:1080:'$(cat pan_expr.txt)':0" output.mp4


**Arguments:**
- `--left-x`: X coordinate of left speaker's face center
- `--right-x`: X coordinate of right speaker's face center
- `--width`: Width of the 9:16 crop window (e.g., 608 for 1080p)

**Output:** ffmpeg expression string like:

if(between(t,0,2.34),100,if(between(t,2.34,5.67),1000,if(between(t,5.67,8.12),100,1000)))

undefined

ffmpeg -i source.mp4 -vf "crop=608:1080:'$(cat pan_expr.txt)':0" output.mp4


**参数说明：**
- `--left-x`：左侧发言者面部中心的X坐标
- `--right-x`：右侧发言者面部中心的X坐标
- `--width`：9:16裁剪窗口的宽度（例如1080p为608）

**输出：** ffmpeg表达式字符串，示例如下：

if(between(t,0,2.34),100,if(between(t,2.34,5.67),1000,if(between(t,5.67,8.12),100,1000)))

undefined

build_ass.py — Generate ASS subtitle file

build_ass.py —— 生成ASS字幕文件

python

undefined

python

undefined

From Whisper JSON output, create opus-style captions

根据Whisper JSON输出创建Opus风格字幕

python scripts/build_ass.py whisper_output.json --style opus > captions.ass

Burn into video

嵌入到视频中

ffmpeg -i video.mp4 -vf "ass=captions.ass" output.mp4


**Whisper JSON format (input):**
```json
{
  "segments": [
    {
      "start": 0.5,
      "end": 2.3,
      "text": "This is the worst advice",
      "words": [
        {"word": "This", "start": 0.5, "end": 0.7},
        {"word": "is", "start": 0.7, "end": 0.85},
        {"word": "the", "start": 0.85, "end": 1.0},
        {"word": "worst", "start": 1.0, "end": 1.4},
        {"word": "advice", "start": 1.4, "end": 2.3}
      ]
    }
  ]
}

Styles:

```
opus
```
: Bold white text, yellow active-word highlight, centered top
```
karaoke
```
: Word-by-word color change, bottom positioned
```
minimal
```
: Clean white text, no highlights

ffmpeg -i video.mp4 -vf "ass=captions.ass" output.mp4


**Whisper JSON格式（输入）：**
```json
{
  "segments": [
    {
      "start": 0.5,
      "end": 2.3,
      "text": "This is the worst advice",
      "words": [
        {"word": "This", "start": 0.5, "end": 0.7},
        {"word": "is", "start": 0.7, "end": 0.85},
        {"word": "the", "start": 0.85, "end": 1.0},
        {"word": "worst", "start": 1.0, "end": 1.4},
        {"word": "advice", "start": 1.4, "end": 2.3}
      ]
    }
  ]
}

字幕样式：

```
opus
```
：白色粗体文本，当前单词黄色高亮，顶部居中
```
karaoke
```
：逐词变色，位于底部
```
minimal
```
：简洁白色文本，无高亮

audio_align.py — Find clip offset in source

audio_align.py —— 查找片段在源视频中的偏移位置

python

undefined

python

undefined

Find where a 20s clip appears in a 2-hour source video

查找20秒片段在2小时源视频中的位置

python scripts/audio_align.py source.mp4 clip.mp4

Output: 00:15:34.2 (offset timestamp)

输出：00:15:34.2（偏移时间戳）


Uses audio cross-correlation. Useful for re-linking edited clips to source timestamps.


使用音频互相关算法，适用于将编辑后的片段重新关联到源视频时间戳。

Common Patterns

常见使用场景

Extract clip manually (without auto-detection)

手动提取片段（无需自动检测）

bash

undefined

bash

undefined

1. Transcribe with Whisper

1. 使用Whisper转录

whisper source.mp4 --model base --output_format json --output_dir ./

2. Cut segment (03:15 to 03:42)

2. 剪辑片段（03:15至03:42）

ffmpeg -i source.mp4 -ss 00:03:15 -to 00:03:42 -c copy raw_clip.mp4

3. Generate captions for this segment

3. 为该片段生成字幕

python scripts/build_ass.py source.json --start 195 --end 222 --style opus > clip.ass

4. Reframe to 9:16 with center crop (no pan)

4. 重构图为9:16（居中裁剪，无平移）

ffmpeg -i raw_clip.mp4 -vf "crop=608:1080:656:0,ass=clip.ass" final_clip.mp4

undefined

ffmpeg -i raw_clip.mp4 -vf "crop=608:1080:656:0,ass=clip.ass" final_clip.mp4

undefined

Two-speaker pan with manual face coordinates

双发言者平移（手动指定面部坐标）

bash

undefined

bash

undefined

1. Identify face regions on a sample frame

1. 在样本帧上识别面部区域

ffplay -ss 00:01:00 source.mp4 # visual inspection

ffplay -ss 00:01:00 source.mp4 # 可视化检查

Left face: x=200, y=100, width=300, height=200

左侧面部：x=200, y=100, width=300, height=200

Right face: x=1100, y=100, width=300, height=200

右侧面部：x=1100, y=100, width=300, height=200

2. Generate motion energy files

2. 生成运动能量文件

ffmpeg -i source.mp4 -ss 00:03:15 -to 00:03:42
-vf "crop=300:200:200:100,format=gray,tblend=all_mode=difference"
-f rawvideo -pix_fmt gray motion_left.bin

ffmpeg -i source.mp4 -ss 00:03:15 -to 00:03:42
-vf "crop=300:200:1100:100,format=gray,tblend=all_mode=difference"
-f rawvideo -pix_fmt gray motion_right.bin

ffmpeg -i source.mp4 -ss 00:03:15 -to 00:03:42
-vf "crop=300:200:200:100,format=gray,tblend=all_mode=difference"
-f rawvideo -pix_fmt gray motion_left.bin

ffmpeg -i source.mp4 -ss 00:03:15 -to 00:03:42
-vf "crop=300:200:1100:100,format=gray,tblend=all_mode=difference"
-f rawvideo -pix_fmt gray motion_right.bin

3. Analyze speaker timeline

3. 分析发言者时间线

python scripts/analyze.py motion_left.bin motion_right.bin --fps 30 > timeline.txt

4. Build pan expression (faces centered at x=350 and x=1250)

4. 构建平移表达式（面部中心位于x=350和x=1250）

python scripts/build_pan.py timeline.txt --left-x 350 --right-x 1250 --width 608 > pan.txt

5. Apply crop with pan

5. 应用带平移的裁剪

ffmpeg -i source.mp4 -ss 00:03:15 -to 00:03:42
-vf "crop=608:1080:'$(cat pan.txt)':0" panned_clip.mp4

undefined

ffmpeg -i source.mp4 -ss 00:03:15 -to 00:03:42
-vf "crop=608:1080:'$(cat pan.txt)':0" panned_clip.mp4

undefined

Custom subtitle styling

自定义字幕样式

Edit ASS file generated by

build_ass.py

ass

[V4+ Styles]
Format: Name, Fontname, Fontsize, PrimaryColour, SecondaryColour, OutlineColour, BackColour, Bold, Italic, Underline, StrikeOut, ScaleX, ScaleY, Spacing, Angle, BorderStyle, Outline, Shadow, Alignment, MarginL, MarginR, MarginV, Encoding
Style: Default,Impact,68,&H00FFFFFF,&H0000FFFF,&H00000000,&H00000000,-1,0,0,0,100,100,0,0,1,3,0,2,10,10,120,1

[Events]
Format: Layer, Start, End, Style, Name, MarginL, MarginR, MarginV, Effect, Text
Dialogue: 0,0:00:00.50,0:00:02.30,Default,,0,0,0,,{\k70}This {\k15}is {\k15}the {\k40}worst {\k90}advice

Customize:

```
Fontname
```
: Impact, Arial, Montserrat
```
Fontsize
```
: 68 for 1080p vertical
```
PrimaryColour
```
:
```
&H00FFFFFF
```
(white in BGR hex)
```
SecondaryColour
```
:
```
&H0000FFFF
```
(yellow highlight)
```
Outline
```
: Border thickness (3 = thick black outline)
```
Alignment
```
: 2=bottom center, 8=top center
```
MarginV
```
: Vertical margin from edge

编辑

build_ass.py

生成的ASS文件：

ass

[V4+ Styles]
Format: Name, Fontname, Fontsize, PrimaryColour, SecondaryColour, OutlineColour, BackColour, Bold, Italic, Underline, StrikeOut, ScaleX, ScaleY, Spacing, Angle, BorderStyle, Outline, Shadow, Alignment, MarginL, MarginR, MarginV, Encoding
Style: Default,Impact,68,&H00FFFFFF,&H0000FFFF,&H00000000,&H00000000,-1,0,0,0,100,100,0,0,1,3,0,2,10,10,120,1

[Events]
Format: Layer, Start, End, Style, Name, MarginL, MarginR, MarginV, Effect, Text
Dialogue: 0,0:00:00.50,0:00:02.30,Default,,0,0,0,,{\k70}This {\k15}is {\k15}the {\k40}worst {\k90}advice

可自定义项：

```
Fontname
```
：Impact、Arial、Montserrat等
```
Fontsize
```
：1080p竖屏建议68
```
PrimaryColour
```
：
```
&H00FFFFFF
```
（BGR十六进制格式的白色）
```
SecondaryColour
```
：
```
&H0000FFFF
```
（黄色高亮）
```
Outline
```
：边框厚度（3=粗黑边框）
```
Alignment
```
：2=底部居中，8=顶部居中
```
MarginV
```
：距边缘的垂直边距

Batch processing multiple clips

批量处理多个片段

python

import subprocess
import json

python

import subprocess
import json

Load Whisper transcript

加载Whisper转录结果

with open("source.json") as f: data = json.load(f)

Define clip ranges

定义片段范围

clips = [ {"start": 154, "end": 171, "title": "clip1"}, {"start": 492, "end": 509, "title": "clip2"}, {"start": 903, "end": 924, "title": "clip3"} ]

for clip in clips: # Cut raw clip subprocess.run([ "ffmpeg", "-i", "source.mp4", "-ss", str(clip["start"]), "-to", str(clip["end"]), "-c", "copy", f"raw_{clip['title']}.mp4" ])

# Generate captions
subprocess.run([
    "python", "scripts/build_ass.py", "source.json",
    "--start", str(clip["start"]),
    "--end", str(clip["end"]),
    "--style", "opus"
], stdout=open(f"{clip['title']}.ass", "w"))

# Reframe and burn captions
subprocess.run([
    "ffmpeg", "-i", f"raw_{clip['title']}.mp4",
    "-vf", f"crop=608:1080:656:0,ass={clip['title']}.ass",
    f"{clip['title']}_final.mp4"
])

undefined

clips = [ {"start": 154, "end": 171, "title": "clip1"}, {"start": 492, "end": 509, "title": "clip2"}, {"start": 903, "end": 924, "title": "clip3"} ]

for clip in clips: # 剪辑原始片段 subprocess.run([ "ffmpeg", "-i", "source.mp4", "-ss", str(clip["start"]), "-to", str(clip["end"]), "-c", "copy", f"raw_{clip['title']}.mp4" ])

# 生成字幕
subprocess.run([
    "python", "scripts/build_ass.py", "source.json",
    "--start", str(clip["start"]),
    "--end", str(clip["end"]),
    "--style", "opus"
], stdout=open(f"{clip['title']}.ass", "w"))

# 重构图并嵌入字幕
subprocess.run([
    "ffmpeg", "-i", f"raw_{clip['title']}.mp4",
    "-vf", f"crop=608:1080:656:0,ass={clip['title']}.ass",
    f"{clip['title']}_final.mp4"
])

undefined

Configuration

配置说明

Hardware acceleration

硬件加速

macOS (default):

bash

ffmpeg -hwaccel videotoolbox -i input.mp4 ...

Linux with NVIDIA:

bash

ffmpeg -hwaccel cuda -hwaccel_output_format cuda -i input.mp4 ...

Windows:

bash

ffmpeg -hwaccel dxva2 -i input.mp4 ...

Disable (CPU only): Remove

-hwaccel

flags from SKILL.md ffmpeg commands.

macOS（默认）：

bash

ffmpeg -hwaccel videotoolbox -i input.mp4 ...

NVIDIA显卡的Linux：

bash

ffmpeg -hwaccel cuda -hwaccel_output_format cuda -i input.mp4 ...

Windows：

bash

ffmpeg -hwaccel dxva2 -i input.mp4 ...

禁用（仅CPU）： 从SKILL.md的ffmpeg命令中移除

-hwaccel

参数。

Whisper model size

Whisper模型尺寸

Faster but less accurate:

bash

whisper video.mp4 --model tiny  # ~1GB, 10x faster

More accurate but slower:

bash

whisper video.mp4 --model medium  # ~1.5GB, 2x slower
whisper video.mp4 --model large   # ~3GB, 4x slower

Default in SKILL.md:

base

(good balance for dialogue).

速度快但精度较低：

bash

whisper video.mp4 --model tiny  # ~1GB，速度快10倍

精度高但速度较慢：

bash

whisper video.mp4 --model medium  # ~1.5GB，速度慢2倍
whisper video.mp4 --model large   # ~3GB，速度慢4倍

SKILL.md中的默认模型：

base

（对话场景下的平衡选择）。

Output quality settings

输出质量设置

High quality (larger file):

bash

ffmpeg -i input.mp4 -c:v libx264 -preset slow -crf 18 -c:a aac -b:a 192k output.mp4

Fast encode (lower quality):

bash

ffmpeg -i input.mp4 -c:v libx264 -preset veryfast -crf 23 -c:a aac -b:a 128k output.mp4

Social media optimized (SKILL.md default):

bash

ffmpeg -i input.mp4 -c:v libx264 -preset medium -crf 20 -c:a aac -b:a 160k output.mp4

高质量（文件较大）：

bash

ffmpeg -i input.mp4 -c:v libx264 -preset slow -crf 18 -c:a aac -b:a 192k output.mp4

快速编码（质量较低）：

bash

ffmpeg -i input.mp4 -c:v libx264 -preset veryfast -crf 23 -c:a aac -b:a 128k output.mp4

社交媒体优化（SKILL.md默认）：

bash

ffmpeg -i input.mp4 -c:v libx264 -preset medium -crf 20 -c:a aac -b:a 160k output.mp4

Troubleshooting

故障排除

"No motion detected in face regions"

"面部区域未检测到运动"

Face crop coordinates are wrong. Verify on a sample frame:

bash

undefined

面部裁剪坐标错误。在样本帧上验证：

bash

undefined

Extract frame at 1 minute mark

提取1分钟处的帧

ffmpeg -ss 00:01:00 -i source.mp4 -frames:v 1 sample.png

Overlay crop rectangles (adjust x,y,w,h)

叠加裁剪矩形（调整x,y,w,h）

ffmpeg -i source.mp4 -ss 00:01:00 -frames:v 1
-vf "drawbox=x=200:y=100:w=300:h=200:color=red:t=5,drawbox=x=1100:y=100:w=300:h=200:color=blue:t=5"
sample_boxes.png


Red = left face, blue = right face. Adjust coordinates until boxes frame each person's mouth/chin.

ffmpeg -i source.mp4 -ss 00:01:00 -frames:v 1
-vf "drawbox=x=200:y=100:w=300:h=200:color=red:t=5,drawbox=x=1100:y=100:w=300:h=200:color=blue:t=5"
sample_boxes.png


红色=左侧面部，蓝色=右侧面部。调整坐标直到框选每个人的嘴部/下巴区域。

"Captions out of sync"

"字幕不同步"

Whisper timestamps drift on long videos. Use smaller segments:

bash

undefined

长视频中Whisper时间戳会漂移。使用更小的片段：

bash

undefined

Transcribe only the relevant 5-minute section

仅转录相关的5分钟片段

ffmpeg -i source.mp4 -ss 00:15:00 -to 00:20:00 -c copy segment.mp4 whisper segment.mp4 --model base --output_format json


Or enable Whisper's word-level timestamps:
```bash
whisper video.mp4 --model base --word_timestamps True

ffmpeg -i source.mp4 -ss 00:15:00 -to 00:20:00 -c copy segment.mp4 whisper segment.mp4 --model base --output_format json


或启用Whisper的逐词时间戳：
```bash
whisper video.mp4 --model base --word_timestamps True

"Pan cuts too frequently"

"平移切换过于频繁"

Increase motion threshold in

analyze.py

python

undefined

在

analyze.py

中提高运动阈值：

python

undefined

Default threshold

默认阈值

MOTION_THRESHOLD = 0.1

Higher = less sensitive (fewer speaker changes)

值越高越不敏感（发言者切换更少）

MOTION_THRESHOLD = 0.3


Or add minimum duration between cuts in `build_pan.py`:

```python

MOTION_THRESHOLD = 0.3


或在`build_pan.py`中添加切换的最小间隔时长：

```python

Ignore speaker changes shorter than 2 seconds

忽略短于2秒的发言者切换

MIN_SEGMENT_DURATION = 2.0

undefined

MIN_SEGMENT_DURATION = 2.0

undefined

"ffmpeg not found"

"未找到ffmpeg"

Ensure ffmpeg is in PATH:

bash

which ffmpeg

确保ffmpeg在PATH中：

bash

which ffmpeg

Should print: /usr/local/bin/ffmpeg or similar

应输出：/usr/local/bin/ffmpeg或类似路径

If not installed:

若未安装：

brew install ffmpeg # macOS sudo apt install ffmpeg # Ubuntu

undefined

brew install ffmpeg # macOS sudo apt install ffmpeg # Ubuntu

undefined

"Whisper import error"

"Whisper导入错误"

bash

undefined

bash

undefined

Uninstall conflicting whisper packages

卸载冲突的whisper包

pip uninstall whisper openai-whisper

Reinstall correct package

重新安装正确的包

pip install openai-whisper

Verify

验证

python -c "import whisper; print(whisper.version)"

undefined

python -c "import whisper; print(whisper.version)"

undefined

"VideoToolbox acceleration failed"

"VideoToolbox加速失败"

macOS-specific. Fallback to CPU:

bash

undefined

macOS专属问题。 fallback到CPU：

bash

undefined

Remove -hwaccel videotoolbox from all ffmpeg commands

从所有ffmpeg命令中移除-hwaccel videotoolbox

ffmpeg -i input.mp4 ... # (no -hwaccel flag)


Or use software decode explicitly:
```bash
ffmpeg -hwaccel none -i input.mp4 ...

ffmpeg -i input.mp4 ... # 无-hwaccel参数


或明确使用软件解码：
```bash
ffmpeg -hwaccel none -i input.mp4 ...

Advanced: Integration with Custom Workflows

进阶：与自定义工作流集成

Use Clipify detection with external editor

将Clipify检测与外部编辑器结合使用

python

undefined

python

undefined

1. Run Clipify's detection logic (without cutting)

1. 运行Clipify的检测逻辑（不进行剪辑）

This would be in your custom script:

以下代码可放入自定义脚本：

import subprocess import json

Transcribe

转录

subprocess.run(["whisper", "source.mp4", "--model", "base", "--output_format", "json"])

Load transcript

加载转录结果

with open("source.json") as f: transcript = json.load(f)

Simple punchline detector (look for laughter indicators)

简单的笑点检测器（查找笑声相关词汇）

candidates = [] for segment in transcript["segments"]: text = segment["text"].lower() if any(word in text for word in ["haha", "lol", "crazy", "worst", "insane"]): candidates.append({ "start": segment["start"], "end": segment["end"], "text": segment["text"] })

print(json.dumps(candidates, indent=2))

Output to DaVinci Resolve, Premiere, etc.

输出到DaVinci Resolve、Premiere等软件

undefined

undefined

Export timeline for manual editing

导出时间线用于手动编辑

python

undefined

python

undefined

Generate EDL (Edit Decision List) from Clipify candidates

从Clipify候选片段生成EDL（编辑决策列表）

def to_edl(clips, fps=30): edl = ["TITLE: Clipify Export", "FCM: NON-DROP FRAME", ""]

for i, clip in enumerate(clips, 1):
    start_tc = frames_to_tc(int(clip["start"] * fps), fps)
    end_tc = frames_to_tc(int(clip["end"] * fps), fps)
    
    edl.append(f"{i:03d}  AX       V     C        {start_tc} {end_tc} 00:00:00:00 {end_tc}")

return "\n".join(edl)

def frames_to_tc(frames, fps): h = frames // (fps * 3600) m = (frames % (fps * 3600)) // (fps * 60) s = (frames % (fps * 60)) // fps f = frames % fps return f"{h:02d}:{m:02d}:{s:02d}:{f:02d}"

def to_edl(clips, fps=30): edl = ["TITLE: Clipify Export", "FCM: NON-DROP FRAME", ""]

for i, clip in enumerate(clips, 1):
    start_tc = frames_to_tc(int(clip["start"] * fps), fps)
    end_tc = frames_to_tc(int(clip["end"] * fps), fps)
    
    edl.append(f"{i:03d}  AX       V     C        {start_tc} {end_tc} 00:00:00:00 {end_tc}")

return "\n".join(edl)

def frames_to_tc(frames, fps): h = frames // (fps * 3600) m = (frames % (fps * 3600)) // (fps * 60) s = (frames % (fps * 60)) // fps f = frames % fps return f"{h:02d}:{m:02d}:{s:02d}:{f:02d}"

Usage

使用示例

clips = [{"start": 154.2, "end": 171.8}, {"start": 492.5, "end": 509.1}] print(to_edl(clips))

undefined

clips = [{"start": 154.2, "end": 171.8}, {"start": 492.5, "end": 509.1}] print(to_edl(clips))

undefined

Custom caption animations

自定义字幕动画

Modify ASS file for animated entrances:

ass

Dialogue: 0,0:00:00.50,0:00:02.30,Default,,0,0,0,,{\fad(200,200)\move(640,1000,640,900)}This is animated text

```
\fad(200,200)
```
: 200ms fade in/out
```
\move(x1,y1,x2,y2)
```
: Slide from bottom to center
```
\t(0,500,\fscx120\fscy120)
```
: Scale animation over 500ms

修改ASS文件实现入场动画：

ass

Dialogue: 0,0:00:00.50,0:00:02.30,Default,,0,0,0,,{\fad(200,200)\move(640,1000,640,900)}This is animated text

```
\fad(200,200)
```
：200ms淡入淡出
```
\move(x1,y1,x2,y2)
```
：从底部滑动到中心
```
\t(0,500,\fscx120\fscy120)
```
：500ms内缩放动画

Performance Tips

性能优化技巧

Use proxy files for preview: Transcode to lower resolution before clipify analysis
bash
```
ffmpeg -i source.mp4 -vf scale=960:540 -c:v libx264 -crf 28 proxy.mp4
```

Skip transcription on re-runs: Cache Whisper JSON output

bash

if [ ! -f source.json ]; then
  whisper source.mp4 --model base --output_format json
fi

Parallel clip rendering: Process multiple clips simultaneously

bash

for clip in clip1 clip2 clip3; do
  ffmpeg -i $clip.mp4 -vf "..." ${clip}_out.mp4 &
done
wait

GPU acceleration: Use NVIDIA NVENC for faster encoding

bash

ffmpeg -hwaccel cuda -i input.mp4 -c:v h264_nvenc -preset p4 -crf 20 output.mp4

使用代理文件预览：在Clipify分析前转码为低分辨率

bash

ffmpeg -i source.mp4 -vf scale=960:540 -c:v libx264 -crf 28 proxy.mp4

重复运行时跳过转录：缓存Whisper JSON输出

bash

if [ ! -f source.json ]; then
  whisper source.mp4 --model base --output_format json
fi

并行渲染片段：同时处理多个片段

bash

for clip in clip1 clip2 clip3; do
  ffmpeg -i $clip.mp4 -vf "..." ${clip}_out.mp4 &
done
wait

GPU加速：使用NVIDIA NVENC加快编码速度

bash

ffmpeg -hwaccel cuda -i input.mp4 -c:v h264_nvenc -preset p4 -crf 20 output.mp4