Loading...
Loading...
Use when the user has a long-form video (interview / lecture / podcast / conversation) and a transcript SRT, and wants to extract 3–6 stand-alone topical short clips from it. This skill ONLY cuts and crops — it produces raw clips + per-clip SRTs as a hand-off package for downstream post-production (`/wjs-overlaying-video`). Triggers — "切成几段", "分主题", "拆成短视频", "切片", "topic segments", "split into clips".
npx skill4agent add jianshuo/claude-skills wjs-segmenting-video/wjs-overlaying-video/wjs-overlaying-videoffmpeg -ss A -to B/wjs-transcribing-audio/wjs-translating-subtitles/wjs-editing-multicam| Is | Is not |
|---|---|
| You (the agent) read the full SRT and decide the topic boundaries | A script that runs NLP topic modeling, silence detection, or "viral moment" scoring. Topic boundaries are semantic; competing tools (Descript, OpusClip, Riverside Magic Clips) all get this wrong by automating it. |
| An end-to-end "magic" pipeline |
| Accurate-seek cuts by default (re-encode) — clip starts EXACTLY at requested timestamp | Stream-copy cuts (those produce keyframe-snap drift up to GOP duration) |
| Hands off raw cropped clips + per-clip SRTs | Burned subtitles, covers, intros, CTAs (those live in |
long video + SRT
↓ (agent reads SRT, decides topics — judgment, not parsing)
segments.json
↓ segment.py --reencode (accurate seek; clip starts exactly at requested t)
clip_NN.mp4 + frame_NN.jpg
↓ ASK: target platform orientation match source?
↓ /wjs-reframing-video on each clip (if 16:9 → 9:16, etc.)
↓ re-extract frames from cropped clips
clip_NN.mp4 (now in target orientation) + clip_NN.zh-CN.burn.srt
↓
HAND OFF → /wjs-overlaying-video
(does covers + captions + illustrations + CTA + final render)segments.jsonreferences/segments_schema.jsonreferences/example_segments.json{
"source_video": "input.mp4",
"source_srt": "input.zh-CN.srt",
"platform": "wechat_channels",
"segments": [{
"id": 1, "slug": "intent-not-code",
"title": "AI 时代不是写代码\n而是写意图",
"summary": "Two-sentence pitch — what's the insight, what's at stake.",
"start": "00:00:43.460", "end": "00:02:35.220",
"cover_prompt": "Visual concept for gpt-image-2 (style anchor, not literal scene)"
}]
}slugtitle\ncover_prompt/wjs-overlaying-videopython3 ~/.claude/skills/wjs-segmenting-video/scripts/segment.py \
--segments segments.json --out output/ --reencode--reencodeffmpeg -ss N -i src -c:v libx264 -c:a aacoutput/frame_NN_slug.jpg--reencodeffmpeg -ss N -c copy-force_key_frames# Build the comma-separated keyframe list from segments.json
KF=$(python3 -c "import json; s=json.load(open('segments.json'))
ts=[]
for seg in s['segments']:
ts += [seg['start'], seg['end']]
print(','.join(ts))")
# Re-encode master once, forcing keyframes at all segment boundaries
ffmpeg -i master.mp4 \
-c:v libx264 -preset medium -crf 18 \
-force_key_frames "$KF" \
-c:a copy master_kf.mp4
# Now stream-copy cuts land exactly:
python3 segment.py --segments segments.json --source master_kf.mp4 --out output/--reencodeffprobe -v error -select_streams v:0 -read_intervals "$((N-2))%$((N+5))" \
-show_entries packet=pts_time,flags -of csv=p=0 master.mp4 | grep "K_"360.023,K__ 362.023,K__-c copyrequested_start − nearest_preceding_keyframe--reencode| Platform | Native orientation | Aspect |
|---|---|---|
| 视频号 (WeChat Channels) | vertical | 9:16 |
| 抖音 / TikTok / Reels | vertical | 9:16 |
| 小红书 (Xiaohongshu video) | vertical | 9:16 |
| YouTube Shorts | vertical | 9:16 |
| YouTube (regular) | horizontal | 16:9 |
| B站 (Bilibili) | horizontal | 16:9 |
ffprobeffprobe -v error -select_streams v:0 \
-show_entries stream=width,height -of csv=p=0 clip_01_*.mp4源视频是横屏 (1920×1080),平台 视频号 需要竖屏 (9:16)。是否对每段 调用转成竖屏?(crop 会用 MediaPipe 跟踪正在说话 的人的脸,保持说话人始终在画面中)/wjs-reframing-video
/wjs-reframing-videomediapipe + opencv + numpyuv venv --python 3.12 /tmp/_crop_venv
/tmp/_crop_venv/bin/python -m pip install mediapipe opencv-python numpyfor n in 01 02 03 04 05; do
slug=$(ls clip_${n}_*.mp4 | grep -v -E "_intro|_burned|_vert" | head -1 | sed -E "s/clip_${n}_(.+)\.mp4/\1/")
/tmp/_crop_venv/bin/python ~/.claude/skills/wjs-reframing-video/scripts/crop.py \
"clip_${n}_${slug}.mp4" \
--out "clip_${n}_${slug}_vert.mp4" \
--target portrait \
--bitrate 8M # 视频号 caps at 10Mbps
donemkdir -p _horizontal_archive
for n in 01 02 03 04 05; do
base=$(ls clip_${n}_*_vert.mp4 | sed -E "s/_vert\.mp4$//")
mv "${base}.mp4" "_horizontal_archive/"
mv "${base}_vert.mp4" "${base}.mp4"
# Re-extract midpoint frame:
mid=$(ffprobe -v error -show_entries format=duration -of csv=p=0 "${base}.mp4" | awk '{print $1/2}')
slug=$(echo "$base" | sed -E "s/^clip_${n}_//")
ffmpeg -hide_banner -loglevel error -ss "$mid" -i "${base}.mp4" \
-frames:v 1 -q:v 3 "frame_${n}_${slug}.jpg" -y
doneface#0: 9.6s on screen (9%)python3 ~/.claude/skills/wjs-segmenting-video/scripts/burn_subs.py \
--segments segments.json --out output/ --no-burn--no-burnclip_NN_slug.zh-CN.burn.srt/wjs-overlaying-videoburn_subs.py--no-burn/wjs-overlaying-video/wjs-overlaying-videooutput/
clip_NN_slug.mp4 # raw cropped clip (target orientation, no subs, no cover)
clip_NN_slug.zh-CN.burn.srt # per-clip SRT, timestamps shifted to start at 0
frame_NN_slug.jpg # midpoint frame (cover reference)
segments.json # for slug/title/summary/cover_prompt metadata/wjs-overlaying-video| Task | Command |
|---|---|
| Cut clips (accurate, default) | |
| Probe source aspect | |
| Convert orientation (ask first) | invoke |
| Slice per-clip SRTs | |
| Diagnose keyframe positions | |
--force_key_frames--reencode/wjs-overlaying-video/wjs-transcribing-audio/wjs-translating-subtitles/wjs-reframing-video/wjs-editing-multicam/wjs-overlaying-videoscripts/segment.pyscripts/burn_subs.py--no-burn/wjs-overlaying-videoreferences/segments_schema.jsonreferences/example_segments.json