Loading...
Loading...
AI-assisted video editing workflows for cutting, structuring, and augmenting real footage. Covers the full pipeline from raw capture through FFmpeg, Remotion, ElevenLabs, fal.ai, and final polish in Descript or CapCut. Use when the user wants to edit video, cut footage, create vlogs, or build video content.
npx skill4agent add affaan-m/everything-claude-code video-editingScreen Studio / raw footage
→ Claude / Codex
→ FFmpeg
→ Remotion
→ ElevenLabs / fal.ai
→ Descript or CapCutvideodbExample prompt:
"Here's the transcript of a 4-hour recording. Identify the 8 strongest segments
for a 24-minute vlog. Give me FFmpeg cut commands for each segment."ffmpeg -i raw.mp4 -ss 00:12:30 -to 00:15:45 -c copy segment_01.mp4#!/bin/bash
# cuts.txt: start,end,label
while IFS=, read -r start end label; do
ffmpeg -i raw.mp4 -ss "$start" -to "$end" -c copy "segments/${label}.mp4"
done < cuts.txt# Create file list
for f in segments/*.mp4; do echo "file '$f'"; done > concat.txt
ffmpeg -f concat -safe 0 -i concat.txt -c copy assembled.mp4ffmpeg -i raw.mp4 -vf "scale=960:-2" -c:v libx264 -preset ultrafast -crf 28 proxy.mp4ffmpeg -i raw.mp4 -vn -acodec pcm_s16le -ar 16000 audio.wavffmpeg -i segment.mp4 -af loudnorm=I=-16:TP=-1.5:LRA=11 -c:v copy normalized.mp4import { AbsoluteFill, Sequence, Video, useCurrentFrame } from "remotion";
export const VlogComposition: React.FC = () => {
const frame = useCurrentFrame();
return (
<AbsoluteFill>
{/* Main footage */}
<Sequence from={0} durationInFrames={300}>
<Video src="/segments/intro.mp4" />
</Sequence>
{/* Title overlay */}
<Sequence from={30} durationInFrames={90}>
<AbsoluteFill style={{
justifyContent: "center",
alignItems: "center",
}}>
<h1 style={{
fontSize: 72,
color: "white",
textShadow: "2px 2px 8px rgba(0,0,0,0.8)",
}}>
The AI Editing Stack
</h1>
</AbsoluteFill>
</Sequence>
{/* Next segment */}
<Sequence from={300} durationInFrames={450}>
<Video src="/segments/demo.mp4" />
</Sequence>
</AbsoluteFill>
);
};npx remotion render src/index.ts VlogComposition output.mp4import os
import requests
resp = requests.post(
f"https://api.elevenlabs.io/v1/text-to-speech/{voice_id}",
headers={
"xi-api-key": os.environ["ELEVENLABS_API_KEY"],
"Content-Type": "application/json"
},
json={
"text": "Your narration text here",
"model_id": "eleven_turbo_v2_5",
"voice_settings": {"stability": 0.5, "similarity_boost": 0.75}
}
)
with open("voiceover.mp3", "wb") as f:
f.write(resp.content)fal-ai-mediagenerate(app_id: "fal-ai/nano-banana-pro", input_data: {
"prompt": "professional thumbnail for tech vlog, dark background, code on screen",
"image_size": "landscape_16_9"
})voiceover = coll.generate_voice(text="Narration here", voice="alloy")
music = coll.generate_music(prompt="lo-fi background for coding vlog", duration=120)
sfx = coll.generate_sound_effect(prompt="subtle whoosh transition")| Platform | Aspect Ratio | Resolution |
|---|---|---|
| YouTube | 16:9 | 1920x1080 |
| TikTok / Reels | 9:16 | 1080x1920 |
| Instagram Feed | 1:1 | 1080x1080 |
| X / Twitter | 16:9 or 1:1 | 1280x720 or 720x720 |
# 16:9 to 9:16 (center crop)
ffmpeg -i input.mp4 -vf "crop=ih*9/16:ih,scale=1080:1920" vertical.mp4
# 16:9 to 1:1 (center crop)
ffmpeg -i input.mp4 -vf "crop=ih:ih,scale=1080:1080" square.mp4from videodb import ReframeMode
# Smart reframe (AI-guided subject tracking)
reframed = video.reframe(start=0, end=60, target="vertical", mode=ReframeMode.smart)# Detect scene changes (threshold 0.3 = moderate sensitivity)
ffmpeg -i input.mp4 -vf "select='gt(scene,0.3)',showinfo" -vsync vfr -f null - 2>&1 | grep showinfo# Find silent segments (useful for cutting dead air)
ffmpeg -i input.mp4 -af silencedetect=noise=-30dB:d=2 -f null - 2>&1 | grep silence"Given this transcript with timestamps and these scene change points,
identify the 5 most engaging 30-second clips for social media."| Tool | Strength | Weakness |
|---|---|---|
| Claude / Codex | Organization, planning, code generation | Not the creative taste layer |
| FFmpeg | Deterministic cuts, batch processing, format conversion | No visual editing UI |
| Remotion | Programmable overlays, composable scenes, reusable templates | Learning curve for non-devs |
| Screen Studio | Polished screen recordings immediately | Only screen capture |
| ElevenLabs | Voice, narration, music, SFX | Not the center of the workflow |
| Descript / CapCut | Final pacing, captions, polish | Manual, not automatable |
fal-ai-mediavideodbcontent-engine