Loading...
Loading...
Generate videos using ComfyUI with Wan 2.2, FramePack, or AnimateDiff. Handles image-to-video, text-to-video, talking heads, and motion-controlled animation. Use when creating any video content from character images or text descriptions.
npx skill4agent add mckruz/comfyui-expert comfyui-video-pipelineVIDEO REQUEST
|
|-- Need film-level quality?
| |-- Yes + 24GB+ VRAM → Wan 2.2 MoE 14B
| |-- Yes + 8GB VRAM → Wan 2.2 1.3B
|
|-- Need long video (>10 seconds)?
| |-- Yes → FramePack (60 seconds on 6GB)
|
|-- Need fast iteration?
| |-- Yes → AnimateDiff Lightning (4-8 steps)
|
|-- Need camera/motion control?
| |-- Yes → AnimateDiff V3 + Motion LoRAs
|
|-- Need first+last frame control?
| |-- Yes → Wan 2.2 MoE (exclusive feature)
|
|-- Default → Wan 2.2 (best general quality)wan2.1_i2v_720p_14b_bf16.safetensorsmodels/diffusion_models/umt5_xxl_fp8_e4m3fn_scaled.safetensorsmodels/clip/open_clip_vit_h_14.safetensorsmodels/clip_vision/wan_2.1_vae.safetensorsmodels/vae/| Parameter | Value | Notes |
|---|---|---|
| Resolution | 1280x720 (landscape) or 720x1280 (portrait) | Native training resolution |
| Frames | 81 (~5 seconds at 16fps) | Multiples of 4 + 1 |
| Steps | 30-50 | Higher = better quality |
| CFG | 5-7 | |
| Sampler | uni_pc | Recommended for Wan |
| Scheduler | normal |
| Duration | Frames (16fps) |
|---|---|
| 1 second | 17 |
| 3 seconds | 49 |
| 5 seconds | 81 |
| 10 seconds | 161 |
wan2.1_t2v_14b_bf16.safetensorsEmptySD3LatentImage| Parameter | Value | Notes |
|---|---|---|
| Resolution | 640x384 to 1280x720 | Depends on VRAM |
| Duration | Up to 60 seconds | VRAM-invariant |
| Quality | High (comparable to Wan) | Uses same base models |
| Parameter | Value (Standard) | Value (Lightning) |
|---|---|---|
| Motion Module | | |
| Steps | 20-25 | 4-8 |
| CFG | 7-8 | 1.5-2.0 |
| Sampler | euler_ancestral | lcm |
| Resolution | 512x512 | 512x512 |
| Context Length | 16 | 16 |
| Context Overlap | 4 | 4 |
| LoRA | Motion |
|---|---|
| v2_lora_ZoomIn | Camera zooms in |
| v2_lora_ZoomOut | Camera zooms out |
| v2_lora_PanLeft | Camera pans left |
| v2_lora_PanRight | Camera pans right |
| v2_lora_TiltUp | Camera tilts up |
| v2_lora_TiltDown | Camera tilts down |
| v2_lora_RollingClockwise | Camera rolls clockwise |
Input (16fps) → RIFE 2x → Output (32fps)
Input (16fps) → RIFE 4x → Output (64fps)rife47rife49frame_rate: 16 (native) or 24/30 (after interpolation)
format: "video/h264-mp4"
crf: 19 (high quality) to 23 (smaller file)1. Generate audio → comfyui-voice-pipeline
2. Generate base video → This skill (Wan I2V or AnimateDiff)
- Prompt: "{character}, talking naturally, slight head movement"
- Duration: match audio length
3. Apply lip-sync → Wav2Lip or LatentSync
4. Enhance faces → FaceDetailer + CodeFormer
5. Final output → video-assemblyreferences/workflows.mdreferences/models.mdreferences/research-2025.mdstate/inventory.json