Loading...
Loading...
AI media generation CLI tool using Google's Imagen 4, Veo 3.1, and Gemini TTS. Use when the user wants to (1) generate images from text prompts, (2) edit existing images with AI, (3) explain image contents, (4) generate videos from text or images, (5) create narration/voice audio with character settings. Triggers on requests like "generate an image of...", "create a video...", "make a voice that says...", "edit this image to...", "describe this image".
npx skill4agent add hirokidaichi/ergon ergonNote: Run withif not installed globally.npx ergon
npx ergon image gen "<prompt>" -t <style> -a <ratio> # Image generation
npx ergon image edit <file> "<instruction>" # Image editing
npx ergon video gen "<prompt>" [-i <image>] # Video with audio
npx ergon narration gen "<text>" -c "<character>" # Voice generationnpx ergon image gen [options] <theme>| Use Case | Style ( | Aspect ( |
|---|---|---|
| Product photo, landscape | | 16:9, 4:3 |
| Character, mascot | | 1:1, 3:4 |
| Icon, logo | | 1:1 |
| Art, poster | | varies |
| Game asset | | 1:1 |
| Business, presentation | | 16:9 |
| Concept sketch | | varies |
| Option | Values | Default |
|---|---|---|
| realistic, illustration, flat, anime, watercolor, oil-painting, pixel-art, sketch, 3d-render, corporate, minimal, pop-art | flat |
| 16:9, 4:3, 1:1, 9:16, 3:4 | 16:9 |
| tiny, hd, fullhd, 2k, 4k | fullhd |
| imagen4, imagen4-fast, imagen4-ultra | imagen4 |
npx ergon image gen "cute cat mascot for tech startup" -t anime -a 1:1
npx ergon image gen "professional team meeting in modern office" -t corporate -a 16:9
npx ergon image gen "abstract geometric logo" -t minimal -a 1:1 -o logo.pngnpx ergon image edit [options] <file> <prompt>npx ergon image edit photo.jpg "change background to blue sky"
npx ergon image edit portrait.png "convert to anime style"npx ergon video gen [options] <theme># Sound effects included
npx ergon video gen "cat meowing and playing with a ball, soft purring sounds"
# Music/ambient audio
npx ergon video gen "sunset timelapse over ocean, with calming wave sounds and soft piano music"
# Dialogue/voice
npx ergon video gen "person saying 'welcome to our channel' with friendly tone, waving at camera"npx ergon video gen "character starts dancing to upbeat music" -i character.png
npx ergon video gen "logo reveals with whoosh sound effect" -i logo.png| Option | Values | Default |
|---|---|---|
| image file | - |
| 5-8 seconds | 8 |
| 16:9, 9:16 | 16:9 |
| use Veo 3.1 Fast | false |
-a 9:16npx ergon narration gen [options] <text>-c-d# Character defines WHO is speaking
npx ergon narration gen "Let's go on an adventure!" -c "energetic young girl"
# Direction defines HOW they speak
npx ergon narration gen "The results are in..." -c "news anchor" -d "serious, building suspense"
# Combined for full expression
npx ergon narration gen "Yay! We did it!" -c "excited child" -d "jumping with joy, high energy"| Voice | Character |
|---|---|
| Kore | Female, versatile (default) |
| Aoede | Female, warm |
| Charon | Male, deep |
| Fenrir | Male, strong |
| Puck | Neutral, playful |
| Option | Values | Default |
|---|---|---|
| Kore, Aoede, Charon, Fenrir, Puck | Kore |
| character description | - |
| acting direction | - |
| 0.25-4.0 | 1.0 |
| ja, en, zh, ko, etc. | ja |
npx ergon image gen "product photo of headphones" -t realistic
npx ergon image edit headphones.png "add soft shadow, white background"npx ergon image gen "mascot character standing" -t anime -a 1:1
npx ergon video gen "mascot waves and says hello cheerfully" -i mascot.pngnpx ergon image gen "complex scene" --dry-run # Check settings
npx ergon video gen "expensive render" --dry-run # Verify before API call--json--dry-run-o, --output <path>