Loading...
Loading...
Found 68 Skills
Use this skill for AI text-to-speech generation. Triggers include: "generate voice", "create audio", "text to speech", "TTS", "read this aloud", "generate narration", "create voiceover", "synthesize speech", "podcast audio", "dialogue audio", "multi-speaker", "audiobook" Supports Google Gemini TTS, ElevenLabs, and OpenAI TTS.
Multimodal image processing skill, supporting text-to-image, image-to-image, image-to-text, long image stitching, marketing material packs, product design images, element disassembly diagrams, and social media image sets. Triggered when the user mentions keywords such as "draw", "generate image", "draw XX", "image processing", "image-to-image", "OCR", "image recognition", "stitch long image", "infographic", "illustration", "product image", "material pack", "marketing material", "detail page", "e-commerce image", "design drawing", "exploded view", "disassembly", "image set", "nine-grid", etc. Note: If the user requests a video (including illustrations + voiceover), use the video-creator skill instead.
Use when the user asks for text-to-speech narration or voiceover, accessibility reads, audio prompts, or batch speech generation via the OpenAI Audio API; run the bundled CLI (`scripts/text_to_speech.py`) with built-in voices and require `OPENAI_API_KEY` for live calls. Custom voice creation is out of scope.
Use this skill to create single-voice audio content like audiobooks, voiceovers, narrations, jingles, and audio ads. Triggers: "create audiobook", "generate voiceover", "narration", "audio ad", "radio ad", "jingle", "brand audio", "sonic logo", "text to audio", "read this aloud", "audio guide", "meditation audio", "soundscape" Orchestrates: narration/TTS, background music, and audio assembly. NOTE: For conversations/dialogues, use podcast-producer instead.
Create production-grade motion graphics and videos using Remotion (React). Use whenever the user wants branded video content, product demos, data-driven video generation, or motion graphics with audio sync, web fonts, TailwindCSS styling, or media embedding. Covers: marketing videos, product launches, data visualizations, social media content, personalized video at scale, explainer videos with voiceover, animated charts, 3D scenes via Three.js. Requires Node.js and Claude Code environment. Trigger on: "create a Remotion video", "React video", "motion graphics", "branded video", "product demo video", "remotion", "video with audio", "TailwindCSS video", "data-driven video generation", "personalized video at scale", "video with voiceover". For mathematical animations, algorithm visualizations, or headless container rendering, use concept-to-video (Manim) instead.
Full video production workflow for Remotion projects. Teaches how to orchestrate MCP tools (TTS, music, SFX, stock footage, video analysis) into complete Remotion compositions. Use this skill whenever producing a video that needs audio, voiceovers, music, stock footage, or analyzing existing video files.
Eva-skill: A Thinking Coaching & Viral Short Video Toolkit for Creators. It helps creators organize their desire to express, aids thinking with creator thinking tools, thinking lenses and MBTI lenses, reframes superficial problems, deconstructs concepts, expands content directions, and completes the production of voiceover short videos for platforms like Xiaohongshu, Douyin, and Video Account. It covers modules including Thinking Assistant, Creator Thinking Tool Library, Thinking Lenses, MBTI Lens, Superficial Problem Reframing, Viral Topic Selection, Viral Case Deconstruction, Title Anchor, User Question Validation, Title & Cover, Voiceover Script, Material Retrieval, Voiceover Performance, Post-Publishing Review, Sedimentation Mechanism, Fallback Mechanism, and Interactive Tone & Rhythm. Trigger Methods: /eva, /thinking-flow, /viral-short-video, /eva-think, /eva-lens, /eva-lenses, /eva-mbti, /eva-mbti-lens, /eva-reframe, /eva-shortvideo, /eva-topic, /eva-deconstruct, /eva-title-cover, /eva-script, /eva-performance, /eva-review, /eva-sediment, "Help me think", "My mind is messy", "Thinking Lens", "Scholar's Perspective", "MBTI Lens", "MBTI Lens", "MBTI", "Personality Type", "Reframe Problem", "Deconstruct Concept", "Help me make a voiceover video", "How to make this Xiaohongshu video", "Help me write a video script", "Help me review data", "Sediment", "Save", "Archive", "Continue next time"
Process videos with the VideoDB Python SDK. Handles trimming, combining clips, audio overlays, background music, subtitles, transcription, voiceover, text/image overlays, transcoding, resolution change, aspect-ratio fix, resizing for social platforms, media generation, search, and real-time capture — all server-side with no ffmpeg or local encoding tools needed.
Scene-by-scene narration scripts for videos. Use when writing voiceover scripts, adding timing markers, or creating CTA patterns for demos
Generate speech audio from text using HeyGen's Starfish TTS model. Use when: (1) Generating standalone speech audio files from text, (2) Converting text to speech with voice selection, speed, and pitch control, (3) Creating audio for voiceovers, narration, or podcasts, (4) Working with HeyGen's /v1/audio endpoints, (5) Listing available TTS voices by language or gender.
Convert text to natural speech using Sarvam AI's Bulbul v3 model. Handles audio generation, voiceovers, and voice interfaces for 11 Indian languages with 30+ voices. Supports REST, HTTP streaming, WebSocket, and pronunciation dictionaries. Use when generating spoken audio from text.
Master the essential audio post-production techniques—normalization, compression, EQ, and noise reduction—using the correct processing order to achieve professional-quality audio. Use when: Editing podcast episodes or video soundtracks; Cleaning up recorded voiceovers; Improving audio quality for marketing content; Preparing audio files for distribution; Troubleshooting common audio issues