Loading...
Loading...
Found 56 Skills
Integrate Modellix's unified API for AI image and video generation into applications. Use this skill whenever the user wants to generate images from text, create videos from text or images, edit images, do virtual try-on, or call any Modellix model API. Also trigger when the user mentions Modellix, model-as-a-service for media generation, or needs to work with providers like Qwen, Wan, Seedream, Seedance, Kling, Hailuo, or MiniMax through a unified API.
Universal AI video generation supporting OpenAI Sora, Google Veo 2/3, Runway Gen-3/Gen-4, Pika 2.2, Luma Dream Machine (Ray 2), FAL (Kling / Wan / Veo / Sora wrappers), Ark Seedance 1.5 Pro/Lite, Bailian Wanx (i2v), MiniMax Hailuo-02, and Vidu Q3. Use this skill whenever the user asks to generate, create, make, or synthesize a video from a text prompt or from a first-frame image. Covers text-to-video and image-to-video, with optional last-frame control on providers that support it. Typical phrases include "generate a video of ...", "make a 5-second clip of ...", "animate this image", "生成一段视频", "做个短片", or any mention of video-generation model families like Sora, Veo, Runway Gen, Kling, Wan, Seedance, Hailuo, Pika, Dream Machine, Vidu. Always use this skill even if the user does not name a specific model — pick a provider from their EXTEND.md defaults or available API keys. Do NOT use this skill when the user explicitly mentions 即梦 / Dreamina / Jimeng — those go to happy-dreamina instead.
Automatically collect hot topics in the AI field or complete AI technical article writing in the writing style of 'Second Brother' according to specified topics. It focuses on actual tests of AI Coding tools (Claude Code, Qoder, Cursor, TRAE, etc.), engineering implementation of large models (SpringAI, LangChain, RAG, etc.), AI Agent and workflow orchestration, evaluation of domestic large models (GLM, Tongyi Qianwen, DeepSeek, MiniMax, Kimi, etc.), and evaluation of various AI tools and Agent tools. Trigger keywords: write an AI article, AI technical article, large model evaluation, AI tool actual test, GLM, Claude Code, Qoder, Cursor, TRAE, SpringAI, RAG, Agent, workflow, domestic large model, collect AI hot topics, AI topic, etc.
Complete fal.ai image-to-video system. PROACTIVELY activate for: (1) Kling 2.5/2.6 Pro image animation, (2) MiniMax Hailuo with prompt optimizer, (3) LTX image-to-video, (4) Runway Gen-3 Turbo, (5) Luma Dream Machine with loop, (6) Stable Video Diffusion, (7) Motion description prompts, (8) Portrait/product animation workflows. Provides: Model endpoints, motion keywords, animation techniques, workflow examples. Ensures natural image animation with proper motion description.
Universal AI voice / text-to-speech skill supporting OpenAI TTS (gpt-4o-mini-tts, tts-1), ElevenLabs multilingual TTS with voice cloning, Bailian Qwen TTS (qwen-tts / qwen3-tts-vd with voice-design custom voices, long-text chunking built in), MiniMax speech-02-hd, SiliconFlow CosyVoice / SenseVoice, and PlayHT 2.0. Use this skill whenever the user asks to read text aloud, synthesize speech, generate narration, create voice-over, dub a script, or turn any text into audio (mp3 / wav / ogg / flac). Typical phrases include "read this aloud", "generate voice for ...", "create a narration of ...", "tts this", "把这段念出来", "做个配音", "合成语音", or mentions of voices / TTS model names like Alloy, Ash, Cherry, Rachel, CosyVoice, PlayHT. Always use this skill even if the user does not specify a provider — pick one from EXTEND.md defaults or available env keys.
Return public original model architecture diagrams for user-specified LLM, VLM, MoE, diffusion, OCR, and SGLang/sgl-cookbook model families. Use when the user asks for a model structure chart, architecture diagram, or rendered image link for a specific model such as DeepSeek, GLM, Qwen, Kimi, MiniMax, Step, Hunyuan, or Qwen3-VL.
Audio generation skill — jingles, beds, voiceover, and sound effects. Routes music requests to Suno V5 / Udio / Lyria, speech to MiniMax TTS / FishAudio / ElevenLabs V3, and SFX to ElevenLabs SFX or AudioCraft. Output is one MP3/WAV file saved to the project folder.
Generate a 65-second founder-style product video from a product URL + user-supplied imagery. Output is a 16:9 1080p MP4 — 4 × 15s SeeDance acts of a talking founder + 5s branded end card + background music. The user's actual product screenshots appear on the founder's phone in reveal shots, so on-screen UI is real, not AI-imagined. Triggers — "founder video", "product video", "60s pitch video", "make a video of [founder] for [URL]", "talking founder explainer". Requires Pika MCP. Uses a supplied brand kit folder (`brand.json` or an exported build-a-brand kit with `brand.md`, tokens, logo assets); if no kit exists, run build-a-brand first.