Loading...
Loading...
Found 29 Skills
Generates image prompts for Seedream 5.0/4.0 (Jimeng AI), and can call the API to generate images and automatically download them to the output/ directory. Workflow: describe your idea → the agent outputs a prompt for review → user confirms → the agent runs generate.py. It covers text-to-image, image editing, multi-image fusion, character consistency, knowledge cards, posters, PPT backgrounds, e-commerce images, avatars, and group/storyboard generation. Activate this tool when the user mentions terms like seedream, jimeng, AI image generation, text-to-image, image-to-image, seedream prompt, prompt keyword, one-click image generation, knowledge card, poster design, e-commerce image, character consistency, or image generation.
Generate and edit images using TensorsLab's AI models. Supports text-to-image, image-to-image generation, plus advanced editing: avatar generation, watermark removal, object erasure, face replacement, and general image editing. Features automatic prompt enhancement, progress tracking, and local file saving. Requires browser-based authorization before first use.
Generate images with GPT Image 2 (ChatGPT Images 2.0) inside Claude Code, using your existing ChatGPT Plus or Pro subscription — no separate OpenAI access, no per-image billing. Supports text-to-image, image-to-image editing, style transfer, and multi-reference composition via the local Codex CLI. Triggers on "gpt image 2", "gpt-image-2", "ChatGPT Images 2.0", "image 2", or any explicit ask to generate or edit an image through the user's ChatGPT plan.
State-of-the-art text-to-image generation with Stable Diffusion models via HuggingFace Diffusers. Use when generating images from text prompts, performing image-to-image translation, inpainting, or building custom diffusion pipelines.
AI image generation and editing capabilities, implemented based on Nano Banana (Gemini Image) to support text-to-image, image-to-image, and image editing. Suitable for scenarios such as creative design, marketing materials, social media content, and presentation illustrations. Supports multiple styles, high-resolution output (up to 4K), text rendering, and character consistency preservation.
Generate/edit images with Nano Banana Pro (Gemini 3 Pro Image). Use for image create/modify requests incl. edits. Supports text-to-image + image-to-image; 1K/2K/4K; use --input-image.
Generate images using Google Gemini AI with text prompts and reference images. Use when creating game assets, concept art, UI mockups, promotional images, or any visual content. Supports text-to-image, image-to-image with style transfer, and multiple output sizes. Requires GEMINI_API_KEY environment variable. Triggers on requests for AI image generation, concept art, visual assets, or Gemini images.
AI Image Generation and Processing Workflow. Generate images via prompts, supporting text-to-image, image-to-image, batch generation, image hosting management, long image merging, and PPT packaging. The core feature is generating images with one-by-one confirmation to avoid wasting API credits.
Generate/edit images with Nano Banana Pro via grsai.com API. Use for image create/modify requests incl. edits. Supports text-to-image + image-to-image; 1K/2K/4K; use --input-image.
Generate, edit, or transform images with Gemini Nano Banana using bundled Python scripts (Flash or Pro) including aspect ratio, resolution, image-to-image edits, logo overlays, and reference images. Use when users request image generation, image edits, image-to-image transformations, logo placement, or specific aspect ratios or resolutions.
Generate and edit images using Google's Nano Banana Pro (Gemini 3 Pro Image) API. Use when the user asks to generate, create, edit, modify, change, alter, or update images. Also use when user references an existing image file and asks to modify it in any way (e.g., "modify this image", "change the background", "replace X with Y"). Supports both text-to-image generation and image-to-image editing with configurable resolution (1K default, 2K, or 4K for high resolution). DO NOT read the image file first - use this skill directly with the --input-image parameter.
MiniMax multimodal model skill — use MiniMax Multi-Modal models for speech, music, video, and image. Create voice, music, video, and images with MiniMax AI: TTS (text-to-speech, voice cloning, voice design, multi-segment), music (songs, instrumentals), video (text-to-video, image-to-video, start-end frame, subject reference, templates, long-form multi-scene), image (text-to-image, image-to-image with character reference), and media processing (convert, concat, trim, extract). Use when the user mentions MiniMax, multimodal generation, or wants speech/music/video/image AI, MiniMax APIs, or FFmpeg workflows alongside MiniMax outputs.