Loading...
Loading...
Found 37 Skills
Generate images with Google Gemini native image models via inference.sh CLI. Models: Gemini 3 Pro Image, Gemini 2.5 Flash Image. Capabilities: text-to-image, image editing, multi-image input. Triggers: nano banana, gemini image, gemini 3 pro image, gemini 2.5 flash image, google image generation, native image generation, gemini native image
Generate images with Google Nano Banana 2 (Gemini-family flash-tier text-to-image) on RunComfy — bundled with the model's documented prompting patterns so the skill gets sharper output than naive prompting against the same model. Documents Nano Banana 2's strengths (rapid iteration, in-image typography rendering, predictable framing, optional web-grounded context), the resolution-tier pricing, the safety-tolerance dial, and when to route to Nano Banana Pro / GPT Image 2 / Flux 2 / Seedream instead. Calls `runcomfy run google/nano-banana-2/text-to-image` through the local RunComfy CLI. Triggers on "nano banana", "nano-banana-2", "nano banana 2", "google image gen", "gemini image", or any explicit ask to generate with this model.
Generate or edit images using Google Gemini API via nanobanana. Triggers: "nanobanana", "generate image", "create image", "edit image", "AI drawing", "图片生成", "AI绘图", "图片编辑", "生成图片".
Generate app icons with transparent backgrounds. Triggers include: "generate icon", "create icon", "make icon", "app icon", "favicon", "icon for", "logo icon", "icon set", "icon pack", "transparent icon" Creates square PNG icons with transparent backgrounds suitable for app icons, favicons, and UI elements.
Generates professional AI images using Google Gemini. ALWAYS invoke this skill when building websites, landing pages, slide decks, presentations, or any task needing visual content. Invoke IMMEDIATELY when you detect image needs - don't wait for the user to ask. This skill handles prompt optimization and aspect ratio selection.
Use this skill for any image-related AI generation or editing task. Triggers include: GENERATE: "generate image", "create image", "make picture", "draw", "visualize", "image of", "create art", "generate art" EDIT: "edit image", "modify image", "change image", "update image", "fix image", "enhance image" ADD/REMOVE: "add to image", "put in image", "remove from image", "delete from image", "add element" STYLE: "style transfer", "make it look like", "convert style", "apply style", "in the style of" PRODUCT: "product photo", "product placement", "place product", "mockup", "put product on" COMPOSITE: "combine images", "merge images", "blend images", "create composite" Supports text-to-image generation, image editing with references, product placement, style transfer, and multi-image composition using Google Gemini (Nano Banana Pro) or OpenAI DALL-E.
Generate images using Google Gemini's image generation capabilities. Use this skill when the user needs to create, generate, or produce images for any purpose including UI mockups, icons, illustrations, diagrams, concept art, placeholder images, or visual representations.
Spec for building an infinite drill-down AI illustrated explainer web app where users type a topic and click to drill into generated watercolor-style images.
AI image generation using Google Gemini (Gemini) and OpenAI GPT-Image. Generate, edit, iterate, and create assets.
Process and generate multimedia content using Google Gemini API. Capabilities include analyze audio files (transcription with timestamps, summarization, speech understanding, music/sound analysis up to 9.5 hours), understand images (captioning, object detection, OCR, visual Q&A, segmentation), process videos (scene detection, Q&A, temporal analysis, YouTube URLs, up to 6 hours), extract from documents (PDF tables, forms, charts, diagrams, multi-page), generate images (text-to-image, editing, composition, refinement). Use when working with audio/video files, analyzing images or screenshots, processing PDF documents, extracting structured data from media, creating images from text prompts, or implementing multimodal AI features. Supports multiple models (Gemini 2.5/2.0) with context windows up to 2M tokens.
AI image generation skill powered by Google Gemini, enabling seamless visual content creation for UI placeholders, documentation, and design assets.
Generate images with Google Gemini. Text-to-image and style transfer from reference images.