Loading...
Loading...
Found 45 Skills
AI image generation skill powered by Google Gemini, enabling seamless visual content creation for UI placeholders, documentation, and design assets.
AI image generation using Google Gemini (Gemini) and OpenAI GPT-Image. Generate, edit, iterate, and create assets.
Generate images using Google Gemini's image generation API for UI mockups, icons, illustrations, and visual assets.
Spec for building an infinite drill-down AI illustrated explainer web app where users type a topic and click to drill into generated watercolor-style images.
AI image generation with Google Gemini via MCP - smart model selection, 4K output, aspect ratios, and templates
Process and generate multimedia content using Google Gemini API. Capabilities include analyze audio files (transcription with timestamps, summarization, speech understanding, music/sound analysis up to 9.5 hours), understand images (captioning, object detection, OCR, visual Q&A, segmentation), process videos (scene detection, Q&A, temporal analysis, YouTube URLs, up to 6 hours), extract from documents (PDF tables, forms, charts, diagrams, multi-page), generate images (text-to-image, editing, composition, refinement). Use when working with audio/video files, analyzing images or screenshots, processing PDF documents, extracting structured data from media, creating images from text prompts, or implementing multimodal AI features. Supports multiple models (Gemini 2.5/2.0) with context windows up to 2M tokens.
AI image generation CLI using Gemini. Use when generating images, checking syntax for resolution, aspect ratio, and reference image options.
Generate images with Google Gemini. Text-to-image and style transfer from reference images.
Generate audio narration of blog posts using Google Gemini TTS. Supports summary narration, full article read-aloud, and two-speaker podcast/dialogue mode with 30 voice options. Outputs MP3 with HTML5 audio embed code. Works standalone via /blog audio or internally from blog-write. Falls back gracefully when API key is not configured. Use when user says "blog audio", "narrate blog", "audio version", "text to speech", "tts", "podcast mode", "read aloud", "audio narration", "voice", "narration", "generate audio".