Loading...
Loading...
Found 58 Skills
Generate and edit images using Google's Nano Banana 2 (Gemini 3.1 Flash Image Preview) API. This skill should be used when the user asks to create or modify images, especially when they need fast iteration, explicit aspect-ratio control, or resolution control from 512px to 4K.
Generate professional presentation slides and high-quality illustrations using Gemini image generation API (Nano Banana 2), with interactive browser-based review and iterative editing. Full workflow: content planning conversation → slides_plan.json → batch image generation → review with feedback → targeted slide editing → PPTX packaging. Use when: user wants to create a presentation, make slides, generate a PPT/PPTX, prepare a talk deck, design visual slide content, or generate high-quality figures/illustrations for papers and documents. Do NOT use for: writing academic papers (use paper-writing) or planning academic conference talk narrative structure (use academic-slides).
Google Gemini API for Pro/Flash/Ultra models with 1M token context.
Operate OpenWord end-to-end for live adventure sessions. Use when Codex needs to download/install/start OpenWord, guide a human player in the browser, or play autonomously through REST API (create/load game, do_action loop, state/image retrieval), including configuring GEMINI_API_KEY and sharing interesting scenes and choices during play.
Analyze local or downloaded social video files with the official Gemini API, especially for TikTok/Reels benchmark breakdowns, script decomposition, and structured JSON outputs. Use this when you need video-level analysis beyond metadata, including uploading video files, prompting Gemini 3.1 Pro Preview, and linking results back to source metadata.
Generate AI images using Gemini image generation API. Use this skill when content needs images - thumbnails, social posts, blog headers, or creative visuals. Follows an iterative workflow - brainstorm concepts, select direction, generate in multiple styles, then produce via API.
Official skill for integrating Firebase AI Logic (Gemini API) into web applications. Covers setup, multimodal inference, structured output, and security.
Consult external LLMs (Gemini, OpenAI/Codex, Qwen) for second opinions, alternative plans, independent reviews, or delegated tasks. Use when a user asks for another model's perspective, wants to compare answers, or requests delegating a subtask to Gemini/Codex/Qwen.
AI image generation and editing for blog content powered by Gemini via MCP. Claude acts as Creative Director - interpreting intent, selecting domain expertise, constructing optimized 6-component prompts (Subject + Action + Context + Composition + Lighting + Style), and orchestrating Gemini for blog-quality results. Generates hero images, inline illustrations, social preview cards, and OG images. Edits existing blog images. Supports 6 blog-optimized domain modes (Editorial, Product, Landscape, UI/Web, Infographic, Abstract). Works standalone via /blog image or internally from blog-write and blog-rewrite workflows. Falls back gracefully when MCP is not configured. Use when user says "blog image", "generate hero image", "blog illustration", "social card", "generate blog image", "edit blog image", "image generate", "blog cover image", "inline image", "OG image".
Turn video moments into AI-generated hand-drawn storyboards with Gemini-powered frame analysis and social media content generation