Loading...
Loading...
Found 411 Skills
AI media generation CLI tool using Google's Imagen 4, Veo 3.1, and Gemini TTS. Use when the user wants to (1) generate images from text prompts, (2) edit existing images with AI, (3) explain image contents, (4) generate videos from text or images, (5) create narration/voice audio with character settings. Triggers on requests like "generate an image of...", "create a video...", "make a voice that says...", "edit this image to...", "describe this image".
Consult Gemini AI for architecture alternatives, design trade-offs, and brainstorming. Use when seeking different perspectives on design, evaluating architectural approaches, comparing solutions, or generating creative ideas.
Generate and edit high-quality images using Gemini 2.5 Flash Image and Gemini 3 Pro Image (Nano Banana). Supports Text-to-Image, Style Transfer, Virtual Try-On, and Character Consistency.
Deep multi-framework reasoning using Gemini. Use for complex problem analysis, challenging ideas, and evaluating multiple options with structured thinking.
Generate images using Google Gemini AI with text prompts and reference images. Use when creating game assets, concept art, UI mockups, promotional images, or any visual content. Supports text-to-image, image-to-image with style transfer, and multiple output sizes. Requires GEMINI_API_KEY environment variable. Triggers on requests for AI image generation, concept art, visual assets, or Gemini images.
Wield Google's Gemini CLI as a powerful auxiliary tool for code generation, review, analysis, and web research. Use when tasks benefit from a second AI perspective, current web information via Google Search, codebase architecture analysis, or parallel code generation. Also use when user explicitly requests Gemini operations.
Use the Gemini API (Nano Banana image generation, Veo video, Gemini TTS speech and audio understanding) to deliver end-to-end multimodal media workflows and code templates for "generation + understanding".
Upload and manage files using Google Gemini File API via scripts/. Use for uploading images, audio, video, PDFs, and other files for use with Gemini models. Supports file upload, status checking, and file management. Triggers on "upload file", "file API", "upload image", "upload PDF", "upload video", "file management".
Use this skill when writing code that calls the Gemini API for text generation, multi-turn chat, multimodal understanding, image generation, streaming responses, background research tasks, function calling, structured output, or migrating from the old generateContent API. This skill covers the Interactions API, the recommended way to use Gemini models and agents in Python and TypeScript.
Generates publication-quality figures for ML papers from research context. Given a paper section or description, extracts system components and relationships to generate architecture diagrams via Gemini. Given experiment results or data, auto-selects chart type and generates data-driven figures via matplotlib/seaborn. Use when creating any figure for a conference paper.
Google Gemini File Search for managed RAG with 100+ file formats. Use for document Q&A, knowledge bases, or encountering immutability errors, quota issues, polling failures. Supports Gemini 3 Pro/Flash (Gemini 2.5 legacy).
Analyze short-form videos with Gemini AI to extract hooks, content structure, and replicable patterns. Supports Instagram Reels, TikTok, and YouTube Shorts. Use when asked to: - Analyze video content for hooks and structure - Extract replicable formulas from viral videos - Understand why a video performed well - Get AI analysis of video content patterns Triggers: "analyze videos", "extract hooks", "video analysis", "analyze reels", "what makes this video work", "hook analysis", "content structure analysis"