Loading...
Loading...
Found 23 Skills
Generate and edit images using Google's Nano Banana 2 (Gemini 3.1 Flash Image Preview) API. This skill should be used when the user asks to create or modify images, especially when they need fast iteration, explicit aspect-ratio control, or resolution control from 512px to 4K.
Operate OpenWord end-to-end for live adventure sessions. Use when Codex needs to download/install/start OpenWord, guide a human player in the browser, or play autonomously through REST API (create/load game, do_action loop, state/image retrieval), including configuring GEMINI_API_KEY and sharing interesting scenes and choices during play.
Generate illustration images for articles and documentation using Gemini Nano Banana 2 API. Produces clean, minimal-style diagrams and concept illustrations.
Generate or edit images using Gemini's native `generateContent` via New-API. Suitable for scenarios requiring text-to-image generation, reference image editing, local PNG output, and those who want to reuse the `.sofunny-image.env` file or current shell environment variables.
Expert guidance for writing Python code using the official Google GenAI SDK (google-genai) for Gemini API and Vertex AI. Use for text generation, multimodal inputs, reasoning, tools, and media generation.
Process and generate multimedia content using Google Gemini API for better vision capabilities. Capabilities include analyze audio files (transcription with timestamps, summarization, speech understanding, music/sound analysis up to 9.5 hours), understand images (better image analysis than Claude models, captioning, reasoning, object detection, design extraction, OCR, visual Q&A, segmentation, handle multiple images), process videos (scene detection, Q&A, temporal analysis, YouTube URLs, up to 6 hours), extract from documents (PDF tables, forms, charts, diagrams, multi-page), generate images (text-to-image with Imagen 4, editing, composition, refinement), generate videos (text-to-video with Veo 3, 8-second clips with native audio). Use when working with audio/video files, analyzing images or screenshots (instead of default vision capabilities of Claude, only fallback to Claude's vision capabilities if needed), processing PDF documents, extracting structured data from media, creating images/videos from text prompts, or implementing multimodal AI features. Supports Gemini 3/2.5, Imagen 4, and Veo 3 models with context windows up to 2M tokens.
Enables grounded question answering by automatically executing the Google Search tool within Gemini models. Use when the required information is recent (post knowledge cutoff) or requires verifiable citation.
Use when "nanobanana", "generate image", "create image", "edit image", "AI drawing", "Gemini image", "image generation"
Consult external LLMs (Gemini, OpenAI/Codex, Qwen) for second opinions, alternative plans, independent reviews, or delegated tasks. Use when a user asks for another model's perspective, wants to compare answers, or requests delegating a subtask to Gemini/Codex/Qwen.
Form a high-level investment committee consisting of three virtual experts modeled after legendary investors (Buffett, Wood, Druckenmiller) to conduct independent multi-round adversarial debates. True independent thinking is achieved through physically isolated Gemini API calls, and final resolutions are formed via voting. Use when evaluating investment decisions, reviewing stock research reports, or seeking multi-perspective analysis on public companies.
AI Course Content Generator - Generate complete online courses with Gemini API. Triggers on "create course", "generate lesson", "course content", "ccg", "/ccg".