Loading...
Loading...
Found 39 Skills
Generate illustration images for articles and documentation using Gemini Nano Banana 2 API. Produces clean, minimal-style diagrams and concept illustrations.
Generate or edit images using Gemini's native `generateContent` via New-API. Suitable for scenarios requiring text-to-image generation, reference image editing, local PNG output, and those who want to reuse the `.sofunny-image.env` file or current shell environment variables.
Expert guidance for writing Python code using the official Google GenAI SDK (google-genai) for Gemini API and Vertex AI. Use for text generation, multimodal inputs, reasoning, tools, and media generation.
Process and generate multimedia content using Google Gemini API for better vision capabilities. Capabilities include analyze audio files (transcription with timestamps, summarization, speech understanding, music/sound analysis up to 9.5 hours), understand images (better image analysis than Claude models, captioning, reasoning, object detection, design extraction, OCR, visual Q&A, segmentation, handle multiple images), process videos (scene detection, Q&A, temporal analysis, YouTube URLs, up to 6 hours), extract from documents (PDF tables, forms, charts, diagrams, multi-page), generate images (text-to-image with Imagen 4, editing, composition, refinement), generate videos (text-to-video with Veo 3, 8-second clips with native audio). Use when working with audio/video files, analyzing images or screenshots (instead of default vision capabilities of Claude, only fallback to Claude's vision capabilities if needed), processing PDF documents, extracting structured data from media, creating images/videos from text prompts, or implementing multimodal AI features. Supports Gemini 3/2.5, Imagen 4, and Veo 3 models with context windows up to 2M tokens.
Enables grounded question answering by automatically executing the Google Search tool within Gemini models. Use when the required information is recent (post knowledge cutoff) or requires verifiable citation.
Use when "nanobanana", "generate image", "create image", "edit image", "AI drawing", "Gemini image", "image generation"
Use Google Gemini API for text generation, multimodal analysis, image generation (Nano Banana), function calling, and search grounding. Invoke when user wants to use Gemini, ask Gemini, generate images with Gemini, or analyze content with Gemini.
Consult external LLMs (Gemini, OpenAI/Codex, Qwen) for second opinions, alternative plans, independent reviews, or delegated tasks. Use when a user asks for another model's perspective, wants to compare answers, or requests delegating a subtask to Gemini/Codex/Qwen.
Form a high-level investment committee consisting of three virtual experts modeled after legendary investors (Buffett, Wood, Druckenmiller) to conduct independent multi-round adversarial debates. True independent thinking is achieved through physically isolated Gemini API calls, and final resolutions are formed via voting. Use when evaluating investment decisions, reviewing stock research reports, or seeking multi-perspective analysis on public companies.
Guide for implementing Google Gemini API image generation - create high-quality images from text prompts using gemini-2.5-flash-image model. Use when generating images, creating visual content, or implementing text-to-image features. Supports text-to-image, image editing, multi-image composition, and iterative refinement.
Guide for implementing Google Gemini API document processing - analyze PDFs with native vision to extract text, images, diagrams, charts, and tables. Use when processing documents, extracting structured data, summarizing PDFs, answering questions about document content, or converting documents to structured formats. (project)
Gemini API를 사용한 슬라이드 이미지 렌더링 스킬. renderer-agent가 프롬프트 파일을 이미지로 변환할 때 사용. generate_slide_images.py 스크립트 실행 가이드, 환경 요구사항, 출력 해석, 에러 처리 방법을 포함합니다.