Loading...
Loading...
Found 32 Skills
Generate a high-cut-density action / fight scene by first composing a 16-cell storyboard image, then driving Seedance 2.0 image-to-video off that storyboard. Stacks GPT-Image-2 (character sheet + storyboard), Nano-Banana-2 (environment concept), and Seedance 2.0 i2v.
Universal AI image generation supporting OpenAI DALL·E / gpt-image, Google Gemini Image / Imagen, Replicate (Flux / SDXL / any model), Stability AI, FAL, Ark (Seedream 4.5), Bailian (qwen-image / wanx), and SiliconFlow. Use this skill whenever the user asks to generate, create, draw, illustrate, render, or synthesize images from text prompts or reference images. Typical phrases include "draw a ...", "generate an image of ...", "画一张 ...", "给我来张图", "make a poster of ...", "create an illustration ...", or any mention of image-generation model families like DALL·E, gpt-image, Flux, SDXL, Seedream, Imagen, Gemini image, Kolors, or Wanx. Always use this skill even if the user does not name a specific model — pick a provider based on their EXTEND.md defaults or available API keys in the environment. Do NOT use this skill when the user explicitly mentions 即梦 / Dreamina / Jimeng — those go to happy-dreamina instead.
Generate 2D pixel art game assets, characters, sprite sheets, background removal, and game backgrounds. Trigger for "pixel art character", "sprite sheet", "walk cycle", "game sprites", "isometric sprites", "side-scroller assets", "RPG character sprites", "idle animation", "attack animation", "jump animation", "game background", "parallax background", "isometric map", "2D game art", "pixel art animation". Covers character generation (nano-banana-pro / gpt-image-2), sprite sheet animation (nano/edit or gpt-image-2/edit), background removal (Bria), and background generation (parallax layers or isometric map).
Build with OpenAI's stateless APIs - Chat Completions (GPT-5, GPT-4o), Embeddings, Images (DALL-E 3), Audio (Whisper + TTS), and Moderation. Includes Node.js SDK and fetch-based approaches for Cloudflare Workers. Use when: implementing chat completions with GPT-5/GPT-4o, streaming responses with SSE, using function calling/tools, creating structured outputs with JSON schemas, generating embeddings for RAG (text-embedding-3-small/large), generating images with DALL-E 3, editing images with GPT-Image-1, transcribing audio with Whisper, synthesizing speech with TTS (11 voices), moderating content (11 safety categories), or troubleshooting rate limits (429), invalid API keys (401), function calling failures, streaming parse errors, embeddings dimension mismatches, or token limit exceeded.
Generate publication-quality academic illustrations through a local Codex app-server bridge that uses Codex native image generation. This is a separate experimental alternative to `paper-illustration`, intended for Claude Code users who want a GPT-image-style renderer without modifying the original skill.
Use this skill whenever a user asks to generate, create, draw, render, or edit images with GPT Image 2 / gpt-image-2, text-to-image, reference-image editing, inpainting, posters, typography, Chinese text, UI mockups, diagrams, or gallery prompts. Analyze the user's prompt, search the bundled Reference Gallery/craft files for matching design patterns, confer on direction when useful, then call the packaged `gpt-image` CLI or bundled `scripts/generate.py`. Do not write new image-generation code unless explicitly asked to modify this repo.
Single-image generation skill for posters, key art, and editorial illustrations. Defaults to gpt-image-2 but is provider-agnostic — the same workflow drives Flux, Imagen, or Midjourney via the active upstream tooling. Output is one or more PNG/JPEG files saved to the project folder.
Upgrade a coded website to award-tier, editorially-crafted design using fal.ai. Takes a local HTML file or a dev-server URL, screenshots it, has an opus-4.7 vision model write a gpt-image-2 edit prompt, uses fal-ai/gpt-image-2/edit to produce the redesigned reference image, then opus-4.7 vision writes a Markdown build-spec with a "Hard constraints" section + a tokens.json. Also supports iterate (screenshot implemented site → delta-spec vs reference) and greenfield generate (brief → mockup → single-file HTML). Invoke when the user says "improve the design", "make it world-class", "redesign this landing page", "upgrade this site", "design pass", or points at a local HTML / dev server for a visual review.
Generate, revise, translate, and manage App Store / Google Play marketing screenshots. Full flow: initialize a .shots workspace, scrape App Store metadata, research the product from the repo and listing, identify theme, colors, audience, and competitor space, save a strategy brief, craft benefit-driven headlines, and generate 3-up GPT-Image 2 composites via OpenAI direct or fal.ai before cropping them into upload-ready panels. Supports iPhone, iPad, and Android Phone platforms. Triggers: "app store screenshots", "marketing screenshots", "store listing images", "screenshot generation", "app store assets", "google play screenshots", "shots", ".shots", "revise shots", "change screenshots", "fix panels", "redo screenshots", "translate screenshots", "localize", "scrape app store", "fetch metadata", "import app store". Do NOT use for general image generation, social media graphics, or non-store marketing assets.
AI coding agent skill for GPT Image Playground — a React/TypeScript web app for OpenAI image generation and editing using gpt-image-1 and related APIs.
Codex Pet generator on RunComfy. Build a Codex-compatible Codex Pet spritesheet.webp + pet.json from a single reference image, drop it into `${CODEX_HOME:-$HOME/.codex}/pets/<name>/` and Codex picks it up as a custom Codex Pet next to the 8 built-ins. This skill produces the exact Codex Pet atlas Codex expects (1536x1872 PNG/WebP, 8 cols x 9 rows, 192x208 cells, 9 animation states — idle, running-right, running-left, waving, jumping, failed, waiting, running, review). Calls OpenAI GPT Image 2 edit ONCE via the local RunComfy CLI as `runcomfy run openai/gpt-image-2/edit` to produce a canonical Codex Pet pose, then assembles all 9 animation rows programmatically with ImageMagick micro-transforms — no Codex Pro, no `$imagegen`, no OPENAI_API_KEY required, only RUNCOMFY_TOKEN. Triggers on "codex pet", "create codex pet", "make codex pet", "hatch codex pet", "/hatch image", "desktop pet codex", "codex pets", "spritesheet.webp", or any explicit ask to build a custom pet for OpenAI Codex.
Generate brand-quality product images via mode-specific prompt enhancement on Higgsfield's gpt_image_2 model. The single entry point for any professional brand visual involving a product. Use when: "make a product photo", "studio shot", "lifestyle photo", "in use", "Pinterest pin", "hero banner", "website header", "carousel", "Meta ads", "ad creatives", "model wearing", "virtual try-on", "person holding product", "closeup with hands", "levitating product", "floating", "splash shot", "CGI style", "surreal product", "restyle", "Christmas version", "in [aesthetic] style", or any request involving a product, brand, or paid social creative. Modes: product_shot, lifestyle_scene, closeup_product_with_person, pinterest_pin, hero_banner, social_carousel, ad_creative_pack, virtual_model_tryout, conceptual_product, restyle. Backend assembles the final prompt — never write gpt_image_2 prompts freehand. Always go through this skill. NOT for: raw text-to-image with no brand/product (use higgsfield-generate), branded marketing video with avatars (use higgsfield-generate's Marketing Studio), Soul Character training (use higgsfield-soul-id).