Image Generation and Editing (GPT Image 2 All) Skills
This image generation skill is implemented based on the latest ChatGPT image generation model gpt-image-2-all from APIYI Platform. It can help users generate images through natural language, accessed via APIYI's domestic proxy service, and supports both Node.js and Python runtime environments. gpt-image-2-all is an official reverse ChatGPT image generation model launched on APIYI Platform, priced at a highly competitive $0.03 per image on a pay-per-use basis. It takes approximately 60 to 300 seconds to generate an image, supporting text-to-image generation, single image editing, multi-image fusion, and natural language-based image modification, with high text restoration accuracy, few content restrictions, and native support for Chinese prompts.
Usage Guide
Follow these steps:
Step 1: Requirement Analysis and Parameter Extraction
-
Clarify Intent: Distinguish whether the user needs [Text-to-Image] (generate new images), [Image-to-Image] (edit/modify existing images), or [Multi-Image Fusion].
-
Prompt Analysis:
- Use the user's original complete input: Directly use the user's original full question and requirement description as the main body of the prompt. Avoid rewriting, summarizing, or secondary creation on your own to prevent loss of details.
- Confirm first when supplementation is needed: If information is insufficient (e.g., missing style, number of subjects, shot language, scene details, text content, prohibited elements, etc.), ask the user for confirmation first; after the user confirms, append the supplementary content to the original prompt in an "appended" manner.
- Examples:
- User input: "Help me generate a picture of a cat, with a cute style."
- Correct example: Use the user's input directly as the prompt:
-p "Help me generate a picture of a cat, with a cute style."
- Incorrect example: Unauthorized rewriting to "Generate a cute-style cat picture" will lose the details and tone of the user's original input.
- If details need to be supplemented (e.g., color, background, etc.), ask for confirmation first: "What color do you want the cat to be? Any requirements for the background?" After the user replies, append it to the prompt:
-p "Help me generate a picture of a cat, with a cute style. The cat is orange, and the background is grass."
-
Key Parameter Organization:
- Prompt (Required): The final prompt after analysis (default = the user's original complete and consistent input; only append supplementary information after user confirmation).
- Filename (Optional): Output image filename/path (must include a random identifier to avoid duplication). If not provided, the script will automatically generate a filename with a timestamp. It is recommended to generate a reasonable filename based on the content (e.g., ), avoid using generic names.
- Size/Aspect (Optional): Since the model has no explicit size parameter, the size is controlled by prompt description. It is recommended to describe the size at the beginning of the prompt.
- "Mobile wallpaper" -> Write or at the start of the prompt
- "Computer wallpaper/video cover" -> Write or at the start of the prompt
- "Avatar" -> Write or at the start of the prompt
- Default: If the user does not explicitly specify the image ratio, leave it blank (adaptive by the model).
- Response Format (Optional): Response format, default is (R2 CDN accelerated link), optional (base64 image data).
- Note: The model does not support , , , parameters; passing them may trigger parameter validation errors.
Step 2: Environment Check and Command Execution
-
Check Environment: Confirm whether the
environment variable is set (usually assumed to be set; prompt the user if the operation fails).
-
Build and Run Commands:
- Priority: Node.js version: If Node is available in the environment (the command works), prefer using
scripts/generate_image.js
(zero dependencies, parameters consistent with Python).
- Use Python version if Node is unavailable: Use
scripts/generate_image.py
.
Text-to-Image Command Template (Priority: Node.js):
bash
node scripts/generate_image.js -p "{prompt}" -f "{filename}" [-r {response_format}]
Image-to-Image Command Template (Priority: Node.js):
bash
node scripts/generate_image.js -p "{edit_instruction}" -i "{input_path}" -f "{output_filename}" [-r {response_format}]
Multi-Image Fusion Command Template (Priority: Node.js):
bash
node scripts/generate_image.js -p "Merge the styles of image 1 and image 2" -i ref1.png ref2.png -f "merged.png" [-r {response_format}]
(Optional) Python Version Command Template (When Node is Unavailable):
bash
python scripts/generate_image.py -p "{prompt}" -f "{filename}" [-r {response_format}]
python scripts/generate_image.py -p "{edit_instruction}" -i "{input_path}" -f "{output_filename}" [-r {response_format}]
⏱️ Long-running Task Handling Strategy
1. Pre-task Prompt
Must inform the user before execution:
- "Image generation has started, it is expected to take 60 to 300 seconds"
2. 🎨 Best Practice Example
"Image generation in progress, expected to complete in 60 to 300 seconds...
⏳ Generating..."
Step 3: Result Feedback
- Execution Feedback: Wait for the terminal command to complete execution.
- Success: Inform the user that the image has been generated and indicate the save path.
- Failure:
- If prompted with missing API Key, guide the user to set the environment variable.
- If prompted with network error, suggest the user check the network or try again later.
Command Line Usage Examples
Generate New Images
bash
python scripts/generate_image.py -p "Image description text" -f "output.png" [-r url|b64_json]
Example:
bash
# Basic generation
python scripts/generate_image.py -p "A cute orange cat playing on the grass" -f "cat.png"
# Specify size (describe at the start of the prompt)
python scripts/generate_image.py -p "Horizontal 16:9 Movie aspect ratio, sunset mountain scenery" -f "sunset.png"
# Vertical high-definition image (suitable for mobile wallpaper)
python scripts/generate_image.py -p "Vertical 9:16 Mobile poster, city night view" -f "city.png"
(Optional) Node.js Version Example:
bash
# Basic generation
node scripts/generate_image.js -p "A cute orange cat playing on the grass" -f "cat.png"
# Specify size
node scripts/generate_image.js -p "Horizontal 16:9 Movie aspect ratio, sunset mountain scenery" -f "sunset.png"
Edit Existing Images
bash
python scripts/generate_image.py -p "Editing instruction" -f "output.png" -i "path/to/input.png"
Example:
bash
# Modify style
python scripts/generate_image.py -p "Convert the image to watercolor style" -f "watercolor.png" -i "original.png"
# Add elements
python scripts/generate_image.py -p "Add a rainbow to the sky" -f "rainbow.png" -i "landscape.png"
# Replace background
python scripts/generate_image.py -p "Change the background to a beach" -f "beach-bg.png" -i "portrait.png"
(Optional) Node.js Version Example:
bash
# Modify style
node scripts/generate_image.js -p "Convert the image to watercolor style" -f "watercolor.png" -i "original.png"
# Merge multiple reference images (up to 5)
node scripts/generate_image.js -p "Merge the styles of image 1 and image 2" -i ref1.png ref2.png -f "merged.png"
Command Line Parameter Description
Parameters are consistent between Python and Node.js versions (short parameters are equivalent to long parameters).
| Parameter | Required | Description |
|---|
| / | Yes | Image description (for text-to-image) or editing instruction (for image-to-image). Keep the user's original complete input. |
| / | No | Output image path/filename; if not provided, a PNG filename with timestamp will be automatically generated and saved to the current directory. |
| / | No | Response format: (default, R2 CDN link) or (base64 image data). |
| / | No | Input image path for image-to-image; multiple images can be passed (up to 5). Passing this parameter enters editing mode. |
File Resource Description
| Resource | Description |
|---|
scripts/generate_image.js
| Node.js version (zero dependencies, preferred) |
scripts/generate_image.py
| Python version (alternative) |
| Size and ratio control document, use when needed, load on demand |
references/batch-template.md
| Batch generation configuration template, use when batch generation is needed, load on demand |
Batch Generation
When the user needs to generate multiple images at once (batch generation):
- Load Configuration Template: references/batch-template.md — includes JSON configuration format description and usage examples
- Obtain/Generate JSON File: Users can provide their own JSON file, or describe requirements and let AI generate it based on the needs
- Preprocess Prompt: Ensure each prompt starts with size description (e.g., "Horizontal 16:9"), supplement if necessary
- Execute One by One: Read the prompts array, execute generation commands one by one, and feed back results after each image is completed
- Summarize Feedback: After completion, inform the user of the number of successful images and the list of image paths
Note: Total time for batch tasks = single image time (60-300 seconds) × number of images. Please inform the user of the estimated duration in advance.
Image Ratio Description
Since the gpt-image-2-all model has no size parameter, the size is controlled by prompt description. The following verified stable expressions are recommended:
| Requirement | Recommended Expression |
|---|
| Square | 1024×1024 Square image / 1:1 Square composition |
| Horizontal | Horizontal 16:9 / Wide screen 16:9 Movie aspect ratio |
| Vertical | Vertical 9:16 / Mobile poster 9:16 |
| Ultra-wide banner | Banner 21:9 Ultra-wide screen |
| Classic printing | 4:3 Standard aspect ratio / 3:2 Classic aspect ratio |
Tip: Describe the size/composition at the beginning of the prompt for better model compliance. You can match it with aspect ratio style words (e.g., movie aspect ratio, mobile poster, square composition) to further improve consistency.
Response Format Description
url (Default)
Returns an R2 CDN accelerated link by default, valid for approximately 24 hours. Suitable for direct rendering in web applications. For images that need long-term storage, please transfer and save them to your own object storage immediately after generation.
b64_json
Returns base64 encoded image data (with
prefix included). Suitable for:
- Server-side direct image data processing
- Need to write to local files
- Direct rendering on the frontend
Notes
- API Key must be set, which can be provided via environment variable or command line parameter
- Image generation time: approximately 60 to 300 seconds
- When editing images, the input image will be automatically converted to base64 encoding
- Ensure the output directory has write permission
- The model does not support , , , parameters
- The default response url field is an R2 CDN accelerated link, valid for approximately 24 hours
API Key Setup and Acquisition
How to Obtain API Key
If you don't have an API Key yet, please go to
https://api.apiyi.com to register an account and apply for an API Key.
Acquisition Steps:
- Visit https://api.apiyi.com
- Register/log in to your account
- Create an API Key in the console
- Copy the key and set it as an environment variable or use it in the command line
Set API Key
The script obtains the API Key from the environment variable
.
Set Environment Variable:
bash
# Linux/Mac
export APIYI_API_KEY="your-api-key-here"
# Windows CMD
set APIYI_API_KEY=your-api-key-here
# Windows PowerShell
$env:APIYI_API_KEY="your-api-key-here"
API Endpoint Description
Recommended Endpoint: POST /v1/chat/completions
Conversational endpoint — compared to
and
, the conversational endpoint follows prompts better, and the same endpoint supports both text-to-image generation and reference image-based modification, naturally enabling multi-round iterations.
- Only input text messages → Text-to-image generation
- Add image_url (URL or base64 data URL) to messages → Reference image-based modification
- Retain assistant historical messages and continue asking → Multi-round iterative image modification
Model Information
- Model Name: gpt-image-2-all
- Image Generation Speed: Approximately 60-300 seconds
- Output Resolution: No explicit size parameter, adaptive by the model (recommended to describe in prompt)
- Default Response Format: url (R2 CDN accelerated link, default 1-day validity)
- Optional Response Format: b64_json
- Chinese Prompts: ✅ Natively supported
- Supported Capabilities: Text-to-image generation, single image editing, multi-image fusion, natural language-based image modification
- Price: $0.03 per image
Author Introduction
- Everywhere who loves One Piece
- My WeChat Official Account: Ubiquitous Technology",