Image Generation and Editing (Official Formal Version of GPT Image 2)

Based on the official formal version of the GPT Image 2 model (gpt-image-2) from Apiyi Platform, this image generation skill can help users generate images through natural language. It is accessed via Apiyi's domestic proxy service and supports both Node.js and Python runtime environments. gpt-image-2 is the official formal version of GPT image generation model on Apiyi Platform, supporting precise size/quality control (including 4K) and billed by token.

Usage Guide

Follow these steps:

Step 1: Analyze Requirements and Extract Parameters

Clarify Intent: Distinguish whether the user needs [Text-to-Image] (generate new images), [Image-to-Image] (edit/modify existing images), or [Multi-Image Fusion].
Prompt Analysis:
- Use the user's original complete input: Directly use the user's original full question and requirement description as the main body of the
```
-p
```
  prompt. Avoid rewriting, summarizing or secondary creation on your own to prevent loss of details.
- Confirm first when supplementation is needed: If information is insufficient (e.g., missing style, number of subjects, shot language, scene details, text content, prohibited elements, etc.), ask the user for confirmation first; after the user confirms, append the supplementary content to the original prompt in an "appending" manner.
- Examples:
  - User input: "Help me generate a picture of a cat, in a cute style."
  - Correct example: Use the user's input directly as the prompt:
    -p "Help me generate a picture of a cat, in a cute style."
  - Incorrect example: Unauthorized rewriting to "Generate a picture of a cat in cute style" will lose the details and tone of the user's original input.
  - If additional details are needed (e.g., color, background, etc.), ask for confirmation first: "What color do you want the cat to be? Any requirements for the background?" After the user replies, append it to the prompt:
    -p "Help me generate a picture of a cat, in a cute style. The cat is orange, and the background is grass."
Key Parameter Organization:
- Prompt (Required): The final prompt after analysis (default = the user's original complete and consistent input; only append supplementary information after user confirmation).
- Filename (Optional): Output image filename/path (must include a random identifier to avoid duplication). If not provided, the script will automatically generate a filename with a timestamp. It is recommended to generate a reasonable filename based on content (e.g.,
```
cat_in_garden.png
```
  ) instead of using generic names.
- Size (Optional): Output size.
  - Preset values:
    1024x1024
    ,
    1536x1024
    ,
    1024x1536
    ,
    2048x2048
    ,
    2048x1152
    ,
    3840x2160
    ,
    2160x3840
  - Custom sizes are also allowed (requirements: maximum side ≤3840, both sides are multiples of 16, aspect ratio ≤3:1, total pixels 0.65–8.3MP)
  - Default is model-adaptive (auto)
- Quality (Optional): Quality level.
```
low
```
  (sketch/batch),
```
medium
```
  (daily use),
```
high
```
  (final draft/fine text),
```
auto
```
  (default)
- Output Format (Optional):
```
png
```
  (default),
```
jpeg
```
  ,
```
webp
```
- Output Compression (Optional): Output compression rate (0-100), only valid for jpeg/webp
- Note: This model uses official formal endpoints, which are different from the reverse version gpt-image-2-all.

Step 2: Environment Check and Command Execution

Check Environment: Confirm whether the
```
APIYI_API_KEY
```
environment variable is set (usually assumed to be set; prompt the user if execution fails).

Build and Run Commands:

Priority Node.js Version: If Node is available in the environment (the
```
node
```
command works), prefer using
```
scripts/generate_image.js
```
(zero dependencies, parameters are consistent with Python).
Use Python Version if Node is Unavailable: Use
```
scripts/generate_image.py
```
.

Text-to-Image Command Template (Priority Node.js):

bash

node scripts/generate_image.js -p "{prompt}" -f "{filename}" [-s {size}] [-q {quality}] [-o {output_format}]

Image-to-Image Command Template (Priority Node.js):

bash

node scripts/generate_image.js -p "{edit_instruction}" -i "{input_path}" -f "{output_filename}" [-s {size}] [-q {quality}]

Multi-Image Fusion Command Template (Priority Node.js):

bash

node scripts/generate_image.js -p "Merge the styles of Image 1 and Image 2" -i ref1.png ref2.png -f "merged.png" [-s {size}] [-q {quality}]

(Optional) Python Version Command Template (When Node is Unavailable):

bash

python scripts/generate_image.py -p "{prompt}" -f "{filename}" [-s {size}] [-q {quality}] [-o {output_format}]
python scripts/generate_image.py -p "{edit_instruction}" -i "{input_path}" -f "{output_filename}" [-s {size}] [-q {quality}]

⏱️ Long-running Task Processing Strategy

1. Pre-task Prompt

Must inform the user before execution:

"Image generation has started, it is expected to take 120-150 seconds, please wait patiently"

2. 🎨 Best Practice Example

"Image generation in progress, expected to complete in 120-150 seconds...\n⏳ Generating...\n(Complex scenes with high quality + 2K/4K may take longer, please wait patiently)"

Step 3: Result Feedback

Execution Feedback: Wait for the terminal command to complete execution.
Success: Inform the user that the image has been generated and indicate the save path.
Failure:
- If prompted for missing API Key, guide the user to set the environment variable.
- If prompted for network error, suggest the user check the network or try again later.

Command Line Usage Examples

Generate New Images

bash

python scripts/generate_image.py -p "Image description text" -f "output.png" [-s {size}] [-q {quality}] [-o {output_format}]

Example:

bash

# Basic generation
python scripts/generate_image.py -p "A cute orange cat playing on the grass" -f "cat.png"

# Specify size and quality
python scripts/generate_image.py -p "Sunset mountain landscape" -f "sunset.png" -s "2048x1152" -q "high"

# Vertical HD image (suitable for mobile wallpaper)
python scripts/generate_image.py -p "City night view" -f "city.png" -s "2160x3840" -q "high"

# Output as JPEG
python scripts/generate_image.py -p "Landscape photo" -f "landscape.jpg" -s "3840x2160" -q "high" -o "jpeg"

(Optional) Node.js Version Example:

bash

# Basic generation
node scripts/generate_image.js -p "A cute orange cat playing on the grass" -f "cat.png"

# Specify size and quality
node scripts/generate_image.js -p "Sunset mountain landscape" -f "sunset.png" -s "2048x1152" -q "high"

Edit Existing Images

bash

python scripts/generate_image.py -p "Editing instruction" -f "output.png" -i "path/to/input.png" [-s {size}] [-q {quality}]

Example:

bash

# Modify style
python scripts/generate_image.py -p "Convert the image to watercolor style" -f "watercolor.png" -i "original.png"

# Add elements
python scripts/generate_image.py -p "Add a rainbow to the sky" -f "rainbow.png" -i "landscape.png" -q "high"

# Replace background
python scripts/generate_image.py -p "Change the background to a beach" -f "beach-bg.png" -i "portrait.png" -s "2048x2048"

(Optional) Node.js Version Example:

bash

# Modify style
node scripts/generate_image.js -p "Convert the image to watercolor style" -f "watercolor.png" -i "original.png"

# Multi-reference image fusion (up to 5 images)
node scripts/generate_image.js -p "Put the character from Image 1 into the scene of Image 2" -i ref1.png ref2.png -f "merged.png"

Command Line Parameter Description

Parameters are consistent between Python and Node.js versions (short parameters are equivalent to long parameters).

Parameter	Required	Description
`-p` / `--prompt`	Yes	Image description (text-to-image) or editing instruction (image-to-image). Retain the user's original complete input.
`-f` / `--filename`	No	Output image path/filename; if not provided, a filename with timestamp will be generated automatically.
`-s` / `--size`	No	Output size: 1024x1024 / 1536x1024 / 1024x1536 / 2048x2048 / 2048x1152 / 3840x2160 / 2160x3840 or custom size.
`-q` / `--quality`	No	Quality level: low / medium / high / auto (default auto).
`-o` / `--output-format`	No	Output format: png (default)/ jpeg / webp.
`-c` / `--output-compression`	No	Output compression rate (0-100), only valid for jpeg/webp.
`-i` / `--input-image`	No	Input image path for image-to-image; multiple images can be passed (up to 5). Passing this parameter enters edit mode.

File Resource Description

Resource	Description
`scripts/generate_image.js`	Node.js version (zero dependencies, prefer to use)
`scripts/generate_image.py`	Python version (alternative)
`references/size-guide.md`	Size and aspect ratio control document, use when needed, load on demand
`references/batch-template.md`	Batch generation configuration template, use when batch generation is needed, load on demand

Batch Generation

When users need to generate multiple images at once (batch generation):

Load Configuration Template: references/batch-template.md — includes JSON configuration format description and usage examples
Obtain/Generate JSON File: Users can provide their own JSON file, or describe requirements and let AI generate it based on the requirements
Execute One by One: Read the prompts array, execute the generation command one by one, and feedback the result after each image is completed
Summary Feedback: After completion, inform the user of the number of successful images and the list of image paths

Note: Total time for batch tasks = single image time (120-150 seconds) × number of images, please inform the user of the estimated duration in advance.

Quality Description

Quality	Description	Applicable Scenario
low	Sketch/batch generation	Quick preview, multiple iterations
medium	Daily use	General usage
high	Final draft/fine text	Final output, images containing text
auto	Default	Determined by the model

Output Format Description

Format	Description	Applicable Scenario
png	Lossless compression, transparent background	Need transparent background, retain best quality
jpeg	Compressed	Photos, storage space sensitive
webp	Modern format	Web usage, balance quality and size

Note: The b64_json field is pure base64, without the

data:image/...;base64,

prefix. Clients need to:

Write to file:
```
base64.b64decode(b64_str)
```
→ write to disk
Render in browser: Append the prefix
```
data:image/png;base64,
```
+ b64 on your own

Notes

API Key must be set, can be provided via environment variable or command line parameter
Image generation time: Approximately 120-150 seconds, complex scenes with high quality + 2K/4K may take longer
When editing images, use multipart/form-data to upload reference images
Ensure the output directory has write permission
Billed by token (not per image)

API Key Setup and Acquisition

How to Obtain API Key

If you don't have an API Key yet, please go to https://api.apiyi.com to register an account and apply for an API Key.

Acquisition steps:

Visit https://api.apiyi.com
Register/login your account
Create an API Key in the console
Copy the key and set the environment variable or use it in the command line

Set API Key

The script obtains the API Key from the environment variable

APIYI_API_KEY

Set Environment Variable:

bash

# Linux/Mac
export APIYI_API_KEY="your-api-key-here"

Set environment variable in advanced settings of your computer or execute the set command:
# Windows CMD
set APIYI_API_KEY=your-api-key-here

# Windows PowerShell
$env:APIYI_API_KEY="your-api-key-here"

API Endpoint Description

Text-to-Image Endpoint: POST /v1/images/generations

Text-to-image endpoint, uses JSON format for requests.

Image-to-Image Endpoint: POST /v1/images/edits

Image-to-image endpoint, uses multipart/form-data format for requests. Upload reference images (up to 5) + instructions for single image editing and multi-image fusion.

The order of reference images is meaningful, and "Image 1/Image 2/Image 3" can be used in the prompt to refer to them.

Model Information

Model Name: gpt-image-2
Image Generation Speed: Approximately 120-150 seconds (4K complex scenes may take longer)
Output Resolution: 1024x1024 / 1536x1024 / 1024x1536 / 2048x2048 / 2048x1152 / 3840x2160 / 2160x3840 or custom
Default Response Format: b64_json (pure base64, no prefix)
Quality Levels: low / medium / high / auto
Output Formats: png / jpeg / webp
Supported Capabilities: Text-to-image, single image editing, multi-image fusion
Billing Method: Billed by token

Comparison between gpt-image-2 (Official Formal Version) vs gpt-image-2-all (Official Reverse Version)

Feature	gpt-image-2	gpt-image-2-all
Nature	Official formal version	Official reverse version
Billing	Billed by token	Fixed $0.03 per image
Endpoints	/v1/images/generations, /v1/images/edits	/v1/chat/completions
Reference Image Upload	multipart form-data	base64 data URL
Image Download	b64_json (pure base64)	url or b64_json (with prefix)
Multi-Image Fusion	image[] array (up to 5 images)	chat multiple image_url
Size Control	Explicit size parameter	Prompt description
Speed	Approximately 120-150 seconds	Approximately 60-300 seconds

Author Introduction

LoveOnePiece_Ubiquitous
My WeChat Official Account: Ubiquitous Technology

apiyi-gpt-image-2-gen

NPX Install

Tags

SKILL.md Content (Chinese)

Image Generation and Editing (Official Formal Version of GPT Image 2)

Usage Guide

Step 1: Analyze Requirements and Extract Parameters

Step 2: Environment Check and Command Execution

⏱️ Long-running Task Processing Strategy

1. Pre-task Prompt

2. 🎨 Best Practice Example

Step 3: Result Feedback

Command Line Usage Examples

Generate New Images

Edit Existing Images

Command Line Parameter Description

File Resource Description

Batch Generation

Quality Description

Output Format Description

Notes

API Key Setup and Acquisition

How to Obtain API Key

Set API Key

API Endpoint Description

Text-to-Image Endpoint: POST /v1/images/generations

Image-to-Image Endpoint: POST /v1/images/edits

Model Information

Comparison between gpt-image-2 (Official Formal Version) vs gpt-image-2-all (Official Reverse Version)

Author Introduction