Image Generation and Editing
Image Generation Skill helps users generate images via natural language, accessible via APIYI domestic proxy service, and supports both Node.js and Python runtime environments.
Usage Guide
Follow these steps:
Step 1: Requirement Analysis and Parameter Extraction
-
Clarify Intent: Distinguish whether the user needs [Text-to-Image] (generate new images) or [Image-to-Image] (edit/modify existing images).
-
Prompt Analysis:
- Use the user's original complete input: Directly use the user's original full question and requirement description as the main body of the prompt. Avoid rewriting, summarizing, or secondary creation on your own to prevent loss of details.
- Confirm first before supplementing: If information is insufficient (e.g., missing style, number of subjects, shot language, scene details, text content, prohibited elements, etc.), ask the user for confirmation first; after the user confirms, append the supplementary content to the original prompt in an "appended" manner.
- Examples:
- User input: "Help me generate a picture of a cat, in a cute style."
- Correct example: Use the user's input directly as the prompt:
-p "Help me generate a picture of a cat, in a cute style."
- Incorrect example: Unauthorized rewriting to "Generate a picture of a cat in cute style" will lose the details and tone of the user's original input.
- If details need to be supplemented (e.g., color, background, etc.), ask for confirmation first: "What color do you want the cat to be? Any requirements for the background?" After the user replies, append to the prompt:
-p "Help me generate a picture of a cat, in a cute style. The cat is orange, and the background is grass."
-
Key Parameter Organization:
- Prompt (Required): The final prompt after analysis (default = the user's original complete and consistent input; only append supplementary information after user confirmation).
- Filename (Optional): Output image filename/path (must include a random identifier to avoid duplicates). If not provided, the script will automatically generate a filename with a timestamp. It is recommended to generate a reasonable filename based on content (e.g., ), avoid using generic names.
- Aspect Ratio (Optional): Infer the ratio based on the user's description. For example:
- "Mobile wallpaper" ->
- "Computer wallpaper/video cover" ->
- "Avatar" ->
- Default: If the user does not specify an aspect ratio, leave it empty.
- Resolution (Optional):
- Default aspect ratio uses .
- Only use for extreme high-definition needs or when specified by the user, and prompt the user friendly that generation will be slow, please wait patiently.
- Note: Parameter values must be uppercase (, , ).
Step 2: Environment Check and Command Execution
-
Check Environment: Confirm whether the
environment variable is set (usually assumed to be set; prompt the user if execution fails).
-
Build and Run Commands:
- Priority: Node.js Version: If Node is available in the environment (the command works), prefer using
scripts/generate_image.js
(zero dependencies, parameters consistent with Python).
- Use Python Version if Node is Unavailable: Use
scripts/generate_image.py
.
Text-to-Image Command Template (Priority: Node.js):
bash
node scripts/generate_image.js -p "{prompt}" -f "{filename}" [-a {ratio}] [-r {res}]
Image-to-Image Command Template (Priority: Node.js):
bash
node scripts/generate_image.js -p "{edit_instruction}" -i "{input_path}" -f "{output_filename}" [-r {res}]
(Optional) Python Version Command Template (When Node is Unavailable):
bash
python scripts/generate_image.py -p "{prompt}" -f "{filename}" [-a {ratio}] [-r {res}]
python scripts/generate_image.py -p "{edit_instruction}" -i "{input_path}" -f "{output_filename}" [-r {res}]
⏱️ Long-running Task Handling Strategy
1. Pre-task Prompt
Must inform the user before execution:
- "Image generation has started, expected to take 25 seconds to 5 minutes"
2. 🎨 Best Practice Examples
- Quick Generation Scenario (1K Resolution)
"Quick Mode: Generate with 1K resolution, expected to complete within 30 seconds"
- High-quality Generation Scenario (2K/4K Resolution)
"High-quality Mode: Generate with 2K resolution, expected to take 1-4 minutes\n⏳ Starting generation... 🔄"
Step 3: Result Feedback
- Execution Feedback: Wait for the terminal command to complete execution.
- Success: Inform the user that the image has been generated and indicate the save path.
- Failure:
- If prompted for missing API Key, guide the user to set the environment variable.
- If prompted for network error, suggest the user check the network or try again later.
Command Line Usage Examples
Generate New Images
bash
python scripts/generate_image.py -p "Image description text" -f "output.png" [-a 1:1] [-r 1K]
Example:
bash
# Basic generation
python scripts/generate_image.py -p "A cute orange cat playing on the grass" -f "cat.png"
# Specify aspect ratio and resolution
python scripts/generate_image.py -p "Sunset mountain landscape" -f "sunset.png" -a 16:9 -r 4K
# Vertical high-definition image (suitable for mobile wallpaper)
python scripts/generate_image.py -p "City night view" -f "city.png" -a 9:16 -r 2K
(Optional) Node.js Version Example:
bash
# Basic generation
node scripts/generate_image.js -p "A cute orange cat playing on the grass" -f "cat.png"
# Specify aspect ratio and resolution
node scripts/generate_image.js -p "Sunset mountain landscape" -f "sunset.png" -a 16:9 -r 4K
Edit Existing Images
bash
python scripts/generate_image.py -p "Editing instruction" -f "output.png" -i "path/to/input.png" [-a 1:1] [-r 1K]
Example:
bash
# Modify style
python scripts/generate_image.py -p "Convert the image to watercolor style" -f "watercolor.png" -i "original.png"
# Add elements
python scripts/generate_image.py -p "Add a rainbow to the sky" -f "rainbow.png" -i "landscape.png" -r 2K
# Replace background
python scripts/generate_image.py -p "Change the background to a beach" -f "beach-bg.png" -i "portrait.png" -a 3:4
(Optional) Node.js Version Example:
bash
# Modify style
node scripts/generate_image.js -p "Convert the image to watercolor style" -f "watercolor.png" -i "original.png"
# Multiple reference images (up to 14)
node scripts/generate_image.js -p "Fuse styles by referencing multiple images" -i ref1.png ref2.png ref3.png -f "merged.png"
Additional Resources
- Common Usage Scenarios Document: references/scene.md
Command Line Parameter Description
Parameters for Python and Node.js versions are consistent (short parameters are equivalent to long parameters).
| Parameter | Required | Description |
|---|
| / | Yes | Image description (text-to-image) or editing instruction (image-to-image). Keep the user's original complete input. |
| / | No | Output image path/filename; if not provided, a PNG filename with timestamp will be automatically generated and saved to the current directory. |
| / | No | Image aspect ratio: , , , , , , , , , . |
| / | No | Image resolution: / / (must be uppercase). If not provided, it will not be specified in the request, determined by the API side. |
| / | No | Input image path for image-to-image; multiple images can be passed (up to 14). Passing this parameter enters edit mode. |
Image Parameter Description
aspect_ratio - Image Aspect Ratio
Supports the following 10 ratios:
| Aspect Ratio | Orientation | Applicable Scenario |
|---|
| 1:1 | Square | Avatars, Instagram posts |
| 16:9 | Landscape | YouTube thumbnails, desktop wallpapers, presentations |
| 9:16 | Portrait | Douyin/TikTok, Instagram Stories, mobile wallpapers |
| 4:3 | Landscape | Classic photos, presentations |
| 3:4 | Portrait | Pinterest, portrait photography |
| 3:2 | Landscape | DSLR standard, print media |
| 2:3 | Portrait | Portrait posters |
| 5:4 | Landscape | Large-format printing, art printing |
| 4:5 | Portrait | Instagram posts, social media |
| 21:9 | Ultra-wide | Cinematic feel, banners, panoramas |
resolution - Image Resolution
Three resolution options: 1K, 2K, 4K
Note: Resolution values must be uppercase (1K, 2K, 4K)
Default: 2K
Notes
- API Key must be set, which can be provided via environment variable or command line parameter
- Resolution parameters must be uppercase (1K/2K/4K); lowercase will default to 1K
- Image generation time: 25 seconds to 5 minutes, depending on resolution and server load
- When editing images, the input image will be automatically converted to base64 encoding
- Ensure the output directory has write permission
API Key Setup and Acquisition
How to Get an API Key
If you don't have an API Key yet, please visit
https://api.apiyi.com to register an account and apply for an API Key.
Acquisition Steps:
- Visit https://api.apiyi.com
- Register/Log in to your account
- Create an API Key in the console
- Copy the key and set the environment variable or use it in the command line
Set Up API Key
The script looks for the API Key in the following order:
- command line parameter (temporary use)
- environment variable (recommended)
Set Environment Variable (Recommended):
bash
# Linux/Mac
export APIYI_API_KEY="your-api-key-here"
# Windows CMD
Set the environment variable in Advanced System Settings of My Computer or execute set APIYI_API_KEY=your-api-key-here
# Windows PowerShell
Set the environment variable in My Computer: $env:APIYI_API_KEY="your-api-key-here"
Command Line Parameter Method (Temporary):
bash
python scripts/generate_image.py -p "A cat" -k "your-api-key-here"
About the Author
- Everywhere Who Loves One Piece
- My WeChat Official Account: Ubiquitous Technology