qwencloud-image-generation
Original:🇺🇸 English
Translated
4 scriptsChecked / no sensitive code detected
[QwenCloud] Generate and edit images using Wan and Qwen Image models. Supports text-to-image, image editing (style transfer, subject consistency, text rendering), and interleaved text-image output. TRIGGER when: user wants to create illustrations, product images, artistic designs, posters, text-to-image generation, edit/transform existing images, apply style transfer, generate images based on reference photos, interleaved text-image content, mentions Wan/Qwen Image models/AI art creation, or explicitly invokes this skill by name (e.g. use qwencloud-image-generation). DO NOT TRIGGER when: user wants to understand/analyze existing images or OCR (use qwencloud-vision), video generation (use qwencloud-video-generation), text-only tasks.
4installs
Sourceqwencloud/qwencloud-ai
Added on
NPX Install
npx skill4agent add qwencloud/qwencloud-ai qwencloud-image-generationTags
Translated version includes tags in frontmatterSKILL.md Content
View Translation Comparison →Agent setup: If your agent doesn't auto-load skills (e.g. Claude Code), see agent-compatibility.md once per session.
Qwen Image Generation
Generate and edit images using Wan and Qwen Image models. Supports text-to-image, reference-image editing (style
transfer, subject consistency, multi-image composition, text rendering), and interleaved text-image output.
This skill is part of qwencloud/qwencloud-ai.
Skill directory
Use this skill's internal files to execute and learn. Load reference files on demand when the default path fails or you need details.
| Location | Purpose |
|---|---|
| Default execution — sync/async, upload, download |
| Fallback: curl (sync/async), code generation |
| Prompt formulas, style keywords, negative_prompt, prompt_extend decision |
| API supplement |
| Official documentation URLs |
| Agent self-check: register skills in project config for agents that don't auto-load |
Security
NEVER output any API key or credential in plaintext. Always use variable references ( in shell, in Python). Any check or detection of credentials must be non-plaintext: report only status (e.g. "set" / "not set", "valid" / "invalid"), never the value. Never display contents of or config files that may contain secrets.
$DASHSCOPE_API_KEYos.environ["DASHSCOPE_API_KEY"].envWhen the API key is not configured, NEVER ask the user to provide it directly. Instead, help create a file with a placeholder () and instruct the user to replace it with their actual key from the QwenCloud Console. Only write the actual key value if the user explicitly requests it.
.envDASHSCOPE_API_KEY=sk-your-key-hereKey Compatibility
Scripts require a standard QwenCloud API key (). Coding Plan keys () cannot be used — image generation models are not available on Coding Plan, and Coding Plan does not support the native QwenCloud API. The script detects keys at startup and prints a warning. If qwencloud-ops-auth is installed, see its for full details.
sk-...sk-sp-...sk-sp-references/codingplan.mdMode Selection Guide
| User Want | Mode | Model |
|---|---|---|
| Generate image from text only | t2i | |
| Edit image / apply style transfer based on 1–4 reference images | image-edit | |
| Subject consistency: generate new images maintaining subject from references | image-edit | |
| Multi-image composition: combine style from one image, background from another | image-edit | |
| Single-image editing preserving subject consistency | i2i | |
| Multi-image fusion: place object from one image into another scene | i2i | |
| Interleaved text-image output (e.g., tutorials, step-by-step guides) | interleave | |
| Fast text-to-image drafts | t2i | |
| Edit text within images, precise element manipulation | image-edit | |
| Multi-image fusion with realistic textures | image-edit | |
| Posters / complex Chinese+English text rendering | t2i | |
| Text-to-image with fixed aspect ratios (batch) | t2i | |
Model Selection
Wan Series (default)
| Model | Use Case |
|---|---|
| wan2.6-t2i | Recommended for text-to-image — sync + async, best quality |
| wan2.6-image | Image editing ONLY (NOT for pure text-to-image) — requires |
| wan2.5-i2i-preview | Image editing — single-image editing with subject consistency, multi-image fusion (up to 3 images), async-only |
| wan2.5-t2i-preview | Preview — free size within constraints |
| wan2.2-t2i-flash | Fast — lower latency |
| wan2.2-t2i-plus | Professional — improved stability |
Qwen Image Series
| Model | Use Case |
|---|---|
| qwen-image-2.0-pro | Fused generation + editing — text rendering, realistic textures, multi-image (1–3 input, 1–6 output) |
| qwen-image-2.0 | Accelerated generation + editing |
| qwen-image-edit-max | Image editing — 1–6 output images |
| qwen-image-edit-plus | Image editing — 1–6 output images |
| qwen-image-edit | Image editing — 1 output image only |
| qwen-image-plus | Text-to-image — fixed resolutions only (async) |
| qwen-image-max | Text-to-image — fixed resolutions only |
Qwen Image editing models (, , ) use the same sync endpoint as () with format. They support text editing in images, element add/delete/replace, style transfer, and multi-image fusion (1–3 input images). Size range: 512x512 to 2048x2048. and also support pure text-to-image (no reference images needed).
qwen-image-2.0-proqwen-image-2.0qwen-image-edit-max/plus/editwan2.6-image/multimodal-generation/generationmessagesqwen-image-2.0-proqwen-image-2.0Qwen Image text-to-image models (, ) use a different endpoint () with format (async-only). They support only 5 fixed resolutions: 1664*928, 1472*1104, 1328*1328, 1104*1472, 928*1664.
qwen-image-plusqwen-image-max/text2image/image-synthesisinput.promptChoosing between and for image editing:
wan2.6-imagewan2.5-i2i-preview- supports up to 4 images, higher resolution (2K), interleaved text-image output, and sync mode. Use for multi-image style composition, interleaved tutorials.
wan2.6-image - uses a simpler prompt-only editing interface (no messages format), supports up to 3 images, async-only. Use for straightforward single-image edits and multi-image object fusion.
wan2.5-i2i-preview
- User specified a model → use directly.
- Consult the qwencloud-model-selector skill when model choice depends on requirement, scenario, or pricing.
- Text-to-image (prompt only, no reference images) → always use (default). NEVER use
wan2.6-t2ifor pure text-to-image — it will error without reference images orwan2.6-image.enable_interleave: true - Reference images / image editing / interleaved output → (preferred) or
wan2.6-image.wan2.5-i2i-preview
⚠️ Important: The model list above is a point-in-time snapshot and may be outdated. Model availability changes frequently. Always check the official model list for the authoritative, up-to-date catalog before making model decisions.
Execution
⚠️ Multiple artifacts: When generating multiple files in a single session, you MUST append a numeric suffix to each filename (e.g.,out_1.png) to prevent overwrites.out_2.png
Prerequisites
- API Key: Check that (or
DASHSCOPE_API_KEY) is set using a non-plaintext check only (e.g. in shell:QWEN_API_KEY; report only "set" or "not set", never the key value). If not set: run the qwencloud-ops-auth skill if available; otherwise guide the user to obtain a key from QwenCloud Console and set it via[ -n "$DASHSCOPE_API_KEY" ]file (.envin project root or current directory) or environment variable. The script searches forecho 'DASHSCOPE_API_KEY=sk-your-key-here' >> .envin the current working directory and the project root. Skills may be installed independently — do not assume qwencloud-ops-auth is present..env - Python 3.9+ (stdlib only, no pip install needed)
Environment Check
Before first execution, verify Python is available:
bash
python3 --version # must be 3.9+If is not found, try or . If Python is unavailable or below 3.9, skip to Path 2 (curl) in execution-guide.md.
python3python --versionpy -3 --versionDefault: Run Script
Script path: Scripts are in the subdirectory of this skill's directory (the directory containing this
SKILL.md). You MUST first locate this skill's installation directory, then ALWAYS use the full absolute path to execute
scripts. Do NOT assume scripts are in the current working directory. Do NOT use to switch directories before
execution.
scripts/cdExecution note: Run all scripts in the foreground — wait for stdout; do not background.
Discovery: Run first to see all available arguments.
python3 <this-skill-dir>/scripts/image.py --helpbash
# Text-to-image (wan2.6-t2i, default)
python3 <this-skill-dir>/scripts/image.py \
--request '{"prompt":"A cozy flower shop with wooden door"}' \
--output output/qwencloud-image-generation/images/out.png \
--print-response
# Image editing with reference images (wan2.6-image)
python3 <this-skill-dir>/scripts/image.py \
--model wan2.6-image \
--request '{"prompt":"Apply watercolor painting style to this photo","reference_images":["https://example.com/photo.jpg"],"n":1,"size":"1K"}' \
--output output/qwencloud-image-generation/images/out.png \
--print-responseMore examples (interleaved output, wan2.5-i2i, qwen-image-2.0-pro, qwen-image-plus): See execution-guide.md
| Argument | Description |
|---|---|
| JSON request body |
| Load request from file |
| Force async mode (required for wan2.5 and older; auto-enabled for qwen-image-plus/max and interleaved output) |
| Override model ( |
| Save image to path (or directory for multi-image output) |
| Print response JSON to stdout |
Verify Result
- Exit code + output contains valid JSON with
0oroutput.results→ successoutput.task_id - Non-zero exit, HTTP error, empty response, or error JSON → fail
- Async: submission must return ; poll must reach
output.task_idtask_status: SUCCEEDED - Post-execution check: Verify the output file exists and has non-zero size ()
ls -la <output_path> - MANDATORY — stderr signal check: After confirming the result, scan the command's stderr output for or
[ACTION_REQUIRED]. If either signal is present, you MUST follow the instructions in Update Check below before responding to the user.[UPDATE_AVAILABLE]
On Failure
If the script fails, match the error output against the diagnostic table below to determine the resolution. If no match, read execution-guide.md for alternative paths: curl commands (Path 2 — sync and async), code generation (Path 3), and autonomous resolution (Path 5).
If Python is not available at all → skip directly to Path 2 (curl) in execution-guide.md.
| Error Pattern | Diagnosis | Resolution |
|---|---|---|
| Python not on PATH | Try |
| Script version check failed | Upgrade Python to 3.9+ |
| Python < 3.9 | Upgrade Python to 3.9+ |
| Missing API key | Obtain key from QwenCloud Console; add to |
| Invalid or mismatched key | Run qwencloud-ops-auth (non-plaintext check only); verify key is valid |
| SSL cert issue (proxy/corporate) | macOS: run |
| Network unreachable | Check internet; set |
| Rate limited | Wait and retry with backoff |
| Server error | Retry with backoff |
| Can't write output | Use |
Quick Reference
Request Fields (Common)
| Field | Type | Description |
|---|---|---|
| string | Text description of the image to generate (required) |
| string | Content to avoid in the image (max 500 chars) |
| string | Resolution — |
| int | Random seed for reproducibility [0, 2147483647] |
| string | |
| bool | Enable prompt rewriting (default: true; image editing mode only) |
Request Fields (wan2.6-image — Image Editing)
| Field | Type | Description |
|---|---|---|
| string[] | 1–4 image URLs or local paths for editing mode; 0–1 for interleave mode |
| string | Single image URL/path (shorthand; |
| bool | |
| int | Number of images to generate in editing mode (1–4, default: 1). Billed per image. |
| int | Max images in interleave mode (1–5, default: 5). Billed per image. |
| bool | Add "AI Generated" watermark (default: false) |
Other Models (wan2.5-i2i, qwen-image-edit, qwen-image-plus/max)
These models have specific parameter requirements:
| Model | Key Differences |
|---|---|
| async-only, 1–3 images, |
| 1–3 images, n=1–6 (except |
| async-only, n fixed at 1, 5 fixed resolutions only |
Full parameter tables: See api-guide.md for detailed parameters.
Size Reference (wan2.6-image)
- Editing mode: (default, ~1280×1280) or
1K(~2048×2048)2K - Interleave mode: pixel dimensions with total pixels in [768×768, 1280×1280]
Common aspect ratios: (1:1), (3:4), (4:3), (9:16), (16:9)
1280*1280960*12801280*960720*12801280*720Response Fields
| Field | Description |
|---|---|
| URL of generated image (24h validity). Use this when chaining to another skill. |
| Array of all image URLs (multi-image output, wan2.6-image, qwen-image-edit) |
| Number of generated images |
| Local file path of the downloaded image. Use this for user preview or non-API operations. |
| Array of local file paths (multi-image output) |
| Array of |
| Image dimensions |
| Seed used |
API Details
- Sync endpoint (wan2.6-t2i, wan2.6-image editing, qwen-image-edit series):
POST /api/v1/services/aigc/multimodal-generation/generation - Async endpoint (wan2.6 and older t2i): with
POST /api/v1/services/aigc/image-generation/generationX-DashScope-Async: enable - Async endpoint (wan2.5-i2i-preview): with
POST /api/v1/services/aigc/image2image/image-synthesisX-DashScope-Async: enable - Async endpoint (qwen-image-plus, qwen-image-max): with
POST /api/v1/services/aigc/text2image/image-synthesisX-DashScope-Async: enable - wan2.6-t2i resolution: Total pixels in [1280x1280, 1440x1440], aspect ratio [1:4, 4:1]
- wan2.6-image resolution: Editing mode [768x768, 2048x2048]; Interleave mode [768x768, 1280x1280]; aspect ratio [1:4, 4:1]
- Input images (wan2.6-image): JPEG/JPG/PNG/BMP/WEBP, 240–8000px per dimension, ≤10MB
- Local files: Script auto-uploads to DashScope temp storage (URL, 48h TTL). Pass local paths directly — no manual upload step needed.
oss:// - Production: Default temp storage has 48h TTL and 100 QPS upload limit — not suitable for production, high-concurrency, or load-testing. To use your own OSS bucket, set and
QWEN_TMP_OSS_BUCKETinQWEN_TMP_OSS_REGION, install.env, and provide credentials viapip install alibabacloud-oss-v2/QWEN_TMP_OSS_AK_IDor the standardQWEN_TMP_OSS_AK_SECRET/OSS_ACCESS_KEY_ID. Use a RAM user with least-privilege (OSS_ACCESS_KEY_SECRET+oss:PutObjecton target bucket only). If qwencloud-ops-auth is installed, see itsoss:GetObjectfor the full setup guide.references/custom-oss.md - Interleaved sync: Requires streaming (+
X-DashScope-Sse: enable); use async mode via this script insteadstream: true
Cross-Skill Chaining
When using generated images as input for another skill (e.g., video-gen i2v, vision analyze):
- Pass directly — do NOT download and re-pass as local path
image_url - All downstream scripts detect URL prefixes (,
https://) and pass them through without re-uploadoss:// - Use only for user preview or non-API operations (e.g., opening in editor)
local_path
| Scenario | Use |
|---|---|
| Feed to another skill (video-gen, vision, image-edit) | |
| Show to user / open in editor | |
Error Handling
| HTTP | Meaning | Action |
|---|---|---|
| 401 | Invalid or missing API key | Run qwencloud-ops-auth if available; else prompt user to set key (non-plaintext check only) |
| 400 | Bad request (invalid prompt, size) | Verify parameters and constraints |
| 429 | Rate limited | Retry with exponential backoff |
| 5xx | Server error | Retry with exponential backoff |
Output Location
Prefer the current working directory. Default subdirectory: .
./output/qwencloud-image-generation/Write prohibition: NEVER write output files into this skill's installation directory or any hierarchy. All generated content must go to under the current working directory or a user-specified path.
skills/output/Update Check (MANDATORY Post-Execution)
After script execution, check stderr for action signals. The script automatically detects update status and emits structured signals. You MUST act on them. This section also applies when the user explicitly asks to check for updates.
Signal: [ACTION_REQUIRED] qwencloud-update-check skill is not installed
[ACTION_REQUIRED] qwencloud-update-check skill is not installedWhen stderr contains this signal, you MUST present the following options to the user and wait for their response before proceeding. Do NOT silently skip this.
The update-check skill is not installed. It keeps you informed about new qwencloud/qwencloud-ai features and bug fixes.
- Install now — I'll run the install command for you
- Skip for now — do nothing this time
- Never remind me — permanently suppress this prompt
After the user responds:
- User chooses 1 → run: , then run:
npx skills add QwenCloud/qwencloud-ai --skill qwencloud-update-check -ypython3 <this-skill-scripts-dir>/gossamer.py --dismiss - User chooses 2 → run: , then continue normally
python3 <this-skill-scripts-dir>/gossamer.py --dismiss - User chooses 3 → run: , then continue normally
python3 <this-skill-scripts-dir>/gossamer.py --never-install
Signal: [UPDATE_AVAILABLE]
[UPDATE_AVAILABLE]When stderr contains this signal, you MUST append a brief update notice to your response, including the version info and the update command shown in the stderr output.
No signal in stderr
If stderr contains neither nor , no action is needed — the skill is installed and up to date (or cached within 24h).
[ACTION_REQUIRED][UPDATE_AVAILABLE]Explicit user request
When the user explicitly asks to check for updates (e.g. "check for updates", "check version"):
- Look for in sibling skill directories.
qwencloud-update-check/SKILL.md - If found — run: and report the result.
python3 <qwencloud-update-check-dir>/scripts/check_update.py --print-response - If not found — present the install options above.
References
- execution-guide.md — Fallback paths (curl sync/async, code generation, autonomous)
- api-guide.md — API supplementary guide
- sources.md — Official documentation URLs