gpt-image-edit

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

🎨 GPT Image Edit — Pro Pack on RunComfy

🎨 GPT Image Edit — RunComfy专业套件

runcomfy.com · Edit endpoint · Text-to-image sibling · GitHub

OpenAI GPT Image 2 —
/edit
endpoint (ChatGPT Images 2.0 image-to-image) on the RunComfy Model API. Strongest in its class at preserving identity through targeted edits and rewriting embedded text in any script (Latin, kana, CJK, Cyrillic, Arabic).

bash

npx skills add agentspace-so/runcomfy-skills --skill gpt-image-edit -g

runcomfy.com · 编辑端点 · 文生图姊妹工具 · GitHub

在RunComfy模型API上调用OpenAI GPT Image 2的
/edit
端点（ChatGPT Images 2.0图生图功能）。该模型在同类工具中表现突出，可通过针对性编辑保留主体特征，并能改写图片中任意文字脚本（拉丁语、假名、中日韩文字、西里尔文、阿拉伯文）。

bash

npx skills add agentspace-so/runcomfy-skills --skill gpt-image-edit -g

When to pick this model (vs siblings)

何时选择该模型（对比同类工具）

You want	Use
Edit multilingual / embedded text in image	GPT Image Edit ✓
Identity preservation through translated headline variants	GPT Image Edit ✓
Layout-precise edit (move headline, swap CTA, etc.)	GPT Image Edit ✓
Up to 10 reference images	GPT Image Edit ✓
Batch up to 20 images consistently	Nano Banana Edit
Single-shot precise local edit, source-fidelity-first	Flux Kontext
Generate from scratch with GPT Image 2	sibling `gpt-image-2` skill
Batch SKU galleries with stable identity	Nano Banana Edit

需求场景	推荐工具
编辑图片中的多语言/嵌入文本	GPT Image Edit ✓
在替换标题为其他语言版本时保留主体特征	GPT Image Edit ✓
布局精准编辑（移动标题、替换CTA按钮等）	GPT Image Edit ✓
最多支持10张参考图	GPT Image Edit ✓
批量处理最多20张图片且保持一致性	Nano Banana Edit
单次精准局部编辑，优先保证与原图一致性	Flux Kontext
使用GPT Image 2从头生成图片	姊妹技能 `gpt-image-2`
批量处理SKU图库并保持主体特征稳定	Nano Banana Edit

Prerequisites

前置条件

RunComfy CLI —
```
npm i -g @runcomfy/cli
```
RunComfy account —
```
runcomfy login
```
opens a browser device-code flow.
CI / containers — set
```
RUNCOMFY_TOKEN=<token>
```
instead of
```
runcomfy login
```
.

RunComfy CLI — 执行
```
npm i -g @runcomfy/cli
```
安装
RunComfy账号 — 执行
```
runcomfy login
```
会打开浏览器设备码登录流程
CI/容器环境 — 设置环境变量
```
RUNCOMFY_TOKEN=<token>
```
替代
```
runcomfy login
```
登录

Endpoints + input schema

端点与输入Schema

openai/gpt-image-2/edit

openai/gpt-image-2/edit

Field	Type	Required	Default	Notes
`prompt`	string	yes	—	Edit instruction. Lead with preservation, end with the change.
`images`	string[]	yes	—	Up to 10 publicly-fetchable HTTPS URLs. First is primary; rest are auxiliary.
`size`	enum	no	`auto`	`auto` (preserve input), `1024_1024` (1:1), `1024_1536` (2:3 portrait), `1536_1024` (3:2 landscape).

size=auto

preserves the input ratio — strongly recommended unless the edit explicitly changes framing.

字段	类型	是否必填	默认值	说明
`prompt`	字符串	是	—	编辑指令。先说明需要保留的内容，再描述修改需求
`images`	字符串数组	是	—	最多10个可公开访问的HTTPS URL。第一个为主要参考图，其余为辅助参考图
`size`	枚举值	否	`auto`	`auto` （保留原图比例）、 `1024_1024` （1:1）、 `1024_1536` （2:3竖屏）、 `1536_1024` （3:2横屏）

size=auto

会保留原图比例——除非编辑需求明确要改变画幅，否则强烈推荐使用该值。

How to invoke

调用方式

Single-ref preservation edit:

bash

runcomfy run openai/gpt-image-2/edit \
  --input '{
    "prompt": "Keep the person'\''s face, pose, and brand mark unchanged. Replace the background with a soft warm-grey studio sweep and a gentle floor shadow.",
    "images": ["https://.../portrait.jpg"]
  }' \
  --output-dir <absolute/path>

Multilingual text rewrite (preserve everything except the headline):

bash

runcomfy run openai/gpt-image-2/edit \
  --input '{
    "prompt": "Keep the photograph, layout, and brand mark exactly as in the input. Replace only the in-image headline. The new headline reads \"今日のおすすめ\" in bold Japanese kana, same position and font weight as before.",
    "images": ["https://.../poster-en.jpg"]
  }' \
  --output-dir <absolute/path>

Multi-ref composition:

bash

runcomfy run openai/gpt-image-2/edit \
  --input '{
    "prompt": "Compose subject from image 1 into the room from image 2. Match the lighting and color palette of image 2. Keep image 1 subject identity (face, pose, clothing) unchanged.",
    "images": ["https://.../subject.jpg", "https://.../room.jpg"]
  }' \
  --output-dir <absolute/path>

单参考图保留式编辑：

bash

runcomfy run openai/gpt-image-2/edit \
  --input '{
    "prompt": "Keep the person'\''s face, pose, and brand mark unchanged. Replace the background with a soft warm-grey studio sweep and a gentle floor shadow.",
    "images": ["https://.../portrait.jpg"]
  }' \
  --output-dir <absolute/path>

多语言文本改写（仅替换标题，保留其他所有内容）：

bash

runcomfy run openai/gpt-image-2/edit \
  --input '{
    "prompt": "Keep the photograph, layout, and brand mark exactly as in the input. Replace only the in-image headline. The new headline reads \"今日のおすすめ\" in bold Japanese kana, same position and font weight as before.",
    "images": ["https://.../poster-en.jpg"]
  }' \
  --output-dir <absolute/path>

多参考图合成：

bash

runcomfy run openai/gpt-image-2/edit \
  --input '{
    "prompt": "Compose subject from image 1 into the room from image 2. Match the lighting and color palette of image 2. Keep image 1 subject identity (face, pose, clothing) unchanged.",
    "images": ["https://.../subject.jpg", "https://.../room.jpg"]
  }' \
  --output-dir <absolute/path>

Prompting — what actually works

有效提示词技巧

Lead with preservation goals. Always:

"Keep [face / pose / clothing / brand / framing] unchanged."

Then state the change. The model honors what's stated up front.

Multilingual text — quote the characters, name the script.

"the headline reads \"コーヒー\" in bold Japanese kana"

"the label says \"АРОМА\" in Cyrillic, white on black"

"the right-margin caption reads \"تخفيض\" in Arabic right-to-left"

. Don't paraphrase — quote.

Directional language for spatial edits. Concrete spatial scopes work:

"move the headline from top-right to bottom-center"

"remove the leftmost object only"

"replace the watermark in the bottom-right corner"

Multi-ref numbering. When passing multiple

images

, refer to them by number:

"subject from image 1, lighting from image 2, color palette from image 3"

. The model routes cues correctly.

Use
size: "auto"
to preserve input ratio. Only override when the edit explicitly changes framing (e.g. cropping a 16:9 to 1:1).

Anti-patterns:

Long compound edit instructions ("change A and B and C and D") → drift increases per added scope.
Missing preservation goals → model subtly rewrites the face / brand / framing.
Paraphrasing in-image text instead of quoting it → text comes out different.
Asking for
```
size
```
outside the 3 fixed values +
```
auto
```
→ 422.

先明确保留目标。务必遵循：

"保留[面部/姿势/服装/品牌/画幅]不变。"

然后再说明修改内容。模型会优先执行开头明确的保留要求。

多语言文本——直接引用字符并标注脚本类型。例如：

"标题改为粗体日文假名\"コーヒー\""

、

"标签改为西里尔文\"АРОМА\"，黑底白字"

、

"右侧边栏说明改为阿拉伯文\"تخفيض\"，从右到左排版"

。不要意译，直接引用原文。

空间编辑使用精准方位词。具体的空间描述更有效：

"将标题从右上角移至底部中央"

、

"仅移除最左侧的物体"

、

"替换右下角的水印"

。

多参考图按编号指代。当传入多张

images

时，按编号引用：

"使用图1的主体，图2的光线，图3的调色板"

。模型会正确对应各参考图的提示信息。

使用
size: "auto"
保留原图比例。仅当编辑需求明确改变画幅时才覆盖该值（例如将16:9裁剪为1:1）。

避坑指南：

冗长的复合编辑指令（“修改A、B、C和D”）→ 每增加一个修改项，结果偏移的概率就会上升
未明确保留目标→ 模型会轻微修改面部/品牌/画幅
意译图片内文本而非直接引用→ 生成的文本会与预期不符
请求的
```
size
```
不在3个固定值+
```
auto
```
范围内→ 返回422错误

Where it shines

优势场景

Use case	Why GPT Image Edit
Multilingual ad localization	One source asset → many language variants of the same headline
Brand-safe headline / CTA swaps	Layout precision + preservation language hold the rest stable
Multi-ref composition (subject from one, scene from another)	Numbered refs route cues correctly
Layout-precise repositioning	Directional language ("top-right to bottom-center") honored
Identity preservation across signage edits	Strongest in class for face / brand preservation through targeted edits

使用场景	选择GPT Image Edit的原因
多语言广告本地化	一份源素材→生成多种语言版本的标题
品牌合规的标题/CTA替换	布局精准+保留性语言可确保其他元素稳定不变
多参考图合成（主体来自一张图，场景来自另一张图）	编号引用可准确传递各参考图的提示信息
布局精准重定位	方位词指令（“右上角移至底部中央”）会被准确执行
标识编辑时保留主体特征	在同类工具中，通过针对性编辑保留面部/品牌特征的表现最佳

Sample prompts (verified to produce strong results)

验证有效的示例提示词

Background swap with full preservation (page example):

Turn the background into a bright minimal white-to-soft-gray studio
sweep with gentle floor shadow; add a large headline in-image that
reads "OPEN STUDIO" in a bold clean sans-serif, high contrast, centered;
keep the main person or product, pose, and face identity unchanged

Multilingual variant:

Keep the photograph, layout, lighting, and brand mark exactly as in the
input. Replace only the in-image headline.
The new headline reads "コーヒー" in bold Japanese kana, same position
and font weight as before.

Multi-ref composition:

Compose subject from image 1 into the kitchen from image 2.
Match the warm window light and color palette of image 2.
Keep subject identity (face, pose, clothing) from image 1 unchanged.

保留主体的背景替换（页面示例）：

Turn the background into a bright minimal white-to-soft-gray studio
sweep with gentle floor shadow; add a large headline in-image that
reads "OPEN STUDIO" in a bold clean sans-serif, high contrast, centered;
keep the main person or product, pose, and face identity unchanged

多语言版本：

Keep the photograph, layout, lighting, and brand mark exactly as in the
input. Replace only the in-image headline.
The new headline reads "コーヒー" in bold Japanese kana, same position
and font weight as before.

多参考图合成：

Compose subject from image 1 into the kitchen from image 2.
Match the warm window light and color palette of image 2.
Keep subject identity (face, pose, clothing) from image 1 unchanged.

Limitations

局限性

size
: 3 fixed values +
auto
— anything else 422s.
images
: up to 10 — first is primary, rest are auxiliary cues.
Long compound prompts drift — split into multiple passes when needed.
For batch consistency across many SKU images, Nano Banana Edit (up to 20) is better.
Photorealism on portraits — Nano Banana Pro wins head-to-head.

size
仅支持3个固定值+
auto
——其他值会返回422错误
images
最多支持10张——第一张为主要参考图，其余为辅助提示图
冗长复合提示词会导致结果偏移——必要时拆分为多次编辑
如需批量处理大量SKU图片并保持一致性，Nano Banana Edit（最多20张）更合适
人像照片真实度——Nano Banana Pro表现更优

Exit codes

退出码

code	meaning
0	success
64	bad CLI args
65	bad input JSON / schema mismatch
69	upstream 5xx
75	retryable: timeout / 429
77	not signed in or token rejected

Full reference: docs.runcomfy.com/cli/troubleshooting.

代码	含义
0	成功
64	CLI参数错误
65	输入JSON错误/Schema不匹配
69	上游服务5xx错误
75	可重试：超时/429限流
77	未登录或令牌被拒绝

完整参考：docs.runcomfy.com/cli/troubleshooting。

How it works

工作原理

The skill invokes

runcomfy run openai/gpt-image-2/edit

with a JSON body matching the schema. The CLI POSTs to

https://model-api.runcomfy.net/v1/models/openai/gpt-image-2/edit

, polls the request, fetches the result, and downloads any

.runcomfy.net

.runcomfy.com

URL into

--output-dir

Ctrl-C

cancels the remote request before exit.

该技能通过符合Schema的JSON请求体调用

runcomfy run openai/gpt-image-2/edit

。CLI会向

https://model-api.runcomfy.net/v1/models/openai/gpt-image-2/edit

发送POST请求，轮询请求状态，获取结果，并将所有

.runcomfy.net

.runcomfy.com

域名的URL下载至

--output-dir

指定目录。按

Ctrl-C

会在退出前取消远程请求。

Security & Privacy

安全与隐私

Token storage:
```
runcomfy login
```
writes the API token to
```
~/.config/runcomfy/token.json
```
with mode 0600 (owner-only read/write). Set
```
RUNCOMFY_TOKEN
```
env var to bypass the file entirely in CI / containers.
Input boundary: the user prompt is passed as a JSON string to the CLI via
```
--input
```
. The CLI does NOT shell-expand the prompt; it transmits the JSON body directly to the Model API over HTTPS. No shell injection surface from prompt content.
Third-party content: image / mask / video URLs you pass are fetched by the RunComfy model server, not by the CLI on your machine. Treat external URLs as untrusted; image-based prompt injection is a known risk for any image-edit / video-edit model.
Outbound endpoints: only
```
model-api.runcomfy.net
```
(request submission) and
```
*.runcomfy.net
```
/
```
*.runcomfy.com
```
(download whitelist for generated outputs). No telemetry, no callbacks.
Generated-file size cap: the CLI aborts any single download > 2 GiB to prevent disk-fill from a malicious or runaway model output.

令牌存储：
```
runcomfy login
```
会将API令牌写入
```
~/.config/runcomfy/token.json
```
，权限为0600（仅所有者可读写）。在CI/容器环境中，可设置环境变量
```
RUNCOMFY_TOKEN
```
绕过文件存储方式。
输入边界：用户提示词通过
```
--input
```
以JSON字符串形式传递给CLI。CLI不会对提示词进行Shell扩展，而是直接将JSON请求体通过HTTPS传输至模型API。提示词内容不存在Shell注入风险。
第三方内容：你传入的图片/遮罩/视频URL由RunComfy模型服务器获取，而非本地CLI。请将外部URL视为不可信来源；基于图片的提示词注入是所有图片/视频编辑模型的已知风险。
出站端点：仅与
```
model-api.runcomfy.net
```
（请求提交）和
```
*.runcomfy.net
```
/
```
*.runcomfy.com
```
（生成结果下载白名单）通信。无遥测数据，无回调请求。
生成文件大小限制：CLI会终止任何超过2 GiB的单个文件下载，防止恶意或异常模型输出占满磁盘空间。

gpt-image-edit

Original

Translation

🎨 GPT Image Edit — Pro Pack on RunComfy

🎨 GPT Image Edit — RunComfy专业套件

When to pick this model (vs siblings)

何时选择该模型（对比同类工具）

Prerequisites

前置条件

Endpoints + input schema

端点与输入Schema

`openai/gpt-image-2/edit`

`openai/gpt-image-2/edit`

How to invoke

调用方式

Prompting — what actually works

有效提示词技巧

Where it shines

优势场景

Sample prompts (verified to produce strong results)

验证有效的示例提示词

Limitations

局限性

Exit codes

退出码

How it works

工作原理

Security & Privacy

安全与隐私