gpt-image-edit
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
Chinese🎨 GPT Image Edit — Pro Pack on RunComfy
🎨 GPT Image Edit — RunComfy专业套件
OpenAI GPT Image 2 — endpoint (ChatGPT Images 2.0 image-to-image) on the RunComfy Model API. Strongest in its class at preserving identity through targeted edits and rewriting embedded text in any script (Latin, kana, CJK, Cyrillic, Arabic).
/editbash
npx skills add agentspace-so/runcomfy-skills --skill gpt-image-edit -g在RunComfy模型API上调用OpenAI GPT Image 2的端点(ChatGPT Images 2.0图生图功能)。该模型在同类工具中表现突出,可通过针对性编辑保留主体特征,并能改写图片中任意文字脚本(拉丁语、假名、中日韩文字、西里尔文、阿拉伯文)。
/editbash
npx skills add agentspace-so/runcomfy-skills --skill gpt-image-edit -gWhen to pick this model (vs siblings)
何时选择该模型(对比同类工具)
| You want | Use |
|---|---|
| Edit multilingual / embedded text in image | GPT Image Edit ✓ |
| Identity preservation through translated headline variants | GPT Image Edit ✓ |
| Layout-precise edit (move headline, swap CTA, etc.) | GPT Image Edit ✓ |
| Up to 10 reference images | GPT Image Edit ✓ |
| Batch up to 20 images consistently | Nano Banana Edit |
| Single-shot precise local edit, source-fidelity-first | Flux Kontext |
| Generate from scratch with GPT Image 2 | sibling |
| Batch SKU galleries with stable identity | Nano Banana Edit |
| 需求场景 | 推荐工具 |
|---|---|
| 编辑图片中的多语言/嵌入文本 | GPT Image Edit ✓ |
| 在替换标题为其他语言版本时保留主体特征 | GPT Image Edit ✓ |
| 布局精准编辑(移动标题、替换CTA按钮等) | GPT Image Edit ✓ |
| 最多支持10张参考图 | GPT Image Edit ✓ |
| 批量处理最多20张图片且保持一致性 | Nano Banana Edit |
| 单次精准局部编辑,优先保证与原图一致性 | Flux Kontext |
| 使用GPT Image 2从头生成图片 | 姊妹技能 |
| 批量处理SKU图库并保持主体特征稳定 | Nano Banana Edit |
Prerequisites
前置条件
- RunComfy CLI —
npm i -g @runcomfy/cli - RunComfy account — opens a browser device-code flow.
runcomfy login - CI / containers — set instead of
RUNCOMFY_TOKEN=<token>.runcomfy login
- RunComfy CLI — 执行安装
npm i -g @runcomfy/cli - RunComfy账号 — 执行会打开浏览器设备码登录流程
runcomfy login - CI/容器环境 — 设置环境变量替代
RUNCOMFY_TOKEN=<token>登录runcomfy login
Endpoints + input schema
端点与输入Schema
openai/gpt-image-2/edit
openai/gpt-image-2/editopenai/gpt-image-2/edit
openai/gpt-image-2/edit| Field | Type | Required | Default | Notes |
|---|---|---|---|---|
| string | yes | — | Edit instruction. Lead with preservation, end with the change. |
| string[] | yes | — | Up to 10 publicly-fetchable HTTPS URLs. First is primary; rest are auxiliary. |
| enum | no | | |
size=auto| 字段 | 类型 | 是否必填 | 默认值 | 说明 |
|---|---|---|---|---|
| 字符串 | 是 | — | 编辑指令。先说明需要保留的内容,再描述修改需求 |
| 字符串数组 | 是 | — | 最多10个可公开访问的HTTPS URL。第一个为主要参考图,其余为辅助参考图 |
| 枚举值 | 否 | | |
size=autoHow to invoke
调用方式
Single-ref preservation edit:
bash
runcomfy run openai/gpt-image-2/edit \
--input '{
"prompt": "Keep the person'\''s face, pose, and brand mark unchanged. Replace the background with a soft warm-grey studio sweep and a gentle floor shadow.",
"images": ["https://.../portrait.jpg"]
}' \
--output-dir <absolute/path>Multilingual text rewrite (preserve everything except the headline):
bash
runcomfy run openai/gpt-image-2/edit \
--input '{
"prompt": "Keep the photograph, layout, and brand mark exactly as in the input. Replace only the in-image headline. The new headline reads \"今日のおすすめ\" in bold Japanese kana, same position and font weight as before.",
"images": ["https://.../poster-en.jpg"]
}' \
--output-dir <absolute/path>Multi-ref composition:
bash
runcomfy run openai/gpt-image-2/edit \
--input '{
"prompt": "Compose subject from image 1 into the room from image 2. Match the lighting and color palette of image 2. Keep image 1 subject identity (face, pose, clothing) unchanged.",
"images": ["https://.../subject.jpg", "https://.../room.jpg"]
}' \
--output-dir <absolute/path>单参考图保留式编辑:
bash
runcomfy run openai/gpt-image-2/edit \
--input '{
"prompt": "Keep the person'\''s face, pose, and brand mark unchanged. Replace the background with a soft warm-grey studio sweep and a gentle floor shadow.",
"images": ["https://.../portrait.jpg"]
}' \
--output-dir <absolute/path>多语言文本改写(仅替换标题,保留其他所有内容):
bash
runcomfy run openai/gpt-image-2/edit \
--input '{
"prompt": "Keep the photograph, layout, and brand mark exactly as in the input. Replace only the in-image headline. The new headline reads \"今日のおすすめ\" in bold Japanese kana, same position and font weight as before.",
"images": ["https://.../poster-en.jpg"]
}' \
--output-dir <absolute/path>多参考图合成:
bash
runcomfy run openai/gpt-image-2/edit \
--input '{
"prompt": "Compose subject from image 1 into the room from image 2. Match the lighting and color palette of image 2. Keep image 1 subject identity (face, pose, clothing) unchanged.",
"images": ["https://.../subject.jpg", "https://.../room.jpg"]
}' \
--output-dir <absolute/path>Prompting — what actually works
有效提示词技巧
Lead with preservation goals. Always: Then state the change. The model honors what's stated up front.
"Keep [face / pose / clothing / brand / framing] unchanged."Multilingual text — quote the characters, name the script. , , . Don't paraphrase — quote.
"the headline reads \"コーヒー\" in bold Japanese kana""the label says \"АРОМА\" in Cyrillic, white on black""the right-margin caption reads \"تخفيض\" in Arabic right-to-left"Directional language for spatial edits. Concrete spatial scopes work: , , .
"move the headline from top-right to bottom-center""remove the leftmost object only""replace the watermark in the bottom-right corner"Multi-ref numbering. When passing multiple , refer to them by number: . The model routes cues correctly.
images"subject from image 1, lighting from image 2, color palette from image 3"Use to preserve input ratio. Only override when the edit explicitly changes framing (e.g. cropping a 16:9 to 1:1).
size: "auto"Anti-patterns:
- Long compound edit instructions ("change A and B and C and D") → drift increases per added scope.
- Missing preservation goals → model subtly rewrites the face / brand / framing.
- Paraphrasing in-image text instead of quoting it → text comes out different.
- Asking for outside the 3 fixed values +
size→ 422.auto
先明确保留目标。务必遵循:然后再说明修改内容。模型会优先执行开头明确的保留要求。
"保留[面部/姿势/服装/品牌/画幅]不变。"多语言文本——直接引用字符并标注脚本类型。例如:、、。不要意译,直接引用原文。
"标题改为粗体日文假名\"コーヒー\"""标签改为西里尔文\"АРОМА\",黑底白字""右侧边栏说明改为阿拉伯文\"تخفيض\",从右到左排版"空间编辑使用精准方位词。具体的空间描述更有效:、、。
"将标题从右上角移至底部中央""仅移除最左侧的物体""替换右下角的水印"多参考图按编号指代。当传入多张时,按编号引用:。模型会正确对应各参考图的提示信息。
images"使用图1的主体,图2的光线,图3的调色板"使用保留原图比例。仅当编辑需求明确改变画幅时才覆盖该值(例如将16:9裁剪为1:1)。
size: "auto"避坑指南:
- 冗长的复合编辑指令(“修改A、B、C和D”)→ 每增加一个修改项,结果偏移的概率就会上升
- 未明确保留目标→ 模型会轻微修改面部/品牌/画幅
- 意译图片内文本而非直接引用→ 生成的文本会与预期不符
- 请求的不在3个固定值+
size范围内→ 返回422错误auto
Where it shines
优势场景
| Use case | Why GPT Image Edit |
|---|---|
| Multilingual ad localization | One source asset → many language variants of the same headline |
| Brand-safe headline / CTA swaps | Layout precision + preservation language hold the rest stable |
| Multi-ref composition (subject from one, scene from another) | Numbered refs route cues correctly |
| Layout-precise repositioning | Directional language ("top-right to bottom-center") honored |
| Identity preservation across signage edits | Strongest in class for face / brand preservation through targeted edits |
| 使用场景 | 选择GPT Image Edit的原因 |
|---|---|
| 多语言广告本地化 | 一份源素材→生成多种语言版本的标题 |
| 品牌合规的标题/CTA替换 | 布局精准+保留性语言可确保其他元素稳定不变 |
| 多参考图合成(主体来自一张图,场景来自另一张图) | 编号引用可准确传递各参考图的提示信息 |
| 布局精准重定位 | 方位词指令(“右上角移至底部中央”)会被准确执行 |
| 标识编辑时保留主体特征 | 在同类工具中,通过针对性编辑保留面部/品牌特征的表现最佳 |
Sample prompts (verified to produce strong results)
验证有效的示例提示词
Background swap with full preservation (page example):
Turn the background into a bright minimal white-to-soft-gray studio
sweep with gentle floor shadow; add a large headline in-image that
reads "OPEN STUDIO" in a bold clean sans-serif, high contrast, centered;
keep the main person or product, pose, and face identity unchangedMultilingual variant:
Keep the photograph, layout, lighting, and brand mark exactly as in the
input. Replace only the in-image headline.
The new headline reads "コーヒー" in bold Japanese kana, same position
and font weight as before.Multi-ref composition:
Compose subject from image 1 into the kitchen from image 2.
Match the warm window light and color palette of image 2.
Keep subject identity (face, pose, clothing) from image 1 unchanged.保留主体的背景替换(页面示例):
Turn the background into a bright minimal white-to-soft-gray studio
sweep with gentle floor shadow; add a large headline in-image that
reads "OPEN STUDIO" in a bold clean sans-serif, high contrast, centered;
keep the main person or product, pose, and face identity unchanged多语言版本:
Keep the photograph, layout, lighting, and brand mark exactly as in the
input. Replace only the in-image headline.
The new headline reads "コーヒー" in bold Japanese kana, same position
and font weight as before.多参考图合成:
Compose subject from image 1 into the kitchen from image 2.
Match the warm window light and color palette of image 2.
Keep subject identity (face, pose, clothing) from image 1 unchanged.Limitations
局限性
- : 3 fixed values +
size— anything else 422s.auto - : up to 10 — first is primary, rest are auxiliary cues.
images - Long compound prompts drift — split into multiple passes when needed.
- For batch consistency across many SKU images, Nano Banana Edit (up to 20) is better.
- Photorealism on portraits — Nano Banana Pro wins head-to-head.
- 仅支持3个固定值+
size——其他值会返回422错误auto - 最多支持10张——第一张为主要参考图,其余为辅助提示图
images - 冗长复合提示词会导致结果偏移——必要时拆分为多次编辑
- 如需批量处理大量SKU图片并保持一致性,Nano Banana Edit(最多20张)更合适
- 人像照片真实度——Nano Banana Pro表现更优
Exit codes
退出码
| code | meaning |
|---|---|
| 0 | success |
| 64 | bad CLI args |
| 65 | bad input JSON / schema mismatch |
| 69 | upstream 5xx |
| 75 | retryable: timeout / 429 |
| 77 | not signed in or token rejected |
Full reference: docs.runcomfy.com/cli/troubleshooting.
| 代码 | 含义 |
|---|---|
| 0 | 成功 |
| 64 | CLI参数错误 |
| 65 | 输入JSON错误/Schema不匹配 |
| 69 | 上游服务5xx错误 |
| 75 | 可重试:超时/429限流 |
| 77 | 未登录或令牌被拒绝 |
How it works
工作原理
The skill invokes with a JSON body matching the schema. The CLI POSTs to , polls the request, fetches the result, and downloads any / URL into . cancels the remote request before exit.
runcomfy run openai/gpt-image-2/edithttps://model-api.runcomfy.net/v1/models/openai/gpt-image-2/edit.runcomfy.net.runcomfy.com--output-dirCtrl-C该技能通过符合Schema的JSON请求体调用。CLI会向发送POST请求,轮询请求状态,获取结果,并将所有/域名的URL下载至指定目录。按会在退出前取消远程请求。
runcomfy run openai/gpt-image-2/edithttps://model-api.runcomfy.net/v1/models/openai/gpt-image-2/edit.runcomfy.net.runcomfy.com--output-dirCtrl-CSecurity & Privacy
安全与隐私
- Token storage: writes the API token to
runcomfy loginwith mode 0600 (owner-only read/write). Set~/.config/runcomfy/token.jsonenv var to bypass the file entirely in CI / containers.RUNCOMFY_TOKEN - Input boundary: the user prompt is passed as a JSON string to the CLI via . The CLI does NOT shell-expand the prompt; it transmits the JSON body directly to the Model API over HTTPS. No shell injection surface from prompt content.
--input - Third-party content: image / mask / video URLs you pass are fetched by the RunComfy model server, not by the CLI on your machine. Treat external URLs as untrusted; image-based prompt injection is a known risk for any image-edit / video-edit model.
- Outbound endpoints: only (request submission) and
model-api.runcomfy.net/*.runcomfy.net(download whitelist for generated outputs). No telemetry, no callbacks.*.runcomfy.com - Generated-file size cap: the CLI aborts any single download > 2 GiB to prevent disk-fill from a malicious or runaway model output.
- 令牌存储:会将API令牌写入
runcomfy login,权限为0600(仅所有者可读写)。在CI/容器环境中,可设置环境变量~/.config/runcomfy/token.json绕过文件存储方式。RUNCOMFY_TOKEN - 输入边界:用户提示词通过以JSON字符串形式传递给CLI。CLI不会对提示词进行Shell扩展,而是直接将JSON请求体通过HTTPS传输至模型API。提示词内容不存在Shell注入风险。
--input - 第三方内容:你传入的图片/遮罩/视频URL由RunComfy模型服务器获取,而非本地CLI。请将外部URL视为不可信来源;基于图片的提示词注入是所有图片/视频编辑模型的已知风险。
- 出站端点:仅与(请求提交)和
model-api.runcomfy.net/*.runcomfy.net(生成结果下载白名单)通信。无遥测数据,无回调请求。*.runcomfy.com - 生成文件大小限制:CLI会终止任何超过2 GiB的单个文件下载,防止恶意或异常模型输出占满磁盘空间。