comfyui-prompt-interview

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

ComfyUI Prompt Interview

ComfyUI 提示词访谈

Conduct a guided conversation to draw out the user's complete creative vision, then synthesize a perfect, model-appropriate prompt with all recommended settings.

通过引导式对话挖掘用户完整的创意构想，然后生成适配模型的完美提示词及所有推荐设置。

When to Invoke This Skill

何时调用此技能

User describes an image or scene idea but hasn't given enough detail for a quality prompt
User says "help me think through what I want to create"
User has a vague concept that needs refinement
User wants a structured prompt but isn't sure what to specify

用户描述了图像或场景构想，但提供的细节不足以生成高质量提示词
用户说“帮我梳理清楚我想要创建的内容”
用户有模糊的概念，需要细化
用户想要结构化的提示词，但不确定需要指定哪些内容

The Interview Philosophy

访谈理念

Ask, don't interrogate. This is a conversation, not a form. Ask one or two questions at a time. Listen to what the user gives you and follow up on what's missing. Tailor your questions to what they've already shared — don't ask about character details if they're generating a landscape.

Fewer questions = better. Aim for 4-7 exchanges maximum. Ask the most impactful questions first. Stop asking when you have enough to generate an excellent prompt.

Don't ask for what you can infer. If the user says "cinematic portrait of a warrior woman," you don't need to ask if it's a person or whether to include a subject.

提问而非审问。这是一场对话，不是填写表单。一次问1-2个问题。倾听用户的回答，针对缺失的信息跟进提问。根据用户已分享的内容调整问题——如果用户要生成风景画，就不要询问角色细节。

问题越少越好。最多进行4-7轮交流。先问最关键的问题。当你获得足够信息可以生成优秀提示词时，就停止提问。

不要询问可推断的信息。如果用户说“战士女性的电影感肖像”，你不需要问这是不是人物，或者是否要包含主体。

Interview Flow

访谈流程

Step 1: Open with the Big Picture

步骤1：从整体构想开始

If the user hasn't told you what they want to create, start here:

"What do you want to create? Give me whatever you have — even a rough idea, a mood, or a reference you're inspired by."

If they gave you a starting concept, skip this and go straight to what's missing.

如果用户还没告诉你他们想要创建什么，从这里开始：

“你想要创建什么？告诉我你所有的想法——哪怕是粗略的概念、氛围，或是你受到启发的参考内容。”

如果用户已经给出了初始概念，跳过这一步，直接询问缺失的信息。

Step 2: Branch by Creation Type

步骤2：按创作类型分支

Based on their answer, determine what kind of generation this is:

Type	Key Questions to Ask
Portrait / Character	Identity method? Existing character? Expression, clothing, setting, lighting
Scene / Environment	Location, time of day, mood, weather, foreground/background elements
Product / Object	Angle, background, lighting style, commercial vs. artistic
Abstract / Concept	Dominant colors, shapes, emotional tone, what to avoid
Video	Motion type, camera movement, duration needed, audio?

根据用户的回答，确定这是哪类生成任务：

类型	关键问题
肖像/角色	身份确认方式？是否为现有角色？表情、服装、场景、光线
场景/环境	地点、时间、氛围、天气、前景/背景元素
产品/物体	角度、背景、光线风格、商业向还是艺术向
抽象/概念	主色调、形状、情感基调、需要规避的内容
视频	运动类型、镜头运动、所需时长、是否需要音频？

Step 3: Ask the High-Impact Questions

步骤3：询问关键问题

Ask only what's missing. Use natural conversational language, not a bullet list.

只询问缺失的信息。使用自然的对话语言，不要用项目符号列表。

For character/portrait content — ask in order of impact:

针对角色/肖像内容——按重要性顺序提问：

Identity (if not specified): "Is this a specific character you have reference images for, or are we designing someone new?"
Expression & mood: "What's the emotion or energy — fierce, serene, playful, haunted?"
Setting: "Where are they, and when? (Time of day, location, interior/exterior)"
Lighting: "Any specific lighting in mind? (Golden hour, dramatic side light, soft studio, neon, candlelight)"
Clothing & details: "What are they wearing, and any other key visual details?"
Camera/composition: "How are we framing this — close-up portrait, three-quarter body, wide establishing shot?"
Style: "Photorealistic, cinematic film, editorial fashion, painterly, or something else?"

身份（如果未指定）：“这是你有参考图的特定角色，还是我们要设计一个新角色？”
表情与氛围：“想要传递什么情绪或气场——凶猛、宁静、俏皮、阴郁？”
场景：“他们在哪里，时间是何时？（时段、地点、室内/室外）”
光线：“有没有特定的光线要求？（黄金时段、戏剧性侧光、柔和影棚光、霓虹光、烛光）”
服装与细节：“他们穿着什么，还有其他关键视觉细节吗？”
镜头/构图：“我们要怎么构图——特写肖像、四分之三身、宽景定场镜头？”
风格：“写实风、电影胶片风、时尚编辑风、绘画风，还是其他风格？”

For scene/environment content — ask in order of impact:

针对场景/环境内容——按重要性顺序提问：

Setting: "Describe the place — what does it look like, and when is it?"
Mood/atmosphere: "What feeling should hit the viewer instantly?"
Lighting: "What's the light source and quality?"
Key elements: "Any specific objects, structures, or details that must be in the shot?"
Style: "Photorealistic, stylized, concept art, painterly?"

场景：“描述一下这个地方——它是什么样子，时间是何时？”
氛围/气场：“想要给观众带来什么样的即时感受？”
光线：“光源和光线质感是什么样的？”
关键元素：“画面中必须包含哪些特定物体、结构或细节？”
风格：“写实风、风格化、概念艺术、绘画风？”

For video content — additional questions:

针对视频内容——额外问题：

Motion: "What's moving — the subject, the camera, or both?"
Duration: "How long? (Short: 3-5s vs. long: 15-60s changes model choice)"
Audio: "Do you need sound/music, or silent?"

运动：“什么在动——主体、镜头，还是两者都动？”
时长：“时长是多少？（短：3-5秒 vs 长：15-60秒会影响模型选择）”
音频：“需要声音/音乐，还是静音？”

Step 4: Technical Questions (ask only if not obvious)

步骤4：技术问题（仅在不明确时询问）

These can usually be inferred from context, but ask if unclear:

Aspect ratio: "Standard 1:1 portrait, 16:9 cinematic, 9:16 vertical/social?"
Model preference: "Any preference on the generation engine, or should I recommend the best one for this?"
Existing character setup: "Do you have a LoRA trained for this character, or reference images?"
What to avoid: "Anything specific you want to make sure stays OUT of the image?"

这些通常可以从上下文推断，但如果不清楚就提问：

宽高比：“标准1:1肖像、16:9电影感、9:16竖屏/社交平台用？”
模型偏好：“对生成引擎有偏好吗，还是我来推荐最适合的引擎？”
现有角色设置：“你有为这个角色训练LoRA，还是有参考图？”
需要规避的内容：“有没有特定内容你想要确保不会出现在图像中？”

Step 5: Confirm and Synthesize

步骤5：确认并生成

Before generating the prompt, briefly reflect back the vision:

"Got it. Here's how I'm reading this: [1-2 sentence summary of the concept]. Let me build that prompt."

Then immediately generate the full output below.

在生成提示词之前，简要总结用户的构想：

“明白了。我是这样理解的：[1-2句话总结概念]。我这就生成提示词。”

然后立即生成下方的完整输出。

Output Format

输出格式

Deliver all four components, clearly separated:

交付所有四个组件，清晰分隔：

🎯 Positive Prompt

🎯 正向提示词

[Craft the positive prompt applying model-specific rules from

skills/comfyui-prompt-engineer/SKILL.md

]

Key rules:

FLUX / Kontext: Natural language, 50-100 words, no quality tags, describe the scene not the face if using identity method
SDXL: Quality tags first, trigger word second, 50-150 words, weighted syntax supported
SD 1.5: Short and tag-based, 30-80 words
Wan / Video: Concise, motion-focused, 20-50 words
If a LoRA trigger word applies, put it first
If using InstantID/InfiniteYou: don't describe facial features, let the identity method handle them

[参考

skills/comfyui-prompt-engineer/SKILL.md

中的模型特定规则构建正向提示词]

关键规则：

FLUX / Kontext：自然语言，50-100词，无质量标签，如果使用身份确认方法，描述场景而非面部
SDXL：先加质量标签，再加触发词，50-150词，支持加权语法
SD 1.5：简短的标签式，30-80词
Wan / Video：简洁，聚焦运动，20-50词
如果适用LoRA触发词，放在最前面
如果使用InstantID/InfiniteYou：不要描述面部特征，让身份确认方法处理

🚫 Negative Prompt

🚫 反向提示词

[Select the appropriate negative template and customize it]

Standard templates:

Photorealism:

(worst quality:1.4), (low quality:1.4), blurry, deformed, bad anatomy, bad hands, extra fingers, missing fingers, text, watermark, 3d render, cartoon, anime, plastic skin, airbrushed, oversaturated

FLUX (minimal):

blurry, low quality, distorted, deformed, ugly, watermark, text

Video:

static, frozen, jerky motion, low quality, blurry, distorted face, bad anatomy, glitch, artifacts, flickering

[选择合适的反向提示词模板并自定义]

标准模板：

写实风：

(worst quality:1.4), (low quality:1.4), blurry, deformed, bad anatomy, bad hands, extra fingers, missing fingers, text, watermark, 3d render, cartoon, anime, plastic skin, airbrushed, oversaturated

FLUX（极简版）：

blurry, low quality, distorted, deformed, ugly, watermark, text

视频：

static, frozen, jerky motion, low quality, blurry, distorted face, bad anatomy, glitch, artifacts, flickering

⚙️ Recommended Settings

⚙️ 推荐设置

Parameter	Value	Reason
Model	[Specific checkpoint]	[Why this model]
Sampler	[e.g., DPM++ 2M Karras]
Steps	[e.g., 25]
CFG Scale	[e.g., 4.5]
Resolution	[e.g., 1016×1016]	[Avoid 1024 watermark artifacts if InstantID]
Identity method	[e.g., InfiniteYou SIM]	[If applicable]
LoRA	[Trigger word + weight]	[If applicable]

参数	值	原因
模型	[具体检查点]	[选择此模型的原因]
采样器	[例如：DPM++ 2M Karras]
步数	[例如：25]
CFG Scale	[例如：4.5]
分辨率	[例如：1016×1016]	[如果使用InstantID，避免1024水印 artifacts]
身份确认方法	[例如：InfiniteYou SIM]	[如适用]
LoRA	[触发词 + 权重]	[如适用]

🔧 Pipeline Recommendation

🔧 工作流推荐

[1-3 sentences describing the recommended workflow pattern and why]

Example: "Use Pattern 2 from the character-gen skill: Load your LoRA, add InfiniteYou SIM for identity lock, route through FaceDetailer at 0.35 denoise, then upscale with 4x-UltraSharp. This gives you maximum identity fidelity while keeping the cinematic mood you described."

[1-3句话描述推荐的工作流模式及原因]

示例：“使用角色生成技能中的模式2：加载你的LoRA，添加InfiniteYou SIM锁定身份，通过去噪强度0.35的FaceDetailer，然后用4x-UltraSharp放大。这样既能最大程度保证身份还原度，又能保留你想要的电影感氛围。”

Decision Rules for Model Selection

模型选择决策规则

Apply these when recommending a model/pipeline:

User's Goal	Recommend
"Photorealistic, single character, reference images available"	InfiniteYou or FLUX Kontext
"Multiple characters, consistent identity"	PuLID Flux II
"Fast iteration, character editing"	FLUX Kontext dev/pro
"Highest quality portrait, trained LoRA exists"	FLUX.1-dev + LoRA
"Wide scene / environment"	FLUX.1-dev or RealVisXL
"Stylized / artistic"	RealVisXL + style LoRA or SDXL
"Video, short clip"	Wan 2.2 1.3B (consumer) or LTX-2 (4K)
"Video, infinite length"	Wan 2.2 I2V A14B + SVI 2.0 Pro
"Talking head"	FramePack + LatentSync 1.6

推荐模型/工作流时遵循以下规则：

用户目标	推荐方案
“写实风、单人角色、有参考图”	InfiniteYou 或 FLUX Kontext
“多角色、身份一致”	PuLID Flux II
“快速迭代、角色编辑”	FLUX Kontext dev/pro
“最高质量肖像、已有训练好的LoRA”	FLUX.1-dev + LoRA
“宽场景/环境”	FLUX.1-dev 或 RealVisXL
“风格化/艺术向”	RealVisXL + 风格LoRA 或 SDXL
“视频、短片段”	Wan 2.2 1.3B（消费级）或 LTX-2（4K）
“视频、无限时长”	Wan 2.2 I2V A14B + SVI 2.0 Pro
“说话人头”	FramePack + LatentSync 1.6

Example Interview

访谈示例

User: "I want to create a dramatic image of a woman warrior standing in ruins at dawn."

VideoAgent:

"Love the concept. A few things that'll lock in the look:

Is this a specific character you want to preserve (reference images, LoRA) or someone new?

What's her energy — defiant and triumphant, exhausted but unbroken, something else?

Any specifics on how she looks — hair, armor style, anything iconic about her?"

User: "New character. Defiant and triumphant. Dark braided hair, ornate bronze armor, early morning light just cresting the horizon behind her."

VideoAgent:

"Got it — framing this as a heroic aftermath shot: she's won, the sun rises behind her, rubble around her feet. One more thing: close portrait, three-quarter body, or full figure with the ruins environment?"

User: "Full figure, show the ruins and the dawn sky."

VideoAgent:

"Perfect. Building that now."

[Delivers complete prompt output]

用户：“我想创建一张黎明时分，女战士站在废墟中的充满戏剧性的图像。”

VideoAgent:

“这个构想很棒。有几个细节能让画面更确定：

这是你想要保留身份的特定角色（有参考图、LoRA），还是新角色？

她的气场是怎样的——桀骜不屈、意气风发，疲惫但不屈服，还是其他？

她的外貌有什么细节——发型、盔甲风格，有没有标志性特征？”

用户：“新角色。桀骜不屈、意气风发。深棕色编发，华丽的青铜盔甲，清晨的阳光刚好在地平线后升起。”

VideoAgent:

“明白了——把这设定为英雄凯旋的镜头：她赢了，太阳在她身后升起，脚下是碎石。还有一个问题：特写肖像、四分之三身，还是全身像（包含废墟环境）？”

用户：“全身像，展示废墟和黎明的天空。”

VideoAgent:

“完美。我这就生成。”

[交付完整的提示词输出]

Integration with Other Skills

与其他技能的集成

After delivering the prompt, offer next steps:

Generate the workflow: "Want me to build the full ComfyUI workflow for this? I'll check your inventory and configure every node."
Refine the prompt: "I can adjust the style, swap the identity method, or rework the negative if anything doesn't feel right."
Save as character profile: "If this becomes a recurring character, I can create a character profile so we always have her settings ready."

交付提示词后，提供后续选项：

生成工作流：“要我为这个内容构建完整的ComfyUI工作流吗？我会检查你的资源库并配置每个节点。”
优化提示词：“我可以调整风格、更换身份确认方法，或修改反向提示词，如果有任何内容不符合你的预期。”
保存为角色档案：“如果这个角色会重复使用，我可以创建一个角色档案，这样我们随时都能调用她的设置。”