gen-ai-persona-creation

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

AI Influencer Persona

AI网红人设

Turn one sentence into a head-to-toe 4-angle casting card in signature wardrobe, persona profile, platform-tuned captions, and (optional) a reel with ambient audio. Output:

./<persona-slug>/

将一句话转化为包含专属穿搭的全身4角度选角卡、人设档案、适配平台的文案，以及（可选）带环境音的短视频。输出路径：

./<persona-slug>/

。

When to Use

适用场景

See the description above.

参见上方描述。

Prerequisites

前置条件

bash

gen-ai whoami            # auth + gen-ai install + Node v22+ check
command -v curl          # ships with macOS / Linux / Git-Bash

gen-ai whoami

fails:

gen-ai login

or set

PICSART_ACCESS_TOKEN

PICSART_USER_ID

. No extra media tools needed.

bash

gen-ai whoami            # 认证 + gen-ai 安装 + Node v22+ 版本检查
command -v curl          # macOS / Linux / Git-Bash 自带

若

gen-ai whoami

执行失败：运行

gen-ai login

或设置

PICSART_ACCESS_TOKEN

PICSART_USER_ID

。无需额外媒体工具。

How to Run

运行方式

Use the agent's
terminal
tool to invoke
gen-ai
commands as described in the Procedure below.

使用Agent的
terminal
工具，按照下方流程调用
gen-ai
命令。

Quick Reference

快速参考

See the Procedure for canonical commands.

参见流程中的标准命令。

Procedure

流程

See sections below for the detailed walkthrough.

参见下方详细步骤说明。

Pitfalls

常见问题

See Common Pitfalls below.

参见下方常见问题部分。

Verification

验证方法

Run

gen-ai whoami

to confirm authentication, then re-run the failed command with

--debug

运行

gen-ai whoami

确认认证状态，然后添加

--debug

参数重新执行失败的命令。

How the skill calls

gen-ai

技能调用

gen-ai

的方式

bash

URL=$(gen-ai generate -m <model> -p "<prompt>" --json --no-input | grep -oE 'https?://[^"]+' | head -1)
curl -sSL -o ./<persona-slug>/<file>.<ext> "$URL"

--download

doesn't work with

--json --no-input

— URL+curl is canonical.

Bash footguns: never add

2>&1

or stderr redirects between

--json --no-input

and the closing

— shell parse error before the command runs (verified). Keep the inner pipe strictly

--json --no-input | grep -oE 'https?://[^"]+' | head -1

. One generation per

URL=$(...)

bash

URL=$(gen-ai generate -m <model> -p "<prompt>" --json --no-input | grep -oE 'https?://[^"]+' | head -1)
curl -sSL -o ./<persona-slug>/<file>.<ext> "$URL"

--download

参数无法与

--json --no-input

同时使用——通过URL+curl是标准方式。

Bash陷阱： 在

--json --no-input

和闭合

之间绝对不要添加

2>&1

或标准错误重定向，否则命令执行前会出现Shell解析错误（已验证）。内部管道必须严格保持为

--json --no-input | grep -oE 'https?://[^"]+' | head -1

。每个

URL=$(...)

对应一次生成操作。

Style routing

风格路由

Style	Model	For	Cost
`realistic` (default)	`gemini-3.1-flash-image`	photoreal humans + photoreal common pets / anthropomorphic animals	~3 cr
`stylized`	`grok-imagine`	anime, 3D-animated fruit/object/character, illustration	~1 cr

Cross-provider fallback: primary fails → retry with

flux-2-max

(~3 cr, supports

imageUrls

). Both fail → surface error.

风格	模型	适用场景	成本
`realistic` （默认）	`gemini-3.1-flash-image`	写实人类 + 写实常见宠物/拟人化动物	~3 cr
`stylized`	`grok-imagine`	动漫、3D动画化水果/物品/角色、插画	~1 cr

跨供应商降级方案： 主模型失败→重试使用

flux-2-max

（~3 cr，支持

imageUrls

）。若两者均失败→显示错误信息。

Style inference (read the brief)

风格推断（阅读需求描述）

Brief contains	Style → opening
Fruit/veggie/food + "character" / "anthropomorphic" / "brainrot"	stylized → fruit/object
Animal name + "pet" / "influencer" / "creator" / breed (NOT "real form" / "four-legged")	realistic → anthropomorphic humanoid pet (default for animal/pet briefs — fluffy biped in cute clothes, matches project's `pets` category vibe)
Animal name + explicit "real form" / "four-legged" / "on all fours" / "real cat / dog / animal"	realistic → real-form quadruped pet (opt-in)
"Anime", "manga", "magical girl", "kawaii", "shoujo", "shonen"	stylized → anime
"3D rendered", "stylized 3D", "claymation", "feature-film animation"	stylized → 3D character
"Illustrated", "painted", "watercolor", "comic book"	stylized → illustration
Human profession + demographic, no style cue	realistic → photoreal human

Both anthropomorphic-humanoid and real-form-quadruped are supported, but anthropomorphic is the default for pet briefs — that matches the Picsart project's

pets

category which is fluffy biped influencers in cute clothes (tiny sweaters, mini hoodies, bow ties), with food-themed names (Biscuit, Mochi, Nugget, Bean, Waffles, Tofu, Pickle) and gen-z bios ("professional napper | treat negotiator | certified good boy/girl"). Real-form four-legged is the opt-in for creators who explicitly say so. Style conflict (e.g. "anime fitness coach") → prefer the stylistic cue.

Most creators want stylized. Don't blindly default to realistic.

IP-safe wording (mandatory): never name studios / franchises in prompts sent to the model — no "Pixar", "Disney", "Toy Story", "Studio Ghibli", "Marvel", etc. Recognize creator phrasing like "Pixar-style" as a 3D-animated intent (route to stylized 3D) but use generic descriptors in the actual prompt: "3D-animated", "feature-film animation aesthetic", "stylized 3D rendering", "anime cel-shaded illustration". Studio names trigger content policies + downstream IP risk.

需求描述包含内容	风格→初始设定
水果/蔬菜/食物 + "character" / "anthropomorphic" / "brainrot"	stylized → 水果/物品角色
动物名称 + "pet" / "influencer" / "creator" / 品种（不含"real form" / "four-legged"）	realistic → 拟人化人形宠物（宠物类需求默认：穿着可爱衣服的毛茸茸双足角色，匹配项目 `pets` 分类风格）
动物名称 + 明确的"real form" / "four-legged" / "on all fours" / "real cat / dog / animal"	realistic → 真实形态四足宠物（需主动选择）
"Anime", "manga", "magical girl", "kawaii", "shoujo", "shonen"	stylized → 动漫风格
"3D rendered", "stylized 3D", "claymation", "feature-film animation"	stylized → 3D角色
"Illustrated", "painted", "watercolor", "comic book"	stylized → 插画风格
人类职业 + 人口特征，无风格提示	realistic → 写实人类

拟人化人形和真实形态四足宠物均支持，但宠物类需求默认采用拟人化风格——这与Picsart项目的

pets

分类一致：穿着可爱衣服（小毛衣、迷你卫衣、领结）的毛茸茸双足网红，食物系名字（Biscuit、Mochi、Nugget、Bean、Waffles、Tofu、Pickle等），Z世代风格简介（"professional napper | treat negotiator | certified good boy/girl"）。真实形态四足仅适用于明确指定的创作者。风格冲突（例如"anime fitness coach"）→优先采用风格化提示。

大多数创作者偏好风格化。 不要盲目默认写实风格。

IP安全措辞（强制要求）： 发送给模型的提示中绝对不要提及工作室/IP名称——禁止使用"Pixar"、"Disney"、"Toy Story"、"Studio Ghibli"、"Marvel"等。将创作者的"Pixar-style"这类表述理解为3D动画的意图（路由至风格化3D），但在实际提示中使用通用描述："3D-animated"、"feature-film animation aesthetic"、"stylized 3D rendering"、"anime cel-shaded illustration"。工作室名称会触发内容政策及后续IP风险。

What creators express in their brief (natural language)

创作者在需求描述中常用的自然语言表达

The agent extracts intent — no CLI flags to learn:

Reference image ("from /path/photo.png") → adds
```
-i
```
to casting-card call
Reel ("add a tiktok reel", "with motion") → triggers Step 4 (~11 extra cr)
Platform ("for tiktok", "instagram reel", "linkedin") → drives reel AR + caption tuning
Style ("anime", "3D", "painted", "photoreal") → routes realistic / stylized
Name ("named Nova") → sets persona name
Character type ("strawberry character", "golden retriever pet", "magical girl") → picks subject opening

Agent会提取意图——无需学习CLI参数：

参考图片（"from /path/photo.png"）→ 在选角卡调用中添加
```
-i
```
参数
短视频（"add a tiktok reel", "with motion"）→ 触发步骤4（额外约11 cr）
平台（"for tiktok", "instagram reel", "linkedin"）→ 驱动短视频AR格式和文案适配
风格（"anime", "3D", "painted", "photoreal"）→ 路由至写实/风格化
名称（"named Nova"）→ 设置人设名称
角色类型（"strawberry character", "golden retriever pet", "magical girl"）→ 选择主题初始设定

Quick start

快速开始

Plain English. Examples:

"Create a persona for: fitness coach, gen-z, neon vibe" (realistic human)
"Create a fluffy golden puppy pet influencer, sassy queen energy, mini hoodie" (anthropomorphic pet — DEFAULT for pet briefs: fluffy biped in cute clothes)
"Create a calico kitten content creator, sleepy baby vibe, tiny knitted sweater" (anthropomorphic pet)
"Create a real four-legged tortoiseshell cat in a sunlit Tokyo apartment" (real-form pet — opt-in only with explicit "real form / four-legged" cue)
"Make me an anime magical-girl librarian" (stylized)
"Create a strawberry character, brainrot 3D-animated vibe" (stylized fruit)
"Create a persona based on /path/photo.png — indie folk musician" (reference)
"Create a persona for: fitness coach — and add a tiktok reel" (with reel)

Output:

casting.png

persona.md

_meta.json

reel-hero.png

reel.mp4

if reel requested).

Cost: ~3 cr lean / ~14 cr with reel.

使用简单英文即可。示例：

"Create a persona for: fitness coach, gen-z, neon vibe"（写实人类）
"Create a fluffy golden puppy pet influencer, sassy queen energy, mini hoodie"（拟人化宠物——宠物需求默认：穿着可爱衣服的毛茸茸双足角色）
"Create a calico kitten content creator, sleepy baby vibe, tiny knitted sweater"（拟人化宠物）
"Create a real four-legged tortoiseshell cat in a sunlit Tokyo apartment"（真实形态宠物——仅在明确指定"real form / four-legged"时启用）
"Make me an anime magical-girl librarian"（风格化）
"Create a strawberry character, brainrot 3D-animated vibe"（风格化水果角色）
"Create a persona based on /path/photo.png — indie folk musician"（带参考图）
"Create a persona for: fitness coach — and add a tiktok reel"（带短视频）

输出文件：

casting.png

、

persona.md

、

_meta.json

（若请求短视频则额外包含

reel-hero.png

reel.mp4

）。

成本：约3 cr（基础版）/ 约14 cr（含短视频）。

Pipeline

流程

Step 1 — Intent

步骤1 — 意图提取

Bias hard toward "infer and proceed." Only ask if brief is truly thin (1–2 words). Invent missing details (gender, age, ethnicity, vibe), note in

persona.md

, let creator re-roll.

If you must ask, ask exactly ONE direct question. Never enumerate A/B/C/D menus. Never stack multiple questions.

GOOD response (only when brief too thin):

Give me a one-liner — vibe / type / niche.
Examples:
- anthropomorphic pet (default for pet briefs): "fluffy golden puppy influencer, sassy queen, mini hoodie" / "calico kitten creator, sleepy baby, tiny sweater"
- real-form pet (opt-in): "real four-legged tortie cat in a sunlit apartment"
- realistic human: "Berlin art curator, dark academia, mid-thirties"
- stylized: "anime magical-girl librarian" / "anthropomorphic strawberry, brainrot 3D"
Add-ons: "from /path/photo.png" / "add a tiktok reel" / "named Mochi"

BAD: A/B/C/D menus + multiple questions stacked. Don't.

强烈倾向于“推断并执行”。 仅当需求描述过于简略（1-2个词）时才询问。补充缺失细节（性别、年龄、种族、风格），记录在

persona.md

中，允许创作者重新生成。

若必须询问，仅问一个直接问题。绝不要列出A/B/C/D选项菜单，绝不要堆叠多个问题。

正确回应（仅当需求过简时）：

请提供一句话描述——风格/类型/细分领域。
示例：
- 拟人化宠物（宠物需求默认）："fluffy golden puppy influencer, sassy queen, mini hoodie" / "calico kitten creator, sleepy baby, tiny sweater"
- 真实形态宠物（需主动选择）："real four-legged tortie cat in a sunlit apartment"
- 写实人类："Berlin art curator, dark academia, mid-thirties"
- 风格化："anime magical-girl librarian" / "anthropomorphic strawberry, brainrot 3D"
附加选项："from /path/photo.png" / "add a tiktok reel" / "named Mochi"

错误做法：A/B/C/D选项菜单+多个问题堆叠。请勿如此操作。

Step 2 — Identity →

persona.md

步骤2 — 身份设定 →

persona.md

Write: name | bio (2–3 sentences) | voice/tone | frozen appearance block (verbatim, reuse in every prompt). Block contains identity DNA only (face geometry, eye/hair/skin, body type, distinguishing marks, wardrobe aesthetic baseline) — NOT per-shot deltas (expression, pose, lighting, scene, specific outfits).

撰写：名称 | 简介（2-3句话） | 语气风格 | 固定外观模块（逐字复用，用于所有提示）。模块仅包含身份核心信息（面部轮廓、眼/发/肤色、体型、标志性特征、穿搭风格基准）——不包含单镜头差异（表情、姿势、光线、场景、特定服装）。

Step 3 — Casting card

步骤3 — 选角卡

One call. 4 head-to-toe angles, plain seamless gray, signature wardrobe, neutral expression. 9:16 portrait with 2×2 grid inside (each panel ≈ 9:16 — fits full body).

bash

URL=$(gen-ai generate -m <style-model> -p "<subject-opening> The image shows the same exact character from four camera angles in a 2x2 portrait grid (9:16 canvas). ALL FOUR PANELS share: identical plain seamless studio gray background — flat uniform fill, no gradient/texture/scene. Identical signature wardrobe — same complete outfit head to feet (or, for common pets, identical simple accessories like collar/bandana/sweater — never humanoid clothing). Identical neutral expression — relaxed mouth. Identical even soft frontal softbox key + subtle fill + soft ground shadow, no rim lights, no colored gels. Identical hair/fur. Same identity in every panel: <frozen appearance block>. Differs only in angle: TOP-LEFT front-facing full body eyes at camera; TOP-RIGHT 3/4 facing camera-right; BOTTOM-LEFT full left profile looking off-left; BOTTOM-RIGHT 3/4 from behind over-the-shoulder. Magazine fashion model sheet composition, thin clean grid lines. The four panels MUST look like consecutive shots from one session — same wardrobe, backdrop, lighting, character; only angle differs. Absolutely no text, no captions, no watermarks, no logos, no UI elements, no phone, no device, no screen, no social media overlays in any panel." --aspect-ratio 9:16 --json --no-input | grep -oE 'https?://[^"]+' | head -1)
curl -sSL -o ./<persona-slug>/casting.png "$URL"

<style-model>

gemini-3.1-flash-image

(realistic) or

grok-imagine

(stylized). Apply fallback wrapper to

flux-2-max

Reference image: add

-i <reference-path>

to this same call. Identity via i2i, same prompt + cost.

单次调用。生成4个全身角度，纯色无缝灰色背景，专属穿搭，中性表情。9:16竖幅，内部包含2×2网格（每个面板≈9:16——适配全身展示）。

bash

URL=$(gen-ai generate -m <style-model> -p "<subject-opening> The image shows the same exact character from four camera angles in a 2x2 portrait grid (9:16 canvas). ALL FOUR PANELS share: identical plain seamless studio gray background — flat uniform fill, no gradient/texture/scene. Identical signature wardrobe — same complete outfit head to feet (or, for common pets, identical simple accessories like collar/bandana/sweater — never humanoid clothing). Identical neutral expression — relaxed mouth. Identical even soft frontal softbox key + subtle fill + soft ground shadow, no rim lights, no colored gels. Identical hair/fur. Same identity in every panel: <frozen appearance block>. Differs only in angle: TOP-LEFT front-facing full body eyes at camera; TOP-RIGHT 3/4 facing camera-right; BOTTOM-LEFT full left profile looking off-left; BOTTOM-RIGHT 3/4 from behind over-the-shoulder. Magazine fashion model sheet composition, thin clean grid lines. The four panels MUST look like consecutive shots from one session — same wardrobe, backdrop, lighting, character; only angle differs. Absolutely no text, no captions, no watermarks, no logos, no UI elements, no phone, no device, no screen, no social media overlays in any panel." --aspect-ratio 9:16 --json --no-input | grep -oE 'https?://[^"]+' | head -1)
curl -sSL -o ./<persona-slug>/casting.png "$URL"

<style-model>

gemini-3.1-flash-image

（写实）或

grok-imagine

（风格化）。若失败则降级为

flux-2-max

。

参考图片： 在本次调用中添加

-i <reference-path>

参数。通过图生图确定身份，提示和成本不变。

Subject openings (replace

<subject-opening>

above)

主题初始设定（替换上方

<subject-opening>

）

Photoreal human (default) — "Professional fashion photograph head-to-toe casting card / model sheet, shot on 85mm lens, RAW photo, 8k UHD, crisp focus, photorealistic, natural skin texture with visible pores, no AI smoothing."
Anthropomorphic humanoid pet (DEFAULT for pet/animal briefs — fluffy biped in cute clothes, project's
pets
category) — "An anthropomorphic [puppy / kitten / bunny / hamster / duckling / fox cub / baby panda / hedgehog / penguin / monkey] character standing upright on two legs like a human, full body visible head to toe, humanoid body proportions, expressive face, [coat detail — e.g. warm golden honey-colored fur / pure snow white fluffy fur / deep midnight black sleek fur / warm ginger orange fur / chocolate brown fur / shimmering silver grey fur / patchy calico orange-white-black fur / soft cream colored fur], adorable, looking directly at camera, professional fashion photograph, shot on 85mm lens, shallow depth of field, cinematic studio lighting with soft key light, photorealistic, RAW photo, 8k ultra high definition, crisp focus." Wardrobe options the agent can pick from when composing the casting card outfit: tiny knitted sweater | mini oversized hoodie | dapper bow tie + collar | flower crown of daisies and roses | tiny stylish sunglasses | flowing superhero cape | stylish bandana around neck | au naturel (no clothing, just fluffy fur). Vibe options for expression / pose: Sassy Queen (hand on hip, serving looks, unbothered) | Silly King (goofy, tongue out, awkward funny pose) | Sleepy Baby (drowsy half-asleep, leaning) | Zoomies Mode (excited, arms up, chaotic joy) | Distinguished (regal, arms crossed, noble) | Mischief Maker (sneaky, hands behind back, guilty-not-sorry). Suggested name (food-themed, project's pool): Biscuit, Mochi, Nugget, Bean, Waffles, Tofu, Dumpling, Peanut, Pickle, Noodle, Churro, Pretzel, Taco, Maple, Truffle, Sesame, Crouton, Muffin, Cupcake, Boba. Suggested bio style (gen-z internet humor, pipe-separated): "professional napper | treat negotiator | certified good boy/girl" / "fluffy & unbothered | snack motivated | full-time cuddle bug" / "chaos gremlin | zoomies champion | will boop for treats".
Real-form quadruped pet (opt-in only — creator explicitly said "real form / four-legged / real cat / on all fours") — "Professional pet portrait photograph head-to-toe model sheet of a [breed] [animal] in their natural anatomical form (four-legged / quadruped, NOT humanoid), full body nose-to-tail visible, shot on 85mm with shallow depth of field, RAW, 8k UHD, photorealistic natural fur with visible individual hairs, no AI smoothing. Pet may wear simple accessories (collar, bandana, harness) but never humanoid clothing — the character is the animal in real anatomical form."
3D-animated anthropomorphic fruit / object — "High quality 3D-animated head-to-toe character sheet of an anthropomorphic [fruit/object] character, feature-film animation aesthetic, [fruit/object] serves as the head on a full human-proportioned athletic body, [skin/surface] texture extending naturally to arms and hands, ultra-high resolution, brainrot character-drama vibe, dramatic cinematic studio lighting with soft fill + subtle ground shadow."
Anime / manga — "High quality anime / manga style head-to-toe character sheet, cel-shaded illustration, clean line art, vibrant saturated colors, soft anime lighting, expressive eyes, [shoujo/shonen/kawaii] aesthetic, magazine character reference sheet composition."
Stylized 3D-animated human / fantasy — "High quality stylized 3D-animated head-to-toe character sheet, feature-film animation aesthetic, soft global illumination, slightly exaggerated proportions, expressive features, character-animation art direction."
Painted / illustrated — "Hand-painted editorial illustration head-to-toe character sheet, [watercolor/gouache/digital painting] aesthetic, painterly brushwork, layered soft light, magazine illustration composition."

Casting-card rules — non-negotiable: identical bg / wardrobe-or-accessories / lighting / expression / hair-fur across all 4 panels — only angle differs | bg flat plain gray | full body (head-to-toe humans/bipeds INCLUDING anthropomorphic humanoid pets, nose-to-tail quadrupeds for real-form pet opt-in) | wardrobe stays same in all panels (same outfit for humans + anthropomorphic pets — yes, anthropomorphic pets wear humanoid clothing like tiny sweaters/mini hoodies/bow ties; only real-form quadruped pets are limited to simple accessories like collar/bandana/harness) | expression and pose match the chosen vibe (Sassy Queen / Silly King / etc. for anthropomorphic pets) — neutral default for humans, eyes at camera (or off per profile/back).

写实人类（默认）—— "Professional fashion photograph head-to-toe casting card / model sheet, shot on 85mm lens, RAW photo, 8k UHD, crisp focus, photorealistic, natural skin texture with visible pores, no AI smoothing."
拟人化人形宠物（宠物/动物需求默认——穿着可爱衣服的毛茸茸双足角色，项目
```
pets
```
分类风格）—— "An anthropomorphic [puppy / kitten / bunny / hamster / duckling / fox cub / baby panda / hedgehog / penguin / monkey] character standing upright on two legs like a human, full body visible head to toe, humanoid body proportions, expressive face, [coat detail — e.g. warm golden honey-colored fur / pure snow white fluffy fur / deep midnight black sleek fur / warm ginger orange fur / chocolate brown fur / shimmering silver grey fur / patchy calico orange-white-black fur / soft cream colored fur], adorable, looking directly at camera, professional fashion photograph, shot on 85mm lens, shallow depth of field, cinematic studio lighting with soft key light, photorealistic, RAW photo, 8k ultra high definition, crisp focus." Agent可从以下选项中为选角卡选择穿搭：tiny knitted sweater | mini oversized hoodie | dapper bow tie + collar | flower crown of daisies and roses | tiny stylish sunglasses | flowing superhero cape | stylish bandana around neck | au naturel（无衣物，仅毛茸茸的皮毛）。表情/姿势风格选项：Sassy Queen（手叉腰，气场十足，满不在乎）| Silly King（搞怪，吐舌头，笨拙有趣的姿势）| Sleepy Baby（昏昏欲睡，半梦半醒，身体倾斜）| Zoomies Mode（兴奋，手臂举起，混乱的快乐）| Distinguished（高贵，双臂交叉，端庄）| Mischief Maker（鬼鬼祟祟，手背后，知错不改）。推荐食物系名字（项目备选池）：Biscuit、Mochi、Nugget、Bean、Waffles、Tofu、Dumpling、Peanut、Pickle、Noodle、Churro、Pretzel、Taco、Maple、Truffle、Sesame、Crouton、Muffin、Cupcake、Boba。推荐Z世代幽默风格简介（竖线分隔）："professional napper | treat negotiator | certified good boy/girl" / "fluffy & unbothered | snack motivated | full-time cuddle bug" / "chaos gremlin | zoomies champion | will boop for treats"。
真实形态四足宠物（仅主动选择——创作者明确指定"real form / four-legged / real cat / on all fours"）—— "Professional pet portrait photograph head-to-toe model sheet of a [breed] [animal] in their natural anatomical form (four-legged / quadruped, NOT humanoid), full body nose-to-tail visible, shot on 85mm with shallow depth of field, RAW, 8k UHD, photorealistic natural fur with visible individual hairs, no AI smoothing. Pet may wear simple accessories (collar, bandana, harness) but never humanoid clothing — the character is the animal in real anatomical form."
3D动画化拟人水果/物品—— "High quality 3D-animated head-to-toe character sheet of an anthropomorphic [fruit/object] character, feature-film animation aesthetic, [fruit/object] serves as the head on a full human-proportioned athletic body, [skin/surface] texture extending naturally to arms and hands, ultra-high resolution, brainrot character-drama vibe, dramatic cinematic studio lighting with soft fill + subtle ground shadow."
动漫/漫画—— "High quality anime / manga style head-to-toe character sheet, cel-shaded illustration, clean line art, vibrant saturated colors, soft anime lighting, expressive eyes, [shoujo/shonen/kawaii] aesthetic, magazine character reference sheet composition."
风格化3D动画人类/奇幻角色—— "High quality stylized 3D-animated head-to-toe character sheet, feature-film animation aesthetic, soft global illumination, slightly exaggerated proportions, expressive features, character-animation art direction."
手绘/插画风格—— "Hand-painted editorial illustration head-to-toe character sheet, [watercolor/gouache/digital painting] aesthetic, painterly brushwork, layered soft light, magazine illustration composition."

选角卡规则——不可协商： 所有4个面板的背景/穿搭或配饰/光线/表情/毛发完全一致——仅角度不同 | 背景为纯色平灰 | 全身展示（人类/双足角色包括拟人化人形宠物，真实形态四足宠物为从头到尾） | 所有面板穿搭保持一致（人类和拟人化宠物为相同服装——是的，拟人化宠物穿着小毛衣/迷你卫衣/领结等人形服装；仅真实形态四足宠物仅限项圈/头巾/胸背带等简单配饰） | 表情和姿势匹配所选风格（拟人化宠物为Sassy Queen/Silly King等；人类默认中性，眼睛看向镜头或根据侧面/背面视角偏移）。

Step 4 — Reel (only if requested)

步骤4 — 短视频（仅当请求时）

Two sub-calls. Seedance treats

imageUrls

as first frame (verified) — passing the casting-card grid would open the reel on it. So: generate single-frame reel-hero first, then animate.

两次子调用。Seedance将

imageUrls

视为第一帧（已验证）——传入选角卡网格会让短视频从该画面开始。因此：先生成单帧短视频首图，再进行动画处理。

Pick the concept first

先确定创意概念

Don't auto-default to "slow contemplative push-in" — most creator content rewards confident energy.

Hook rule: first second must arrest attention. Platform sensitivity: TikTok / IG Reel / Shorts → punchy; LinkedIn / YouTube → professional / calm; fruit / 3D / anime → lean stylized + confident (calm beats fall flat for them).

Environment / lighting: atmospheric specifics > generic. Replace "in a cafe" with "neon-pink Tokyo coffee shop interior, signage reflections" (punchy) OR "rain-streaked window with candlelight, steam from teacup" (calm). Match energy to concept.

不要默认采用“缓慢推进的沉思镜头”——大多数创作者内容需要充满活力的风格。

钩子规则： 第一秒必须抓住注意力。平台适配： TikTok / IG Reel / Shorts → 节奏明快；LinkedIn / YouTube → 专业/沉稳；水果/3D/动漫风格→偏向风格化+充满活力（平静镜头效果不佳）。

环境/光线： 具体氛围 > 通用描述。将“in a cafe”替换为“霓虹粉色东京咖啡店内部，招牌反光”（明快）或“雨痕玻璃窗，烛光，茶杯蒸汽”（平静）。氛围需匹配创意概念。

Sub-step 4-i: Reel hero (gemini i2i, target AR, single full-body frame)

子步骤4-i：短视频首图（Gemini图生图，目标AR格式，单帧全身画面）

bash

URL=$(gen-ai generate -m gemini-3.1-flash-image -i ./<persona-slug>/casting.png -p "<subject-opening from Step 3> Single full-body photograph of the same character from the casting-card reference, head-to-toe in frame. <frozen appearance block>. Wearing the same signature wardrobe shown in casting card. <opening pose / framing for chosen concept>. <atmospheric environment + lighting>. Composition: full body head to toe, framed for video animation in <platform-AR>. Real photograph quality (or stylized rendering per opening). No text, no captions, no watermarks, no logos, no UI, no phone, no device, no screen, no social media overlays." --aspect-ratio <platform-AR> --json --no-input | grep -oE 'https?://[^"]+' | head -1)
curl -sSL -o ./<persona-slug>/reel-hero.png "$URL"

Cost: ~3 cr. Apply fallback to

flux-2-max

Reel-hero ≠ final action pose. Gemini tends to preserve the casting card's neutral stance even when prompted for power-pose / mid-action (verified). That's fine — the action lands in the Seedance prompt at 4-ii. Don't re-roll the hero just because the pose looks calmer than expected.

bash

URL=$(gen-ai generate -m gemini-3.1-flash-image -i ./<persona-slug>/casting.png -p "<subject-opening from Step 3> Single full-body photograph of the same character from the casting-card reference, head-to-toe in frame. <frozen appearance block>. Wearing the same signature wardrobe shown in casting card. <opening pose / framing for chosen concept>. <atmospheric environment + lighting>. Composition: full body head to toe, framed for video animation in <platform-AR>. Real photograph quality (or stylized rendering per opening). No text, no captions, no watermarks, no logos, no UI, no phone, no device, no screen, no social media overlays." --aspect-ratio <platform-AR> --json --no-input | grep -oE 'https?://[^"]+' | head -1)
curl -sSL -o ./<persona-slug>/reel-hero.png "$URL"

成本：约3 cr。若失败则降级为

flux-2-max

。

短视频首图 ≠ 最终动作姿势。 Gemini即使提示力量姿势/动作中，也倾向于保留选角卡的中立姿态（已验证）。这没问题——动作将在4-ii的Seedance提示中体现。不要仅因为姿势比预期平静就重新生成首图。

Sub-step 4-ii: Animation (Seedance i2v, audio enabled)

子步骤4-ii：动画（Seedance图生视频，启用音频）

Platform → AR + duration:

Platform	AR	Duration
tiktok / instagram-reel / instagram-story / youtube-shorts	9:16	8s
instagram-feed	1:1 (Seedance has no 4:5; closest universal)	6s
youtube / linkedin / x / twitter	16:9	8–10s

bash

URL=$(gen-ai generate -m seedance-2.0 -i ./<persona-slug>/reel-hero.png -p "<subject-opening>. <frozen appearance block>. Wearing same signature wardrobe. <single action from vocabulary matching the concept — strong language here, this is where action actually lands>. <same atmospheric environment + lighting as hero>. <single camera move from vocabulary>. Audio: <ambient soundscape matching scene — environmental sounds, mood-appropriate underscore; no spoken dialogue, no voiceover, no music vocals>. Single continuous moment, no scene changes, no multiple sequential actions, no fast or chaotic movement. No text, no captions, no watermarks, no logos, no UI, no phone, no device, no screen, no social media overlays." --aspect-ratio <platform-AR> --duration <platform-duration> --generate-audio --json --no-input | grep -oE 'https?://[^"]+' | head -1)
curl -sSL -o ./<persona-slug>/reel.mp4 "$URL"

Cost: 1 cr/sec × duration. Total reel: ~8–13 cr.

Seedance prompt order (verified KLING_RULES): Subject → Action → Environment → Camera → Lighting → Audio. One continuous camera move, one primary action — never chain.

Models we DON'T use for reel: any

startFrame

-only i2v (

seedance-i2v

hailuo-2.3-fast

runway-gen3a-turbo

wan-2.7-i2v

luma-flash2-i2v

pika-frames

) drifts across the clip;

runway-gen4-ref

returns a still PNG (verified, not a video);

kling-3.0-pro

veo-3.1

veo-3.1-fast

are

startFrame

-only — multi-image char-ref modes (Kling element / Veo Ingredients) aren't surfaced in the CLI today (roadmap).

Honest constraint: Seedance's

imageUrls

behaves as first frame, not pure char-ref. Single-frame hero + i2v = clean character image opens the reel and animates from there.

平台→AR格式+时长：

平台	AR格式	时长
tiktok / instagram-reel / instagram-story / youtube-shorts	9:16	8s
instagram-feed	1:1（Seedance无4:5；最接近的通用格式）	6s
youtube / linkedin / x / twitter	16:9	8–10s

bash

URL=$(gen-ai generate -m seedance-2.0 -i ./<persona-slug>/reel-hero.png -p "<subject-opening>. <frozen appearance block>. Wearing same signature wardrobe. <single action from vocabulary matching the concept — strong language here, this is where action actually lands>. <same atmospheric environment + lighting as hero>. <single camera move from vocabulary>. Audio: <ambient soundscape matching scene — environmental sounds, mood-appropriate underscore; no spoken dialogue, no voiceover, no music vocals>. Single continuous moment, no scene changes, no multiple sequential actions, no fast or chaotic movement. No text, no captions, no watermarks, no logos, no UI, no phone, no device, no screen, no social media overlays." --aspect-ratio <platform-AR> --duration <platform-duration> --generate-audio --json --no-input | grep -oE 'https?://[^"]+' | head -1)
curl -sSL -o ./<persona-slug>/reel.mp4 "$URL"

成本：1 cr/秒 × 时长。短视频总成本：约8–13 cr。

Seedance提示顺序（已验证KLING_RULES）：主题→动作→环境→镜头→光线→音频。单次连续镜头移动，单个主要动作——绝不要串联多个动作。

我们不用于短视频的模型： 任何仅支持

startFrame

的图生视频模型（

seedance-i2v

、

hailuo-2.3-fast

、

runway-gen3a-turbo

、

wan-2.7-i2v

、

luma-flash2-i2v

、

pika-frames

）会在视频中出现漂移；

runway-gen4-ref

返回静态PNG（已验证，非视频）；

kling-3.0-pro

veo-3.1

veo-3.1-fast

仅支持

startFrame

——多图像角色参考模式（Kling元素/Veo成分）目前未在CLI中开放（规划中）。

实际限制： Seedance的

imageUrls

功能是作为第一帧，而非纯粹的角色参考。单帧首图+图生视频=短视频以清晰的角色画面开场并从该画面开始动画。

Step 5 — Captions, deliver

步骤5 — 文案，交付

Append captions to

persona.md

— 3 by default, in persona's voice. Hashtag block ALWAYS leads with

#picsart #picsartcreator

, then platform-specific niche tags.

Platform	Length	Niche tags after Picsart pair
tiktok / youtube-shorts	80–150 chars, single hook	4–6 trending
instagram (reel/story/feed)	150–300 chars, hook + story	6–10
youtube standard	300–500 chars, keyword-dense	3–5 keyword
linkedin	500–1000 chars, professional	3–5 industry
x / twitter	≤280 chars total (incl tags)	1–2
(no platform)	~150 chars, balanced	4–6 generic

Print final summary:

✓ Persona "Lena" delivered. Local: ./lena/. Spent: ~3 credits. Files: casting.png, persona.md (+ _meta.json)

. Add

reel-hero.png

reel.mp4

to file list if reel was generated.

将文案追加到

persona.md

中——默认3条，匹配人设语气。标签块必须以

#picsart #picsartcreator

开头，然后添加平台专属细分领域标签。

平台	长度	Picsart标签后的细分领域标签
tiktok / youtube-shorts	80–150字符，单个钩子	4–6个热门标签
instagram（reel/story/feed）	150–300字符，钩子+故事	6–10个
youtube标准视频	300–500字符，关键词密集	3–5个关键词标签
linkedin	500–1000字符，专业风格	3–5个行业标签
x / twitter	总计≤280字符（含标签）	1–2个
（无指定平台）	~150字符，平衡风格	4–6个通用标签

打印最终总结：

✓ Persona "Lena" delivered. Local: ./lena/. Spent: ~3 credits. Files: casting.png, persona.md (+ _meta.json)

。若生成了短视频，在文件列表中添加

reel-hero.png

reel.mp4

。

Cost transparency

成本透明度

Show plan before spending — pull live rates with

gen-ai pricing <model>

, never hardcode. After each step:

✓ <step> (<credits>)

Plan:
  Casting card (gemini-3.1-flash-image, 1 image)         ~3 cr
[ Reel hero (gemini-3.1-flash-image, 1 image)            ~3 cr ]   reel only
[ Reel animation (seedance-2.0, 8s @ 9:16)               ~8 cr ]
  ────────────────────────────────────────────────────────
  Estimated total                                       ~3 or ~14 cr
Continue? [Y/n]

执行前展示计划——使用

gen-ai pricing <model>

获取实时费率，绝不要硬编码。每步完成后：

✓ <step> (<credits>)

。

Plan:
  Casting card (gemini-3.1-flash-image, 1 image)         ~3 cr
[ Reel hero (gemini-3.1-flash-image, 1 image)            ~3 cr ]   reel only
[ Reel animation (seedance-2.0, 8s @ 9:16)               ~8 cr ]
  ────────────────────────────────────────────────────────
  Estimated total                                       ~3 or ~14 cr
Continue? [Y/n]

Output

输出结构

./<persona-slug>/
├── persona.md       # name, bio, voice, frozen appearance block, captions
├── casting.png      # head-to-toe 4-angle casting card
├── reel-hero.png    # only if reel requested
├── reel.mp4         # only if reel requested (includes ambient audio)
└── _meta.json       # step parameters

./<persona-slug>/
├── persona.md       # 名称、简介、语气、固定外观模块、文案
├── casting.png      # 全身4角度选角卡
├── reel-hero.png    # 仅当请求短视频时存在
├── reel.mp4         # 仅当请求短视频时存在（含环境音）
└── _meta.json       # 步骤参数

Re-rolls

重新生成

Natural language. Agent reads

_meta.json

and reruns the right step:

"Regenerate Lena with darker hair" (~3 cr)
"Redo the reel with a slow camera push instead of static" (~8 cr)

Confirm spend before re-running.

使用自然语言。Agent读取

_meta.json

并重新执行对应步骤：

"Regenerate Lena with darker hair"（约3 cr）
"Redo the reel with a slow camera push instead of static"（约8 cr）

重新执行前确认成本。

Limitations (today)

当前限制

Local-only output (Drive integration tracked v1.1 once new CLI Drive API ships)

One persona per run (multi-persona via

gen-ai generate -m kling-multi-image-v2-1 -i nova/casting.png -i lena/casting.png -p "<scene>"

or future Scene Composer skill)

No premium photoreal tier (
```
gemini-3-pro-image
```
deferred)
No premium motion-control reel (Kling Motion Control V3 + creator motion-ref deferred)
No voice / talking-head reel (Picsart-Eleven gender unreliable). Reel ships with Seedance ambient audio — environmental + atmospheric underscore, no synthesized speech
No bespoke music (Seedance underscore via
```
--generate-audio
```
; dedicated music pass deferred)
No Kling-element / Veo-Ingredients char-ref video (not surfaced in CLI)
No built-in scene variations (casting card is the character; downstream tools handle scenes)

仅本地输出（Drive集成计划在v1.1版本中支持，待新CLI Drive API发布）

每次运行仅生成一个人设（多人设可通过

gen-ai generate -m kling-multi-image-v2-1 -i nova/casting.png -i lena/casting.png -p "<scene>"

或未来的场景合成技能实现）

无高级写实 tier（
```
gemini-3-pro-image
```
延期支持）
无高级运动控制短视频（Kling Motion Control V3 + 创作者运动参考延期支持）
无语音/对话式短视频（Picsart-Eleven性别识别不可靠）。短视频附带Seedance环境音——环境音+氛围背景音乐，无合成语音
无定制音乐（Seedance通过
```
--generate-audio
```
提供背景音乐；专属音乐模块延期支持）
无Kling元素/Veo成分角色参考视频（未在CLI中开放）
无内置场景变体（选角卡为角色本身；后续工具处理场景）

gen-ai-persona-creation

Original

Translation

AI Influencer Persona

AI网红人设

When to Use

适用场景

Prerequisites

前置条件

How to Run

运行方式

Quick Reference

快速参考

Procedure

流程

Pitfalls

常见问题

Verification

验证方法

How the skill calls gen-ai

技能调用gen-ai的方式

Style routing

风格路由

Style inference (read the brief)

风格推断（阅读需求描述）

What creators express in their brief (natural language)

创作者在需求描述中常用的自然语言表达

Quick start

快速开始

Pipeline

流程

Step 1 — Intent

步骤1 — 意图提取

Step 2 — Identity → persona.md

步骤2 — 身份设定 → persona.md

Step 3 — Casting card

步骤3 — 选角卡

Subject openings (replace <subject-opening> above)

主题初始设定（替换上方<subject-opening>）

Step 4 — Reel (only if requested)

步骤4 — 短视频（仅当请求时）

Pick the concept first

先确定创意概念

Sub-step 4-i: Reel hero (gemini i2i, target AR, single full-body frame)

子步骤4-i：短视频首图（Gemini图生图，目标AR格式，单帧全身画面）

Sub-step 4-ii: Animation (Seedance i2v, audio enabled)

子步骤4-ii：动画（Seedance图生视频，启用音频）

Step 5 — Captions, deliver

步骤5 — 文案，交付

Cost transparency

成本透明度

Output

输出结构

Re-rolls

重新生成

Limitations (today)

当前限制

How the skill calls
`gen-ai`

技能调用
`gen-ai`
的方式

Step 2 — Identity →
`persona.md`

步骤2 — 身份设定 →
`persona.md`

Subject openings (replace
`<subject-opening>`
above)

主题初始设定（替换上方
`<subject-opening>`
）