text-to-visual

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Text to Visual

文本转视觉

A single skill that covers every "given text, produce a matching image" workflow Picsart's
gen-ai
CLI supports. Use when the user has any kind of written content — a paragraph, a blog draft, a URL — and needs visuals generated from it. Replaces three narrower skills with one entry point and three mode references.
Input: text (paragraph, article, or URL). Output: one or more images matched to the text's tone, topic, and target placement.
这是一项涵盖Picsart
gen-ai
CLI支持的所有「给定文本,生成匹配图像」工作流的技能。当用户拥有任何书面内容——段落、博客草稿、URL——并需要从中生成视觉内容时使用。它以一个入口点和三种模式参考替代了三个更细分的技能。
输入: 文本(段落、文章或URL)。输出: 一张或多张与文本语气、主题及目标使用场景匹配的图像。

When to Use

适用场景

ModeTrigger phrasesReference
single"match a visual to this paragraph", "inline image for this section", "one visual for this content"
references/modes/single.md
article-set"illustrate this blog post", "hero + inline visuals for an article", "full visual set for a draft"
references/modes/article-set.md
og"OG image for this URL", "open graph preview", "Twitter card image", "dynamic meta image"
references/modes/og.md
If the user wants product photos transformed, that's
product-photo-studio
. If they want video, that's
motion-studio
.
模式触发短语参考文档
single(单图)"为这段文字匹配一张视觉图"、"为该章节生成内嵌图像"、"为这份内容生成一张视觉图"
references/modes/single.md
article-set(文章配图集)"为这篇博客生成插图"、"为文章生成首图+内嵌视觉图"、"为草稿生成全套视觉图"
references/modes/article-set.md
og(开放图谱图)"为这个URL生成OG图"、"开放图谱预览图"、"Twitter卡片图"、"动态元数据图"
references/modes/og.md
如果用户需要修改产品照片,请使用
product-photo-studio
;如果需要生成视频,请使用
motion-studio

Prerequisites

前置条件

Picsart
gen-ai
CLI installed and authenticated:
bash
curl -fsSL https://picsart.com/gen-ai-cli/install.sh | bash
gen-ai login
gen-ai whoami
Per-mode setup (caching, serverless endpoints, font choices) is documented inside each mode reference.
需安装并验证Picsart
gen-ai
CLI:
bash
curl -fsSL https://picsart.com/gen-ai-cli/install.sh | bash
gen-ai login
gen-ai whoami
各模式的专属配置(缓存、无服务器端点、字体选择)已记录在对应模式的参考文档中。

How to Run

运行步骤

  1. Identify the mode from the user's request using the table in When to Use.
  2. Load the corresponding mode reference:
    Read
    references/modes/<mode>.md
    .
  3. Follow the procedure described there — extract text signals, build prompt, generate.
  4. Return to this SKILL.md only when switching modes mid-task.
  1. 根据适用场景中的表格,从用户请求中识别对应的模式。
  2. 加载对应的模式参考文档:阅读
    references/modes/<mode>.md
  3. 按照文档中的流程操作——提取文本信号、构建提示词、生成图像。
  4. 仅在任务中途切换模式时返回本SKILL.md。

Quick Reference

快速参考

bash
undefined
bash
undefined

Single image from a prompt

根据提示词生成单张图像

gen-ai generate --model <model> --prompt "<derived from text>"
gen-ai generate --model <model> --prompt "<从文本提取的内容>"

Estimate cost first

先估算成本

gen-ai pricing --model <model> --count <N>
gen-ai pricing --model <model> --count <N>

Browse available models

浏览可用模型

gen-ai models

Prompt-construction patterns (how to derive a prompt from a paragraph, an article, or a URL's metadata) live in the individual mode references.
gen-ai models

提示词构建规则(如何从段落、文章或URL元数据生成提示词)记录在各模式的参考文档中。

Procedure

通用流程

Shared outer loop:
  1. Extract signals — pull subject, tone, palette hints, and target dimensions from the input text.
  2. Build prompt — translate signals into a
    gen-ai
    prompt; each mode has its own template.
  3. Estimate
    gen-ai pricing
    before committing for multi-image runs.
  4. Generate — invoke
    gen-ai generate
    . Stream progress.
  5. Place — drop into the right slot: PDP, blog frontmatter, OG meta tag, social variant.
通用外层流程:
  1. 提取信号——从输入文本中提取主题、语气、色调提示及目标尺寸。
  2. 构建提示词——将信号转换为
    gen-ai
    可识别的提示词;每种模式都有专属模板。
  3. 估算成本——生成多张图像前,先使用
    gen-ai pricing
    估算费用。
  4. 生成图像——调用
    gen-ai generate
    ,实时反馈进度。
  5. 部署使用——将图像放入对应位置:产品详情页、博客前置信息、OG元标签、社交平台变体。

Pitfalls

注意事项

  • Don't generate from raw text. Always extract signals first; raw paragraphs produce literal, lifeless images.
  • Match aspect ratio to placement. OG = 1200×630, blog hero = 16:9, social = varies.
  • Cache OG images. Don't regenerate on every page view — see
    references/modes/og.md
    .
  • Mode-specific pitfalls live inside the individual mode references.
  • 切勿直接用原始文本生成图像。务必先提取信号;直接使用原始段落会生成生硬、缺乏生气的图像。
  • 匹配图像比例与使用场景。OG图比例为1200×630,博客首图为16:9,社交平台图比例依平台而定。
  • 缓存OG图。不要每次页面加载都重新生成——详见
    references/modes/og.md
  • 各模式的专属注意事项记录在对应模式的参考文档中。

Verification

验证步骤

bash
undefined
bash
undefined

Confirm output exists and matches expected dimensions

确认输出文件存在且符合预期尺寸

gen-ai inspect outputs/<run>/<image>.png
gen-ai inspect outputs/<run>/<image>.png

Spot-check the visual matches the source text by re-reading both side by side

对照原文与图像,抽查视觉内容是否匹配

undefined
undefined

See also

相关链接

  • product-photo-studio
    — transform existing product photos
  • motion-studio
    — video from text
  • gen-ai-use
    — foundational gen-ai CLI reference
  • product-photo-studio
    — 修改现有产品照片
  • motion-studio
    — 从文本生成视频
  • gen-ai-use
    — gen-ai CLI基础参考文档