sn-infographic

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

sn-infographic

sn-infographic

Info graphic generation scene skill (tier 1), relying on the
sn-image-generate
,
sn-image-recognize
, and
sn-text-optimize
tools provided by
sn-image-base
(tier 0).
Features:
  • Evaluation of prompt quality (auto mode)
  • Prompt expansion (force/auto mode)
  • Multiple rounds of image generation and VLM review
  • Output the best result based on quality ranking
信息图生成场景技能(一级),依赖
sn-image-base
(零级)提供的
sn-image-generate
sn-image-recognize
sn-text-optimize
工具。
功能特性:
  • 提示词质量评估(自动模式)
  • 提示词扩展(强制/自动模式)
  • 多轮图像生成与VLM审核
  • 根据质量排名输出最优结果

Input Specification

输入规范

ParameterTypeDefault ValueDescription
user_prompt
stringRequiredUser original request
max_rounds
int
1
Maximum number of generation rounds
output_mode
string
friendly
Output mode: friendly / verbose
prompts_expand_mode
string
auto
expand strategy: auto / force / disable
参数类型默认值描述
user_prompt
string必填用户原始请求
max_rounds
int
1
最大生成轮次
output_mode
string
friendly
输出模式:friendly / verbose
prompts_expand_mode
string
auto
扩展策略:auto / force / disable

API Configuration

API配置

All API calls in this skill are executed through the
sn_agent_runner.py
of the
sn-image-base
skill, with authentication parameters using default values (CLI > environment variables > built-in defaults),无需显式传入。
Call TypeToolAuthentication ParametersDescription
LLMsn-text-optimize (evaluation/expansion)Default reads
SN_TEXT_API_KEY
->
SN_CHAT_API_KEY
->
SN_API_KEY
Built-in default points to Sensenova internal network service
VLMsn-image-recognize (image review)Default reads
SN_VISION_API_KEY
->
SN_CHAT_API_KEY
->
SN_API_KEY
Built-in default points to Sensenova internal network service
Image Generationsn-image-generateDefault reads
SN_IMAGE_GEN_API_KEY
->
SN_API_KEY
;
SN_IMAGE_GEN_API_KEY
is only needed for image-specific override
Default uses image generation configuration of
sn-image-base
When encountering
MissingApiKeyError
or needing to specify a model
: pass explicitly via CLI parameters, parameter reference
$SN_IMAGE_BASE/references/api_spec.md
.
$SN_IMAGE_BASE
path explanation
:
$SN_IMAGE_BASE
is the installation directory of the
sn-image-base
skill (
SKILL.md
exists). The agent can locate this path by skill name
sn-image-base
in the list of installed skills.
本技能中的所有API调用均通过
sn-image-base
技能的
sn_agent_runner.py
执行,认证参数采用默认值(CLI > 环境变量 > 内置默认值),无需显式传入。
调用类型工具认证参数描述
LLMsn-text-optimize(评估/扩展)默认读取
SN_TEXT_API_KEY
->
SN_CHAT_API_KEY
->
SN_API_KEY
内置默认指向商汤内部网络服务
VLMsn-image-recognize(图像审核)默认读取
SN_VISION_API_KEY
->
SN_CHAT_API_KEY
->
SN_API_KEY
内置默认指向商汤内部网络服务
图像生成sn-image-generate默认读取
SN_IMAGE_GEN_API_KEY
->
SN_API_KEY
;仅在需要单独指定图像生成密钥时才需传入
SN_IMAGE_GEN_API_KEY
默认使用
sn-image-base
的图像生成配置
遇到
MissingApiKeyError
或需要指定模型时
:通过CLI参数显式传入,参数参考
$SN_IMAGE_BASE/references/api_spec.md
$SN_IMAGE_BASE
路径说明
$SN_IMAGE_BASE
sn-image-base
技能的安装目录(存在
SKILL.md
文件)。 Agent可通过已安装技能列表中的技能名称
sn-image-base
定位该路径。

Architecture: Main Agent + Worker Agent

架构:主Agent + 工作Agent

This skill uses a two-tier agent architecture:
RoleResponsibility
Main AgentReceive user request, normalize parameters, send preflight, start Worker, collect results, send text and images to user
Worker AgentExecute orchestration loop (expand → multiple rounds of generation + review → sort), return structured JSON
Responsibility Boundaries:
  • Worker Agent does not send any messages to the user directly, only returns structured JSON
  • Main Agent is responsible for sending all user-visible messages
  • Worker Agent's last message must be and only be the JSON string defined in the Return Contract
  • Worker Agent's internal VLM calls always execute directly, without spawning subagents
本技能采用双层Agent架构:
角色职责
主Agent接收用户请求,标准化参数,发送预检消息,启动工作Agent,收集结果,向用户发送文本和图像
工作Agent执行编排循环(扩展→多轮生成+审核→排序),返回结构化JSON
职责边界
  • 工作Agent不直接向用户发送任何消息,仅返回结构化JSON
  • 主Agent负责发送所有用户可见的消息
  • 工作Agent的最后一条消息必须且只能是返回契约中定义的JSON字符串
  • 工作Agent内部的VLM调用始终直接执行,不生成子Agent

Workflow

工作流程

Main Agent Workflow

主Agent工作流程

  1. Extract
    user_prompt
    ,
    max_rounds
    (default 1),
    output_mode
    (default
    friendly
    ), and
    prompts_expand_mode
    (default
    auto
    ) from user request
  2. Send uniform preflight message:
    "Using sn-infographic skill to generate infographic, please wait..."
  3. Start Worker Agent (Sub-Agent), passing in complete parameters and working directory
  4. When Worker Agent returns
    status=ok
    and
    need_main_agent_send=true
    :
    • max_rounds = 1: Send a one-sentence description of the image content, then send the rank=1 single image
    • max_rounds > 1, friendly mode: Generate a one-sentence natural language description based on
      result
      and
      violations
      , send the evaluation text, then send the rank=1 single image
    • max_rounds > 1, verbose mode: Send complete text summary message, then send all images in rank order to the user
  5. If Worker Agent returns
    status=error
    , report the real
    error
    field content to the user
  1. 从用户请求中提取
    user_prompt
    max_rounds
    (默认1)、
    output_mode
    (默认
    friendly
    )和
    prompts_expand_mode
    (默认
    auto
  2. 发送统一预检消息:
    "使用sn-infographic技能生成信息图,请稍候..."
  3. 启动工作Agent(子Agent),传入完整参数和工作目录
  4. 当工作Agent返回
    status=ok
    need_main_agent_send=true
    时:
    • max_rounds = 1:生成一句图像内容描述(不超过50字),然后发送排名第1的单张图像
    • max_rounds > 1,friendly模式:基于
      result
      violations
      生成一句自然语言描述(不超过50字),发送评估文本,然后发送排名第1的单张图像
    • max_rounds > 1,verbose模式:发送完整文本摘要,然后按排名顺序向用户发送所有图像
  5. 如果工作Agent返回
    status=error
    ,将真实的
    error
    字段内容告知用户

Worker Agent Workflow

工作Agent工作流程

Worker Agent receives
user_prompt
,
max_rounds
,
output_mode
,
prompts_expand_mode
, and the working directory of this skill (
SN_IMAGE_INFOG
).
工作Agent接收
user_prompt
max_rounds
output_mode
prompts_expand_mode
以及本技能的工作目录(
SN_IMAGE_INFOG
)。

Step 0 — Initialization

步骤0 — 初始化

  1. Generate
    task_id
    (using timestamp, format
    YYYYMMDD_HHMMSS
    )
  2. Create a uniform temporary directory:
    /tmp/openclaw/sn-infographic/<task_id>/
    as
    TEMP_DIR
  3. Initialize an empty
    rounds
    list
  4. Infer
    aspect_ratio
    (default
    16:9
    ) and
    image_size
    (default
    2k
    ) from
    user_prompt
    based on the rules in
    $SKILL_DIR/references/runtime-parameters.md
  1. 生成
    task_id
    (使用时间戳,格式
    YYYYMMDD_HHMMSS
  2. 创建统一临时目录:
    /tmp/openclaw/sn-infographic/<task_id>/
    作为
    TEMP_DIR
  3. 初始化空列表
    rounds
  4. 根据
    $SKILL_DIR/references/runtime-parameters.md
    中的规则,从
    user_prompt
    推断
    aspect_ratio
    (默认
    16:9
    )和
    image_size
    (默认
    2k

Step 1 —
prompts_expand_mode
Processing

步骤1 —
prompts_expand_mode
处理

disable
mode
:
  • Skip expand, directly use
    user_prompt
    as
    expanded_prompt
  • Assign variable and write to temporary directory:
    bash
    EXPANDED_PROMPT="$USER_PROMPT"
    echo "$EXPANDED_PROMPT" > "$TEMP_DIR/expanded-prompt.txt"
  • Record
    prompts_expand_skipped = true
force
mode
:
  • Directly execute Step 2
auto
mode
:
  1. Call sn-text-optimize for evaluation
  2. Parse JSON, extract
    required_results
    and
    optional_results
  3. Determine logic:
    • required_pass
      : All
      answer
      in
      required_results
      are
      "yes"
    • optional_pass
      : The number of
      answer="yes"
      in
      optional_results
      / total ≥ 0.6
    • should_expand = not (required_pass and optional_pass)
  4. If JSON parsing fails, default
    should_expand = true
    (conservative strategy)
  5. If
    should_expand = false
    : Skip Step 2, assign variable and write to temporary directory, record
    prompts_expand_skipped = true
    :
    bash
    EXPANDED_PROMPT="$USER_PROMPT"
    echo "$EXPANDED_PROMPT" > "$TEMP_DIR/expanded-prompt.txt"
  6. If
    should_expand = true
    : Execute Step 2
Evaluation Call (using
sn-image-base
's
sn-text-optimize
tool):
bash
python "$SN_IMAGE_BASE/scripts/sn_agent_runner.py" sn-text-optimize \
  --system-prompt-path "$SKILL_DIR/references/evaluation-standard.md" \
  --user-prompt "$USER_PROMPT" \
  --output-format json
disable
模式
  • 跳过扩展,直接使用
    user_prompt
    作为
    expanded_prompt
  • 赋值变量并写入临时目录:
    bash
    EXPANDED_PROMPT="$USER_PROMPT"
    echo "$EXPANDED_PROMPT" > "$TEMP_DIR/expanded-prompt.txt"
  • 记录
    prompts_expand_skipped = true
force
模式
  • 直接执行步骤2
auto
模式
  1. 调用sn-text-optimize进行评估
  2. 解析JSON,提取
    required_results
    optional_results
  3. 判断逻辑:
    • required_pass
      required_results
      中的所有
      answer
      均为
      "yes"
    • optional_pass
      optional_results
      answer="yes"
      的数量/总数 ≥ 0.6
    • should_expand = not (required_pass and optional_pass)
  4. 如果JSON解析失败,默认
    should_expand = true
    (保守策略)
  5. 如果
    should_expand = false
    :跳过步骤2,赋值变量并写入临时目录,记录
    prompts_expand_skipped = true
    bash
    EXPANDED_PROMPT="$USER_PROMPT"
    echo "$EXPANDED_PROMPT" > "$TEMP_DIR/expanded-prompt.txt"
  6. 如果
    should_expand = true
    :执行步骤2
评估调用(使用
sn-image-base
sn-text-optimize
工具):
bash
python "$SN_IMAGE_BASE/scripts/sn_agent_runner.py" sn-text-optimize \\
  --system-prompt-path "$SKILL_DIR/references/evaluation-standard.md" \\
  --user-prompt "$USER_PROMPT" \\
  --output-format json

Step 2 — Content Analysis + Layout & Style Selection + Prompt Expansion

步骤2 — 内容分析 + 布局与风格选择 + 提示词扩展

2.0 Content Analysis (using
sn-image-base
's
sn-text-optimize
tool):
bash
ANALYSIS=$(python "$SN_IMAGE_BASE/scripts/sn_agent_runner.py" sn-text-optimize \
  --system-prompt-path "$SKILL_DIR/references/analysis-framework.md" \
  --user-prompt "$USER_PROMPT" \
  --output-format json)
Save analysis result stdout to
analysis.json
in temporary directory
$TEMP_DIR/analysis.json
:
bash
echo "$ANALYSIS" > "$TEMP_DIR/analysis.json"
2.1 Layout & Style Selection
  1. Read analysis result from temporary directory
    $TEMP_DIR/analysis.json
    ;
bash
ANALYSIS=$(cat "$TEMP_DIR/analysis.json")
  1. Based on
    data_type
    ,
    tone
    ,
    audience
    , select
    layout
    and
    style
    based on the rules in
    $SKILL_DIR/references/layout-style-selection.md
    ;
  2. Read layout/style definition files:
bash
LAYOUT_DEF=$(cat "$SKILL_DIR/references/layouts/<layout>.md")
STYLE_DEF=$(cat "$SKILL_DIR/references/styles/<style>.md")
If file does not exist, fallback to
hub-spoke
+
corporate-memphis
.
  1. Save selection result to temporary directory:
    $TEMP_DIR/layout-style.json
    ;
Format of
layout-style.json
:
json
{
  "layout": "<layout>",
  "style": "<style>"
}
2.2 Structured Content Generation
Read analysis result and structured content template, convert
user_prompt
into a design-ready structured content based on the template rules:
bash
ANALYSIS=$(cat "$TEMP_DIR/analysis.json")
LAYOUT_STYLE=$(cat "$TEMP_DIR/layout-style.json")
STRUCTURED_CONTENT_TEMPLATE=$(cat "$SKILL_DIR/references/structured-content-template.md")
Follow the three phases defined in the template (High-Level Outline → Section Development → Data Integrity Check), combine the learning objectives, visual opportunities, and key data in
analysis.json
, generate structured content, and save it to the temporary directory:
bash
cat > "$TEMP_DIR/structured-content.md" << 'EOF'
<Content generated based on structured-content-template.md format>
EOF
Rules: All data must be preserved exactly. Do not rewrite. Do not add information that is not in the source.
2.3 Prompt Expansion (using
sn-image-base
's
sn-text-optimize
tool):
Read structured content and layout/style selection from temporary directory, dynamically concatenate system prompt, and write to temporary file:
bash
STRUCTURED_CONTENT=$(cat "$TEMP_DIR/structured-content.md")
LAYOUT_STYLE=$(cat "$TEMP_DIR/layout-style.json")
LAYOUT=$(echo "$LAYOUT_STYLE" | jq -r '.layout')
STYLE=$(echo "$LAYOUT_STYLE" | jq -r '.style')
LAYOUT_DEF=$(cat "$SKILL_DIR/references/layouts/${LAYOUT}.md")
STYLE_DEF=$(cat "$SKILL_DIR/references/styles/${STYLE}.md")

cat > "$TEMP_DIR/expand-system-prompt.md" << EOF
$(cat "$SKILL_DIR/references/prompts-expand-system.md")

---
2.0 内容分析(使用
sn-image-base
sn-text-optimize
工具):
bash
ANALYSIS=$(python "$SN_IMAGE_BASE/scripts/sn_agent_runner.py" sn-text-optimize \\
  --system-prompt-path "$SKILL_DIR/references/analysis-framework.md" \\
  --user-prompt "$USER_PROMPT" \\
  --output-format json)
将分析结果标准输出保存到临时目录的
analysis.json
bash
echo "$ANALYSIS" > "$TEMP_DIR/analysis.json"
2.1 布局与风格选择
  1. 从临时目录
    $TEMP_DIR/analysis.json
    读取分析结果;
bash
ANALYSIS=$(cat "$TEMP_DIR/analysis.json")
  1. 根据
    data_type
    tone
    audience
    ,按照
    $SKILL_DIR/references/layout-style-selection.md
    中的规则选择
    layout
    style
  2. 读取布局/风格定义文件:
bash
LAYOUT_DEF=$(cat "$SKILL_DIR/references/layouts/<layout>.md")
STYLE_DEF=$(cat "$SKILL_DIR/references/styles/<style>.md")
如果文件不存在, fallback到
hub-spoke
+
corporate-memphis
  1. 将选择结果保存到临时目录:
    $TEMP_DIR/layout-style.json
layout-style.json
格式:
json
{
  "layout": "<layout>",
  "style": "<style>"
}
2.2 结构化内容生成
读取分析结果和结构化内容模板,根据模板规则将
user_prompt
转换为可用于设计的结构化内容:
bash
ANALYSIS=$(cat "$TEMP_DIR/analysis.json")
LAYOUT_STYLE=$(cat "$TEMP_DIR/layout-style.json")
STRUCTURED_CONTENT_TEMPLATE=$(cat "$SKILL_DIR/references/structured-content-template.md")
遵循模板中定义的三个阶段(高层大纲→章节展开→数据完整性检查),结合
analysis.json
中的学习目标、视觉机会和关键数据,生成结构化内容并保存到临时目录:
bash
cat > "$TEMP_DIR/structured-content.md" << 'EOF'
<基于structured-content-template.md格式生成的内容>
EOF
规则:所有数据必须完整保留,不得重写,不得添加源内容中没有的信息。
2.3 提示词扩展(使用
sn-image-base
sn-text-optimize
工具):
从临时目录读取结构化内容和布局/风格选择,动态拼接系统提示词并写入临时文件:
bash
STRUCTURED_CONTENT=$(cat "$TEMP_DIR/structured-content.md")
LAYOUT_STYLE=$(cat "$TEMP_DIR/layout-style.json")
LAYOUT=$(echo "$LAYOUT_STYLE" | jq -r '.layout')
STYLE=$(echo "$LAYOUT_STYLE" | jq -r '.style')
LAYOUT_DEF=$(cat "$SKILL_DIR/references/layouts/${LAYOUT}.md")
STYLE_DEF=$(cat "$SKILL_DIR/references/styles/${STYLE}.md")

cat > "$TEMP_DIR/expand-system-prompt.md" << EOF
$(cat "$SKILL_DIR/references/prompts-expand-system.md")

---

Selected Layout: $LAYOUT

选中的布局: $LAYOUT

$LAYOUT_DEF

$LAYOUT_DEF

Selected Style: $STYLE

选中的风格: $STYLE

$STYLE_DEF

$STYLE_DEF

Output Template Reference

输出模板参考

$(cat "$SKILL_DIR/references/base-prompt.md") EOF

Use the content of `structured-content.md` as user-prompt, read system prompt from temporary file and call sn-text-optimize:

```bash
python "$SN_IMAGE_BASE/scripts/sn_agent_runner.py" sn-text-optimize \
  --system-prompt-path "$TEMP_DIR/expand-system-prompt.md" \
  --user-prompt "$STRUCTURED_CONTENT" \
  --output-format json
Parse JSON stdout, extract
result
field as
expanded_prompt
, and write to temporary directory:
bash
echo "$EXPANDED_PROMPT" > "$TEMP_DIR/expanded-prompt.txt"
If parsing fails or truncation is suspected (the returned content is incomplete), notify the user and terminate the workflow.
$(cat "$SKILL_DIR/references/base-prompt.md") EOF

使用`structured-content.md`的内容作为用户提示词,从临时文件读取系统提示词并调用sn-text-optimize:

```bash
python "$SN_IMAGE_BASE/scripts/sn_agent_runner.py" sn-text-optimize \\
  --system-prompt-path "$TEMP_DIR/expand-system-prompt.md" \\
  --user-prompt "$STRUCTURED_CONTENT" \\
  --output-format json
解析JSON标准输出,提取
result
字段作为
expanded_prompt
并写入临时目录:
bash
echo "$EXPANDED_PROMPT" > "$TEMP_DIR/expanded-prompt.txt"
如果解析失败或怀疑内容截断(返回内容不完整),则通知用户并终止工作流程。

Step 3 — Image Generation Loop

步骤3 — 图像生成循环

Execute
round
from
1
to
max_rounds
sequentially:
Generate Image (using
sn-image-base
's
sn-image-generate
tool):
bash
python "$SN_IMAGE_BASE/scripts/sn_agent_runner.py" sn-image-generate \
  --prompt "$EXPANDED_PROMPT" \
  --image-size "$IMAGE_SIZE" \
  --aspect-ratio "$ASPECT_RATIO" \
  --save-path "$TEMP_DIR/round_<N>.png" \
  -o json
Review Image (only executed when
max_rounds > 1
):
VLM configuration requirements:
  • When
    max_rounds > 1
    , call VLM for review
  • Select VLM model from OpenClaw configuration as parameter for image recognition
  • If no suitable VLM model exists in OpenClaw configuration:
    • Notify user that current parameter combination cannot be executed
    • Suggest adding VLM configuration or changing max_rounds to 1 to avoid VLM calls
  • If VLM call times out or fails: do not fallback, report the real error directly
bash
python "$SN_IMAGE_BASE/scripts/sn_agent_runner.py" sn-image-recognize \
  --system-prompt-path "$SN_IMAGE_INFOG/references/prompts-critic-system.md" \
  --user-prompt "Evaluate the diagram in the image against the rules. Output your assessment." \
  --images "$TEMP_DIR/round_<N>.png" \
  --output-format json
System prompt comes from
references/prompts-critic-system.md
, user prompt is provided directly.
Save Round Result
json
{
  "round": 1,
  "image": "$TEMP_DIR/round_1.png",
  "result": "PASS|FAIL",
  "violations_count": 0,
  "violations": [],
  "reasoning": "<Reasoning process, empty string when max_rounds=1>",
  "timing": {
    "image_generation": { "elapsed_seconds": 12.34, "model": "sn_image_model" },
    "vlm_review": { "elapsed_seconds": 5.67, "model": "sensenova-6.7-flash-lite" }
  }
}
Note:
elapsed_seconds
is read from the
--output-format json
return of each CLI call;
image_generation.model
is fixed to the hardcoded placeholder
"sn_image_model"
(sn-image-generate does not return the model field);
vlm_review.model
is read from the JSON return of sn-image-recognize.
timing.vlm_review
is omitted when
max_rounds=1
.
Early Termination Check (only executed when
max_rounds > 1
):
  • If
    result=PASS
    , immediately exit the loop, do not continue generating
  • If
    result=FAIL
    , continue to the next round (if there are remaining rounds)
依次执行从
1
max_rounds
round
生成图像(使用
sn-image-base
sn-image-generate
工具):
bash
python "$SN_IMAGE_BASE/scripts/sn_agent_runner.py" sn-image-generate \\
  --prompt "$EXPANDED_PROMPT" \\
  --image-size "$IMAGE_SIZE" \\
  --aspect-ratio "$ASPECT_RATIO" \\
  --save-path "$TEMP_DIR/round_<N>.png" \\
  -o json
审核图像(仅在
max_rounds > 1
时执行):
VLM配置要求:
  • max_rounds > 1
    时,调用VLM进行审核
  • 从OpenClaw配置中选择VLM模型作为图像识别参数
  • 如果OpenClaw配置中没有合适的VLM模型:
    • 通知用户当前参数组合无法执行
    • 建议添加VLM配置或将max_rounds改为1以避免VLM调用
  • 如果VLM调用超时或失败:不进行降级处理,直接上报真实错误
bash
python "$SN_IMAGE_BASE/scripts/sn_agent_runner.py" sn-image-recognize \\
  --system-prompt-path "$SN_IMAGE_INFOG/references/prompts-critic-system.md" \\
  --user-prompt "Evaluate the diagram in the image against the rules. Output your assessment." \\
  --images "$TEMP_DIR/round_<N>.png" \\
  --output-format json
系统提示词来自
references/prompts-critic-system.md
,用户提示词直接提供。
保存轮次结果
json
{
  "round": 1,
  "image": "$TEMP_DIR/round_1.png",
  "result": "PASS|FAIL",
  "violations_count": 0,
  "violations": [],
  "reasoning": "<推理过程,max_rounds=1时为空字符串>",
  "timing": {
    "image_generation": { "elapsed_seconds": 12.34, "model": "sn_image_model" },
    "vlm_review": { "elapsed_seconds": 5.67, "model": "sensenova-6.7-flash-lite" }
  }
}
注意:
elapsed_seconds
从每个CLI调用的
--output-format json
返回值中读取;
image_generation.model
固定为硬编码占位符
"sn_image_model"
(sn-image-generate不返回模型字段);
vlm_review.model
从sn-image-recognize的JSON返回值中读取。当
max_rounds=1
时,省略
timing.vlm_review
提前终止检查(仅在
max_rounds > 1
时执行):
  • 如果
    result=PASS
    ,立即退出循环,不再继续生成
  • 如果
    result=FAIL
    ,继续下一轮(如果还有剩余轮次)

Step 4 — Image Quality Ranking

步骤4 — 图像质量排名

Sort images by
violations_count
ascending +
round
ascending, return structured JSON to Main Agent.
按照
violations_count
升序 +
round
升序对图像排序,向主Agent返回结构化JSON。

Return Contract

返回契约

After Worker Agent completes, its last message must be and only be the following JSON string (bare JSON, no code fences, no preceding or trailing text).
Normal Flow:
json
{
  "status": "ok",
  "need_main_agent_send": true,
  "output_mode": "friendly|verbose",
  "expanded_prompt": "<always contains when output_mode=verbose; value is original user_prompt when prompts_expand_skipped=true, otherwise is expanded result>",
  "prompts_expand_skipped": true,
  "early_terminated": true,
  "timing": {
    "total_elapsed_seconds": 35.12,
    "prompt_detection": { "elapsed_seconds": 2.11, "model": "sensenova-6.7-flash-lite" },
    "content_analysis": { "elapsed_seconds": 3.22, "model": "sensenova-6.7-flash-lite" },
    "prompt_expand": { "elapsed_seconds": 8.45, "model": "sensenova-6.7-flash-lite" }
  },
  "rounds": [
    {
      "round": 1,
      "image": "$TEMP_DIR/round_1.png",
      "result": "PASS|FAIL",
      "violations_count": 0,
      "violations": [],
      "reasoning": "<Reasoning process, empty string when max_rounds=1>",
      "timing": {
        "image_generation": { "elapsed_seconds": 12.34, "model": "sn_image_model" },
        "vlm_review": { "elapsed_seconds": 5.67, "model": "sensenova-6.7-flash-lite" }
      }
    }
  ]
}
Error Flow:
json
{
  "status": "error",
  "error": "<Actual error information>"
}
Rules:
  • status=ok
    must contain
    need_main_agent_send: true
  • expanded_prompt
    must contain when
    output_mode=verbose
    ; value is original
    user_prompt
    when
    prompts_expand_skipped=true
  • prompts_expand_skipped
    must contain when expand is not executed (value is
    true
    ), covering two cases:
    prompts_expand_mode=disable
    and
    prompts_expand_mode=auto
    and evaluation passes and skip expand
  • early_terminated
    must contain when early termination (value is
    true
    ), omitted when normal execution completes
  • violations
    is an array of strings, from review results
  • reasoning
    is an empty string when
    max_rounds=1
  • Top-level
    timing
    contains:
    • total_elapsed_seconds
      : Worker Agent's wall time from Step 0 to returning JSON, calculated by Worker Agent itself
    • prompt_detection
      : Step 1 evaluation call, containing
      elapsed_seconds
      and
      model
      (read from sn-text-optimize JSON return); omitted when
      prompts_expand_mode=disable
    • content_analysis
      : Step 2.0 content analysis call, containing
      elapsed_seconds
      and
      model
      (read from sn-text-optimize JSON return); omitted when expand is skipped
    • prompt_expand
      : Step 2.3 prompt expansion call, containing
      elapsed_seconds
      and
      model
      (read from sn-text-optimize JSON return); omitted when expand is skipped
  • rounds[].timing.image_generation.model
    is fixed to the hardcoded placeholder
    "sn_image_model"
  • rounds[].timing.vlm_review
    is omitted when
    max_rounds=1
工作Agent完成后,其最后一条消息必须且只能是以下JSON字符串(纯JSON,无代码围栏,无前后文本)。
正常流程
json
{
  "status": "ok",
  "need_main_agent_send": true,
  "output_mode": "friendly|verbose",
  "expanded_prompt": "<output_mode=verbose时必须包含;prompts_expand_skipped=true时为原始user_prompt,否则为扩展结果>",
  "prompts_expand_skipped": true,
  "early_terminated": true,
  "timing": {
    "total_elapsed_seconds": 35.12,
    "prompt_detection": { "elapsed_seconds": 2.11, "model": "sensenova-6.7-flash-lite" },
    "content_analysis": { "elapsed_seconds": 3.22, "model": "sensenova-6.7-flash-lite" },
    "prompt_expand": { "elapsed_seconds": 8.45, "model": "sensenova-6.7-flash-lite" }
  },
  "rounds": [
    {
      "round": 1,
      "image": "$TEMP_DIR/round_1.png",
      "result": "PASS|FAIL",
      "violations_count": 0,
      "violations": [],
      "reasoning": "<推理过程,max_rounds=1时为空字符串>",
      "timing": {
        "image_generation": { "elapsed_seconds": 12.34, "model": "sn_image_model" },
        "vlm_review": { "elapsed_seconds": 5.67, "model": "sensenova-6.7-flash-lite" }
      }
    }
  ]
}
错误流程
json
{
  "status": "error",
  "error": "<真实错误信息>"
}
规则
  • status=ok
    必须包含
    need_main_agent_send: true
  • expanded_prompt
    output_mode=verbose
    时必须包含;
    prompts_expand_skipped=true
    时为原始
    user_prompt
  • 当未执行扩展时,必须包含
    prompts_expand_skipped
    (值为
    true
    ),覆盖两种情况:
    prompts_expand_mode=disable
    prompts_expand_mode=auto
    且评估通过并跳过扩展
  • 当提前终止时,必须包含
    early_terminated
    (值为
    true
    ),正常执行完成时省略
  • violations
    是字符串数组,来自审核结果
  • max_rounds=1
    时,
    reasoning
    为空字符串
  • 顶层
    timing
    包含:
    • total_elapsed_seconds
      :工作Agent从步骤0到返回JSON的实际耗时,由工作Agent自行计算
    • prompt_detection
      :步骤1的评估调用,包含
      elapsed_seconds
      model
      (从sn-text-optimize的JSON返回值读取);
      prompts_expand_mode=disable
      时省略
    • content_analysis
      :步骤2.0的内容分析调用,包含
      elapsed_seconds
      model
      (从sn-text-optimize的JSON返回值读取);跳过扩展时省略
    • prompt_expand
      :步骤2.3的提示词扩展调用,包含
      elapsed_seconds
      model
      (从sn-text-optimize的JSON返回值读取);跳过扩展时省略
  • rounds[].timing.image_generation.model
    固定为硬编码占位符
    "sn_image_model"
  • max_rounds=1
    时,省略
    rounds[].timing.vlm_review

Output Format

输出格式

friendly mode (default)

friendly模式(默认)

Text Summary:
  • when
    max_rounds = 1
    : Generate a one-sentence description of the image content based on
    expanded_prompt
    ,不超过50字
  • when
    max_rounds > 1
    : Generate a one-sentence description of the image content based on
    result
    and
    violations
    ,不超过50字:
    • result=PASS
      : Describe in a positive tone
    • result=FAIL
      (1-2 violations): Gently point out specific issues
    • result=FAIL
      (3 or more): Objectively summarize the main issues
Image: rank=1 best single image
文本摘要
  • max_rounds = 1
    :基于
    expanded_prompt
    生成一句图像内容描述,不超过50字
  • max_rounds > 1
    :基于
    result
    violations
    生成一句图像内容描述,不超过50字:
    • result=PASS
      :采用积极语气描述
    • result=FAIL
      (1-2个问题):温和指出具体问题
    • result=FAIL
      (3个及以上问题):客观总结主要问题
图像:排名第1的最优单张图像

verbose mode

verbose模式

Quality ranking result (high -> low)
---
Expanded prompt: [expanded | not expanded, using original prompt]
<expanded_prompt>
---
#1 round=<n> result=<PASS|FAIL> violations=<n> [early terminated]
#2 round=<n> result=<PASS|FAIL> violations=<n>
...
---
Time statistics: Total <total>s | Prompt evaluation <t>s | Content analysis <t>s | Prompt expansion <t>s | Image generation <t>s×<n> rounds | VLM review <t>s×<n> rounds
---
Images (sent in rank order)
质量排名结果(高→低)
---
扩展提示词:[已扩展 | 未扩展,使用原始提示词]
<expanded_prompt>
---
#1 round=<n> result=<PASS|FAIL> violations=<n> [提前终止]
#2 round=<n> result=<PASS|FAIL> violations=<n>
...
---
时间统计:总耗时 <total>s | 提示词评估 <t>s | 内容分析 <t>s | 提示词扩展 <t>s | 图像生成 <t>s×<n>轮 | VLM审核 <t>s×<n>轮
---
图像(按排名顺序发送)

Call Relationship

调用关系

  • Bottom-level dependency:
    sn-image-base
    sn-image-generate
    ,
    sn-image-recognize
    ,
    sn-text-optimize
  • 底层依赖:
    sn-image-base
    sn-image-generate
    ,
    sn-image-recognize
    ,
    sn-text-optimize

References

参考文档

  • references/analysis-framework.md
    - Analysis methodology
  • references/base-prompt.md
    - Prompt template
  • references/evaluation-standard.md
    - Evaluation standard
  • references/layout-style-selection.md
    - Layout and style selection rules
  • references/prompts-expand-system.md
    - Prompt expansion system prompt
  • references/prompts-critic-system.md
    - Prompt critic system prompt
  • references/runtime-parameters.md
    - Runtime parameters
  • references/structured-content-template.md
    - Structured content template
  • references/layouts/<layout>.md
    - Layout definitions (87 layouts)
  • references/styles/<style>.md
    - Style definitions (66 styles)
  • references/analysis-framework.md
    - 分析方法论
  • references/base-prompt.md
    - 提示词模板
  • references/evaluation-standard.md
    - 评估标准
  • references/layout-style-selection.md
    - 布局与风格选择规则
  • references/prompts-expand-system.md
    - 提示词扩展系统提示词
  • references/prompts-critic-system.md
    - 提示词审核系统提示词
  • references/runtime-parameters.md
    - 运行时参数
  • references/structured-content-template.md
    - 结构化内容模板
  • references/layouts/<layout>.md
    - 布局定义(87种布局)
  • references/styles/<style>.md
    - 风格定义(66种风格) ",