visual-verdict

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese
<Purpose> Use this skill to compare generated UI screenshots against one or more reference images and return a strict JSON verdict that can drive the next edit iteration. </Purpose>
<Use_When>
  • The task includes visual fidelity requirements (layout, spacing, typography, component styling)
  • You have a generated screenshot and at least one reference image
  • You need deterministic pass/fail guidance before continuing edits </Use_When>
<Inputs> - `reference_images[]` (one or more image paths) - `generated_screenshot` (current output image) - Optional: `category_hint` (e.g., `hackernews`, `sns-feed`, `dashboard`) </Inputs>
<Output_Contract> Return JSON only with this exact shape:
json
{
  "score": 0,
  "verdict": "revise",
  "category_match": false,
  "differences": ["..."],
  "suggestions": ["..."],
  "reasoning": "short explanation"
}
Rules:
  • score
    : integer 0-100
  • verdict
    : short status (
    pass
    ,
    revise
    , or
    fail
    )
  • category_match
    :
    true
    when the generated screenshot matches the intended UI category/style
  • differences[]
    : concrete visual mismatches (layout, spacing, typography, colors, hierarchy)
  • suggestions[]
    : actionable next edits tied to the differences
  • reasoning
    : 1-2 sentence summary
<Threshold_And_Loop>
  • Target pass threshold is 90+.
  • If
    score < 90
    , continue editing and rerun
    $visual-verdict
    before any further code edits in the next iteration.
  • Persist the verdict in
    .omx/state/{scope}/ralph-progress.json
    with both:
    • numeric signal (
      score
      , threshold pass/fail)
    • qualitative signal (
      reasoning
      ,
      suggestions
      ,
      next_actions
      ) </Threshold_And_Loop>
<Debug_Visualization> When mismatch diagnosis is hard:
  1. Keep
    $visual-verdict
    as the authoritative decision.
  2. Use pixel-level diff tooling (pixel diff / pixelmatch overlay) as a secondary debug aid to localize hotspots.
  3. Convert pixel diff hotspots into concrete
    differences[]
    and
    suggestions[]
    updates. </Debug_Visualization>
<Example> ```json { "score": 87, "verdict": "revise", "category_match": true, "differences": [ "Top nav spacing is tighter than reference", "Primary button uses smaller font weight" ], "suggestions": [ "Increase nav item horizontal padding by 4px", "Set primary button font-weight to 600" ], "reasoning": "Core layout matches, but style details still diverge." } ``` </Example>
<用途> 使用此skill将生成的UI截图与一张或多张参考图像进行比对,并返回严格的JSON格式判定结果,可用于驱动下一轮编辑迭代。 </用途>
<适用场景>
  • 任务包含视觉保真度要求(布局、间距、排版、组件样式)
  • 你已持有生成的截图和至少一张参考图像
  • 在继续编辑前需要确定性的通过/不通过指导 </适用场景>
<输入参数>
  • reference_images[]
    (一张或多张图片路径)
  • generated_screenshot
    (当前输出图像)
  • 可选参数:
    category_hint
    (例如:
    hackernews
    sns-feed
    dashboard
    ) </输入参数>
<输出约定> 返回仅JSON格式内容,且严格遵循如下结构:
json
{
  "score": 0,
  "verdict": "revise",
  "category_match": false,
  "differences": ["..."],
  "suggestions": ["..."],
  "reasoning": "short explanation"
}
规则说明:
  • score
    :取值为0-100的整数
  • verdict
    :简短状态标识(可选值为
    pass
    revise
    fail
  • category_match
    :当生成的截图符合预期UI分类/风格时返回
    true
  • differences[]
    :具体的视觉不匹配项(包含布局、间距、排版、颜色、层级等维度)
  • suggestions[]
    :与不匹配项对应的可执行下一步编辑建议
  • reasoning
    :1-2句话的结果总结说明
<阈值与循环逻辑>
  • 目标通过阈值为90分及以上
  • 如果
    score < 90
    ,请先继续编辑并重新运行
    $visual-verdict
    ,再开展下一轮的代码编辑工作。
  • 请将判定结果持久化存储到
    .omx/state/{scope}/ralph-progress.json
    中,需同时包含两类信息:
    • 量化信号(
      score
      、阈值达标/未达标状态)
    • 定性信号(
      reasoning
      suggestions
      next_actions
      ) </阈值与循环逻辑>
<调试可视化> 当难以诊断不匹配问题时,可按如下步骤操作:
  1. 始终以
    $visual-verdict
    的判定结果作为权威依据。
  2. 使用像素级差异工具(像素diff / pixelmatch叠加层)作为次要调试辅助来定位问题区域。
  3. 将像素差异定位到的问题区域转化为具体的
    differences[]
    suggestions[]
    更新项。 </调试可视化>
<示例>
json
{
  "score": 87,
  "verdict": "revise",
  "category_match": true,
  "differences": [
    "Top nav spacing is tighter than reference",
    "Primary button uses smaller font weight"
  ],
  "suggestions": [
    "Increase nav item horizontal padding by 4px",
    "Set primary button font-weight to 600"
  ],
  "reasoning": "Core layout matches, but style details still diverge."
}
</示例>