visual-verdict

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

<Purpose> Use this skill to compare generated UI screenshots against one or more reference images and return a strict JSON verdict that can drive the next edit iteration. </Purpose>

<Use_When>

The task includes visual fidelity requirements (layout, spacing, typography, component styling)
You have a generated screenshot and at least one reference image
You need deterministic pass/fail guidance before continuing edits </Use_When>

<Inputs> - `reference_images[]` (one or more image paths) - `generated_screenshot` (current output image) - Optional: `category_hint` (e.g., `hackernews`, `sns-feed`, `dashboard`) </Inputs>

<Output_Contract> Return JSON only with this exact shape:

json

{
  "score": 0,
  "verdict": "revise",
  "category_match": false,
  "differences": ["..."],
  "suggestions": ["..."],
  "reasoning": "short explanation"
}

Rules:

```
score
```
: integer 0-100
```
verdict
```
: short status (
```
pass
```
,
```
revise
```
, or
```
fail
```
)
```
category_match
```
:
```
true
```
when the generated screenshot matches the intended UI category/style
```
differences[]
```
: concrete visual mismatches (layout, spacing, typography, colors, hierarchy)
```
suggestions[]
```
: actionable next edits tied to the differences
```
reasoning
```
: 1-2 sentence summary

<Threshold_And_Loop>

Target pass threshold is 90+.
If
```
score < 90
```
, continue editing and rerun
```
$visual-verdict
```
before any further code edits in the next iteration.
Persist the verdict in
```
.omx/state/{scope}/ralph-progress.json
```
with both:
- numeric signal (
```
score
```
  , threshold pass/fail)
- qualitative signal (
```
reasoning
```
  ,
```
suggestions
```
  ,
```
next_actions
```
  ) </Threshold_And_Loop>

<Debug_Visualization> When mismatch diagnosis is hard:

Keep
```
$visual-verdict
```
as the authoritative decision.
Use pixel-level diff tooling (pixel diff / pixelmatch overlay) as a secondary debug aid to localize hotspots.
Convert pixel diff hotspots into concrete
```
differences[]
```
and
```
suggestions[]
```
updates. </Debug_Visualization>

<Example> ```json { "score": 87, "verdict": "revise", "category_match": true, "differences": [ "Top nav spacing is tighter than reference", "Primary button uses smaller font weight" ], "suggestions": [ "Increase nav item horizontal padding by 4px", "Set primary button font-weight to 600" ], "reasoning": "Core layout matches, but style details still diverge." } ``` </Example>

<用途> 使用此skill将生成的UI截图与一张或多张参考图像进行比对，并返回严格的JSON格式判定结果，可用于驱动下一轮编辑迭代。 </用途>

<适用场景>

任务包含视觉保真度要求（布局、间距、排版、组件样式）
你已持有生成的截图和至少一张参考图像
在继续编辑前需要确定性的通过/不通过指导 </适用场景>

<输入参数>

```
reference_images[]
```
（一张或多张图片路径）
```
generated_screenshot
```
（当前输出图像）
可选参数：
```
category_hint
```
（例如：
```
hackernews
```
、
```
sns-feed
```
、
```
dashboard
```
） </输入参数>

<输出约定> 返回仅JSON格式内容，且严格遵循如下结构：

json

{
  "score": 0,
  "verdict": "revise",
  "category_match": false,
  "differences": ["..."],
  "suggestions": ["..."],
  "reasoning": "short explanation"
}

规则说明：

```
score
```
：取值为0-100的整数
```
verdict
```
：简短状态标识（可选值为
```
pass
```
、
```
revise
```
、
```
fail
```
）
```
category_match
```
：当生成的截图符合预期UI分类/风格时返回
```
true
```
```
differences[]
```
：具体的视觉不匹配项（包含布局、间距、排版、颜色、层级等维度）
```
suggestions[]
```
：与不匹配项对应的可执行下一步编辑建议
```
reasoning
```
：1-2句话的结果总结说明

<阈值与循环逻辑>

目标通过阈值为90分及以上。
如果
```
score < 90
```
，请先继续编辑并重新运行
```
$visual-verdict
```
，再开展下一轮的代码编辑工作。
请将判定结果持久化存储到
```
.omx/state/{scope}/ralph-progress.json
```
中，需同时包含两类信息：
- 量化信号（
```
score
```
  、阈值达标/未达标状态）
- 定性信号（
```
reasoning
```
  、
```
suggestions
```
  、
```
next_actions
```
  ） </阈值与循环逻辑>

<调试可视化> 当难以诊断不匹配问题时，可按如下步骤操作：

始终以
```
$visual-verdict
```
的判定结果作为权威依据。
使用像素级差异工具（像素diff / pixelmatch叠加层）作为次要调试辅助来定位问题区域。
将像素差异定位到的问题区域转化为具体的
```
differences[]
```
和
```
suggestions[]
```
更新项。 </调试可视化>

<示例>

json

{
  "score": 87,
  "verdict": "revise",
  "category_match": true,
  "differences": [
    "Top nav spacing is tighter than reference",
    "Primary button uses smaller font weight"
  ],
  "suggestions": [
    "Increase nav item horizontal padding by 4px",
    "Set primary button font-weight to 600"
  ],
  "reasoning": "Core layout matches, but style details still diverge."
}

</示例>