visual-verdict

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

<Purpose> Use this skill to compare generated UI screenshots against one or more reference images and return a strict JSON verdict that can drive the next edit iteration. </Purpose>

<Use_When>

The task includes visual fidelity requirements (layout, spacing, typography, component styling)
You have a generated screenshot and at least one reference image
You need deterministic pass/fail guidance before continuing edits </Use_When>

<Inputs> - `reference_images[]` (one or more image paths) - `generated_screenshot` (current output image) - Optional: `category_hint` (e.g., `hackernews`, `sns-feed`, `dashboard`) </Inputs>

<Output_Contract> Return JSON only with this exact shape:

json

{
  "score": 0,
  "verdict": "revise",
  "category_match": false,
  "differences": ["..."],
  "suggestions": ["..."],
  "reasoning": "short explanation"
}

Rules:

```
score
```
: integer 0-100
```
verdict
```
: short status (
```
pass
```
,
```
revise
```
, or
```
fail
```
)
```
category_match
```
:
```
true
```
when the generated screenshot matches the intended UI category/style
```
differences[]
```
: concrete visual mismatches (layout, spacing, typography, colors, hierarchy)
```
suggestions[]
```
: actionable next edits tied to the differences
```
reasoning
```
: 1-2 sentence summary

<Threshold_And_Loop>

Target pass threshold is 90+.
If
```
score < 90
```
, continue editing and rerun
```
/oh-my-claudecode:visual-verdict
```
before any further visual review pass.
Do not treat the visual task as complete until the next screenshot clears the threshold. </Threshold_And_Loop>

<Debug_Visualization> When mismatch diagnosis is hard:

Keep
```
$visual-verdict
```
as the authoritative decision.
Use pixel-level diff tooling (pixel diff / pixelmatch overlay) as a secondary debug aid to localize hotspots.
Convert pixel diff hotspots into concrete
```
differences[]
```
and
```
suggestions[]
```
updates. </Debug_Visualization>

<Example> ```json { "score": 87, "verdict": "revise", "category_match": true, "differences": [ "Top nav spacing is tighter than reference", "Primary button uses smaller font weight" ], "suggestions": [ "Increase nav item horizontal padding by 4px", "Set primary button font-weight to 600" ], "reasoning": "Core layout matches, but style details still diverge." } ``` </Example>

Task: {{ARGUMENTS}}

<Purpose> 使用该技能将生成的UI截图与一张或多张参考图进行对比，返回严格的JSON判定结果，用于指导后续的迭代修改。 </Purpose>

<Use_When>

任务包含视觉保真度要求（布局、间距、排版、组件样式）
你已有生成的截图和至少一张参考图
继续编辑前你需要明确的通过/不通过指导 </Use_When>

<Inputs> - `reference_images[]`（一张或多张图片路径） - `generated_screenshot`（当前输出图片） - 可选：`category_hint`（例如：`hackernews`、`sns-feed`、`dashboard`） </Inputs>

<Output_Contract> 返回 JSON only ，结构严格如下：

json

{
  "score": 0,
  "verdict": "revise",
  "category_match": false,
  "differences": ["..."],
  "suggestions": ["..."],
  "reasoning": "short explanation"
}

规则：

```
score
```
：0-100的整数
```
verdict
```
：简短状态（
```
pass
```
、
```
revise
```
或
```
fail
```
）
```
category_match
```
：生成的截图符合预期UI分类/风格时返回
```
true
```
```
differences[]
```
：具体的视觉不匹配项（布局、间距、排版、颜色、层级）
```
suggestions[]
```
：与不匹配项对应的可落地后续修改建议
```
reasoning
```
：1-2句话的总结

<Threshold_And_Loop>

目标通过阈值为 90分及以上。
如果
```
score < 90
```
，继续编辑并重新运行
```
/oh-my-claudecode:visual-verdict
```
，再进行后续视觉评审。
在下一张截图达到阈值前，不要认为视觉任务已完成。 </Threshold_And_Loop>

<Debug_Visualization> 难以诊断不匹配问题时：

以
```
$visual-verdict
```
的判定作为权威结论。
使用像素级对比工具（像素对比 / pixelmatch叠加层）作为次要调试辅助工具定位问题区域。
将像素对比发现的问题区域转化为具体的
```
differences[]
```
和
```
suggestions[]
```
更新内容。 </Debug_Visualization>

Task: {{ARGUMENTS}}