critique

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Critique Skill · 5 维度专家评审

Critique Skill · 5维度专家评审

Produce a single-file HTML "design review report" that scores any artifact across 5 dimensions and proposes actionable fixes. Inspired by the huashu-design expert-critique flow.
生成一份单文件HTML“设计评审报告”,从5个维度为任意制品评分并提出可落地的修复建议。灵感源自huashu-design的专家评审流程。

When to use

使用场景

  • After the agent (or user) generates an artifact (deck / prototype / landing page) and the user asks "what's wrong with this?" or "review this"
  • As a self-check loop the agent can run on its own output before emitting it
  • For comparing two variants of the same design
  • 当Agent(或用户)生成制品(演示文稿/原型/着陆页)后,用户询问“这个有什么问题?”或“评审一下这个”时
  • 作为Agent在输出内容之前对自身产出进行自查的环节
  • 用于对比同一设计的两个版本

What you produce

产出内容

A single self-contained
<artifact type="text/html">
review report including:
  1. Header — what artifact was reviewed, date, reviewer ("OD · Critique skill"), 1-line verdict
  2. Radar chart (inline SVG, no library) showing the 5 scores
  3. Five dimension cards, each with:
    • Score 0–10 (with band: 0–4 Broken · 5–6 Functional · 7–8 Strong · 9–10 Exceptional)
    • 1-paragraph evidence (cite specific elements / files / lines)
    • One Keep / Fix / Quick-win bullet
  4. Combined action lists at the bottom:
    • Keep — what's working, don't touch
    • Fix — P0 / P1 issues that are visually expensive
    • Quick wins — 5–15 minute tweaks with disproportionate impact
一份独立的
<artifact type="text/html">
评审报告,包含:
  1. 页眉——评审的制品名称、日期、评审方(“OD · Critique skill”)、一句话结论
  2. 雷达图(内嵌SVG,无需外部库)展示5项评分
  3. 五个维度卡片,每个卡片包含:
    • 0–10分评分(对应等级:0–4分 待修复 · 5–6分 可用 · 7–8分 优秀 · 9–10分 卓越
    • 一段证据说明(引用具体元素/文件/行号)
    • 一条保留/修复/快速优化的要点
  4. 底部的合并行动列表
    • 保留项——运行良好的内容,无需改动
    • 修复项——P0/P1级问题,对视觉影响较大
    • 快速优化项——耗时5–15分钟的微调,能带来显著效果

The 5 dimensions

5个评审维度

Each dimension is independent — a deck can be 9/10 on Innovation but 4/10 on Hierarchy and the report should say so plainly. Don't average away interesting failures.
每个维度相互独立——一份演示文稿可能在创新性上得9/10分,但在视觉层级上仅得4/10分,报告应如实呈现,不要通过平均掩盖明显的问题。

1. Philosophy consistency · 哲学一致性

1. Philosophy consistency · 哲学一致性

Does the artifact pick a clear direction and stick to it through every micro-decision (chrome / kicker / spacing / accent)?
Evidence to look for:
  • Is there one declared design direction (e.g. Monocle / WIRED / Kinfolk) or is it three styles in a trench coat?
  • Does the chrome / kicker vocabulary stay in one register, or does page 3 say "Vol.04 · Spring" and page 7 say "BUT WAIT 🔥"?
  • Are accent / serif / mono used by the same rule throughout?
0–4 Three styles fighting each other. 5–6 One direction but half the elements drift. 7–8 Coherent, occasional drift on edge pages. 9–10 Every element argues for the same thesis.
制品是否确立了清晰的设计方向,并在所有细节决策(chrome / kicker / spacing / accent)中始终遵循该方向?
需查证的证据:
  • 是否明确声明了一种设计风格(如Monocle / WIRED / Kinfolk),还是混杂了三种风格?
  • chrome / kicker的表述风格是否保持统一,还是第3页写着“Vol.04 · Spring”,第7页却出现“BUT WAIT 🔥”?
  • 强调元素/衬线字体/等宽字体的使用规则是否全程一致?
评分标准: 0–4分:三种风格相互冲突。5–6分:有明确方向,但半数元素偏离。7–8分:整体连贯,仅边缘页面偶尔出现偏离。9–10分:每个元素都服务于同一设计主题。

2. Visual hierarchy · 视觉层级

2. Visual hierarchy · 视觉层级

Can a stranger figure out what to read first, second, third — without being told?
Evidence to look for:
  • Is the largest type clearly the most important thing on each page?
  • Do mono / serif / sans roles match the information's role (meta / body / display)?
  • Lots of "loud" elements competing? Or a clear primary + secondary + tertiary tier?
0–4 Everything shouts. 5–6 Hierarchy works on hero pages but breaks on body. 7–8 Clear tiers, occasional collision. 9–10 Eye moves with zero friction.
陌生人无需指引,能否明确分辨内容的阅读顺序:先读什么、再读什么、最后读什么?
需查证的证据:
  • 每个页面上最大的字体是否对应最重要的内容?
  • 等宽/衬线/无衬线字体的角色是否与信息的功能(元数据/正文/标题)匹配?
  • 是否存在大量“醒目”元素相互竞争?还是有清晰的主、次、三级内容层级?
评分标准: 0–4分:所有元素都在“争抢注意力”。5–6分:首页层级清晰,但正文页面层级混乱。7–8分:层级清晰,偶尔出现元素冲突。9–10分:视线移动毫无阻碍。

3. Detail execution · 细节执行

3. Detail execution · 细节执行

The 90/10 stuff — alignment, leading, kerning at large sizes, image framing, foot/chrome polish, edge-case spacing.
Evidence to look for:
  • Big-stat pages: does the number sit on a baseline, or float?
  • Left/right column tops aligned in
    grid-2-7-5
    ?
  • frame-img
    + caption proportions consistent across pages?
  • Mono labels: same letter-spacing? same uppercase rule?
  • Any orphaned
    <br>
    causing 1-character lines?
0–4 Visible tape and string. 5–6 Most pages clean, 1–2 ragged. 7–8 Polished, expert eye finds 2–3 misses. 9–10 Magazine-grade — the kind of detail that makes printed-by-hand typographers nod.
决定成败的细节——对齐、行高、大字号字距、图片构图、页脚/界面装饰元素的打磨、边缘场景的间距处理。
需查证的证据:
  • 数据展示页:数字是否对齐基线,还是悬浮状态?
  • grid-2-7-5
    布局中左右栏顶部是否对齐?
  • frame-img
    + 标题的比例是否在所有页面保持一致?
  • 等宽标签:字距是否相同?大写规则是否统一?
  • 是否存在孤立的
    <br>
    标签导致出现单字符行?
评分标准: 0–4分:细节粗糙,漏洞明显。5–6分:大部分页面整洁,存在1–2处杂乱。7–8分:整体精致,专业人士能发现2–3处疏漏。9–10分:杂志级水准——细节到位,能让手工排版的设计师点头认可。

4. Functionality · 功能性

4. Functionality · 功能性

Does the artifact work for its intended use? Click targets, nav, readability at presentation distance, copy-paste-ability for code blocks, mobile fallback if relevant.
Evidence to look for:
  • Deck: keyboard / wheel / touch nav all working? Iframe scroll fallback?
  • Landing: CTA above the fold? Phone number tappable on mobile?
  • Runbook: code blocks copyable, mono font, no smart quotes?
  • Critical info readable from 4m away (large screen presentation)?
0–4 Visually fine but doesn't accomplish its job. 5–6 Core flow works, edge cases broken. 7–8 Robust through normal use. 9–10 Defensively engineered — handles iframe / fullscreen / paste / print without flinching.
制品是否能满足预期用途?点击目标、导航、演示距离下的可读性、代码块的可复制性、相关的移动端适配。
需查证的证据:
  • 演示文稿:键盘/滚轮/触摸导航是否正常工作?是否有iframe滚动适配?
  • 着陆页:CTA按钮是否在首屏可见?手机号在移动端是否可点击?
  • 操作手册:代码块是否可复制?是否使用等宽字体?是否无智能引号?
  • 关键信息在4米外(大屏幕演示)是否清晰可读?
评分标准: 0–4分:视觉效果尚可,但无法完成核心任务。5–6分:核心流程可用,但边缘场景失效。7–8分:正常使用下表现稳定。9–10分:防御性设计——能完美处理iframe/全屏/粘贴/打印等场景。

5. Innovation · 创新性

5. Innovation · 创新性

Does this push past the median? Is there one element that makes people lean in?
Evidence to look for:
  • One unexpected layout / motion / typographic move that wasn't required?
  • Or 100% safe — could be any deck/landing from any agency?
  • Is the innovation earned (matches direction) or grafted on (random WebGL on a Kinfolk slow-living deck)?
0–4 Generic AI-slop median. 5–6 Competent and unmemorable. 7–8 One memorable moment, the rest solid. 9–10 Multiple moves you'd steal — but each one obviously serves the thesis.
是否超越常规水平?是否存在让人眼前一亮的元素?
需查证的证据:
  • 是否有一个非必需的、出人意料的布局/动效/排版设计?
  • 还是100%保守设计——和任何机构的演示文稿/着陆页毫无区别?
  • 创新是否贴合设计方向,还是生硬添加(比如在Kinfolk慢生活风格的演示文稿中加入随机WebGL效果)?
评分标准: 0–4分:通用AI生成的平庸内容。5–6分:合格但毫无记忆点。7–8分:有一个让人印象深刻的亮点,其余部分表现稳定。9–10分:有多个值得借鉴的设计——且每个设计都明确服务于主题。

Scoring discipline (read before you score)

评分准则(评分前必读)

  • Always cite evidence — "scored 4 because hero page mixes Playfair display with Inter sans on the same line" beats "feels inconsistent". Numbers without evidence get rejected.
  • Don't average up — if Hierarchy is 5 because page 3 is broken, don't bump to 7 because pages 1 and 2 are fine. The score is the worst sustained band.
  • Don't grade-inflate — a 7 means strong, not acceptable. If every score is 7+, you're not reviewing critically.
  • Innovation is allowed to be low — 5/10 is fine for production deliverables. Don't punish appropriate conservatism.
  • 务必引用证据——“因首页在同一行混用Playfair标题字体和Inter无衬线字体,故评4分”比“感觉不一致”更有说服力。无证据的评分无效。
  • 不要拉高平均分——如果视觉层级因第3页混乱而得5分,不要因为第1、2页表现良好就提至7分。评分应基于持续最差的表现层级
  • 不要虚高评分——7分意味着优秀,而非合格。如果所有评分都在7分以上,说明评审不够严苛。
  • 创新性允许低分——生产交付物得5/10分是合理的。不要惩罚恰当的保守设计

Workflow

工作流程

Step 1 — Acquire the artifact

步骤1 — 获取制品

Three modes:
  1. Project file — user said "review the index.html I just made": open it from the project folder.
  2. Pasted HTML — user pasted code in the chat: read it from the message.
  3. Generated by you in this turn — you just emitted an artifact above and want to self-critique: re-read your own
    <artifact>
    .
If multiple HTML files exist, ask which one (don't review all).
三种方式:
  1. 项目文件——用户说“评审我刚做的index.html”:从项目文件夹中打开该文件。
  2. 粘贴的HTML——用户在聊天中粘贴了代码:从消息中读取内容。
  3. 本轮生成的内容——你刚输出了一份制品,想要进行自评:重新读取你输出的
    <artifact>
如果存在多个HTML文件,请询问用户评审哪一个(不要全部评审)。

Step 2 — Read enough to score

步骤2 — 充分阅读以评分

Skim the entire
<style>
, then read 6–8 representative content blocks. Do not score from frontmatter alone. The score depends on executed design, not declared intent.
浏览整个
<style>
部分,然后阅读6–8个有代表性的内容块。不要仅根据前置内容评分。评分基于实际落地的设计,而非声明的设计意图。

Step 3 — Score with evidence

步骤3 — 结合证据评分

For each of the 5 dimensions, write the score and a 30–80 word evidence paragraph that names specific elements. Use line numbers, class names, page numbers.
Example:
Dimension: Detail execution
Score: 6 / 10
Evidence: Stat-cards on page 3 align cleanly (grid-6, 3×2), but on
page 8 the right column foot sits 2vh higher than the left because
.callout has 3vh top margin while the figure doesn't. Image captions
use mono on page 5 but sans on page 7 — pick one.
针对每个维度,写下评分和一段30–80字的证据说明,提及具体元素。使用行号、类名、页码。
示例:
Dimension: Detail execution
Score: 6 / 10
Evidence: Stat-cards on page 3 align cleanly (grid-6, 3×2), but on
page 8 the right column foot sits 2vh higher than the left because
.callout has 3vh top margin while the figure doesn't. Image captions
use mono on page 5 but sans on page 7 — pick one.

Step 4 — Build the action lists

步骤4 — 整理行动列表

Aggregate the 5 evidence paragraphs into:
  • Keep (3–5 bullets) — concrete things working that the user must not break in the next iteration. Cite by class / page / element.
  • Fix (3–6 bullets) — must-do, ordered by visual cost saved per minute spent. Each bullet ≤ 1 sentence.
  • Quick wins (3–5 bullets) — 5–15 minutes each, high signal-to-noise (e.g. "swap
    display:flex
    for
    grid
    on page 4 to fix the column drift").
将5段证据说明整合为:
  • 保留项(3–5条)——运行良好的具体内容,用户在后续迭代中不得改动。引用类名/页码/元素。
  • 修复项(3–6条)——必须修复的问题,按每分钟修复能减少的视觉影响排序。每条不超过1句话。
  • 快速优化项(3–5条)——每项耗时5–15分钟,性价比极高(例如“将第4页的
    display:flex
    替换为
    grid
    以修复列偏移”)。

Step 5 — Emit the report HTML

步骤5 — 输出报告HTML

Build a single file:
  • Header: artifact name + reviewer credit + date
  • Big radar chart (SVG)
  • 5 dimension cards in a 1-column or 2-column grid
  • Three action lists at the bottom with checkbox affordance
Use the active DESIGN.md tokens if one exists; otherwise default to a neutral light theme (off-white background, near-black text, one accent for radar fill).
生成单文件:
  • 页眉:制品名称 + 评审方署名 + 日期
  • 大型雷达图(SVG)
  • 1列或2列网格布局的5个维度卡片
  • 底部带有复选框样式的三个行动列表
如果存在可用的DESIGN.md令牌,使用该令牌;否则默认使用中性浅色主题(米白色背景、近黑色文字、雷达图填充使用一种强调色)。

Output contract

输出规范

<artifact identifier="critique-<artifact-slug>" type="text/html" title="Critique · <Artifact Title>">
<!doctype html>
<html>...</html>
</artifact>
One sentence before the artifact ("Reviewed X across 5 dimensions, see report below.") and stop after
</artifact>
— do not paraphrase the report in chat; the user will read the artifact.
<artifact identifier="critique-<artifact-slug>" type="text/html" title="Critique · <Artifact Title>">
<!doctype html>
<html>...</html>
</artifact>
在制品前添加一句话说明(“已从5个维度评审X,查看下方报告。”),
</artifact>
后停止输出
——不要在聊天中复述报告内容;用户会直接查看制品。

Hard rules

硬性规则

  • 5 scores, every time — partial reports (e.g. only 3 dimensions) are not allowed.
  • Evidence per score — no "feels off" / "needs work". If you can't cite an element, the score is not justified.
  • Don't grade-inflate — overall mean above 8 is suspicious; check yourself.
  • Don't review your own artifact in the same turn — the user needs to see it first. Self-critique only on explicit request ("now critique what you just made").
  • Single-file HTML only — no external CSS/JS. Inline everything.
  • Radar chart is mandatory — gives the report a recognizable silhouette and lets the user spot weak axes at a glance.
  • 必须给出5项评分——不允许输出部分报告(例如仅3个维度)。
  • 每项评分需附证据——禁止使用“感觉不对劲”“需要改进”这类表述。如果无法引用具体元素,评分无效。
  • 不要虚高评分——整体平均分超过8分需谨慎;请自查。
  • 不要在同一轮中评审自己生成的制品——用户需要先看到制品。仅在明确请求时进行自评(“现在评审你刚做的内容”)。
  • 仅输出单文件HTML——禁止外部CSS/JS。所有内容内嵌。
  • 必须包含雷达图——让报告具有辨识度,方便用户一眼发现薄弱维度。