recursive-improvement

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Recursive Self-Improvement Loop

递归式自我改进循环

A pattern for generating higher-quality output by iterating against explicit scoring criteria.
一种通过对照明确评分标准进行迭代来生成更高质量产出的模式。

The Pattern

模式流程

generate → evaluate → diagnose → improve → repeat (until passing)
Never ship first-draft output for important content. Run the loop.

generate → evaluate → diagnose → improve → repeat (until passing)
重要内容绝对不要直接发布初稿。请执行这个循环流程。

How It Works

运作机制

1. Generate

1. 生成

Create the initial output as you normally would.
按照常规方式创建初始产出内容。

2. Evaluate

2. 评估

Score the output against each criterion (1-10). Be brutally honest.
对照每项标准为产出内容打分(1-10分)。请务必坦诚评判。

3. Diagnose

3. 诊断

For any criterion scoring below threshold:
  • What specifically is weak?
  • Why does it fail?
  • What would "passing" look like?
对于任何未达阈值的标准:
  • 具体哪部分存在不足?
  • 为什么未达标?
  • “达标”的状态是什么样的?

4. Improve

4. 改进

Rewrite addressing each diagnosed weakness. Don't patch — rebuild the weak sections.
针对诊断出的每一项不足进行重写。不要小修小补——要彻底重构薄弱部分。

5. Repeat

5. 重复

Re-evaluate. Keep looping until all criteria pass threshold (usually 8/10 minimum).

重新评估。持续循环直到所有标准都达到阈值(通常最低为8/10分)。

Adversarial Pressure (Optional but Powerful)

对抗性检验(可选但效果显著)

After passing criteria, attack the output from a hostile perspective:
  • Skeptical customer: "Why should I believe this? What's the catch?"
  • Distracted scroller: "Would I stop for this? In 2 seconds?"
  • Competitor: "How would a rival tear this apart?"
If it survives, ship it. If not, iterate.

在通过所有标准后,从敌对视角审视产出内容:
  • 持怀疑态度的客户:“我为什么要相信这个?有什么陷阱?”
  • 注意力分散的滚动浏览者:“我会为这个停下来吗?2秒内会注意到吗?”
  • 竞争对手:“竞品会如何抨击这个内容?”
如果内容能经受住考验,就可以发布。如果不能,继续迭代。

Example Criteria by Use Case

不同场景下的示例标准

Social Content

社交内容

CriterionWhat to evaluate
Hook strengthFirst line grabs attention? Pattern interrupt?
Curiosity gapCreates urge to keep reading?
ClarityOne clear idea? No confusion?
Voice matchSounds like the target voice/brand?
Engagement potentialPeople will reply/share/save?
Thumb-stop powerScroller would pause?
Value densityEvery line earns its place?
CTA clarityClear what reader should do next?
Adversarial test: Would a distracted, skeptical user at 11pm engage with this?

标准评估要点
钩子吸引力开头能否抓住注意力?是否有打破常规的设计?
好奇心缺口是否能激发读者继续阅读的欲望?
清晰度是否只有一个明确的核心观点?没有歧义?
语气匹配度是否符合目标品牌/账号的语气风格?
互动潜力读者会回复/分享/收藏吗?
驻足吸引力滚动浏览者会停下来吗?
价值密度每一行内容都有存在的意义吗?
CTA清晰度读者清楚接下来该做什么吗?
对抗性检验: 一个注意力分散、持怀疑态度的用户在深夜11点会愿意互动吗?

Landing Page / Web Copy

着陆页/网页文案

CriterionWhat to evaluate
Headline clarityInstantly clear what this business does?
Value prop strengthWhy choose them over competitors?
Benefit focusFeatures translated to customer benefits?
CTA effectivenessClear, compelling action? Low friction?
Trust signalsCredibility established? Social proof?
ReadabilityScannable? Short paragraphs? Clear hierarchy?
Objection handlingCommon concerns addressed?
SpecificityConcrete details vs vague claims?
Adversarial test: Would someone searching on their phone take action within 30 seconds?

标准评估要点
标题清晰度能否立刻明确该业务的核心内容?
价值主张强度为什么要选择他们而不是竞品?
利益点聚焦是否将功能转化为了客户能获得的实际利益?
CTA有效性是否清晰、有吸引力,且操作门槛低?
信任信号是否建立了可信度?有社交证明吗?
可读性是否易于扫描阅读?段落简短?层级清晰?
异议处理是否解决了常见的顾虑?
具体性是具体细节还是模糊的宣称?
对抗性检验: 用手机搜索的用户会在30秒内采取行动吗?

Email Copy

邮件文案

CriterionWhat to evaluate
Subject lineWould this get opened? Stands out in inbox?
Opening hookFirst sentence earns the second?
Single focusOne clear ask per email?
SkimmabilityCan get the gist in 5 seconds?
CTA prominenceAction is obvious and easy?
Voice consistencyMatches brand/sender personality?
Length appropriateNo fluff, nothing missing?
Mobile friendlyWorks on small screens?
Adversarial test: Would a busy person with 200 unread emails act on this?

标准评估要点
主题行会被打开吗?在收件箱里是否显眼?
开头钩子第一句话能否吸引读者读第二句?
单一聚焦每封邮件是否只有一个明确的诉求?
易读性能否在5秒内抓住核心信息?
CTA突出性行动指令是否明显且易于执行?
语气一致性是否符合品牌/发件人的个性?
篇幅适宜性没有冗余内容,也没有遗漏必要信息?
移动端适配性在小屏幕上显示正常吗?
对抗性检验: 一个有200封未读邮件的忙碌用户会采取行动吗?

Ad Copy

广告文案

CriterionWhat to evaluate
Thumb-stop powerPattern interrupt in first 2 seconds?
Curiosity gapCreates need to know more?
Emotional triggerHits a real pain point or desire?
CredibilityBelievable? Not too good to be true?
CTA strengthClear next step with low friction?
Persona matchSpeaks directly to target audience?
DifferentiationStands out from competitor ads?
Platform nativeFits the platform's style/format?
Adversarial test: Would this stop YOUR scroll? Would you click?

标准评估要点
驻足吸引力前2秒内是否有打破常规的设计?
好奇心缺口是否能激发进一步了解的需求?
情感触发是否击中了真实的痛点或欲望?
可信度是否可信?不会显得过于夸张?
CTA强度是否有清晰的下一步行动,且操作门槛低?
受众匹配度是否直接针对目标受众说话?
差异化是否能从竞品广告中脱颖而出?
平台适配性是否符合平台的风格/格式?
对抗性检验: 你自己会为这个停下来吗?会点击吗?

When to Use

适用场景

Always use for:
  • Headlines and hooks
  • CTAs and value props
  • Key landing page sections
  • Social posts (especially threads)
  • Ad copy
  • Important emails
Can skip for:
  • Internal notes
  • First-pass brainstorming
  • Technical documentation
  • Boilerplate content

务必使用的场景:
  • 标题和钩子内容
  • CTA和价值主张
  • 着陆页关键板块
  • 社交帖子(尤其是长帖)
  • 广告文案
  • 重要邮件
可跳过的场景:
  • 内部笔记
  • 初稿头脑风暴
  • 技术文档
  • 通用模板内容

Building Your Own Criteria

制定专属标准

  1. Pick one task you do repeatedly
  2. Write down how YOU evaluate that output — what makes "good" vs "mid"?
  3. Turn each into a pass/fail threshold — be specific ("9/10 minimum" not "make it good")
  4. Add adversarial pressure — who would attack this? What would they say?
  5. Save and reuse — now you have a system, not just a prompt

  1. 选择一项你重复执行的任务
  2. 写下你自己评估产出的标准——什么是“优秀” vs “一般”?
  3. 将每项标准转化为通过/不通过的阈值——要具体(比如“最低9/10分”而不是“做得好一点”)
  4. 添加对抗性检验——谁会抨击这个内容?他们会说什么?
  5. 保存并复用——现在你拥有的是一套体系,而不只是一个提示词

Quick Loop Template

快速循环模板

markdown
undefined
markdown
undefined

Output v1

Output v1

[Initial generation]
[Initial generation]

Evaluation v1

Evaluation v1

  • Hook strength: 6/10 — Opens weak, no pattern interrupt
  • Clarity: 8/10 — Clear enough
  • Voice match: 7/10 — Too formal [... score all criteria]
  • Hook strength: 6/10 — Opens weak, no pattern interrupt
  • Clarity: 8/10 — Clear enough
  • Voice match: 7/10 — Too formal [... score all criteria]

Diagnosis

Diagnosis

  1. Hook needs a surprising stat or contrarian take
  2. Voice should be more casual, shorter sentences
  3. [...]
  1. Hook needs a surprising stat or contrarian take
  2. Voice should be more casual, shorter sentences
  3. [...]

Output v2

Output v2

[Revised version addressing weaknesses]
[Revised version addressing weaknesses]

Evaluation v2

Evaluation v2

[Re-score — continue until all pass]

---

The loop typically adds 2-3 iterations. Worth it for anything that matters.
[Re-score — continue until all pass]

---

这个循环通常需要2-3次迭代。对于任何重要内容来说,都是值得的。