recursive-improvement

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Recursive Self-Improvement Loop

递归式自我改进循环

A pattern for generating higher-quality output by iterating against explicit scoring criteria.

一种通过对照明确评分标准进行迭代来生成更高质量产出的模式。

The Pattern

模式流程

generate → evaluate → diagnose → improve → repeat (until passing)

Never ship first-draft output for important content. Run the loop.

generate → evaluate → diagnose → improve → repeat (until passing)

重要内容绝对不要直接发布初稿。请执行这个循环流程。

How It Works

运作机制

1. Generate

1. 生成

Create the initial output as you normally would.

按照常规方式创建初始产出内容。

2. Evaluate

2. 评估

Score the output against each criterion (1-10). Be brutally honest.

对照每项标准为产出内容打分（1-10分）。请务必坦诚评判。

3. Diagnose

3. 诊断

For any criterion scoring below threshold:

What specifically is weak?
Why does it fail?
What would "passing" look like?

对于任何未达阈值的标准：

具体哪部分存在不足？
为什么未达标？
“达标”的状态是什么样的？

4. Improve

4. 改进

Rewrite addressing each diagnosed weakness. Don't patch — rebuild the weak sections.

针对诊断出的每一项不足进行重写。不要小修小补——要彻底重构薄弱部分。

5. Repeat

5. 重复

Re-evaluate. Keep looping until all criteria pass threshold (usually 8/10 minimum).

重新评估。持续循环直到所有标准都达到阈值（通常最低为8/10分）。

Adversarial Pressure (Optional but Powerful)

对抗性检验（可选但效果显著）

After passing criteria, attack the output from a hostile perspective:

Skeptical customer: "Why should I believe this? What's the catch?"
Distracted scroller: "Would I stop for this? In 2 seconds?"
Competitor: "How would a rival tear this apart?"

If it survives, ship it. If not, iterate.

在通过所有标准后，从敌对视角审视产出内容：

持怀疑态度的客户：“我为什么要相信这个？有什么陷阱？”
注意力分散的滚动浏览者：“我会为这个停下来吗？2秒内会注意到吗？”
竞争对手：“竞品会如何抨击这个内容？”

如果内容能经受住考验，就可以发布。如果不能，继续迭代。

Example Criteria by Use Case

不同场景下的示例标准

Social Content

社交内容

Criterion	What to evaluate
Hook strength	First line grabs attention? Pattern interrupt?
Curiosity gap	Creates urge to keep reading?
Clarity	One clear idea? No confusion?
Voice match	Sounds like the target voice/brand?
Engagement potential	People will reply/share/save?
Thumb-stop power	Scroller would pause?
Value density	Every line earns its place?
CTA clarity	Clear what reader should do next?

Adversarial test: Would a distracted, skeptical user at 11pm engage with this?

标准	评估要点
钩子吸引力	开头能否抓住注意力？是否有打破常规的设计？
好奇心缺口	是否能激发读者继续阅读的欲望？
清晰度	是否只有一个明确的核心观点？没有歧义？
语气匹配度	是否符合目标品牌/账号的语气风格？
互动潜力	读者会回复/分享/收藏吗？
驻足吸引力	滚动浏览者会停下来吗？
价值密度	每一行内容都有存在的意义吗？
CTA清晰度	读者清楚接下来该做什么吗？

对抗性检验： 一个注意力分散、持怀疑态度的用户在深夜11点会愿意互动吗？

Landing Page / Web Copy

着陆页/网页文案

Criterion	What to evaluate
Headline clarity	Instantly clear what this business does?
Value prop strength	Why choose them over competitors?
Benefit focus	Features translated to customer benefits?
CTA effectiveness	Clear, compelling action? Low friction?
Trust signals	Credibility established? Social proof?
Readability	Scannable? Short paragraphs? Clear hierarchy?
Objection handling	Common concerns addressed?
Specificity	Concrete details vs vague claims?

Adversarial test: Would someone searching on their phone take action within 30 seconds?

标准	评估要点
标题清晰度	能否立刻明确该业务的核心内容？
价值主张强度	为什么要选择他们而不是竞品？
利益点聚焦	是否将功能转化为了客户能获得的实际利益？
CTA有效性	是否清晰、有吸引力，且操作门槛低？
信任信号	是否建立了可信度？有社交证明吗？
可读性	是否易于扫描阅读？段落简短？层级清晰？
异议处理	是否解决了常见的顾虑？
具体性	是具体细节还是模糊的宣称？

对抗性检验： 用手机搜索的用户会在30秒内采取行动吗？

Email Copy

邮件文案

Criterion	What to evaluate
Subject line	Would this get opened? Stands out in inbox?
Opening hook	First sentence earns the second?
Single focus	One clear ask per email?
Skimmability	Can get the gist in 5 seconds?
CTA prominence	Action is obvious and easy?
Voice consistency	Matches brand/sender personality?
Length appropriate	No fluff, nothing missing?
Mobile friendly	Works on small screens?

Adversarial test: Would a busy person with 200 unread emails act on this?

标准	评估要点
主题行	会被打开吗？在收件箱里是否显眼？
开头钩子	第一句话能否吸引读者读第二句？
单一聚焦	每封邮件是否只有一个明确的诉求？
易读性	能否在5秒内抓住核心信息？
CTA突出性	行动指令是否明显且易于执行？
语气一致性	是否符合品牌/发件人的个性？
篇幅适宜性	没有冗余内容，也没有遗漏必要信息？
移动端适配性	在小屏幕上显示正常吗？

对抗性检验： 一个有200封未读邮件的忙碌用户会采取行动吗？

Ad Copy

广告文案

Criterion	What to evaluate
Thumb-stop power	Pattern interrupt in first 2 seconds?
Curiosity gap	Creates need to know more?
Emotional trigger	Hits a real pain point or desire?
Credibility	Believable? Not too good to be true?
CTA strength	Clear next step with low friction?
Persona match	Speaks directly to target audience?
Differentiation	Stands out from competitor ads?
Platform native	Fits the platform's style/format?

Adversarial test: Would this stop YOUR scroll? Would you click?

标准	评估要点
驻足吸引力	前2秒内是否有打破常规的设计？
好奇心缺口	是否能激发进一步了解的需求？
情感触发	是否击中了真实的痛点或欲望？
可信度	是否可信？不会显得过于夸张？
CTA强度	是否有清晰的下一步行动，且操作门槛低？
受众匹配度	是否直接针对目标受众说话？
差异化	是否能从竞品广告中脱颖而出？
平台适配性	是否符合平台的风格/格式？

对抗性检验： 你自己会为这个停下来吗？会点击吗？

When to Use

适用场景

Always use for:

Headlines and hooks
CTAs and value props
Key landing page sections
Social posts (especially threads)
Ad copy
Important emails

Can skip for:

Internal notes
First-pass brainstorming
Technical documentation
Boilerplate content

务必使用的场景：

标题和钩子内容
CTA和价值主张
着陆页关键板块
社交帖子（尤其是长帖）
广告文案
重要邮件

可跳过的场景：

内部笔记
初稿头脑风暴
技术文档
通用模板内容

Building Your Own Criteria

制定专属标准

Pick one task you do repeatedly
Write down how YOU evaluate that output — what makes "good" vs "mid"?
Turn each into a pass/fail threshold — be specific ("9/10 minimum" not "make it good")
Add adversarial pressure — who would attack this? What would they say?
Save and reuse — now you have a system, not just a prompt

选择一项你重复执行的任务
写下你自己评估产出的标准——什么是“优秀” vs “一般”？
将每项标准转化为通过/不通过的阈值——要具体（比如“最低9/10分”而不是“做得好一点”）
添加对抗性检验——谁会抨击这个内容？他们会说什么？
保存并复用——现在你拥有的是一套体系，而不只是一个提示词

Quick Loop Template

快速循环模板

markdown

undefined

markdown

undefined

Output v1

[Initial generation]

Evaluation v1

Hook strength: 6/10 — Opens weak, no pattern interrupt
Clarity: 8/10 — Clear enough
Voice match: 7/10 — Too formal [... score all criteria]

Hook strength: 6/10 — Opens weak, no pattern interrupt
Clarity: 8/10 — Clear enough
Voice match: 7/10 — Too formal [... score all criteria]

Diagnosis

Hook needs a surprising stat or contrarian take
Voice should be more casual, shorter sentences
[...]

Hook needs a surprising stat or contrarian take
Voice should be more casual, shorter sentences
[...]

Output v2

[Revised version addressing weaknesses]

Evaluation v2

[Re-score — continue until all pass]


---

The loop typically adds 2-3 iterations. Worth it for anything that matters.

[Re-score — continue until all pass]


---

这个循环通常需要2-3次迭代。对于任何重要内容来说，都是值得的。