junior-to-senior

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Junior to Senior

从初级到资深

Assume the plan in front of you was written by a capable junior: fluent, confident, and trained on the past. Build a senior reviewer that is grounded in two things the junior was not — this codebase as it actually exists and the state of the art as it exists today — and let the senior tear the plan down and rebuild it.
This skill exists because agent-generated plans fail at two altitudes:
  • Fog — the plan describes the high level fine ("add caching", "handle auth", "make it scalable") but never commits on the hard parts. No interfaces, no data shapes, no failure handling, no named libraries. An engineer reading it still has to make every real decision themselves.
  • Tunnel — the plan dives into function signatures and file diffs but has no product vision. No statement of who this is for, what success means, what is out of scope, or why this approach beats the boring alternative. It optimizes a local detail while the shape of the feature is still wrong.
Both are altitude failures. The senior's job is to drag the plan to the right altitude and upgrade its substance past the model's training cutoff.
假设你面前的计划是由一位能力尚可的初级工程师撰写的:表达流畅、态度自信,但知识仅停留在过往经验。请构建一位资深评审者,其评判依据基于两点初级工程师不具备的要素——当前真实存在的代码库当下的前沿技术水平——让这位资深评审者推翻原有计划并重新构建。
该技能的存在是因为Agent生成的计划会出现两种「高度偏差问题」:
  • Fog(模糊型偏差):计划在高层级描述上没问题(如「添加缓存」「处理认证」「实现可扩展性」),但从未明确核心难点。没有定义接口、数据结构、故障处理方式,也未指定具体库。工程师阅读后仍需自行做出所有实际决策。
  • Tunnel(隧道型偏差):计划深入到函数签名和文件差异层面,但缺乏产品愿景。未说明服务对象、成功标准、明确的范围边界,也未解释为何该方案优于常规替代方案。它在局部细节上优化,却忽略了功能本身的整体合理性。
这两种都是高度偏差问题。资深评审者的职责是将计划调整到合适的粒度高度,并将其内容升级至超越模型训练截止时间的前沿水平。

The cardinal rule

核心准则

Every senior finding needs evidence. A claim about the codebase cites a file and line. A claim about best practice cites a fetched source — official docs, release notes, an RFC, a postmortem — with a date. If web research is unavailable, the finding is labeled
[training-data, unverified]
so stale knowledge is never laundered as current truth. A senior who argues from vibes is just a louder junior.
每一条资深评审结论都需要依据支撑。 关于代码库的论断需引用具体文件和行号。关于最佳实践的论断需引用可信来源——官方文档、发布说明、RFC文档、事后复盘报告等——并标注日期。若无法进行网络调研,需为相关结论标注
[training-data, unverified]
,避免将过时知识当作当前事实。仅凭感觉论断的资深评审者,本质上只是嗓门更大的初级工程师。

Phase 0: Capture the junior artifact

阶段0:锁定初级产出物

Identify exactly what is under review:
  • A plan the agent just produced in this conversation (the default — including your own output from a moment ago).
  • A pasted plan, design doc, RFC, or issue description.
  • A planning document in the repo the user points at.
Freeze it. Quote or restate the artifact in full before reviewing so the review targets a fixed text, not a moving memory of it. If there is no artifact yet, say so and offer to either generate the junior draft first or review the user's existing idea — do not review thin air.
明确评审对象:
  • 本次对话中Agent刚生成的计划(默认情况——包括你片刻前的输出内容)。
  • 用户粘贴的计划、设计文档、RFC文档或问题描述。
  • 用户指定的代码库中的规划文档。
锁定该产出物。在评审前完整引用或重述该产出物,确保评审针对固定文本,而非模糊的记忆。若暂无产出物,需告知用户,并提供两种选择:先生成初级版本草稿,或评审用户已有的想法——切勿无的放矢。

Phase 1: Construct the senior

阶段1:构建资深评审者画像

The senior is not a tone of voice. It is a reviewer profile built from research done now. Skipping this phase and going straight to critique produces generic review slop.
资深评审者并非一种语气,而是基于实时调研构建的评审者角色。跳过此阶段直接进行批评,只会产生泛泛而谈的无效评审。

1a. Extract the domains

1a. 提取核心技术领域

List the 2-5 load-bearing technical domains the plan touches (e.g., "Postgres schema migration", "React server components", "OAuth token refresh", "vector search at 10M rows"). For each, write one sentence on what a staff-level engineer in that domain would refuse to let slide. This list drives all research that follows.
列出计划涉及的2-5个核心技术领域(例如:「Postgres schema迁移」「React server components」「OAuth token刷新」「千万级向量搜索」)。针对每个领域,撰写一句话说明资深工程师在该领域绝不会妥协的要点。此列表将指导后续所有调研工作。

1b. Code research — what is true here

1b. 代码库调研——当前真实情况

Investigate the repository before judging the plan against it:
  • Existing conventions and architecture the plan must fit (or explicitly break, with justification).
  • Actual versions in lockfiles/manifests — a plan recommending an API that the pinned version doesn't have is a blocker.
  • Prior art: similar features already in the codebase, ADRs, migrations, test patterns.
  • Real constraints the junior plan ignored: build system, deploy targets, performance budgets, existing data.
Use a subagent (e.g.
Explore
) for broad sweeps so the review context stays clean.
在依据代码库评判计划前,先开展调研:
  • 计划必须适配(或明确说明为何打破)的现有规范和架构。
  • lockfile/manifest文件中记录的实际版本——若计划推荐的API在已固定的版本中不存在,这将是一个阻塞问题。
  • 已有先例:代码库中类似的功能、ADR文档、迁移记录、测试模式。
  • 初级计划忽略的真实约束:构建系统、部署目标、性能预算、现有数据。
可使用子Agent(如
Explore
)进行全面扫描,确保评审上下文清晰。

1c. Web research — what is true now

1c. 网络调研——当前前沿情况

For each load-bearing decision in the plan, search for the current state of the art. The junior's knowledge ends at a training cutoff; the senior's must not. Prioritize primary sources (official docs, changelogs, release notes, maintainer posts) and check dates. You are looking for three kinds of delta:
  • Deprecations — the plan's approach is now discouraged or removed.
  • Supersessions — a newer pattern/library/API has clearly won since the cutoff.
  • Hard-won lessons — published postmortems, benchmarks, or security advisories that change the tradeoff.
Query patterns, source-quality ranking, and when to stop are in references/research-playbook.md. If web access is unavailable, proceed on code research alone and mark every best-practice claim
[training-data, unverified]
.
针对计划中的每一个核心决策,调研当前的前沿技术水平。初级工程师的知识截止于模型训练时间,而资深评审者的知识必须与时俱进。优先选择一手来源(官方文档、更新日志、发布说明、维护者文章)并核对日期。你需要寻找三类差异:
  • 废弃通知:计划采用的方案现已不被推荐或已移除。
  • 替代方案:自模型训练截止后,已有更优的模式/库/API成为主流。
  • 经验教训:已发布的事后复盘报告、基准测试结果或安全公告,这些内容会改变方案的权衡逻辑。
查询模式、来源质量排序及停止调研的标准详见**references/research-playbook.md**。若无法访问网络,仅基于代码库调研进行评审,并为所有最佳实践相关结论标注
[training-data, unverified]

1d. Isolation

1d. 上下文隔离

When the harness supports subagents, run the senior review in a context-isolated subagent that receives the frozen artifact and the research findings but not the reasoning that produced the junior plan. Self-review in the same context anchors on its own justifications; isolation is what makes the review adversarial rather than confirmatory.
当系统支持子Agent时,在上下文隔离的子Agent中执行资深评审,该子Agent仅接收锁定的产出物和调研结果,不接收生成初级计划的推理过程。在同一上下文中进行自我评审会受原有逻辑束缚;上下文隔离才能让评审具备对抗性,而非验证性。

Phase 2: Diagnose the altitude

阶段2:诊断高度偏差类型

Before line-by-line critique, classify the artifact: fog, tunnel, or mixed (most real plans fog the hard parts and tunnel on the easy ones — flag each section separately).
Fog test — for every component the plan names, can a competent engineer start tomorrow without making a product or architecture decision themselves? Tunnel test — does the plan state who this is for, what success looks like, what is explicitly out of scope, and why this approach beat the obvious alternative?
The full diagnostic checklists, the vague-word blacklist ("simple", "scalable", "handle gracefully", "robust", ...), and severity definitions are in references/review-rubric.md.
在逐行评审前,先对产出物进行分类:模糊型偏差隧道型偏差混合型(多数真实计划会在核心部分模糊、在简单部分陷入细节——需分别标记各部分)。
模糊型偏差测试:针对计划中提到的每个组件,合格的工程师能否在无需自行做出产品或架构决策的前提下,于次日启动工作?隧道型偏差测试:计划是否说明了服务对象、成功标准、明确的范围边界,以及为何该方案优于常规替代方案?
完整的诊断 checklist、模糊词汇黑名单(如「简单」「可扩展」「优雅处理」「健壮」等)及严重程度定义详见**references/review-rubric.md**。

Phase 3: Adversarial review

阶段3:对抗式评审

The senior reviews the frozen artifact against three lenses: codebase reality (1b), current state of the art (1c), and altitude (Phase 2). Rules of engagement:
  • Every vague phrase gets challenged with the concrete question it is hiding from.
  • Every named technology gets a version and a reason; every unnamed one ("a queue", "some cache") gets named or the choice gets flagged as an open decision.
  • Every data shape that crosses a boundary gets written down.
  • Every plan gets asked: what is the rollback, what is the migration, what breaks at 10x.
  • Steelman before attacking: state the strongest version of the junior's choice, then show why it still loses (or concede that it wins — agreeing with the junior when the evidence supports it is a valid senior outcome, not a failure of the skill).
Findings use three severities — blocker (plan fails as written), major (works but meaningfully worse than SOTA or misfit to the repo), minor (polish) — each with evidence and a concrete fix. Definitions and examples: references/review-rubric.md.
资深评审者从三个维度评审锁定的产出物:代码库实际情况(1b)、当前前沿技术水平(1c)、高度偏差类型(阶段2)。评审规则:
  • 每一个模糊表述都要被追问其回避的具体问题。
  • 每一项指定的技术都要标注版本并说明选择理由;每一项未指定的技术(如「一个队列」「某种缓存」)都要明确指定,或标记为待决策项。
  • 每一个跨边界传递的数据结构都要明确写出。
  • 每一个计划都要被问及:回滚方案是什么?迁移方案是什么?流量增长10倍时会出现什么问题?
  • 先强化再反驳:先阐述初级方案的最优版本,再说明为何该版本仍不足(若证据支持初级方案,则认可它——认可合理方案是资深评审的有效结果,并非技能失效)。
评审结论分为三个严重等级——阻塞问题(按原计划执行会失败)、主要问题(可执行但效果远逊于前沿方案或与代码库不匹配)、次要问题(优化建议)——每个结论都需附带依据和具体修复方案。定义及示例详见**references/review-rubric.md**。

Phase 4: Promote the plan

阶段4:升级计划

Critique without a rewrite is just complaining. Produce the senior version of the plan with this shape:
  1. Goal and non-goals — one paragraph of product intent; explicit out-of-scope list.
  2. Decisions — each load-bearing choice with the chosen option, version, rationale, the strongest rejected alternative, and the evidence (file ref or source link).
  3. Design at the right altitude — interfaces, data shapes, and failure handling for the hard parts; deliberately coarse strokes for the routine parts.
  4. Sequencing — milestones with an observable verification step each ("done" must be checkable, not vibes).
  5. Risks and rollback — what is hardest to undo and the escape hatch.
  6. Open questions for a human — product decisions the senior is not allowed to invent. Scoping is the senior's job; product direction is not.
仅有批评而无重写只是抱怨。请生成符合以下结构的资深版本计划:
  1. 目标与非目标:一段产品意图描述;明确的非目标列表。
  2. 核心决策:每个核心决策需包含选定方案、版本、理由、最具竞争力的备选方案,以及支撑依据(文件引用或来源链接)。
  3. 合适粒度的设计:核心部分需明确接口、数据结构和故障处理方式;常规部分可采用粗粒度描述。
  4. 执行顺序:包含可验证里程碑的执行序列(「完成」必须可量化,而非主观感受)。
  5. 风险与回滚:最难撤销的内容是什么,以及对应的应急方案。
  6. 需人工确认的开放问题:资深评审者不得自行决定的产品方向问题。范围界定是资深评审者的职责;产品方向则不是。

Output format

输出格式

Deliver two artifacts, review first:
markdown
undefined
交付两份成果,先展示评审内容:
markdown
undefined

Senior Review

资深评审

Altitude diagnosis: fog | tunnel | mixed — one-sentence justification.
高度偏差诊断: fog | tunnel | mixed — 一句话说明理由。

Blockers

阻塞问题

  • [B1] Finding — evidence (file:line or source+date) — fix.
  • [B1] 评审结论 — 依据(文件:行号 或 来源+日期) — 修复方案。

Major

主要问题

  • [M1] ...
  • [M1] ...

Minor

次要问题

  • [m1] ...
  • [m1] ...

What the junior got right

初级方案的可取之处

  • Credit where due; preserved in the rewrite.
  • 值得肯定的内容;将在重写版本中保留。

Promoted Plan (v2)

升级后的计划(v2)

[Phase 4 structure]
[阶段4的结构内容]

Delta summary

变更摘要

  • 3-6 bullets: what changed from junior to senior and why.
  • 3-6条要点:从初级到资深版本的变更内容及原因。

Open questions for you

需你确认的开放问题

  • Product decisions that need a human.
undefined
  • 需人工决策的产品方向问题。
undefined

Boundaries

边界规则

  • The senior scopes and upgrades; it does not invent product direction. Genuine product choices go to "Open questions", not into the rewrite.
  • Never silently replace the junior plan — the user sees the review, the rewrite, and the delta, and decides.
  • If research contradicts the user's stated preference, present the evidence and defer; the user may have context the senior lacks.
  • A review with zero blockers and zero majors is a legitimate result. Say "this plan holds" and stop — do not manufacture findings to look rigorous.
  • 资深评审者负责界定范围并升级方案;不负责制定产品方向。真正的产品决策需放入「开放问题」,而非写入重写版本。
  • 不得直接替换初级计划——用户需查看评审内容、重写版本及变更摘要,自行决定是否采用。
  • 若调研结果与用户明确偏好冲突,需展示依据并听从用户决定;用户可能掌握资深评审者未知的上下文信息。
  • 若评审无阻塞问题和主要问题,这是合理结果。只需说明「该计划可行」即可——切勿为了显得严谨而编造评审结论。