verify

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Verify

验证

Use the existing infrastructure to prove your own change works before calling it done.
在宣告完成前,利用现有基础设施证明你的变更能够正常工作。

Principles

原则

  • The builder does not grade their own work in the same context; switch into a fresh evaluator context or separate subagent first
  • Run repo guardrails first, then hit the real surface
  • Prefer smoke, integration, contract, or e2e proof over unit tests that mock most of the behavior under test
  • Challenge the changed code for shape as well as behavior; passing tests do not excuse bloated, duplicated, or comment-dependent code
  • Load shared doctrine from the repo's guidance files such as
    AGENTS.md
    ,
    CLAUDE.md
    , or repo rules before judging the result
  • If the infrastructure is too weak to verify reliably, stop and hand off to
    agent-readiness
  • 开发者不能在同一环境下给自己的工作打分;请先切换到全新的评估者环境或独立子Agent
  • 先运行仓库的防护机制,再验证实际功能
  • 优先选择冒烟测试、集成测试、契约测试或端到端测试作为验证依据,而非大量依赖模拟的单元测试
  • 既要检查变更代码的行为正确性,也要检查代码结构合理性;测试通过不代表可以容忍臃肿、重复或依赖注释的代码
  • 在评估结果前,先加载仓库指导文件中的通用规则,如
    AGENTS.md
    CLAUDE.md
    或仓库规则
  • 如果基础设施不足以可靠地完成验证,请停止操作并移交至
    agent-readiness

Handoffs

移交规则

  • No stable boot / smoke / interact path, or infrastructure too weak to trust → use
    agent-readiness
  • Need to review existing code, a diff, branch, or PR you are not verifying as the builder → use
    review
  • Main problem is stale AGENTS.md, README, specs, or repo docs → use
    docs
  • 无稳定的启动/冒烟/交互路径,或基础设施不可靠 → 使用
    agent-readiness
  • 需要审核现有代码、差异、分支或PR(非作为开发者验证自己的变更)→ 使用
    review
  • 主要问题是
    AGENTS.md
    、README、规格文档或仓库文档过时 → 使用
    docs

Before You Start

开始前准备

  1. Define the exact change being verified and the expected user-visible behavior
  2. Switch into an independent evaluator context before judging your own work
  3. Load the target repo's guidance files such as
    AGENTS.md
    ,
    CLAUDE.md
    , or repo rules, when present
  4. Confirm you can boot and interact with the real surface
  5. Pick the smallest check set that can disprove the change honestly
  1. 明确定义待验证的具体变更以及预期的用户可见行为
  2. 在评估自己的工作前,切换到独立的评估者环境
  3. 加载目标仓库的指导文件(如存在),例如
    AGENTS.md
    CLAUDE.md
    或仓库规则
  4. 确认能够启动并与实际功能交互
  5. 选择最小的检查集合,以便真实地验证变更是否有效

Workflow

工作流程

1. Run deterministic guardrails first

1. 先运行确定性防护机制

  • Prefer the repo's built-in entrypoint:
    make verify
    ,
    just verify
    ,
    pnpm test
    ,
    cargo test
    , or the nearest targeted equivalent
  • When choosing tests, prefer the strongest cheap proof available: smoke, integration, contract, or e2e checks beat mock-heavy unit suites that mainly replay implementation details
  • Swallow boring success output and surface only failures, anomalies, and exact commands
  • 优先使用仓库内置的入口命令:
    make verify
    just verify
    pnpm test
    cargo test
    或类似的针对性命令
  • 选择测试时,优先选择最有效的低成本验证方式:冒烟测试、集成测试、契约测试或端到端测试优于大量依赖模拟、主要重放实现细节的单元测试套件
  • 忽略无意义的成功输出,仅展示失败、异常信息和具体命令

2. Exercise the real surface

2. 验证实际功能

  • UI → run the browser automation, navigate the changed flow, and capture screenshots
  • API → hit the local endpoint with a real request such as
    curl http://127.0.0.1:3000/health
  • CLI → run the shipped command such as
    node dist/cli.js --help
    or the repo's packaged entrypoint
  • state/config → verify round trips, restart behavior, and config boot paths
Follow references/evidence-rules.md when collecting proof.
  • UI → 运行浏览器自动化,导航变更后的流程并捕获截图
  • API → 使用真实请求访问本地端点,例如
    curl http://127.0.0.1:3000/health
  • CLI → 运行已打包的命令,例如
    node dist/cli.js --help
    或仓库的打包入口
  • 状态/配置 → 验证往返流程、重启行为和配置启动路径
收集证据时请遵循references/evidence-rules.md

3. Run a code-shape pass on the changed files

3. 对变更文件进行代码结构检查

  • Focus on code touched in the current task unless the changes obviously exposed a broader local mess
  • Ask whether the solution matches the repo's language, framework, and design patterns rather than merely working
  • Remove duplication, dead branches, unused helpers, and unnecessary abstractions when they do not protect a real boundary
  • Treat
    any
    , unsafe
    as
    , boundary-leaking
    unknown
    , and non-null assertions as safety failures unless the repo explicitly allows them
  • Check that failures are classified intentionally and surfaced with useful recovery guidance, while preserving codes or diagnostics for operators
  • Prefer code that explains itself; comments should survive only when they carry durable context the code cannot make obvious
  • Read the changed files as if a brand new agent inherited them tomorrow and had to extend the flow without prior context
Use references/simplification.md for the exact simplification questions.
  • 重点关注当前任务中修改的代码,除非变更明显暴露了更广泛的本地代码问题
  • 检查解决方案是否符合仓库的语言、框架和设计模式,而不仅仅是能正常运行
  • 删除重复代码、无效分支、未使用的工具函数和不必要的抽象,除非它们用于保护真实的边界
  • 除非仓库明确允许,否则将
    any
    、不安全的
    as
    、边界泄露的
    unknown
    和非空断言视为安全问题
  • 检查失败是否被有意分类,并提供有用的恢复指导,同时为运维人员保留错误码或诊断信息
  • 优先选择自解释的代码;只有当注释承载了代码无法体现的持久上下文时,才保留注释
  • 以全新Agent的视角阅读变更文件,假设明天有新Agent接手,需要在无前置上下文的情况下扩展流程
具体的简化检查问题请参考references/simplification.md

4. Probe adjacent risk

4. 排查相关风险

  • Check the main happy path
  • Check at least one failure path or edge case
  • Check that at least one exercised failure path returns or logs a useful, actionable error instead of a vague or swallowed failure
  • Re-test any config, persistence, or restart-sensitive behavior touched by the change
  • 检查主正常流程
  • 至少检查一个失败路径或边缘案例
  • 确保至少一个已验证的失败路径返回或记录了有用、可操作的错误,而非模糊或被掩盖的失败
  • 重新测试变更涉及的任何与配置、持久化或重启相关的行为

5. Synthesize the verdict

5. 综合得出结论

Produce one clear outcome:
  • ship it
  • needs review
  • blocked
If blocked because the infrastructure is weak, say so explicitly and hand off to
agent-readiness
.
生成一个明确的结果:
  • ship it
    (可发布)
  • needs review
    (需要审核)
  • blocked
    (受阻)
如果因基础设施薄弱而受阻,请明确说明并移交至
agent-readiness

Output

输出内容

After verification, report:
  • verdict
  • change verified
  • surfaces exercised
  • code-shape findings: clarity, duplication, dead code, unsafe type escapes, error classification, recovery messaging, comments, or maintainability debt in the changed files
  • top findings by severity
  • exact evidence: commands, screenshots, traces, responses, or file references
  • readiness gaps or doc drift discovered during verification
  • recommended follow-up:
    agent-readiness
    ,
    docs
    , or implementation
Example:
text
verdict: needs review
change verified: retry banner after transient API failure
surfaces exercised: pnpm test test/retry.spec.ts, curl http://127.0.0.1:3000/api/retry
code-shape finding: low — retry counter update is split across two helpers with identical branching; merge into one explicit path
finding: medium — the UI recovers, but the retry count is not persisted across refresh
evidence: local API returned 200 after retry; browser screenshot after refresh shows count reset to 0
recommended follow-up: implementation
验证完成后,需报告:
  • 结论
  • 已验证的变更内容
  • 已验证的功能面
  • 代码结构检查结果:变更文件中的清晰度、重复代码、无效代码、不安全类型逃逸、错误分类、恢复信息、注释或可维护性债务
  • 按严重程度排序的主要问题
  • 具体证据:命令、截图、跟踪信息、响应或文件引用
  • 验证过程中发现的就绪性差距或文档偏差
  • 建议的后续操作:
    agent-readiness
    docs
    或实现调整
示例:
text
verdict: needs review
change verified: retry banner after transient API failure
surfaces exercised: pnpm test test/retry.spec.ts, curl http://127.0.0.1:3000/api/retry
code-shape finding: low — retry counter update is split across two helpers with identical branching; merge into one explicit path
finding: medium — the UI recovers, but the retry count is not persisted across refresh
evidence: local API returned 200 after retry; browser screenshot after refresh shows count reset to 0
recommended follow-up: implementation

References

参考资料

  • references/verification.md — evaluator pattern, targeted real-surface checks, and cost trade-offs
  • references/evidence-rules.md — what counts as proof and how to report it
  • references/simplification.md — clarity, dedupe, and "fresh-agent readability" checks for changed code
  • references/verification.md — 评估者模式、针对性的实际功能检查和成本权衡
  • references/evidence-rules.md — 有效证据的定义及报告方式
  • references/simplification.md — 变更代码的清晰度、去重和“新Agent可读性”检查