verify
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseVerify
验证
Use the existing infrastructure to prove your own change works before calling it done.
在宣告完成前,利用现有基础设施证明你的变更能够正常工作。
Principles
原则
- The builder does not grade their own work in the same context; switch into a fresh evaluator context or separate subagent first
- Run repo guardrails first, then hit the real surface
- Prefer smoke, integration, contract, or e2e proof over unit tests that mock most of the behavior under test
- Challenge the changed code for shape as well as behavior; passing tests do not excuse bloated, duplicated, or comment-dependent code
- Load shared doctrine from the repo's guidance files such as ,
AGENTS.md, or repo rules before judging the resultCLAUDE.md - If the infrastructure is too weak to verify reliably, stop and hand off to
agent-readiness
- 开发者不能在同一环境下给自己的工作打分;请先切换到全新的评估者环境或独立子Agent
- 先运行仓库的防护机制,再验证实际功能
- 优先选择冒烟测试、集成测试、契约测试或端到端测试作为验证依据,而非大量依赖模拟的单元测试
- 既要检查变更代码的行为正确性,也要检查代码结构合理性;测试通过不代表可以容忍臃肿、重复或依赖注释的代码
- 在评估结果前,先加载仓库指导文件中的通用规则,如、
AGENTS.md或仓库规则CLAUDE.md - 如果基础设施不足以可靠地完成验证,请停止操作并移交至
agent-readiness
Handoffs
移交规则
- No stable boot / smoke / interact path, or infrastructure too weak to trust → use
agent-readiness - Need to review existing code, a diff, branch, or PR you are not verifying as the builder → use
review - Main problem is stale AGENTS.md, README, specs, or repo docs → use
docs
- 无稳定的启动/冒烟/交互路径,或基础设施不可靠 → 使用
agent-readiness - 需要审核现有代码、差异、分支或PR(非作为开发者验证自己的变更)→ 使用
review - 主要问题是、README、规格文档或仓库文档过时 → 使用
AGENTS.mddocs
Before You Start
开始前准备
- Define the exact change being verified and the expected user-visible behavior
- Switch into an independent evaluator context before judging your own work
- Load the target repo's guidance files such as ,
AGENTS.md, or repo rules, when presentCLAUDE.md - Confirm you can boot and interact with the real surface
- Pick the smallest check set that can disprove the change honestly
- 明确定义待验证的具体变更以及预期的用户可见行为
- 在评估自己的工作前,切换到独立的评估者环境
- 加载目标仓库的指导文件(如存在),例如、
AGENTS.md或仓库规则CLAUDE.md - 确认能够启动并与实际功能交互
- 选择最小的检查集合,以便真实地验证变更是否有效
Workflow
工作流程
1. Run deterministic guardrails first
1. 先运行确定性防护机制
- Prefer the repo's built-in entrypoint: ,
make verify,just verify,pnpm test, or the nearest targeted equivalentcargo test - When choosing tests, prefer the strongest cheap proof available: smoke, integration, contract, or e2e checks beat mock-heavy unit suites that mainly replay implementation details
- Swallow boring success output and surface only failures, anomalies, and exact commands
- 优先使用仓库内置的入口命令:、
make verify、just verify、pnpm test或类似的针对性命令cargo test - 选择测试时,优先选择最有效的低成本验证方式:冒烟测试、集成测试、契约测试或端到端测试优于大量依赖模拟、主要重放实现细节的单元测试套件
- 忽略无意义的成功输出,仅展示失败、异常信息和具体命令
2. Exercise the real surface
2. 验证实际功能
- UI → run the browser automation, navigate the changed flow, and capture screenshots
- API → hit the local endpoint with a real request such as
curl http://127.0.0.1:3000/health - CLI → run the shipped command such as or the repo's packaged entrypoint
node dist/cli.js --help - state/config → verify round trips, restart behavior, and config boot paths
Follow references/evidence-rules.md when collecting proof.
- UI → 运行浏览器自动化,导航变更后的流程并捕获截图
- API → 使用真实请求访问本地端点,例如
curl http://127.0.0.1:3000/health - CLI → 运行已打包的命令,例如或仓库的打包入口
node dist/cli.js --help - 状态/配置 → 验证往返流程、重启行为和配置启动路径
收集证据时请遵循references/evidence-rules.md。
3. Run a code-shape pass on the changed files
3. 对变更文件进行代码结构检查
- Focus on code touched in the current task unless the changes obviously exposed a broader local mess
- Ask whether the solution matches the repo's language, framework, and design patterns rather than merely working
- Remove duplication, dead branches, unused helpers, and unnecessary abstractions when they do not protect a real boundary
- Treat , unsafe
any, boundary-leakingas, and non-null assertions as safety failures unless the repo explicitly allows themunknown - Check that failures are classified intentionally and surfaced with useful recovery guidance, while preserving codes or diagnostics for operators
- Prefer code that explains itself; comments should survive only when they carry durable context the code cannot make obvious
- Read the changed files as if a brand new agent inherited them tomorrow and had to extend the flow without prior context
Use references/simplification.md for the exact simplification questions.
- 重点关注当前任务中修改的代码,除非变更明显暴露了更广泛的本地代码问题
- 检查解决方案是否符合仓库的语言、框架和设计模式,而不仅仅是能正常运行
- 删除重复代码、无效分支、未使用的工具函数和不必要的抽象,除非它们用于保护真实的边界
- 除非仓库明确允许,否则将、不安全的
any、边界泄露的as和非空断言视为安全问题unknown - 检查失败是否被有意分类,并提供有用的恢复指导,同时为运维人员保留错误码或诊断信息
- 优先选择自解释的代码;只有当注释承载了代码无法体现的持久上下文时,才保留注释
- 以全新Agent的视角阅读变更文件,假设明天有新Agent接手,需要在无前置上下文的情况下扩展流程
具体的简化检查问题请参考references/simplification.md。
4. Probe adjacent risk
4. 排查相关风险
- Check the main happy path
- Check at least one failure path or edge case
- Check that at least one exercised failure path returns or logs a useful, actionable error instead of a vague or swallowed failure
- Re-test any config, persistence, or restart-sensitive behavior touched by the change
- 检查主正常流程
- 至少检查一个失败路径或边缘案例
- 确保至少一个已验证的失败路径返回或记录了有用、可操作的错误,而非模糊或被掩盖的失败
- 重新测试变更涉及的任何与配置、持久化或重启相关的行为
5. Synthesize the verdict
5. 综合得出结论
Produce one clear outcome:
ship itneeds reviewblocked
If blocked because the infrastructure is weak, say so explicitly and hand off to .
agent-readiness生成一个明确的结果:
- (可发布)
ship it - (需要审核)
needs review - (受阻)
blocked
如果因基础设施薄弱而受阻,请明确说明并移交至。
agent-readinessOutput
输出内容
After verification, report:
- verdict
- change verified
- surfaces exercised
- code-shape findings: clarity, duplication, dead code, unsafe type escapes, error classification, recovery messaging, comments, or maintainability debt in the changed files
- top findings by severity
- exact evidence: commands, screenshots, traces, responses, or file references
- readiness gaps or doc drift discovered during verification
- recommended follow-up: ,
agent-readiness, or implementationdocs
Example:
text
verdict: needs review
change verified: retry banner after transient API failure
surfaces exercised: pnpm test test/retry.spec.ts, curl http://127.0.0.1:3000/api/retry
code-shape finding: low — retry counter update is split across two helpers with identical branching; merge into one explicit path
finding: medium — the UI recovers, but the retry count is not persisted across refresh
evidence: local API returned 200 after retry; browser screenshot after refresh shows count reset to 0
recommended follow-up: implementation验证完成后,需报告:
- 结论
- 已验证的变更内容
- 已验证的功能面
- 代码结构检查结果:变更文件中的清晰度、重复代码、无效代码、不安全类型逃逸、错误分类、恢复信息、注释或可维护性债务
- 按严重程度排序的主要问题
- 具体证据:命令、截图、跟踪信息、响应或文件引用
- 验证过程中发现的就绪性差距或文档偏差
- 建议的后续操作:、
agent-readiness或实现调整docs
示例:
text
verdict: needs review
change verified: retry banner after transient API failure
surfaces exercised: pnpm test test/retry.spec.ts, curl http://127.0.0.1:3000/api/retry
code-shape finding: low — retry counter update is split across two helpers with identical branching; merge into one explicit path
finding: medium — the UI recovers, but the retry count is not persisted across refresh
evidence: local API returned 200 after retry; browser screenshot after refresh shows count reset to 0
recommended follow-up: implementationReferences
参考资料
- references/verification.md — evaluator pattern, targeted real-surface checks, and cost trade-offs
- references/evidence-rules.md — what counts as proof and how to report it
- references/simplification.md — clarity, dedupe, and "fresh-agent readability" checks for changed code
- references/verification.md — 评估者模式、针对性的实际功能检查和成本权衡
- references/evidence-rules.md — 有效证据的定义及报告方式
- references/simplification.md — 变更代码的清晰度、去重和“新Agent可读性”检查