agent-readiness
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseAgent-Readiness
Agent就绪性
Make a repo ready for autonomous agent work.
使代码仓库为Agent自主工作做好准备。
Principles
原则
- Environment > instruction — infrastructure matters more than the prompt
- Mechanical enforcement > prose — hooks, CI, health checks, and scripts beat wishes
- Separate builder from judge — builds the rig,
agent-readinessproves your own change,verifycritiques existing codereview - Real behavior > mocked confidence — smoke, integration, and e2e checks beat large suites that mostly mock the seam under test
- Smallest useful layer first — add layers in order, stop when the repo becomes reliably verifiable
- Progressive disclosure — keep the core workflow here, load patterns on demand
- 环境优先于指令——基础设施比提示词更重要
- 机械执行优先于文字说明——钩子、CI、健康检查和脚本比主观期望更有效
- 构建者与评审者分离——负责搭建基础架构,
agent-readiness用于验证自身变更,verify用于评审现有代码review - 真实行为优先于模拟假象——smoke测试、集成测试和e2e测试优于大量仅模拟被测接口的测试套件
- 先构建最小可用层——按顺序添加层级,当代码仓库可被可靠验证时即可停止
- 渐进式披露——核心工作流保留在此,按需加载模式
Handoffs
交接场景
- Need to review existing code, a diff, branch, or PR → use
review - Need to prove your own completed change works on real surfaces → use
verify - Need to update AGENTS.md, README.md, specs, or repo docs → use
docs
- 需要评审现有代码、代码差异、分支或PR → 使用
review - 需要证明自身完成的变更在真实环境中有效 → 使用
verify - 需要更新AGENTS.md、README.md、规格说明或仓库文档 → 使用
docs
The 7-Layer Stack
七层架构
- Boot — single command starts the app
- Smoke — a fast proof the app is alive
- Interact — agent can exercise the real surface
- E2e — key user flows work end to end
- Enforce — hooks, CI gates, lint rules, or mechanical checks
- Observe — logs, health endpoints, traces, machine-readable signals
- Isolate — worktrees or containers do not collide
Concrete examples:
- Boot: ,
pnpm dev, orcargo rundocker compose up - Smoke:
curl http://127.0.0.1:3000/health - Interact/E2e:
pnpm exec playwright test - Observe: structured logs or a machine-readable health endpoint
- 启动层——单命令启动应用
- Smoke测试层——快速验证应用是否存活
- 交互层——Agent可操作真实环境
- E2e测试层——关键用户流程端到端可用
- 强制层——钩子、CI门禁、代码规范或机械检查
- 可观测层——日志、健康端点、链路追踪、机器可读信号
- 隔离层——工作树或容器互不干扰
具体示例:
- 启动:,
pnpm dev, orcargo rundocker compose up - Smoke测试:
curl http://127.0.0.1:3000/health - 交互/E2e测试:
pnpm exec playwright test - 可观测:结构化日志或机器可读的健康端点
Workflow
工作流程
1. Audit
1. 审计
Grade the repo across these dimensions:
- bootable
- testable
- observable
- verifiable
For each, report:
- status: /
pass/partialfail - evidence: file or command
- gap: what is missing
Use references/grading.md. Lowest dimension sets the overall grade.
Example output:
text
bootable: partial — `pnpm dev` starts the app after manual env setup
testable: fail — only mocked tests under test/
observable: partial — health endpoint exists, structured logs missing
verifiable: fail — no stable smoke or interaction script
overall grade: D从以下维度对代码仓库进行评级:
- 可启动性
- 可测试性
- 可观测性
- 可验证性
针对每个维度,报告:
- 状态:(通过)/
pass(部分通过)/partial(失败)fail - 证据:文件或命令
- 差距:缺失内容
参考references/grading.md。最低维度的评级即为整体评级。
示例输出:
text
bootable: partial — `pnpm dev` starts the app after manual env setup
testable: fail — only mocked tests under test/
observable: partial — health endpoint exists, structured logs missing
verifiable: fail — no stable smoke or interaction script
overall grade: D2. Setup
2. 搭建
Build missing layers in this order:
Boot → Smoke → Interact → E2e → Enforce → Observe → Isolate
Each step should be independently useful. Stop once the repo is reliably verifiable; do not build a cathedral because you got excited.
When readiness work includes agent entrypoints, keep as the canonical authored guide and place beside it as a symlink to rather than maintaining two separate guidance files.
AGENTS.mdCLAUDE.mdAGENTS.mdBoot — create a single-command entry point:
bash
#!/usr/bin/env bash
set -euo pipefail
<your-boot-command> &
APP_PID=$!
for i in $(seq 1 30); do
curl -sf http://localhost:${PORT:-3000}/health > /dev/null 2>&1 && break
sleep 1
done
curl -sf http://localhost:${PORT:-3000}/health > /dev/null 2>&1 || {
echo "ERROR: App failed to start"; kill $APP_PID 2>/dev/null; exit 1
}
echo "App is ready"Smoke — fast proof the app is alive (< 5 seconds):
bash
curl -sf http://localhost:3000/health | jq . # HTTP service
./dist/my-cli --version # CLI tool
npx playwright test smoke.spec.ts # UI appEnforce — pre-push hook to catch failures before CI:
bash
#!/usr/bin/env bash按以下顺序构建缺失的层级:
启动层 → Smoke测试层 → 交互层 → E2e测试层 → 强制层 → 可观测层 → 隔离层
每个步骤都应具备独立效用。当代码仓库可被可靠验证时即可停止,不要因一时兴起过度构建。
当就绪性工作包含Agent入口点时,将作为标准权威指南,在其旁创建指向的符号链接,而非维护两个独立的指南文件。
AGENTS.mdAGENTS.mdCLAUDE.md启动层——创建单命令入口:
bash
#!/usr/bin/env bash
set -euo pipefail
<your-boot-command> &
APP_PID=$!
for i in $(seq 1 30); do
curl -sf http://localhost:${PORT:-3000}/health > /dev/null 2>&1 && break
sleep 1
done
curl -sf http://localhost:${PORT:-3000}/health > /dev/null 2>&1 || {
echo "ERROR: App failed to start"; kill $APP_PID 2>/dev/null; exit 1
}
echo "App is ready"Smoke测试层——快速验证应用是否存活(耗时<5秒):
bash
curl -sf http://localhost:3000/health | jq . # HTTP service
./dist/my-cli --version # CLI tool
npx playwright test smoke.spec.ts # UI app强制层——预推送钩子,在CI之前捕获问题:
bash
#!/usr/bin/env bash.git-hooks/pre-push
.git-hooks/pre-push
set -euo pipefail
<your-lint-command>
<your-smoke-command>
See [references/setup-patterns.md](references/setup-patterns.md) for e2e, observability, isolation, and containerized stack patterns.set -euo pipefail
<your-lint-command>
<your-smoke-command>
关于e2e测试、可观测性、隔离机制和容器化架构模式,请参考[references/setup-patterns.md](references/setup-patterns.md)。3. Improve
3. 优化
Tighten weak or flaky layers:
- remove mock-only confidence theater
- prefer smoke, integration, and e2e checks over mock-heavy suites that self-verify implementation details
- replace one-off checks with reusable scripts or hooks
- add dead-code or unused-symbol enforcement where the stack supports it
- add logs and health signals agents can query
- make parallel work safe when agent collisions are real
加固薄弱或不稳定的层级:
- 移除仅用于营造假象的纯模拟测试
- 优先选择smoke测试、集成测试和e2e测试,而非大量仅验证实现细节的重模拟测试套件
- 用可复用的脚本或钩子替代一次性检查
- 在架构支持的情况下,添加死代码或未使用符号的检查机制
- 添加Agent可查询的日志和健康信号
- 当Agent存在冲突风险时,确保并行工作的安全性
4. Hand Off
4. 交接
When the repo reaches C+ and can be judged honestly, hand off to or .
If changes created doc drift, hand off to .
verifyreviewdocs当代码仓库达到C+评级且可被客观评审时,交接给或。
如果变更导致文档不一致,交接给。
verifyreviewdocsAnti-Patterns
反模式
- Mock-only tests — pass by construction, verify nothing
- Mock-heavy unit suites as the main proof — agents love them because they are easy to satisfy, not because they prove the system works
- Self-evaluation — builder grading its own work
- Docs-only fixes disguised as readiness work
- Routine PR review here — that's
review - Perfect infrastructure upfront — iterate from real failure modes
- 纯模拟测试——通过构造逻辑通过测试,无法验证任何实际功能
- 以重模拟单元测试套件作为主要验证依据——Agent喜欢这类测试是因为容易满足,而非它们能证明系统正常工作
- 自我评估——构建者给自己的工作评级
- 伪装成就绪性工作的仅文档修复
- 常规PR评审——这属于的职责范围
review - 预先构建完美基础设施——应从实际失败模式出发逐步迭代
Output
输出要求
After readiness work, report:
- grade before and after
- dimensions with evidence
- files changed
- remaining gaps ranked by impact
- verify readiness
- recommended next handoff: ,
verify,review, or human reviewdocs
就绪性工作完成后,需报告:
- 工作前后的评级
- 各维度及对应证据
- 变更的文件
- 按影响程度排序的剩余差距
- 验证就绪状态
- 推荐的下一步交接对象:、
verify、review或人工评审docs
References
参考资料
- references/grading.md — agent-readiness grading scale with mechanical criteria
- references/setup-patterns.md — boot, smoke, e2e, observability, and isolation patterns
- references/industry-examples.md — external patterns and justification for readiness investment
- references/grading.md — 基于机械标准的Agent就绪性评级量表
- references/setup-patterns.md — 启动、Smoke测试、E2e测试、可观测性和隔离机制模式
- references/industry-examples.md — 外部行业模式及就绪性投入的合理性说明