agent-readiness

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Agent-Readiness

Agent就绪性

Make a repo ready for autonomous agent work.

使代码仓库为Agent自主工作做好准备。

Principles

原则

Environment > instruction — infrastructure matters more than the prompt
Mechanical enforcement > prose — hooks, CI, health checks, and scripts beat wishes
Separate builder from judge —
```
agent-readiness
```
builds the rig,
```
verify
```
proves your own change,
```
review
```
critiques existing code
Real behavior > mocked confidence — smoke, integration, and e2e checks beat large suites that mostly mock the seam under test
Smallest useful layer first — add layers in order, stop when the repo becomes reliably verifiable
Progressive disclosure — keep the core workflow here, load patterns on demand

环境优先于指令——基础设施比提示词更重要
机械执行优先于文字说明——钩子、CI、健康检查和脚本比主观期望更有效
构建者与评审者分离——
```
agent-readiness
```
负责搭建基础架构，
```
verify
```
用于验证自身变更，
```
review
```
用于评审现有代码
真实行为优先于模拟假象——smoke测试、集成测试和e2e测试优于大量仅模拟被测接口的测试套件
先构建最小可用层——按顺序添加层级，当代码仓库可被可靠验证时即可停止
渐进式披露——核心工作流保留在此，按需加载模式

Handoffs

交接场景

Need to review existing code, a diff, branch, or PR → use
```
review
```
Need to prove your own completed change works on real surfaces → use
```
verify
```
Need to update AGENTS.md, README.md, specs, or repo docs → use
```
docs
```

需要评审现有代码、代码差异、分支或PR → 使用
```
review
```
需要证明自身完成的变更在真实环境中有效 → 使用
```
verify
```
需要更新AGENTS.md、README.md、规格说明或仓库文档 → 使用
```
docs
```

The 7-Layer Stack

七层架构

Boot — single command starts the app
Smoke — a fast proof the app is alive
Interact — agent can exercise the real surface
E2e — key user flows work end to end
Enforce — hooks, CI gates, lint rules, or mechanical checks
Observe — logs, health endpoints, traces, machine-readable signals
Isolate — worktrees or containers do not collide

Concrete examples:

Boot:
```
pnpm dev
```
,
```
cargo run
```
, or
```
docker compose up
```
Smoke:
```
curl http://127.0.0.1:3000/health
```
Interact/E2e:
```
pnpm exec playwright test
```
Observe: structured logs or a machine-readable health endpoint

启动层——单命令启动应用
Smoke测试层——快速验证应用是否存活
交互层——Agent可操作真实环境
E2e测试层——关键用户流程端到端可用
强制层——钩子、CI门禁、代码规范或机械检查
可观测层——日志、健康端点、链路追踪、机器可读信号
隔离层——工作树或容器互不干扰

具体示例：

启动：
```
pnpm dev
```
,
```
cargo run
```
, or
```
docker compose up
```
Smoke测试：
```
curl http://127.0.0.1:3000/health
```
交互/E2e测试：
```
pnpm exec playwright test
```
可观测：结构化日志或机器可读的健康端点

Workflow

工作流程

1. Audit

1. 审计

Grade the repo across these dimensions:

bootable
testable
observable
verifiable

For each, report:

status:
```
pass
```
/
```
partial
```
/
```
fail
```
evidence: file or command
gap: what is missing

Use references/grading.md. Lowest dimension sets the overall grade.

Example output:

text

bootable: partial — `pnpm dev` starts the app after manual env setup
testable: fail — only mocked tests under test/
observable: partial — health endpoint exists, structured logs missing
verifiable: fail — no stable smoke or interaction script
overall grade: D

从以下维度对代码仓库进行评级：

可启动性
可测试性
可观测性
可验证性

针对每个维度，报告：

状态：
```
pass
```
（通过）/
```
partial
```
（部分通过）/
```
fail
```
（失败）
证据：文件或命令
差距：缺失内容

参考references/grading.md。最低维度的评级即为整体评级。

示例输出：

text

bootable: partial — `pnpm dev` starts the app after manual env setup
testable: fail — only mocked tests under test/
observable: partial — health endpoint exists, structured logs missing
verifiable: fail — no stable smoke or interaction script
overall grade: D

2. Setup

2. 搭建

Build missing layers in this order:

Boot → Smoke → Interact → E2e → Enforce → Observe → Isolate

Each step should be independently useful. Stop once the repo is reliably verifiable; do not build a cathedral because you got excited.

When readiness work includes agent entrypoints, keep

AGENTS.md

as the canonical authored guide and place

CLAUDE.md

beside it as a symlink to

AGENTS.md

rather than maintaining two separate guidance files.

Boot — create a single-command entry point:

bash

#!/usr/bin/env bash
set -euo pipefail
<your-boot-command> &
APP_PID=$!
for i in $(seq 1 30); do
  curl -sf http://localhost:${PORT:-3000}/health > /dev/null 2>&1 && break
  sleep 1
done
curl -sf http://localhost:${PORT:-3000}/health > /dev/null 2>&1 || {
  echo "ERROR: App failed to start"; kill $APP_PID 2>/dev/null; exit 1
}
echo "App is ready"

Smoke — fast proof the app is alive (< 5 seconds):

bash

curl -sf http://localhost:3000/health | jq .   # HTTP service
./dist/my-cli --version                         # CLI tool
npx playwright test smoke.spec.ts               # UI app

Enforce — pre-push hook to catch failures before CI:

bash

#!/usr/bin/env bash

按以下顺序构建缺失的层级：

启动层 → Smoke测试层 → 交互层 → E2e测试层 → 强制层 → 可观测层 → 隔离层

每个步骤都应具备独立效用。当代码仓库可被可靠验证时即可停止，不要因一时兴起过度构建。

当就绪性工作包含Agent入口点时，将

AGENTS.md

作为标准权威指南，在其旁创建指向

AGENTS.md

的符号链接

CLAUDE.md

，而非维护两个独立的指南文件。

启动层——创建单命令入口：

bash

#!/usr/bin/env bash
set -euo pipefail
<your-boot-command> &
APP_PID=$!
for i in $(seq 1 30); do
  curl -sf http://localhost:${PORT:-3000}/health > /dev/null 2>&1 && break
  sleep 1
done
curl -sf http://localhost:${PORT:-3000}/health > /dev/null 2>&1 || {
  echo "ERROR: App failed to start"; kill $APP_PID 2>/dev/null; exit 1
}
echo "App is ready"

Smoke测试层——快速验证应用是否存活（耗时<5秒）：

bash

curl -sf http://localhost:3000/health | jq .   # HTTP service
./dist/my-cli --version                         # CLI tool
npx playwright test smoke.spec.ts               # UI app

强制层——预推送钩子，在CI之前捕获问题：

bash

#!/usr/bin/env bash

.git-hooks/pre-push

set -euo pipefail <your-lint-command> <your-smoke-command>


See [references/setup-patterns.md](references/setup-patterns.md) for e2e, observability, isolation, and containerized stack patterns.

set -euo pipefail <your-lint-command> <your-smoke-command>


关于e2e测试、可观测性、隔离机制和容器化架构模式，请参考[references/setup-patterns.md](references/setup-patterns.md)。

3. Improve

3. 优化

Tighten weak or flaky layers:

remove mock-only confidence theater
prefer smoke, integration, and e2e checks over mock-heavy suites that self-verify implementation details
replace one-off checks with reusable scripts or hooks
add dead-code or unused-symbol enforcement where the stack supports it
add logs and health signals agents can query
make parallel work safe when agent collisions are real

加固薄弱或不稳定的层级：

移除仅用于营造假象的纯模拟测试
优先选择smoke测试、集成测试和e2e测试，而非大量仅验证实现细节的重模拟测试套件
用可复用的脚本或钩子替代一次性检查
在架构支持的情况下，添加死代码或未使用符号的检查机制
添加Agent可查询的日志和健康信号
当Agent存在冲突风险时，确保并行工作的安全性

4. Hand Off

4. 交接

When the repo reaches C+ and can be judged honestly, hand off to

verify

review

. If changes created doc drift, hand off to

docs

当代码仓库达到C+评级且可被客观评审时，交接给

verify

或

review

。如果变更导致文档不一致，交接给

docs

。

Anti-Patterns

反模式

Mock-only tests — pass by construction, verify nothing
Mock-heavy unit suites as the main proof — agents love them because they are easy to satisfy, not because they prove the system works
Self-evaluation — builder grading its own work
Docs-only fixes disguised as readiness work
Routine PR review here — that's
```
review
```
Perfect infrastructure upfront — iterate from real failure modes

纯模拟测试——通过构造逻辑通过测试，无法验证任何实际功能
以重模拟单元测试套件作为主要验证依据——Agent喜欢这类测试是因为容易满足，而非它们能证明系统正常工作
自我评估——构建者给自己的工作评级
伪装成就绪性工作的仅文档修复
常规PR评审——这属于
```
review
```
的职责范围
预先构建完美基础设施——应从实际失败模式出发逐步迭代

Output

输出要求

After readiness work, report:

grade before and after
dimensions with evidence
files changed
remaining gaps ranked by impact
verify readiness
recommended next handoff:
```
verify
```
,
```
review
```
,
```
docs
```
, or human review

就绪性工作完成后，需报告：

工作前后的评级
各维度及对应证据
变更的文件
按影响程度排序的剩余差距
验证就绪状态
推荐的下一步交接对象：
```
verify
```
、
```
review
```
、
```
docs
```
或人工评审

References

参考资料

references/grading.md — agent-readiness grading scale with mechanical criteria
references/setup-patterns.md — boot, smoke, e2e, observability, and isolation patterns
references/industry-examples.md — external patterns and justification for readiness investment

references/grading.md — 基于机械标准的Agent就绪性评级量表
references/setup-patterns.md — 启动、Smoke测试、E2e测试、可观测性和隔离机制模式
references/industry-examples.md — 外部行业模式及就绪性投入的合理性说明