agent-dx-cli-scale

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Agent DX CLI Scale

Agent DX CLI 评分量表

Use this skill to evaluate any CLI against the principles of agent-first design. Score each axis from 0–3, then sum for a total between 0–21.

Human DX optimizes for discoverability and forgiveness. Agent DX optimizes for predictability and defense-in-depth. — You Need to Rewrite Your CLI for AI Agents

使用该评分标准对照Agent优先设计原则评估任意CLI，每个维度按0–3分打分，总分范围为0–21分。

人类DX优化方向为可发现性和容错性。 Agent DX优化方向为可预测性和纵深防御。 — You Need to Rewrite Your CLI for AI Agents

Scoring Axes

评分维度

1. Machine-Readable Output

1. 机器可读输出

Can an agent parse the CLI's output without heuristics?

Score	Criteria
0	Human-only output (tables, color codes, prose). No structured format available.
1	`--output json` or equivalent exists but is incomplete or inconsistent across commands.
2	Consistent JSON output across all commands. Errors also return structured JSON.
3	NDJSON streaming for paginated results. Structured output is the default in non-TTY (piped) contexts.

Agent无需借助启发式算法就能解析CLI的输出吗？

得分	评判标准
0	仅支持人类可读输出（表格、颜色代码、文本描述），无可用结构化格式。
1	提供 `--output json` 或等效参数，但功能不完整，或不同命令的输出格式不一致。
2	所有命令的JSON输出格式一致，错误信息也返回结构化JSON。
3	分页结果支持NDJSON流式输出，非TTY（管道）场景下默认使用结构化输出。

2. Raw Payload Input

2. 原生负载输入

Can an agent send the full API payload without translation through bespoke flags?

Score	Criteria
0	Only bespoke flags. No way to pass structured input.
1	Accepts `--json` or stdin JSON for some commands, but most require flags.
2	All mutating commands accept a raw JSON payload that maps directly to the underlying API schema.
3	Raw payload is first-class alongside convenience flags. The agent can use the API schema as documentation with zero translation loss.

Agent无需通过自定义标志位转换就能发送完整API负载吗？

得分	评判标准
0	仅支持自定义标志位，无传入结构化输入的途径。
1	部分命令支持 `--json` 参数或标准输入JSON，但大部分命令仍需使用标志位。
2	所有变更类命令都支持直接传入与底层API schema完全映射的原生JSON负载。
3	原生负载与便捷标志位同属一等公民，Agent可直接使用API schema作为文档，无转换损耗。

3. Schema Introspection

3. Schema自省

Can an agent discover what the CLI accepts at runtime without pre-stuffed documentation?

Score	Criteria
0	Only `--help` text. No machine-readable schema.
1	`--help --json` or a `describe` command for some surfaces, but incomplete.
2	Full schema introspection for all commands — params, types, required fields — as JSON.
3	Live, runtime-resolved schemas (e.g., from a discovery document) that always reflect the current API version. Includes scopes, enums, and nested types.

Agent无需预加载文档就能在运行时发现CLI支持的入参规则吗？

得分	评判标准
0	仅提供 `--help` 文本，无机器可读schema。
1	部分场景支持 `--help --json` 或 `describe` 命令获取结构化信息，但功能不完整。
2	所有命令都支持完整的schema自省能力，可返回参数、类型、必填字段等信息的JSON格式。
3	支持实时运行时解析schema（例如从发现文档获取），始终与当前API版本保持一致，包含作用域、枚举值和嵌套类型。

4. Context Window Discipline

4. 上下文窗口管控

Does the CLI help agents control response size to protect their context window?

Score	Criteria
0	Returns full API responses with no way to limit fields or paginate.
1	Supports `--fields` or field masks on some commands.
2	Field masks on all read commands. Pagination with `--page-all` or equivalent.
3	Streaming pagination (NDJSON per page). Explicit guidance in context/skill files on field mask usage. The CLI actively protects the agent from token waste.

CLI是否能帮助Agent控制响应大小，保护其上下文窗口？

得分	评判标准
0	返回完整API响应，无限制字段或分页能力。
1	部分命令支持 `--fields` 或字段掩码参数。
2	所有查询类命令都支持字段掩码，提供 `--page-all` 或等效参数实现分页。
3	支持流式分页（每页输出NDJSON），上下文/技能文件中提供字段掩码使用的明确指引，CLI主动避免Agent浪费token。

5. Input Hardening

5. 输入加固

Does the CLI defend against the specific ways agents fail (hallucinations, not typos)?

Score	Criteria
0	No input validation beyond basic type checks.
1	Validates some inputs, but does not cover agent-specific hallucination patterns (path traversals, embedded query params, double encoding).
2	Rejects control characters, path traversals ( `../` ), percent-encoded segments ( `%2e` ), and embedded query params ( `?` , `#` ) in resource IDs.
3	Comprehensive hardening: all of the above, plus output path sandboxing to CWD, HTTP-layer percent-encoding, and an explicit security posture — "The agent is not a trusted operator."

CLI是否能防御Agent特有的错误模式（幻觉，而非拼写错误）？

得分	评判标准
0	除基础类型检查外无其他输入校验。
1	对部分输入做校验，但未覆盖Agent特有的幻觉模式（路径遍历、嵌入查询参数、双重编码）。
2	拒绝资源ID中的控制字符、路径遍历（ `../` ）、百分编码段（ `%2e` ）和嵌入查询参数（ `?` 、 `#` ）。
3	全面加固：包含上述所有能力，额外支持输出路径沙箱限制到当前工作目录、HTTP层百分编码，以及明确的安全立场——"Agent不属于可信操作者"。

6. Safety Rails

6. 安全护栏

Can agents validate before acting, and are responses sanitized against prompt injection?

Score	Criteria
0	No dry-run mode. No response sanitization.
1	`--dry-run` exists for some mutating commands.
2	`--dry-run` for all mutating commands. Agent can validate requests without side effects.
3	Dry-run plus response sanitization (e.g., via Model Armor) to defend against prompt injection embedded in API data. The full request→response loop is defended.

Agent可以在执行操作前做校验吗？响应是否经过提示词注入 sanitize 处理？

得分	评判标准
0	无试运行模式，无响应 sanitize 处理。
1	部分变更类命令支持 `--dry-run` 参数。
2	所有变更类命令都支持 `--dry-run` 参数，Agent可以无副作用地校验请求。
3	支持试运行+响应sanitize（例如通过Model Armor），防御API数据中嵌入的提示词注入，全请求→响应链路都有防护。

7. Agent Knowledge Packaging

7. Agent知识封装

Does the CLI ship knowledge in formats agents can consume at conversation start?

Score	Criteria
0	Only `--help` and a docs site. No agent-specific context files.
1	A `CONTEXT.md` or `AGENTS.md` with basic usage guidance.
2	Structured skill files (YAML frontmatter + Markdown) covering per-command or per-API-surface workflows and invariants.
3	Comprehensive skill library encoding agent-specific guardrails ("always use --dry-run", "always use --fields"). Skills are versioned, discoverable, and follow a standard like OpenClaw.

CLI是否提供Agent对话启动时就能消费的格式的知识？

得分	评判标准
0	仅提供 `--help` 和文档站点，无Agent专属上下文文件。
1	提供 `CONTEXT.md` 或 `AGENTS.md` 文件，包含基础使用指引。
2	提供结构化技能文件（YAML frontmatter + Markdown），覆盖单命令或单API场景的工作流和不变量。
3	完善的技能库，编码了Agent专属护栏（例如"始终使用--dry-run"、"始终使用--fields"），技能支持版本化、可发现，遵循OpenClaw等标准。

Interpreting the Total

总分解读

Range	Rating	Description
0–5	Human-only	Built for humans. Agents will struggle with parsing, hallucinate inputs, and lack safety rails.
6–10	Agent-tolerant	Agents can use it, but they'll waste tokens, make avoidable errors, and require heavy prompt engineering to compensate.
11–15	Agent-ready	Solid agent support. Structured I/O, input validation, and some introspection. A few gaps remain.
16–21	Agent-first	Purpose-built for agents. Full schema introspection, comprehensive input hardening, safety rails, and packaged agent knowledge.

分数范围	评级	说明
0–5	仅适配人类	为人类设计，Agent会在解析时遇到困难、产生幻觉输入、缺乏安全护栏。
6–10	可兼容Agent	Agent可以使用，但会浪费token、产生可避免的错误，需要大量提示工程来弥补不足。
11–15	就绪适配Agent	可靠的Agent支持，包含结构化I/O、输入校验和一定的自省能力，仍存在少量差距。
16–21	Agent优先	专为Agent设计，具备完整schema自省、全面输入加固、安全护栏和封装好的Agent知识。

Bonus: Multi-Surface Readiness

额外项：多场景就绪度

Not scored, but note whether the CLI exposes multiple agent surfaces from the same binary:

MCP (stdio JSON-RPC) — typed tool invocation, no shell escaping
Extension / plugin install — agent treats the CLI as a native capability
Headless auth — env vars for tokens/credentials, no browser redirect required

不计入评分，但请留意CLI是否在同个二进制文件中暴露多个Agent接入面：

MCP (stdio JSON-RPC) — 类型化工具调用，无需shell转义
扩展/插件安装 — Agent将CLI视作原生能力
无界面鉴权 — 支持通过环境变量传入token/凭证，无需浏览器跳转