Create CLI
Design CLI surface area (syntax + behavior), agent-aware, human-friendly.
Phase 1 — Prepare
Read
${CLAUDE_PLUGIN_ROOT}/skills/create-cli/references/cli-guidelines.md
. Apply it as the default CLI rubric, including the Agent Ergonomics section.
For new CLI designs, also read
${CLAUDE_PLUGIN_ROOT}/skills/create-cli/references/language-selection.md
to inform the language recommendation in Phase 2. Skip it for audits — the language is already chosen.
Proceed when cli-guidelines.md is loaded.
Phase 2 — Clarify
Determine whether this is a new design or an audit from the user's trigger.
New design
Ask, then proceed with best-guess defaults if user is unsure:
- Command name + one-sentence purpose.
- Primary consumer: agent/LLM, human at a terminal, scripted automation, or mixed.
- Input sources: args vs stdin; files vs URLs; secrets (never via flags).
- Output contract: human text by default, for structured output, exit codes.
- Interactivity: prompts allowed? need ? confirmations for destructive ops?
- Config model: flags/env/config-file; precedence; XDG vs repo-local.
- Language & distribution: ask for the user's preferred implementation language, or offer to
recommend one. Ask whether a single binary (no runtime needed on target machine) is required,
or whether a runtime dependency is acceptable. Apply language-selection.md to recommend if
the user is unsure. Platform: macOS/Linux/Windows.
If an existing CLI spec or tool description is provided, read it first — skip questions already answered by it.
Audit
Ask:
- CLI name and source location (repo path, or provide output).
- Primary consumer: agent, human, or mixed.
- Known pain points or specific areas to focus on.
Then explore the codebase: use Glob/Grep to find command definitions, flag registrations, output formatting, and error handling. Run
via Bash to capture actual behavior.
Proceed when answers are confirmed or user is unsure — use best-guess defaults.
Phase 3 — Conventions
Apply the conventions from cli-guidelines.md (loaded in Phase 1), including the Agent Ergonomics section. The rules below are the key conventions to enforce — cli-guidelines.md provides the full rubric for edge cases.
If primary consumer is human-only, the Errors and Reduce Tool Calls subsections are optional — apply them only if the user wants script-friendliness.
Output
- Default output is human-readable text. gives structured JSON. Explicit is better than implicit — no TTY sniffing, no surprises.
- List commands in mode use NDJSON (one JSON object per line) — enables streaming and piping without buffering. For paginated results with metadata, a JSON object with an array is acceptable. If the CLI extends an existing ecosystem that uses JSON arrays (kubectl, aws, gh), match the ecosystem convention.
- Primary data to stdout; diagnostics/errors to stderr.
- Suppress ANSI codes, progress spinners, and decorative output when is passed or when stdout is not a TTY.
Errors (agent/mixed consumers only)
- When is active, emit error objects on stderr:
{"error": "<snake_case_code>", "message": "...", "hint": "<exact CLI invocation or null>"}
— so agent callers can route recovery logic without parsing free-text stderr. The field must be an executable command, not prose.
- Exit codes: success, runtime error, invalid usage; add command-specific codes only when genuinely useful.
Flags
- always shows help; ignores other args.
- prints version to stdout.
- preferred for structured output. / acceptable when the CLI needs multiple output formats (yaml, table, csv) under a single flag. Pick one and apply consistently.
- Consistent flag names across all subcommands for the same concept (, , ) — agents learn the naming pattern once and apply it everywhere without guessing.
- Prompts only when stdin is a TTY; disables prompts. acceptable if the ecosystem already uses it.
- Destructive operations: interactive confirmation; non-interactive requires .
- Respect , ; provide .
- Handle Ctrl-C: exit fast; bounded cleanup; crash-only when possible.
Reduce Tool Calls (agent/mixed consumers only)
- Compound output: operations return enough data to avoid a follow-up call. returns the new resource's ID and key fields. echoes what was removed.
- Rich JSON defaults: in mode, return full objects not just IDs.
- Bounded lists: list commands default to a safe limit (e.g., 50 items) with to adjust. In JSON mode, include (bool) and optionally for keyset pagination. Unbounded output wastes tokens and risks context overflow for agent callers.
- Idempotent by default: where possible, commands are safe to repeat; document preconditions explicitly — agents rely on safe retries for error recovery without human intervention.
Apply all applicable conventions, then proceed to Phase 4.
Phase 4 — Deliver
Audits
Evaluate the existing CLI against every Phase 3 subsection. For each convention, state: what the CLI does today, whether it conforms, and what to change. Also check:
- Flag naming consistency across subcommands.
- Help text quality (examples present, common flags first, fits one screen).
- Config precedence (flags > env > project config > user config > defaults).
- Destructive-op safety (confirmations, --force, --dry-run).
- Shell completion availability.
Produce a gap report organized by severity: Breaking (requires API change), Major (agent-breaking or convention violation), Minor (cosmetic/polish). Each finding: current behavior, convention violated, recommended fix with migration risk (none/low/breaking).
New designs
Produce a compact spec the user can implement. Include all relevant sections:
- Command tree + USAGE synopsis.
- Args/flags table (types, defaults, required/optional, examples).
- Subcommand semantics (what each does; idempotence; state changes).
- Output rules: stdout vs stderr; for structured output; /.
- Error + exit code map (top failure modes).
- Safety rules: , confirmations, , .
- Config/env rules + precedence (flags > env > project config > user config > system).
- Shell completion story (if relevant): install/discoverability; generation command or bundled scripts.
- 5–10 example invocations (common flows; include piped/stdin examples).
Use this skeleton, dropping irrelevant sections:
- Language & distribution: · · single binary · for CI
(Omit if language was not determined.)
- Name:
- One-liner:
- USAGE:
mycmd [global flags] <subcommand> [args]
- Subcommands:
- Global flags:
- / (define exactly)
- (structured JSON output; NDJSON for list commands)
- I/O contract:
- Exit codes:
- success
- generic failure
- invalid usage (parse/validation)
- (add command-specific codes only when actually useful)
- Env/config:
- env vars:
- config file path + precedence:
- Examples:
See
${CLAUDE_PLUGIN_ROOT}/skills/create-cli/examples/example-cli-spec.md
for a complete worked example.
If the spec is destined for a skill body or CLAUDE.md, omit unused sections entirely (do not mark them "N/A") and limit examples to ≤5 invocations that each demonstrate multiple patterns.
Phase 5 — Verify
For new specs: confirm the spec covers all applicable sections from the Phase 4 skeleton. Verify the examples section demonstrates at least:
output, error recovery (if agent/mixed consumer), and one piped/stdin usage.
For audits: confirm the gap report addresses every Phase 3 subsection and includes at least one example invocation showing the recommended fix for each Major finding.
Skill is complete when verification passes.
Notes
- Once language is selected (Phase 2), include the idiomatic parsing library in the spec (see language-selection.md). If language remains undetermined, omit the library.
- If the request is "design parameters", do not drift into implementation.