claude-code-audit

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Claude Code Audit

Claude Code 审计

What this is

这是什么

A disciplined framework for auditing a large sample of the user's Claude Code sessions to find step-change improvements — the 2–3 changes that would transform their workflow, not the 20 that would polish it.
The default behavior when asked "how can I improve my workflow" is to produce a long list of marginal suggestions. That output is worse than useless: it buries real leverage under pleasant-sounding noise, and the user does nothing with it. This skill exists to prevent that outcome.
一个严谨的框架,用于审计大量用户Claude Code会话样本,以找到阶跃式改进——即能彻底改变工作流的2-3项变化,而非仅做优化的20项小调整。
当被问及「如何改进我的工作流」时,常规做法是生成一长串边际性建议。这种输出弊大于利:它会将真正的可优化点淹没在听起来不错的噪音中,用户最终不会采取任何行动。本技能的存在就是为了避免这种结果。

The guarantee

保证

By the end of this audit you will have produced:
  1. A short prioritized report — the few findings that would actually move the needle, with quantified evidence.
  2. Drafted implementations — ready-to-install skills, CLAUDE.md rules, hook scripts, or settings diffs for the top recommendations. Not suggestions. Actual drafts.
  3. A paper trail — the extracted data, sampled session list, and subagent findings, so the user can verify the synthesis and revisit it later.
If you end the audit having produced only observations, you have failed.
审计结束时,你将产出:
  1. 一份简短的优先级报告——包含少数真正能带来显著改变的发现,并附有量化证据。
  2. 可直接落地的实现草案——针对顶级建议的可安装技能、CLAUDE.md规则、钩子脚本或设置差异。不是建议,是实际的草案。
  3. 纸质记录——提取的数据、采样会话列表和子代理发现,方便用户验证合成结果并日后回顾。
若审计结束时仅产出观察结果,即为失败。

Step-change vs marginal — hold this distinction the whole way

阶跃式 vs 边际性——全程坚守这一区分

MarginalStep change
"You should use skill X more often""Skill X never triggers on prompts like these — its description excludes the phrasing you actually use. Here's a new description."
"You correct Claude a lot about testing style""You've given the same correction 14 times in 30 days. The root pattern is Claude defaults to unit tests when you want integration tests. Here's a CLAUDE.md paragraph that eliminates the correction loop."
"Consider adding tests earlier""19 of your 60 sessions have a 'test afterward' → 'fix regression' → 'fix another regression' loop that averages 35 min and 80k tokens. A pre-commit hook that runs your existing test script would short-circuit it."
The left column is observation. The right column is diagnosis plus prescription plus evidence. You produce the right column.
边际性阶跃式
"你应该更频繁地使用技能X""技能X从未在这类提示下触发——其描述排除了你实际使用的措辞。以下是新的描述。"
"你经常纠正Claude的测试风格""30天内你已给出相同纠正14次。根本模式是Claude默认使用单元测试,而你需要的是集成测试。以下是一段可消除纠正循环的CLAUDE.md段落。"
"考虑更早添加测试""你的60次会话中有19次存在「事后测试」→「修复回归问题」→「修复另一个回归问题」的循环,平均耗时35分钟,消耗80k tokens。一个运行现有测试脚本的预提交钩子可打破这一循环。"
左列是观察结果,右列是诊断+处方+证据。你需要产出右列内容。

Where session data lives

会话数据存储位置

  • Root:
    ~/.claude/projects/<encoded-project-path>/<session-id>.jsonl
  • Each line is one event:
    user
    ,
    assistant
    ,
    attachment
    (hook result),
    system
    ,
    file-history-snapshot
    ,
    last-prompt
    .
  • The user is at
    ~/.claude/
    with global CLAUDE.md, skills, settings, and hooks. You'll cross-reference against these during synthesis.
  • See
    references/session_format.md
    for the JSONL schema and which fields matter.
  • 根目录:
    ~/.claude/projects/<encoded-project-path>/<session-id>.jsonl
  • 每一行代表一个事件:
    user
    assistant
    attachment
    (钩子结果)、
    system
    file-history-snapshot
    last-prompt
  • 用户全局的CLAUDE.md、技能、设置和钩子位于
    ~/.claude/
    。合成过程中你会交叉引用这些内容。
  • 查看
    references/session_format.md
    获取JSONL schema及关键字段说明。

Setup — working directory

设置——工作目录

Before starting, set
SKILL_DIR
and
AUDIT_DIR
:
  • SKILL_DIR=~/.claude/skills/claude-code-audit
    (where the scripts live — never modify this)
  • AUDIT_DIR=~/claude-audit/<date>
    (where the audit outputs go — create a fresh subdir per run, or use a path the user provided)
All
python $SKILL_DIR/scripts/...
invocations below assume these are set. Run from any
cwd
; the scripts take absolute paths. Create
AUDIT_DIR
before stage 1.
开始前,设置
SKILL_DIR
AUDIT_DIR
  • SKILL_DIR=~/.claude/skills/claude-code-audit
    (脚本存放目录——请勿修改)
  • AUDIT_DIR=~/claude-audit/<date>
    (审计输出目录——每次运行创建新子目录,或使用用户指定路径)
以下所有
python $SKILL_DIR/scripts/...
调用均假设已设置上述变量。可在任意
cwd
运行;脚本使用绝对路径。在第一阶段前创建
AUDIT_DIR

The five stages

五个阶段

Create a TaskList for these so the user can see progress. This is a long-running analysis — 20 to 60 minutes depending on sample size.
创建任务列表,方便用户查看进度。这是一项耗时较长的分析——根据样本大小,需20至60分钟。

Stage 1 — Inventory

阶段1——盘点

bash
python $SKILL_DIR/scripts/inventory.py --out $AUDIT_DIR/inventory.json
Catalogs every session: path, project, timestamp, size, rough message/turn/tool counts, token totals, version, git branch. Cheap, deterministic. Read the summary it prints.
bash
python $SKILL_DIR/scripts/inventory.py --out $AUDIT_DIR/inventory.json
记录所有会话:路径、项目、时间戳、大小、大致消息/轮次/工具数量、token总数、版本、git分支。操作简单、结果确定。阅读其打印的摘要。

Stage 2 — Sample

阶段2——采样

bash
python $SKILL_DIR/scripts/sample.py --inventory $AUDIT_DIR/inventory.json --count 60 --days 60 --out $AUDIT_DIR/sample.json
Stratified sample weighted toward recency but diverse across projects and session sizes. Print the distribution. Tell the user the shape of the sample in one sentence ("60 sessions: 40 from last 2 weeks, 20 older; spanning X projects") and proceed unless they push back.
bash
python $SKILL_DIR/scripts/sample.py --inventory $AUDIT_DIR/inventory.json --count 60 --days 60 --out $AUDIT_DIR/sample.json
分层采样,侧重近期会话,但涵盖不同项目和会话规模。打印分布情况。用一句话告知用户样本情况(如“60次会话:40次来自过去2周,20次更早;覆盖X个项目”),除非用户反对,否则继续。

Stage 3 — Extract

阶段3——提取

bash
python $SKILL_DIR/scripts/extract.py --sample $AUDIT_DIR/sample.json --out $AUDIT_DIR/extracted/
For each sampled session, writes a JSON with: user prompts (real ones, not command wrappers), assistant tool-use sequence, error counts, skill/command invocations, token usage, approximate wall-clock duration, first-prompt intent, "correction" markers (user messages that follow a failed tool or that use corrective language), and session outcome hints.
Then aggregate:
bash
python $SKILL_DIR/scripts/extract.py --aggregate $AUDIT_DIR/extracted/ --out $AUDIT_DIR/aggregate.json
You now have the quantitative layer. Skim it. Flag any surprises — surprises are leads.
bash
python $SKILL_DIR/scripts/extract.py --sample $AUDIT_DIR/sample.json --out $AUDIT_DIR/extracted/
对每个采样会话,生成包含以下内容的JSON:用户提示(真实提示,而非命令包装器)、助手工具使用序列、错误计数、技能/命令调用、token使用量、大致耗时、首次提示意图、“纠正”标记(跟随失败工具的用户消息或使用纠正性语言的消息)、会话结果提示。
然后进行聚合:
bash
python $SKILL_DIR/scripts/extract.py --aggregate $AUDIT_DIR/extracted/ --out $AUDIT_DIR/aggregate.json
现在你已获得量化层面的数据。浏览数据,标记任何意外发现——意外发现是线索。

Stage 4 — Deep read via parallel subagents

阶段4——通过并行子代理深度解读

Scripts miss the qualitative signal: frustration, confusion, circularity, breakthrough moments, mental-model mismatches. Subagents handle this.
Split the sample into batches of 8–10 sessions. Launch all batches in a single message (parallel, not sequential). For each batch, spawn an
Explore
subagent with the briefing from
references/subagent_brief.md
plus:
  • The list of session file paths in its batch.
  • The path to
    $AUDIT_DIR/aggregate.json
    so it has global context.
  • The path to
    $SKILL_DIR/references/step_change_patterns.md
    as its lens.
  • The user's current skills dir (
    ~/.claude/skills
    ), CLAUDE.md (
    ~/.claude/CLAUDE.md
    ), and settings.json (
    ~/.claude/settings.json
    ), for cross-reference.
  • Tell it to use
    python $SKILL_DIR/scripts/render.py <session-path>
    to get readable transcripts. Using render.py is critical — raw JSONL will blow its context.
Each subagent returns a structured JSON of findings. Save to
$AUDIT_DIR/batch-<N>-findings.json
.
脚本会遗漏定性信号:挫败感、困惑、循环往复、突破时刻、心智模型不匹配。子代理可处理这些内容。
将样本分成8-10次会话的批次。在单条消息中启动所有批次(并行,而非顺序)。对每个批次,生成一个
Explore
子代理,附带
references/subagent_brief.md
中的说明,以及:
  • 批次中的会话文件路径列表。
  • $AUDIT_DIR/aggregate.json
    的路径,以便获取全局上下文。
  • $SKILL_DIR/references/step_change_patterns.md
    的路径,作为分析视角。
  • 用户当前的技能目录(
    ~/.claude/skills
    )、CLAUDE.md(
    ~/.claude/CLAUDE.md
    )和settings.json(
    ~/.claude/settings.json
    ),用于交叉引用。
  • 告知其使用
    python $SKILL_DIR/scripts/render.py <session-path>
    获取可读的会话记录。使用render.py至关重要——原始JSONL会超出上下文限制。
每个子代理返回结构化的JSON结果。保存至
$AUDIT_DIR/batch-<N>-findings.json

Stage 5 — Synthesize

阶段5——合成

This is where the skill earns its name.
  1. Read every batch-findings file and the aggregate.
  2. Cluster findings across batches. A pattern that appears in one session is noise; a pattern in eight is signal.
  3. Rank clusters by frequency × severity × fix leverage. Leverage = how many future sessions the fix would affect. A fix that helps one niche workflow is not leverage; a fix that removes friction from every coding session is.
  4. Apply
    references/step_change_patterns.md
    as a filter — is each top cluster a marginal tweak or a step change? If it's marginal, either promote it to something bigger (find the underlying doctrine) or drop it.
  5. Write the report using
    references/report_template.md
    .
  6. Draft the fixes. Every top-tier recommendation gets a real draft, not a description of a draft. New skill? Write the SKILL.md. New rule? Write the paragraph, word for word. New hook? Write the script. Settings change? Show the diff.
这是本技能的核心价值所在。
  1. 阅读所有批次结果文件和聚合数据。
  2. 跨批次聚类发现。仅出现在一次会话中的模式是噪音;出现在八次会话中的模式是信号。
  3. 频率×严重程度×修复杠杆率对聚类结果排序。杠杆率=修复措施将影响多少未来会话。仅帮助 niche 工作流的修复不具备杠杆率;消除每次编码会话中摩擦的修复才是高杠杆率。
  4. references/step_change_patterns.md
    为过滤器——每个顶级聚类是边际调整还是阶跃式改进?如果是边际调整,要么将其升级为更具影响力的改进(找到底层原则),要么舍弃。
  5. 使用
    references/report_template.md
    撰写报告。
  6. 起草修复方案。每个顶级建议都要有真实的草案,而非草案描述。新技能?撰写SKILL.md。新规则?逐字逐句撰写段落。新钩子?撰写脚本。设置变更?展示差异。

Non-negotiables

不可协商的规则

You draft the fix. If you find yourself writing "the user could consider adding a skill for X", stop and write the skill. If you write "a CLAUDE.md rule about Y would help", stop and write the exact paragraph. This is the single most important thing this skill does. Skip it and the audit is worthless.
Short over complete. A 3-item report the user acts on beats a 30-item report they skim and forget. Cut everything that isn't leverage.
Numbers, not adjectives. "Frequently", "often", "sometimes" are lies. Say "17 of 60 sessions" or "23 corrections across the sample". The scripts give you exact counts — use them.
Cite the sessions. Every claim in the report names specific session IDs as evidence. The user must be able to open a session and verify the pattern.
Audit Claude, not the user. If the user keeps correcting the same thing, Claude's defaults are wrong for them — not the other way around. Don't scold the user for their prompting. Find the pattern in Claude's behavior that forces the correction, and write the fix to Claude's configuration.
你必须起草修复方案。如果你发现自己在写“用户可以考虑添加X技能”,请停下来直接撰写该技能。如果你写“关于Y的CLAUDE.md规则会有所帮助”,请停下来撰写确切的段落。这是本技能最重要的工作。跳过这一步,审计毫无价值。
简短优于完整。一份用户会采取行动的3项建议报告,胜过一份用户浏览后遗忘的30项建议报告。砍掉所有非高杠杆率的内容。
用数字,不用形容词。“频繁”“经常”“有时”都是模糊表述。要说“60次会话中的17次”或“样本中23次纠正”。脚本会给出精确计数——请使用它们。
引用会话。报告中的每个主张都要列出具体的会话ID作为证据。用户必须能够打开会话并验证模式。
审计Claude,而非用户。如果用户不断纠正同一问题,说明Claude的默认设置不适合他们——而非用户的问题。不要指责用户的提示方式。找出Claude导致用户必须纠正的行为模式,并修改Claude的配置来解决问题。

Common failure modes

常见失败模式

  • Observation dump. Listing 40 patterns with no priority. Fix: be brutal in cutting.
  • Self-congratulation. Only describing things Claude did well. Fix: deliberately hunt for failures, frustrations, and circling.
  • Vague prescriptions. "Consider improving X." Fix: you draft the improvement or you don't include it.
  • Project tunnel vision. Reporting only on the most-active project. Fix: sample forces diversity; honor it in the write-up.
  • Ignoring the meta-level. Focusing only on in-session behavior and missing cross-session patterns (sessions ending prematurely, same context being rebuilt every time, etc.). Fix:
    references/step_change_patterns.md
    explicitly calls these out; reread before synthesis.
  • 观察结果堆砌。列出40个无优先级的模式。解决方法:果断删减。
  • 自我吹捧。仅描述Claude做得好的地方。解决方法:刻意寻找失败、挫败和循环往复的情况。
  • 模糊的建议。“考虑改进X”。解决方法:要么起草改进方案,要么不纳入。
  • 项目视野狭隘。仅报告最活跃的项目。解决方法:采样确保多样性;在撰写时尊重这一点。
  • 忽略元层面。仅关注会话内行为,忽略跨会话模式(会话提前结束、每次都要重建相同上下文等)。解决方法:
    references/step_change_patterns.md
    明确列出了这些情况;合成前重新阅读。

Output layout

输出目录结构

$AUDIT_DIR/
  inventory.json            # stage 1
  sample.json               # stage 2
  extracted/<session>.json  # stage 3
  aggregate.json            # stage 3
  batch-<N>-findings.json   # stage 4
  report.md                 # stage 5 — the deliverable
  drafts/                   # stage 5 — the implementations
    skills/<skill-name>/SKILL.md
    claude-md-additions.md
    hooks/<hook-name>.sh
    settings-diff.json
Default
$AUDIT_DIR
to
~/claude-audit/<YYYY-MM-DD>/
unless the user specifies otherwise. Everything stays on disk so the user can rerun, verify, or share.
$AUDIT_DIR/
  inventory.json            # 阶段1
  sample.json               # 阶段2
  extracted/<session>.json  # 阶段3
  aggregate.json            # 阶段3
  batch-<N>-findings.json   # 阶段4
  report.md                 # 阶段5 — 交付物
  drafts/                   # 阶段5 — 实现方案
    skills/<skill-name>/SKILL.md
    claude-md-additions.md
    hooks/<hook-name>.sh
    settings-diff.json
默认
$AUDIT_DIR
~/claude-audit/<YYYY-MM-DD>/
,除非用户指定其他路径。所有内容都保存在磁盘上,方便用户重新运行、验证或分享。

Reference files

参考文件

  • references/session_format.md
    — JSONL schema and field meanings
  • references/step_change_patterns.md
    — taxonomy of leverage points; the lens for synthesis
  • references/report_template.md
    — required output structure
  • references/subagent_brief.md
    — prompt you pass to stage-4 subagents
  • references/session_format.md
    — JSONL schema及字段含义
  • references/step_change_patterns.md
    — 可优化点分类;合成时的分析视角
  • references/report_template.md
    — 要求的输出结构
  • references/subagent_brief.md
    — 传递给阶段4子代理的提示词