skill-comply
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
Chineseskill-comply: Automated Compliance Measurement
skill-comply:自动化合规性检测
Measures whether coding agents actually follow skills, rules, or agent definitions by:
- Auto-generating expected behavioral sequences (specs) from any .md file
- Auto-generating scenarios with decreasing prompt strictness (supportive → neutral → competing)
- Running and capturing tool call traces via stream-json
claude -p - Classifying tool calls against spec steps using LLM (not regex)
- Checking temporal ordering deterministically
- Generating self-contained reports with spec, prompts, and timelines
该工具用于检测编码Agent是否实际遵循技能、规则或Agent定义,具体方式如下:
- 从任意.md文件自动生成预期行为序列(specs)
- 自动生成提示严格程度递减的场景(支持型→中立型→竞争型)
- 运行并通过stream-json捕获工具调用轨迹
claude -p - 使用LLM(而非正则表达式)对照spec步骤对工具调用进行分类
- 确定性检查时间顺序
- 生成包含spec、提示和时间线的独立报告
Supported Targets
支持的检测目标
- Skills (): Workflow skills like search-first, TDD guides
skills/*/SKILL.md - Rules (): Mandatory rules like testing.md, security.md, git-workflow.md
rules/common/*.md - Agent definitions (): Whether an agent gets invoked when expected (internal workflow verification not yet supported)
agents/*.md
- Skills():诸如搜索优先、TDD指南之类的工作流技能
skills/*/SKILL.md - Rules():诸如testing.md、security.md、git-workflow.md之类的强制性规则
rules/common/*.md - Agent定义():检测Agent是否在预期时机被调用(内部工作流验证暂不支持)
agents/*.md
When to Activate
激活时机
- User runs
/skill-comply <path> - User asks "is this rule actually being followed?"
- After adding new rules/skills, to verify agent compliance
- Periodically as part of quality maintenance
- 用户运行命令时
/skill-comply <路径> - 用户询问“这条规则是否真的被遵循?”时
- 添加新规则/技能后,用于验证Agent合规性时
- 作为质量维护的一部分定期执行时
Usage
使用方法
bash
undefinedbash
undefinedFull run
完整运行
uv run python -m scripts.run ~/.claude/rules/common/testing.md
uv run python -m scripts.run ~/.claude/rules/common/testing.md
Dry run (no cost, spec + scenarios only)
试运行(无成本,仅生成spec和场景)
uv run python -m scripts.run --dry-run ~/.claude/skills/search-first/SKILL.md
uv run python -m scripts.run --dry-run ~/.claude/skills/search-first/SKILL.md
Custom models
自定义模型
uv run python -m scripts.run --gen-model haiku --model sonnet <path>
undefineduv run python -m scripts.run --gen-model haiku --model sonnet <路径>
undefinedKey Concept: Prompt Independence
核心概念:提示独立性
Measures whether a skill/rule is followed even when the prompt doesn't explicitly support it.
检测即使在提示未明确支持的情况下,技能/规则是否仍被遵循。
Report Contents
报告内容
Reports are self-contained and include:
- Expected behavioral sequence (auto-generated spec)
- Scenario prompts (what was asked at each strictness level)
- Compliance scores per scenario
- Tool call timelines with LLM classification labels
报告为独立文件,包含以下内容:
- 预期行为序列(自动生成的spec)
- 场景提示(各严格程度下的请求内容)
- 各场景的合规性得分
- 带有LLM分类标签的工具调用时间线
Advanced (optional)
进阶功能(可选)
For users familiar with hooks, reports also include hook promotion recommendations for steps with low compliance. This is informational — the main value is the compliance visibility itself.
对于熟悉hooks的用户,报告还会针对合规性较低的步骤提供hook升级建议。这仅为参考信息——核心价值在于合规性可视化本身。