skill-comply

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

skill-comply: Automated Compliance Measurement

skill-comply：自动化合规性检测

Measures whether coding agents actually follow skills, rules, or agent definitions by:

Auto-generating expected behavioral sequences (specs) from any .md file
Auto-generating scenarios with decreasing prompt strictness (supportive → neutral → competing)
Running
```
claude -p
```
and capturing tool call traces via stream-json
Classifying tool calls against spec steps using LLM (not regex)
Checking temporal ordering deterministically
Generating self-contained reports with spec, prompts, and timelines

该工具用于检测编码Agent是否实际遵循技能、规则或Agent定义，具体方式如下：

从任意.md文件自动生成预期行为序列（specs）
自动生成提示严格程度递减的场景（支持型→中立型→竞争型）
运行
```
claude -p
```
并通过stream-json捕获工具调用轨迹
使用LLM（而非正则表达式）对照spec步骤对工具调用进行分类
确定性检查时间顺序
生成包含spec、提示和时间线的独立报告

Supported Targets

支持的检测目标

Skills (
```
skills/*/SKILL.md
```
): Workflow skills like search-first, TDD guides
Rules (
```
rules/common/*.md
```
): Mandatory rules like testing.md, security.md, git-workflow.md
Agent definitions (
```
agents/*.md
```
): Whether an agent gets invoked when expected (internal workflow verification not yet supported)

Skills（
```
skills/*/SKILL.md
```
）：诸如搜索优先、TDD指南之类的工作流技能
Rules（
```
rules/common/*.md
```
）：诸如testing.md、security.md、git-workflow.md之类的强制性规则
Agent定义（
```
agents/*.md
```
）：检测Agent是否在预期时机被调用（内部工作流验证暂不支持）

When to Activate

激活时机

User runs
```
/skill-comply <path>
```
User asks "is this rule actually being followed?"
After adding new rules/skills, to verify agent compliance
Periodically as part of quality maintenance

用户运行
```
/skill-comply <路径>
```
命令时
用户询问“这条规则是否真的被遵循？”时
添加新规则/技能后，用于验证Agent合规性时
作为质量维护的一部分定期执行时

Usage

使用方法

bash

undefined

bash

undefined

Full run

完整运行

uv run python -m scripts.run ~/.claude/rules/common/testing.md

Dry run (no cost, spec + scenarios only)

试运行（无成本，仅生成spec和场景）

uv run python -m scripts.run --dry-run ~/.claude/skills/search-first/SKILL.md

Custom models

自定义模型

uv run python -m scripts.run --gen-model haiku --model sonnet <path>

undefined

uv run python -m scripts.run --gen-model haiku --model sonnet <路径>

undefined

Key Concept: Prompt Independence

核心概念：提示独立性

Measures whether a skill/rule is followed even when the prompt doesn't explicitly support it.

检测即使在提示未明确支持的情况下，技能/规则是否仍被遵循。

Report Contents

报告内容

Reports are self-contained and include:

Expected behavioral sequence (auto-generated spec)
Scenario prompts (what was asked at each strictness level)
Compliance scores per scenario
Tool call timelines with LLM classification labels

报告为独立文件，包含以下内容：

预期行为序列（自动生成的spec）
场景提示（各严格程度下的请求内容）
各场景的合规性得分
带有LLM分类标签的工具调用时间线

Advanced (optional)

进阶功能（可选）

For users familiar with hooks, reports also include hook promotion recommendations for steps with low compliance. This is informational — the main value is the compliance visibility itself.

对于熟悉hooks的用户，报告还会针对合规性较低的步骤提供hook升级建议。这仅为参考信息——核心价值在于合规性可视化本身。