audit-agents-skills

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Audit Agents/Skills/Commands (Advanced Skill)

审计Agent/Skill/命令(高级Skill)

Comprehensive quality audit system for Claude Code agents, skills, and commands. Provides quantitative scoring, comparative analysis, and production readiness grading based on industry best practices.
面向Claude Code Agent、Skill和命令的综合质量审计系统,基于行业最佳实践提供量化评分、对比分析和生产就绪度评级。

Purpose

用途

Problem: Manual validation of agents/skills is error-prone and inconsistent. According to the LangChain Agent Report 2026, 29.5% of organizations deploy agents without systematic evaluation, leading to "agent bugs" as the top challenge (18% of teams).
Solution: Automated quality scoring across 16 weighted criteria with production readiness thresholds (80% = Grade B minimum for production deployment).
Key Features:
  • Quantitative scoring (32 points for agents/skills, 20 for commands)
  • Weighted criteria (Identity 3x, Prompt 2x, Validation 1x, Design 2x)
  • Production readiness grading (A-F scale with 80% threshold)
  • Comparative analysis vs reference templates
  • JSON/Markdown dual output for programmatic integration
  • Fix suggestions for failing criteria

问题:人工验证Agent/Skill容易出错且标准不统一。根据《LangChain Agent报告2026》,29.5%的机构在未经过系统评估的情况下就部署Agent,导致“Agent漏洞”成为头号挑战(18%的团队受此影响)。
解决方案:基于16项加权标准的自动化质量评分,设置生产就绪阈值(80% = 生产部署最低B级要求)。
核心特性
  • 量化评分(Agent/Skill满分32分,命令满分20分)
  • 加权评分标准(身份权重3x、Prompt权重2x、验证权重1x、设计权重2x)
  • 生产就绪度评级(A-F等级,阈值80%)
  • 与参考模板的对比分析
  • 支持JSON/Markdown双输出,方便程序集成
  • 针对未达标项提供修复建议

Modes

模式

ModeUsageOutput
Quick AuditTop-5 critical criteria onlyFast pass/fail (3-5 min for 20 files)
Full AuditAll 16 criteria per fileDetailed scores + recommendations (10-15 min)
ComparativeFull + benchmark vs templatesAnalysis + gap identification (15-20 min)
Default: Full Audit (recommended for first run)

模式用途输出
快速审计仅检查Top5关键标准快速得出通过/未通过结果(20个文件仅需3-5分钟)
全量审计每个文件检查全部16项标准详细得分+优化建议(10-15分钟)
对比审计全量审计+与模板基准对比分析+差距识别(15-20分钟)
默认模式:全量审计(首次运行推荐)

Methodology

方法论

Why These Criteria?

为什么选择这些标准?

The 16-criteria framework is derived from:
  1. Claude Code Best Practices (Ultimate Guide line 4921: Agent Validation Checklist)
  2. Industry Data (LangChain Agent Report 2026: evaluation gaps)
  3. Production Failures (Community feedback on hardcoded paths, missing error handling)
  4. Composition Patterns (Skills should reference other skills, agents should be modular)
16项标准框架来源于:
  1. Claude Code最佳实践(终极指南第4921行:Agent验证检查清单)
  2. 行业数据(《LangChain Agent报告2026》:评估缺口)
  3. 生产故障案例(社区反馈的硬编码路径、缺失错误处理等问题)
  4. 组合模式要求(Skill应当可以引用其他Skill,Agent应当模块化)

Scoring Philosophy

评分逻辑

Weight Rationale:
  • Identity (3x): If users can't find/invoke the agent, quality is irrelevant (discoverability > quality)
  • Prompt (2x): Determines reliability and accuracy of outputs
  • Validation (1x): Improves robustness but is secondary to core functionality
  • Design (2x): Impacts long-term maintainability and scalability
Grade Standards:
  • A (90-100%): Production-ready, minimal risk
  • B (80-89%): Good, meets production threshold
  • C (70-79%): Needs improvement before production
  • D (60-69%): Significant gaps, not production-ready
  • F (<60%): Critical issues, requires major refactoring
Industry Alignment: The 80% threshold aligns with software engineering best practices for production deployment (e.g., code coverage >80%, security scan pass rates).

权重设置理由
  • 身份(3x):如果用户找不到/无法调用Agent,质量毫无意义(可发现性 > 质量)
  • Prompt(2x):决定输出的可靠性和准确性
  • 验证(1x):提升鲁棒性,但优先级低于核心功能
  • 设计(2x):影响长期可维护性和可扩展性
评级标准
  • A(90-100%):生产就绪,风险极低
  • B(80-89%):良好,达到生产阈值
  • C(70-79%):上线前需要优化
  • D(60-69%):存在明显缺口,不适合生产部署
  • F(<60%):存在严重问题,需要大量重构
行业对齐:80%的阈值符合软件工程生产部署的最佳实践(如代码覆盖率>80%、安全扫描通过率要求等)。

Workflow

工作流

Phase 1: Discovery

阶段1:资源发现

  1. Scan directories:
    .claude/agents/
    .claude/skills/
    .claude/commands/
    examples/agents/      (if exists)
    examples/skills/      (if exists)
    examples/commands/    (if exists)
  2. Classify files by type (agent/skill/command)
  3. Load reference templates (for Comparative mode):
    guide/examples/agents/     (benchmark files)
    guide/examples/skills/     (benchmark files)
    guide/examples/commands/   (benchmark files)
  1. 扫描目录
    .claude/agents/
    .claude/skills/
    .claude/commands/
    examples/agents/      (如果存在)
    examples/skills/      (如果存在)
    examples/commands/    (如果存在)
  2. 按类型分类文件(Agent/Skill/命令)
  3. 加载参考模板(仅对比模式需要):
    guide/examples/agents/     (基准文件)
    guide/examples/skills/     (基准文件)
    guide/examples/commands/   (基准文件)

Phase 2: Scoring Engine

阶段2:评分引擎

Load scoring criteria from
scoring/criteria.yaml
:
yaml
agents:
  max_points: 32
  categories:
    identity:
      weight: 3
      criteria:
        - id: A1.1
          name: "Clear name"
          points: 3
          detection: "frontmatter.name exists and is descriptive"
        # ... (16 total criteria)
For each file:
  1. Parse frontmatter (YAML)
  2. Extract content sections
  3. Run detection patterns (regex, keyword search)
  4. Calculate score:
    (points / max_points) × 100
  5. Assign grade (A-F)
scoring/criteria.yaml
加载评分标准:
yaml
agents:
  max_points: 32
  categories:
    identity:
      weight: 3
      criteria:
        - id: A1.1
          name: "Clear name"
          points: 3
          detection: "frontmatter.name exists and is descriptive"
        # ... (共16项标准)
对每个文件:
  1. 解析frontmatter(YAML)
  2. 提取内容区块
  3. 运行检测规则(正则、关键词搜索)
  4. 计算得分:
    (得分/满分) × 100
  5. 分配等级(A-F)

Phase 3: Comparative Analysis (Comparative Mode Only)

阶段3:对比分析(仅对比模式)

For each project file:
  1. Find closest matching template (by description similarity)
  2. Compare scores per criterion
  3. Identify gaps:
    template_score - project_score
  4. Flag significant gaps (>10 points difference)
Example:
Project file: .claude/agents/debugging-specialist.md (Score: 78%, Grade C)
Closest template: examples/agents/debugging-specialist.md (Score: 94%, Grade A)

Gaps:
- Anti-hallucination measures: -2 points (template has, project missing)
- Edge cases documented: -1 point (template has 5 examples, project has 1)
- Integration documented: -1 point (template references 3 skills, project none)

Total gap: 16 points (explains C vs A difference)
对每个项目文件:
  1. 匹配最接近的模板(基于描述相似度)
  2. 按标准项对比得分
  3. 识别差距:
    模板得分 - 项目得分
  4. 标记显著差距(得分差>10分)
示例
Project file: .claude/agents/debugging-specialist.md (Score: 78%, Grade C)
Closest template: examples/agents/debugging-specialist.md (Score: 94%, Grade A)

Gaps:
- Anti-hallucination measures: -2 points (template has, project missing)
- Edge cases documented: -1 point (template has 5 examples, project has 1)
- Integration documented: -1 point (template references 3 skills, project none)

Total gap: 16 points (explains C vs A difference)

Phase 4: Report Generation

阶段4:报告生成

Markdown Report (
audit-report.md
):
  • Summary table (overall + by type)
  • Individual scores with top issues
  • Detailed breakdown per file (collapsible)
  • Prioritized recommendations
JSON Output (
audit-report.json
):
json
{
  "metadata": {
    "project_path": "/path/to/project",
    "audit_date": "2026-02-07",
    "mode": "full",
    "version": "1.0.0"
  },
  "summary": {
    "overall_score": 82.5,
    "overall_grade": "B",
    "total_files": 15,
    "production_ready_count": 10,
    "production_ready_percentage": 66.7
  },
  "by_type": {
    "agents": { "count": 5, "avg_score": 85.2, "grade": "B" },
    "skills": { "count": 8, "avg_score": 78.9, "grade": "C" },
    "commands": { "count": 2, "avg_score": 92.0, "grade": "A" }
  },
  "files": [
    {
      "path": ".claude/agents/debugging-specialist.md",
      "type": "agent",
      "score": 78.1,
      "grade": "C",
      "points_obtained": 25,
      "points_max": 32,
      "failed_criteria": [
        {
          "id": "A2.4",
          "name": "Anti-hallucination measures",
          "points_lost": 2,
          "recommendation": "Add section on source verification"
        }
      ]
    }
  ],
  "top_issues": [
    {
      "issue": "Missing error handling",
      "affected_files": 8,
      "impact": "Runtime failures unhandled",
      "priority": "high"
    }
  ]
}
Markdown报告
audit-report.md
):
  • 汇总表格(整体得分+按类型得分)
  • 单个文件得分与核心问题
  • 每个文件的详细得分拆解(可折叠)
  • 按优先级排序的优化建议
JSON输出
audit-report.json
):
json
{
  "metadata": {
    "project_path": "/path/to/project",
    "audit_date": "2026-02-07",
    "mode": "full",
    "version": "1.0.0"
  },
  "summary": {
    "overall_score": 82.5,
    "overall_grade": "B",
    "total_files": 15,
    "production_ready_count": 10,
    "production_ready_percentage": 66.7
  },
  "by_type": {
    "agents": { "count": 5, "avg_score": 85.2, "grade": "B" },
    "skills": { "count": 8, "avg_score": 78.9, "grade": "C" },
    "commands": { "count": 2, "avg_score": 92.0, "grade": "A" }
  },
  "files": [
    {
      "path": ".claude/agents/debugging-specialist.md",
      "type": "agent",
      "score": 78.1,
      "grade": "C",
      "points_obtained": 25,
      "points_max": 32,
      "failed_criteria": [
        {
          "id": "A2.4",
          "name": "Anti-hallucination measures",
          "points_lost": 2,
          "recommendation": "Add section on source verification"
        }
      ]
    }
  ],
  "top_issues": [
    {
      "issue": "Missing error handling",
      "affected_files": 8,
      "impact": "Runtime failures unhandled",
      "priority": "high"
    }
  ]
}

Phase 5: Fix Suggestions (Optional)

阶段5:修复建议(可选)

For each failing criterion, generate actionable fix:
markdown
undefined
对每个未达标项,生成可执行的修复方案
markdown
undefined

File: .claude/agents/debugging-specialist.md

文件: .claude/agents/debugging-specialist.md

Issue: Missing anti-hallucination measures (2 points lost)
Fix: Add this section after "Methodology":
问题: 缺失防幻觉措施(扣2分)
修复方案: 在“方法论”章节后添加如下内容:

Source Verification

来源验证

  • Always cite sources for technical claims
  • Use phrases: "According to [documentation]...", "Based on [tool output]..."
  • If uncertain, state: "I don't have verified information on..."
  • Never invent: statistics, version numbers, API signatures, stack traces
Detection: Grep for keywords: "verify", "cite", "source", "evidence"

---
  • 技术声明必须标注来源
  • 使用话术:“根据[文档]...”, “基于[工具输出]...”
  • 不确定时说明:“我没有关于该问题的验证信息...”
  • 禁止编造:统计数据、版本号、API签名、堆栈跟踪
检测规则: 搜索关键词:"verify", "cite", "source", "evidence"

---

Scoring Criteria

评分标准

See
scoring/criteria.yaml
for complete definitions. Summary:
完整定义请查看
scoring/criteria.yaml
,摘要如下:

Agents (32 points max)

Agent(满分32分)

CategoryWeightCriteria CountMax Points
Identity3x412
Prompt Quality2x48
Validation1x44
Design2x48
Key Criteria:
  • Clear name (3 pts): Not generic like "agent1"
  • Description with triggers (3 pts): Contains "when"/"use"
  • Role defined (2 pts): "You are..." statement
  • 3+ examples (1 pt): Usage scenarios documented
  • Single responsibility (2 pts): Focused, not "general purpose"
分类权重标准数量满分
身份3x412
Prompt质量2x48
验证1x44
设计2x48
核心标准
  • 清晰的名称(3分):不能是“agent1”这类通用名称
  • 带触发条件的描述(3分):包含“当”/“使用场景”相关说明
  • 明确定义角色(2分):有“你是...”类声明
  • 3个以上示例(1分):有使用场景文档
  • 单一职责(2分):功能聚焦,不是“通用型”Agent

Skills (32 points max)

Skill(满分32分)

CategoryWeightCriteria CountMax Points
Structure3x412
Content2x48
Technical1x44
Design2x48
Key Criteria:
  • Valid SKILL.md (3 pts): Proper naming
  • Name valid (3 pts): Lowercase, 1-64 chars, no spaces
  • Methodology described (2 pts): Workflow section exists
  • No hardcoded paths (1 pt): No
    /Users/
    ,
    /home/
  • Clear triggers (2 pts): "When to use" section
分类权重标准数量满分
结构3x412
内容2x48
技术实现1x44
设计2x48
核心标准
  • 有效的SKILL.md文件(3分):命名规范
  • 合法名称(3分):小写、1-64字符、无空格
  • 描述方法论(2分):存在工作流章节
  • 无硬编码路径(1分):没有
    /Users/
    /home/
    这类路径
  • 清晰触发条件(2分):有“何时使用”章节

Commands (20 points max)

命令(满分20分)

CategoryWeightCriteria CountMax Points
Structure3x412
Quality2x48
Key Criteria:
  • Valid frontmatter (3 pts): name + description
  • Argument hint (3 pts): If uses
    $ARGUMENTS
  • Step-by-step workflow (3 pts): Numbered sections
  • Error handling (2 pts): Mentions failure modes

分类权重标准数量满分
结构3x412
质量2x48
核心标准
  • 有效的frontmatter(3分):包含名称+描述
  • 参数提示(3分):如果使用
    $ARGUMENTS
    需提供提示
  • 分步工作流(3分):有编号步骤说明
  • 错误处理(2分):提及失败场景

Detection Patterns

检测规则

Frontmatter Parsing

Frontmatter解析

python
import yaml
import re

def parse_frontmatter(content):
    match = re.search(r'^---\n(.*?)\n---', content, re.DOTALL)
    if match:
        return yaml.safe_load(match.group(1))
    return None
python
import yaml
import re

def parse_frontmatter(content):
    match = re.search(r'^---\n(.*?)\n---', content, re.DOTALL)
    if match:
        return yaml.safe_load(match.group(1))
    return None

Keyword Detection

关键词检测

python
def has_keywords(text, keywords):
    text_lower = text.lower()
    return any(kw in text_lower for kw in keywords)
python
def has_keywords(text, keywords):
    text_lower = text.lower()
    return any(kw in text_lower for kw in keywords)

Example

示例

has_trigger = has_keywords(description, ['when', 'use', 'trigger']) has_error_handling = has_keywords(content, ['error', 'failure', 'fallback'])
undefined
has_trigger = has_keywords(description, ['when', 'use', 'trigger']) has_error_handling = has_keywords(content, ['error', 'failure', 'fallback'])
undefined

Overlap Detection (Duplication Check)

重叠检测(重复校验)

python
def jaccard_similarity(text1, text2):
    words1 = set(text1.lower().split())
    words2 = set(text2.lower().split())
    intersection = words1 & words2
    union = words1 | words2
    return len(intersection) / len(union) if union else 0
python
def jaccard_similarity(text1, text2):
    words1 = set(text1.lower().split())
    words2 = set(text2.lower().split())
    intersection = words1 & words2
    union = words1 | words2
    return len(intersection) / len(union) if union else 0

Flag if similarity > 0.5 (50% keyword overlap)

相似度>0.5(50%关键词重叠)则标记

if jaccard_similarity(desc1, desc2) > 0.5: issues.append("High overlap with another file")
undefined
if jaccard_similarity(desc1, desc2) > 0.5: issues.append("High overlap with another file")
undefined

Token Counting (Approximate)

Token计数(近似值)

python
def estimate_tokens(text):
    # Rough estimate: 1 token ≈ 0.75 words
    word_count = len(text.split())
    return int(word_count * 1.3)
python
def estimate_tokens(text):
    # 粗略估算:1 token ≈ 0.75 个单词
    word_count = len(text.split())
    return int(word_count * 1.3)

Check budget

检查长度上限

tokens = estimate_tokens(file_content) if tokens > 5000: issues.append("File too large (>5K tokens)")

---
tokens = estimate_tokens(file_content) if tokens > 5000: issues.append("File too large (>5K tokens)")

---

Industry Context

行业背景

Source: LangChain Agent Report 2026 (public report, page 14-22)
Key Findings:
  • 29.5% of organizations deploy agents without systematic evaluation
  • 18% cite "agent bugs" as their primary challenge
  • Only 12% use automated quality checks (88% manual or none)
  • 43% report difficulty maintaining agent quality over time
  • Top issues: Hallucinations (31%), poor error handling (28%), unclear triggers (22%)
Implications:
  1. Automation gap: Most teams rely on manual checklists (error-prone at scale)
  2. Quality debt: Agents deployed without validation accumulate technical debt
  3. Maintenance burden: 43% struggle with quality over time (no tracking system)
This skill addresses:
  • Automation: Replaces manual checklists with quantitative scoring
  • Tracking: JSON output enables trend analysis over time
  • Standards: 80% threshold provides clear production gate

来源:《LangChain Agent报告2026》(公开报告,第14-22页)
核心发现
  • **29.5%**的机构部署Agent前没有经过系统评估
  • **18%**的团队将“Agent漏洞”列为首要挑战
  • **仅12%**的团队使用自动化质量检查(88%的团队使用人工检查或无检查)
  • **43%**的团队表示长期维护Agent质量很困难
  • Top问题:幻觉(31%)、错误处理不足(28%)、触发条件不清晰(22%)
启示
  1. 自动化缺口:大多数团队依赖人工检查清单(规模扩大后容易出错)
  2. 质量债务:未经验证就部署的Agent会累积技术债务
  3. 维护负担:43%的团队难以长期维持质量(没有跟踪系统)
本Skill解决的问题
  • 自动化:用量化评分替代人工检查清单
  • 跟踪:JSON输出支持长期趋势分析
  • 标准:80%的阈值提供清晰的生产上线门槛

Output Examples

输出示例

Quick Audit (Top-5 Criteria)

快速审计(Top5标准)

markdown
undefined
markdown
undefined

Quick Audit: Agents/Skills/Commands

快速审计:Agent/Skill/命令

Files: 15 (5 agents, 8 skills, 2 commands) Critical Issues: 3 files fail top-5 criteria
文件总数:15(5个Agent、8个Skill、2个命令) 严重问题:3个文件未通过Top5标准检查

Top-5 Criteria (Pass/Fail)

Top5标准(通过/未通过)

FileValid NameHas TriggersError HandlingNo Hardcoded PathsExamples
agent1.md
skill2/
文件有效名称触发条件错误处理无硬编码路径示例
agent1.md
skill2/

Action Required

需要执行的操作

  1. Add error handling: 5 files
  2. Remove hardcoded paths: 3 files
  3. Add usage examples: 4 files
undefined
  1. 添加错误处理:5个文件
  2. 移除硬编码路径:3个文件
  3. 添加使用示例:4个文件
undefined

Full Audit

全量审计

See Phase 4: Report Generation above for full structure.
完整结构参考上文阶段4:报告生成部分。

Comparative (Full + Benchmarks)

对比审计(全量+基准对比)

markdown
undefined
markdown
undefined

Comparative Audit

对比审计

Project vs Templates

项目 vs 模板基准

FileProject ScoreTemplate ScoreGapTop Missing
debugging-specialist.md78% (C)94% (A)-16 ptsAnti-hallucination, edge cases
testing-expert/85% (B)91% (A)-6 ptsIntegration docs
文件项目得分模板得分差距核心缺失项
debugging-specialist.md78% (C)94% (A)-16分防幻觉措施、边界场景
testing-expert/85% (B)91% (A)-6分集成文档

Recommendations

优化建议

Focus on these gaps to reach template quality:
  1. Anti-hallucination measures (8 files): Add source verification sections
  2. Edge case documentation (5 files): Add failure scenario examples
  3. Integration documentation (4 files): List compatible agents/skills

---
重点优化以下缺口即可达到模板质量:
  1. 防幻觉措施(8个文件):添加来源验证章节
  2. 边界场景文档(5个文件):添加失败场景示例
  3. 集成文档(4个文件):列出兼容的Agent/Skill

---

Usage

使用方法

Basic (Full Audit)

基础用法(全量审计)

bash
undefined
bash
undefined

In Claude Code

在Claude Code中使用

Use skill: audit-agents-skills
Use skill: audit-agents-skills

Specify path

指定路径

Use skill: audit-agents-skills for ~/projects/my-app
undefined
Use skill: audit-agents-skills for ~/projects/my-app
undefined

With Options

带参数使用

bash
undefined
bash
undefined

Quick audit (fast)

快速审计(速度快)

Use skill: audit-agents-skills with mode=quick
Use skill: audit-agents-skills with mode=quick

Comparative (benchmark analysis)

对比审计(基准分析)

Use skill: audit-agents-skills with mode=comparative
Use skill: audit-agents-skills with mode=comparative

Generate fixes

生成修复建议

Use skill: audit-agents-skills with fixes=true
Use skill: audit-agents-skills with fixes=true

Custom output path

自定义输出路径

Use skill: audit-agents-skills with output=~/Desktop/audit.json
undefined
Use skill: audit-agents-skills with output=~/Desktop/audit.json
undefined

JSON Output Only

仅输出JSON

bash
undefined
bash
undefined

For programmatic integration

用于程序集成

Use skill: audit-agents-skills with format=json output=audit.json

---
Use skill: audit-agents-skills with format=json output=audit.json

---

Integration with CI/CD

与CI/CD集成

Pre-commit Hook

Pre-commit钩子

bash
#!/bin/bash
bash
#!/bin/bash

.git/hooks/pre-commit

.git/hooks/pre-commit

Run quick audit on changed agent/skill/command files

对变更的Agent/Skill/命令文件运行快速审计

changed_files=$(git diff --cached --name-only | grep -E "^.claude/(agents|skills|commands)/")
if [ -n "$changed_files" ]; then echo "Running quick audit on changed files..." # Run audit (requires Claude Code CLI wrapper) # Exit with 1 if any file scores <80% fi
undefined
changed_files=$(git diff --cached --name-only | grep -E "^.claude/(agents|skills|commands)/")
if [ -n "$changed_files" ]; then echo "Running quick audit on changed files..." # 运行审计(需要Claude Code CLI封装) # 如果有文件得分<80%则退出码为1 fi
undefined

GitHub Actions

GitHub Actions

yaml
name: Audit Agents/Skills
on: [pull_request]
jobs:
  audit:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3
      - name: Run quality audit
        run: |
          # Run audit skill
          # Parse JSON output
          # Fail if overall_score < 80

yaml
name: Audit Agents/Skills
on: [pull_request]
jobs:
  audit:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3
      - name: Run quality audit
        run: |
          # 运行审计Skill
          # 解析JSON输出
          # 如果整体得分<80则失败

Comparison: Command vs Skill

对比:命令 vs Skill

AspectCommand (
/audit-agents-skills
)
Skill (this file)
ScopeCurrent project onlyMulti-project, comparative
OutputMarkdown reportMarkdown + JSON
SpeedFast (5-10 min)Slower (10-20 min with comparative)
DepthStandard 16 criteriaSame + benchmark analysis
Fix suggestionsVia
--fix
flag
Built-in with recommendations
ProgrammaticTerminal outputJSON for CI/CD integration
Best forQuick checks, dev workflowDeep audits, quality tracking
Recommendation: Use command for daily checks, skill for release gates and quality tracking.

维度命令(
/audit-agents-skills
Skill(本文件)
适用范围仅当前项目多项目,支持对比
输出Markdown报告Markdown + JSON
速度快(5-10分钟)较慢(对比模式10-20分钟)
审计深度标准16项检查相同+基准对比分析
修复建议通过
--fix
参数支持
内置优化建议
程序集成终端输出支持JSON对接CI/CD
适用场景快速检查、开发流程深度审计、质量跟踪
推荐用法:日常检查用命令,发布卡点和质量跟踪用Skill。

Maintenance

维护

Updating Criteria

更新标准

Edit
scoring/criteria.yaml
:
yaml
agents:
  categories:
    identity:
      criteria:
        - id: A1.5  # New criterion
          name: "API versioning specified"
          points: 3
          detection: "mentions API version or compatibility"
Version bump: Increment
version
in frontmatter when criteria change.
编辑
scoring/criteria.yaml
yaml
agents:
  categories:
    identity:
      criteria:
        - id: A1.5  # 新增标准
          name: "API versioning specified"
          points: 3
          detection: "mentions API version or compatibility"
版本升级:标准变更时增加frontmatter中的
version
字段。

Adding File Types

新增文件类型支持

To support new file types (e.g., "workflows"):
  1. Add to
    scoring/criteria.yaml
    :
    yaml
    workflows:
      max_points: 24
      categories: [...]
  2. Update detection logic (file path patterns)
  3. Update report templates

要支持新的文件类型(如“workflows”):
  1. scoring/criteria.yaml
    中添加配置:
    yaml
    workflows:
      max_points: 24
      categories: [...]
  2. 更新检测逻辑(文件路径匹配规则)
  3. 更新报告模板

Related

相关资源

  • Command version:
    .claude/commands/audit-agents-skills.md
  • Agent Validation Checklist: guide line 4921 (manual 16 criteria)
  • Skill Validation: guide line 5491 (spec documentation)
  • Reference templates:
    examples/agents/
    ,
    examples/skills/
    ,
    examples/commands/

  • 命令版本
    .claude/commands/audit-agents-skills.md
  • Agent验证检查清单:指南第4921行(16项人工检查标准)
  • Skill验证规范:指南第5491行(规格文档)
  • 参考模板
    examples/agents/
    examples/skills/
    examples/commands/

Changelog

更新日志

v1.0.0 (2026-02-07):
  • Initial release
  • 16-criteria framework (agents/skills/commands)
  • 3 audit modes (quick/full/comparative)
  • JSON + Markdown output
  • Fix suggestions
  • Industry context (LangChain 2026 report)

Skill ready for use:
audit-agents-skills
v1.0.0(2026-02-07):
  • 首次发布
  • 16项标准框架(支持Agent/Skill/命令)
  • 3种审计模式(快速/全量/对比)
  • JSON+Markdown输出
  • 修复建议功能
  • 行业背景(LangChain 2026报告)

Skill可投入使用
audit-agents-skills