audit-agents-skills

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Audit Agents/Skills/Commands (Advanced Skill)

审计Agent/Skill/命令（高级Skill）

Comprehensive quality audit system for Claude Code agents, skills, and commands. Provides quantitative scoring, comparative analysis, and production readiness grading based on industry best practices.

面向Claude Code Agent、Skill和命令的综合质量审计系统，基于行业最佳实践提供量化评分、对比分析和生产就绪度评级。

Purpose

用途

Problem: Manual validation of agents/skills is error-prone and inconsistent. According to the LangChain Agent Report 2026, 29.5% of organizations deploy agents without systematic evaluation, leading to "agent bugs" as the top challenge (18% of teams).

Solution: Automated quality scoring across 16 weighted criteria with production readiness thresholds (80% = Grade B minimum for production deployment).

Key Features:

Quantitative scoring (32 points for agents/skills, 20 for commands)
Weighted criteria (Identity 3x, Prompt 2x, Validation 1x, Design 2x)
Production readiness grading (A-F scale with 80% threshold)
Comparative analysis vs reference templates
JSON/Markdown dual output for programmatic integration
Fix suggestions for failing criteria

问题：人工验证Agent/Skill容易出错且标准不统一。根据《LangChain Agent报告2026》，29.5%的机构在未经过系统评估的情况下就部署Agent，导致“Agent漏洞”成为头号挑战（18%的团队受此影响）。

解决方案：基于16项加权标准的自动化质量评分，设置生产就绪阈值（80% = 生产部署最低B级要求）。

核心特性：

量化评分（Agent/Skill满分32分，命令满分20分）
加权评分标准（身份权重3x、Prompt权重2x、验证权重1x、设计权重2x）
生产就绪度评级（A-F等级，阈值80%）
与参考模板的对比分析
支持JSON/Markdown双输出，方便程序集成
针对未达标项提供修复建议

Modes

模式

Mode	Usage	Output
Quick Audit	Top-5 critical criteria only	Fast pass/fail (3-5 min for 20 files)
Full Audit	All 16 criteria per file	Detailed scores + recommendations (10-15 min)
Comparative	Full + benchmark vs templates	Analysis + gap identification (15-20 min)

Default: Full Audit (recommended for first run)

模式	用途	输出
快速审计	仅检查Top5关键标准	快速得出通过/未通过结果（20个文件仅需3-5分钟）
全量审计	每个文件检查全部16项标准	详细得分+优化建议（10-15分钟）
对比审计	全量审计+与模板基准对比	分析+差距识别（15-20分钟）

默认模式：全量审计（首次运行推荐）

Methodology

方法论

Why These Criteria?

为什么选择这些标准？

The 16-criteria framework is derived from:

Claude Code Best Practices (Ultimate Guide line 4921: Agent Validation Checklist)
Industry Data (LangChain Agent Report 2026: evaluation gaps)
Production Failures (Community feedback on hardcoded paths, missing error handling)
Composition Patterns (Skills should reference other skills, agents should be modular)

16项标准框架来源于：

Claude Code最佳实践（终极指南第4921行：Agent验证检查清单）
行业数据（《LangChain Agent报告2026》：评估缺口）
生产故障案例（社区反馈的硬编码路径、缺失错误处理等问题）
组合模式要求（Skill应当可以引用其他Skill，Agent应当模块化）

Scoring Philosophy

评分逻辑

Weight Rationale:

Identity (3x): If users can't find/invoke the agent, quality is irrelevant (discoverability > quality)
Prompt (2x): Determines reliability and accuracy of outputs
Validation (1x): Improves robustness but is secondary to core functionality
Design (2x): Impacts long-term maintainability and scalability

Grade Standards:

A (90-100%): Production-ready, minimal risk
B (80-89%): Good, meets production threshold
C (70-79%): Needs improvement before production
D (60-69%): Significant gaps, not production-ready
F (<60%): Critical issues, requires major refactoring

Industry Alignment: The 80% threshold aligns with software engineering best practices for production deployment (e.g., code coverage >80%, security scan pass rates).

权重设置理由：

身份（3x）：如果用户找不到/无法调用Agent，质量毫无意义（可发现性 > 质量）
Prompt（2x）：决定输出的可靠性和准确性
验证（1x）：提升鲁棒性，但优先级低于核心功能
设计（2x）：影响长期可维护性和可扩展性

评级标准：

A（90-100%）：生产就绪，风险极低
B（80-89%）：良好，达到生产阈值
C（70-79%）：上线前需要优化
D（60-69%）：存在明显缺口，不适合生产部署
F（<60%）：存在严重问题，需要大量重构

行业对齐：80%的阈值符合软件工程生产部署的最佳实践（如代码覆盖率>80%、安全扫描通过率要求等）。

Workflow

工作流

Phase 1: Discovery

阶段1：资源发现

Scan directories:

.claude/agents/
.claude/skills/
.claude/commands/
examples/agents/      (if exists)
examples/skills/      (if exists)
examples/commands/    (if exists)

Classify files by type (agent/skill/command)

Load reference templates (for Comparative mode):

guide/examples/agents/     (benchmark files)
guide/examples/skills/     (benchmark files)
guide/examples/commands/   (benchmark files)

扫描目录：

.claude/agents/
.claude/skills/
.claude/commands/
examples/agents/      (如果存在)
examples/skills/      (如果存在)
examples/commands/    (如果存在)

按类型分类文件（Agent/Skill/命令）

加载参考模板（仅对比模式需要）：

guide/examples/agents/     (基准文件)
guide/examples/skills/     (基准文件)
guide/examples/commands/   (基准文件)

Phase 2: Scoring Engine

阶段2：评分引擎

Load scoring criteria from

scoring/criteria.yaml

yaml

agents:
  max_points: 32
  categories:
    identity:
      weight: 3
      criteria:
        - id: A1.1
          name: "Clear name"
          points: 3
          detection: "frontmatter.name exists and is descriptive"
        # ... (16 total criteria)

For each file:

Parse frontmatter (YAML)
Extract content sections
Run detection patterns (regex, keyword search)
Calculate score:
```
(points / max_points) × 100
```
Assign grade (A-F)

从

scoring/criteria.yaml

加载评分标准：

yaml

agents:
  max_points: 32
  categories:
    identity:
      weight: 3
      criteria:
        - id: A1.1
          name: "Clear name"
          points: 3
          detection: "frontmatter.name exists and is descriptive"
        # ... (共16项标准)

对每个文件：

解析frontmatter（YAML）
提取内容区块
运行检测规则（正则、关键词搜索）
计算得分：
```
(得分/满分) × 100
```
分配等级（A-F）

Phase 3: Comparative Analysis (Comparative Mode Only)

阶段3：对比分析（仅对比模式）

For each project file:

Find closest matching template (by description similarity)
Compare scores per criterion
Identify gaps:
```
template_score - project_score
```
Flag significant gaps (>10 points difference)

Example:

Project file: .claude/agents/debugging-specialist.md (Score: 78%, Grade C)
Closest template: examples/agents/debugging-specialist.md (Score: 94%, Grade A)

Gaps:
- Anti-hallucination measures: -2 points (template has, project missing)
- Edge cases documented: -1 point (template has 5 examples, project has 1)
- Integration documented: -1 point (template references 3 skills, project none)

Total gap: 16 points (explains C vs A difference)

对每个项目文件：

匹配最接近的模板（基于描述相似度）
按标准项对比得分
识别差距：
```
模板得分 - 项目得分
```
标记显著差距（得分差>10分）

示例：

Project file: .claude/agents/debugging-specialist.md (Score: 78%, Grade C)
Closest template: examples/agents/debugging-specialist.md (Score: 94%, Grade A)

Gaps:
- Anti-hallucination measures: -2 points (template has, project missing)
- Edge cases documented: -1 point (template has 5 examples, project has 1)
- Integration documented: -1 point (template references 3 skills, project none)

Total gap: 16 points (explains C vs A difference)

Phase 4: Report Generation

阶段4：报告生成

Markdown Report (

audit-report.md

Summary table (overall + by type)
Individual scores with top issues
Detailed breakdown per file (collapsible)
Prioritized recommendations

JSON Output (

audit-report.json

json

{
  "metadata": {
    "project_path": "/path/to/project",
    "audit_date": "2026-02-07",
    "mode": "full",
    "version": "1.0.0"
  },
  "summary": {
    "overall_score": 82.5,
    "overall_grade": "B",
    "total_files": 15,
    "production_ready_count": 10,
    "production_ready_percentage": 66.7
  },
  "by_type": {
    "agents": { "count": 5, "avg_score": 85.2, "grade": "B" },
    "skills": { "count": 8, "avg_score": 78.9, "grade": "C" },
    "commands": { "count": 2, "avg_score": 92.0, "grade": "A" }
  },
  "files": [
    {
      "path": ".claude/agents/debugging-specialist.md",
      "type": "agent",
      "score": 78.1,
      "grade": "C",
      "points_obtained": 25,
      "points_max": 32,
      "failed_criteria": [
        {
          "id": "A2.4",
          "name": "Anti-hallucination measures",
          "points_lost": 2,
          "recommendation": "Add section on source verification"
        }
      ]
    }
  ],
  "top_issues": [
    {
      "issue": "Missing error handling",
      "affected_files": 8,
      "impact": "Runtime failures unhandled",
      "priority": "high"
    }
  ]
}

Markdown报告（

audit-report.md

）：

汇总表格（整体得分+按类型得分）
单个文件得分与核心问题
每个文件的详细得分拆解（可折叠）
按优先级排序的优化建议

JSON输出（

audit-report.json

）：

json

{
  "metadata": {
    "project_path": "/path/to/project",
    "audit_date": "2026-02-07",
    "mode": "full",
    "version": "1.0.0"
  },
  "summary": {
    "overall_score": 82.5,
    "overall_grade": "B",
    "total_files": 15,
    "production_ready_count": 10,
    "production_ready_percentage": 66.7
  },
  "by_type": {
    "agents": { "count": 5, "avg_score": 85.2, "grade": "B" },
    "skills": { "count": 8, "avg_score": 78.9, "grade": "C" },
    "commands": { "count": 2, "avg_score": 92.0, "grade": "A" }
  },
  "files": [
    {
      "path": ".claude/agents/debugging-specialist.md",
      "type": "agent",
      "score": 78.1,
      "grade": "C",
      "points_obtained": 25,
      "points_max": 32,
      "failed_criteria": [
        {
          "id": "A2.4",
          "name": "Anti-hallucination measures",
          "points_lost": 2,
          "recommendation": "Add section on source verification"
        }
      ]
    }
  ],
  "top_issues": [
    {
      "issue": "Missing error handling",
      "affected_files": 8,
      "impact": "Runtime failures unhandled",
      "priority": "high"
    }
  ]
}

Phase 5: Fix Suggestions (Optional)

阶段5：修复建议（可选）

For each failing criterion, generate actionable fix:

markdown

undefined

对每个未达标项，生成可执行的修复方案：

markdown

undefined

File: .claude/agents/debugging-specialist.md

文件: .claude/agents/debugging-specialist.md

Issue: Missing anti-hallucination measures (2 points lost)

Fix: Add this section after "Methodology":

问题: 缺失防幻觉措施（扣2分）

修复方案: 在“方法论”章节后添加如下内容：

Source Verification

来源验证

Always cite sources for technical claims
Use phrases: "According to [documentation]...", "Based on [tool output]..."
If uncertain, state: "I don't have verified information on..."
Never invent: statistics, version numbers, API signatures, stack traces

Detection: Grep for keywords: "verify", "cite", "source", "evidence"

---

技术声明必须标注来源
使用话术：“根据[文档]...”, “基于[工具输出]...”
不确定时说明：“我没有关于该问题的验证信息...”
禁止编造：统计数据、版本号、API签名、堆栈跟踪

检测规则: 搜索关键词："verify", "cite", "source", "evidence"

---

Scoring Criteria

评分标准

See

scoring/criteria.yaml

for complete definitions. Summary:

完整定义请查看

scoring/criteria.yaml

，摘要如下：

Agents (32 points max)

Agent（满分32分）

Category	Weight	Criteria Count	Max Points
Identity	3x	4	12
Prompt Quality	2x	4	8
Validation	1x	4	4
Design	2x	4	8

Key Criteria:

Clear name (3 pts): Not generic like "agent1"
Description with triggers (3 pts): Contains "when"/"use"
Role defined (2 pts): "You are..." statement
3+ examples (1 pt): Usage scenarios documented
Single responsibility (2 pts): Focused, not "general purpose"

分类	权重	标准数量	满分
身份	3x	4	12
Prompt质量	2x	4	8
验证	1x	4	4
设计	2x	4	8

核心标准：

清晰的名称（3分）：不能是“agent1”这类通用名称
带触发条件的描述（3分）：包含“当”/“使用场景”相关说明
明确定义角色（2分）：有“你是...”类声明
3个以上示例（1分）：有使用场景文档
单一职责（2分）：功能聚焦，不是“通用型”Agent

Skills (32 points max)

Skill（满分32分）

Category	Weight	Criteria Count	Max Points
Structure	3x	4	12
Content	2x	4	8
Technical	1x	4	4
Design	2x	4	8

Key Criteria:

Valid SKILL.md (3 pts): Proper naming
Name valid (3 pts): Lowercase, 1-64 chars, no spaces
Methodology described (2 pts): Workflow section exists
No hardcoded paths (1 pt): No
```
/Users/
```
,
```
/home/
```
Clear triggers (2 pts): "When to use" section

分类	权重	标准数量	满分
结构	3x	4	12
内容	2x	4	8
技术实现	1x	4	4
设计	2x	4	8

核心标准：

有效的SKILL.md文件（3分）：命名规范
合法名称（3分）：小写、1-64字符、无空格
描述方法论（2分）：存在工作流章节
无硬编码路径（1分）：没有
```
/Users/
```
、
```
/home/
```
这类路径
清晰触发条件（2分）：有“何时使用”章节

Commands (20 points max)

命令（满分20分）

Category	Weight	Criteria Count	Max Points
Structure	3x	4	12
Quality	2x	4	8

Key Criteria:

Valid frontmatter (3 pts): name + description
Argument hint (3 pts): If uses
```
$ARGUMENTS
```
Step-by-step workflow (3 pts): Numbered sections
Error handling (2 pts): Mentions failure modes

分类	权重	标准数量	满分
结构	3x	4	12
质量	2x	4	8

核心标准：

有效的frontmatter（3分）：包含名称+描述
参数提示（3分）：如果使用
```
$ARGUMENTS
```
需提供提示
分步工作流（3分）：有编号步骤说明
错误处理（2分）：提及失败场景

Detection Patterns

检测规则

Frontmatter Parsing

Frontmatter解析

python

import yaml
import re

def parse_frontmatter(content):
    match = re.search(r'^---\n(.*?)\n---', content, re.DOTALL)
    if match:
        return yaml.safe_load(match.group(1))
    return None

python

import yaml
import re

def parse_frontmatter(content):
    match = re.search(r'^---\n(.*?)\n---', content, re.DOTALL)
    if match:
        return yaml.safe_load(match.group(1))
    return None

Keyword Detection

关键词检测

python

def has_keywords(text, keywords):
    text_lower = text.lower()
    return any(kw in text_lower for kw in keywords)

python

def has_keywords(text, keywords):
    text_lower = text.lower()
    return any(kw in text_lower for kw in keywords)

Example

示例

has_trigger = has_keywords(description, ['when', 'use', 'trigger']) has_error_handling = has_keywords(content, ['error', 'failure', 'fallback'])

undefined

has_trigger = has_keywords(description, ['when', 'use', 'trigger']) has_error_handling = has_keywords(content, ['error', 'failure', 'fallback'])

undefined

Overlap Detection (Duplication Check)

重叠检测（重复校验）

python

def jaccard_similarity(text1, text2):
    words1 = set(text1.lower().split())
    words2 = set(text2.lower().split())
    intersection = words1 & words2
    union = words1 | words2
    return len(intersection) / len(union) if union else 0

python

def jaccard_similarity(text1, text2):
    words1 = set(text1.lower().split())
    words2 = set(text2.lower().split())
    intersection = words1 & words2
    union = words1 | words2
    return len(intersection) / len(union) if union else 0

Flag if similarity > 0.5 (50% keyword overlap)

相似度>0.5（50%关键词重叠）则标记

if jaccard_similarity(desc1, desc2) > 0.5: issues.append("High overlap with another file")

undefined

if jaccard_similarity(desc1, desc2) > 0.5: issues.append("High overlap with another file")

undefined

Token Counting (Approximate)

Token计数（近似值）

python

def estimate_tokens(text):
    # Rough estimate: 1 token ≈ 0.75 words
    word_count = len(text.split())
    return int(word_count * 1.3)

python

def estimate_tokens(text):
    # 粗略估算：1 token ≈ 0.75 个单词
    word_count = len(text.split())
    return int(word_count * 1.3)

Check budget

检查长度上限

tokens = estimate_tokens(file_content) if tokens > 5000: issues.append("File too large (>5K tokens)")

---

tokens = estimate_tokens(file_content) if tokens > 5000: issues.append("File too large (>5K tokens)")

---

Industry Context

行业背景

Source: LangChain Agent Report 2026 (public report, page 14-22)

Key Findings:

29.5% of organizations deploy agents without systematic evaluation
18% cite "agent bugs" as their primary challenge
Only 12% use automated quality checks (88% manual or none)
43% report difficulty maintaining agent quality over time
Top issues: Hallucinations (31%), poor error handling (28%), unclear triggers (22%)

Implications:

Automation gap: Most teams rely on manual checklists (error-prone at scale)
Quality debt: Agents deployed without validation accumulate technical debt
Maintenance burden: 43% struggle with quality over time (no tracking system)

This skill addresses:

Automation: Replaces manual checklists with quantitative scoring
Tracking: JSON output enables trend analysis over time
Standards: 80% threshold provides clear production gate

来源：《LangChain Agent报告2026》（公开报告，第14-22页）

核心发现：

**29.5%**的机构部署Agent前没有经过系统评估
**18%**的团队将“Agent漏洞”列为首要挑战
**仅12%**的团队使用自动化质量检查（88%的团队使用人工检查或无检查）
**43%**的团队表示长期维护Agent质量很困难
Top问题：幻觉（31%）、错误处理不足（28%）、触发条件不清晰（22%）

启示：

自动化缺口：大多数团队依赖人工检查清单（规模扩大后容易出错）
质量债务：未经验证就部署的Agent会累积技术债务
维护负担：43%的团队难以长期维持质量（没有跟踪系统）

本Skill解决的问题：

自动化：用量化评分替代人工检查清单
跟踪：JSON输出支持长期趋势分析
标准：80%的阈值提供清晰的生产上线门槛

Output Examples

输出示例

Quick Audit (Top-5 Criteria)

快速审计（Top5标准）

markdown

undefined

markdown

undefined

Quick Audit: Agents/Skills/Commands

快速审计：Agent/Skill/命令

Files: 15 (5 agents, 8 skills, 2 commands) Critical Issues: 3 files fail top-5 criteria

文件总数：15（5个Agent、8个Skill、2个命令） 严重问题：3个文件未通过Top5标准检查

Top-5 Criteria (Pass/Fail)

Top5标准（通过/未通过）

File	Valid Name	Has Triggers	Error Handling	No Hardcoded Paths	Examples
agent1.md	✅	✅	❌	✅	❌
skill2/	✅	❌	✅	❌	✅

文件	有效名称	触发条件	错误处理	无硬编码路径	示例
agent1.md	✅	✅	❌	✅	❌
skill2/	✅	❌	✅	❌	✅

Action Required

需要执行的操作

Add error handling: 5 files
Remove hardcoded paths: 3 files
Add usage examples: 4 files

undefined

添加错误处理：5个文件
移除硬编码路径：3个文件
添加使用示例：4个文件

undefined

Full Audit

全量审计

See Phase 4: Report Generation above for full structure.

完整结构参考上文阶段4：报告生成部分。

Comparative (Full + Benchmarks)

对比审计（全量+基准对比）

markdown

undefined

markdown

undefined

Comparative Audit

对比审计

Project vs Templates

项目 vs 模板基准

File	Project Score	Template Score	Gap	Top Missing
debugging-specialist.md	78% (C)	94% (A)	-16 pts	Anti-hallucination, edge cases
testing-expert/	85% (B)	91% (A)	-6 pts	Integration docs

文件	项目得分	模板得分	差距	核心缺失项
debugging-specialist.md	78% (C)	94% (A)	-16分	防幻觉措施、边界场景
testing-expert/	85% (B)	91% (A)	-6分	集成文档

Recommendations

优化建议

Focus on these gaps to reach template quality:

Anti-hallucination measures (8 files): Add source verification sections
Edge case documentation (5 files): Add failure scenario examples
Integration documentation (4 files): List compatible agents/skills

---

重点优化以下缺口即可达到模板质量：

防幻觉措施（8个文件）：添加来源验证章节
边界场景文档（5个文件）：添加失败场景示例
集成文档（4个文件）：列出兼容的Agent/Skill

---

Usage

使用方法

Basic (Full Audit)

基础用法（全量审计）

bash

undefined

bash

undefined

In Claude Code

在Claude Code中使用

Use skill: audit-agents-skills

Specify path

指定路径

Use skill: audit-agents-skills for ~/projects/my-app

undefined

Use skill: audit-agents-skills for ~/projects/my-app

undefined

With Options

带参数使用

bash

undefined

bash

undefined

Quick audit (fast)

快速审计（速度快）

Use skill: audit-agents-skills with mode=quick

Comparative (benchmark analysis)

对比审计（基准分析）

Use skill: audit-agents-skills with mode=comparative

Generate fixes

生成修复建议

Use skill: audit-agents-skills with fixes=true

Custom output path

自定义输出路径

Use skill: audit-agents-skills with output=~/Desktop/audit.json

undefined

Use skill: audit-agents-skills with output=~/Desktop/audit.json

undefined

JSON Output Only

仅输出JSON

bash

undefined

bash

undefined

For programmatic integration

用于程序集成

Use skill: audit-agents-skills with format=json output=audit.json

---

Use skill: audit-agents-skills with format=json output=audit.json

---

Integration with CI/CD

与CI/CD集成

Pre-commit Hook

Pre-commit钩子

bash

#!/bin/bash

bash

#!/bin/bash

.git/hooks/pre-commit

Run quick audit on changed agent/skill/command files

对变更的Agent/Skill/命令文件运行快速审计

changed_files=$(git diff --cached --name-only | grep -E "^.claude/(agents|skills|commands)/")

if [ -n "$changed_files" ]; then echo "Running quick audit on changed files..." # Run audit (requires Claude Code CLI wrapper) # Exit with 1 if any file scores <80% fi

undefined

changed_files=$(git diff --cached --name-only | grep -E "^.claude/(agents|skills|commands)/")

if [ -n "$changed_files" ]; then echo "Running quick audit on changed files..." # 运行审计（需要Claude Code CLI封装） # 如果有文件得分<80%则退出码为1 fi

undefined

GitHub Actions

yaml

name: Audit Agents/Skills
on: [pull_request]
jobs:
  audit:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3
      - name: Run quality audit
        run: |
          # Run audit skill
          # Parse JSON output
          # Fail if overall_score < 80

yaml

name: Audit Agents/Skills
on: [pull_request]
jobs:
  audit:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3
      - name: Run quality audit
        run: |
          # 运行审计Skill
          # 解析JSON输出
          # 如果整体得分<80则失败

Comparison: Command vs Skill

对比：命令 vs Skill

Aspect	Command ( `/audit-agents-skills` )	Skill (this file)
Scope	Current project only	Multi-project, comparative
Output	Markdown report	Markdown + JSON
Speed	Fast (5-10 min)	Slower (10-20 min with comparative)
Depth	Standard 16 criteria	Same + benchmark analysis
Fix suggestions	Via `--fix` flag	Built-in with recommendations
Programmatic	Terminal output	JSON for CI/CD integration
Best for	Quick checks, dev workflow	Deep audits, quality tracking

Recommendation: Use command for daily checks, skill for release gates and quality tracking.

维度	命令（ `/audit-agents-skills` ）	Skill（本文件）
适用范围	仅当前项目	多项目，支持对比
输出	Markdown报告	Markdown + JSON
速度	快（5-10分钟）	较慢（对比模式10-20分钟）
审计深度	标准16项检查	相同+基准对比分析
修复建议	通过 `--fix` 参数支持	内置优化建议
程序集成	终端输出	支持JSON对接CI/CD
适用场景	快速检查、开发流程	深度审计、质量跟踪

推荐用法：日常检查用命令，发布卡点和质量跟踪用Skill。

Maintenance

维护

Updating Criteria

更新标准

Edit

scoring/criteria.yaml

yaml

agents:
  categories:
    identity:
      criteria:
        - id: A1.5  # New criterion
          name: "API versioning specified"
          points: 3
          detection: "mentions API version or compatibility"

Version bump: Increment

version

in frontmatter when criteria change.

编辑

scoring/criteria.yaml

：

yaml

agents:
  categories:
    identity:
      criteria:
        - id: A1.5  # 新增标准
          name: "API versioning specified"
          points: 3
          detection: "mentions API version or compatibility"

版本升级：标准变更时增加frontmatter中的

version

字段。

Adding File Types

新增文件类型支持

To support new file types (e.g., "workflows"):

Add to

scoring/criteria.yaml

yaml

workflows:
  max_points: 24
  categories: [...]

Update detection logic (file path patterns)
Update report templates

要支持新的文件类型（如“workflows”）：

在

scoring/criteria.yaml

中添加配置：

yaml

workflows:
  max_points: 24
  categories: [...]

更新检测逻辑（文件路径匹配规则）
更新报告模板

Changelog

更新日志

v1.0.0 (2026-02-07):

Initial release
16-criteria framework (agents/skills/commands)
3 audit modes (quick/full/comparative)
JSON + Markdown output
Fix suggestions
Industry context (LangChain 2026 report)

Skill ready for use:

audit-agents-skills

v1.0.0（2026-02-07）：

首次发布
16项标准框架（支持Agent/Skill/命令）
3种审计模式（快速/全量/对比）
JSON+Markdown输出
修复建议功能
行业背景（LangChain 2026报告）

Skill可投入使用：

audit-agents-skills