skill-validator
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseSkill Validator
Skill Validator
Validate any skill against production-level quality criteria.
对照生产级质量准则验证任意Skill。
Before Implementation
实施前准备
| Source | Gather |
|---|---|
| Skill Directory | SKILL.md, references/, scripts/, assets/ |
| Skill Type | Builder, Guide, Automation, Analyzer, or Validator |
| Conversation | Validation purpose (audit, improvement, review) |
| 来源 | 收集内容 |
|---|---|
| Skill 目录 | SKILL.md, references/, scripts/, assets/ |
| Skill 类型 | Builder、Guide、Automation、Analyzer 或 Validator |
| 对话信息 | 验证目的(审计、优化、评审) |
What This Skill Does NOT Do
本Skill不具备的功能
- Test skills in production environments
- Automatically fix identified issues
- Validate skill runtime behavior (only structure/content)
- Replace human judgment on domain accuracy
- 在生产环境中测试Skill
- 自动修复发现的问题
- 验证Skill的运行时行为(仅验证结构/内容)
- 替代人类对领域准确性的判断
Validation Workflow
验证工作流
Phase 1: Gather Context
阶段1:收集上下文信息
- Read the skill's SKILL.md completely
- Identify skill type from frontmatter description:
- Builder skill (creates artifacts)
- Guide skill (provides instructions)
- Automation skill (executes workflows)
- Analyzer skill (extracts insights)
- Validator skill (enforces quality)
- Hybrid skill (combination of above)
- Read all reference files in directory
references/ - Check for assets/scripts directories
- Note frontmatter fields (,
name,description,allowed-tools)model
- 完整阅读Skill的SKILL.md文件
- 从前置描述中识别Skill类型:
- Builder Skill(生成工件)
- Guide Skill(提供指导说明)
- Automation Skill(执行工作流)
- Analyzer Skill(提取洞察信息)
- Validator Skill(保障质量)
- Hybrid Skill(以上类型的组合)
- 阅读目录下的所有参考文件
references/ - 检查是否存在assets/scripts目录
- 记录前置字段(、
name、description、allowed-tools)model
Phase 2: Apply Criteria
阶段2:应用验证准则
Evaluate against 9 criteria categories. Each criterion scores 0-3:
- 0: Missing/Absent
- 1: Present but inadequate
- 2: Adequate implementation
- 3: Excellent implementation
对照9类验证准则进行评估。每个准则的评分范围为0-3分:
- 0分:缺失/未提供
- 1分:已提供但不充分
- 2分:实现符合要求
- 3分:实现优秀
Criteria Categories
准则类别
1. Structure & Anatomy (Weight: 12%)
1. 结构与组成(权重:12%)
| Criterion | What to Check |
|---|---|
| SKILL.md exists | Root file present |
| Line count | <500 lines (context is precious) |
| Frontmatter complete | |
| Name constraints | 1-64 chars; lowercase alphanumeric + hyphens; no consecutive hyphens; can't start/end with hyphen; must match directory name |
| Description format | [What] + [When] format; ≤1024 chars |
| Description style | Third-person: "This skill should be used when..." |
| No extraneous files | No README.md, CHANGELOG.md, LICENSE in skill dir |
| Progressive disclosure | Details in |
| Asset organization | Templates in |
| Large file guidance | If references >10k words, grep patterns in SKILL.md |
Fail condition: Missing SKILL.md or >800 lines = automatic fail
| 准则 | 检查内容 |
|---|---|
| SKILL.md文件存在 | 根目录下存在该文件 |
| 行数限制 | 少于500行(上下文信息宝贵) |
| 前置信息完整 | YAML中包含 |
| 名称约束 | 1-64个字符;仅包含小写字母、数字和连字符;无连续连字符;不能以连字符开头/结尾;必须与目录名称匹配 |
| 描述格式 | [功能] + [适用场景]格式;≤1024个字符 |
| 描述风格 | 第三人称表述:"本Skill应在……场景下使用" |
| 无冗余文件 | Skill目录下无README.md、CHANGELOG.md、LICENSE文件 |
| 渐进式信息披露 | 详细内容放在 |
| 资源组织 | 模板放在 |
| 大文件指引 | 若参考文件超过10000字,在SKILL.md中提供grep检索模式 |
失败条件:缺失SKILL.md或文件行数超过800行 = 直接判定不合格
2. Content Quality (Weight: 15%)
2. 内容质量(权重:15%)
| Criterion | What to Check |
|---|---|
| Conciseness | No verbose explanations, context is public good |
| Imperative form | Instructions use "Do X" not "You should do X" |
| Appropriate freedom | Constraints where needed, flexibility where safe |
| Scope clarity | Clear what skill does AND does not do |
| No hallucination risk | No instructions that encourage making up info |
| Output specification | Clear expected outputs defined |
| 准则 | 检查内容 |
|---|---|
| 简洁性 | 无冗余解释,上下文信息为公共资源 |
| 命令式表述 | 说明使用"执行X操作"而非"你应该执行X操作" |
| 适度灵活性 | 必要时设置约束,安全场景下保留灵活性 |
| 范围清晰 | 明确说明Skill能做什么以及不能做什么 |
| 无幻觉风险 | 无鼓励编造信息的指导内容 |
| 输出规范 | 明确定义预期输出内容 |
3. User Interaction (Weight: 12%)
3. 用户交互(权重:12%)
| Criterion | What to Check |
|---|---|
| Clarification triggers | Asks questions before acting on ambiguity |
| Required vs optional | Distinguishes must-know from nice-to-know |
| Graceful handling | What to do when user doesn't answer |
| No over-asking | Doesn't ask obvious or inferrable questions |
| Question pacing | Avoids too many questions in single message |
| Context awareness | Uses available context before asking |
Key pattern to look for:
markdown
undefined| 准则 | 检查内容 |
|---|---|
| 歧义澄清触发机制 | 在存在歧义时主动提问 |
| 必填与可选区分 | 区分必须了解和可选了解的信息 |
| 优雅处理无回应场景 | 定义用户未回复时的处理方式 |
| 避免过度提问 | 不询问明显或可推断的问题 |
| 提问节奏 | 避免在单条消息中提出过多问题 |
| 上下文感知 | 先利用已有上下文信息,再进行提问 |
需关注的核心模式:
markdown
undefinedRequired Clarifications
必填澄清问题
- Question about X
- Question about Y
- 关于X的问题
- 关于Y的问题
Optional Clarifications
可选澄清问题
- Question about Z (if relevant)
Note: Avoid asking too many questions in a single message.
undefined- 关于Z的问题(如相关)
注意:避免在单条消息中提出过多问题。
undefined4. Documentation & References (Weight: 10%)
4. 文档与参考(权重:10%)
| Criterion | What to Check |
|---|---|
| Source URLs | Official documentation links provided |
| Reference files | Complex details in |
| Fetch guidance | Instructions to fetch docs for unlisted patterns |
| Version awareness | Notes about checking for latest patterns |
| Example coverage | Good/bad examples for key patterns |
Key pattern to look for:
markdown
| Resource | URL | Use For |
|----------|-----|---------|
| Official Docs | https://... | Complex cases || 准则 | 检查内容 |
|---|---|
| 官方文档链接 | 提供官方文档的链接 |
| 参考文件存放 | 复杂细节放在 |
| 文档获取指引 | 提供针对未列出模式的文档获取指导 |
| 版本意识 | 提示检查最新模式的说明 |
| 示例覆盖 | 为核心模式提供正反示例 |
需关注的核心模式:
markdown
| 资源 | 链接 | 用途 |
|----------|-----|---------|
| 官方文档 | https://... | 处理复杂场景 |5. Domain Standards (Weight: 10%)
5. 领域标准(权重:10%)
| Criterion | What to Check |
|---|---|
| Best practices | Follows domain conventions (e.g., WCAG, OWASP) |
| Enforcement mechanism | Checklists, validation steps, must-verify items |
| Anti-patterns | Lists what NOT to do |
| Quality gates | Output checklist before delivery |
Key pattern to look for:
markdown
undefined| 准则 | 检查内容 |
|---|---|
| 最佳实践遵循 | 符合领域惯例(如WCAG、OWASP) |
| 执行机制 | 包含检查清单、验证步骤、必须验证的事项 |
| 反模式列举 | 列出不能做的事项 |
| 质量关卡 | 交付前提供输出检查清单 |
需关注的核心模式:
markdown
undefinedMust Follow
必须遵循
- Requirement 1
- Requirement 2
- 要求1
- 要求2
Must Avoid
必须避免
- Antipattern 1
- Antipattern 2
undefined- 反模式1
- 反模式2
undefined6. Technical Robustness (Weight: 8%)
6. 技术健壮性(权重:8%)
| Criterion | What to Check |
|---|---|
| Error handling | Guidance for failure scenarios |
| Security considerations | Input validation, secrets handling if relevant |
| Dependencies | External tools/APIs documented |
| Edge cases | Common edge cases addressed |
| Testability | Can outputs be verified? |
| 准则 | 检查内容 |
|---|---|
| 错误处理 | 提供故障场景的处理指引 |
| 安全考量 | 若相关,包含输入验证、敏感信息处理说明 |
| 依赖项说明 | 记录外部工具/API |
| 边缘场景覆盖 | 处理常见边缘场景 |
| 可测试性 | 输出结果可被验证 |
7. Maintainability (Weight: 8%)
7. 可维护性(权重:8%)
| Criterion | What to Check |
|---|---|
| Modularity | References are self-contained topics |
| Update path | Easy to update when standards change |
| No hardcoded values | Uses placeholders/variables where appropriate |
| Clear organization | Logical section ordering |
| 准则 | 检查内容 |
|---|---|
| 模块化 | 参考文件为独立主题 |
| 更新路径 | 标准变更时易于更新 |
| 无硬编码值 | 适当使用占位符/变量 |
| 组织清晰 | 章节顺序符合逻辑 |
8. Zero-Shot Implementation (Weight: 12%)
8. 零次实现(Weight: 12%)
Skills should enable single-interaction implementation with embedded expertise.
| Criterion | What to Check |
|---|---|
| Before Implementation section | Context gathering guidance present |
| Codebase context | Guidance to scan existing structure/patterns |
| Conversation context | Uses discussed requirements/decisions |
| Embedded expertise | Domain knowledge in |
| User-only questions | Only asks for USER requirements, not domain knowledge |
Key pattern to look for:
markdown
undefinedSkill应支持借助内嵌专业知识的单次交互实现。
| 准则 | 检查内容 |
|---|---|
| 实施前准备章节 | 包含上下文收集指引 |
| 代码库上下文 | 提供扫描现有结构/模式的指引 |
| 对话上下文 | 利用已讨论的需求/决策 |
| 内嵌专业知识 | 领域知识放在 |
| 仅询问用户需求 | 仅向用户询问需求,而非领域知识 |
需关注的核心模式:
markdown
undefinedBefore Implementation
实施前准备
Gather context to ensure successful implementation:
| Source | Gather |
|---|---|
| Codebase | Existing structure, patterns, conventions |
| Conversation | User's specific requirements |
| Skill References | Domain patterns from |
| User Guidelines | Project-specific conventions |
**Red flag**: Skill instructs to "research" or "discover" domain knowledge at runtime instead of embedding it.收集上下文信息以确保实施成功:
| 来源 | 收集内容 |
|---|---|
| 代码库 | 现有结构、模式、惯例 |
| 对话信息 | 用户的具体需求 |
| Skill参考文件 | |
| 用户指南 | 项目特定惯例 |
**警示信号**:Skill要求在运行时“研究”或“发现”领域知识,而非将其内嵌。9. Reusability (Weight: 13%)
9. 可复用性(权重:13%)
Skills should handle variations, not single requirements.
| Criterion | What to Check |
|---|---|
| Handles variations | Not hardcoded to single use case |
| Variable elements | Clarifications capture what VARIES |
| Constant patterns | Domain best practices encoded as constants |
| Not requirement-specific | Avoids hardcoded data, tools, configs |
| Abstraction level | Appropriate generalization for domain |
Good example:
markdown
"Create visualizations - adaptable to data shape, chart type, library"Bad example (too specific):
markdown
"Create bar chart with sales data using Recharts"Key check: Does the skill work for multiple use cases within its domain?
Skill应能处理多种场景变化,而非仅适用于单一需求。
| 准则 | 检查内容 |
|---|---|
| 支持场景变化 | 未硬编码为单一用例 |
| 可变元素捕获 | 通过澄清问题捕获可变内容 |
| 固定模式编码 | 领域最佳实践作为固定模式编码 |
| 不绑定特定需求 | 避免硬编码数据、工具、配置 |
| 抽象层级合适 | 针对领域进行适度泛化 |
优秀示例:
markdown
"生成可视化图表 - 适配数据形态、图表类型、库"反面示例(过于具体):
markdown
"使用Recharts生成销售数据的柱状图"核心检查点:该Skill是否适用于领域内的多个用例?
Type-Specific Validation
类型特定验证
After scoring general criteria, verify type-specific requirements:
| Type | Must Have |
|---|---|
| Builder | Clarifications, Output Spec, Domain Standards, Output Checklist |
| Guide | Workflow Steps, Examples (Good/Bad), Official Docs links |
| Automation | Scripts in |
| Analyzer | Analysis Scope, Evaluation Criteria, Output Format, Synthesis |
| Validator | Quality Criteria, Scoring Rubric, Thresholds, Remediation |
Scoring: Deduct 10 points if type-specific requirements missing for identified type.
完成通用准则评分后,验证对应类型的特定要求:
| 类型 | 必须具备的内容 |
|---|---|
| Builder | 澄清问题、输出规范、领域标准、输出检查清单 |
| Guide | 工作流步骤、正反示例、官方文档链接 |
| Automation | |
| Analyzer | 分析范围、评估准则、输出格式、信息整合 |
| Validator | 质量准则、评分规则、阈值、整改建议 |
评分规则:若缺失对应类型的特定要求,扣除10分。
Scoring Guide
评分指南
Category Scores
类别得分计算
Calculate each category score:
Category Score = (Sum of criterion scores) / (Max possible) * 100计算每个类别的得分:
类别得分 = (准则得分总和) / (最高可能得分) * 100Overall Score
总体得分计算
Overall = Σ(Category Score × Weight)总体得分 = Σ(类别得分 × 权重)Rating Thresholds
评级阈值
| Score | Rating | Meaning |
|---|---|---|
| 90-100 | Production | Ready for wide use |
| 75-89 | Good | Minor improvements needed |
| 60-74 | Adequate | Functional but needs work |
| 40-59 | Developing | Significant gaps |
| 0-39 | Incomplete | Major rework required |
| 得分 | 评级 | 含义 |
|---|---|---|
| 90-100 | 生产级 | 可广泛使用 |
| 75-89 | 良好 | 需小幅优化 |
| 60-74 | 合格 | 可用但需改进 |
| 40-59 | 待开发 | 存在显著差距 |
| 0-39 | 不完整 | 需要大幅重构 |
Output Format
输出格式
Generate validation report:
markdown
undefined生成验证报告:
markdown
undefinedSkill Validation Report: [skill-name]
Skill验证报告: [skill-name]
Rating: [Production/Good/Adequate/Developing/Incomplete]
Overall Score: [X]/100
评级: [生产级/良好/合格/待开发/不完整]
总体得分: [X]/100
Summary
摘要
[2-3 sentence assessment]
[2-3句话的评估内容]
Category Scores
类别得分
| Category | Score | Weight | Weighted |
|---|---|---|---|
| Structure & Anatomy | X/100 | 12% | X |
| Content Quality | X/100 | 15% | X |
| User Interaction | X/100 | 12% | X |
| Documentation | X/100 | 10% | X |
| Domain Standards | X/100 | 10% | X |
| Technical Robustness | X/100 | 8% | X |
| Maintainability | X/100 | 8% | X |
| Zero-Shot Implementation | X/100 | 12% | X |
| Reusability | X/100 | 13% | X |
| Type-Specific Deduction | -X | - | -X |
| 类别 | 得分 | 权重 | 加权得分 |
|---|---|---|---|
| 结构与组成 | X/100 | 12% | X |
| 内容质量 | X/100 | 15% | X |
| 用户交互 | X/100 | 12% | X |
| 文档与参考 | X/100 | 10% | X |
| 领域标准 | X/100 | 10% | X |
| 技术健壮性 | X/100 | 8% | X |
| 可维护性 | X/100 | 8% | X |
| 零次实现 | X/100 | 12% | X |
| 可复用性 | X/100 | 13% | X |
| 类型特定扣分 | -X | - | -X |
Critical Issues (if any)
关键问题(若有)
- [Issue requiring immediate fix]
- [需要立即修复的问题]
Improvement Recommendations
优化建议
- High Priority: [Specific action]
- Medium Priority: [Specific action]
- Low Priority: [Specific action]
- 高优先级: [具体行动]
- 中优先级: [具体行动]
- 低优先级: [具体行动]
Strengths
优势
- [What skill does well]
---- [Skill的优秀之处]
---Quick Validation Checklist
快速验证检查清单
For rapid assessment, check these critical items:
用于快速评估的关键检查项:
Structure & Frontmatter
结构与前置信息
- SKILL.md <500 lines
- Frontmatter: name (≤64 chars, lowercase, hyphens) + description (≤1024 chars)
- Description uses third-person style ("This skill should be used when...")
- No README.md/CHANGELOG.md in skill directory
- SKILL.md文件少于500行
- 前置信息:名称(≤64字符、小写、连字符)+ 描述(≤1024字符)
- 描述使用第三人称风格("本Skill应在……场景下使用")
- Skill目录下无README.md/CHANGELOG.md文件
Content & Interaction
内容与交互
- Has clarification questions (Required vs Optional)
- Has output specification
- Has official documentation links
- 包含澄清问题(必填 vs 可选)
- 包含输出规范
- 提供官方文档链接
Zero-Shot & Reusability
零次实现与可复用性
- Has "Before Implementation" section (context gathering)
- Domain expertise embedded in (not runtime discovery)
references/ - Handles variations (not requirement-specific)
- 包含“实施前准备”章节(上下文收集)
- 领域专业知识内嵌在目录中(而非运行时获取)
references/ - 支持场景变化(不绑定特定需求)
Type-Specific (check based on skill type)
类型特定检查(根据Skill类型)
- Builder: Clarifications + Output Spec + Standards + Checklist
- Guide: Workflow + Examples + Docs
- Automation: Scripts + Dependencies + Error Handling
- Analyzer: Scope + Criteria + Output Format
- Validator: Criteria + Scoring + Thresholds + Remediation
If 10+ checked: Likely Production (90+)
If 7-9 checked: Likely Good (75-89)
If 5-6 checked: Likely Adequate (60-74)
If <5 checked: Needs significant work
- Builder:澄清问题 + 输出规范 + 标准 + 检查清单
- Guide:工作流 + 示例 + 文档链接
- Automation:脚本 + 依赖项 + 错误处理
- Analyzer:范围 + 准则 + 输出格式
- Validator:准则 + 评分规则 + 阈值 + 整改建议
若勾选10项及以上:大概率为生产级(90分以上)
若勾选7-9项:大概率为良好(75-89分)
若勾选5-6项:大概率为合格(60-74分)
若勾选不足5项:需要大幅改进
Reference Files
参考文件
| File | When to Read |
|---|---|
| Deep evaluation of specific criterion |
| Example validations for calibration |
| Common fixes for common issues |
| 文件 | 阅读场景 |
|---|---|
| 深入评估特定准则 |
| 参考示例验证进行校准 |
| 常见问题的通用修复方案 |
Usage Examples
使用示例
Validate a skill
验证某个Skill
Validate the chatgpt-widget-creator skill against production criteria对照生产级准则验证chatgpt-widget-creator skillQuick audit
快速审计
Quick validation check on mcp-builder skill对mcp-builder skill进行快速验证检查Focused review
聚焦式评审
Check if skill-creator skill has proper user interaction patterns检查skill-creator skill是否具备合适的用户交互模式