Loading...
Loading...
Evaluates agent skills against Anthropic's best practices. Use when asked to review, evaluate, assess, or audit a skill for quality. Analyzes SKILL.md structure, naming conventions, description quality, content organization, and identifies anti-patterns. Produces actionable improvement recommendations.
npx skill4agent add gotalab/skillport skill-evaluatorscripts/validate_skill.py <skill-path>scripts/validate_skill.py <path/to/skill>| Score | Criteria |
|---|---|
| 5 | Gerund form (-ing), clear purpose, memorable |
| 4 | Descriptive, follows conventions |
| 3 | Acceptable but could be clearer |
| 2 | Vague or misleading |
| 1 | Violates naming rules |
processing-pdfsanalyzing-spreadsheetsbuilding-dashboardspdfmy-skillClaudeHelperanthropic-tools| Score | Criteria |
|---|---|
| 5 | Clear functionality + specific activation triggers + third person |
| 4 | Good description with some triggers |
| 3 | Adequate but missing triggers or vague |
| 2 | Too brief or unclear purpose |
| 1 | Missing or unhelpful |
| Score | Criteria |
|---|---|
| 5 | Concise, assumes Claude intelligence, actionable instructions |
| 4 | Generally good, minor verbosity |
| 3 | Some unnecessary explanations or redundancy |
| 2 | Overly verbose or confusing |
| 1 | Bloated, explains obvious concepts |
| Score | Criteria |
|---|---|
| 5 | Excellent progressive disclosure, clear navigation, optimal length |
| 4 | Good organization, appropriate file splits |
| 3 | Acceptable but could be better organized |
| 2 | Poor organization, missing references, or bloated SKILL.md |
| 1 | No structure, everything dumped in SKILL.md |
| Score | Criteria |
|---|---|
| 5 | Perfect match: high freedom for flexible tasks, low for fragile operations |
| 4 | Generally appropriate freedom levels |
| 3 | Acceptable but could be better calibrated |
| 2 | Mismatched: too rigid or too loose |
| 1 | Completely wrong freedom level for the task type |
# Skill Evaluation Report: [skill-name]
## Summary
- **Overall Score**: X.X/5.0
- **Recommendation**: [Ready for publication / Needs minor improvements / Needs major revision]
## Dimension Scores
| Dimension | Score | Weight | Weighted |
|-----------|-------|--------|----------|
| Naming | X/5 | 10% | X.XX |
| Description | X/5 | 20% | X.XX |
| Content Quality | X/5 | 30% | X.XX |
| Structure | X/5 | 25% | X.XX |
| Degrees of Freedom | X/5 | 10% | X.XX |
| Anti-Patterns | X/5 | 5% | X.XX |
| **Total** | | 100% | **X.XX** |
## Strengths
- [List 2-3 things done well]
## Areas for Improvement
- [List specific issues with actionable fixes]
## Anti-Patterns Found
- [List any anti-patterns detected]
## Recommendations
1. [Priority 1 fix]
2. [Priority 2 fix]
3. [Priority 3 fix]
## Pre-Publication Checklist
- [ ] Description is specific with activation triggers
- [ ] SKILL.md under 500 lines
- [ ] One-level-deep file references
- [ ] Forward slashes in all paths
- [ ] No time-sensitive information
- [ ] Consistent terminology
- [ ] Concrete examples provided
- [ ] Scripts handle errors explicitly
- [ ] All configuration values justified
- [ ] Required packages listed
- [ ] Tested with Haiku, Sonnet, Opus| Score Range | Rating | Action |
|---|---|---|
| 4.5 - 5.0 | Excellent | Ready for publication |
| 4.0 - 4.4 | Good | Minor improvements recommended |
| 3.0 - 3.9 | Acceptable | Several improvements needed |
| 2.0 - 2.9 | Needs Work | Major revision required |
| 1.0 - 1.9 | Poor | Fundamental redesign needed |