Loading...
Loading...
Found 29 Skills
Scaffolds eval.yaml test files for agent skills in the dotnet/skills repository. Use when creating skill tests, writing evaluation scenarios, defining assertions and rubrics, or setting up test fixture files. Handles eval.yaml generation, fixture organization, and overfitting avoidance. Do not use for running or debugging existing tests nor for skills authoring.
Deep test, analyze, and audit Claude skills. Use this skill whenever the user wants to test a skill's behavior, analyze how it uses the Claude API, inspect inputs/outputs from scripts, or run security and code review audits against skill scripts. Trigger on: "test my skill", "analyze this skill", "audit skill scripts", "review skill for security issues", "what does this skill actually do when it runs", "inspect API calls from skill", "run a skill through its paces", "check my skill for bugs or vulnerabilities". Also trigger when the user shows you a SKILL.md and asks you to evaluate, critique, or stress-test it.
Testing framework for evaluating Databricks skills. Use when building test cases for skills, running skill evaluations, comparing skill versions, or creating ground truth datasets with the Generate-Review-Promote (GRP) pipeline. Triggers include "test skill", "evaluate skill", "skill regression", "ground truth", "GRP pipeline", "skill quality", and "skill metrics".
Validate skill files for structural compliance and behavioral correctness. Three modes: static (linter), spec (behavioral), audit (coverage report).
[Hyper] Test Codex/agent skills for intended triggering and behavior with realistic positive, negative, boundary, and edge-case scenarios. Use when validating a skill folder, SKILL.md, rules/references/scripts/assets, trigger precision, workflow correctness, or regression coverage before shipping skill changes.
Run tests from skill examples and generate a report (project)
AI-powered E2E testing for any app. Test 8 platforms with natural language — no test code needed.
Minimal validation for crawl-and-skill workflow readiness.
Validates and scores Claude Code skill packages for quality, completeness, and best practices compliance. Tests Python scripts, checks YAML frontmatter, and generates quality reports. Use when creating new skills, validating skill packages, or auditing skill quality.
Sample skill for testing the skill-tester validation pipeline. Demonstrates proper skill structure with scripts, references, and assets.
Day 1 Skill 체험용. "/day1-test-skill", "테스트 스킬" 요청에 사용.
Guide for creating effective AI agent skills. Use when users want to create a new skill (or update an existing skill) that extends an AI agent's capabilities with specialized knowledge, workflows, or tool integrations. Works with any agent that supports the SKILL.md format (Claude Code, Cursor, Roo, Cline, Windsurf, etc.). Triggers on "create skill", "new skill", "package knowledge", "skill for".