Search Results: plugin-eval

Found 2 Skills

vercel-plugin-eval

Run live eval sessions against the vercel-plugin to verify hook behavior, skill injection, dedup correctness, and coverage. Launches real Claude Code sessions via WezTerm, monitors debug logs, and produces a structured coverage report.

🇺🇸|EnglishTranslated

Tools & Utilitieswshobson/agents

evaluation-methodology

PluginEval quality methodology — dimensions, rubrics, statistical methods, and scoring formulas. Use this skill when understanding how plugin quality is measured, when interpreting a low score on a specific dimension, when deciding how to improve a skill's triggering accuracy or orchestration fitness, when calibrating scoring thresholds for your marketplace, or when explaining quality badges to external partners like Neon.

🇺🇸|EnglishTranslated