Loading...
Loading...
Run live eval sessions against the vercel-plugin to verify hook behavior, skill injection, dedup correctness, and coverage. Launches real Claude Code sessions via WezTerm, monitors debug logs, and produces a structured coverage report.
npx skill4agent add vercel/vercel-plugin vercel-plugin-evalclaude --print-p--dangerously-skip-permissions/tmp/~/dev/vercel-plugin-testing/settings.local.jsonnpx add-pluginCLAUDE_PLUGIN_ROOTbash -c/bin/zsh -icx# 1. Create test dir & install plugin (with timestamp)
TS=$(date +%Y%m%d-%H%M)
SLUG="my-eval-$TS"
mkdir -p ~/dev/vercel-plugin-testing/$SLUG
cd ~/dev/vercel-plugin-testing/$SLUG
npx add-plugin https://github.com/vercel/vercel-plugin -s project -y
# 2. Launch session via WezTerm
wezterm cli spawn --cwd /Users/johnlindquist/dev/vercel-plugin-testing/$SLUG -- /bin/zsh -ic \
"unset CLAUDECODE; VERCEL_PLUGIN_LOG_LEVEL=debug x '<PROMPT>' --settings .claude/settings.json; exec zsh"
# 3. Find debug log (wait ~25s for session start)
find ~/.claude/debug -name "*.txt" -mmin -2 -exec grep -l "$SLUG" {} +LOG=~/.claude/debug/<session-id>.txt
# SessionStart (3 hooks)
grep "SessionStart.*success" "$LOG"
# PreToolUse skill injection
grep -c "executePreToolHooks" "$LOG" # total calls
grep -c "provided additionalContext" "$LOG" # injections
# UserPromptSubmit
grep "UserPromptSubmit.*success" "$LOG"
# PostToolUse validate + shadcn font-fix
grep "posttooluse-validate.*provided" "$LOG"
grep "PostToolUse:Bash.*success" "$LOG"
# SessionEnd cleanup
grep "SessionEnd" "$LOG"TMPDIR=$(node -e "import {tmpdir} from 'os'; console.log(tmpdir())" --input-type=module)
CLAIMDIR="$TMPDIR/vercel-plugin-<session-id>-seen-skills.d"
# Claim files = one per skill, atomic O_EXCL
ls "$CLAIMDIR"
# Compare: injections should equal claims
inject_meta=$(grep -c "skillInjection:" "$LOG")
claims=$(ls "$CLAIMDIR" 2>/dev/null | wc -l | tr -d ' ')
echo "Injections: $((inject_meta / 3)) | Claims: $claims"skillInjection:grep "VALIDATION" "$LOG" | head -10| Scenario Type | Skills Exercised |
|---|---|
| AI chat app | ai-sdk, ai-gateway, nextjs, ai-elements |
| Durable workflow | workflow, ai-sdk, vercel-queues |
| Monorepo | turborepo, turbopack, nextjs |
| Edge auth + routing | routing-middleware, auth, sign-in-with-vercel |
| Chat bot (multi-platform) | chat-sdk, ai-sdk, vercel-storage |
| Feature flags + CRM | vercel-flags, vercel-queues, ai-sdk |
| Email pipeline | email, satori, ai-sdk, vercel-storage |
| Marketplace/payments | payments, marketplace, cms |
| Kitchen sink | micro, ncc, all niche skills |
ai-elementsv0-devvercel-firewallmarketplacegeistjson-rendercomponents/chat-*.tsx.notes/COVERAGE.mdrm -rf ~/dev/vercel-plugin-testing