Search Results: llm-as-judge

Found 44 Skills

AI & Machine Learningglennguilloux/context-eng...

agent-evaluation

Evaluate and improve Claude Code commands, skills, and agents. Use when testing prompt effectiveness, validating context engineering choices, or measuring improvement quality.

🇺🇸|EnglishTranslated

Code Qualityneolabhq/context-engineer...

critique

Comprehensive multi-perspective review using specialized judges with debate and consensus building

🇺🇸|EnglishTranslated

AI & Machine Learningneolabhq/context-engineer...

do-in-steps

Execute complex tasks through sequential sub-agent orchestration with intelligent model selection, meta-judge → LLM-as-a-judge verification

🇺🇸|EnglishTranslated

AI & Machine Learningneolabhq/context-engineer...

sadd:judge

Launch a sub-agent judge to evaluate results produced in the current conversation

🇺🇸|EnglishTranslated

AI & Machine Learningneolabhq/context-engineer...

judge

Launch a meta-judge then a judge sub-agent to evaluate results produced in the current conversation

🇺🇸|EnglishTranslated

Code Qualityneolabhq/context-engineer...

reflexion:critique

Comprehensive multi-perspective review using specialized judges with debate and consensus building

🇺🇸|EnglishTranslated

AI & Machine Learningneolabhq/context-engineer...

sadd:do-and-judge

Execute a task with sub-agent implementation and LLM-as-a-judge verification with automatic retry loop

🇺🇸|EnglishTranslated

AI & Machine Learningneolabhq/context-engineer...

customaize-agent:agent-evaluation

Evaluate and improve Claude Code commands, skills, and agents. Use when testing prompt effectiveness, validating context engineering choices, or measuring improvement quality.

🇺🇸|EnglishTranslated