Loading...
Found 1 Skills
Iterate on RAG systems with structured evals instead of eyeballing. This skill should be used when the user is tuning a RAG pipeline — changing retrieval prompts, swapping models, adjusting chunking, or debugging poor answers — and wants a cheap, ranked set of experiments with cost tracking and structured feedback on the stack. Also use when the user asks "how do I know if my RAG is working?", "this RAG eval is burning money", or "what should I try next on retrieval?".