Loading...
Loading...
Full evaluation workflow - launch a run, watch progress, and summarize results. Use for end-to-end agent testing.
npx skill4agent add coval-ai/coval-external-skills quick-eval$ARGUMENTScoval agents list
coval test-sets list
coval personas listcoval runs launch \
--agent-id <agent_id> \
--persona-id <persona_id> \
--test-set-id <test_set_id> \
--name "Quick Eval - $(date +%Y%m%d-%H%M)"coval runs watch <run_id>coval runs get <run_id> --format json
coval simulations list --run-id <run_id> --format json## Evaluation Complete
**Run:** <run_id>
**Agent:** <agent_name>
**Test Set:** <test_set_name>
**Duration:** X minutes
### Results
- Total Simulations: N
- Completed: N
- Failed: N
### Sample Simulations
[List 3-5 simulation IDs for review]
### Next Steps
- View full results: `coval simulations list --run-id <run_id>`
- Download audio: `coval simulations audio <sim_id> -o recording.wav`
- Get transcript: `coval simulations get <sim_id>`