Search Results: ai-testing

Found 21 Skills

alicloud-ai-multimodal-qwen-vl-test

Minimal image-understanding smoke test for Model Studio Qwen VL.

stably-sdk-rules

AI rules for writing tests with Stably Playwright SDK. Use this skill when writing or modifying Playwright tests with Stably AI features. Covers when to use Playwright vs Stably methods, plus minimal patterns for aiAssert, extract, getLocatorsByAI, agent.act, Inbox, and Google auth.

🇺🇸|EnglishTranslated

AI & Machine Learningpromptfoo/promptfoo

promptfoo-evals

Write, refine, run, and QA promptfoo evaluation suites: promptfooconfig.yaml, prompts, providers, vars, tests, assertions, model-graded rubrics, transforms, datasets, exports, and CI gates. Use for non-redteam eval coverage, regression tests, or new eval matrices. Do not use for adversarial redteam plugin or strategy setup.

🇺🇸|EnglishTranslated

AI & Machine Learningaxiomhq/skills

writing-evals

Scaffolds evaluation suites for the Axiom AI SDK. Generates eval files, scorers, flag schemas, and config from natural-language descriptions. Use when creating evals, writing scorers, setting up flag schemas, or configuring axiom.config.ts.

🇺🇸|EnglishTranslated

16 scripts/Attention

AI & Machine Learningcoval-ai/coval-external-s...

quick-eval

Full evaluation workflow - launch a run, watch progress, and summarize results. Use for end-to-end agent testing.

🇺🇸|EnglishTranslated

AI & Machine Learningnvidia/skills

onboard-gb200-1node-tests

Onboard 1-node GitHub MR functional tests for GB200 from existing mr-scoped 2-node tests.

🇺🇸|EnglishTranslated

AI & Machine Learninggarrytan/gbrain

skillify

The meta skill. Turn any raw feature into a properly-skilled, tested, resolvable unit of agent capability. Cross-modal eval is the recommended Phase 3 quality gate: 3 frontier models from different providers critique the output, you iterate to quality, THEN write tests that lock in the proven-good behavior.

🇺🇸|EnglishTranslated

AI & Machine Learningmarcohefti/zero-context-l...

zcl

Orchestrator workflow for running ZeroContext Lab (ZCL) attempts/suites with deterministic artifacts, trace-backed evidence, and fast post-mortems (shim support for "agent only types tool name").

🇺🇸|EnglishTranslated

Testing & QAaradotso/devtools-skills

browser-devtools-mcp-vscode

VSCode extension for Browser DevTools MCP Server enabling AI-driven browser automation, debugging, and testing via Playwright and Model Context Protocol

🇺🇸|EnglishTranslated