Triage Failing Tests
Daily Skill Update Check
Before starting this skill's work, opportunistically refresh Shiplight skills at most once per day:
- Check the timestamp file at
.shiplight-agent-skills-last-update
in the current project.
- If the timestamp file is missing or older than 24 hours, run
npx -y skills@latest update -y
, then create/update the timestamp file even if the command fails. Treat .shiplight-agent-skills-last-update
as local cache and do not commit it.
- If the update command fails, continue with the currently installed skill and mention the failure briefly.
Use this skill to reproduce, diagnose, and repair failing Shiplight YAML tests. If the application is broken or current behavior conflicts with the spec, report the mismatch instead of rewriting the test around it.
When To Use
- A Shiplight test run is failing
- A deployment or UI change broke existing tests
- Several tests may share the same failure source
- CI needs a best-effort repair/report pass
- Creating new tests from scratch; use
- Verifying UI code changes without failing tests; use
- Tests pass and the task is only quality improvement
- The product is being intentionally redesigned and tests need planned rewriting
Required Context
Before editing YAML:
- Read the reference guides , ,
test-implementation-guide.md
, and .
- Read relevant notes for the failing area, environment, auth, data, and tooling.
- Read the matching spec under , if one exists.
- Read
shiplight://yaml-test-spec
and shiplight://schemas/action-entity
.
Ground Truth
When sources disagree, this precedence applies:
- Explicit user instruction
- Feature or journey spec in
- Existing YAML test , step , and assertions
- Current app behavior
- Project context in and
- Agent docs
- Agent inference
If current app behavior conflicts with a spec or test goal, report the mismatch. Do not silently rewrite intent to match current behavior.
Workflow
- Reproduce — run the specified target, or the narrowest relevant suite if no target was provided. If a failure looks transient, rerun the smallest affected target once before editing.
- Understand — read the failure output, relevant YAML, matching spec, related tests, and shared templates/functions/hooks before opening a browser or changing files.
- Inspect when needed — when logs and files are not enough, inspect the live app in a browser. Use the evidence needed for the failure: DOM, actions, locators, console logs, network logs, screenshots, or recordings.
- Fix minimally — change the smallest correct surface: YAML, template, helper function, auth setup, environment data, or spec. Do not touch passing tests unless they share the same broken source.
- Validate and rerun — validate edited YAML with , then rerun the narrowest changed target. After batch fixes, rerun the original target once.
- Reflect — update specs, , or when the session produced durable learning or corrected stale assumptions.
Guardrails
- Do not guess rendered UI when the failure depends on current browser behavior.
- Do not delete assertions, skip required steps, or reduce coverage only to make a test pass.
- If intended product behavior changed, update the matching spec before updating YAML.
- If the app is broken, report the app issue instead of masking it with test changes.
- Preserve user changes and unrelated work.
- Prefer focused fixes over broad rewrites.
- Keep ACTION locators and VERIFY caches current when editing affected steps, but do not churn unrelated caches.
- In CI or non-interactive mode, do not block on user input. Make conservative best-effort decisions and document uncertainty.
Common causes include stale locators, changed user flows, assertion drift, expired auth, timing, shared templates/hooks, invalid parameter data, environment issues, and real app bugs. Use evidence to decide the minimal correct fix.
Reporting
After triage, report:
- Target command(s) run and pass/fail result
- Files changed
- Tests repaired, skipped, still failing, or already passing
- Behavior covered or restored
- App/spec mismatches or unresolved blockers
- Knowledge, context, or specs updated