Systematic QA Testing
Modes
Full (default)
Systematic page-by-page testing, 8-category health score, full issue documentation.
Quick
30-second smoke test of critical paths only: login, main nav, primary action.
Regression
Diff current state against saved baseline, report new/resolved issues.
Browser Automation Detection
Detect available automation in priority order:
- Playwright MCP server — check if Playwright tools are available in the current tool list
- skill — check if the agent-browser skill is loaded
- Direct CLI tools — check for , , or binaries on PATH
- Manual fallback — instruct the user to navigate and report observations
Use the highest-priority method available. State which method is in use at the start of the report.
Workflow
Phase 1: Initialize
- Detect mode from user prompt. Default to full if unspecified.
- Detect application URL:
- Check
references/project-detection.md
for framework port conventions (e.g., Next.js → 3000, Vite → 5173, Django → 8000).
- If not detectable, ask the user.
- Detect available browser automation method (priority list above).
- If regression mode: load previous baseline from .
Phase 2: Authenticate (if needed)
- Navigate to root URL and check if a login wall is present.
- If credentials were provided: authenticate and store session.
- If not: ask the user for test credentials, or skip auth-gated pages and note the gap in the report.
Phase 3: Orient
- Navigate to root URL.
- Map the primary navigation structure — collect all top-level nav links.
- Classify each page: static, form, list, detail, dashboard.
- Build a test plan ordered by page category (forms and dashboards first — highest defect density).
Phase 4: Explore (Full mode)
For each page, run the per-page checklist below. In quick mode, run only the items marked with (Q).
Visual Scan
- (Q) Layout renders correctly — no overlap, no overflow
- Images load — no broken tags
- Typography consistent — no visible font fallbacks
- Responsive: check at desktop (1280px) and mobile (375px) widths
Interactive Elements
- (Q) All buttons and links are clickable and responsive
- Hover states present where expected
- Focus indicators visible for keyboard navigation
- Disabled states visually distinct
Forms
- (Q) Required field validation fires on empty submit
- Error messages display on invalid input
- Success feedback on valid submission
- Form resubmission handled — no duplicate submissions on double-click
Navigation
- (Q) All nav links resolve — no 404s
- Back button works as expected
- Deep links work — direct URL access returns correct page
- Breadcrumbs accurate (if present)
State Management
- Loading states displayed during async operations
- Empty states handled — no blank pages when data is absent
- Error states recoverable — retry or back options present
- Data persists across navigation — no lost form data on back/forward
Console
- (Q) No JavaScript errors in console
- No failed network requests (4xx/5xx)
- No mixed content warnings
- No deprecation warnings in hot paths
Responsiveness
- Mobile layout usable — no horizontal scroll at 375px
- Touch targets >= 44px
- Text readable without zoom (>= 16px body text)
Phase 5: Document
For each issue found, classify using
references/issue-taxonomy.md
:
- Severity: critical (blocks usage), major (degrades experience), minor (cosmetic/polish)
- Category: functional, visual, accessibility, performance, content, navigation, security, console
- Evidence: screenshot description or reproduction steps
Assign a unique ID:
,
, etc.
Compute health score using the weights defined below and detailed in
references/report-template.md
.
Phase 6: Wrap Up
- Generate structured report following
references/report-template.md
.
- Save to
.qa-reports/<YYYY-MM-DD>-<mode>.json
.
- If full mode: save baseline for future regression comparison.
- Present summary: health score, critical/major/minor counts, top 3 priority fixes.
Health Score
Weighted average across 8 categories, scored 0-100.
| Category | Weight |
|---|
| Console errors | 15% |
| Broken links | 10% |
| Functional | 20% |
| UX/Usability | 15% |
| Accessibility | 15% |
| Visual | 10% |
| Performance | 10% |
| Content | 5% |
Scoring per category: start at 100, deduct per issue by severity:
- Critical: -30
- Major: -15
- Minor: -5
Floor at 0. Final health score = weighted sum of category scores.
Quick Mode Behavior
Run only items marked (Q) in the Phase 4 checklist. Skip health score computation — report pass/fail per critical path. Target completion: 30 seconds of actual testing time.
Regression Mode Behavior
- Load the most recent baseline from .
- Run full mode.
- Diff issues by ID and description similarity.
- Report: new issues, resolved issues, persistent issues.
- Save updated baseline.