ln-310-multi-agent-validator
Original:🇺🇸 English
Translated
Validates Stories/Tasks or context via parallel multi-agent review (Codex + Gemini). Merges findings, debates, applies fixes. GO/NO-GO verdict.
12installs
Added on
NPX Install
npx skill4agent add levnikolaevich/claude-code-skills ln-310-multi-agent-validatorTags
Translated version includes tags in frontmatterSKILL.md Content
View Translation Comparison →Paths: File paths (,shared/,references/) are relative to skills repo root. If not found at CWD, locate this SKILL.md directory and go up one level for repo root.../ln-*
Multi-Agent Validator
Validates Stories/Tasks (mode=story) or arbitrary context (mode=context) with parallel multi-agent review and critical verification.
Inputs
| Input | Required | Source | Description |
|---|---|---|---|
| mode=story | args, git branch, kanban, user | Story to process |
| mode=context | args | File paths to review |
| No | args or auto | Short label for file naming (default: |
| No | args | Areas to focus on (default: all) |
| No | args | Human-readable title (default: |
| No | args or auto-detect | Technology stack override (e.g., |
Mode detection: → mode=context. Anything else → mode=story.
Resolution (mode=story): Story Resolution Chain. Status filter: Backlog
"context {file1} {file2}..."Purpose & Scope
- mode=story: Validate Story plus child Tasks against industry standards and project patterns. Calculate Penalty Points, auto-fix violations, delegate to ln-002 for documentation. Approve Story (Backlog -> Todo).
- mode=context: Review plans, documents, architecture proposals via multi-agent review + MCP Ref research. Advisory output only (no status changes).
- Both modes: Launch external agents (Codex + Gemini) in parallel with own validation. Merge findings, critically verify, debate, apply accepted changes.
- Support Plan Mode: show audit results, wait for approval, then fix
When to Use
- mode=story: Reviewing Stories before approval (Backlog -> Todo), validating implementation path, ensuring standards fit
- mode=context: Reviewing plans, decisions, documents, architecture proposals for independent second opinion
- Optimizing or correcting proposed approaches with multi-agent verification
Penalty Points System
Goal: Quantitative assessment of Story/Tasks quality. Before score = raw quality; After score = post-fix quality.
| Severity | Points | Description |
|---|---|---|
| CRITICAL | 10 | RFC/OWASP/security violations |
| HIGH | 5 | Outdated libraries, architecture issues |
| MEDIUM | 3 | Best practices violations |
| LOW | 1 | Structural/cosmetic issues |
Workflow:
- Audit: Calculate penalty points for all 27 criteria (Before)
- Fix: Auto-fix fixable violations; FLAGGED items keep their penalty
- Report: Before → After (0 if all fixed; >0 if FLAGGED remain)
Mode Detection
Detect operating mode at startup:
Plan Mode Active:
- Phase 1-2: Full audit (discovery + research + penalty calculation)
- Phase 3: Show results + fix plan -> WAIT for user approval
- Phase 4-6: After approval -> execute fixes
Normal Mode:
- Phase 1-6: Standard workflow without stopping
- Automatically fix and approve
Plan Mode: Progress Tracking with TodoWrite
When operating in any mode, skill MUST create detailed todo checklist tracking ALL phases and steps.
Rules:
- Create todos IMMEDIATELY before Phase 1
- Each phase step = separate todo item
- Mark before starting step,
in_progressafter finishingcompleted
Todo Template (~21 items):
Phase 1: Discovery & Loading
- Auto-discover configuration (Team ID, docs)
- Load Story metadata (ID, title, status, labels)
- Load Tasks metadata (1-8 implementation tasks)
Phase 2: Research & Audit
- Extract technical domains from Story/Tasks
- Delegate documentation creation to ln-002
- Research via MCP Ref (RFC, OWASP, library versions)
- Verify technical claims (Anti-Hallucination)
- Pre-mortem Analysis (complex Stories)
- Calculate Penalty Points (27 criteria)
Phase 3: Audit Results & Fix Plan
- Display Penalty Points table and fix plan
- Wait for user approval (Plan Mode only)
Phase 4: Auto-Fix (11 groups)
- Fix Structural violations (#1-#4, #24)
- Fix Standards violations (#5)
- Fix Solution violations (#6, #21)
- Fix Workflow violations (#7-#13)
- Fix Quality violations (#14-#15)
- Fix Dependencies violations (#18-#19/#19b)
- Fix Cross-Reference violations (#25-#26)
- Fix Risk violations (#20)
- Fix Pre-mortem violations (#27)
- Fix Verification violations (#22)
- Fix Traceability violations (#16-#17)
Phase 1b: Agent Launch (mode=story)
- Health check: agent availability
- Build prompt from review_base.md + modes/story.md (per shared workflow "Step: Build Prompt")
- Launch codex-review + gemini-review as background tasks
Phase 1b: Agent Launch (mode=context)
- Health check: agent availability
- Build prompt from review_base.md + modes/context.md (per shared workflow "Step: Build Prompt")
- Launch codex-review + gemini-review as background tasks
Phase 5: Merge + Critical Verification (MANDATORY)
- Wait for agent results (process-as-arrive)
- Re-read lines modified in Phase 4 auto-fix (agents saw pre-fix state)
- Dedup against Claude's findings + review history
- Critical Verification + Debate per shared workflow
- Apply accepted suggestions
- Save review summary to .agent-review/review_history.md
Phase 6: Approve & Notify (mode=story only)
- Set Story/Tasks to Todo status in Linear
- Update kanban_board.md with APPROVED marker
- Add Linear comment with validation summary
- Display tabular output to terminalWorkflow
Phase 0: Tools Config
MANDATORY READ: Load , , and
shared/references/tools_config_guide.mdshared/references/storage_mode_detection.mdshared/references/input_resolution_pattern.mdExtract: = Task Management → Provider ( | ).
task_providerlinearfileAll subsequent phases use to select operations per storage_mode_detection.md.
task_providerPhase 1: Discovery & Loading
Step 1: Resolve storyId (per input_resolution_pattern.md):
- IF args provided → use args
- ELSE IF git branch matches → extract id
feature/{id}-* - ELSE IF kanban has exactly 1 Story in [Backlog] → suggest
- ELSE → AskUserQuestion: show Stories from kanban filtered by [Backlog]
Step 2: Configuration & Metadata Loading
- Auto-discover configuration: Team ID (), project docs (
docs/tasks/kanban_board.md), epic from Story.projectCLAUDE.md - Load metadata only: Story ID/title/status/labels, child Task IDs/titles/status/labels
- IF =
task_provider:linear+get_issue(storyId)list_issues(parentId=storyId) - IF =
task_provider:file+Read story.mdGlob("docs/tasks/epics/*/stories/*/tasks/*.md")
- IF
- Expect 1-8 implementation tasks; record parentId for filtering
- Rationale: keep loading light; full descriptions arrive in Phase 2
Phase 1b: Agent Launch
MANDATORY READ: Load ,
shared/references/agent_review_workflow.mdshared/references/agent_delegation_pattern.mdmode=story:
- Health Check (per shared workflow "Step: Health Check"):
- Read → exclude agents with
docs/environment_state.jsondisabled: true - Run for remaining agents
python shared/agents/agent_runner.py --health-check - If 0 agents available → skip agent review, proceed with Claude-only validation
- Read
- Get references:
- IF =
task_provider:linear→ Story URL,get_issue(storyId)→ Task URLslist_issues(parent=storyId) - IF =
task_provider: Read story.md, Glob tasks → pathsfile
- IF
- Build prompt: Assemble from +
shared/agents/prompt_templates/review_base.md(per shared workflow "Step: Build Prompt"), replacemodes/story.md,{story_ref}. Save to{task_refs}.agent-review/{identifier}_storyreview_prompt.md - Launch BOTH agents as background tasks (per shared workflow "Step: Run Agents")
mode=context:
- Health Check: same as above
- Resolve identifier: If not provided, generate
review_YYYYMMDD_HHMMSS - Materialize context (if needed): If context is from chat → write to
.agent-review/context/{identifier}_context.md - Build prompt: Assemble from +
shared/agents/prompt_templates/review_base.md(per shared workflow "Step: Build Prompt"), replacemodes/context.md,{review_title},{context_refs}. Save to{focus_areas}.agent-review/{identifier}_contextreview_prompt.md - Launch BOTH agents as background tasks
Agents now run in background. Claude proceeds to foreground work.
Foreground: mode=context (skip Phases 2-4, run MCP Ref research instead)
MANDATORY READ: Load (weight table, stack detection, safety rules) and
references/context_review_pipeline.mdshared/references/research_tool_fallback.mdWhile agents run in background, Claude performs foreground research:
a) Load Review Memory — per shared workflow "Step: Load Review Memory"
b) Applicability Check — scan context_files for technology decision signals (infrastructure, API/protocol, security, library/framework choices). No signals → skip MCP Ref, proceed to Phase 5.
c) Stack Detection — detect from: input > > indicator files (*.csproj, package.json, etc.)
d) Extract Topics (3-5) — parse context_files for technology decisions, score by weight, take top 3-5
e) MCP Ref Research — per chain (Ref → Context7 → WebSearch). Query:
f) Compare & Correct — if MCP Ref contradicts plan statement (high confidence), apply surgical Edit with inline rationale . Max 5 corrections per run. In Plan Mode → output to chat, skip edits until approved.
g) Save Findings — write to (per ). Display:
query_prefixtech_stackdocs/tools_config.mdresearch_tool_fallback.md"{query_prefix} {topic} RFC standard best practices {current_year}""(per {RFC/standard}: ...)".agent-review/context/{identifier}_mcp_ref_findings.mdreferences/mcp_ref_findings_template.md"MCP Ref: {N} topics validated, {M} corrections, {K} confirmed"Then proceed to Phase 5 (Merge).
Phase 2: Research & Audit (mode=story only)
PREREQUISITE: Phase 1b (Agent Launch) must have completed before Phase 2. If agents were not launched and health check was not run, go back to Phase 1b.
MANDATORY READ: Load for complete research and audit procedure:
references/phase2_research_audit.md- Domain extraction from Story/Tasks
- Documentation delegation to ln-002 (guides/manuals/ADRs)
- MCP research (RFC/OWASP/library versions via Ref + Context7)
- Anti-Hallucination verification (evidence-based claims)
- Pre-mortem Analysis (Tigers → #20, Elephants → #24)
- Penalty Points calculation (27 criteria, see Auto-Fix Actions Reference in same file)
Always execute for every Story - no exceptions.
Phase 3: Audit Results & Fix Plan
Display audit results:
- Penalty Points table (criterion, severity, points, description)
- Total: X penalty points
- Fix Plan: list of fixes for each criterion
Mode handling:
- IF Plan Mode: Show results + "After your approval, changes will be applied" -> WAIT
- ELSE (Normal Mode): Proceed to Phase 4 immediately
Phase 4: Auto-Fix
Execute fixes for ALL 27 criteria on the spot.
- Execution order (11 groups):
- Structural (#1-#4, #24) — Story/Tasks template compliance + AC completeness/specificity + Assumption Registry
- Standards (#5) — RFC/OWASP compliance FIRST (before YAGNI/KISS!)
- Solution (#6, #21) — Library versions, alternative solutions
- Workflow (#7-#13) — Test strategy, docs integration, size, cleanup, YAGNI, KISS, task order
- Quality (#14-#15) — Documentation complete, hardcoded values
- Dependencies (#18-#19/#19b) — Story/Task independence (no forward deps), parallel group validity
- Cross-Reference (#25-#26) — AC overlap with siblings, task duplication across Stories
- Risk (#20) — Implementation risk analysis (after dependencies resolved, before traceability)
- Pre-mortem (#27) — Tiger/Paper Tiger/Elephant classification (complex Stories)
- Verification (#22) — AC verify methods exist for all task ACs (test/command/inspect)
- Traceability (#16-#17) — Story-Task alignment, AC coverage quality (LAST, after all fixes)
- Use Auto-Fix Actions table below as authoritative checklist
- Zero out penalty points as fixes applied
- Test Strategy section must exist but remain empty (testing handled separately)
Phase 5: Merge + Critical Verification (MANDATORY — DO NOT SKIP)
MANDATORY STEP: This phase merges agent results (launched in Phase 1b) with Claude's own findings. Agents were already running in background during Phases 2-4 (mode=story) or during foreground research (mode=context).
MANDATORY READ: Load (Critical Verification + Debate),
shared/references/agent_review_workflow.mdshared/references/agent_review_memory.md- Wait for agent results — read result files as they arrive (process-as-arrive pattern)
- Parse agent suggestions from both agents' result files
- MERGE: Claude's own findings (Phase 2-4 violations for mode=story, MCP Ref findings for mode=context) + Agent suggestions
- If agent suggestion targets lines modified in Phase 4 auto-fix, re-read affected lines before evaluation (agents saw pre-fix state, files are now post-fix)
- For EACH agent suggestion:
- Dedup against Claude's own findings (skip if already covered)
- Dedup against review history (skip if already addressed)
- Claude Evaluation: is it real? Actionable? Applies to our context?
- MCP Ref enhancement (mode=context): agent suggestion contradicts MCP Ref finding → DISAGREE with citation; aligns → AGREE; not covered → standard evaluation
- AGREE → accept. DISAGREE → debate (Challenge + Follow-Up per shared workflow)
- Apply accepted suggestions:
- mode=story → apply to Story/Tasks text
- mode=context → output to chat as advisory
- Save review summary to
.agent-review/review_history.md
- If verdict = (no agents at health check) → proceed to Phase 6 unchanged
SKIPPED - Display:
"Agent Review: codex ({accepted}/{total}), gemini ({accepted}/{total}), {N} suggestions applied"
Phase 6: Approve & Notify (mode=story only)
mode=context: Skip Phase 6. Return suggestions as advisory output. Done.
- Set Story + all Tasks to Todo; update with APPROVED marker
kanban_board.md- IF =
task_provider:linearfor Story + each Tasksave_issue({id, state: "Todo"}) - IF =
task_provider:fileEditline to**Status:**in story.md + each task fileTodo
- IF
- Add validation summary comment:
- IF =
task_provider:linearon Storycreate_comment({issueId, body}) - IF =
task_provider:filecomment toWritedocs/tasks/epics/.../comments/{ISO-timestamp}.md - Content: Penalty Points table (Before -> After = 0), Auto-Fixes Applied, Documentation Created (via ln-002), Standards Compliance Evidence
- IF
- Display tabular output (Unicode box-drawing) to terminal with Before/After scores
- Recommended next step: to start Story execution
ln-400-story-executor
Auto-Fix Actions Reference
MANDATORY READ: Load for complete 27-criteria table with:
references/phase2_research_audit.md- Structural (#1-#4, #24): Story/Task template compliance, Assumption Registry
- Standards (#5): RFC/OWASP compliance
- Solution (#6, #21): Library versions, alternatives
- Workflow (#7-#13): Test strategy, docs, size, YAGNI/KISS, task order
- Quality (#14-#15): Documentation, hardcoded values
- Dependencies (#18-#19/#19b): No forward dependencies
- Cross-Reference (#25-#26): AC overlap, task duplication across sibling Stories
- Risk (#20): Implementation risk analysis
- Pre-mortem (#27): Tiger/Paper Tiger/Elephant classification
- Traceability (#16-#17): Story-Task alignment, AC coverage
Maximum Penalty: 110 points (sum of all 27 criteria; #20 capped at 15; #25 max 1 CRITICAL = 10)
Final Assessment Model
Two-stage assessment: Before (raw audit) and After (post auto-fix).
| Metric | Before | After | Meaning |
|---|---|---|---|
| Penalty Points | Raw audit total | Remaining after fixes | 0 = all fixed; >0 = unfixable items |
| Readiness Score | | | Quality confidence (1-10) |
| Anti-Hallucination | — | VERIFIED / FLAGGED | Technical claims verified |
| AC Coverage | — | N/N (target 100%) | All ACs mapped to Tasks |
| Gate | — | GO / NO-GO | Final verdict |
GO/NO-GO Decision
| Gate | Condition |
|---|---|
| GO | After Penalty Points = 0 AND no FLAGGED criteria |
| NO-GO | After Penalty Points > 0 OR any criterion FLAGGED as unfixable |
FLAGGED criteria: If auto-fix is impossible (MCP Ref unavailable, external dependency), penalty stays — it is NOT zeroed out. User must resolve manually before re-validation.
Anti-Hallucination Verification
Verify technical claims have evidence:
| Claim Type | Verification |
|---|---|
| RFC/Standard reference | MCP Ref search confirms existence |
| Library version | Context7 query confirms version |
| Security requirement | OWASP/CWE reference exists |
| Performance claim | Benchmark/doc reference |
Status: VERIFIED (all claims sourced) or FLAGGED (unverified claims listed)
Task-AC Coverage Matrix
Output explicit mapping:
| AC | Task(s) | Coverage |
|----|---------|----------|
| AC1: Given/When/Then | T-001, T-002 | ✅ |
| AC2: Given/When/Then | T-003 | ✅ |
| AC3: Given/When/Then | — | ❌ UNCOVERED |Coverage: (target: 100%)
{covered}/{total} ACsSelf-Audit Protocol (Mandatory)
Verify all 27 criteria (#1-#27) from Auto-Fix Actions pass with concrete evidence (doc path, MCP result, Linear update) before proceeding to Phase 6.
Critical Rules
- All 27 criteria MUST be verified with concrete evidence (doc path, MCP result, Linear update) before Phase 6 (Self-Audit Protocol)
- Fix execution order is strict: Structural -> Standards -> Solution -> Workflow -> Quality -> Dependencies -> Cross-Reference -> Risk -> Pre-mortem -> Verification -> Traceability (standards before YAGNI/KISS)
- If auto-fix succeeds, zero out that criterion's penalty. If auto-fix is impossible (e.g., MCP Ref unavailable, external dependency), mark as FLAGGED with reason — penalty stays, Gate = NO-GO, user must resolve manually
- Test Strategy section must exist but remain empty (testing handled separately by other skills)
- In Plan Mode, MUST stop after Phase 3 and wait for user approval before applying any fixes
Definition of Done
- Phases 1-6 completed: metadata loaded, research done, penalties calculated, fixes applied, agent review done, Story approved.
- Penalty Points After = 0 (all 27 criteria fixed or none FLAGGED). Readiness Score After = 10.
- Anti-Hallucination: VERIFIED (all claims sourced via MCP).
- AC Coverage: 100% (each AC mapped to ≥1 Task).
- Agent Review: agents launched in background before Phase 2, results merged in Phase 5, suggestions verified + debated, accepted applied (or SKIPPED if no agents).
- Story/Tasks set to Todo; kanban updated; Linear comment with Final Assessment posted.
Example Workflow
Story: "Create user management API with rate limiting"
- Phase 1: Load metadata (5 Tasks, status Backlog) 1b. Phase 1b: Health check → launch codex-review + gemini-review in background
- Phase 2:
- Domain extraction: REST API, Rate Limiting
- Delegate ln-002: creates Guide-05 (REST patterns), Guide-06 (Rate Limiting)
- MCP Ref: RFC 7231 compliance, OWASP API Security
- Context7: Express v4.19 (current v4.17)
- Penalty Points: 18 total (version=5, missing docs=5, structure=3, standards=5)
- Phase 3:
- Show Penalty Points table
- IF Plan Mode: "18 penalty points found. Fix plan ready. Approve?"
- Phase 4:
- Fix #6: Update Express v4.17 -> v4.19
- Fix #5: Add RFC 7231 compliance notes
- Fix #13: Add Guide-05, Guide-06 references
- Fix #17: Docs already created by ln-002
- All fixes applied, Penalty Points = 0
- Phase 5: Merge agent results (launched in Phase 1b) + Claude's findings → verify, debate, apply
- Phase 6: Story -> Todo, tabular report
Template Loading
Templates: ,
story_template.mdtask_template_implementation.mdLoading Logic:
- Check if exists in target project
docs/templates/{template}.md - IF NOT EXISTS:
a. Create directory if missing b. Copy
docs/templates/→shared/templates/{template}.mdc. Replace placeholders in the LOCAL copy:docs/templates/{template}.md- → from
{{TEAM_ID}}docs/tasks/kanban_board.md - → "docs" (standard)
{{DOCS_PATH}}
- Use LOCAL copy () for all validation operations
docs/templates/{template}.md
Rationale: Templates are copied to target project on first use, ensuring:
- Project independence (no dependency on skills repository)
- Customization possible (project can modify local templates)
- Placeholder replacement happens once at copy time
Reference Files
- Tools config:
shared/references/tools_config_guide.md - Storage mode operations:
shared/references/storage_mode_detection.md - AC validation rules:
shared/references/ac_validation_rules.md - Plan mode behavior:
shared/references/plan_mode_pattern.md - Final Assessment: (GO/NO-GO rules, Readiness Score calculation)
references/readiness_scoring.md - Templates (centralized): ,
shared/templates/story_template.mdshared/templates/task_template_implementation.md - Local copies: (in target project)
docs/templates/ - Validation Checklists (Progressive Disclosure):
- (criteria #1-#4)
references/structural_validation.md - (criterion #5)
references/standards_validation.md - (criterion #6)
references/solution_validation.md - (criteria #7-#13)
references/workflow_validation.md - (criteria #14-#15)
references/quality_validation.md - (criteria #18-#19/#19b)
references/dependency_validation.md - (criterion #20)
references/risk_validation.md - (criteria #25-#26)
references/cross_reference_validation.md - (criterion #27)
references/premortem_validation.md - (criteria #16-#17)
references/traceability_validation.md - (pattern registry for ln-002 delegation)
references/domain_patterns.md - (penalty system details)
references/penalty_points.md
- Prevention checklist: (creator-facing mapping of 27 criteria)
shared/references/creation_quality_checklist.md - MANDATORY READ: ,
shared/templates/linear_integration.mdshared/references/research_tool_fallback.md - Agent review workflow:
shared/references/agent_review_workflow.md - Agent delegation pattern:
shared/references/agent_delegation_pattern.md - Agent review memory:
shared/references/agent_review_memory.md - Review templates: +
shared/agents/prompt_templates/review_base.md,modes/story.mdmodes/context.md - Challenge template:
shared/agents/prompt_templates/challenge_review.md - MCP Ref findings template:
references/mcp_ref_findings_template.md - Context review pipeline: (weight table, stack detection, safety rules for mode=context)
references/context_review_pipeline.md
Version: 7.0.0
Last Updated: 2026-02-03