Dynamic Context
- Args: $ARGUMENTS
- Branch: !
git branch --show-current
- Config: !
test -f .claude/autonomous-tests.json && echo "YES" || echo "NO — requires autonomous-tests config"
- Pending fixes: !
find docs/_autonomous/pending-fixes -name '*.md' 2>/dev/null | wc -l | tr -d ' '
- Fix results: !
find docs/_autonomous/fix-results -name '*.md' 2>/dev/null | wc -l | tr -d ' '
- Test results: !
find docs/_autonomous/test-results -name '*.md' 2>/dev/null | wc -l | tr -d ' '
- Agent Teams: !
python3 -c "import json;s=json.load(open('$HOME/.claude/settings.json'));print('ENABLED' if s.get('env',{}).get('CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS')=='1' else 'DISABLED')" 2>/dev/null || echo "DISABLED — settings not found"
Role
Project-agnostic autonomous fix runner. Reads findings from
output, lets the user select items to fix, plans and executes fixes via Agent Teams, verifies results, and updates documentation to enable re-testing — creating a bidirectional test-fix loop.
Arguments: $ARGUMENTS
| Arg | Meaning |
|---|
| (empty) | Default: interactive selection via AskUserQuestion |
| Select all fixable items (V, F, T prefixes) |
| Pre-select items with Severity = Critical |
| Pre-select items with Severity = Critical or High |
| Pre-select all security/vulnerability items (V-prefix) |
| Target a specific pending-fixes or test-results file |
Print resolved scope, then proceed without waiting.
Phase 0 — Configuration
Step 0: Prerequisites Check
Read
and check two things:
-
Agent teams feature flag: verify
env.CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS
is
. If missing or not
,
STOP and tell the user:
Agent teams are required for this skill but not enabled. Run:
bash <skill-dir>/scripts/setup-hook.sh
This enables the
CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS
flag and the required hooks in your settings.
Do not proceed until the flag is confirmed enabled.
-
Hooks (informational): if the
→
or
hooks are not present in global settings, inform the user:
This skill includes ExitPlanMode and AskUserQuestion as skill-scoped hooks, so they work automatically during
runs. To also enable them globally, run the setup script above.
Then continue — do not block on this.
Step 1: Config Validation
This skill reuses
.claude/autonomous-tests.json
— no separate config file.
- Run
test -f .claude/autonomous-tests.json && echo "CONFIG_EXISTS" || echo "CONFIG_MISSING"
in Bash.
- If , STOP: "No autonomous-tests config found. Run first to set up your project and generate test findings."
- Read the config. Validate equals .
- Verify config trust: compute SHA-256 hash (same method as autonomous-tests) and check against
~/.claude/trusted-configs/
. If untrusted, show config for confirmation (redact values).
- Ensure : if missing, add
"fixResults": "docs/_autonomous/fix-results"
to the config and save.
Step 2: Findings Scan
Scan the configured
directories:
documentation.pendingFixes
→ pending-fixes documents
documentation.testResults
→ test-results documents
- → prior fix-results (for context)
If no pending-fixes and no test-results with
or
entries exist,
STOP: "No findings to fix. Run
first to generate test results."
Phase 1 — Finding Presentation (User Selection Gate)
Parse all
documents following the rules in
references/finding-parser.md
. Build a structured summary with four categories:
-
Vulnerabilities (V-prefix): Items with Category =
,
,
, or from
/
### API Response Security
subsections. Each shows:
- OWASP category
- Severity
- Regulatory impact (LGPD/GDPR/CCPA/HIPAA)
- Exploitability assessment
- Compliance risk level
-
Bugs (F-prefix): From pending-fixes, non-security categories.
-
Failed Tests (T-prefix): From test-results
.
-
Informational: Guided tests (G), autonomous tests (A) — counts only, not selectable.
Argument-based pre-selection:
- → select all V, F, T items
- → select items with Severity = Critical
- → select items with Severity = Critical or High
- → select all V-prefix items
- → select items from the specified file only
If no argument pre-selects items (empty args or default), present findings via
(forced by hook — works even in dontAsk/bypass mode). Let the user choose which items to fix.
CRITICAL: Do NOT read any source code during this phase. No file reads, no grepping, no code exploration. Only parse
documents. Source code reading happens in Phase 2 after the user has selected items.
Phase 2 — Plan Mode
Enter plan mode (Use /plan). The plan MUST start with a "Context Reload" section as Step 0 containing:
- Instruction to re-read this skill file (the SKILL.md that launched this session)
- Instruction to read the config:
.claude/autonomous-tests.json
- Instruction to read the templates: the file from this skill
- The resolved scope arguments:
- The current branch name
- The selected items (IDs, titles, sources)
- Key context from the finding documents
This ensures that when context is cleared after plan approval, the executing agent can fully reconstruct the session state.
For each selected item, read the relevant source code and build a Fix Context Document:
- Read the files referenced in the finding (endpoint files, model files, test files)
- Trace the code path: input → processing → output
- Identify the root cause (not just the symptom)
- Design the fix
Vulnerability items (V-prefix) get enhanced context:
- Trace full input → processing → output path for the affected endpoint/handler
- Identify ALL user-controlled inputs reaching the vulnerable code
- Check for related vulnerability patterns in same file/module (e.g., if SQL injection found, check all query construction in the file)
- Assess regulatory exposure (which data protection laws apply to the exposed data)
- Security-aware remediation design: fixes must address root causes, not mask symptoms — enforce proper serialization/DTO filtering, add validation/sanitization layers, introduce rate limiting or protective guards where needed
Dependency analysis: Determine which items can be fixed independently vs. which form dependent chains:
- Independent: Items that affect different files, modules, DB collections, or endpoints with no overlap
- Dependent: Items that share files, modify the same function, or where one fix might affect another
Execution strategy:
- Independent groups → parallel (one agent per group via Agent Teams)
- Dependent chains → sequential (single agent handles the chain in order)
- Always use for agents
Wait for user approval.
Phase 3 — Execution
Use
to create a fix team. Spawn
agents as teammates with
.
Standard fix agent instructions (all items):
- Read the Fix Context Document for your assigned items
- Re-read the source files to confirm current state
- Implement the fix addressing the root cause
- Run existing unit tests if configured ( from config)
- Verify the fix with targeted checks (API calls, DB queries, log inspection)
- Report results (RESOLVED/PARTIAL/UNABLE with details)
Vulnerability fix agent instructions (V-prefix items — in addition to standard):
- Remove or redact sensitive data from API responses (enforce DTO/serializer filtering)
- Add input validation and sanitization at the boundary
- Implement rate limiting, file size validation, content-type validation where applicable
- Add circuit breakers for external service interactions
- Harden error responses (no stack traces, internal metadata, or debug info in responses)
- Verify the fix doesn't introduce new attack vectors
- Check for the same vulnerability pattern in related files/endpoints
- Test with variant attack payloads (not just the original vector)
Execution flow:
- Create tasks for each item/group via — include: Fix Context Document, source file paths, fix instructions, verification steps
- Assign tasks to agents via with
- Independent groups run in parallel; dependent chains run sequentially through a single agent
- Never fix in the main conversation — always delegate to agents
- After all agents complete, shut down teammates via with
Phase 4 — Verification
After all agents report, verify results:
Standard verification (all items):
- Confirm files were modified as expected
- Run unit tests if configured
- Check that the original issue is resolved (re-execute the failing scenario)
Security-specific verification (V-prefix items):
- Re-test the original attack vector (must be blocked)
- Test variant payloads (different injection strings, encoding bypasses, alternative file types)
- Verify no auth bypass or privilege escalation introduced
- Verify error responses are hardened (no internal metadata leakage)
- Verify sensitive data removal from API responses
- Check rate limiting is enforced (if added)
Mark each item as: RESOLVED (fix works, verified), PARTIAL (partially addressed, needs more work), or UNABLE (cannot fix autonomously, needs human intervention).
Phase 5 — Documentation Update
Generate docs in dirs from config (create dirs if needed). Get filename timestamp by running
date -u +"%Y-%m-%d-%H-%M-%S"
in Bash (never guess the time).
Read for the exact output structure before writing.
Fix-results document: Always generated. Write to
path. Contains Fix Cycle Metadata, per-item results, and next steps.
Resolution blocks: For each item sourced from pending-fixes, append a
block to the corresponding fix entry in the original pending-fixes document.
Test-results updates: For each T-prefix item, append a fix-applied status line to the corresponding entry in the test-results document.
V-prefix items get a subsection in the fix-results document containing:
- OWASP category
- Attack vector (realistic exploitation scenario)
- Regulatory/compliance impact (which laws, what penalties)
- Mitigation description (what the fix does and why it works)
- Related patterns checked (other files/endpoints verified)
- Residual risk (if any)
Phase 6 — Loop Signal
Summarize the fix cycle and signal readiness for re-testing:
## Fix Cycle Complete
- Items attempted: {N}
- Resolved: {N}
- Partial: {N}
- Unable: {N}
Re-run autonomous-tests to verify: `/autonomous-tests`
If
in the fix-results document, inform the user that autonomous-tests will prioritize re-testing these items on next run.
Vulnerability warning: If any V-prefix items remain PARTIAL or UNABLE, emit a prominent warning with security priority ranking:
⚠️ UNRESOLVED SECURITY FINDINGS
The following security items could not be fully resolved and require manual attention:
Priority order (highest risk first):
1. Data leaks — {list V-prefix items if any}
2. Credential exposure — {list V-prefix items if any}
3. Privilege escalation — {list V-prefix items if any}
4. Denial-of-service risks — {list V-prefix items if any}
5. Compliance violations — {list V-prefix items if any}
Rules
- Never modify production data or connect to production services
- Never expose credentials, keys, or tokens in documentation output
- Always enter plan mode before executing fixes (Phase 2)
- Always delegate fixes to Agent Teams — never fix in main conversation
- NEVER use the tool directly for execution. ALWAYS use → → spawn agents with parameter → → . Plain calls bypass team coordination and task tracking. The tool without is PROHIBITED during Phase 3.
- Always spawn agents with for maximum reasoning capability
- Always present findings for user selection before reading source code (Phase 1 before Phase 2)
- AskUserQuestion hook ensures user selection even in dontAsk/bypass mode
- Security fixes must address root causes — not mask symptoms
- Use UTC timestamps everywhere — always obtain from , never guess
- Reuse
.claude/autonomous-tests.json
— never create a separate config
- Never activate Docker MCPs where
- V-prefix items always get enhanced security context, verification, and documentation