Loading...
Loading...
Implementation + audit loop using parallel agent teams with structured simplify, harden, and document passes. Spawns implementation agents to do the work, then audit agents to find complexity, security gaps, and spec deviations, then loops until code compiles cleanly, all tests pass, and auditors find zero issues or the loop cap is reached. Use when: implementing features from a spec or plan, hardening existing code, fixing a batch of issues, or any multi-file task that benefits from a build-verify-fix cycle.
npx skill4agent add pskoett/pskoett-ai-skills agent-teams-simplify-and-hardennpx skills add pskoett/pskoett-ai-skills/agent-teams-simplify-and-harden┌──────────────────────────────────────────────────────────┐
│ TEAM LEAD (you) │
│ │
│ Phase 1: IMPLEMENT (+ document pass on fix rounds) │
│ ┌──────────┐ ┌──────────┐ ┌──────────┐ │
│ │ impl-1 │ │ impl-2 │ │ impl-3 │ ... │
│ │ (general │ │ (general │ │ (general │ │
│ │ purpose) │ │ purpose) │ │ purpose) │ │
│ └──────────┘ └──────────┘ └──────────┘ │
│ │ │ │ │
│ ▼ ▼ ▼ │
│ ┌─────────────────────────────────────┐ │
│ │ Verify: compile + tests │ │
│ └─────────────────────────────────────┘ │
│ │ │
│ Phase 2: SIMPLIFY & HARDEN AUDIT │
│ ┌──────────┐ ┌──────────┐ ┌──────────┐ │
│ │ simplify │ │ harden │ │ spec │ ... │
│ │ auditor │ │ auditor │ │ auditor │ │
│ │ (Explore)│ │ (Explore)│ │ (Explore)│ │
│ └──────────┘ └──────────┘ └──────────┘ │
│ │ │ │ │
│ ▼ ▼ ▼ │
│ Exit conditions met? │
│ YES → Produce summary. Ship it. │
│ NO → back to Phase 1 with findings as tasks │
│ (max 3 audit rounds) │
└──────────────────────────────────────────────────────────┘lowTeamCreate:
team_name: "<project>-harden"
description: "Implement and harden <description>"TaskCreate for each unit of work:
subject: "Implement <specific thing>"
description: "Detailed requirements, file paths, acceptance criteria"
activeForm: "Implementing <thing>"TaskUpdate: { taskId: "2", addBlockedBy: ["1"] }general-purposeTask tool (spawn teammate):
subagent_type: general-purpose
team_name: "<project>-harden"
name: "impl-<area>"
mode: bypassPermissions
prompt: |
You are an implementation agent on the <project>-harden team.
Your name is impl-<area>.
Check TaskList for your assigned tasks and complete them.
After completing each task, mark it completed and check for more.
Quality gates:
- Code must compile cleanly (substitute your project's compile
command, e.g. bunx tsc --noEmit, cargo build, go build ./...)
- Tests must pass (substitute your project's test command,
e.g. bun test, pytest, go test ./...)
- Follow existing code patterns and conventions
When all your tasks are done, notify the team lead.git diff --name-only <base-branch> # or: git diff --name-only HEAD~NExplore| Auditor | Focus | Mindset |
|---|---|---|
| simplify-auditor | Code clarity and unnecessary complexity | "Is there a simpler way to express this?" |
| harden-auditor | Security and resilience gaps | "If someone malicious saw this, what would they try?" |
| spec-auditor | Implementation vs spec/plan completeness | "Does the code match what was asked for?" |
Task tool (spawn teammate):
subagent_type: Explore
team_name: "<project>-harden"
name: "simplify-auditor"
prompt: |
You are a simplify auditor on the <project>-harden team.
Your name is simplify-auditor.
Your job is to find unnecessary complexity -- NOT fix it. You are
read-only.
SCOPE: Only review the following files (modified in this session).
Do NOT flag issues in other files, even if you notice them.
Files to review:
<paste file list here>
Fresh-eyes start (mandatory): Before reporting findings, re-read all
listed changed code with "fresh eyes" and actively look for obvious
bugs, errors, confusing logic, brittle assumptions, naming issues,
and missed hardening opportunities.
Review each file and check for:
1. Dead code and scaffolding -- debug logs, commented-out attempts,
unused imports, temporary variables left from iteration
2. Naming clarity -- function names, variables, and parameters that
don't read clearly when seen fresh
3. Control flow -- nested conditionals that could be flattened, early
returns that could replace deep nesting, boolean expressions that
could be simplified
4. API surface -- public methods/functions that should be private,
more exposure than necessary
5. Over-abstraction -- classes, interfaces, or wrapper functions not
justified by current scope. Agents tend to over-engineer.
6. Consolidation -- logic spread across multiple functions/files that
could live in one place
For each finding, categorize as:
- **Cosmetic** (dead code, unused imports, naming, control flow,
visibility reduction) -- low risk, easy fix
- **Refactor** (consolidation, restructuring, abstraction changes)
-- only flag when genuinely necessary, not just "slightly better."
The bar: would a senior engineer say the current state is clearly
wrong, not just imperfect?
For each finding report:
1. File and line number
2. Category (cosmetic or refactor)
3. What's wrong
4. What it should be (specific fix, not vague)
5. Severity: high / medium / low
If you notice issues outside the scoped files, list them separately
under "Out-of-scope observations" at the end.
Be thorough within scope. Check every listed file.
When done, send your complete findings to the team lead.
If you find ZERO in-scope issues, say so explicitly.Task tool (spawn teammate):
subagent_type: Explore
team_name: "<project>-harden"
name: "harden-auditor"
prompt: |
You are a security/harden auditor on the <project>-harden team.
Your name is harden-auditor.
Your job is to find security and resilience gaps -- NOT fix them.
You are read-only.
SCOPE: Only review the following files (modified in this session).
Do NOT flag issues in other files, even if you notice them.
Files to review:
<paste file list here>
Fresh-eyes start (mandatory): Before reporting findings, re-read all
listed changed code with "fresh eyes" and actively look for obvious
bugs, errors, confusing logic, brittle assumptions, naming issues,
and missed hardening opportunities.
Review each file and check for:
1. Input validation -- unvalidated external inputs (user input, API
params, file paths, env vars), type coercion issues, missing
bounds checks, unconstrained string lengths
2. Error handling -- non-specific catch blocks, errors logged without
context, swallowed exceptions, sensitive data in error messages
3. Injection vectors -- SQL injection, XSS, command injection, path
traversal, template injection in string-building code
4. Auth and authorization -- endpoints or functions missing auth,
incorrect permission checks, privilege escalation risks
5. Secrets and credentials -- hardcoded secrets, API keys, tokens,
credentials in log output, unparameterized connection strings
6. Data exposure -- internal state in error output, stack traces in
responses, PII in logs, database schemas leaked
7. Dependency risk -- new dependencies that are unmaintained, poorly
versioned, or have known vulnerabilities
8. Race conditions -- unsynchronized shared resources, TOCTOU
vulnerabilities in concurrent code
For each finding, categorize as:
- **Patch** (adding validation, escaping output, removing a secret)
-- straightforward fix
- **Security refactor** (restructuring auth flow, replacing a
vulnerable pattern) -- requires structural changes
For each finding report:
1. File and line number
2. Category (patch or security refactor)
3. What's wrong
4. Severity: critical / high / medium / low
5. Attack vector (if applicable)
6. Specific fix recommendation
If you notice issues outside the scoped files, list them separately
under "Out-of-scope observations" at the end.
Be thorough within scope. Check every listed file.
When done, send your complete findings to the team lead.
If you find ZERO in-scope issues, say so explicitly.Task tool (spawn teammate):
subagent_type: Explore
team_name: "<project>-harden"
name: "spec-auditor"
prompt: |
You are a spec auditor on the <project>-harden team.
Your name is spec-auditor.
Your job is to find gaps between implementation and spec/plan --
NOT fix them. You are read-only.
SCOPE: Only review the following files (modified in this session).
Do NOT flag issues in other files, even if you notice them.
Files to review:
<paste file list here>
Fresh-eyes start (mandatory): Before reporting findings, re-read all
listed changed code with "fresh eyes" and actively look for obvious
bugs, errors, confusing logic, brittle assumptions, and
implementation/spec mismatches before running the spec checklist.
Review each file against the spec/plan and check for:
1. Missing features -- spec requirements that have no corresponding
implementation
2. Incorrect behavior -- logic that contradicts what the spec
describes (wrong conditions, wrong outputs, wrong error handling)
3. Incomplete implementation -- features that are partially built
but missing edge cases, error paths, or configuration the spec
requires
4. Contract violations -- API shapes, response formats, status
codes, or error messages that don't match the spec
5. Test coverage -- untested code paths, missing edge case tests,
assertions that don't verify enough, happy-path-only testing
6. Acceptance criteria gaps -- spec conditions that aren't verified
by any test
For each finding, categorize as:
- **Missing** -- feature or behavior not implemented at all
- **Incorrect** -- implemented but wrong
- **Incomplete** -- partially implemented, gaps remain
- **Untested** -- implemented but no test coverage
For each finding report:
1. File and line number (or "N/A -- not implemented")
2. Category (missing, incorrect, incomplete, untested)
3. What the spec requires (quote or reference the spec)
4. What the implementation does (or doesn't do)
5. Severity: critical / high / medium / low
If you notice issues outside the scoped files, list them separately
under "Out-of-scope observations" at the end.
Be thorough within scope. Cross-reference every spec requirement.
When done, send your complete findings to the team lead.
If you find ZERO in-scope issues, say so explicitly.lowAfter fixing your assigned issues, add up to 5 single-line comments across the files you touched on non-obvious decisions:
- Logic that needs more than 5 seconds of "why does this exist?" thought
- Workarounds or hacks, with context and a TODO for removal conditions
- Performance choices and why the current approach was picked
Do NOT comment on the audit fixes themselves -- only on decisions from the original implementation that lack explanation.
// TODO// FIXME## Hardening Summary
**Audit rounds completed:** 2 of 3 max
**Exit reason:** Clean audit (all auditors reported zero findings)
### Findings by round
Round 1:
- simplify-auditor: 4 cosmetic, 1 refactor (rejected -- style preference)
- harden-auditor: 2 patches, 1 security refactor (approved)
- spec-auditor: 1 missing feature
Round 2:
- simplify-auditor: 0 findings
- harden-auditor: 0 findings
- spec-auditor: 0 findings
### Actions taken
- Fixed: 6 findings (4 cosmetic, 2 patches, 1 security refactor, 1 missing feature -- rejected refactor excluded)
- Skipped: 1 refactor proposal (reason: style preference, not a defect)
- Document pass: 3 comments added across 2 files
### Unresolved
- None
### Out-of-scope observations
- <any out-of-scope items auditors flagged, for future reference>SendMessage type: shutdown_request to each agent
TeamDelete| Codebase / Task Size | Impl Agents | Audit Agents |
|---|---|---|
| Small (< 10 files) | 1-2 | 2 (simplify + harden) |
| Medium (10-30 files) | 2-3 | 2-3 |
| Large (30+ files) | 3-5 | 3 (simplify + harden + spec) |
general-purposeExploreowner1. Read spec, identify 8 features to implement
2. TeamCreate: "feature-harden"
3. TaskCreate x8 (one per feature)
4. Spawn 3 impl agents, assign ~3 tasks each
5. Wait → all done → compile clean → tests pass
6. Collect modified file list (git diff --name-only)
7. Spawn 3 auditors: simplify-auditor, harden-auditor, spec-auditor
8. Simplify-auditor finds 4 cosmetic + 1 refactor proposal
9. Harden-auditor finds 2 patches + 1 security refactor
10. Spec-auditor finds 1 missing feature
11. Team lead evaluates refactors (approve security refactor,
reject simplify refactor), creates fix + document tasks
12. Spawn 2 impl agents for fixes
13. Wait → compile clean → tests pass
14. Round 2: Spawn 3 fresh auditors
15. Auditors find 0 issues → exit condition met
16. Produce hardening summary
17. Shutdown agents, TeamDelete// TODO// FIXME