/council — Multi-Model Consensus Council

Spawn parallel judges with different perspectives, consolidate into consensus. Works for any task — validation, research, brainstorming.

Quick Start

bash

/council --quick validate recent                               # fast inline check
/council validate this plan                                    # validation (2 agents)
/council brainstorm caching approaches                         # brainstorm
/council validate the implementation                          # validation (critique triggers map here)
/council research kubernetes upgrade strategies                # research
/council research the CI/CD pipeline bottlenecks               # research (analyze triggers map here)
/council --preset=security-audit validate the auth system      # preset personas
/council --deep --explorers=3 research upgrade automation      # deep + explorers
/council --debate validate the auth system                # adversarial 2-round review
/council --deep --debate validate the migration plan      # thorough + debate
/council                                                       # infers from context

Council works independently — no RPI workflow, no ratchet chain, no

ao

CLI required. Zero setup beyond plugin install.

Modes

Mode	Agents	Execution Backend	Use Case
`--quick`	0 (inline)	Self	Fast single-agent check, no spawning
default	2	Runtime-native (Codex sub-agents preferred; Claude teams fallback)	Independent judges (no perspective labels)
`--deep`	3	Runtime-native	Thorough review
`--mixed`	3+3	Runtime-native + Codex CLI	Cross-vendor consensus
`--debate`	2+	Runtime-native	Adversarial refinement (2 rounds)

bash

/council --quick validate recent   # inline single-agent check, no spawning
/council recent                    # 2 runtime-native judges
/council --deep recent             # 3 runtime-native judges
/council --mixed recent            # runtime-native + Codex CLI

Spawn Backend Selection (MANDATORY)

Council must auto-select backend using capability detection:

If
```
spawn_agent
```
is available, use Codex experimental sub-agents
Else if
```
TeamCreate
```
is available, use Claude native teams
Else use Task(run_in_background=true) fallback

This keeps

/council

universal across Claude and Codex sessions.

When to Use

--debate

Use

--debate

for high-stakes or ambiguous reviews where judges are likely to disagree:

Security audits, architecture decisions, migration plans
Reviews where multiple valid perspectives exist
Cases where a missed finding has real consequences

Skip

--debate

for routine validation where consensus is expected. Debate adds R2 latency (judges stay alive and process a second round via backend messaging).

Incompatibilities:

```
--quick
```
and
```
--debate
```
cannot be combined.
```
--quick
```
runs inline with no spawning;
```
--debate
```
requires multi-agent rounds. If both are passed, exit with error: "Error: --quick and --debate are incompatible."
```
--debate
```
is only supported with validate mode. Brainstorm and research do not produce PASS/WARN/FAIL verdicts. If combined, exit with error: "Error: --debate is only supported with validate mode."

Task Types

Type	Trigger Words	Perspective Focus
validate	validate, check, review, assess, critique, feedback, improve	Is this correct? What's wrong? What could be better?
brainstorm	brainstorm, explore, options, approaches	What are the alternatives? Pros/cons?
research	research, investigate, deep dive, explore deeply, analyze, examine, evaluate, compare	What can we discover? What are the properties, trade-offs, and structure?

Natural language works — the skill infers task type from your prompt.

Architecture

Execution Flow

┌─────────────────────────────────────────────────────────────────┐
│  Phase 1: Build Packet (JSON)                                   │
│  - Task type (validate/brainstorm/research)                      │
│  - Target description                                           │
│  - Context (files, diffs, prior decisions)                      │
│  - Perspectives to assign                                       │
└─────────────────────────────────────────────────────────────────┘
                              │
                              ▼
┌─────────────────────────────────────────────────────────────────┐
│  Phase 1a: Select spawn backend                                  │
│  codex_subagents | claude_teams | background_fallback            │
│  Team lead = spawner (this agent)                                │
└─────────────────────────────────────────────────────────────────┘
                              │
            ┌─────────────────┴─────────────────┐
            ▼                                   ▼
┌───────────────────────┐           ┌───────────────────────┐
│  RUNTIME-NATIVE JUDGES│           │     CODEX AGENTS      │
│ (spawn_agent or teams)│           │  (Bash tool, parallel)│
│                       │           │  Agent 1 (independent │
│  Agent 1 (independent │           │    or with preset)    │
│    or with preset)    │           │  Agent 2              │
│  Agent 2              │           │  Agent 3              │
│  Agent 3 (--deep only)│           │  (--mixed only)       │
│  (--deep/--mixed only)│           │                       │
│                       │           │  Output: JSON + MD    │
│  Write files, then    │           │  Files: .agents/      │
│ wait()/SendMessage to │           │    council/codex-*    │
│ lead                  │           │                       │
│  Files: .agents/      │           └───────────────────────┘
│    council/claude-*   │                       │
└───────────────────────┘                       │
            │                                   │
            └─────────────────┬─────────────────┘
                              ▼
┌─────────────────────────────────────────────────────────────────┐
│  Phase 2: Consolidation (Team Lead)                             │
│  - Receive completion from backend channel (wait/SendMessage)   │
│  - Read all agent output files                                  │
│  - If schema_version is missing from a judge's output, treat    │
│    as version 0 (backward compatibility)                        │
│  - Compute consensus verdict                                    │
│  - Identify shared findings                                     │
│  - Surface disagreements with attribution                       │
│  - Generate Markdown report for human                           │
└─────────────────────────────────────────────────────────────────┘
                              │
                              ▼
┌─────────────────────────────────────────────────────────────────┐
│  Phase 3: Cleanup                                               │
│  - Cleanup backend resources (close_agent / TeamDelete / none)  │
└─────────────────────────────────────────────────────────────────┘
                              │
                              ▼
┌─────────────────────────────────────────────────────────────────┐
│  Output: Markdown Council Report                                │
│  - Consensus: PASS/WARN/FAIL                                    │
│  - Shared findings                                              │
│  - Disagreements (if any)                                       │
│  - Recommendations                                              │
└─────────────────────────────────────────────────────────────────┘

Graceful Degradation

Failure	Behavior
1 of N agents times out	Proceed with N-1, note in report
All Codex CLI agents fail	Proceed with runtime-native judges only, note degradation
All agents fail	Return error, suggest retry
Codex CLI not installed	Skip Codex CLI judges, continue runtime-native mode (warn user)
Codex sub-agents unavailable	Fall back to Claude teams
Native teams unavailable	Fall back to `Task(run_in_background=true)` fire-and-forget
Output dir missing	Create `.agents/council/` automatically

Timeout: 120s per agent (configurable via

--timeout=N

in seconds).

Minimum quorum: At least 1 agent must respond for a valid council. If 0 agents respond, return error.

Pre-Flight Checks

Runtime-native backend: Select via capability detection (
```
spawn_agent
```
->
```
TeamCreate
```
->
```
Task(run_in_background=true)
```
).
Codex CLI judges (--mixed only): Check
```
which codex
```
, test model availability, test
```
--output-schema
```
support. Downgrade mixed mode when unavailable.

Agent count: Verify

judges * (1 + explorers) <= MAX_AGENTS (12)

Output dir:
```
mkdir -p .agents/council
```

Quick Mode (

--quick

)

Single-agent inline validation. No subprocess spawning, no Task tool, no Codex. The current agent performs a structured self-review using the same output schema as a full council.

When to use: Routine checks, mid-implementation sanity checks, pre-commit quick scan.

Execution: Gather context (files, diffs) -> perform structured self-review inline using the council output_schema (verdict, confidence, findings, recommendation) -> write report to

.agents/council/YYYY-MM-DD-quick-<target>.md

labeled as

Mode: quick (single-agent)

Limitations: No cross-perspective disagreement, no cross-vendor insights, lower confidence ceiling. Not suitable for security audits or architecture decisions.

Packet Format (JSON)

The packet sent to each agent. File contents are included inline — agents receive the actual code/plan text in the packet, not just paths. This ensures both Claude and Codex agents can analyze without needing file access.

json

{
  "council_packet": {
    "version": "1.0",
    "mode": "validate | brainstorm | research",
    "target": "Implementation of user authentication system",
    "context": {
      "files": [
        {
          "path": "src/auth/jwt.py",
          "content": "<file contents inlined here>"
        },
        {
          "path": "src/auth/middleware.py",
          "content": "<file contents inlined here>"
        }
      ],
      "diff": "git diff output if applicable",
      "spec": {
        "source": "bead na-0042 | plan doc | none",
        "content": "The spec/bead description text (optional — included when wrapper provides it)"
      },
      "prior_decisions": [
        "Using JWT, not sessions",
        "Refresh tokens required"
      ]
    },
    "perspective": "skeptic (only when --preset or --perspectives used)",
    "perspective_description": "What could go wrong? (only when --preset or --perspectives used)",
    "output_schema": {
      "verdict": "PASS | WARN | FAIL",
      "confidence": "HIGH | MEDIUM | LOW",
      "key_insight": "Single sentence summary",
      "findings": [
        {
          "severity": "critical | significant | minor",
          "category": "security | architecture | performance | style",
          "description": "What was found",
          "location": "file:line if applicable",
          "recommendation": "How to address"
        }
      ],
      "recommendation": "Concrete next step",
      "schema_version": 1
    }
  }
}

Perspectives

Perspectives & Presets: Use
Read
tool on
skills/council/references/personas.md
for persona definitions, preset configurations, and custom perspective details.

Auto-Escalation: When

--preset

--perspectives

specifies more perspectives than the current judge count, automatically escalate judge count to match. The

--count

flag overrides auto-escalation.

Explorer Sub-Agents

Explorer Details: Use
Read
tool on
skills/council/references/explorers.md
for explorer architecture, prompts, sub-question generation, and timeout configuration.

Summary: Judges can spawn explorer sub-agents (

--explorers=N

, max 5) for parallel deep-dive research. Total agents =

judges * (1 + explorers)

, capped at MAX_AGENTS=12.

Debate Phase (

--debate

)

Debate Protocol: Use
Read
tool on
skills/council/references/debate-protocol.md
for full debate execution flow, R1-to-R2 verdict injection, timeout handling, and cost analysis.

Summary: Two-round adversarial review. R1 produces independent verdicts. R2 sends other judges' verdicts via backend messaging (

send_input

SendMessage

) for steel-manning and revision. Only supported with validate mode.

Agent Prompts

Agent Prompts: Use
Read
tool on
skills/council/references/agent-prompts.md
for judge prompts (default and perspective-based), consolidation prompt, and debate R2 message template.

Consensus Rules

Condition	Verdict
All PASS	PASS
Any FAIL	FAIL
Mixed PASS/WARN	WARN
All WARN	WARN

Disagreement handling:

If Claude says PASS and Codex says FAIL → DISAGREE (surface both)
Severity-weighted: Security FAIL outweighs style WARN

DISAGREE resolution: When vendors disagree, the spawner presents both positions with reasoning and defers to the user. No automatic tie-breaking — cross-vendor disagreement is a signal worth human attention.

Output Format

Report Templates: Use
Read
tool on
skills/council/references/output-format.md
for full report templates (validate, brainstorm, research) and debate report additions (verdict shifts, convergence detection).

All reports write to

.agents/council/YYYY-MM-DD-<type>-<target>.md

Configuration

Partial Completion

Minimum quorum: 1 agent. Recommended: 80% of judges. On timeout, proceed with remaining judges and note in report. On user cancellation, shutdown all judges and generate partial report with INCOMPLETE marker.

Environment Variables

Variable	Default	Description
`COUNCIL_TIMEOUT`	120	Agent timeout in seconds
`COUNCIL_CODEX_MODEL`	gpt-5.3-codex	Default Codex model for --mixed
`COUNCIL_CLAUDE_MODEL`	opus	Claude model for agents
`COUNCIL_EXPLORER_MODEL`	sonnet	Model for explorer sub-agents
`COUNCIL_EXPLORER_TIMEOUT`	60	Explorer timeout in seconds
`COUNCIL_R2_TIMEOUT`	90	Maximum wait time for R2 debate completion after sending debate messages. Shorter than R1 since judges already have context.

Flags

Flag	Description
`--deep`	3 Claude agents instead of 2
`--mixed`	Add 3 Codex agents
`--debate`	Enable adversarial debate round (2 rounds via backend messaging, same agents). Incompatible with `--quick` .
`--timeout=N`	Override timeout in seconds (default: 120)
`--perspectives="a,b,c"`	Custom perspective names
`--preset=<name>`	Built-in persona preset (security-audit, architecture, research, ops, code-review, plan-review, retrospective)
`--count=N`	Override agent count per vendor (e.g., `--count=4` = 4 Claude, or 4+4 with --mixed). Subject to MAX_AGENTS=12 cap.
`--explorers=N`	Explorer sub-agents per judge (default: 0, max: 5). Max effective value depends on judge count. Total agents capped at 12.
`--explorer-model=M`	Override explorer model (default: sonnet)

CLI Spawning Commands

CLI Spawning: Use
Read
tool on
skills/council/references/cli-spawning.md
for team setup, Claude/Codex agent spawning, parallel execution, debate R2 commands, cleanup, and model selection.

Examples

bash

/council validate recent                                        # 2 judges, recent commits
/council --deep --preset=architecture research the auth system  # 3 judges with architecture personas
/council --mixed validate this plan                             # 3 Claude + 3 Codex
/council --deep --explorers=3 research upgrade patterns         # 12 agents (3 judges x 4)
/council --preset=security-audit --deep validate the API        # attacker, defender, compliance
/council brainstorm caching strategies for the API              # 2 judges explore options
/council research Redis vs Memcached for session storage        # 2 judges assess trade-offs
/council validate the implementation plan in PLAN.md            # structured plan feedback

Migration from /judge

/council

replaces

/judge

. Migration:

Old	New
`/judge recent`	`/council validate recent`
`/judge 2 opus`	`/council recent` (default)
`/judge 3 opus`	`/council --deep recent`

The

/judge

skill is deprecated. Use

/council

Runtime-Native Architecture

Council uses runtime-native spawning as primary:

Codex sessions: experimental sub-agents (
```
spawn_agent
```
,
```
wait
```
,
```
send_input
```
,
```
close_agent
```
)
Claude sessions: native teams (
```
TeamCreate
```
,
```
SendMessage
```
, shared
```
TaskList
```
)
Fallback:
```
Task(run_in_background=true)
```

Deliberation Protocol

The

--debate

flag implements the deliberation protocol pattern:

Independent assessment → evidence exchange → position revision → convergence analysis

Runtime-native backends make this pattern first-class:

R1: Judges spawn as sub-agents/teammates, assess independently, return verdicts to lead
R2: Team lead sends other judges' verdicts via
```
send_input
```
(Codex) or
```
SendMessage
```
(Claude). Judges wake from idle with full R1 context.
Consolidation: Team lead reads all output files, computes consensus
Cleanup:
```
close_agent
```
(Codex) or
```
shutdown_request
```
+
```
TeamDelete()
```
(Claude)

Communication Rules

Judges → team lead only. Judges never message each other directly. This prevents anchoring.
Team lead → judges. Only the team lead sends follow-ups (
```
send_input
```
or
```
SendMessage
```
).
No shared task mutation by judges. Team lead manages coordination state.

Ralph Wiggum Compliance

Council maintains fresh-context isolation (Ralph Wiggum pattern) with one documented exception:

--debate
reuses judge context across R1 and R2. This is intentional. Judges persist within a single atomic council invocation — they do NOT persist across separate council calls. The rationale:

Judges benefit from their own R1 analytical context (reasoning chain, not just the verdict JSON) when evaluating other judges' positions in R2
Re-spawning with only the verdict summary (~200 tokens) would lose the judge's working memory of WHY they reached their verdict
The exception is bounded: max 2 rounds, within one invocation, with explicit cleanup (close_agent or shutdown_request + TeamDelete)

Without

--debate

, council is fully Ralph-compliant: each judge is a fresh spawn, executes once, writes output, and terminates.

Fallback

If runtime-native backend is unavailable, fall back to

Task(run_in_background=true)

fire-and-forget. In fallback mode:

```
--debate
```
reverts to R2 re-spawning with truncated R1 verdicts
The debate report must include
```
**Fidelity:** degraded (fallback — R1 verdicts truncated for R2 re-spawn)
```
in the header so users know results may be lower fidelity
Non-debate mode works identically (judges write files, team lead reads them)

Judge Naming

Convention:

council-YYYYMMDD-<target>

(e.g.,

council-20260206-auth-system

Judge names:

judge-{N}

for independent judges (e.g.,

judge-1

judge-2

), or

judge-{perspective}

when using presets/perspectives (e.g.,

judge-error-paths

judge-feasibility

). Use the same logical names across both Codex and Claude backends.

council

NPX Install

Tags

SKILL.md Content

/council — Multi-Model Consensus Council

Quick Start

Modes

Spawn Backend Selection (MANDATORY)

When to Use
`--debate`

Task Types

Architecture

Execution Flow

Graceful Degradation

Pre-Flight Checks

Quick Mode (
`--quick`
)

Packet Format (JSON)

Perspectives

Explorer Sub-Agents

Debate Phase (
`--debate`
)

Agent Prompts

Consensus Rules

Output Format

Configuration

Partial Completion

Environment Variables

Flags

CLI Spawning Commands

Examples

Migration from /judge

Runtime-Native Architecture

Deliberation Protocol

Communication Rules

Ralph Wiggum Compliance

Fallback

Judge Naming

See Also

council

NPX Install

Tags

SKILL.md Content

/council — Multi-Model Consensus Council

Quick Start

Modes

Spawn Backend Selection (MANDATORY)

When to Use --debate

Task Types

Architecture

Execution Flow

Graceful Degradation

Pre-Flight Checks

Quick Mode (--quick)

Packet Format (JSON)

Perspectives

Explorer Sub-Agents

Debate Phase (--debate)

Agent Prompts

Consensus Rules

Output Format

Configuration

Partial Completion

Environment Variables

Flags

CLI Spawning Commands

Examples

Migration from /judge

Runtime-Native Architecture

Deliberation Protocol

Communication Rules

Ralph Wiggum Compliance

Fallback

Judge Naming

See Also

When to Use
`--debate`

Quick Mode (
`--quick`
)

Debate Phase (
`--debate`
)