cross-eval

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

/cs:cross-eval — Multi-Model Consensus

/cs:cross-eval — 多模型共识评估

Command:
/cs:cross-eval <memo-or-brief>
Runs the same memo through multiple model providers and reconciles divergences. Use for high-stakes, irreversible decisions where single-model bias is too costly: M&A, major fundraises, layoffs, strategic pivots, regulatory commitments.
Adapted from gstack's
/codex
cross-review pattern, generalized to business memos instead of code PRs.
命令:
/cs:cross-eval <memo-or-brief>
将同一备忘录提交给多个模型提供商,并调和分歧。适用于高风险、不可逆转的决策场景,此时单一模型的偏见代价过高:如并购、大规模融资、裁员、战略转型、合规承诺。
改编自gstack的
/codex
交叉评审模式,将适用场景从代码PR推广至商业备忘录

When to Run

适用场景

  • Before signing a term sheet
  • Before announcing a layoff
  • Before committing to a regulated market
  • Before any decision where reversing costs > 6 months of company time
  • When the boardroom vote was split or had a CRITICAL dissent
  • 签署条款清单前
  • 宣布裁员前
  • 承诺进入受监管市场前
  • 任何逆转成本超过6个月公司时间的决策前
  • 董事会投票出现分歧或存在关键反对意见时

Models Used (graceful degradation)

使用模型(支持优雅降级)

The command tries to invoke each available model in order:
  1. Claude (primary, always available) — the boardroom's native voice
  2. Codex / OpenAI (if
    OPENAI_API_KEY
    or
    codex
    CLI available)
  3. Gemini (if
    GEMINI_API_KEY
    or
    gemini
    CLI available)
If only Claude is available, the command runs Claude-only with adversarial mode — same model, different prompt seeds — and clearly labels the output as single-model.
命令会按顺序尝试调用每个可用模型:
  1. Claude(主模型,始终可用)—— 适配董事会场景的原生模型
  2. Codex / OpenAI(若
    OPENAI_API_KEY
    或codex CLI可用)
  3. Gemini(若
    GEMINI_API_KEY
    或gemini CLI可用)
若仅Claude可用,命令将以Claude单模型对抗模式运行——同一模型,不同提示词种子——并在输出中明确标注为单模型结果。

Workflow

工作流程

  1. Read the memo / brief
  2. Probe environment for available model CLIs / API keys
  3. For each available model:
    • Send the memo with this prompt prefix:
      "You are an independent C-suite reviewer. The following is a board memo from another company's boardroom. Identify the top 3 concerns, the top 3 supports, and your vote (APPROVE / REJECT / DEFER). Do not deferentially agree — assume the memo's reasoning is flawed until proven otherwise."
  4. Collect three independent reviews
  5. Reconcile: where do they agree? Where do they diverge?
  6. Surface the divergences as questions for the founder
  1. 读取备忘录/简报
  2. 检测环境中可用的模型CLI/API密钥
  3. 针对每个可用模型:
    • 发送备忘录并添加以下提示前缀:
      "你是一名独立的高管评审员。以下是来自另一家公司董事会的备忘录。请指出前3项担忧、前3项支持点,以及你的投票(APPROVE / REJECT / DEFER)。不要顺从认可——在证明备忘录推理无误前,默认其存在缺陷。"
  4. 收集三份独立评审结果
  5. 调和:找出共识点与分歧点
  6. 将分歧点整理为需要创始人解答的问题

Output Format

输出格式

Saved to
~/.claude/cross-eval/YYYY-MM-DD-<slug>.md
:
markdown
undefined
结果将保存至
~/.claude/cross-eval/YYYY-MM-DD-<slug>.md
markdown
undefined

Cross-Eval: <memo title>

Cross-Eval: <memo title>

Date: YYYY-MM-DD Memo reviewed: <link> Models invoked: Claude / Codex / Gemini (or noted fallbacks)
日期: YYYY-MM-DD 评审备忘录: <链接> 调用模型: Claude / Codex / Gemini(或标注降级方案)

Vote Tally

投票统计

ModelVoteConfidence
ClaudeAPPROVEHigh
CodexDEFERMed
GeminiAPPROVELow
模型投票置信度
ClaudeAPPROVEHigh
CodexDEFERMed
GeminiAPPROVELow

Consensus Concerns (≥2 models flagged)

共识担忧(≥2个模型标记)

  1. <concern> — flagged by Claude + Codex
  2. <concern> — flagged by all 3
  1. <担忧内容> — 由Claude + Codex标记
  2. <担忧内容> — 由全部3个模型标记

Divergent Concerns (1 model flagged)

分歧担忧(仅1个模型标记)

  • <Codex only:> <concern> — worth a second look
  • <Gemini only:> <concern> — likely noise, but check
  • <仅Codex标记:> <担忧内容> — 值得再次审视
  • <仅Gemini标记:> <担忧内容> — 可能为干扰项,但需确认

Consensus Supports (≥2 models endorsed)

共识支持点(≥2个模型认可)

  1. <support>
  2. <support>
  1. <支持内容>
  2. <支持内容>

Recommendation

建议

  • 🟢 GO if 2+ models APPROVE and no CRITICAL concerns from any model
  • 🟡 PAUSE if any model is DEFER or any concern is CRITICAL
  • 🔴 STOP if 2+ models REJECT
  • 🟢 执行:若≥2个模型投APPROVE且无任何模型提出关键担忧
  • 🟡 暂停:若任何模型投DEFER或存在关键担忧
  • 🔴 终止:若≥2个模型投REJECT

Open Questions for Founder

需创始人解答的问题

  1. <question raised by divergence>
  2. <question raised by divergence>
undefined
  1. <分歧引发的问题>
  2. <分歧引发的问题>
undefined

Why This Matters

重要性说明

Single-model recommendations have systematic biases. Claude trends helpful and may under-weight risk. Codex (OpenAI) trends more cautious on emerging-market and regulatory topics. Gemini trends more cautious on technical scale claims. Disagreement is signal, not noise.
This is the safety net before irreversibility — not a replacement for outside counsel or a real board.
单一模型的建议存在系统性偏差。Claude倾向于提供帮助,可能低估风险;Codex(OpenAI)在新兴市场和合规话题上更为谨慎;Gemini对技术规模相关的主张更为谨慎。分歧是信号,而非干扰项。
这是不可逆转决策前的安全网——不能替代外部法律顾问或真实董事会。

Graceful Degradation

优雅降级方案

If only Claude is available:
markdown
**Models available:** Claude only
**Mode:** ADVERSARIAL — running 3 independent Claude passes with different system prompts:
  1. Standard reviewer
  2. Devil's advocate (must find 3 critical concerns)
  3. Steelman (must find 3 strongest reasons to approve)

This is weaker than true multi-model. Treat the result as suggestive, not conclusive.
若仅Claude可用:
markdown
**可用模型:** 仅Claude
**模式:** 对抗式——使用不同系统提示词运行3次独立Claude评审:
  1. 标准评审员
  2. 魔鬼代言人(必须找出3个关键担忧)
  3. 强化支持者(必须找出3个最有力的批准理由)

此模式弱于真正的多模型评审。结果仅作参考,不具决定性。

Routing

后续路由

  • /cs:decide
    — if consensus is GO
  • /cs:freeze
    — if consensus is PAUSE
  • /cs:boardroom
    (re-run) — if consensus is STOP
  • /cs:decide
    — 若共识为执行
  • /cs:freeze
    — 若共识为暂停
  • /cs:boardroom
    (重新运行) — 若共识为终止

Related

相关内容

  • Skills:
    board-meeting
    ,
    executive-mentor
  • Inspiration: gstack's
    /codex
    cross-review pattern (adapted to business memos)

Version: 1.0.0
  • Skills:
    board-meeting
    ,
    executive-mentor
  • 灵感来源:gstack的
    /codex
    交叉评审模式(适配商业备忘录场景)

版本: 1.0.0