cross-eval
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
Chinese/cs:cross-eval — Multi-Model Consensus
/cs:cross-eval — 多模型共识评估
Command:
/cs:cross-eval <memo-or-brief>Runs the same memo through multiple model providers and reconciles divergences. Use for high-stakes, irreversible decisions where single-model bias is too costly: M&A, major fundraises, layoffs, strategic pivots, regulatory commitments.
Adapted from gstack's cross-review pattern, generalized to business memos instead of code PRs.
/codex命令:
/cs:cross-eval <memo-or-brief>将同一备忘录提交给多个模型提供商,并调和分歧。适用于高风险、不可逆转的决策场景,此时单一模型的偏见代价过高:如并购、大规模融资、裁员、战略转型、合规承诺。
改编自gstack的交叉评审模式,将适用场景从代码PR推广至商业备忘录。
/codexWhen to Run
适用场景
- Before signing a term sheet
- Before announcing a layoff
- Before committing to a regulated market
- Before any decision where reversing costs > 6 months of company time
- When the boardroom vote was split or had a CRITICAL dissent
- 签署条款清单前
- 宣布裁员前
- 承诺进入受监管市场前
- 任何逆转成本超过6个月公司时间的决策前
- 董事会投票出现分歧或存在关键反对意见时
Models Used (graceful degradation)
使用模型(支持优雅降级)
The command tries to invoke each available model in order:
- Claude (primary, always available) — the boardroom's native voice
- Codex / OpenAI (if or
OPENAI_API_KEYCLI available)codex - Gemini (if or
GEMINI_API_KEYCLI available)gemini
If only Claude is available, the command runs Claude-only with adversarial mode — same model, different prompt seeds — and clearly labels the output as single-model.
命令会按顺序尝试调用每个可用模型:
- Claude(主模型,始终可用)—— 适配董事会场景的原生模型
- Codex / OpenAI(若或codex CLI可用)
OPENAI_API_KEY - Gemini(若或gemini CLI可用)
GEMINI_API_KEY
若仅Claude可用,命令将以Claude单模型对抗模式运行——同一模型,不同提示词种子——并在输出中明确标注为单模型结果。
Workflow
工作流程
- Read the memo / brief
- Probe environment for available model CLIs / API keys
- For each available model:
- Send the memo with this prompt prefix:
"You are an independent C-suite reviewer. The following is a board memo from another company's boardroom. Identify the top 3 concerns, the top 3 supports, and your vote (APPROVE / REJECT / DEFER). Do not deferentially agree — assume the memo's reasoning is flawed until proven otherwise."
- Send the memo with this prompt prefix:
- Collect three independent reviews
- Reconcile: where do they agree? Where do they diverge?
- Surface the divergences as questions for the founder
- 读取备忘录/简报
- 检测环境中可用的模型CLI/API密钥
- 针对每个可用模型:
- 发送备忘录并添加以下提示前缀:
"你是一名独立的高管评审员。以下是来自另一家公司董事会的备忘录。请指出前3项担忧、前3项支持点,以及你的投票(APPROVE / REJECT / DEFER)。不要顺从认可——在证明备忘录推理无误前,默认其存在缺陷。"
- 发送备忘录并添加以下提示前缀:
- 收集三份独立评审结果
- 调和:找出共识点与分歧点
- 将分歧点整理为需要创始人解答的问题
Output Format
输出格式
Saved to :
~/.claude/cross-eval/YYYY-MM-DD-<slug>.mdmarkdown
undefined结果将保存至:
~/.claude/cross-eval/YYYY-MM-DD-<slug>.mdmarkdown
undefinedCross-Eval: <memo title>
Cross-Eval: <memo title>
Date: YYYY-MM-DD
Memo reviewed: <link>
Models invoked: Claude / Codex / Gemini (or noted fallbacks)
日期: YYYY-MM-DD
评审备忘录: <链接>
调用模型: Claude / Codex / Gemini(或标注降级方案)
Vote Tally
投票统计
| Model | Vote | Confidence |
|---|---|---|
| Claude | APPROVE | High |
| Codex | DEFER | Med |
| Gemini | APPROVE | Low |
| 模型 | 投票 | 置信度 |
|---|---|---|
| Claude | APPROVE | High |
| Codex | DEFER | Med |
| Gemini | APPROVE | Low |
Consensus Concerns (≥2 models flagged)
共识担忧(≥2个模型标记)
- <concern> — flagged by Claude + Codex
- <concern> — flagged by all 3
- <担忧内容> — 由Claude + Codex标记
- <担忧内容> — 由全部3个模型标记
Divergent Concerns (1 model flagged)
分歧担忧(仅1个模型标记)
- <Codex only:> <concern> — worth a second look
- <Gemini only:> <concern> — likely noise, but check
- <仅Codex标记:> <担忧内容> — 值得再次审视
- <仅Gemini标记:> <担忧内容> — 可能为干扰项,但需确认
Consensus Supports (≥2 models endorsed)
共识支持点(≥2个模型认可)
- <support>
- <support>
- <支持内容>
- <支持内容>
Recommendation
建议
- 🟢 GO if 2+ models APPROVE and no CRITICAL concerns from any model
- 🟡 PAUSE if any model is DEFER or any concern is CRITICAL
- 🔴 STOP if 2+ models REJECT
- 🟢 执行:若≥2个模型投APPROVE且无任何模型提出关键担忧
- 🟡 暂停:若任何模型投DEFER或存在关键担忧
- 🔴 终止:若≥2个模型投REJECT
Open Questions for Founder
需创始人解答的问题
- <question raised by divergence>
- <question raised by divergence>
undefined- <分歧引发的问题>
- <分歧引发的问题>
undefinedWhy This Matters
重要性说明
Single-model recommendations have systematic biases. Claude trends helpful and may under-weight risk. Codex (OpenAI) trends more cautious on emerging-market and regulatory topics. Gemini trends more cautious on technical scale claims. Disagreement is signal, not noise.
This is the safety net before irreversibility — not a replacement for outside counsel or a real board.
单一模型的建议存在系统性偏差。Claude倾向于提供帮助,可能低估风险;Codex(OpenAI)在新兴市场和合规话题上更为谨慎;Gemini对技术规模相关的主张更为谨慎。分歧是信号,而非干扰项。
这是不可逆转决策前的安全网——不能替代外部法律顾问或真实董事会。
Graceful Degradation
优雅降级方案
If only Claude is available:
markdown
**Models available:** Claude only
**Mode:** ADVERSARIAL — running 3 independent Claude passes with different system prompts:
1. Standard reviewer
2. Devil's advocate (must find 3 critical concerns)
3. Steelman (must find 3 strongest reasons to approve)
This is weaker than true multi-model. Treat the result as suggestive, not conclusive.若仅Claude可用:
markdown
**可用模型:** 仅Claude
**模式:** 对抗式——使用不同系统提示词运行3次独立Claude评审:
1. 标准评审员
2. 魔鬼代言人(必须找出3个关键担忧)
3. 强化支持者(必须找出3个最有力的批准理由)
此模式弱于真正的多模型评审。结果仅作参考,不具决定性。Routing
后续路由
- — if consensus is GO
/cs:decide - — if consensus is PAUSE
/cs:freeze - (re-run) — if consensus is STOP
/cs:boardroom
- — 若共识为执行
/cs:decide - — 若共识为暂停
/cs:freeze - (重新运行) — 若共识为终止
/cs:boardroom
Related
相关内容
- Skills: ,
board-meetingexecutive-mentor - Inspiration: gstack's cross-review pattern (adapted to business memos)
/codex
Version: 1.0.0
- Skills: ,
board-meetingexecutive-mentor - 灵感来源:gstack的交叉评审模式(适配商业备忘录场景)
/codex
版本: 1.0.0