idea-discovery

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Workflow 1: Idea Discovery Pipeline

工作流1：想法发现流水线

Orchestrate a complete idea discovery workflow for: $ARGUMENTS

为以下内容编排完整的想法发现工作流：$ARGUMENTS

Overview

概述

This skill chains sub-skills into a single automated pipeline:

/research-lit → /idea-creator → /novelty-check → /research-review → /research-refine-pipeline
  (survey)      (brainstorm)    (verify novel)    (critical feedback)  (refine method + plan experiments)

Each phase builds on the previous one's output. The final deliverables are a validated

IDEA_REPORT.md

with ranked ideas, plus a refined proposal (

refine-logs/FINAL_PROPOSAL.md

) and experiment plan (

refine-logs/EXPERIMENT_PLAN.md

) for the top idea.

该技能将多个子技能串联成一条自动化流水线：

/research-lit → /idea-creator → /novelty-check → /research-review → /research-refine-pipeline
  (文献调研)      (头脑风暴)    (新颖性验证)    (批判性反馈)  (方法优化 + 实验规划)

每个阶段都基于前一阶段的输出展开。最终交付物包括一份带有排名想法的已验证

IDEA_REPORT.md

，以及针对最优想法的优化方案（

refine-logs/FINAL_PROPOSAL.md

）和实验规划（

refine-logs/EXPERIMENT_PLAN.md

）。

Constants

常量配置

PILOT_MAX_HOURS = 2 — Skip any pilot experiment estimated to take > 2 hours per GPU. Flag as "needs manual pilot" in the report.
PILOT_TIMEOUT_HOURS = 3 — Hard timeout: kill any running pilot that exceeds 3 hours. Collect partial results if available.
MAX_PILOT_IDEAS = 3 — Run pilots for at most 3 top ideas in parallel. Additional ideas are validated on paper only.
MAX_TOTAL_GPU_HOURS = 8 — Total GPU budget across all pilots. If exceeded, skip remaining pilots and note in report.
AUTO_PROCEED = true — If user doesn't respond at a checkpoint, automatically proceed with the best option after presenting results. Set to
```
false
```
to always wait for explicit user confirmation.
REVIEWER_MODEL =
gpt-5.4
— Model used via Codex MCP. Must be an OpenAI model (e.g.,
```
gpt-5.4
```
,
```
o3
```
,
```
gpt-4o
```
). Passed to sub-skills.
ARXIV_DOWNLOAD = false — When
```
true
```
,
```
/research-lit
```
downloads the top relevant arXiv PDFs during Phase 1. When
```
false
```
(default), only fetches metadata. Passed through to
```
/research-lit
```
.

💡 These are defaults. Override by telling the skill, e.g.,
/idea-discovery "topic" — pilot budget: 4h per idea, 20h total
or
/idea-discovery "topic" — arxiv download: true
.

PILOT_MAX_HOURS = 2 — 跳过任何预估单GPU运行时间超过2小时的试点实验，在报告中标记为“需手动试点”。
PILOT_TIMEOUT_HOURS = 3 — 强制超时：终止任何运行时间超过3小时的试点任务，若有部分结果则收集保存。
MAX_PILOT_IDEAS = 3 — 最多同时为3个排名靠前的想法开展试点实验，其余想法仅通过书面验证。
MAX_TOTAL_GPU_HOURS = 8 — 所有试点任务的总GPU预算。若超出预算，跳过剩余试点并在报告中说明。
AUTO_PROCEED = true — 若用户在检查点未回复，展示结果后将自动选择最优方案继续推进。设置为
```
false
```
则始终等待用户明确确认。
REVIEWER_MODEL =
gpt-5.4
— 通过Codex MCP调用的模型，必须为OpenAI模型（如
```
gpt-5.4
```
、
```
o3
```
、
```
gpt-4o
```
），该配置会传递给子技能。
ARXIV_DOWNLOAD = false — 设为
```
true
```
时，第一阶段的
```
/research-lit
```
会下载相关度最高的arXiv论文PDF；默认设为
```
false
```
时，仅获取元数据，该配置会传递给
```
/research-lit
```
。

💡 以上为默认配置，可通过向技能发送指令覆盖，例如：
/idea-discovery "topic" — pilot budget: 4h per idea, 20h total
或
/idea-discovery "topic" — arxiv download: true
。

Pipeline

流水线流程

Phase 1: Literature Survey

阶段1：文献调研

Invoke

/research-lit

to map the research landscape:

/research-lit "$ARGUMENTS"

What this does:

Search arXiv, Google Scholar, Semantic Scholar for recent papers
Build a landscape map: sub-directions, approaches, open problems
Identify structural gaps and recurring limitations
Output a literature summary (saved to working notes)

🚦 Checkpoint: Present the landscape summary to the user. Ask:

📚 Literature survey complete. Here's what I found:
- [key findings, gaps, open problems]

Does this match your understanding? Should I adjust the scope before generating ideas?
(If no response, I'll proceed with the top-ranked direction.)

User approves (or no response + AUTO_PROCEED=true) → proceed to Phase 2 with best direction.
User requests changes (e.g., "focus more on X", "ignore Y", "too broad") → refine the search with updated queries, re-run
```
/research-lit
```
with adjusted scope, and present again. Repeat until the user is satisfied.

调用

/research-lit

梳理研究格局：

/research-lit "$ARGUMENTS"

执行内容：

在arXiv、Google Scholar、Semantic Scholar上搜索近期论文
构建研究格局图谱：子方向、研究方法、待解决问题
识别结构性空白和普遍存在的局限性
输出文献总结（保存至工作笔记）

🚦 检查点： 向用户展示研究格局总结，并询问：

📚 文献调研完成，以下为发现成果：
- [关键发现、空白领域、待解决问题]

这是否符合你的认知？在生成想法前是否需要调整研究范围？
（若未收到回复，我将基于排名最高的方向继续推进。）

用户确认通过（或未回复且AUTO_PROCEED=true）→ 基于最优方向进入阶段2。
用户要求调整（如“更多关注X方向”“忽略Y方向”“范围太宽泛”）→ 更新搜索关键词，调整范围后重新运行
```
/research-lit
```
，再次展示结果。重复此过程直至用户满意。

Phase 2: Idea Generation + Filtering + Pilots

阶段2：想法生成 + 筛选 + 试点实验

Invoke

/idea-creator

with the landscape context:

/idea-creator "$ARGUMENTS"

What this does:

Brainstorm 8-12 concrete ideas via GPT-5.4 xhigh
Filter by feasibility, compute cost, quick novelty search
Deep validate top ideas (full novelty check + devil's advocate)
Run parallel pilot experiments on available GPUs (top 2-3 ideas)
Rank by empirical signal
Output
```
IDEA_REPORT.md
```

🚦 Checkpoint: Present

IDEA_REPORT.md

ranked ideas to the user. Ask:

💡 Generated X ideas, filtered to Y, piloted Z. Top results:

1. [Idea 1] — Pilot: POSITIVE (+X%)
2. [Idea 2] — Pilot: WEAK POSITIVE (+Y%)
3. [Idea 3] — Pilot: NEGATIVE, eliminated

Which ideas should I validate further? Or should I regenerate with different constraints?
(If no response, I'll proceed with the top-ranked ideas.)

User picks ideas (or no response + AUTO_PROCEED=true) → proceed to Phase 3 with top-ranked ideas.
User unhappy with all ideas → collect feedback ("what's missing?", "what direction do you prefer?"), update the prompt with user's constraints, and re-run Phase 2 (idea generation). Repeat until the user selects at least 1 idea.
User wants to adjust scope → go back to Phase 1 with refined direction.

结合研究格局上下文调用

/idea-creator

：

/idea-creator "$ARGUMENTS"

执行内容：

通过GPT-5.4 xhigh头脑风暴生成8-12个具体想法
基于可行性、计算成本、快速新颖性搜索进行筛选
对排名靠前的想法进行深度验证（完整新颖性检查 + 反向质疑）
在可用GPU上并行开展试点实验（针对2-3个最优想法）
基于实验数据信号进行排名
输出
```
IDEA_REPORT.md
```

🚦 检查点： 向用户展示

IDEA_REPORT.md

中的排名想法，并询问：

💡 已生成X个想法，筛选后保留Y个，为Z个想法开展了试点实验。排名靠前的结果如下：

1. [想法1] — 试点结果：POSITIVE (+X%)
2. [想法2] — 试点结果：WEAK POSITIVE (+Y%)
3. [想法3] — 试点结果：NEGATIVE，已淘汰

需要进一步验证哪些想法？或者是否需要调整约束条件重新生成想法？
（若未收到回复，我将基于排名最高的想法继续推进。）

用户选定想法（或未回复且AUTO_PROCEED=true）→ 基于排名靠前的想法进入阶段3。
用户对所有想法不满意→ 收集反馈（“缺少什么？”“偏好什么方向？”），结合用户约束更新提示词，重新运行阶段2（想法生成）。重复此过程直至用户选定至少1个想法。
用户要求调整范围→ 返回阶段1，基于优化后的方向重新调研。

Phase 3: Deep Novelty Verification

阶段3：深度新颖性验证

For each top idea (positive pilot signal), run a thorough novelty check:

/novelty-check "[top idea 1 description]"
/novelty-check "[top idea 2 description]"

What this does:

Multi-source literature search (arXiv, Scholar, Semantic Scholar)
Cross-verify with GPT-5.4 xhigh
Check for concurrent work (last 3-6 months)
Identify closest existing work and differentiation points

Update
IDEA_REPORT.md
with deep novelty results. Eliminate any idea that turns out to be already published.

针对每个试点结果为正向的最优想法，开展全面的新颖性检查：

/novelty-check "[最优想法1描述]"
/novelty-check "[最优想法2描述]"

执行内容：

多源文献搜索（arXiv、Scholar、Semantic Scholar）
结合GPT-5.4 xhigh进行交叉验证
检查近期（3-6个月）的同期研究
识别最接近的现有研究及差异化点

更新
IDEA_REPORT.md
，补充深度新颖性验证结果。淘汰任何已被发表的想法。

Phase 4: External Critical Review

阶段4：外部批判性评审

For the surviving top idea(s), get brutal feedback:

/research-review "[top idea with hypothesis + pilot results]"

What this does:

GPT-5.4 xhigh acts as a senior reviewer (NeurIPS/ICML level)
Scores the idea, identifies weaknesses, suggests minimum viable improvements
Provides concrete feedback on experimental design

Update
IDEA_REPORT.md
with reviewer feedback and revised plan.

针对留存的最优想法，获取严苛的评审反馈：

/research-review "[包含假设+试点结果的最优想法]"

执行内容：

GPT-5.4 xhigh扮演资深评审专家（NeurIPS/ICML级别）
为想法评分、识别短板、建议最小可行改进方案
针对实验设计提供具体反馈

更新
IDEA_REPORT.md
，补充评审反馈和修订后的规划。

Phase 4.5: Method Refinement + Experiment Planning

阶段4.5：方法优化 + 实验规划

After review, refine the top idea into a concrete proposal and plan experiments:

/research-refine-pipeline "[top idea description + pilot results + reviewer feedback]"

What this does:

Freeze a Problem Anchor to prevent scope drift
Iteratively refine the method via GPT-5.4 review (up to 5 rounds, until score ≥ 9)
Generate a claim-driven experiment roadmap with ablations, budgets, and run order

Output:

refine-logs/FINAL_PROPOSAL.md

refine-logs/EXPERIMENT_PLAN.md

refine-logs/EXPERIMENT_TRACKER.md

🚦 Checkpoint: Present the refined proposal summary:

🔬 Method refined and experiment plan ready:
- Problem anchor: [anchored problem]
- Method thesis: [one sentence]
- Dominant contribution: [what's new]
- Must-run experiments: [N blocks]
- First 3 runs to launch: [list]

Proceed to implementation? Or adjust the proposal?

User approves (or AUTO_PROCEED=true) → proceed to Final Report.
User requests changes → pass feedback to
```
/research-refine
```
for another round.
Lite mode: If reviewer score < 6 or pilot was weak, run
```
/research-refine
```
only (skip
```
/experiment-plan
```
) and note remaining risks in the report.

评审完成后，将最优想法细化为具体方案并规划实验：

/research-refine-pipeline "[最优想法描述 + 试点结果 + 评审反馈]"

执行内容：

锁定问题锚点，防止范围偏离
通过GPT-5.4评审迭代优化方法（最多5轮，直至评分≥9）
生成以论点为核心的实验路线图，包含对照实验、预算和执行顺序

输出：

refine-logs/FINAL_PROPOSAL.md

、

refine-logs/EXPERIMENT_PLAN.md

、

refine-logs/EXPERIMENT_TRACKER.md

🚦 检查点： 向用户展示优化后的方案摘要：

🔬 方法已优化完成，实验规划就绪：
- 问题锚点：[锁定的问题]
- 方法核心论点：[一句话总结]
- 核心贡献：[创新点]
- 必做实验：[N个模块]
- 首批启动的3项实验：[列表]

是否推进至实施阶段？或者是否需要调整方案？

用户确认通过（或AUTO_PROCEED=true）→ 进入最终报告阶段。
用户要求调整→ 将反馈传递给
```
/research-refine
```
进行新一轮优化。
轻量模式： 若评审评分<6或试点结果较弱，仅运行
```
/research-refine
```
（跳过
```
/experiment-plan
```
），并在报告中注明剩余风险。

Phase 5: Final Report

阶段5：最终报告

Finalize

IDEA_REPORT.md

with all accumulated information:

markdown

undefined

整合所有信息，完成

IDEA_REPORT.md

的最终版本：

markdown

undefined

Idea Discovery Report

想法发现报告

Direction: $ARGUMENTS Date: [today] Pipeline: research-lit → idea-creator → novelty-check → research-review → research-refine-pipeline

研究方向：$ARGUMENTS 日期：[今日日期] 流水线流程：research-lit → idea-creator → novelty-check → research-review → research-refine-pipeline

Executive Summary

执行摘要

[2-3 sentences: best idea, key evidence, recommended next step]

[2-3句话：最优想法、核心证据、建议下一步行动]

Literature Landscape

研究格局

[from Phase 1]

[来自阶段1的内容]

Ranked Ideas

排名想法

[from Phase 2, updated with Phase 3-4 results]

[来自阶段2的内容，结合阶段3-4结果更新]

🏆 Idea 1: [title] — RECOMMENDED

🏆 想法1：[标题] — 推荐

Pilot: POSITIVE (+X%)
Novelty: CONFIRMED (closest: [paper], differentiation: [what's different])
Reviewer score: X/10
Next step: implement full experiment → /auto-review-loop

试点结果：POSITIVE (+X%)
新颖性：已确认（最接近的研究：[论文]，差异化点：[创新之处]）
评审评分：X/10
下一步：实施完整实验 → /auto-review-loop

Idea 2: [title] — BACKUP

想法2：[标题] — 备选

...

Eliminated Ideas

已淘汰想法

[ideas killed at each phase, with reasons]

[各阶段淘汰的想法及原因]

Refined Proposal

优化后方案

Proposal:
```
refine-logs/FINAL_PROPOSAL.md
```
Experiment plan:
```
refine-logs/EXPERIMENT_PLAN.md
```
Tracker:
```
refine-logs/EXPERIMENT_TRACKER.md
```

方案文档：
```
refine-logs/FINAL_PROPOSAL.md
```
实验规划：
```
refine-logs/EXPERIMENT_PLAN.md
```
跟踪文档：
```
refine-logs/EXPERIMENT_TRACKER.md
```

Next Steps

下一步行动

/run-experiment to deploy experiments from the plan
/auto-review-loop to iterate until submission-ready
Or invoke /research-pipeline for the complete end-to-end flow

undefined

/run-experiment 部署规划中的实验
/auto-review-loop 迭代优化直至达到投稿标准
或调用 /research-pipeline 执行完整端到端流程

undefined

Key Rules

核心规则

Large file handling: If the Write tool fails due to file size, immediately retry using Bash (
```
cat << 'EOF' > file
```
) to write in chunks. Do NOT ask the user for permission — just do it silently.
Don't skip phases. Each phase filters and validates — skipping leads to wasted effort later.
Checkpoint between phases. Briefly summarize what was found before moving on.
Kill ideas early. It's better to kill 10 bad ideas in Phase 3 than to implement one and fail.
Empirical signal > theoretical appeal. An idea with a positive pilot outranks a "sounds great" idea without evidence.
Document everything. Dead ends are just as valuable as successes for future reference.
Be honest with the reviewer. Include negative results and failed pilots in the review prompt.
Feishu notifications are optional. If
```
~/.claude/feishu.json
```
exists, send
```
checkpoint
```
at each phase transition and
```
pipeline_done
```
at final report. If absent/off, skip silently.

大文件处理：若Write工具因文件大小失败，立即通过Bash（
```
cat << 'EOF' > file
```
）分块重试，无需询问用户许可，静默执行即可。
不得跳过阶段：每个阶段都承担筛选和验证功能，跳过会导致后续工作浪费。
阶段间设置检查点：推进到下一阶段前，简要总结已发现的内容。
尽早淘汰无效想法：在阶段3淘汰10个糟糕想法，远胜于后续实施一个注定失败的想法。
实验信号优先于理论吸引力：有正向试点结果的想法，排名优于“听起来不错”但无证据支撑的想法。
记录所有内容：失败的尝试与成功的经验对未来研究同样有价值。
对评审保持诚实：在评审提示中包含负面结果和失败的试点实验。
飞书通知为可选配置：若存在
```
~/.claude/feishu.json
```
文件，在每个阶段切换时发送
```
checkpoint
```
通知，在最终报告完成时发送
```
pipeline_done
```
通知；若文件不存在或未开启，静默跳过即可。

Composing with Workflow 2

与工作流2的组合使用

After this pipeline produces a validated top idea:

/idea-discovery "direction"         ← you are here (Workflow 1, includes method refinement + experiment planning)
/run-experiment                     ← deploy experiments from the plan
/auto-review-loop "top idea"        ← Workflow 2: iterate until submission-ready

Or use /research-pipeline for the full end-to-end flow.

本流水线生成经过验证的最优想法后，可按以下流程推进：

/idea-discovery "direction"         ← 当前位置（工作流1，包含方法优化 + 实验规划）
/run-experiment                     ← 部署规划中的实验
/auto-review-loop "top idea"        ← 工作流2：迭代优化直至达到投稿标准

或调用 /research-pipeline 执行完整端到端流程。