idea-discovery-robot
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseRobotics Idea Discovery Pipeline
机器人研究想法发现工作流
Orchestrate a robotics-specific idea discovery workflow for: $ARGUMENTS
为以下方向统筹专属机器人领域的想法发现工作流:$ARGUMENTS
Overview
概述
This skill chains four sub-skills into a single automated pipeline:
/research-lit → /idea-creator (robotics framing) → /novelty-check → /research-review
(survey) (filter + pilot plan) (verify novel) (critical feedback)But every phase must be grounded in robotics-specific constraints:
- Embodiment: arm, mobile manipulator, drone, humanoid, quadruped, autonomous car, etc.
- Task family: grasping, insertion, locomotion, navigation, manipulation, rearrangement, multi-step planning
- Observation + action interface: RGB/RGB-D/tactile/language; torque/velocity/waypoints/end-effector actions
- Simulator / benchmark availability: simulation-first by default
- Real robot constraints: hardware availability, reset cost, safety, operator time
- Evaluation quality: success rate plus failure cases, safety violations, intervention count, latency, sample efficiency
- Sim2real story: whether the idea can stay in sim, needs offline logs, or truly requires hardware
The goal is not to produce flashy demos. The goal is to produce ideas that are:
- benchmarkable
- falsifiable
- feasible with available robotics infrastructure
- interesting even if the answer is negative
本技能将四个子技能串联为一条自动化工作流:
/research-lit → /idea-creator (robotics framing) → /novelty-check → /research-review
(调研) (机器人领域框架梳理) (创新性校验) (批判性评审)但每个阶段都必须基于机器人领域的特定约束:
- 具身形态:机械臂、移动操作机器人、无人机、人形机器人、四足机器人、自动驾驶汽车等
- 任务类别:抓取、插入、移动、导航、操作、重排、长周期规划
- 观测与动作接口:RGB/RGB-D/触觉/语言;力矩/速度/路径点/末端执行器动作
- 仿真器/基准测试可用性:默认优先采用仿真方案
- 实体机器人约束:硬件可用性、重置成本、安全性、操作人员时间
- 评估质量:成功率+失败案例、安全违规、干预次数、延迟、样本效率
- Sim2Real落地路径:想法是否可仅在仿真中验证、是否需要离线日志、或确实需要实体硬件
本工作流的目标并非制作炫目的演示Demo,而是产出具备以下特性的研究想法:
- 可进行基准测试
- 可证伪
- 基于现有机器人基础设施具备可行性
- 即使得出负面结论仍具备研究价值
Constants
常量定义
- MAX_PILOT_IDEAS = 3 — Validate at most 3 top ideas deeply
- PILOT_MODE = — Prefer simulation or offline-log pilots before any hardware execution
sim-first - REAL_ROBOT_PILOTS = — Never assume physical robot access or approval
explicit approval only - AUTO_PROCEED = true — If user does not respond at checkpoints, proceed with the best sim-first option
- REVIEWER_MODEL = — External reviewer model via Codex MCP
gpt-5.4 - TARGET_VENUES = CoRL, RSS, ICRA, IROS, RA-L — Default novelty and reviewer framing
Override inline, e.g.or/idea-discovery-robot "bimanual manipulation" — only sim ideas, no real robot/idea-discovery-robot "drone navigation" — focus on CoRL/RSS, 2 pilot ideas max
- MAX_PILOT_IDEAS = 3 — 最多深度验证3个顶级想法
- PILOT_MODE = — 在任何实体硬件执行前,优先选择仿真或离线日志验证方案
sim-first - REAL_ROBOT_PILOTS = — 绝不默认拥有实体机器人的使用权或操作许可
explicit approval only - AUTO_PROCEED = true — 若用户在检查点未回应,则采用最优的优先仿真选项继续执行
- REVIEWER_MODEL = — 通过Codex MCP调用外部评审模型
gpt-5.4 - TARGET_VENUES = CoRL, RSS, ICRA, IROS, RA-L — 默认的目标会议与创新性评审框架
可在调用时覆盖默认配置,例如:或/idea-discovery-robot "双机械臂操作" — only sim ideas, no real robot/idea-discovery-robot "无人机导航" — focus on CoRL/RSS, 2 pilot ideas max
Execution Rule
执行规则
Follow the phases in order. Do not stop after a checkpoint unless:
- the user explicitly says to stop, or
- the user asks to change scope and re-run an earlier phase
If and the user does not respond, continue immediately to the next phase using the strongest sim-first, benchmark-grounded option.
AUTO_PROCEED=true按顺序执行各阶段。除非满足以下条件,否则不要在检查点停止:
- 用户明确要求停止,或
- 用户要求调整范围并重新运行之前的阶段
若且用户未回应,立即使用最优质的优先仿真、基准测试驱动选项进入下一阶段。
AUTO_PROCEED=truePhase 0: Frame the Robotics Problem
阶段0:机器人问题框架构建
Before generating ideas, extract or infer this Robotics Problem Frame from and local project context:
$ARGUMENTS- Embodiment
- Task family
- Environment type: tabletop, warehouse, home, outdoor, aerial, driving, legged terrain
- Observation modalities
- Action interface / controller abstraction
- Learning regime: RL, imitation, behavior cloning, world model, planning, VLA/VLM, classical robotics, hybrid
- Available assets: simulator, benchmark suite, teleop data, offline logs, existing codebase, real hardware
- Compute budget
- Safety constraints
- Desired contribution type: method, benchmark, diagnosis, systems, sim2real, data curation
If some fields are missing, make explicit assumptions and default to:
- simulation-first
- public benchmark preferred
- no real robot execution
Write this frame into working notes before moving on. Every later decision should reference it.
在生成想法前,从和本地项目上下文提取或推断机器人问题框架:
$ARGUMENTS- 具身形态
- 任务类别
- 环境类型:桌面、仓库、家庭、户外、空中、驾驶场景、腿式机器人地形
- 观测模态
- 动作接口/控制器抽象
- 学习范式:RL、模仿学习、行为克隆、世界模型、规划、VLA/VLM、经典机器人学、混合方案
- 可用资源:仿真器、基准测试套件、遥操作数据、离线日志、现有代码库、实体硬件
- 计算预算
- 安全约束
- 期望贡献类型:方法、基准测试、诊断分析、系统方案、Sim2Real落地、数据整理
若部分字段缺失,需做出明确假设并默认采用:
- 优先仿真
- 优先选择公开基准测试
- 不执行实体机器人测试
在进入下一阶段前,将该框架写入工作笔记。后续所有决策都需参考此框架。
Phase 1: Robotics Literature Survey
阶段1:机器人领域文献调研
Invoke:
/research-lit "$ARGUMENTS — focus venues: CoRL, RSS, ICRA, IROS, RA-L, TRO, Science Robotics"Then reorganize the findings using a robotics lens instead of a generic ML lens.
调用:
/research-lit "$ARGUMENTS — focus venues: CoRL, RSS, ICRA, IROS, RA-L, TRO, Science Robotics"随后以机器人领域视角而非通用机器学习视角重组调研结果。
Build a Robotics Landscape Matrix
构建机器人领域全景矩阵
For each relevant paper, classify:
| Axis | Examples |
|---|---|
| Embodiment | single-arm, mobile manipulator, humanoid, drone, quadruped |
| Task | pick-place, insertion, navigation, locomotion, long-horizon rearrangement |
| Learning setup | RL, BC, IL, offline RL, world model, planning, diffusion policy |
| Observation | RGB, RGB-D, proprioception, tactile, language |
| Action abstraction | torque, joint velocity, end-effector delta pose, waypoint planner |
| Eval regime | pure sim, sim+real, real-only, offline benchmark |
| Benchmark | ManiSkill, RLBench, Isaac Lab, Habitat, Meta-World, CALVIN, LIBERO, custom |
| Metrics | success rate, collision rate, intervention count, path length, latency, energy |
| Main bottleneck | sample inefficiency, brittleness, reset cost, perception drift, sim2real gap |
针对每篇相关论文,按以下维度分类:
| 维度 | 示例 |
|---|---|
| 具身形态 | single-arm, mobile manipulator, humanoid, drone, quadruped |
| 任务 | pick-place, insertion, navigation, locomotion, long-horizon rearrangement |
| 学习设置 | RL, BC, IL, offline RL, world model, planning, diffusion policy |
| 观测方式 | RGB, RGB-D, proprioception, tactile, language |
| 动作抽象 | torque, joint velocity, end-effector delta pose, waypoint planner |
| 评估方案 | pure sim, sim+real, real-only, offline benchmark |
| 基准测试 | ManiSkill, RLBench, Isaac Lab, Habitat, Meta-World, CALVIN, LIBERO, custom |
| 评估指标 | success rate, collision rate, intervention count, path length, latency, energy |
| 核心瓶颈 | sample inefficiency, brittleness, reset cost, perception drift, sim2real gap |
Search Priorities
调研优先级
When refining the survey, prioritize:
- recent work from CoRL, RSS, ICRA, IROS, RA-L
- recent arXiv papers from the last 6-12 months
- benchmark papers and follow-up reproductions
- negative-result or diagnosis papers if they reveal system bottlenecks
优化调研范围时,优先关注:
- CoRL, RSS, ICRA, IROS, RA-L的近期研究
- 过去6-12个月的arXiv预印本
- 基准测试相关论文及后续复现研究
- 揭示系统瓶颈的负面结论或诊断分析类论文
What to Look For
调研重点
Do not stop at "who got the best success rate." Explicitly identify:
- recurring failure modes papers do not fix
- benchmarks that are saturated or misleading
- places where embodiment changes invalidate prior conclusions
- methods that only work with privileged observations
- ideas whose reported gains come from reset engineering, reward shaping, or hidden infrastructure
- task families where evaluation quality is weak even if performance numbers look high
Checkpoint: Present the landscape to the user in robotics terms:
🤖 Robotics survey complete. I grouped the field by embodiment, benchmark, action interface, and sim2real setup.
Main gaps:
1. [...]
2. [...]
3. [...]
Should I generate ideas under this framing, or should I narrow to a specific robot / benchmark / modality?- User approves (or no response + AUTO_PROCEED=true) → proceed to Phase 2 with the best robotics frame.
- User requests changes (e.g. narrower embodiment, different benchmark family, no sim2real, no hardware) → refine the robotics frame, re-run Phase 1, and present again.
不要仅停留在“谁取得了最高成功率”。需明确识别:
- 论文未解决的反复出现的失败模式
- 已饱和或存在误导性的基准测试
- 具身形态变化导致先前结论失效的场景
- 仅在特权观测下有效的方法
- 收益来自重置工程、奖励塑形或隐藏基础设施的想法
- 即使性能数值亮眼但评估质量薄弱的任务类别
检查点:以机器人领域术语向用户呈现调研全景:
🤖 机器人领域调研完成。我按具身形态、基准测试、动作接口及Sim2Real方案对领域进行了分组。
核心研究缺口:
1. [...]
2. [...]
3. [...]
是否基于此框架生成研究想法,或是否需要缩小至特定机器人/基准测试/模态范围?- 用户确认(或无回应+AUTO_PROCEED=true)→ 采用最优机器人框架进入阶段2
- 用户要求调整(例如:缩小具身形态范围、更换基准测试类别、无需Sim2Real、无需实体硬件)→ 优化机器人框架,重新运行阶段1并再次呈现结果
Phase 2: Robotics-Specific Idea Generation and Filtering
阶段2:机器人领域专属想法生成与筛选
Generate ideas only after the robotics frame is explicit.
Invoke the existing idea generator, but pass the Robotics Problem Frame and landscape matrix into the prompt so it does not produce generic ML ideas:
/idea-creator "$ARGUMENTS — robotics frame: [paste Robotics Problem Frame] — focus venues: CoRL, RSS, ICRA, IROS, RA-L — benchmark-specific ideas only — sim-first pilots — no real-robot execution without explicit approval — require failure metrics and baseline clarity"Then rewrite and filter the output using the robotics-specific rules below.
Each candidate idea must include:
- One-sentence summary
- Target embodiment
- Target benchmark / simulator / dataset
- Core bottleneck being addressed
- Minimum sim-first pilot
- Mandatory metrics
- Expected failure mode if the idea does not work
- Whether the idea truly needs real hardware
仅在机器人框架明确后生成想法。
调用现有想法生成器,但需将机器人问题框架和全景矩阵传入提示词,避免产出通用机器学习想法:
/idea-creator "$ARGUMENTS — robotics frame: [粘贴机器人问题框架] — focus venues: CoRL, RSS, ICRA, IROS, RA-L — benchmark-specific ideas only — sim-first pilots — no real-robot execution without explicit approval — require failure metrics and baseline clarity"随后使用以下机器人领域特定规则重写并筛选输出结果。
每个候选想法必须包含:
- 一句话摘要
- 目标具身形态
- 目标基准测试/仿真器/数据集
- 核心解决的瓶颈
- 最小化优先仿真验证方案
- 强制评估指标
- 若想法无效的预期失败模式
- 是否确实需要实体硬件
Good Robotics Idea Patterns
优质机器人研究想法特征
Prefer ideas that:
- expose a real bottleneck in perception-action coupling
- improve robustness under embodiment or environment shift
- reduce operator time, reset cost, or demonstration cost
- strengthen sim2real transfer with measurable mechanisms
- improve recovery, retry behavior, or failure detection
- create a better benchmark, diagnostic, or evaluation protocol
- test an assumption the community repeats but rarely measures
优先选择具备以下特性的想法:
- 暴露感知-动作耦合的真实瓶颈
- 提升具身形态或环境变化下的鲁棒性
- 减少操作人员时间、重置成本或演示数据成本
- 通过可测量机制增强Sim2Real迁移效果
- 提升恢复、重试能力或故障检测效率
- 构建更优的基准测试、诊断或评估协议
- 验证社区普遍重复但极少量化的假设
Weak Robotics Idea Patterns
低质量机器人研究想法特征
Downrank ideas that are mostly:
- "apply a foundation model / VLM / diffusion model to robot X" with no new bottleneck analysis
- demo-driven but not benchmarkable
- dependent on inaccessible hardware, custom sensors, or massive private datasets
- impossible to evaluate without a months-long infrastructure build
- only interesting if everything works perfectly
以下类型的想法需降低优先级:
- 仅“将大语言模型/VLM/扩散模型应用于机器人X”而无新瓶颈分析
- 以演示为驱动但无法进行基准测试
- 依赖难以获取的硬件、定制传感器或大规模私有数据集
- 需耗时数月搭建基础设施才能评估
- 仅在完全成功时具备研究价值
Filtering Rules
筛选规则
For each idea, reject or heavily downrank if:
- no concrete simulator or benchmark is available
- no credible baseline exists
- no measurable metric beyond "looks better"
- real robot execution is required but hardware access is unclear
- the setup depends on privileged observations that make the claim weak
- the expected contribution disappears if evaluation is made fair
Checkpoint: Present the ranked robotics ideas before novelty checking:
💡 Robotics ideas generated. Top candidates:
1. [Idea 1] — Embodiment: [...] — Benchmark: [...] — Pilot: sim/offline — Risk: LOW/MEDIUM/HIGH
2. [Idea 2] — Embodiment: [...] — Benchmark: [...] — Pilot: sim/offline — Risk: LOW/MEDIUM/HIGH
3. [Idea 3] — requires hardware / weak benchmark / high risk
Should I carry the top sim-first ideas into novelty checking and external review?
(If no response, I'll continue with the strongest benchmark-grounded ideas.)- User picks ideas (or no response + AUTO_PROCEED=true) → proceed to Phase 3 with the top sim-first ideas, then continue to Phase 4 and Phase 5.
- User wants different constraints → update the robotics frame and re-run Phase 2.
- User wants narrower scope → go back to Phase 1 with a tighter embodiment / task / benchmark focus.
若候选想法满足以下任一条件,需拒绝或大幅降低优先级:
- 无可用的具体仿真器或基准测试
- 无可信的基线方案
- 除“视觉效果更优”外无其他可测量指标
- 需实体机器人执行但硬件使用权不明确
- 依赖特权观测导致结论说服力弱
- 若采用公平评估则贡献消失
检查点:在创新性校验前向用户呈现排名后的机器人研究想法:
💡 机器人研究想法已生成。顶级候选:
1. [想法1] — 具身形态: [...] — 基准测试: [...] — 验证方案: 仿真/离线 — 风险: 低/中/高
2. [想法2] — 具身形态: [...] — 基准测试: [...] — 验证方案: 仿真/离线 — 风险: 低/中/高
3. [想法3] — 需实体硬件/基准测试薄弱/高风险
是否将顶级优先仿真想法带入创新性校验与外部评审环节?
(若无回应,我将采用最优质的基准测试驱动想法继续执行。)- 用户选择想法(或无回应+AUTO_PROCEED=true)→ 携带顶级优先仿真想法进入阶段3,随后继续阶段4和阶段5
- 用户要求调整约束→ 更新机器人框架并重新运行阶段2
- 用户要求缩小范围→ 返回阶段1,收紧具身形态/任务/基准测试范围
Phase 3: Feasibility and Pilot Design
阶段3:可行性分析与验证方案设计
For the top ideas, design a minimal validation package.
If the repository already contains a usable simulator, benchmark harness, or offline dataset pipeline, you may validate the top 1-3 ideas there. If not, do not force execution. Produce a concrete pilot plan instead.
By default, pilots should be one of:
- simulation pilot
- offline log / dataset pilot
- analysis-only pilot using existing benchmark outputs
Only propose a real-robot pilot if the user explicitly wants that.
For each surviving idea, specify:
markdown
- Embodiment:
- Benchmark / simulator:
- Baselines:
- Pilot type: sim / offline / real
- Compute estimate:
- Human/operator time:
- Success metrics:
- Failure metrics:
- Safety concerns:
- What result would count as positive signal:
- What negative result would still be publishable:针对顶级想法,设计最小化验证方案包。
若仓库中已包含可用的仿真器、基准测试工具或离线数据处理流程,可在其中验证1-3个顶级想法。若没有,请勿强制执行,需产出具体的验证方案。
默认情况下,验证方案应为以下类型之一:
- 仿真验证
- 离线日志/数据集验证
- 仅分析验证:使用现有基准测试输出
仅在用户明确要求时,才提出实体机器人验证方案。
针对每个留存的想法,需明确:
markdown
- 具身形态:
- 基准测试/仿真器:
- 基线方案:
- 验证类型: 仿真/离线/实体
- 计算资源预估:
- 操作人员时间预估:
- 成功指标:
- 失败指标:
- 安全顾虑:
- 何为积极信号:
- 何为仍具备发表价值的负面结论:Real Robot Rule
实体机器人规则
Never auto-proceed to physical robot testing. If an idea needs hardware:
- mark it as
needs physical validation - design the sim or offline precursor first
- ask for explicit user confirmation before any real-robot step
If no cheap sim/offline pilot exists, keep the idea in the report but label it high execution risk.
After Phase 3, continue to Phase 4 even if you only produced a pilot plan rather than running a pilot. Lack of immediate execution is not a reason to stop the workflow.
绝不要自动进入实体机器人测试环节。若想法需要硬件:
- 标记为
needs physical validation - 先设计仿真或离线前置验证方案
- 在任何实体机器人操作前需获得用户明确确认
若不存在低成本的仿真/离线验证方案,需在报告中保留该想法但标记为高执行风险。
完成阶段3后,即使仅产出验证方案而非实际执行验证,仍需继续进入阶段4。缺乏即时执行条件并非终止工作流的理由。
Phase 4: Deep Novelty Verification
阶段4:深度创新性验证
For each top idea, run:
/novelty-check "[idea description with embodiment + task family + benchmark + sensor stack + controller/policy class + sim2real angle + target venues: CoRL/RSS/ICRA/IROS/RA-L]"Robotics novelty checks must include:
- embodiment
- task family
- benchmark / simulator
- sensor stack
- controller / policy type
- sim2real or safety angle if relevant
Be especially skeptical of ideas that are just:
- old method + new benchmark
- VLA/VLM + standard manipulation benchmark
- sim2real claim without new transfer mechanism
If the method is not novel but the finding or evaluation protocol is, say that explicitly.
针对每个顶级想法,运行:
/novelty-check "[包含具身形态+任务类别+基准测试+传感器栈+控制器/策略类别+Sim2Real视角+target venues: CoRL/RSS/ICRA/IROS/RA-L的想法描述]"机器人领域创新性校验必须包含:
- 具身形态
- 任务类别
- 基准测试/仿真器
- 传感器栈
- 控制器/策略类型
- 若相关则包含Sim2Real或安全视角
需特别警惕以下类型的想法:
- 旧方法+新基准测试
- VLA/VLM+标准操作基准测试
- 无新迁移机制的Sim2Real宣称
若方法无创新性但发现或评估协议具备创新性,需明确说明。
Phase 5: External Robotics Review
阶段5:外部机器人领域评审
Invoke:
/research-review "[top idea with robotics framing, embodiment, benchmark, baselines, pilot plan, evaluation metrics, and sim2real/hardware risks — review as CoRL/RSS/ICRA reviewer]"Frame the reviewer as a senior CoRL / RSS / ICRA reviewer. Ask them to focus on:
- whether the contribution is really new for robotics, not just ML
- the minimum benchmark package needed for credibility
- whether the sim2real story is justified
- missing baselines or failure analyses
- whether the idea survives realistic infrastructure constraints
Update the report with the reviewer's minimum viable evidence package.
调用:
/research-review "[包含机器人领域框架、具身形态、基准测试、基线方案、验证方案、评估指标及Sim2Real/硬件风险的顶级想法 — 以CoRL/RSS/ICRA评审专家视角评审]"将评审专家设定为资深CoRL / RSS / ICRA评审,要求其重点关注:
- 贡献是否为机器人领域专属创新,而非仅机器学习领域创新
- 具备可信度所需的最小基准测试包
- Sim2Real落地路径是否合理
- 缺失的基线方案或失败分析
- 想法是否符合实际基础设施约束
使用评审专家提出的最小可行证据包更新报告。
Phase 6: Final Report
阶段6:最终报告
Write or update with a robotics-specific structure so it stays compatible with downstream workflows.
IDEA_REPORT.mdmarkdown
undefined撰写或更新,采用机器人领域专属结构,确保与下游工作流兼容。
IDEA_REPORT.mdmarkdown
undefinedRobotics Idea Discovery Report
机器人研究想法发现报告
Direction: $ARGUMENTS
Date: [today]
Pipeline: research-lit → idea-creator (robotics framing) → novelty-check → research-review
研究方向: $ARGUMENTS
日期: [今日]
工作流: research-lit → idea-creator (robotics framing) → novelty-check → research-review
Robotics Problem Frame
机器人问题框架
- Embodiment:
- Task family:
- Observation / action interface:
- Available assets:
- Constraints:
- 具身形态:
- 任务类别:
- 观测/动作接口:
- 可用资源:
- 约束条件:
Landscape Matrix
领域全景矩阵
[grouped by embodiment, benchmark, and bottleneck]
[按具身形态、基准测试及瓶颈分组]
Ranked Ideas
排名后的研究想法
Idea 1: [title] — RECOMMENDED
想法1: [标题] — 推荐
- Embodiment:
- Benchmark / simulator:
- Bottleneck addressed:
- Pilot type: sim / offline / real
- Positive signal:
- Novelty:
- Reviewer score:
- Hardware risk:
- Next step:
- 具身形态:
- 基准测试/仿真器:
- 解决的核心瓶颈:
- 验证类型: 仿真/离线/实体
- 积极信号:
- 创新性:
- 评审得分:
- 硬件风险:
- 下一步:
Eliminated Ideas
被淘汰的想法
- [idea] — killed because benchmark unclear / hardware inaccessible / novelty weak / no fair evaluation
- [想法] — 淘汰原因: 基准测试不明确/硬件不可获取/创新性薄弱/无公平评估方案
Evidence Package for the Top Idea
顶级想法的证据包
- Required baselines:
- Required metrics:
- Required failure cases:
- Whether real robot evidence is mandatory:
- 必需基线方案:
- 必需评估指标:
- 必需失败案例:
- 是否必须实体机器人证据:
Next Steps
下一步计划
- Implement sim-first pilot
- Run /novelty-check on the final idea wording
- Only after approval: consider hardware validation
undefined- 实现优先仿真验证方案
- 针对最终想法表述运行/novelty-check
- 仅在获得批准后:考虑实体硬件验证
undefinedKey Rules
核心规则
- Simulation first. Hardware is never the default.
- Benchmark specificity is mandatory. No benchmark, no serious idea.
- Evaluation must include failures. Success rate alone is not enough.
- Embodiment matters. Do not assume a result on one robot transfers to another.
- Avoid foundation-model theater. Novel terminology is not novelty.
- Infrastructure realism matters. Operator time, reset burden, and safety count as research constraints.
- If the contribution is mainly diagnostic or evaluative, say so. That can still be publishable.
- 优先仿真。硬件绝非默认选项。
- 必须明确基准测试。无基准测试则无严谨研究想法。
- 评估必须包含失败分析。仅成功率不足以支撑研究价值。
- 具身形态至关重要。不要假设某一机器人的结论可迁移至其他机器人。
- 避免大模型噱头。新颖术语不等于创新性。
- 基础设施现实性至关重要。操作人员时间、重置负担及安全性均为研究约束。
- 若贡献主要为诊断或评估类,需明确说明。此类贡献仍具备发表价值。
Composing with Later Work
与后续工作流的衔接
After this workflow identifies a strong robotics idea:
/idea-discovery-robot "direction" ← you are here
implement sim-first pilot
/run-experiment ← if infrastructure exists
/auto-review-loop "top robotics idea"If no simulator or benchmark is available yet, stop at the report and ask the user to choose whether to build infrastructure or pivot to a more executable idea.
本工作流识别出优质机器人研究想法后:
/idea-discovery-robot "direction" ← 当前阶段
implement sim-first pilot
/run-experiment ← 若基础设施存在
/auto-review-loop "top robotics idea"若尚无可用的仿真器或基准测试,需在报告阶段停止并询问用户是搭建基础设施还是转向更具可执行性的想法。