ai-research-explore

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

ai-research-explore

ai-research-explore

Purpose

用途

Use this as the Rigor Explore compatible skill slug after the researcher explicitly authorizes candidate-only work on top of a durable
current_research
anchor. The installed slug remains
ai-research-explore
for compatibility. Rigor Explore is for meaningful and potentially novel deep learning research candidates while preserving scientific rigor, comparability, reproducibility, and auditable collaboration. Novelty and significance remain hypotheses before literature contrast, ablation evidence, and fair comparison. The skill does not promise autonomous discovery, global benchmark completeness, novelty proof, or trusted reproduction success.
Start from the shared operating principles in
../../references/agent-operating-principles.md
, then load
../../references/research-rigor-principles.md
for research claims and
../../references/deep-learning-experiment-principles.md
when experiment details affect comparability or reproducibility.
当研究者明确授权在稳定的
current_research
锚点基础上仅开展候选方案相关工作时,可将此作为Rigor Explore兼容的skill slug使用。为保证兼容性,已安装的slug仍为
ai-research-explore
。Rigor Explore用于生成有意义且具备潜在新颖性的深度学习研究候选方案,同时保留科研严谨性、可比性、可复现性及可审计的协作性。在完成文献对比、消融实验证据收集及公平对比前,新颖性与重要性均仅为假设。本技能不承诺自主发现、全局基准完整性、新颖性证明或可信复现成功。
先从
../../references/agent-operating-principles.md
中的共享操作原则入手,当涉及研究声明时加载
../../references/research-rigor-principles.md
,当实验细节影响可比性或可复现性时加载
../../references/deep-learning-experiment-principles.md

Fit

适用场景

Use this skill only when the request has both:
  • Explicit exploration authorization such as candidate-only work, isolated branch or worktree, sweep, several variants, or exploratory ranking.
  • A durable
    current_research
    context such as a branch, commit, checkpoint, run record, or already-trained local model state.
Keep narrow code-only requests on
explore-code
. Keep narrow run-only requests on
explore-run
. Keep passive repository analysis on
analyze-project
. Keep README-first reproduction on
ai-research-reproduction
.
仅当请求同时满足以下两个条件时,方可使用本技能:
  • 明确的探索授权,例如仅候选方案工作、隔离分支或工作树、参数扫描、多种变体或探索性排名。
  • 稳定的
    current_research
    上下文,例如分支、提交记录、检查点、运行记录或已训练完成的本地模型状态。
纯代码类窄范围请求请使用
explore-code
。纯运行类窄范围请求请使用
explore-run
。被动仓库分析请使用
analyze-project
。以README优先的复现工作请使用
ai-research-reproduction

Research Rhythm

研究节奏

Use a two-loop rhythm:
  • Outer loop: understand the repository, freeze task/dataset/evaluation/budget, preserve user ideas, map sources, gate ideas, and decide whether the next experiment is worth running.
  • Inner loop: make one bounded candidate change or run, smoke-check it, collect evidence, rank it against the current anchor, and either stop or return to the outer loop with the new evidence.
This rhythm is a guide, not a rigid autonomous loop. Stop at explicit blockers, unclear scientific meaning, exhausted budget, missing anchor/evaluation, or a human checkpoint.
采用双循环节奏:
  • 外循环:理解仓库内容,确定任务/数据集/评估/预算,保留用户想法,梳理来源,筛选想法,判断是否值得开展下一轮实验。
  • 内循环:进行一次有边界的候选方案修改或运行,进行冒烟测试,收集证据,与当前锚点对比排名,之后要么停止,要么带着新证据回到外循环。
该节奏为指导框架,而非僵化的自主循环。遇到明确障碍、科学意义模糊、预算耗尽、锚点/评估缺失或人工检查点时,需停止操作。

Workflow

工作流程

  1. Confirm
    current_research
    and explicit explore-lane authorization.
  2. Accept either legacy
    variant_spec
    or higher-level
    research_campaign
    .
  3. In campaign mode, freeze the task, dataset, benchmark, evaluation source, SOTA reference, and budget before candidate work.
  4. Build only the repo-understanding artifacts needed for the current campaign, usually through
    analyze-project
    .
  5. Run bounded, cache-first source lookup when source support matters; prefer local curated literature such as Zotero if available, then seed sources, repo-local locators, public locators, or optional web lookup. Treat lookup as source resolution, not an open-ended literature search.
  6. Preserve researcher-provided ideas, optionally add a small bounded set of single-variable seed ideas, and rank ideas with explicit gates and score breakdowns.
  7. Prefer one clear candidate at a time. Use
    explore-code
    for bounded code adaptation and
    explore-run
    for short-cycle trials or sweeps.
  8. Use
    minimal-run-and-audit
    or
    run-train
    only when the exploratory plan requires real execution evidence.
  9. Write candidate-only outputs to
    analysis_outputs/
    ,
    sources/
    , and
    explore_outputs/
    as appropriate; never present exploratory gains as trusted reproduction success. Include
    SCIENTIFIC_CHANGELOG.md
    and
    COMPARABILITY_REPORT.md
    for candidate scientific meaning and comparison boundaries.
  1. 确认
    current_research
    及明确的探索通道授权。
  2. 接受传统的
    variant_spec
    或更高层级的
    research_campaign
  3. 在活动模式下,开展候选方案工作前,先确定任务、数据集、基准、评估来源、SOTA参考及预算。
  4. 仅构建当前活动所需的仓库理解工件,通常通过
    analyze-project
    完成。
  5. 当来源支持至关重要时,执行有边界、缓存优先的来源查找;若有可用的本地整理文献(如Zotero)则优先使用,其次是种子来源、仓库本地定位器、公共定位器或可选的网络查找。将查找视为来源解析,而非开放式文献搜索。
  6. 保留研究者提供的想法,可选择性添加少量有边界的单变量种子想法,并通过明确的筛选规则和得分细分对想法进行排名。
  7. 优先一次聚焦一个清晰的候选方案。使用
    explore-code
    进行有边界的代码适配,使用
    explore-run
    进行短周期试验或参数扫描。
  8. 仅当探索计划需要真实执行证据时,才使用
    minimal-run-and-audit
    run-train
  9. 将仅候选方案相关的输出酌情写入
    analysis_outputs/
    sources/
    explore_outputs/
    ;切勿将探索性成果表述为可信复现成功。需包含
    SCIENTIFIC_CHANGELOG.md
    COMPARABILITY_REPORT.md
    ,用于说明候选方案的科学意义及对比边界。

Ranking and Evidence

排名与证据

  • Before execution, prioritize candidates by expected gain, cost, success likelihood, patch surface, dependency drag, evaluation risk, and rollback ease.
  • After execution, rank by real evidence first: command status, observed metrics, artifacts, changed paths, smoke results, and reproducibility notes.
  • Keep researcher-provided
    evaluation_source
    and
    sota_reference
    frozen for the campaign; do not claim they are globally complete.
  • If the top ideas are too close or the implementation cannot be decomposed into auditable units, stop for a checkpoint instead of silently choosing.
  • 执行前,根据预期收益、成本、成功可能性、修改范围、依赖影响、评估风险及回滚难度对候选方案进行优先级排序。
  • 执行后,优先依据真实证据进行排名:命令状态、观测指标、工件、变更路径、冒烟测试结果及可复现性说明。
  • 活动期间保持研究者提供的
    evaluation_source
    sota_reference
    不变;不得声称其具备全局完整性。
  • 若排名靠前的想法差距过小,或实现无法分解为可审计单元,则需停止操作等待人工检查,而非自行选择。

Campaign Inputs

研究活动输入

research_campaign
is preferred for Rigor Explore campaigns, but it should stay minimal. The durable core is:
  • current_research
  • task_family
  • dataset
  • benchmark
  • evaluation_source
  • sota_reference
  • compute_budget
Use
candidate_ideas
,
variant_spec
,
research_lookup
,
idea_policy
,
idea_generation
,
source_constraints
,
feasibility_policy
,
baseline_gate
, and
execution_policy
as optional guidance, not as fields the agent must fill for every campaign. See
references/research-campaign-spec.md
for the advanced schema and artifact expectations.
Rigor Explore活动优先使用
research_campaign
,但需保持精简。其核心稳定内容包括:
  • current_research
  • task_family
  • dataset
  • benchmark
  • evaluation_source
  • sota_reference
  • compute_budget
可将
candidate_ideas
variant_spec
research_lookup
idea_policy
idea_generation
source_constraints
feasibility_policy
baseline_gate
execution_policy
作为可选指导,而非要求代理为每个活动都填写的字段。如需了解高级 Schema 和工件要求,请查看
references/research-campaign-spec.md

Reference Loading

参考文档加载

  • Load
    references/ai-research-explore-policy.md
    for lane safety and candidate semantics.
  • Load
    references/research-campaign-spec.md
    only when a campaign file is present or the user asks for Rigor Explore campaign governance.
  • Load
    ../../references/explore-variant-spec.md
    for run-level variant matrix details.
  • Load
    ../../references/research-rigor-principles.md
    before making novelty, contribution, SOTA, or comparability statements.
  • Load
    ../../references/deep-learning-experiment-principles.md
    when training, evaluation, baseline, ablation, metric, checkpoint, or dataset details matter.
  • Use
    scripts/orchestrate_explore.py
    and
    scripts/write_outputs.py
    for the existing deterministic artifact workflow.
  • 加载
    references/ai-research-explore-policy.md
    以了解通道安全规则及候选方案语义。
  • 仅当存在活动文件或用户要求对Rigor Explore活动进行管控时,才加载
    references/research-campaign-spec.md
  • 加载
    ../../references/explore-variant-spec.md
    以了解运行级变体矩阵细节。
  • 在发表新颖性、贡献、SOTA或可比性声明前,加载
    ../../references/research-rigor-principles.md
  • 当训练、评估、基准、消融实验、指标、检查点或数据集细节至关重要时,加载
    ../../references/deep-learning-experiment-principles.md
  • 使用
    scripts/orchestrate_explore.py
    scripts/write_outputs.py
    执行现有的确定性工件工作流。