research-wiki

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Research Wiki: Persistent Research Knowledge Base

Research Wiki:持久化研究知识库

Subcommand: $ARGUMENTS
子命令:$ARGUMENTS

Overview

概述

The research wiki is a persistent, per-project knowledge base that accumulates structured knowledge across the entire ARIS research lifecycle. Unlike one-off literature surveys that are used and forgotten, the wiki compounds — every paper read, idea tested, experiment run, and review received makes the wiki smarter.
Inspired by Karpathy's LLM Wiki pattern: compile knowledge once, keep it current, don't re-derive on every query.
Research Wiki是一个针对单个项目的持久化知识库,可在整个ARIS研究生命周期中积累结构化知识。与用过即弃的一次性文献调研不同,该Wiki具备复利效应——每读一篇论文、测试一个想法、完成一次实验、收到一次评审意见,都会让Wiki变得更完善。
灵感源自Karpathy的LLM Wiki模式:一次性整理知识,持续更新,无需在每次查询时重新推导。

Core Concepts

核心概念

Four Entity Types

四种实体类型

EntityDirectoryNode ID formatWhat it represents
Paper
papers/
paper:<slug>
A published or preprint research paper
Idea
ideas/
idea:<id>
A research idea (proposed, tested, or failed)
Experiment
experiments/
exp:<id>
A concrete experiment run with results
Claim
claims/
claim:<id>
A testable scientific claim with evidence status
实体类型目录节点ID格式代表内容
Paper
papers/
paper:<slug>
已发表或预印本研究论文
Idea
ideas/
idea:<id>
研究想法(已提出、已测试或已失败)
Experiment
experiments/
exp:<id>
已完成的具体实验及结果
Claim
claims/
claim:<id>
可验证的科学主张及证据状态

Typed Relationships (
graph/edges.jsonl
)

类型化关系(
graph/edges.jsonl

Edge typeFrom → ToMeaning
extends
paper → paperBuilds on prior work
contradicts
paper → paperDisagrees with results/claims
addresses_gap
paper|idea → gapTargets a known field gap
inspired_by
idea → paperIdea sourced from this paper
tested_by
idea|claim → expTested in this experiment
supports
exp → claim|ideaExperiment confirms claim
invalidates
exp → claim|ideaExperiment disproves claim
supersedes
paper → paperNewer work replaces older
Edges are stored in
graph/edges.jsonl
only. The
## Connections
section on each page is auto-generated from the graph — never hand-edit it.
关系类型起点 → 终点含义
extends
paper → paper基于前人研究成果
contradicts
paper → paper与已有结果/主张相悖
addresses_gap
paper|idea → gap针对已知领域空白
inspired_by
idea → paper想法来源于该论文
tested_by
idea|claim → exp在该实验中被验证
supports
exp → claim|idea实验证实该主张/想法
invalidates
exp → claim|idea实验推翻该主张/想法
supersedes
paper → paper新研究替代旧研究
关系仅存储在
graph/edges.jsonl
中。每个页面的
## Connections
部分由图谱自动生成,请勿手动编辑。

Wiki Directory Structure

Wiki目录结构

research-wiki/
  index.md               # categorical index (auto-generated)
  log.md                 # append-only timeline
  gap_map.md             # field gaps with stable IDs (G1, G2, ...)
  query_pack.md          # compressed summary for /idea-creator (auto-generated, max 8000 chars)
  papers/
    <slug>.md            # one page per paper
  ideas/
    <idea_id>.md         # one page per idea
  experiments/
    <exp_id>.md          # one page per experiment
  claims/
    <claim_id>.md        # one page per testable claim
  graph/
    edges.jsonl          # materialized current relationship graph
research-wiki/
  index.md               # 分类索引(自动生成)
  log.md                 # 仅追加式时间线
  gap_map.md             # 带稳定ID的领域空白(G1、G2……)
  query_pack.md          # 供/idea-creator使用的压缩摘要(自动生成,最大8000字符)
  papers/
    <slug>.md            # 单篇论文对应一个页面
  ideas/
    <idea_id>.md         # 单个想法对应一个页面
  experiments/
    <exp_id>.md          # 单个实验对应一个页面
  claims/
    <claim_id>.md        # 单个可验证主张对应一个页面
  graph/
    edges.jsonl          # 实例化的当前关系图谱

Subcommands

子命令

/research-wiki init

/research-wiki init

Initialize the wiki for the current project:
  1. Create
    research-wiki/
    directory structure
  2. Create empty
    index.md
    ,
    log.md
    ,
    gap_map.md
  3. Create empty
    graph/edges.jsonl
  4. Log: "Wiki initialized"
为当前项目初始化Wiki:
  1. 创建
    research-wiki/
    目录结构
  2. 创建空白的
    index.md
    log.md
    gap_map.md
  3. 创建空白的
    graph/edges.jsonl
  4. 日志记录:"Wiki已初始化"

/research-wiki ingest "<paper title>" — arxiv: <id>

/research-wiki ingest "<paper title>" --arxiv: <id>

Add a paper to the wiki. This subcommand is thin wrapping around the canonical helper
python3 tools/research_wiki.py ingest_paper …
, which is the single implementation of paper ingest in ARIS (per
shared-references/integration-contract.md
— one helper, no copies). The helper does all of:
  1. Fetch metadata — queries the arXiv Atom API when
    --arxiv-id
    is given
  2. Generate slug
    <first_author_last_name><year>_<keyword>
  3. Check dedup — skip an existing page unless
    --update-on-exist
  4. Create page
    papers/<slug>.md
    with the schema below
  5. Rebuild
    index.md
    and
    query_pack.md
  6. Append
    log.md
Edge extraction (step 5/8 in the old manual flow) is not in
ingest_paper
; do it as a follow-up with
add_edge
per relationship identified:
bash
undefined
将论文添加到Wiki。该子命令是对标准辅助工具
python3 tools/research_wiki.py ingest_paper …
的轻量封装,该工具是ARIS中论文导入的唯一实现(遵循
shared-references/integration-contract.md
——单一辅助工具,无重复实现)。该工具会完成以下所有操作:
  1. 获取元数据——当指定
    --arxiv-id
    时,调用arXiv Atom API查询
  2. 生成slug——格式为
    <first_author_last_name><year>_<keyword>
  3. 查重——若页面已存在则跳过,除非指定
    --update-on-exist
  4. 创建页面——生成
    papers/<slug>.md
    ,遵循下方的页面 schema
  5. **重建
    index.md
    **和
    query_pack.md
  6. 追加到
    log.md
关系提取(旧手动流程的第5/8步)不包含在
ingest_paper
中;需在识别到关系后,使用
add_edge
作为后续步骤完成:
bash
undefined

arXiv-known paper

arXiv收录的论文

python3 tools/research_wiki.py ingest_paper research-wiki/
--arxiv-id 2501.12345 --thesis "One-line claim from abstract."
python3 tools/research_wiki.py ingest_paper research-wiki/
--arxiv-id 2501.12345 --thesis "摘要中的核心主张一句话总结。"

Venue paper with no arXiv mirror

无arXiv镜像的会议期刊论文

python3 tools/research_wiki.py ingest_paper research-wiki/
--title "Attention Is All You Need"
--authors "Ashish Vaswani, Noam Shazeer, …" --year 2017 --venue "NeurIPS"
python3 tools/research_wiki.py ingest_paper research-wiki/
--title "Attention Is All You Need"
--authors "Ashish Vaswani, Noam Shazeer, …" --year 2017 --venue "NeurIPS"

Manual edge after ingest

导入后手动添加关系

python3 tools/research_wiki.py add_edge research-wiki/
--from "paper:vaswani2017_attention_all_you"
--to "paper:chen2025_factorized_gap"
--type "extends" --evidence "Section 3.2: adapts the encoder block …"

Other skills (`/research-lit`, `/arxiv`, `/alphaxiv`, `/deepxiv`,
`/semantic-scholar`, `/exa-search`) call the same helper directly in
their own last step — they don't re-route through `/research-wiki
ingest` as a subcommand, so they don't need an LLM roundtrip.
python3 tools/research_wiki.py add_edge research-wiki/
--from "paper:vaswani2017_attention_all_you"
--to "paper:chen2025_factorized_gap"
--type "extends" --evidence "第3.2节:适配了编码器模块……"

其他技能(`/research-lit`、`/arxiv`、`/alphaxiv`、`/deepxiv`、`/semantic-scholar`、`/exa-search`)会在自身流程的最后一步直接调用该辅助工具——无需通过`/research-wiki ingest`子命令中转,因此无需LLM往返调用。

/research-wiki sync — arxiv-ids <id1>,<id2>,...

/research-wiki sync --arxiv-ids <id1>,<id2>,...

Batch backfill: ingest one or more arXiv IDs that were read earlier without being ingested (e.g., because
research-wiki/
was set up after the reading happened, or a hook didn't fire).
bash
undefined
批量补全:导入之前已阅读但未导入的arXiv ID(例如,
research-wiki/
是在阅读完成后才创建的,或钩子未触发)。
bash
undefined

Explicit list

显式指定ID列表

python3 tools/research_wiki.py sync research-wiki/
--arxiv-ids 2310.06770,1706.03762
python3 tools/research_wiki.py sync research-wiki/
--arxiv-ids 2310.06770,1706.03762

From a file (one id per line, # comments ok)

从文件读取(每行一个ID,支持#注释)

python3 tools/research_wiki.py sync research-wiki/ --from-file ids.txt

Dedup is handled per-id; already-ingested papers are skipped silently.
This is the recommended **manual repair** step (see integration
contract §5 Backfill). `sync` does not scan session traces — callers
declare the ids explicitly.

**Paper page schema** (exactly what `ingest_paper` emits — do not
handwrite alternative fields; `lint` will flag drift):

```markdown
---
type: paper
node_id: paper:<slug>
title: "<full title>"
authors: ["First A. Author", "Second B. Author"]
year: 2025
venue: "arXiv"
external_ids:
  arxiv: "2501.12345"
  doi: null
  s2: null
tags: ["tag1", "tag2"]
added: 2026-04-07T10:12:00Z
---
python3 tools/research_wiki.py sync research-wiki/ --from-file ids.txt

会按ID处理查重;已导入的论文将被静默跳过。这是推荐的**手动修复**步骤(见集成协议第5节补全)。`sync`不会扫描会话轨迹——调用者需显式声明ID。

**论文页面schema**(与`ingest_paper`生成的完全一致——请勿手写其他字段;`lint`会标记偏差):

```markdown
---
type: paper
node_id: paper:<slug>
title: "<完整标题>"
authors: ["First A. Author", "Second B. Author"]
year: 2025
venue: "arXiv"
external_ids:
  arxiv: "2501.12345"
  doi: null
  s2: null
tags: ["tag1", "tag2"]
added: 2026-04-07T10:12:00Z
---

<full title>

<完整标题>

One-line thesis

核心主张一句话总结

[Single sentence capturing the paper's core contribution]
[提炼论文核心贡献的单句内容]

Problem / Gap

问题/领域空白

Method

方法

Key Results

关键结果

Assumptions

假设前提

Limitations / Failure Modes

局限性/失效模式

Reusable Ingredients

可复用要素

[Techniques, datasets, or insights that could be repurposed]
[可被复用的技术、数据集或见解]

Open Questions

开放问题

Claims

研究主张

[Reference claim pages: claim:C1, claim:C2, etc.]
[引用主张页面:claim:C1、claim:C2等]

Connections

关联关系

[AUTO-GENERATED from graph/edges.jsonl — do not edit manually]
[由graph/edges.jsonl自动生成——请勿手动编辑]

Relevance to This Project

与本项目的相关性

[Why this paper matters for our specific research direction]

_Additionally, when the paper was ingested via `--arxiv-id` and the arXiv
API returned an abstract, the helper appends an `## Abstract (original)`
section after `Relevance to This Project` containing the raw abstract
text as a blockquote. Manual ingests (no `--arxiv-id`) do not include
this section._
[该论文为何对我们的特定研究方向重要]

_此外,当通过`--arxiv-id`导入论文且arXiv API返回摘要时,辅助工具会在`与本项目的相关性`之后追加`## 原始摘要`部分,以块引用形式包含原始摘要文本。手动导入(未指定`--arxiv-id`)不会包含此部分。_

/research-wiki query "<topic>"

/research-wiki query "<topic>"

Generate
query_pack.md
— a compressed, context-window-friendly summary:
Fixed budget (max 8000 chars / ~2000 tokens):
SectionBudgetContent
Project direction300 charsFrom CLAUDE.md or RESEARCH_BRIEF.md
Top 5 gaps1200 charsFrom gap_map.md, ranked by: unresolved + linked ideas + failed experiments
Paper clusters1600 chars3-5 clusters by tag overlap, 2-3 sentences each
Failed ideas1400 charsAlways included — highest anti-repetition value
Top papers1800 chars8-12 pages ranked by: linked gaps, linked ideas, centrality, relevance flag
Active chains900 charslimitation → opportunity relationship chains
Open unknowns500 charsUnresolved questions across the wiki
Pruning priority (when over budget): low-ranked papers > cluster detail > chain detail. Never prune failed ideas or top gaps first.
Key rule: Read from short fields only (frontmatter, one-line thesis, gap summary, failure note). Do not summarize full page bodies every time.
生成
query_pack.md
——一个压缩的、适配上下文窗口的摘要:
固定容量(最大8000字符/约2000 tokens):
章节容量内容
项目方向300字符来自CLAUDE.md或RESEARCH_BRIEF.md
顶级5个领域空白1200字符来自gap_map.md,按未解决程度+关联想法+失败实验排序
论文聚类1600字符按标签重叠度分为3-5个聚类,每个聚类2-3句话
失败想法1400字符必含——具有最高的防重复价值
顶级论文1800字符8-12篇论文,按关联空白、关联想法、中心度、相关性标记排序
活跃关系链900字符局限性→机遇的关系链
未知问题500字符Wiki中未解决的问题
裁剪优先级(超出容量时):低排名论文 > 聚类细节 > 关系链细节。请勿优先裁剪失败想法或顶级领域空白。
**核心规则:**仅读取短字段(前置元数据、核心主张一句话总结、空白摘要、失败说明)。无需每次都总结完整页面内容。

/research-wiki update <node_id> — <field>: <value>

/research-wiki update <node_id> -- <field>: <value>

Update a specific entity:
/research-wiki update paper:chen2025 — relevance: core
/research-wiki update idea:001 — outcome: negative
/research-wiki update claim:C1 — status: invalidated
After any update: rebuild
query_pack.md
, update
log.md
.
更新特定实体:
/research-wiki update paper:chen2025 -- relevance: core
/research-wiki update idea:001 -- outcome: negative
/research-wiki update claim:C1 -- status: invalidated
任何更新完成后:重建
query_pack.md
,更新
log.md

/research-wiki lint

/research-wiki lint

Health check the wiki:
  1. Orphan pages — entities with zero edges
  2. Stale claims — claims with
    status: reported
    older than 14 days
  3. Contradictions — claims with both
    supports
    and
    invalidates
    edges
  4. Missing connections — papers sharing 2+ tags but no explicit relationship
  5. Dead ideas
    stage: proposed
    ideas that were never tested
  6. Sparse pages — pages with 3+ empty sections
Output a
LINT_REPORT.md
with suggested fixes.
对Wiki进行健康检查:
  1. 孤立页面——无任何关系的实体
  2. 过期主张——
    status: reported
    状态超过14天的主张
  3. 矛盾关系——同时存在
    supports
    invalidates
    关系的主张
  4. 缺失关联——共享2个以上标签但无明确关系的论文
  5. 停滞想法——
    stage: proposed
    状态但从未被测试的想法
  6. 内容稀疏页面——存在3个以上空白章节的页面
输出包含修复建议的
LINT_REPORT.md

/research-wiki stats

/research-wiki stats

Quick overview:
📚 Research Wiki Stats
Papers: 28 (12 core, 10 related, 6 peripheral)
Ideas: 7 (2 active, 3 failed, 1 partial, 1 succeeded)
Experiments: 12
Claims: 15 (5 supported, 3 invalidated, 7 reported)
Edges: 64
Gaps: 8 (3 unresolved)
Last updated: 2026-04-07T10:12:00Z
快速概览:
📚 Research Wiki统计数据
论文:28篇(12篇核心、10篇相关、6篇外围)
想法:7个(2个活跃、3个失败、1个部分完成、1个成功)
实验:12次
主张:15个(5个已证实、3个已推翻、7个已报告)
关系:64条
领域空白:8个(3个未解决)
最后更新时间:2026-04-07T10:12:00Z

Integration with Existing Workflows

与现有工作流的集成

All paper-reading skills follow the same integration contract (see
shared-references/integration-contract.md
):
  • single predicate —
    [ -d research-wiki/ ]
  • single canonical helper —
    python3 tools/research_wiki.py ingest_paper …
  • concrete artifact —
    papers/<slug>.md
    +
    log.md
    entry
  • backfill —
    sync --arxiv-ids …
  • diagnostic —
    tools/verify_wiki_coverage.sh
所有论文阅读技能遵循相同的集成协议(见
shared-references/integration-contract.md
):
  • 单一判断条件——
    [ -d research-wiki/ ]
  • 单一标准辅助工具——
    python3 tools/research_wiki.py ingest_paper …
  • 具体产物——
    papers/<slug>.md
    +
    log.md
    条目
  • 补全——
    sync --arxiv-ids …
  • 诊断——
    tools/verify_wiki_coverage.sh

Hook 1: After
/research-lit
finds papers

钩子1:
/research-lit
找到论文后

undefined
undefined

At end of research-lit, after synthesis:

在research-lit流程末尾,完成合成后:

if research-wiki/ exists: for paper in top_relevant_papers (limit 8-12): python3 tools/research_wiki.py ingest_paper research-wiki/
--arxiv-id <id> [--thesis "..."] [--tags "..."] for each explicit relation to existing wiki paper: python3 tools/research_wiki.py add_edge research-wiki/
--from "paper:<slug>" --to "<target>"
--type <extends|contradicts|addresses_gap|...>
--evidence "..." log "research-lit ingested N papers"

Each paper-reading skill ships its own Step "Update Research Wiki (if
active)" that calls the same helper once per paper it touched. The
business logic is not duplicated — only the loop over that skill's
specific result set differs.
if research-wiki/ 存在: for paper in top_relevant_papers(限制8-12篇): python3 tools/research_wiki.py ingest_paper research-wiki/
--arxiv-id <id> [--thesis "..."] [--tags "..."] for each explicit relation to existing wiki paper: python3 tools/research_wiki.py add_edge research-wiki/
--from "paper:<slug>" --to "<target>"
--type <extends|contradicts|addresses_gap|...>
--evidence "..." log "research-lit已导入N篇论文"

每个论文阅读技能都自带“更新Research Wiki(若启用)”步骤,会针对其处理的每篇论文调用一次该辅助工具。业务逻辑不会重复——仅遍历该技能特定结果集的循环有所不同。

Hook 2:
/idea-creator
reads AND writes wiki

钩子2:
/idea-creator
读取并写入Wiki

Before ideation:
if research-wiki/query_pack.md exists (and < 7 days old):
    prepend query_pack to landscape context
    treat failed ideas as banlist
    treat top gaps as search seeds
    still run fresh literature search for last 3-6 months
After ideation (THIS IS CRITICAL — without it, ideas/ stays empty):
for idea in all_generated_ideas (recommended + killed):
    /research-wiki upsert_idea(idea)
    for paper_id in idea.based_on:
        add_edge(idea.node_id, paper_id, "inspired_by")
    for gap_id in idea.target_gaps:
        add_edge(idea.node_id, gap_id, "addresses_gap")
rebuild query_pack
log "idea-creator wrote N ideas to wiki"
构思前:
if research-wiki/query_pack.md 存在(且创建时间<7天):
    将query_pack前置到领域背景中
    将失败想法视为禁用列表
    将顶级领域空白视为搜索种子
    仍需针对过去3-6个月的文献进行全新搜索
构思后(这一步至关重要——否则ideas/目录将为空):
for idea in all_generated_ideas(推荐+已否决):
    /research-wiki upsert_idea(idea)
    for paper_id in idea.based_on:
        add_edge(idea.node_id, paper_id, "inspired_by")
    for gap_id in idea.target_gaps:
        add_edge(idea.node_id, gap_id, "addresses_gap")
重建query_pack
log "idea-creator已向Wiki写入N个想法"

Hook 3: After
/result-to-claim
verdict

钩子3:
/result-to-claim
得出结论后

undefined
undefined

Create experiment page

创建实验页面

exp_id = upsert_experiment(experiment_data)
exp_id = upsert_experiment(experiment_data)

Update each claim's status

更新每个主张的状态

for claim_id in resolved_claims: if verdict == "yes": set_claim_status(claim_id, "supported") add_edge(exp_id, claim_id, "supports") elif verdict == "partial": set_claim_status(claim_id, "partial") add_edge(exp_id, claim_id, "supports") # partial else: set_claim_status(claim_id, "invalidated") add_edge(exp_id, claim_id, "invalidates")
for claim_id in resolved_claims: if verdict == "yes": set_claim_status(claim_id, "supported") add_edge(exp_id, claim_id, "supports") elif verdict == "partial": set_claim_status(claim_id, "partial") add_edge(exp_id, claim_id, "supports") # 部分证实 else: set_claim_status(claim_id, "invalidated") add_edge(exp_id, claim_id, "invalidates")

Update idea outcome

更新想法结果

update_idea(active_idea_id, outcome=verdict)
update_idea(active_idea_id, outcome=verdict)

If failed, record WHY for future ideation

若失败,记录原因供未来构思参考

if verdict in ("no", "partial"): update_idea failure_notes with specific metrics and reasons
rebuild query_pack log "result-to-claim: exp_id updated, verdict=..."
undefined
if verdict in ("no", "partial"): update_idea failure_notes with specific metrics and reasons
重建query_pack log "result-to-claim: exp_id已更新,verdict=..."
undefined

Re-ideation Trigger

重新构思触发条件

After significant wiki updates, suggest re-running
/idea-creator
:
  • ≥5 new papers ingested since last ideation
  • ≥3 new failed/partial ideas since last ideation
  • New contradiction discovered in the graph
  • New gap identified that no existing idea addresses
The system suggests but does not auto-trigger. User decides.
当Wiki发生重大更新后,建议重新运行
/idea-creator
  • 自上次构思以来导入了≥5篇新论文
  • 自上次构思以来新增了≥3个失败/部分完成的想法
  • 在图谱中发现新的矛盾关系
  • 发现了现有想法未覆盖的新领域空白
系统仅给出建议,不会自动触发。由用户决定是否执行。

Key Rules

核心规则

  • One source of truth for relationships:
    graph/edges.jsonl
    . Page
    Connections
    sections are auto-generated views.
  • Canonical node IDs everywhere:
    paper:<slug>
    ,
    idea:<id>
    ,
    exp:<id>
    ,
    claim:<id>
    ,
    gap:<id>
    . Never use raw titles or inconsistent shorthands.
  • Failed ideas are the most valuable memory. Never prune them from query_pack.
  • query_pack.md is hard-budgeted at 8000 chars. Deterministic generation, not open-ended summarization.
  • Append to log.md for every mutation. The log is the audit trail.
  • Reviewer independence applies. When the wiki is read by cross-model review skills, pass file paths only — do not summarize wiki content for the reviewer.
  • 关系的唯一数据源
    graph/edges.jsonl
    。页面的
    Connections
    部分是自动生成的视图。
  • 统一使用标准节点ID
    paper:<slug>
    idea:<id>
    exp:<id>
    claim:<id>
    gap:<id>
    。请勿使用原始标题或不一致的简写。
  • 失败想法是最有价值的记忆。请勿从query_pack中移除它们。
  • query_pack.md有严格容量限制:8000字符。采用确定性生成,而非开放式总结。
  • 每次修改都要追加到log.md。日志是审计追踪记录。
  • 保持评审独立性。当跨模型评审技能读取Wiki时,仅传递文件路径——请勿为评审者总结Wiki内容。

Acknowledgements

致谢

Inspired by Karpathy's LLM Wiki — "compile knowledge once, keep it current, don't re-derive on every query."
灵感来源于Karpathy的LLM Wiki——“一次性整理知识,持续更新,无需在每次查询时重新推导。”