litreview

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Litreview — Academic Literature Orientation

Litreview — 学术文献导向

Portability: Requires a Consensus MCP connection, Node.js with
docx
package for document generation, and (in CLI)
bash_tool
. Works in Claude Code CLI natively. In Claude.ai with Consensus MCP + Code Execution, the workflow is supported.

Produce a launching pad — not a finished literature review, but an orientation document that gives a researcher entering an unfamiliar field everything they need to start reading and searching with confidence. Think: what a generous colleague who knows the field would tell you over coffee.

可移植性： 需要Consensus MCP连接、带有
docx
包的Node.js用于文档生成，以及（在CLI环境中）
bash_tool
。原生支持Claude Code CLI。在配备Consensus MCP + 代码执行功能的Claude.ai中，该工作流同样受支持。

生成研究启动平台——并非完整的文献综述，而是一份导向文档，为刚进入陌生领域的研究人员提供自信开展阅读与搜索所需的全部信息。类比：就像一位熟悉该领域的热心同事在咖啡时间为你讲解的内容。

Agent Integrity Rules (Research-Pack Convention)

Agent完整性规则（研究包约定）

Inherited from the research-pack convention; locked verbatim per PR #657's cross-skill consistency audit.

Source discipline. Only cite Consensus-returned papers from THIS session. Training knowledge labeled
```
[Not from Consensus — model knowledge]
```
and excluded from cited count. Sparse results stated explicitly, never silently filled.
Counting discipline. Three numbers tracked: searches executed / unique papers received (deduplicated) / papers cited. Every cited paper has a retrievable Consensus URL from this session. Use
```
scripts/citation_tracker.py
```
for deterministic counts.
Tool constraints. Consensus per-query cap depends on plan tier. Detect at first search, report at checkpoint. Rate limit is 1 query/sec — sequential execution mandatory.
Retry policy. On failure → wait 3s → retry once → log. After 3 consecutive failures: stop, alert user, share what was collected.
Plan-tier detection. Parse first-search response for "Showing top 10" / "upgrade" → free tier (10/search). 20 returned → Pro (20/search). Calculate theoretical ceiling and surface at checkpoint so user can recalibrate.

See

references/search_budget_allocation.md

for the sequential-execution rationale + plan-tier signals.

继承自研究包约定；根据PR #657的跨技能一致性审核，内容固定不变。

来源规范：仅引用本次会话中Consensus返回的论文。标记为
```
[Not from Consensus — model knowledge]
```
的训练知识不计入引用数量。若搜索结果稀少需明确说明，绝不擅自补充内容。
计数规范：跟踪三个数据：执行的搜索次数 / 收到的唯一论文数（去重后） / 引用的论文数。每篇被引用的论文均需包含本次会话中可检索的Consensus URL。使用
```
scripts/citation_tracker.py
```
进行确定性计数。
工具限制：Consensus的单查询上限取决于套餐等级。首次搜索时检测，并在检查点报告。速率限制为1次查询/秒——必须按顺序执行。
重试策略：搜索失败时→等待3秒→重试一次→记录日志。连续3次失败后：停止搜索，提醒用户，分享已收集的内容。
套餐等级检测：解析首次搜索响应中的「Showing top 10」/「upgrade」字样→免费版（每次搜索返回10条结果）；返回20条结果→专业版（Pro）。计算理论上限并在检查点告知用户，以便用户重新调整。

查看

references/search_budget_allocation.md

了解顺序执行的原理及套餐等级信号说明。

Error Handling

错误处理

Failure	Behavior
Consensus rate-limit hit	Wait 3s, retry once, log outcome
Search returns 0 results	Note explicitly; "either niche terminology or genuine gap"; never silently fill
Plan-tier cap detected	Log tier; report at checkpoint; surface in audit
3 consecutive failures	Stop searching, alert user, share what's collected, ask how to proceed
Sub-area returns thin results (<5 papers)	Flag in audit; suggest manual PubMed/Scholar supplementation
User wants to adjust sub-areas	Update table, re-confirm before searching
DOCX validation fails	Unpack XML, fix, repack

故障场景	处理行为
触发Consensus速率限制	等待3秒，重试一次，记录结果
搜索返回0条结果	明确说明；提示「可能是术语过于小众或确实存在研究空白」；绝不擅自补充内容
检测到套餐等级上限	记录等级；在检查点报告；在审计日志中体现
连续3次失败	停止搜索，提醒用户，分享已收集内容，询问后续操作
子领域结果稀少（<5篇论文）	在审计日志中标记；建议手动补充PubMed/Google Scholar搜索
用户希望调整子领域	更新表格，重新确认后再执行搜索
DOCX验证失败	解压XML，修复问题后重新打包

Phase 0: Grill-Me Intake (3 forcing questions, one at a time)

阶段0：Grill-me深度问询导入（3个强制问题，逐一提问）

Each question carries explicit "why I'm asking". Stop condition: max 3 before Phase 1.

每个问题均需明确说明「提问原因」。停止条件：最多提问3个后进入阶段1。

Q1 (root) — Research question specificity

Q1（核心）——研究问题特异性

State the research question in 1–2 sentences. Specific is better — "How do LLMs perform on clinical reasoning tasks compared to physicians?" beats "AI in medicine". Vague questions produce vague reviews.

Why I'm asking: The reconnaissance search hinges on precise terminology. Vague questions produce thin recon results that don't yield a useful framework breakdown.

Refuse mush. Re-ask once with examples if user is too broad. If still vague, deliver with explicit "broad-scope orientation, not depth review" caveat.

请用1-2句话阐述研究问题。越具体越好——「LLM在临床推理任务中的表现与医师相比如何？」优于「AI在医学中的应用」。模糊的问题会产生模糊的综述结果。

提问原因： 侦察搜索的效果取决于精准的术语。模糊的问题会导致侦察结果单薄，无法生成有用的框架细分。

拒绝模糊表述。若用户表述过于宽泛，可提供示例后重新提问。若仍模糊，则需明确标注「仅提供宽泛领域导向，无深度综述」的说明后再交付内容。

Q2 (depends on Q1) — Framework hint

Q2（基于Q1）——框架提示

Framework — pick one or say "you pick":

PICO (Population / Intervention / Comparison / Outcome — most clinical questions)

SPIDER (Sample / Phenomenon / Design / Evaluation / Research-type — social/qualitative)

Decomposition (Problem / Solution / Evaluation / Limitations — technology-focused)

Hybrid (you pick which components from which framework)

You pick — analyze Q1 and recommend

Why I'm asking: PICO is the default for ~70% of clinical questions but maps poorly to qualitative work or technology evaluation. Picking upfront saves the recon search from suggesting a misaligned framework.

Forcing choice with default ("you pick"). The skill surfaces its own framework recommendation after the recon search so user can override. Use

scripts/framework_recommender.py

for the heuristic.

See

references/framework_selection.md

for PICO / SPIDER / Decomposition canon.

框架选择——选一个或说「由你选择」：

PICO（人群/干预措施/对照/结局——适用于大多数临床问题）

SPIDER（样本/现象/设计/评估/研究类型——适用于社科/定性研究）

分解法（问题/解决方案/评估/局限性——适用于技术类研究）

混合法（明确从不同框架中选取哪些组件）

由你选择——分析Q1并给出推荐

提问原因： PICO是约70%临床问题的默认框架，但对定性研究或技术评估的适配性较差。提前选择框架可避免侦察搜索推荐不匹配的框架。

提供强制选择项（默认「由你选择」）。技能会在侦察搜索后给出自身的框架推荐，供用户覆盖调整。使用

scripts/framework_recommender.py

执行启发式推荐。

查看

references/framework_selection.md

了解PICO/SPIDER/分解法的标准规范。

Q3 (depends on Q1) — Tentative depth

Q3（基于Q1）——初步研究深度

Tentative depth — pick one. Final confirmation comes after the framework breakdown:

Quick scan (5 searches)

Standard review (10 searches)

Deep dive (20 searches)

Why I'm asking: I ask this twice — once now to calibrate the recon search emphasis, once after the framework breakdown to confirm. Tentative answer affects which sub-areas to surface first; final answer drives search budget allocation.

Forcing choice. Re-asked at the post-Phase-2 checkpoint after the user has seen the framework breakdown.

Stop condition: 3 questions max before Phase 1. The post-Phase-2 checkpoint is its own grill-me moment (framework table + sub-area-adjustment + depth-reconfirmation).

初步研究深度——选一个。最终确认将在框架细分后进行：

快速扫描（5次搜索）

标准综述（10次搜索）

深度探索（20次搜索）

提问原因： 我会问两次——现在问是为了校准侦察搜索的重点，框架细分后再问是为了最终确认。初步答案会影响优先展示哪些子领域；最终答案将决定搜索预算分配。

提供强制选择项。在阶段2后的检查点会重新提问，此时用户已看到框架细分内容。

停止条件： 最多提问3个问题后进入阶段1。阶段2后的检查点是另一个深度问询环节（框架表格+子领域调整+深度重新确认）。

Phase 1: Initial Reconnaissance

阶段1：初始侦察

One broad Consensus search to map themes, terminology, methodological distinctions.

Query: broad version of Q1 (terminology variants are okay; first search casts wide)

Record:

citation_tracker.py --action record_search --session NAME --query "..."

Record received count:

citation_tracker.py --action record_papers_received --session NAME --count N

Detect plan tier from response: "Showing top 10" / "upgrade" → free; 20 returned → Pro

Synthesize for the checkpoint:

Themes that surfaced
Terminology variations (e.g., "LLM" vs "large language model" vs "GPT-style model")
Methodological distinctions (clinical trials vs benchmark eval vs case study)
Coverage gaps (sub-questions absent from recon results)

执行一次宽泛的Consensus搜索，以梳理主题、术语、方法论差异。

查询词：Q1的宽泛版本（可包含术语变体；首次搜索需扩大范围）

记录：

citation_tracker.py --action record_search --session NAME --query "..."

记录收到的论文数量：

citation_tracker.py --action record_papers_received --session NAME --count N

检测套餐等级：从响应中识别「Showing top 10」/「upgrade」→免费版；返回20条结果→专业版（Pro）

为检查点整理以下内容：

浮现的主题
术语变体（例如：「LLM」vs「large language model」vs「GPT-style model」）
方法论差异（临床试验vs基准评估vs案例研究）
覆盖空白（侦察结果中未涉及的子问题）

Phase 2: Framework Selection + Sub-area Generation

阶段2：框架选择 + 子领域生成

Choose framework (from Q2 OR override based on recon):

PICO — most clinical questions (~70% default)
SPIDER — social / qualitative
Decomposition — technology focus (Problem / Solution / Evaluation / Limitations)
Hybrid — explicit cross-framework mapping

Generate 4-5 sub-area questions mapped to framework components. Each becomes a targeted Phase 3 search.

选择框架（基于Q2或根据侦察结果覆盖调整）：

PICO——适用于大多数临床问题（约70%默认选择）
SPIDER——适用于社科/定性研究
分解法——适用于技术类研究（问题/解决方案/评估/局限性）
混合法——明确跨框架映射关系

生成4-5个映射到框架组件的子领域问题。每个问题将作为阶段3的定向搜索词。

Checkpoint (grill-me forcing-options moment)

检查点（深度问询强制选择环节）

After Phase 2, halt and present:

阶段2完成后暂停，向用户展示：

3-4 sentence recon summary

3-4句话的侦察总结

What themes surfaced
Terminology landscape
Evidence landscape characterization

浮现的主题
术语体系
证据特征描述

Framework breakdown table

框架细分表格

Framework Component	How It Maps to This Topic	Proposed Sub-area to Explore
(Component 1)	...	Sub-area 1
(Component 2)	...	Sub-area 2
(Component 3)	...	Sub-area 3
(Component 4)	...	Sub-area 4
Cross-cutting theme	...	Sub-area 5

框架组件	与本主题的映射关系	拟探索的子领域
(组件1)	...	子领域1
(组件2)	...	子领域2
(组件3)	...	子领域3
(组件4)	...	子领域4
交叉主题	...	子领域5

Depth re-confirmation (forcing choice)

研究深度重新确认（强制选择）

Surface the practical constraint: detected plan tier + theoretical ceiling.

Quick scan (5 searches × ~10 results each = ~50 papers max)
Standard review (10 searches × ~10 = ~100 papers)
Deep dive (20 searches × ~10 = ~200 papers)

告知实际限制：检测到的套餐等级+理论上限。

快速扫描（5次搜索×约10条结果/次=最多约50篇论文）
标准综述（10次搜索×约10条结果/次=约100篇论文）
深度探索（20次搜索×约10条结果/次=约200篇论文）

Sub-area forcing options

子领域强制选项

"Looks good — proceed with these sub-areas"
"Adjust: add sub-area on [X]"
"Adjust: remove and replace [Y] with [Z]"
"Restart with different framework"

「没问题——按这些子领域继续」
「调整：添加关于[X]的子领域」
「调整：移除[Y]并替换为[Z]」
「重新选择框架」

Why I'm asking (the rationale)

提问原因（原理说明）

A wrong framework or sub-area set wastes the search budget. This is the last cheap moment to correct course.

Wait for user response before Phase 3. Refuse to start Phase 3 without explicit user choice.

错误的框架或子领域设置会浪费搜索预算。这是最后一个低成本调整方向的时机。

等待用户回复后再进入阶段3。未获得用户明确选择前，不得启动阶段3。

Phase 3: Targeted Searches

阶段3：定向搜索

Sequential (1 query/sec), budget per depth tier. See

references/search_budget_allocation.md

for full canon.

按顺序执行（1次查询/秒），根据深度等级分配预算。查看

references/search_budget_allocation.md

了解完整规范。

Quick scan (5 searches)

快速扫描（5次搜索）

5 sub-area searches (one per sub-area)
Skip era-gated + review-specific

5次子领域搜索（每个子领域1次）
跳过时间限定和综述类特定搜索

Standard review (10 searches)

标准综述（10次搜索）

5 sub-area searches

2 review article searches (top 2 sub-areas):

"systematic review [topic]"

"meta-analysis [topic]"

2 era-gated searches (most important sub-area):
```
year_max: 2015
```
+
```
year_min: 2021
```
1 follow-up on highest-cited paper using its key terms +
```
year_min
```
after publication

5次子领域搜索
2次综述文章搜索（针对前2个子领域）：
```
"systematic review [topic]"
```
/
```
"meta-analysis [topic]"
```
2次时间限定搜索（针对最重要的子领域）：
```
year_max: 2015
```
+
```
year_min: 2021
```
1次针对高引用论文的跟进搜索：使用其关键词+
```
year_min
```
（论文发表年份后）

Deep dive (20 searches)

深度探索（20次搜索）

5 sub-area searches
5 review article searches (one per sub-area)
4 era-gated searches (top 2 sub-areas, old + new each)
3 follow-ups on top 3 highest-cited papers
3 spare for emerging threads (surprising findings to chase)

Throughout: 1 q/sec rate limit. Sequential. Confirm response before next call. Record each via

citation_tracker.py

5次子领域搜索
5次综述文章搜索（每个子领域1次）
4次时间限定搜索（针对前2个子领域，各包含新旧时间段）
3次针对前3篇高引用论文的跟进搜索
3次备用搜索：用于追踪意外发现的新研究方向

全程遵循1次查询/秒的速率限制，按顺序执行。每次调用前确认响应。通过

citation_tracker.py

记录每一次搜索。

Cross-Search Intelligence

跨搜索智能分析

Three trackers across ALL search results — run

scripts/cross_search_aggregator.py --session NAME

after Phase 3 completes:

Repeat-hit papers — same paper appearing in 3+ sub-area searches = likely foundational
Recurring authors — same author in multiple searches = dominant research group; top 3-5 most frequent matter
Citation-per-year heuristic — a 2023 paper with 150 citations >> 2008 paper with 150 citations. Use for seminal-work identification.

These feed the "Start Here" + "Key Research Groups" + "Bibliography" DOCX sections.

在所有搜索结果中跟踪三类信息——阶段3完成后运行

scripts/cross_search_aggregator.py --session NAME

：

重复出现的论文——同一篇论文出现在3个以上子领域搜索中=可能是基础性文献
高频作者——同一作者出现在多次搜索中=主导研究团队；需关注前3-5位高频作者
年度引用启发式——2023年发表且被引用150次的论文价值远高于2008年发表且被引用150次的论文。用于识别开创性成果。

这些信息将用于填充DOCX文档的「入门优先阅读」+「核心研究团队」+「参考文献」章节。

Phase 4: DOCX Research Guide

阶段4：DOCX研究指南

Generate via Node.js +

docx

library. 8 sections (see

references/docx_8_sections.md

for full spec):

Topic Overview — single tight paragraph (4-6 sentences)
Start Here — Priority Reading Order — 5-7 papers ordered: best recent review → foundational → 2-3 frontier → gap/controversy. Each: hyperlinked title + authors/year + 1-sentence contribution + 1-sentence "what to look for"
How the Field Got Here — chronological narrative (1-2 paragraphs) + timeline table (5-8 milestones: Year / Milestone / Significance) + terminology evolution note
Sub-area Guides (one per sub-area, 4 parts each)
- 4a. What the Research Shows (2-3 sentence synthesis with inline citations)
- 4b. Key Papers (3-5 hyperlinked papers with citation count, year, 1-sentence importance)
- 4c. Key Search Terms (6-10 keywords, synonyms, MeSH, historical terms)
- 4d. Boolean Search Strings (2-3 ready-to-paste strings)
Key Research Groups — top 3-5 authors/groups with affiliations, sub-area coverage, representative paper link (from cross-search aggregator)
Open Questions & Gaps — three categories: methodological / population-context / conceptual-theoretical. Each gap explains why it matters.
Bibliography — alphabetical by first author. Every entry has clickable "View on Consensus" link. Every inline citation matches a bibliography entry.
Audit Log — search summary table (#, query, filters, papers returned, status), counts block, coverage notes including detected tier and theoretical ceiling

通过Node.js +

docx

库生成文档。包含8个章节（查看

references/docx_8_sections.md

了解完整规范）：

主题概述——一段精炼的文字（4-6句话）
入门优先阅读顺序——5-7篇论文，按以下顺序排列：最佳近期综述→基础性文献→2-3篇前沿文献→研究空白/争议文献。每篇包含：超链接标题+作者/年份+1句话核心贡献+1句话「重点关注内容」
领域发展脉络—— chronological叙事（1-2段）+时间线表格（5-8个里程碑：年份/里程碑/意义）+术语演变说明
子领域指南（每个子领域1份，包含4部分）
- 4a. 研究结论（2-3句话的综述，含内嵌引用）
- 4b. 核心论文（3-5篇带超链接的论文，包含引用量、年份、1句话重要性说明）
- 4c. 核心搜索词（6-10个关键词、同义词、MeSH术语、历史术语）
- 4d. 布尔搜索字符串（2-3个可直接粘贴使用的字符串）
核心研究团队——前3-5位作者/团队，包含所属机构、子领域研究方向、代表性论文链接（来自跨搜索智能分析）
开放问题与研究空白——三类：方法论/人群背景/概念理论。每个空白需说明其重要性。
参考文献——按第一作者姓氏字母顺序排列。每条条目均包含可点击的「在Consensus查看」链接。所有内嵌引用均与参考文献条目对应。
审计日志——搜索汇总表格（编号、查询词、筛选条件、返回论文数、状态）、统计块、覆盖范围说明（含检测到的套餐等级和理论上限）

DOCX Technical Requirements

DOCX技术要求

Document the key

docx

library patterns:

Page: US Letter, 1-inch margins
Lists:
```
LevelFormat.BULLET
```
(never unicode bullets)
Hyperlinks:
```
ExternalHyperlink
```
with
```
style: "Hyperlink"
```
, full URL (never truncated)
Tables: dual widths (
```
columnWidths
```
+ cell
```
width
```
),
```
ShadingType.CLEAR
```

Validation step after save (

python scripts/office/validate.py output.docx

)

Reference the docx skill for setup patterns and best practices.

记录

docx

库的核心使用模式：

页面：US Letter，1英寸边距
列表：使用
```
LevelFormat.BULLET
```
（绝不使用unicode项目符号）
超链接：使用
```
ExternalHyperlink
```
并设置
```
style: "Hyperlink"
```
，完整URL（绝不截断）
表格：双宽度设置（
```
columnWidths
```
+ 单元格
```
width
```
），
```
ShadingType.CLEAR
```

保存后验证步骤（

python scripts/office/validate.py output.docx

）

参考docx技能的设置模式和最佳实践。

Output

输出

research_guide_<topic-slug>_<YYYY-MM-DD>.docx

Plus:

Chat summary block: "Saved: <path>. Audit: N searches × M unique papers / K cited. Plan tier: <tier>."
Audit log printed inline if user asks for it

research_guide_<topic-slug>_<YYYY-MM-DD>.docx

附加内容：

聊天总结块：「已保存：<路径>。审计信息：N次搜索×M篇唯一论文 / K篇被引用。套餐等级：<等级>。」
若用户要求，可直接打印审计日志

Tooling

工具集

Script	Role
`scripts/citation_tracker.py`	JSON-backed three-count audit at `~/.litreview_sessions/<session>.json`
`scripts/framework_recommender.py`	Heuristic PICO/SPIDER/Decomposition suggestion from research question
`scripts/cross_search_aggregator.py`	Repeat-hits + recurring-authors + citation-per-year ranking after Phase 3

脚本	作用
`scripts/citation_tracker.py`	基于JSON的三数据审计，存储于 `~/.litreview_sessions/<session>.json`
`scripts/framework_recommender.py`	根据研究问题提供PICO/SPIDER/分解法的启发式推荐
`scripts/cross_search_aggregator.py`	阶段3完成后，分析重复论文、高频作者、年度引用排名

References

参考文献

```
references/framework_selection.md
```
— PICO / SPIDER / Decomposition canon (7+ sources)
```
references/search_budget_allocation.md
```
— depth tiers + cross-search intelligence + sequential execution rationale (7+ sources)
```
references/docx_8_sections.md
```
— research guide DOCX spec + technical requirements (7+ sources)

```
references/framework_selection.md
```
— PICO/SPIDER/分解法标准规范（7+来源）
```
references/search_budget_allocation.md
```
— 深度等级+跨搜索智能分析+顺序执行原理（7+来源）
```
references/docx_8_sections.md
```
— 研究指南DOCX规范+技术要求（7+来源）

Anti-Patterns To Reject

需避免的反模式

Parallelizing Consensus calls
Skipping the interactive checkpoint (running all searches without user confirmation)
Padding thin results with training knowledge
Defaulting to non-PICO framework without justification
Citing papers in chat that didn't come from Consensus this session
Hardcoding plan tier instead of detecting from first response
Skipping era-gated searches in standard/deep budgets
Skipping cross-search intelligence (repeat-hits, recurring authors)
Truncating Consensus URLs in hyperlinks

Version: 1.0.0 Source spec:

megaprompts/09-litreview-megaprompt.md

Build pattern: Path B (direct conversion). Sibling of

pulse

(research-pack shape).

并行调用Consensus接口
跳过交互式检查点（未获用户确认即执行所有搜索）
用训练知识补充稀少的搜索结果
无正当理由默认使用非PICO框架
在聊天中引用本次会话未从Consensus获取的论文
硬编码套餐等级而非从首次响应中检测
在标准/深度预算中跳过时间限定搜索
跳过跨搜索智能分析（重复论文、高频作者）
在超链接中截断Consensus URL

版本： 1.0.0 来源规范：

megaprompts/09-litreview-megaprompt.md

构建模式： Path B（直接转换）。与

pulse

为同系列（研究包形态）。