network-meta-analysis-appraisal
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseNetwork Meta-Analysis Comprehensive Appraisal
网络元分析综合评估
Overview
概述
This skill enables systematic, reproducible appraisal of network meta-analysis (NMA) papers through:
- Automated PDF intelligence - Extract text, tables, and statistical content from NMA PDFs
- Semantic evidence matching - Map 200+ checklist criteria to PDF content using AI similarity
- Triple-validation methodology - Two independent concurrent appraisals + meta-review consensus
- Comprehensive frameworks - PRISMA-NMA, NICE DSU TSD 7, ISPOR-AMCP-NPC, CINeMA integration
- Professional reports - Generate markdown checklists and structured YAML outputs
The skill transforms a complex, time-intensive manual process (~6-8 hours) into a systematic, partially-automated workflow (~2-3 hours).
本技能可通过以下方式实现对网络元分析(NMA)论文的系统化、可复现评估:
- 自动化PDF智能处理 - 从NMA PDF中提取文本、表格和统计内容
- 语义证据匹配 - 利用AI相似度算法将200余项检查表标准与PDF内容进行匹配
- 三重验证方法 - 两次独立并行评估 + 元评审共识
- 综合框架整合 - 整合PRISMA-NMA、NICE DSU TSD 7、ISPOR-AMCP-NPC、CINeMA框架
- 专业报告生成 - 生成Markdown检查表和结构化YAML输出
该技能将复杂、耗时的手动流程(约6-8小时)转化为系统化的半自动化工作流(约2-3小时)。
When to Use This Skill
适用场景
Apply this skill when:
- Conducting peer review for journal submissions containing NMA
- Evaluating evidence for clinical guideline development
- Assessing NMA for health technology assessment (HTA)
- Reviewing NMA for reimbursement/formulary decisions
- Training on systematic NMA critical appraisal methodology
- Comparing Bayesian vs Frequentist NMA approaches
在以下场景中应用本技能:
- 对包含NMA的期刊投稿进行同行评审
- 为临床指南制定评估证据
- 为卫生技术评估(HTA)评审NMA
- 为报销/ formulary决策评审NMA
- 开展系统化NMA批判性评估方法的培训
- 比较贝叶斯与频率学派NMA方法
Workflow: PDF to Appraisal Report
工作流:从PDF到评估报告
Follow this sequential 5-step workflow for comprehensive appraisal:
遵循以下5步顺序工作流进行全面评估:
Step 1: Setup & Prerequisites
步骤1:设置与前置条件
Install Required Libraries:
bash
cd scripts/
pip install -r requirements.txt安装所需库:
bash
cd scripts/
pip install -r requirements.txtDownload semantic model (first time only)
首次使用时下载语义模型
python -c "from sentence_transformers import SentenceTransformer; SentenceTransformer('all-MiniLM-L6-v2')"
**Verify Checklist Availability:**
Confirm all 8 checklist sections are in `references/checklist_sections/`:
- SECTION I - STUDY RELEVANCE and APPLICABILITY.md
- SECTION II - REPORTING TRANSPARENCY and COMPLETENESS - PRISMA-NMA.md
- SECTION III - METHODOLOGICAL RIGOR - NICE DSU TSD 7.md
- SECTION IV - CREDIBILITY ASSESSMENT - ISPOR-AMCP-NPC.md
- SECTION V - CERTAINTY OF EVIDENCE - CINeMA Framework.md
- SECTION VI - SYNTHESIS and OVERALL JUDGMENT.md
- SECTION VII - APPRAISER INFORMATION.md
- SECTION VIII - APPENDICES.md
**Select Framework Scope:**
Choose based on appraisal purpose (see `references/frameworks_overview.md` for details):
- `comprehensive`: All 4 frameworks (~200 items, 4-6 hours)
- `reporting`: PRISMA-NMA only (~90 items, 2-3 hours)
- `methodology`: NICE + CINeMA (~30 items, 2-3 hours)
- `decision`: Relevance + ISPOR + CINeMA (~30 items, 2-3 hours)python -c "from sentence_transformers import SentenceTransformer; SentenceTransformer('all-MiniLM-L6-v2')"
**验证检查表可用性:**
确认所有8个检查表章节均位于`references/checklist_sections/`目录下:
- SECTION I - STUDY RELEVANCE and APPLICABILITY.md
- SECTION II - REPORTING TRANSPARENCY and COMPLETENESS - PRISMA-NMA.md
- SECTION III - METHODOLOGICAL RIGOR - NICE DSU TSD 7.md
- SECTION IV - CREDIBILITY ASSESSMENT - ISPOR-AMCP-NPC.md
- SECTION V - CERTAINTY OF EVIDENCE - CINeMA Framework.md
- SECTION VI - SYNTHESIS and OVERALL JUDGMENT.md
- SECTION VII - APPRAISER INFORMATION.md
- SECTION VIII - APPENDICES.md
**选择框架范围:**
根据评估目的选择(详情见`references/frameworks_overview.md`):
- `comprehensive`:全部4个框架(约200项,4-6小时)
- `reporting`:仅PRISMA-NMA(约90项,2-3小时)
- `methodology`:NICE + CINeMA(约30项,2-3小时)
- `decision`:相关性评估 + ISPOR + CINeMA(约30项,2-3小时)Step 2: Extract PDF Content
步骤2:提取PDF内容
Run to extract structured content from the NMA paper:
pdf_intelligence.pybash
python scripts/pdf_intelligence.py path/to/nma_paper.pdf --output pdf_extraction.jsonWhat This Does:
- Extracts text with section detection (abstract, methods, results, discussion)
- Parses tables using multiple libraries (Camelot, pdfplumber)
- Extracts metadata (title, page count, etc.)
- Calculates extraction quality scores
Outputs:
- - Structured PDF content for evidence matching
pdf_extraction.json
Quality Check:
- Verify scores ≥ 0.6 for text_coverage and sections_detected
extraction_quality - Low scores indicate poor PDF quality - may require manual supplementation
运行从NMA论文中提取结构化内容:
pdf_intelligence.pybash
python scripts/pdf_intelligence.py path/to/nma_paper.pdf --output pdf_extraction.json功能说明:
- 提取文本并检测章节(摘要、方法、结果、讨论)
- 使用多个库(Camelot、pdfplumber)解析表格
- 提取元数据(标题、页数等)
- 计算提取质量得分
输出结果:
- - 用于证据匹配的结构化PDF内容
pdf_extraction.json
质量检查:
- 确认中的text_coverage和sections_detected得分≥0.6
extraction_quality - 低得分表明PDF质量不佳,可能需要手动补充
Step 3: Match Evidence to Checklist Criteria
步骤3:匹配证据与检查表标准
Prepare Checklist Criteria JSON:
Extract checklist items from markdown sections into machine-readable format:
python
import json
from pathlib import Path准备检查表标准JSON:
将检查表章节中的条目提取为机器可读格式:
python
import json
from pathlib import PathExample: Extract criteria from Section II
示例:提取章节II的标准
criteria = []
section_file = Path("references/checklist_sections/SECTION II - REPORTING TRANSPARENCY and COMPLETENESS - PRISMA-NMA.md")
criteria = []
section_file = Path("references/checklist_sections/SECTION II - REPORTING TRANSPARENCY and COMPLETENESS - PRISMA-NMA.md")
Parse markdown table rows to extract item IDs and criteria text
解析Markdown表格行以提取条目ID和标准文本
Format: [{"id": "4.1", "text": "Does the title identify the study as a systematic review and network meta-analysis?"},...]
格式: [{"id": "4.1", "text": "Does the title identify the study as a systematic review and network meta-analysis?"},...]
Path("checklist_criteria.json").write_text(json.dumps(criteria, indent=2))
**Run Semantic Evidence Matching:**
```bash
python scripts/semantic_search.py pdf_extraction.json checklist_criteria.json --output evidence_matches.jsonWhat This Does:
- Encodes each checklist criterion as semantic vector
- Searches PDF sections for matching paragraphs
- Calculates similarity scores (0.0-1.0)
- Assigns confidence levels (high/moderate/low/unable)
Outputs:
- - Evidence mapped to each criterion with confidence scores
evidence_matches.json
Path("checklist_criteria.json").write_text(json.dumps(criteria, indent=2))
**运行语义证据匹配:**
```bash
python scripts/semantic_search.py pdf_extraction.json checklist_criteria.json --output evidence_matches.json功能说明:
- 将每个检查表标准编码为语义向量
- 在PDF章节中搜索匹配的段落
- 计算相似度得分(0.0-1.0)
- 分配置信度等级(高/中/低/无法匹配)
输出结果:
- - 映射到每个标准的证据及置信度得分
evidence_matches.json
Step 4: Conduct Triple-Validation Appraisal
步骤4:开展三重验证评估
Manual Appraisal with Evidence Support:
For each checklist section:
-
Load evidence matches for that section's criteria
-
Review PDF content highlighted by semantic search
-
Apply triple-validation methodology (see):
references/triple_validation_methodology.mdAppraiser #1 (Critical Reviewer):- Evidence threshold: 0.75 (high)
- Stance: Skeptical, conservative
- For each item: Assign rating (✓/⚠/✗/N/A) based on evidence quality
Appraiser #2 (Methodologist):- Evidence threshold: 0.70 (moderate)
- Stance: Technical rigor emphasis
- For each item: Assign rating independently
-
Meta-Review Concordance Analysis:
- Compare ratings between appraisers
- Calculate agreement levels (perfect/minor/major discordance)
- Apply resolution strategy (evidence-weighted by default)
- Flag major discordances for manual review
Structure Appraisal Results:
json
{
"pdf_metadata": {...},
"appraisal": {
"sections": [
{
"id": "section_ii",
"name": "REPORTING TRANSPARENCY & COMPLETENESS",
"items": [
{
"id": "4.1",
"criterion": "Title identification...",
"rating": "✓",
"confidence": "high",
"evidence": "The title explicitly states...",
"source": "methods section",
"appraiser_1_rating": "✓",
"appraiser_2_rating": "✓",
"concordance": "perfect"
},
...
]
},
...
]
}
}Save as .
appraisal_results.json基于证据支持的手动评估:
针对每个检查表章节:
-
加载该章节标准的证据匹配结果
-
查看语义搜索高亮的PDF内容
-
应用三重验证方法(详情见):
references/triple_validation_methodology.md评估者1(批判性评审员):- 证据阈值:0.75(高)
- 立场:持怀疑、保守态度
- 为每个条目:根据证据质量分配评级(✓/⚠/✗/N/A)
评估者2(方法学家):- 证据阈值:0.70(中)
- 立场:强调技术严谨性
- 为每个条目:独立分配评级
-
元评审一致性分析:
- 比较两位评估者的评级
- 计算一致程度(完全一致/轻微不一致/严重不一致)
- 应用解决策略(默认基于证据权重)
- 标记严重不一致的条目以便手动评审
结构化评估结果:
json
{
"pdf_metadata": {...},
"appraisal": {
"sections": [
{
"id": "section_ii",
"name": "REPORTING TRANSPARENCY & COMPLETENESS",
"items": [
{
"id": "4.1",
"criterion": "Title identification...",
"rating": "✓",
"confidence": "high",
"evidence": "The title explicitly states...",
"source": "methods section",
"appraiser_1_rating": "✓",
"appraiser_2_rating": "✓",
"concordance": "perfect"
},
...
]
},
...
]
}
}保存为。
appraisal_results.jsonStep 5: Generate Reports
步骤5:生成报告
Create Markdown and YAML Reports:
bash
python scripts/report_generator.py appraisal_results.json --format both --output-dir ./reportsOutputs:
- - Human-readable checklist with ratings, evidence, concordance
reports/nma_appraisal_report.md - - Machine-readable structured data
reports/nma_appraisal_report.yaml
Report Contents:
- Executive summary with overall quality ratings
- Detailed checklist tables (all 8 sections)
- Concordance analysis summary
- Recommendations for decision-makers and authors
- Evidence citations and confidence scores
Quality Validation:
- Review major discordance items flagged in concordance analysis
- Verify evidence confidence ≥ moderate for ≥50% of items
- Check overall agreement rate ≥ 65%
- Manually review any critical items with low confidence
创建Markdown和YAML报告:
bash
python scripts/report_generator.py appraisal_results.json --format both --output-dir ./reports输出结果:
- - 包含评级、证据、一致性的人类可读检查表
reports/nma_appraisal_report.md - - 机器可读的结构化数据
reports/nma_appraisal_report.yaml
报告内容:
- 包含总体质量评级的执行摘要
- 详细的检查表表格(全部8个章节)
- 一致性分析摘要
- 针对决策者和作者的建议
- 证据引用和置信度得分
质量验证:
- 评审一致性分析中标记的严重不一致条目
- 确认≥50%的条目的证据置信度≥中等
- 检查总体一致率≥65%
- 手动评审任何置信度低的关键条目
Methodological Decision Points
方法学决策要点
Bayesian vs Frequentist Detection
贝叶斯与频率学派方法检测
The skill automatically detects statistical approach by scanning for keywords:
Bayesian Indicators: MCMC, posterior, prior, credible interval, WinBUGS, JAGS, Stan, burn-in, convergence diagnostic
Frequentist Indicators: confidence interval, p-value, I², τ², netmeta, prediction interval
Apply appropriate checklist items based on detected approach:
- Item 18.3 (Bayesian specifications) - only if Bayesian detected
- Items on heterogeneity metrics (I², τ²) - primarily Frequentist
- Convergence diagnostics - only Bayesian
本技能通过扫描关键词自动检测统计方法:
贝叶斯指标:MCMC、posterior、prior、credible interval、WinBUGS、JAGS、Stan、burn-in、convergence diagnostic
频率学派指标:confidence interval、p-value、I²、τ²、netmeta、prediction interval
根据检测到的方法应用相应的检查表条目:
- 条目18.3(贝叶斯规范)- 仅当检测到贝叶斯方法时应用
- 异质性指标条目(I²、τ²)- 主要适用于频率学派方法
- 收敛诊断条目 - 仅适用于贝叶斯方法
Handling Missing Evidence
缺失证据的处理
When semantic search returns low confidence (<0.45):
- Manually search PDF for the criterion
- Check supplementary materials (if accessible)
- If truly absent, rate as ⚠ or ✗ depending on item criticality
- Document "No evidence found in main text" in evidence field
当语义搜索返回低置信度(<0.45)时:
- 手动搜索PDF以查找对应标准
- 检查补充材料(若可访问)
- 若确实缺失,根据条目的重要性评级为⚠或✗
- 在证据字段中记录“主文中未找到相关证据”
Resolution Strategy Selection
解决策略选择
Choose concordance resolution strategy based on appraisal purpose:
- Evidence-weighted (default): Most objective, prefers stronger evidence
- Conservative: For high-stakes decisions (regulatory submissions)
- Optimistic: For formative assessments or educational purposes
See for detailed guidance.
references/triple_validation_methodology.md根据评估目的选择一致性解决策略:
- 基于证据权重(默认):最客观,优先选择更强的证据
- 保守策略:适用于高风险决策(监管提交)
- 乐观策略:适用于形成性评估或教育场景
详情见。
references/triple_validation_methodology.mdResources
资源
scripts/
scripts/目录
Production-ready Python scripts for automated tasks:
- pdf_intelligence.py - Multi-library PDF extraction (PyMuPDF, pdfplumber, Camelot)
- semantic_search.py - AI-powered evidence-to-criterion matching
- report_generator.py - Markdown + YAML report generation
- requirements.txt - Python dependencies
Usage: Scripts can be run standalone via CLI or orchestrated programmatically.
用于自动化任务的生产级Python脚本:
- pdf_intelligence.py - 多库PDF提取(PyMuPDF、pdfplumber、Camelot)
- semantic_search.py - AI驱动的证据与标准匹配
- report_generator.py - Markdown + YAML报告生成
- requirements.txt - Python依赖项
使用方式:脚本可通过CLI独立运行,或通过编程方式编排执行。
references/
references/目录
Comprehensive documentation for appraisal methodology:
- checklist_sections/ - All 8 integrated checklist sections (PRISMA/NICE/ISPOR/CINeMA)
- frameworks_overview.md - Framework selection guide, rating scales, key references
- triple_validation_methodology.md - Appraiser roles, concordance analysis, resolution strategies
Usage: Load relevant references when conducting specific appraisal steps or interpreting results.
评估方法学的综合文档:
- checklist_sections/ - 全部8个整合的检查表章节(PRISMA/NICE/ISPOR/CINeMA)
- frameworks_overview.md - 框架选择指南、评级量表、关键参考文献
- triple_validation_methodology.md - 评估者角色、一致性分析、解决策略
使用方式:在执行特定评估步骤或解读结果时加载相关参考文档。
Best Practices
最佳实践
- Always run pdf_intelligence.py first - Extraction quality affects all downstream steps
- Review low-confidence matches manually - Semantic search is not perfect
- Document resolution rationale - For major discordances, explain meta-review decision
- Maintain appraiser independence - Conduct Appraiser #1 and #2 evaluations without cross-reference
- Validate critical items - Manually verify evidence for high-impact methodological criteria
- Use appropriate framework scope - Comprehensive for peer review, targeted for specific assessments
- 始终先运行pdf_intelligence.py - 提取质量会影响所有下游步骤
- 手动评审低置信度匹配结果 - 语义搜索并非完美
- 记录解决理由 - 对于严重不一致的情况,说明元评审决策依据
- 保持评估者独立性 - 评估者1和2的评估应在无交叉参考的情况下进行
- 验证关键条目 - 手动验证高影响方法学标准的证据
- 使用合适的框架范围 - 同行评审用综合框架,特定评估用针对性框架
Limitations
局限性
- PDF quality dependent: Poor scans or complex layouts reduce extraction accuracy
- Semantic matching not perfect: May miss evidence phrased in unexpected ways
- No external validation: Cannot verify PROSPERO registration or check author COI databases
- Language: Optimized for English-language papers
- Human oversight required: Final appraisal should be reviewed by domain expert
- 依赖PDF质量:扫描质量差或布局复杂会降低提取准确性
- 语义匹配并非完美:可能会遗漏以意外方式表述的证据
- 无外部验证:无法验证PROSPERO注册或检查作者利益冲突数据库
- 语言限制:针对英文论文优化
- 需人工监督:最终评估应由领域专家审核