citation-management
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseCitation Management
引用管理
Overview
概述
Manage citations systematically throughout the research and writing process. This skill provides tools and strategies for searching academic databases (Google Scholar, PubMed), extracting accurate metadata from multiple sources (CrossRef, PubMed, arXiv), validating citation information, and generating properly formatted BibTeX entries.
Critical for maintaining citation accuracy, avoiding reference errors, and ensuring reproducible research. Integrates seamlessly with the literature-review skill for comprehensive research workflows.
在研究和写作全流程中系统化管理引用。该技能提供工具与策略,用于搜索学术数据库(Google Scholar、PubMed),从多来源(CrossRef、PubMed、arXiv)提取精准元数据,验证引用信息,并生成格式规范的BibTeX条目。
这对维持引用准确性、避免参考文献错误、确保研究可复现至关重要。可与文献综述技能无缝集成,构建完整的研究工作流。
When to Use This Skill
适用场景
Use this skill when:
- Searching for specific papers on Google Scholar or PubMed
- Converting DOIs, PMIDs, or arXiv IDs to properly formatted BibTeX
- Extracting complete metadata for citations (authors, title, journal, year, etc.)
- Validating existing citations for accuracy
- Cleaning and formatting BibTeX files
- Finding highly cited papers in a specific field
- Verifying that citation information matches the actual publication
- Building a bibliography for a manuscript or thesis
- Checking for duplicate citations
- Ensuring consistent citation formatting
在以下场景中使用该技能:
- 在Google Scholar或PubMed中搜索特定论文
- 将DOI、PMID或arXiv ID转换为格式规范的BibTeX条目
- 提取引用的完整元数据(作者、标题、期刊、年份等)
- 验证现有引用的准确性
- 清理并格式化BibTeX文件
- 查找特定领域的高被引论文
- 验证引用信息与实际出版物是否匹配
- 为手稿或论文构建参考文献列表
- 检查重复引用
- 确保引用格式一致
Core Workflow
核心工作流
Citation management follows a systematic process:
引用管理遵循系统化流程:
Phase 1: Paper Discovery and Search
阶段1:论文发现与搜索
Goal: Find relevant papers using academic search engines.
目标:使用学术搜索引擎找到相关论文。
Google Scholar Search
Google Scholar搜索
Google Scholar provides the most comprehensive coverage across disciplines.
Basic Search:
bash
undefinedGoogle Scholar提供跨学科的最全面文献覆盖。
基础搜索:
bash
undefinedSearch for papers on a topic
搜索主题相关论文
python scripts/search_google_scholar.py "CRISPR gene editing"
--limit 50
--output results.json
--limit 50
--output results.json
python scripts/search_google_scholar.py "CRISPR gene editing"
--limit 50
--output results.json
--limit 50
--output results.json
Search with year filter
按年份筛选搜索
python scripts/search_google_scholar.py "machine learning protein folding"
--year-start 2020
--year-end 2024
--limit 100
--output ml_proteins.json
--year-start 2020
--year-end 2024
--limit 100
--output ml_proteins.json
**Advanced Search Strategies** (see `references/google_scholar_search.md`):
- Use quotation marks for exact phrases: `"deep learning"`
- Search by author: `author:LeCun`
- Search in title: `intitle:"neural networks"`
- Exclude terms: `machine learning -survey`
- Find highly cited papers using sort options
- Filter by date ranges to get recent work
**Best Practices**:
- Use specific, targeted search terms
- Include key technical terms and acronyms
- Filter by recent years for fast-moving fields
- Check "Cited by" to find seminal papers
- Export top results for further analysispython scripts/search_google_scholar.py "machine learning protein folding"
--year-start 2020
--year-end 2024
--limit 100
--output ml_proteins.json
--year-start 2020
--year-end 2024
--limit 100
--output ml_proteins.json
**高级搜索策略**(详见`references/google_scholar_search.md`):
- 使用引号匹配精确短语:`"deep learning"`
- 按作者搜索:`author:LeCun`
- 在标题中搜索:`intitle:"neural networks"`
- 排除特定术语:`machine learning -survey`
- 使用排序选项查找高被引论文
- 按日期范围筛选获取最新研究
**最佳实践**:
- 使用具体、针对性的搜索词
- 包含关键技术术语和缩写
- 针对快速发展的领域,筛选近年文献
- 查看“被引用次数”找到开创性论文
- 导出顶部结果用于进一步分析PubMed Search
PubMed搜索
PubMed specializes in biomedical and life sciences literature (35+ million citations).
Basic Search:
bash
undefinedPubMed专注于生物医学和生命科学文献(包含3500万+条引用)。
基础搜索:
bash
undefinedSearch PubMed
搜索PubMed
python scripts/search_pubmed.py "Alzheimer's disease treatment"
--limit 100
--output alzheimers.json
--limit 100
--output alzheimers.json
python scripts/search_pubmed.py "Alzheimer's disease treatment"
--limit 100
--output alzheimers.json
--limit 100
--output alzheimers.json
Search with MeSH terms and filters
使用MeSH术语和筛选条件搜索
python scripts/search_pubmed.py
--query '"Alzheimer Disease"[MeSH] AND "Drug Therapy"[MeSH]'
--date-start 2020
--date-end 2024
--publication-types "Clinical Trial,Review"
--output alzheimers_trials.json
--query '"Alzheimer Disease"[MeSH] AND "Drug Therapy"[MeSH]'
--date-start 2020
--date-end 2024
--publication-types "Clinical Trial,Review"
--output alzheimers_trials.json
**Advanced PubMed Queries** (see `references/pubmed_search.md`):
- Use MeSH terms: `"Diabetes Mellitus"[MeSH]`
- Field tags: `"cancer"[Title]`, `"Smith J"[Author]`
- Boolean operators: `AND`, `OR`, `NOT`
- Date filters: `2020:2024[Publication Date]`
- Publication types: `"Review"[Publication Type]`
- Combine with E-utilities API for automation
**Best Practices**:
- Use MeSH Browser to find correct controlled vocabulary
- Construct complex queries in PubMed Advanced Search Builder first
- Include multiple synonyms with OR
- Retrieve PMIDs for easy metadata extraction
- Export to JSON or directly to BibTeXpython scripts/search_pubmed.py
--query '"Alzheimer Disease"[MeSH] AND "Drug Therapy"[MeSH]'
--date-start 2020
--date-end 2024
--publication-types "Clinical Trial,Review"
--output alzheimers_trials.json
--query '"Alzheimer Disease"[MeSH] AND "Drug Therapy"[MeSH]'
--date-start 2020
--date-end 2024
--publication-types "Clinical Trial,Review"
--output alzheimers_trials.json
**高级PubMed查询**(详见`references/pubmed_search.md`):
- 使用MeSH术语:`"Diabetes Mellitus"[MeSH]`
- 字段标签:`"cancer"[Title]`, `"Smith J"[Author]`
- 布尔运算符:`AND`, `OR`, `NOT`
- 日期筛选:`2020:2024[Publication Date]`
- 出版物类型:`"Review"[Publication Type]`
- 结合E-utilities API实现自动化
**最佳实践**:
- 使用MeSH浏览器查找正确的受控词汇
- 先在PubMed高级搜索构建器中构建复杂查询
- 使用OR包含多个同义词
- 获取PMID以便轻松提取元数据
- 导出为JSON或直接导出为BibTeX格式Phase 2: Metadata Extraction
阶段2:元数据提取
Goal: Convert paper identifiers (DOI, PMID, arXiv ID) to complete, accurate metadata.
目标:将论文标识符(DOI、PMID、arXiv ID)转换为完整、准确的元数据。
Quick DOI to BibTeX Conversion
快速DOI转BibTeX
For single DOIs, use the quick conversion tool:
bash
undefined针对单个DOI,使用快速转换工具:
bash
undefinedConvert single DOI
转换单个DOI
python scripts/doi_to_bibtex.py 10.1038/s41586-021-03819-2
python scripts/doi_to_bibtex.py 10.1038/s41586-021-03819-2
Convert multiple DOIs from a file
从文件转换多个DOI
python scripts/doi_to_bibtex.py --input dois.txt --output references.bib
python scripts/doi_to_bibtex.py --input dois.txt --output references.bib
Different output formats
不同输出格式
python scripts/doi_to_bibtex.py 10.1038/nature12345 --format json
undefinedpython scripts/doi_to_bibtex.py 10.1038/nature12345 --format json
undefinedComprehensive Metadata Extraction
全面元数据提取
For DOIs, PMIDs, arXiv IDs, or URLs:
bash
undefined支持DOI、PMID、arXiv ID或URL:
bash
undefinedExtract from DOI
从DOI提取
python scripts/extract_metadata.py --doi 10.1038/s41586-021-03819-2
python scripts/extract_metadata.py --doi 10.1038/s41586-021-03819-2
Extract from PMID
从PMID提取
python scripts/extract_metadata.py --pmid 34265844
python scripts/extract_metadata.py --pmid 34265844
Extract from arXiv ID
从arXiv ID提取
python scripts/extract_metadata.py --arxiv 2103.14030
python scripts/extract_metadata.py --arxiv 2103.14030
Extract from URL
从URL提取
python scripts/extract_metadata.py --url "https://www.nature.com/articles/s41586-021-03819-2"
python scripts/extract_metadata.py --url "https://www.nature.com/articles/s41586-021-03819-2"
Batch extraction from file (mixed identifiers)
从文件批量提取(混合标识符)
python scripts/extract_metadata.py --input identifiers.txt --output citations.bib
**Metadata Sources** (see `references/metadata_extraction.md`):
1. **CrossRef API**: Primary source for DOIs
- Comprehensive metadata for journal articles
- Publisher-provided information
- Includes authors, title, journal, volume, pages, dates
- Free, no API key required
2. **PubMed E-utilities**: Biomedical literature
- Official NCBI metadata
- Includes MeSH terms, abstracts
- PMID and PMCID identifiers
- Free, API key recommended for high volume
3. **arXiv API**: Preprints in physics, math, CS, q-bio
- Complete metadata for preprints
- Version tracking
- Author affiliations
- Free, open access
4. **DataCite API**: Research datasets, software, other resources
- Metadata for non-traditional scholarly outputs
- DOIs for datasets and code
- Free access
**What Gets Extracted**:
- **Required fields**: author, title, year
- **Journal articles**: journal, volume, number, pages, DOI
- **Books**: publisher, ISBN, edition
- **Conference papers**: booktitle, conference location, pages
- **Preprints**: repository (arXiv, bioRxiv), preprint ID
- **Additional**: abstract, keywords, URLpython scripts/extract_metadata.py --input identifiers.txt --output citations.bib
**元数据来源**(详见`references/metadata_extraction.md`):
1. **CrossRef API**:DOI的主要数据源
- 期刊文章的全面元数据
- 出版商提供的信息
- 包含作者、标题、期刊、卷、页码、日期
- 免费使用,无需API密钥
2. **PubMed E-utilities**:生物医学文献
- NCBI官方元数据
- 包含MeSH术语、摘要
- 支持PMID和PMCID标识符
- 免费使用,高请求量时推荐使用API密钥
3. **arXiv API**:物理、数学、计算机科学、定量生物学领域的预印本
- 预印本的完整元数据
- 版本追踪
- 作者机构信息
- 免费开放访问
4. **DataCite API**:研究数据集、软件及其他资源
- 非传统学术产出的元数据
- 数据集和代码的DOI
- 免费访问
**提取内容**:
- **必填字段**:作者、标题、年份
- **期刊文章**:期刊名、卷、期、页码、DOI
- **书籍**:出版商、ISBN、版本
- **会议论文**:论文集名、会议地点、页码
- **预印本**:存储库(arXiv、bioRxiv)、预印本ID
- **附加内容**:摘要、关键词、URLPhase 3: BibTeX Formatting
阶段3:BibTeX格式化
Goal: Generate clean, properly formatted BibTeX entries.
目标:生成清晰、格式规范的BibTeX条目。
Understanding BibTeX Entry Types
理解BibTeX条目类型
See for complete guide.
references/bibtex_formatting.mdCommon Entry Types:
- : Journal articles (most common)
@article - : Books
@book - : Conference papers
@inproceedings - : Book chapters
@incollection - : Dissertations
@phdthesis - : Preprints, software, datasets
@misc
Required Fields by Type:
bibtex
@article{citationkey,
author = {Last1, First1 and Last2, First2},
title = {Article Title},
journal = {Journal Name},
year = {2024},
volume = {10},
number = {3},
pages = {123--145},
doi = {10.1234/example}
}
@inproceedings{citationkey,
author = {Last, First},
title = {Paper Title},
booktitle = {Conference Name},
year = {2024},
pages = {1--10}
}
@book{citationkey,
author = {Last, First},
title = {Book Title},
publisher = {Publisher Name},
year = {2024}
}完整指南详见。
references/bibtex_formatting.md常见条目类型:
- : 期刊文章(最常用)
@article - : 书籍
@book - : 会议论文
@inproceedings - : 书籍章节
@incollection - : 博士论文
@phdthesis - : 预印本、软件、数据集
@misc
各类型必填字段:
bibtex
@article{citationkey,
author = {Last1, First1 and Last2, First2},
title = {Article Title},
journal = {Journal Name},
year = {2024},
volume = {10},
number = {3},
pages = {123--145},
doi = {10.1234/example}
}
@inproceedings{citationkey,
author = {Last, First},
title = {Paper Title},
booktitle = {Conference Name},
year = {2024},
pages = {1--10}
}
@book{citationkey,
author = {Last, First},
title = {Book Title},
publisher = {Publisher Name},
year = {2024}
}Formatting and Cleaning
格式化与清理
Use the formatter to standardize BibTeX files:
bash
undefined使用格式化工具标准化BibTeX文件:
bash
undefinedFormat and clean BibTeX file
格式化并清理BibTeX文件
python scripts/format_bibtex.py references.bib
--output formatted_references.bib
--output formatted_references.bib
python scripts/format_bibtex.py references.bib
--output formatted_references.bib
--output formatted_references.bib
Sort entries by citation key
按引用键排序条目
python scripts/format_bibtex.py references.bib
--sort key
--output sorted_references.bib
--sort key
--output sorted_references.bib
python scripts/format_bibtex.py references.bib
--sort key
--output sorted_references.bib
--sort key
--output sorted_references.bib
Sort by year (newest first)
按年份排序(最新优先)
python scripts/format_bibtex.py references.bib
--sort year
--descending
--output sorted_references.bib
--sort year
--descending
--output sorted_references.bib
python scripts/format_bibtex.py references.bib
--sort year
--descending
--output sorted_references.bib
--sort year
--descending
--output sorted_references.bib
Remove duplicates
移除重复条目
python scripts/format_bibtex.py references.bib
--deduplicate
--output clean_references.bib
--deduplicate
--output clean_references.bib
python scripts/format_bibtex.py references.bib
--deduplicate
--output clean_references.bib
--deduplicate
--output clean_references.bib
Validate and report issues
验证并生成问题报告
python scripts/format_bibtex.py references.bib
--validate
--report validation_report.txt
--validate
--report validation_report.txt
**Formatting Operations**:
- Standardize field order
- Consistent indentation and spacing
- Proper capitalization in titles (protected with {})
- Standardized author name format
- Consistent citation key format
- Remove unnecessary fields
- Fix common errors (missing commas, braces)python scripts/format_bibtex.py references.bib
--validate
--report validation_report.txt
--validate
--report validation_report.txt
**格式化操作**:
- 标准化字段顺序
- 统一缩进和间距
- 标题中正确的大小写(用{}保护)
- 标准化作者姓名格式
- 统一引用键格式
- 移除不必要的字段
- 修复常见错误(缺失逗号、括号)Phase 4: Citation Validation
阶段4:引用验证
Goal: Verify all citations are accurate and complete.
目标:验证所有引用的准确性和完整性。
Comprehensive Validation
全面验证
bash
undefinedbash
undefinedValidate BibTeX file
验证BibTeX文件
python scripts/validate_citations.py references.bib
python scripts/validate_citations.py references.bib
Validate and fix common issues
验证并修复常见问题
python scripts/validate_citations.py references.bib
--auto-fix
--output validated_references.bib
--auto-fix
--output validated_references.bib
python scripts/validate_citations.py references.bib
--auto-fix
--output validated_references.bib
--auto-fix
--output validated_references.bib
Generate detailed validation report
生成详细验证报告
python scripts/validate_citations.py references.bib
--report validation_report.json
--verbose
--report validation_report.json
--verbose
**Validation Checks** (see `references/citation_validation.md`):
1. **DOI Verification**:
- DOI resolves correctly via doi.org
- Metadata matches between BibTeX and CrossRef
- No broken or invalid DOIs
2. **Required Fields**:
- All required fields present for entry type
- No empty or missing critical information
- Author names properly formatted
3. **Data Consistency**:
- Year is valid (4 digits, reasonable range)
- Volume/number are numeric
- Pages formatted correctly (e.g., 123--145)
- URLs are accessible
4. **Duplicate Detection**:
- Same DOI used multiple times
- Similar titles (possible duplicates)
- Same author/year/title combinations
5. **Format Compliance**:
- Valid BibTeX syntax
- Proper bracing and quoting
- Citation keys are unique
- Special characters handled correctly
**Validation Output**:
```json
{
"total_entries": 150,
"valid_entries": 145,
"errors": [
{
"citation_key": "Smith2023",
"error_type": "missing_field",
"field": "journal",
"severity": "high"
},
{
"citation_key": "Jones2022",
"error_type": "invalid_doi",
"doi": "10.1234/broken",
"severity": "high"
}
],
"warnings": [
{
"citation_key": "Brown2021",
"warning_type": "possible_duplicate",
"duplicate_of": "Brown2021a",
"severity": "medium"
}
]
}python scripts/validate_citations.py references.bib
--report validation_report.json
--verbose
--report validation_report.json
--verbose
**验证检查项**(详见`references/citation_validation.md`):
1. **DOI验证**:
- DOI可通过doi.org正常解析
- BibTeX中的元数据与CrossRef一致
- 无无效或损坏的DOI
2. **必填字段检查**:
- 对应条目类型的所有必填字段均存在
- 无空值或缺失关键信息
- 作者姓名格式正确
3. **数据一致性**:
- 年份有效(4位数字,合理范围)
- 卷/期为数字格式
- 页码格式正确(如123--145)
- URL可访问
4. **重复检测**:
- 同一DOI被多次使用
- 标题相似(可能重复)
- 作者/年份/标题组合重复
5. **格式合规性**:
- 有效的BibTeX语法
- 正确的括号和引号使用
- 引用键唯一
- 特殊字符处理正确
**验证输出示例**:
```json
{
"total_entries": 150,
"valid_entries": 145,
"errors": [
{
"citation_key": "Smith2023",
"error_type": "missing_field",
"field": "journal",
"severity": "high"
},
{
"citation_key": "Jones2022",
"error_type": "invalid_doi",
"doi": "10.1234/broken",
"severity": "high"
}
],
"warnings": [
{
"citation_key": "Brown2021",
"warning_type": "possible_duplicate",
"duplicate_of": "Brown2021a",
"severity": "medium"
}
]
}Phase 5: Integration with Writing Workflow
阶段5:与写作工作流集成
Building References for Manuscripts
为手稿构建参考文献
Complete workflow for creating a bibliography:
bash
undefined创建参考文献的完整工作流:
bash
undefined1. Search for papers on your topic
1. 搜索主题相关论文
python scripts/search_pubmed.py
'"CRISPR-Cas Systems"[MeSH] AND "Gene Editing"[MeSH]'
--date-start 2020
--limit 200
--output crispr_papers.json
'"CRISPR-Cas Systems"[MeSH] AND "Gene Editing"[MeSH]'
--date-start 2020
--limit 200
--output crispr_papers.json
python scripts/search_pubmed.py
'"CRISPR-Cas Systems"[MeSH] AND "Gene Editing"[MeSH]'
--date-start 2020
--limit 200
--output crispr_papers.json
'"CRISPR-Cas Systems"[MeSH] AND "Gene Editing"[MeSH]'
--date-start 2020
--limit 200
--output crispr_papers.json
2. Extract DOIs from search results and convert to BibTeX
2. 从搜索结果中提取DOI并转换为BibTeX
python scripts/extract_metadata.py
--input crispr_papers.json
--output crispr_refs.bib
--input crispr_papers.json
--output crispr_refs.bib
python scripts/extract_metadata.py
--input crispr_papers.json
--output crispr_refs.bib
--input crispr_papers.json
--output crispr_refs.bib
3. Add specific papers by DOI
3. 通过DOI添加特定论文
python scripts/doi_to_bibtex.py 10.1038/nature12345 >> crispr_refs.bib
python scripts/doi_to_bibtex.py 10.1126/science.abcd1234 >> crispr_refs.bib
python scripts/doi_to_bibtex.py 10.1038/nature12345 >> crispr_refs.bib
python scripts/doi_to_bibtex.py 10.1126/science.abcd1234 >> crispr_refs.bib
4. Format and clean the BibTeX file
4. 格式化并清理BibTeX文件
python scripts/format_bibtex.py crispr_refs.bib
--deduplicate
--sort year
--descending
--output references.bib
--deduplicate
--sort year
--descending
--output references.bib
python scripts/format_bibtex.py crispr_refs.bib
--deduplicate
--sort year
--descending
--output references.bib
--deduplicate
--sort year
--descending
--output references.bib
5. Validate all citations
5. 验证所有引用
python scripts/validate_citations.py references.bib
--auto-fix
--report validation.json
--output final_references.bib
--auto-fix
--report validation.json
--output final_references.bib
python scripts/validate_citations.py references.bib
--auto-fix
--report validation.json
--output final_references.bib
--auto-fix
--report validation.json
--output final_references.bib
6. Review validation report and fix any remaining issues
6. 查看验证报告并修复剩余问题
cat validation.json
cat validation.json
7. Use in your LaTeX document
7. 在LaTeX文档中使用
\bibliography{final_references}
\bibliography{final_references}
undefinedundefinedIntegration with Literature Review Skill
与文献综述技能集成
This skill complements the skill:
literature-reviewLiterature Review Skill → Systematic search and synthesis
Citation Management Skill → Technical citation handling
Combined Workflow:
- Use for comprehensive multi-database search
literature-review - Use to extract and validate all citations
citation-management - Use to synthesize findings thematically
literature-review - Use to verify final bibliography accuracy
citation-management
bash
undefined该技能与技能互补:
literature-review文献综述技能 → 系统化搜索与综合分析
引用管理技能 → 技术层面的引用处理
组合工作流:
- 使用进行多数据库全面搜索
literature-review - 使用提取并验证所有引用
citation-management - 使用进行主题性结果综合
literature-review - 使用验证最终参考文献的准确性
citation-management
bash
undefinedAfter completing literature review
完成文献综述后
Verify all citations in the review document
验证综述文档中的所有引用
python scripts/validate_citations.py my_review_references.bib --report review_validation.json
python scripts/validate_citations.py my_review_references.bib --report review_validation.json
Format for specific citation style if needed
根据需要格式化为特定引用风格
python scripts/format_bibtex.py my_review_references.bib
--style nature
--output formatted_refs.bib
--style nature
--output formatted_refs.bib
undefinedpython scripts/format_bibtex.py my_review_references.bib
--style nature
--output formatted_refs.bib
--style nature
--output formatted_refs.bib
undefinedSearch Strategies
搜索策略
Google Scholar Best Practices
Google Scholar最佳实践
Finding Seminal Papers:
- Sort by citation count (most cited first)
- Look for review articles for overview
- Check "Cited by" for impact assessment
- Use citation alerts for tracking new citations
Advanced Operators (full list in ):
references/google_scholar_search.md"exact phrase" # Exact phrase matching
author:lastname # Search by author
intitle:keyword # Search in title only
source:journal # Search specific journal
-exclude # Exclude terms
OR # Alternative terms
2020..2024 # Year rangeExample Searches:
undefined查找开创性论文:
- 按引用次数排序(被引最多优先)
- 查找综述文章获取领域概览
- 查看“被引用次数”评估影响力
- 使用引用提醒追踪新引用
高级运算符(完整列表见):
references/google_scholar_search.md"exact phrase" # 精确短语匹配
author:lastname # 按作者搜索
intitle:keyword # 仅在标题中搜索
source:journal # 搜索特定期刊
-exclude # 排除术语
OR # 替代术语
2020..2024 # 年份范围搜索示例:
undefinedFind recent reviews on a topic
查找主题相关的最新综述
"CRISPR" intitle:review 2023..2024
"CRISPR" intitle:review 2023..2024
Find papers by specific author on topic
查找特定作者在某主题上的论文
author:Church "synthetic biology"
author:Church "synthetic biology"
Find highly cited foundational work
查找高被引的开创性研究
"deep learning" 2012..2015 sort:citations
"deep learning" 2012..2015 sort:citations
Exclude surveys and focus on methods
排除综述,聚焦方法类论文
"protein folding" -survey -review intitle:method
undefined"protein folding" -survey -review intitle:method
undefinedPubMed Best Practices
PubMed最佳实践
Using MeSH Terms:
MeSH (Medical Subject Headings) provides controlled vocabulary for precise searching.
- Find MeSH terms at https://meshb.nlm.nih.gov/search
- Use in queries:
"Diabetes Mellitus, Type 2"[MeSH] - Combine with keywords for comprehensive coverage
Field Tags:
[Title] # Search in title only
[Title/Abstract] # Search in title or abstract
[Author] # Search by author name
[Journal] # Search specific journal
[Publication Date] # Date range
[Publication Type] # Article type
[MeSH] # MeSH termBuilding Complex Queries:
bash
undefined使用MeSH术语:
MeSH(医学主题词)提供受控词汇,实现精准搜索。
- 查找MeSH术语:访问https://meshb.nlm.nih.gov/search
- 在查询中使用:
"Diabetes Mellitus, Type 2"[MeSH] - 与关键词结合:实现全面覆盖
字段标签:
[Title] # 仅在标题中搜索
[Title/Abstract] # 在标题或摘要中搜索
[Author] # 按作者姓名搜索
[Journal] # 搜索特定期刊
[Publication Date] # 日期范围
[Publication Type] # 文章类型
[MeSH] # MeSH术语构建复杂查询:
bash
undefinedClinical trials on diabetes treatment published recently
近年发表的糖尿病治疗临床试验
"Diabetes Mellitus, Type 2"[MeSH] AND "Drug Therapy"[MeSH]
AND "Clinical Trial"[Publication Type] AND 2020:2024[Publication Date]
"Diabetes Mellitus, Type 2"[MeSH] AND "Drug Therapy"[MeSH]
AND "Clinical Trial"[Publication Type] AND 2020:2024[Publication Date]
Reviews on CRISPR in specific journal
特定期刊中关于CRISPR的综述
"CRISPR-Cas Systems"[MeSH] AND "Nature"[Journal] AND "Review"[Publication Type]
"CRISPR-Cas Systems"[MeSH] AND "Nature"[Journal] AND "Review"[Publication Type]
Specific author's recent work
特定作者的近期研究
"Smith AB"[Author] AND cancer[Title/Abstract] AND 2022:2024[Publication Date]
**E-utilities for Automation**:
The scripts use NCBI E-utilities API for programmatic access:
- **ESearch**: Search and retrieve PMIDs
- **EFetch**: Retrieve full metadata
- **ESummary**: Get summary information
- **ELink**: Find related articles
See `references/pubmed_search.md` for complete API documentation."Smith AB"[Author] AND cancer[Title/Abstract] AND 2022:2024[Publication Date]
**E-utilities自动化**:
脚本使用NCBI E-utilities API实现程序化访问:
- **ESearch**: 搜索并获取PMID
- **EFetch**: 获取完整元数据
- **ESummary**: 获取摘要信息
- **ELink**: 查找相关文章
完整API文档详见`references/pubmed_search.md`。Tools and Scripts
工具与脚本
search_google_scholar.py
search_google_scholar.py
Search Google Scholar and export results.
Features:
- Automated searching with rate limiting
- Pagination support
- Year range filtering
- Export to JSON or BibTeX
- Citation count information
Usage:
bash
undefined搜索Google Scholar并导出结果。
功能:
- 带速率限制的自动化搜索
- 分页支持
- 年份范围筛选
- 导出为JSON或BibTeX
- 包含引用次数信息
用法:
bash
undefinedBasic search
基础搜索
python scripts/search_google_scholar.py "quantum computing"
python scripts/search_google_scholar.py "quantum computing"
Advanced search with filters
带筛选条件的高级搜索
python scripts/search_google_scholar.py "quantum computing"
--year-start 2020
--year-end 2024
--limit 100
--sort-by citations
--output quantum_papers.json
--year-start 2020
--year-end 2024
--limit 100
--sort-by citations
--output quantum_papers.json
python scripts/search_google_scholar.py "quantum computing"
--year-start 2020
--year-end 2024
--limit 100
--sort-by citations
--output quantum_papers.json
--year-start 2020
--year-end 2024
--limit 100
--sort-by citations
--output quantum_papers.json
Export directly to BibTeX
直接导出为BibTeX
python scripts/search_google_scholar.py "machine learning"
--limit 50
--format bibtex
--output ml_papers.bib
--limit 50
--format bibtex
--output ml_papers.bib
undefinedpython scripts/search_google_scholar.py "machine learning"
--limit 50
--format bibtex
--output ml_papers.bib
--limit 50
--format bibtex
--output ml_papers.bib
undefinedsearch_pubmed.py
search_pubmed.py
Search PubMed using E-utilities API.
Features:
- Complex query support (MeSH, field tags, Boolean)
- Date range filtering
- Publication type filtering
- Batch retrieval with metadata
- Export to JSON or BibTeX
Usage:
bash
undefined使用E-utilities API搜索PubMed。
功能:
- 支持复杂查询(MeSH、字段标签、布尔运算符)
- 日期范围筛选
- 出版物类型筛选
- 批量获取元数据
- 导出为JSON或BibTeX
用法:
bash
undefinedSimple keyword search
简单关键词搜索
python scripts/search_pubmed.py "CRISPR gene editing"
python scripts/search_pubmed.py "CRISPR gene editing"
Complex query with filters
带筛选条件的复杂查询
python scripts/search_pubmed.py
--query '"CRISPR-Cas Systems"[MeSH] AND "therapeutic"[Title/Abstract]'
--date-start 2020-01-01
--date-end 2024-12-31
--publication-types "Clinical Trial,Review"
--limit 200
--output crispr_therapeutic.json
--query '"CRISPR-Cas Systems"[MeSH] AND "therapeutic"[Title/Abstract]'
--date-start 2020-01-01
--date-end 2024-12-31
--publication-types "Clinical Trial,Review"
--limit 200
--output crispr_therapeutic.json
python scripts/search_pubmed.py
--query '"CRISPR-Cas Systems"[MeSH] AND "therapeutic"[Title/Abstract]'
--date-start 2020-01-01
--date-end 2024-12-31
--publication-types "Clinical Trial,Review"
--limit 200
--output crispr_therapeutic.json
--query '"CRISPR-Cas Systems"[MeSH] AND "therapeutic"[Title/Abstract]'
--date-start 2020-01-01
--date-end 2024-12-31
--publication-types "Clinical Trial,Review"
--limit 200
--output crispr_therapeutic.json
Export to BibTeX
导出为BibTeX
python scripts/search_pubmed.py "Alzheimer's disease"
--limit 100
--format bibtex
--output alzheimers.bib
--limit 100
--format bibtex
--output alzheimers.bib
undefinedpython scripts/search_pubmed.py "Alzheimer's disease"
--limit 100
--format bibtex
--output alzheimers.bib
--limit 100
--format bibtex
--output alzheimers.bib
undefinedextract_metadata.py
extract_metadata.py
Extract complete metadata from paper identifiers.
Features:
- Supports DOI, PMID, arXiv ID, URL
- Queries CrossRef, PubMed, arXiv APIs
- Handles multiple identifier types
- Batch processing
- Multiple output formats
Usage:
bash
undefined从论文标识符中提取完整元数据。
功能:
- 支持DOI、PMID、arXiv ID、URL
- 查询CrossRef、PubMed、arXiv API
- 处理多种标识符类型
- 批量处理
- 多种输出格式
用法:
bash
undefinedSingle DOI
单个DOI
python scripts/extract_metadata.py --doi 10.1038/s41586-021-03819-2
python scripts/extract_metadata.py --doi 10.1038/s41586-021-03819-2
Single PMID
单个PMID
python scripts/extract_metadata.py --pmid 34265844
python scripts/extract_metadata.py --pmid 34265844
Single arXiv ID
单个arXiv ID
python scripts/extract_metadata.py --arxiv 2103.14030
python scripts/extract_metadata.py --arxiv 2103.14030
From URL
从URL提取
python scripts/extract_metadata.py
--url "https://www.nature.com/articles/s41586-021-03819-2"
--url "https://www.nature.com/articles/s41586-021-03819-2"
python scripts/extract_metadata.py
--url "https://www.nature.com/articles/s41586-021-03819-2"
--url "https://www.nature.com/articles/s41586-021-03819-2"
Batch processing (file with one identifier per line)
批量处理(每行一个标识符的文件)
python scripts/extract_metadata.py
--input paper_ids.txt
--output references.bib
--input paper_ids.txt
--output references.bib
python scripts/extract_metadata.py
--input paper_ids.txt
--output references.bib
--input paper_ids.txt
--output references.bib
Different output formats
不同输出格式
python scripts/extract_metadata.py
--doi 10.1038/nature12345
--format json # or bibtex, yaml
--doi 10.1038/nature12345
--format json # or bibtex, yaml
undefinedpython scripts/extract_metadata.py
--doi 10.1038/nature12345
--format json # 或bibtex、yaml
--doi 10.1038/nature12345
--format json # 或bibtex、yaml
undefinedvalidate_citations.py
validate_citations.py
Validate BibTeX entries for accuracy and completeness.
Features:
- DOI verification via doi.org and CrossRef
- Required field checking
- Duplicate detection
- Format validation
- Auto-fix common issues
- Detailed reporting
Usage:
bash
undefined验证BibTeX条目的准确性和完整性。
功能:
- 通过doi.org和CrossRef验证DOI
- 必填字段检查
- 重复检测
- 格式验证
- 自动修复常见问题
- 详细报告
用法:
bash
undefinedBasic validation
基础验证
python scripts/validate_citations.py references.bib
python scripts/validate_citations.py references.bib
With auto-fix
自动修复
python scripts/validate_citations.py references.bib
--auto-fix
--output fixed_references.bib
--auto-fix
--output fixed_references.bib
python scripts/validate_citations.py references.bib
--auto-fix
--output fixed_references.bib
--auto-fix
--output fixed_references.bib
Detailed validation report
详细验证报告
python scripts/validate_citations.py references.bib
--report validation_report.json
--verbose
--report validation_report.json
--verbose
python scripts/validate_citations.py references.bib
--report validation_report.json
--verbose
--report validation_report.json
--verbose
Only check DOIs
仅检查DOI
python scripts/validate_citations.py references.bib
--check-dois-only
--check-dois-only
undefinedpython scripts/validate_citations.py references.bib
--check-dois-only
--check-dois-only
undefinedformat_bibtex.py
format_bibtex.py
Format and clean BibTeX files.
Features:
- Standardize formatting
- Sort entries (by key, year, author)
- Remove duplicates
- Validate syntax
- Fix common errors
- Enforce citation key conventions
Usage:
bash
undefined格式化并清理BibTeX文件。
功能:
- 标准化格式
- 按引用键、年份、作者排序条目
- 移除重复条目
- 验证语法
- 修复常见错误
- 强制执行引用键规范
用法:
bash
undefinedBasic formatting
基础格式化
python scripts/format_bibtex.py references.bib
python scripts/format_bibtex.py references.bib
Sort by year (newest first)
按年份排序(最新优先)
python scripts/format_bibtex.py references.bib
--sort year
--descending
--output sorted_refs.bib
--sort year
--descending
--output sorted_refs.bib
python scripts/format_bibtex.py references.bib
--sort year
--descending
--output sorted_refs.bib
--sort year
--descending
--output sorted_refs.bib
Remove duplicates
移除重复条目
python scripts/format_bibtex.py references.bib
--deduplicate
--output clean_refs.bib
--deduplicate
--output clean_refs.bib
python scripts/format_bibtex.py references.bib
--deduplicate
--output clean_refs.bib
--deduplicate
--output clean_refs.bib
Complete cleanup
完整清理
python scripts/format_bibtex.py references.bib
--deduplicate
--sort year
--validate
--auto-fix
--output final_refs.bib
--deduplicate
--sort year
--validate
--auto-fix
--output final_refs.bib
undefinedpython scripts/format_bibtex.py references.bib
--deduplicate
--sort year
--validate
--auto-fix
--output final_refs.bib
--deduplicate
--sort year
--validate
--auto-fix
--output final_refs.bib
undefineddoi_to_bibtex.py
doi_to_bibtex.py
Quick DOI to BibTeX conversion.
Features:
- Fast single DOI conversion
- Batch processing
- Multiple output formats
- Clipboard support
Usage:
bash
undefined快速DOI转BibTeX工具。
功能:
- 快速转换单个DOI
- 批量处理
- 多种输出格式
- 剪贴板支持
用法:
bash
undefinedSingle DOI
单个DOI
python scripts/doi_to_bibtex.py 10.1038/s41586-021-03819-2
python scripts/doi_to_bibtex.py 10.1038/s41586-021-03819-2
Multiple DOIs
多个DOI
python scripts/doi_to_bibtex.py
10.1038/nature12345
10.1126/science.abc1234
10.1016/j.cell.2023.01.001
10.1038/nature12345
10.1126/science.abc1234
10.1016/j.cell.2023.01.001
python scripts/doi_to_bibtex.py
10.1038/nature12345
10.1126/science.abc1234
10.1016/j.cell.2023.01.001
10.1038/nature12345
10.1126/science.abc1234
10.1016/j.cell.2023.01.001
From file (one DOI per line)
从文件转换(每行一个DOI)
python scripts/doi_to_bibtex.py --input dois.txt --output references.bib
python scripts/doi_to_bibtex.py --input dois.txt --output references.bib
Copy to clipboard
复制到剪贴板
python scripts/doi_to_bibtex.py 10.1038/nature12345 --clipboard
undefinedpython scripts/doi_to_bibtex.py 10.1038/nature12345 --clipboard
undefinedBest Practices
最佳实践
Search Strategy
搜索策略
-
Start broad, then narrow:
- Begin with general terms to understand the field
- Refine with specific keywords and filters
- Use synonyms and related terms
-
Use multiple sources:
- Google Scholar for comprehensive coverage
- PubMed for biomedical focus
- arXiv for preprints
- Combine results for completeness
-
Leverage citations:
- Check "Cited by" for seminal papers
- Review references from key papers
- Use citation networks to discover related work
-
Document your searches:
- Save search queries and dates
- Record number of results
- Note any filters or restrictions applied
-
从宽到窄:
- 先使用通用术语了解领域
- 用特定关键词和筛选条件细化搜索
- 使用同义词和相关术语
-
多源搜索:
- Google Scholar获取全面覆盖
- PubMed聚焦生物医学
- arXiv获取预印本
- 合并结果确保完整性
-
利用引用关系:
- 查看“被引用次数”找到开创性论文
- 查阅关键论文的参考文献
- 通过引用网络发现相关研究
-
记录搜索过程:
- 保存搜索查询和日期
- 记录结果数量
- 记录使用的筛选条件或限制
Metadata Extraction
元数据提取
-
Always use DOIs when available:
- Most reliable identifier
- Permanent link to the publication
- Best metadata source via CrossRef
-
Verify extracted metadata:
- Check author names are correct
- Verify journal/conference names
- Confirm publication year
- Validate page numbers and volume
-
Handle edge cases:
- Preprints: Include repository and ID
- Preprints later published: Use published version
- Conference papers: Include conference name and location
- Book chapters: Include book title and editors
-
Maintain consistency:
- Use consistent author name format
- Standardize journal abbreviations
- Use same DOI format (URL preferred)
-
优先使用DOI:
- 最可靠的标识符
- 指向出版物的永久链接
- 通过CrossRef获取最佳元数据
-
验证提取的元数据:
- 检查作者姓名是否正确
- 验证期刊/会议名称
- 确认出版年份
- 验证页码和卷号
-
处理边缘情况:
- 预印本:包含存储库和ID
- 已发表的预印本:使用发表版本
- 会议论文:包含会议名称和地点
- 书籍章节:包含书名和编者
-
保持一致性:
- 使用统一的作者姓名格式
- 标准化期刊缩写
- 使用相同的DOI格式(推荐URL格式)
BibTeX Quality
BibTeX质量
-
Follow conventions:
- Use meaningful citation keys (FirstAuthor2024keyword)
- Protect capitalization in titles with {}
- Use -- for page ranges (not single dash)
- Include DOI field for all modern publications
-
Keep it clean:
- Remove unnecessary fields
- No redundant information
- Consistent formatting
- Validate syntax regularly
-
Organize systematically:
- Sort by year or topic
- Group related papers
- Use separate files for different projects
- Merge carefully to avoid duplicates
-
遵循规范:
- 使用有意义的引用键(如FirstAuthor2024keyword)
- 用{}保护标题中的大小写
- 页码范围使用--(而非单破折号)
- 所有现代出版物都包含DOI字段
-
保持简洁:
- 移除不必要的字段
- 无冗余信息
- 格式统一
- 定期验证语法
-
系统化组织:
- 按年份或主题排序
- 分组相关论文
- 不同项目使用单独文件
- 合并时谨慎操作避免重复
Validation
验证
-
Validate early and often:
- Check citations when adding them
- Validate complete bibliography before submission
- Re-validate after any manual edits
-
Fix issues promptly:
- Broken DOIs: Find correct identifier
- Missing fields: Extract from original source
- Duplicates: Choose best version, remove others
- Format errors: Use auto-fix when safe
-
Manual review for critical citations:
- Verify key papers cited correctly
- Check author names match publication
- Confirm page numbers and volume
- Ensure URLs are current
-
尽早并频繁验证:
- 添加引用时检查
- 提交前验证完整参考文献
- 手动编辑后重新验证
-
及时修复问题:
- 损坏的DOI:查找正确的标识符
- 缺失字段:从原始来源提取
- 重复条目:选择最佳版本,移除其他
- 格式错误:安全时使用自动修复
-
关键引用手动检查:
- 验证关键论文的引用是否正确
- 检查作者姓名与出版物是否匹配
- 确认页码和卷号
- 确保URL有效
Common Pitfalls to Avoid
需避免的常见陷阱
-
Single source bias: Only using Google Scholar or PubMed
- Solution: Search multiple databases for comprehensive coverage
-
Accepting metadata blindly: Not verifying extracted information
- Solution: Spot-check extracted metadata against original sources
-
Ignoring DOI errors: Broken or incorrect DOIs in bibliography
- Solution: Run validation before final submission
-
Inconsistent formatting: Mixed citation key styles, formatting
- Solution: Use format_bibtex.py to standardize
-
Duplicate entries: Same paper cited multiple times with different keys
- Solution: Use duplicate detection in validation
-
Missing required fields: Incomplete BibTeX entries
- Solution: Validate and ensure all required fields present
-
Outdated preprints: Citing preprint when published version exists
- Solution: Check if preprints have been published, update to journal version
-
Special character issues: Broken LaTeX compilation due to characters
- Solution: Use proper escaping or Unicode in BibTeX
-
No validation before submission: Submitting with citation errors
- Solution: Always run validation as final check
-
Manual BibTeX entry: Typing entries by hand
- Solution: Always extract from metadata sources using scripts
-
单一来源偏差:仅使用Google Scholar或PubMed
- 解决方案:搜索多个数据库确保全面覆盖
-
盲目信任元数据:不验证提取的信息
- 解决方案:抽查提取的元数据与原始来源是否一致
-
忽略DOI错误:参考文献中存在损坏或错误的DOI
- 解决方案:提交前运行验证
-
格式不一致:混合的引用键风格和格式
- 解决方案:使用format_bibtex.py标准化
-
重复条目:同一论文使用不同引用键多次引用
- 解决方案:使用验证中的重复检测功能
-
缺失必填字段:不完整的BibTeX条目
- 解决方案:验证并确保所有必填字段存在
-
引用过时预印本:预印本已发表却仍引用预印本版本
- 解决方案:检查预印本是否已发表,更新为期刊版本
-
特殊字符问题:特殊字符导致LaTeX编译失败
- 解决方案:在BibTeX中正确转义或使用Unicode
-
提交前未验证:带着引用错误提交
- 解决方案:始终将验证作为最终检查步骤
-
手动编写BibTeX条目:手动输入条目
- 解决方案:始终使用脚本从元数据源提取
Example Workflows
示例工作流
Example 1: Building a Bibliography for a Paper
示例1:为论文构建参考文献
bash
undefinedbash
undefinedStep 1: Find key papers on your topic
步骤1:查找主题相关关键论文
python scripts/search_google_scholar.py "transformer neural networks"
--year-start 2017
--limit 50
--output transformers_gs.json
--year-start 2017
--limit 50
--output transformers_gs.json
python scripts/search_pubmed.py "deep learning medical imaging"
--date-start 2020
--limit 50
--output medical_dl_pm.json
--date-start 2020
--limit 50
--output medical_dl_pm.json
python scripts/search_google_scholar.py "transformer neural networks"
--year-start 2017
--limit 50
--output transformers_gs.json
--year-start 2017
--limit 50
--output transformers_gs.json
python scripts/search_pubmed.py "deep learning medical imaging"
--date-start 2020
--limit 50
--output medical_dl_pm.json
--date-start 2020
--limit 50
--output medical_dl_pm.json
Step 2: Extract metadata from search results
步骤2:从搜索结果提取元数据
python scripts/extract_metadata.py
--input transformers_gs.json
--output transformers.bib
--input transformers_gs.json
--output transformers.bib
python scripts/extract_metadata.py
--input medical_dl_pm.json
--output medical.bib
--input medical_dl_pm.json
--output medical.bib
python scripts/extract_metadata.py
--input transformers_gs.json
--output transformers.bib
--input transformers_gs.json
--output transformers.bib
python scripts/extract_metadata.py
--input medical_dl_pm.json
--output medical.bib
--input medical_dl_pm.json
--output medical.bib
Step 3: Add specific papers you already know
步骤3:添加已知的特定论文
python scripts/doi_to_bibtex.py 10.1038/s41586-021-03819-2 >> specific.bib
python scripts/doi_to_bibtex.py 10.1126/science.aam9317 >> specific.bib
python scripts/doi_to_bibtex.py 10.1038/s41586-021-03819-2 >> specific.bib
python scripts/doi_to_bibtex.py 10.1126/science.aam9317 >> specific.bib
Step 4: Combine all BibTeX files
步骤4:合并所有BibTeX文件
cat transformers.bib medical.bib specific.bib > combined.bib
cat transformers.bib medical.bib specific.bib > combined.bib
Step 5: Format and deduplicate
步骤5:格式化并去重
python scripts/format_bibtex.py combined.bib
--deduplicate
--sort year
--descending
--output formatted.bib
--deduplicate
--sort year
--descending
--output formatted.bib
python scripts/format_bibtex.py combined.bib
--deduplicate
--sort year
--descending
--output formatted.bib
--deduplicate
--sort year
--descending
--output formatted.bib
Step 6: Validate
步骤6:验证
python scripts/validate_citations.py formatted.bib
--auto-fix
--report validation.json
--output final_references.bib
--auto-fix
--report validation.json
--output final_references.bib
python scripts/validate_citations.py formatted.bib
--auto-fix
--report validation.json
--output final_references.bib
--auto-fix
--report validation.json
--output final_references.bib
Step 7: Review any issues
步骤7:查看问题
cat validation.json | grep -A 3 '"errors"'
cat validation.json | grep -A 3 '"errors"'
Step 8: Use in LaTeX
步骤8:在LaTeX中使用
\bibliography{final_references}
\bibliography{final_references}
undefinedundefinedExample 2: Converting a List of DOIs
示例2:转换DOI列表
bash
undefinedbash
undefinedYou have a text file with DOIs (one per line)
你有一个每行一个DOI的文本文件
dois.txt contains:
dois.txt内容:
10.1038/s41586-021-03819-2
10.1038/s41586-021-03819-2
10.1126/science.aam9317
10.1126/science.aam9317
10.1016/j.cell.2023.01.001
10.1016/j.cell.2023.01.001
Convert all to BibTeX
转换为BibTeX
python scripts/doi_to_bibtex.py --input dois.txt --output references.bib
python scripts/doi_to_bibtex.py --input dois.txt --output references.bib
Validate the result
验证结果
python scripts/validate_citations.py references.bib --verbose
undefinedpython scripts/validate_citations.py references.bib --verbose
undefinedExample 3: Cleaning an Existing BibTeX File
示例3:清理现有BibTeX文件
bash
undefinedbash
undefinedYou have a messy BibTeX file from various sources
你有一个来自不同来源的杂乱BibTeX文件
Clean it up systematically
系统化清理
Step 1: Format and standardize
步骤1:格式化并标准化
python scripts/format_bibtex.py messy_references.bib
--output step1_formatted.bib
--output step1_formatted.bib
python scripts/format_bibtex.py messy_references.bib
--output step1_formatted.bib
--output step1_formatted.bib
Step 2: Remove duplicates
步骤2:移除重复条目
python scripts/format_bibtex.py step1_formatted.bib
--deduplicate
--output step2_deduplicated.bib
--deduplicate
--output step2_deduplicated.bib
python scripts/format_bibtex.py step1_formatted.bib
--deduplicate
--output step2_deduplicated.bib
--deduplicate
--output step2_deduplicated.bib
Step 3: Validate and auto-fix
步骤3:验证并自动修复
python scripts/validate_citations.py step2_deduplicated.bib
--auto-fix
--output step3_validated.bib
--auto-fix
--output step3_validated.bib
python scripts/validate_citations.py step2_deduplicated.bib
--auto-fix
--output step3_validated.bib
--auto-fix
--output step3_validated.bib
Step 4: Sort by year
步骤4:按年份排序
python scripts/format_bibtex.py step3_validated.bib
--sort year
--descending
--output clean_references.bib
--sort year
--descending
--output clean_references.bib
python scripts/format_bibtex.py step3_validated.bib
--sort year
--descending
--output clean_references.bib
--sort year
--descending
--output clean_references.bib
Step 5: Final validation report
步骤5:最终验证报告
python scripts/validate_citations.py clean_references.bib
--report final_validation.json
--verbose
--report final_validation.json
--verbose
python scripts/validate_citations.py clean_references.bib
--report final_validation.json
--verbose
--report final_validation.json
--verbose
Review report
查看报告
cat final_validation.json
undefinedcat final_validation.json
undefinedExample 4: Finding and Citing Seminal Papers
示例4:查找并引用开创性论文
bash
undefinedbash
undefinedFind highly cited papers on a topic
查找主题相关的高被引论文
python scripts/search_google_scholar.py "AlphaFold protein structure"
--year-start 2020
--year-end 2024
--sort-by citations
--limit 20
--output alphafold_seminal.json
--year-start 2020
--year-end 2024
--sort-by citations
--limit 20
--output alphafold_seminal.json
python scripts/search_google_scholar.py "AlphaFold protein structure"
--year-start 2020
--year-end 2024
--sort-by citations
--limit 20
--output alphafold_seminal.json
--year-start 2020
--year-end 2024
--sort-by citations
--limit 20
--output alphafold_seminal.json
Extract the top 10 by citation count
提取被引次数前10的论文
(script will have included citation counts in JSON)
(脚本会在JSON中包含被引次数)
Convert to BibTeX
转换为BibTeX
python scripts/extract_metadata.py
--input alphafold_seminal.json
--output alphafold_refs.bib
--input alphafold_seminal.json
--output alphafold_refs.bib
python scripts/extract_metadata.py
--input alphafold_seminal.json
--output alphafold_refs.bib
--input alphafold_seminal.json
--output alphafold_refs.bib
The BibTeX file now contains the most influential papers
此时BibTeX文件包含该领域最具影响力的论文
undefinedundefinedIntegration with Other Skills
与其他技能集成
Literature Review Skill
文献综述技能
Citation Management provides the technical infrastructure for Literature Review:
- Literature Review: Multi-database systematic search and synthesis
- Citation Management: Metadata extraction and validation
Combined workflow:
- Use literature-review for systematic search methodology
- Use citation-management to extract and validate citations
- Use literature-review to synthesize findings
- Use citation-management to ensure bibliography accuracy
引用管理为文献综述提供技术基础:
- 文献综述:多数据库系统化搜索与综合分析
- 引用管理:元数据提取与验证
组合工作流:
- 使用literature-review技能制定系统化搜索方法
- 使用citation-management技能提取并验证引用
- 使用literature-review技能综合分析研究结果
- 使用citation-management技能确保参考文献准确性
Scientific Writing Skill
科学写作技能
Citation Management ensures accurate references for Scientific Writing:
- Export validated BibTeX for use in LaTeX manuscripts
- Verify citations match publication standards
- Format references according to journal requirements
引用管理确保科学写作中的参考文献准确:
- 导出经过验证的BibTeX用于LaTeX手稿
- 验证引用符合出版标准
- 根据期刊要求格式化参考文献
Venue Templates Skill
期刊模板技能
Citation Management works with Venue Templates for submission-ready manuscripts:
- Different venues require different citation styles
- Generate properly formatted references
- Validate citations meet venue requirements
引用管理与期刊模板技能配合,生成可直接提交的手稿:
- 不同期刊要求不同的引用风格
- 生成格式规范的参考文献
- 验证引用符合期刊要求
Resources
资源
Bundled Resources
内置资源
References (in ):
references/- : Complete Google Scholar search guide
google_scholar_search.md - : PubMed and E-utilities API documentation
pubmed_search.md - : Metadata sources and field requirements
metadata_extraction.md - : Validation criteria and quality checks
citation_validation.md - : BibTeX entry types and formatting rules
bibtex_formatting.md
Scripts (in ):
scripts/- : Google Scholar search automation
search_google_scholar.py - : PubMed E-utilities API client
search_pubmed.py - : Universal metadata extractor
extract_metadata.py - : Citation validation and verification
validate_citations.py - : BibTeX formatter and cleaner
format_bibtex.py - : Quick DOI to BibTeX converter
doi_to_bibtex.py
Assets (in ):
assets/- : Example BibTeX entries for all types
bibtex_template.bib - : Quality assurance checklist
citation_checklist.md
参考文档(位于):
references/- :完整Google Scholar搜索指南
google_scholar_search.md - :PubMed与E-utilities API文档
pubmed_search.md - :元数据源与字段要求
metadata_extraction.md - :验证标准与质量检查
citation_validation.md - :BibTeX条目类型与格式规则
bibtex_formatting.md
脚本(位于):
scripts/- :Google Scholar搜索自动化工具
search_google_scholar.py - :PubMed E-utilities API客户端
search_pubmed.py - :通用元数据提取器
extract_metadata.py - :引用验证工具
validate_citations.py - :BibTeX格式化与清理工具
format_bibtex.py - :快速DOI转BibTeX工具
doi_to_bibtex.py
资源文件(位于):
assets/- :所有类型的BibTeX条目示例
bibtex_template.bib - :质量保证清单
citation_checklist.md
External Resources
外部资源
Search Engines:
- Google Scholar: https://scholar.google.com/
- PubMed: https://pubmed.ncbi.nlm.nih.gov/
- PubMed Advanced Search: https://pubmed.ncbi.nlm.nih.gov/advanced/
Metadata APIs:
- CrossRef API: https://api.crossref.org/
- PubMed E-utilities: https://www.ncbi.nlm.nih.gov/books/NBK25501/
- arXiv API: https://arxiv.org/help/api/
- DataCite API: https://api.datacite.org/
Tools and Validators:
- MeSH Browser: https://meshb.nlm.nih.gov/search
- DOI Resolver: https://doi.org/
- BibTeX Format: http://www.bibtex.org/Format/
Citation Styles:
- BibTeX documentation: http://www.bibtex.org/
- LaTeX bibliography management: https://www.overleaf.com/learn/latex/Bibliography_management
搜索引擎:
- Google Scholar: https://scholar.google.com/
- PubMed: https://pubmed.ncbi.nlm.nih.gov/
- PubMed高级搜索: https://pubmed.ncbi.nlm.nih.gov/advanced/
元数据API:
- CrossRef API: https://api.crossref.org/
- PubMed E-utilities: https://www.ncbi.nlm.nih.gov/books/NBK25501/
- arXiv API: https://arxiv.org/help/api/
- DataCite API: https://api.datacite.org/
工具与验证器:
- MeSH浏览器: https://meshb.nlm.nih.gov/search
- DOI解析器: https://doi.org/
- BibTeX格式指南: http://www.bibtex.org/Format/
引用风格:
- BibTeX文档: http://www.bibtex.org/
- LaTeX参考文献管理: https://www.overleaf.com/learn/latex/Bibliography_management
Dependencies
依赖
Required Python Packages
必需Python包
bash
undefinedbash
undefinedCore dependencies
核心依赖
pip install requests # HTTP requests for APIs
pip install bibtexparser # BibTeX parsing and formatting
pip install biopython # PubMed E-utilities access
pip install requests # API的HTTP请求
pip install bibtexparser # BibTeX解析与格式化
pip install biopython # PubMed E-utilities访问
Optional (for Google Scholar)
可选(用于Google Scholar)
pip install scholarly # Google Scholar API wrapper
pip install scholarly # Google Scholar API包装器
or
或
pip install selenium # For more robust Scholar scraping
undefinedpip install selenium # 更稳定的Scholar爬取工具
undefinedOptional Tools
可选工具
bash
undefinedbash
undefinedFor advanced validation
高级验证
pip install crossref-commons # Enhanced CrossRef API access
pip install pylatexenc # LaTeX special character handling
undefinedpip install crossref-commons # 增强版CrossRef API访问
pip install pylatexenc # LaTeX特殊字符处理
undefinedSummary
总结
The citation-management skill provides:
- Comprehensive search capabilities for Google Scholar and PubMed
- Automated metadata extraction from DOI, PMID, arXiv ID, URLs
- Citation validation with DOI verification and completeness checking
- BibTeX formatting with standardization and cleaning tools
- Quality assurance through validation and reporting
- Integration with scientific writing workflow
- Reproducibility through documented search and extraction methods
Use this skill to maintain accurate, complete citations throughout your research and ensure publication-ready bibliographies.
引用管理技能提供:
- 全面搜索能力:支持Google Scholar和PubMed
- 自动化元数据提取:支持DOI、PMID、arXiv ID、URL
- 引用验证:DOI验证与完整性检查
- BibTeX格式化:标准化与清理工具
- 质量保证:验证与报告功能
- 工作流集成:与科学写作流程集成
- 可复现性:文档化的搜索与提取方法
使用该技能在研究全程维持准确、完整的引用,确保参考文献达到出版要求。