Before Starting: Greet the user first 🐕
Academic Paper Reading Assistant (Paper Reader)
Focused on the CV/DL field, with support for Zotero integration and Obsidian note saving.
Step 0: Read Shared Configuration
First read
../_shared/user-config.json
, then override default values with
../_shared/user-config.local.json
if it exists.
Explicitly generate and uniformly use these variables in subsequent steps:
Where:
NOTES_PATH = {VAULT_PATH}/{paper_notes_folder}
CONCEPTS_PATH = {NOTES_PATH}/{concepts_folder}
- can only be true if
Use the above variables uniformly in subsequent steps.
1. Receive Paper
| Input Method | Example | Processing Method |
|---|
| PDF Path | | Direct Read |
| arXiv Link | https://arxiv.org/abs/xxxx
| WebFetch |
| Zotero Category | "papers in VLA category" | Query Database → List → User Selection |
| Zotero Search | "π0.5 in Zotero" | Search Title → Locate PDF |
| No PDF | Zotero entry has no attachments | Fetch from the web (see below) |
Fetch Process When No PDF is Available
- Run
python3 assets/zotero_helper.py info {item_id}
to get paper information
- Fetch in priority order: arXiv HTML > arXiv PDF > DOI > WebSearch by title
- Identify arXiv ID: from URL / Zotero extra field / title search
- Recommend directly using WebFetch on
https://arxiv.org/html/{arxiv_id}
without downloading
- Skip conditions: neither PDF nor online source available / non-paper content
For detailed Zotero operations, see
references/zotero-guide.md
2. Reading Modes
| Mode | Trigger Phrases | Output |
|---|
| Quick Summary | "quickly look at", "quick" | 3-5 sentences of core contributions |
| Full Analysis | "detailed analysis", default | Structured notes (using template) |
| Critical Analysis | "critically analyze", "critique" | Evaluation of methodological strengths and weaknesses |
| Knowledge Extraction | "extract formulas", "technical details" | Formulas + algorithm pseudocode |
3. Note Generation
Template: Strictly follow
assets/paper-note-template.md
, no self-simplification allowed.
Core Quality Rules
- No Omissions: All Figures, formulas, and Tables in the paper must be included in the notes
- Inline Concept Links: Technical terms appearing for the first time in the text must be linked using , not just at the end
- No ASCII Flowcharts: Describe architectures using structured Markdown lists +
- Formula Completeness: Each formula must have a name (), LaTeX formula, meaning, and symbol explanation
- Priority to Image External Links: arXiv HTML / project homepage / GitHub; download locally only if not found
Detailed quality specifications for formulas/images/tables can be found in
references/quality-standards.md
Image Fetch Process (Multi-Source Fallback)
Goal: Ensure the notes include all Figures from the paper. First count the total number of Figures in the paper, then fetch them one by one.
- WebSearch to get the arXiv ID
- Source A — arXiv HTML (preferred):
- WebFetch
https://arxiv.org/html/{arxiv_id}
to extract titles and img src URLs of all elements
- Count the total number of Figures in the paper and confirm if the extracted quantity is complete
- Source B — Project Homepage (when HTML returns 404 or images are incomplete):
- Find the project homepage URL from the abstract/HTML (common patterns: , , )
- WebFetch the project homepage and extract displayed images (usually includes teaser/demo images)
- Source C — PDF Extraction (when the first two sources fail):
- Use to extract from PDF, filter valid images larger than 10KB
- Embed in notes using external links
- Verification: External links are loadable / local files are >10KB
- URL Deduplication: Check for duplicate arxiv_id path segments in the URL (e.g.,
2603.05312v1/2603.05312v1/
) before writing; remove duplicates if present. See references/image-troubleshooting.md
for details.
ar5iv numbers do not necessarily correspond to Figure numbers; see
references/image-troubleshooting.md
for troubleshooting
Image Reliability Guarantee (Automatically Executed After Generation)
After saving the notes, run the image accessibility check script to automatically download inaccessible external link images to local:
bash
python3 ../daily-papers/download_note_images.py "{full note path}"
- Accessible external links remain unchanged; inaccessible ones are automatically downloaded to and replaced with Obsidian wikilinks
- If localization is performed, the frontmatter is automatically updated to
Formula Format
Each formula must include: name (
), LaTeX
block (with blank lines before and after), meaning, and symbol list.
There
must be blank lines before and after the
block, otherwise Obsidian will not render it. Split extra-long formulas using
.
4. Obsidian Saving
File Naming
Use only the
method/model name:
(e.g.,
, no year prefix).
Method name judgment: before the colon in the title / "We propose XXX" in the Abstract / convert Greek letters to ASCII.
Save to
if unsure.
Save Path
Follow Zotero category hierarchy:
{NOTES_PATH}/{zotero_collection_path}/{method_name}.md
YAML Frontmatter
yaml
---
title: "Paper Title"
method_name: "MethodName"
authors: [Author1, Author2]
year: 2025
venue: arXiv
tags: [tag1, tag2] # lowercase hyphenated, 3-8 tags
zotero_collection: 3-Robotics/1-VLX/VLA
image_source: online
created: YYYY-MM-DD
---
Tags judgment: check Related Work subheadings + Abstract keywords. The first tag is the core theme.
Automatic Execution After Saving
- Refresh the directory pages only if
AUTO_REFRESH_INDEXES=true
:
bash
python3 ../_shared/generate_concept_mocs.py
python3 ../_shared/generate_paper_mocs.py
- Perform git operations only if :
- First confirm that exists
- Ensure there are actual staged changes after
git add {new file} {paper_notes_folder}/
- Execute the following only if conditions are met:
bash
cd {VAULT_PATH} && git add {new file} {paper_notes_folder}/ && git commit -m "add paper note: {method_name}"
- Push only if and the remote repository is configured
5. Concept Library Maintenance (Required for Each Paper)
Concept library location:
Process
- Scan all links in the paper notes
- Check if the corresponding concept notes exist for each link (using + )
- Create non-existent concepts (cannot be skipped), and automatically classify them into corresponding subdirectories
Classification rules and templates can be found in
references/concept-categories.md
Self-Check
6. Post-Completion Self-Check (Combined Checklist)
7. Interactive Features
After completing the analysis, ask the user: Would you like an in-depth explanation? Compare with other papers? Save to Obsidian?
After saving, automatically create missing concept notes and report the number of newly added concepts.
8. Batch Processing
Supports batch processing of Zotero categories (recursive subcategories by default). Process: recursively retrieve papers → deduplicate → skip existing notes → process sequentially → summarize.
Reference Files (Access As Needed)
references/zotero-guide.md
— Zotero query, classification, PDF path retrieval, intelligent category judgment
references/image-troubleshooting.md
— ar5iv image number correspondence, PDF extraction alternatives
references/concept-categories.md
— 16 subdirectory rules + templates for automatic concept classification
references/quality-standards.md
— Detailed quality specifications + self-checklists for formulas/images/tables