research-docs
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseResearch Docs — Document Q&A with Visual Citations
Research Docs — 带可视化引用的文档问答
Parse documents with LiteParse, answer a question using the parsed text, and generate an HTML report with source citations highlighted on page images.
使用LiteParse解析文档,利用解析后的文本回答问题,并生成在页面图像上高亮标注源引用的HTML报告。
Arguments
参数
$ARGUMENTS- First argument (): Path to the data directory containing documents
$0 - Remaining arguments: The question to answer
If either is missing, ask the user to provide them.
$ARGUMENTS- 第一个参数 (): 存放文档的数据目录路径
$0 - 其余参数: 需要回答的问题
如果缺少任意一项,请要求用户提供。
Step 1 — Parse Documents
步骤1 — 解析文档
IMPORTANT: Always use the bundled Python script below for parsing. Do NOT call or CLI commands directly — use only .
litliteparsegenerate_report.pyRun the bundled parse script to extract text and bounding boxes from all supported files:
bash
python "${CLAUDE_SKILL_DIR}/scripts/generate_report.py" \
--skill-dir "${CLAUDE_SKILL_DIR}" \
--dir "$0" \
--parse-only \
--output /tmp/research_docs_parsed.jsonThis discovers and parses all supported files in the directory:
- LiteParse formats: PDF, DOCX, PPTX, XLSX, images (up to 50 files)
- Plaintext: .txt, .md, .rst (read directly)
The output is a JSON file with parsed text and bounding box coordinates for each page.
If the directory has more than 50 files and the user's question targets a specific document not in the first 50, re-run with a narrower pointing to a subdirectory, or ask the user which files to focus on.
--dir重要提示: 请始终使用下方捆绑的Python脚本进行解析。不要直接调用 或 CLI 命令,仅使用 。
litliteparsegenerate_report.py运行捆绑的解析脚本,从所有支持的文件中提取文本和边界框:
bash
python "${CLAUDE_SKILL_DIR}/scripts/generate_report.py" \
--skill-dir "${CLAUDE_SKILL_DIR}" \
--dir "$0" \
--parse-only \
--output /tmp/research_docs_parsed.json该命令会发现并解析目录中所有支持的文件:
- LiteParse 支持格式: PDF、DOCX、PPTX、XLSX、图片(最多50个文件)
- 纯文本格式: .txt、.md、.rst(直接读取)
输出为JSON文件,包含解析后的文本以及每个页面的边界框坐标。
如果目录中的文件超过50个,且用户的问题针对的是前50个之外的特定文档,请重新运行命令,将 指向范围更小的子目录,或者询问用户需要聚焦哪些文件。
--dirStep 2 — Read Parsed Content
步骤2 — 读取解析后的内容
Read using the Read tool. Focus on:
/tmp/research_docs_parsed.json- Each file's and
nametype - For LiteParse files: each page's field (skip raw
text— those are for bounding box rendering)textItems - For plaintext files: the field
text - The object for total counts
summary
Build a mental model of all document content before answering.
使用读取工具读取 ,重点关注:
/tmp/research_docs_parsed.json- 每个文件的 和
nametype - 对于LiteParse支持的文件:每个页面的 字段(跳过原始
text,该字段用于边界框渲染)textItems - 对于纯文本文件:字段
text - 用于统计总数量的 对象
summary
在回答前先梳理所有文档内容的整体逻辑。
Step 3 — Answer with Citations
步骤3 — 带引用回答问题
Using the parsed text as context, answer the user's question. Write your response as a JSON file:
bash
cat > /tmp/research_docs_answer.json << 'ANSWER_EOF'
{
"question": "<the user's question>",
"answer": "<your answer in markdown with [N] citation markers>",
"citations": [
{
"file": "<filename e.g. report.pdf>",
"page": <1-indexed page number>,
"quote": "<exact verbatim substring from the parsed text>",
"relevance": "<explanation of what this value/quote means and how it supports the answer>"
}
]
}
ANSWER_EOFCritical rules for the answer:
- Embed inline citation markers like ,
[1], etc. in your answer text, corresponding to the 1-indexed position in the[2]arraycitations - Place markers at the end of the sentence or claim they support
- Example:
"Reserve Bank credit totaled **$6,613,609 million** [1], with securities held outright at $6,375,679 million [2]."
Critical rules for what to cite:
- Cite the EVIDENCE, not just the label. The user wants to audit your claims. If you say "revenue was $1.2B", cite the actual number from the text — not just the heading "Revenue". You can cite both the value and the label if they're on the same page.
1,200,000 - Cite specific data values — numbers, percentages, dates, dollar amounts, quantities. These are what the user needs to verify.
- Each field should explain the "so what" — not just restate the label but explain what this value means in context and how it supports your answer. E.g., instead of "Total revenue figure" write "Total revenue for Q3 2025, representing a 12% year-over-year increase that supports the growth trend discussed above."
relevance - Include 5-15 citations covering all key claims and data points in your answer.
Critical rules for quote format:
- MUST be copied character-for-character from the parsed text. It is used for bounding box lookup via exact string matching. Do NOT paraphrase, reword, clean up, or fix typos.
quote - Prefer short, precise quotes — a number like or a short phrase like
6,613,609(under 60 characters). Shorter quotes match bounding boxes much more reliably than long sentences.Securities held outright - If the text has unusual characters, hyphens, or formatting artifacts, include them exactly as they appear.
- is 1-indexed (matches LiteParse pageNum)
page - is just the filename (not the full path)
file - For plaintext files (.txt, .md), set to
page(they have no pages)0
将解析后的文本作为上下文,回答用户的问题。将你的回答写入JSON文件:
bash
cat > /tmp/research_docs_answer.json << 'ANSWER_EOF'
{
"question": "<the user's question>",
"answer": "<your answer in markdown with [N] citation markers>",
"citations": [
{
"file": "<filename e.g. report.pdf>",
"page": <1-indexed page number>,
"quote": "<exact verbatim substring from the parsed text>",
"relevance": "<explanation of what this value/quote means and how it supports the answer>"
}
]
}
ANSWER_EOF回答的核心规则:
- 在回答文本中嵌入类似 、
[1]的行内引用标记,对应[2]数组中从1开始计数的位置citations - 将标记放在其支撑的句子或论断的末尾
- 示例:
"Reserve Bank credit totaled **$6,613,609 million** [1], with securities held outright at $6,375,679 million [2]."
引用内容的核心规则:
- 引用证据,而非仅引用标签。 用户需要验证你的论断,如果你说“收入为12亿美元”,请引用文本中实际的数字 ,而不是仅引用“收入”这个标题。如果数值和标签在同一页,你可以同时引用两者。
1,200,000 - 引用具体的数据值 — 数字、百分比、日期、金额、数量,这些都是用户需要验证的内容。
- 每个 字段都要解释“引用的意义” — 不要只是重述标签,要解释该数值在上下文中的含义,以及它如何支撑你的回答。例如,不要写“总收入数值”,而要写“2025年第三季度总收入,同比增长12%,支撑了上文讨论的增长趋势。”
relevance - 包含 5-15条引用,覆盖回答中所有的核心论断和数据点。
引用文本格式的核心规则:
- 必须逐字符复制自解析后的文本,它用于通过精确字符串匹配查找边界框。请勿改写、重述、润色或修正拼写错误。
quote - 优先选择简短精准的引用 — 比如 这样的数字,或者
6,613,609这样的短语(不超过60个字符)。较短的引用比长句匹配边界框的可靠性高得多。Securities held outright - 如果文本中有特殊字符、连字符或格式伪影,请原样保留。
- 从1开始计数(与LiteParse的pageNum一致)
page - 仅填写文件名(无需完整路径)
file - 对于纯文本文件(.txt、.md),将 设为
page(这类文件没有分页)0
Step 4 — Generate HTML Report
步骤4 — 生成HTML报告
Run the bundled script in generate mode to produce the visual report:
bash
python "${CLAUDE_SKILL_DIR}/scripts/generate_report.py" \
--skill-dir "${CLAUDE_SKILL_DIR}" \
--dir "$0" \
--answer-file /tmp/research_docs_answer.json \
--output research_docs_output/This will:
- Parse and screenshot only the cited pages (efficient — not all pages)
- Find bounding boxes for each cited quote
- Generate a self-contained HTML report with the answer, page images, and highlights
- Open the report in the default browser
以生成模式运行捆绑的脚本,生成可视化报告:
bash
python "${CLAUDE_SKILL_DIR}/scripts/generate_report.py" \
--skill-dir "${CLAUDE_SKILL_DIR}" \
--dir "$0" \
--answer-file /tmp/research_docs_answer.json \
--output research_docs_output/该命令会:
- 仅解析和截取被引用的页面(高效,无需处理所有页面)
- 查找每个引用文本的边界框
- 生成独立的HTML报告,包含回答、页面图像和高亮标记
- 在默认浏览器中打开报告
Step 5 — Present Results
步骤5 — 展示结果
Tell the user:
- Where the report was saved (the file path printed by the script)
- A brief summary of the answer (2-3 sentences)
- How many citations were found
告知用户:
- 报告的保存位置(脚本打印的文件路径)
- 回答的简要总结(2-3句话)
- 找到的引用数量