paper-audit
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChinesePaper Audit Skill (论文审核)
Paper Audit Skill (论文审核)
Unified academic paper auditing across formats and languages.
跨格式、跨语言的统一学术论文审核能力。
Critical Rules
核心规则
- NEVER modify ,
\cite{},\ref{}, math environments in LaTeX\label{} - NEVER modify ,
@cite,#cite(),#ref()in Typst<label> - NEVER fabricate bibliography entries — only verify existing /
.bibfiles.yml - NEVER change domain terminology without user confirmation
- Check lists before suggesting any terminology changes
FORBIDDEN_TERMS - For PDF input, clearly flag sections where extraction quality is uncertain
- Always distinguish between automated findings and LLM-judgment scores
- 绝对不要修改LaTeX中的、
\cite{}、\ref{}和数学环境\label{} - 绝对不要修改Typst中的、
@cite、#cite()、#ref()<label> - 绝对不要捏造参考文献条目,仅可验证已有的/
.bib文件.yml - 未经用户确认不得修改领域专业术语
- 建议修改术语前需先检查禁止术语列表
FORBIDDEN_TERMS - 针对PDF输入,需明确标记提取质量不确定的段落
- 始终明确区分自动检查结果和LLM判定得分
Audit Modes
审核模式
Mode: self-check
(Pre-submission Self-Check)
self-check模式:self-check
(投稿前自查)
self-checkTrigger keywords: audit, check, self-check, pre-submission, score, review my paper
What it does: Runs all automated checks and generates a structured report with:
- Per-dimension scores (Quality, Clarity, Significance, Originality) on 1-6 scale
- Issue list sorted by severity (Critical > Major > Minor)
- Improvement suggestions per section
- Pre-submission checklist results
CLI:
python scripts/audit.py paper.tex --mode self-check触发关键词:audit、check、self-check、pre-submission、score、review my paper
功能说明:运行所有自动检查项,生成结构化报告,包含:
- 各维度得分(质量、清晰度、重要性、原创性),采用1-6分制
- 按严重程度排序的问题列表(Critical > Major > Minor)
- 各章节改进建议
- 投稿前检查清单结果
CLI命令:
python scripts/audit.py paper.tex --mode self-checkOnline Bibliography Verification
在线参考文献验证
Add to enable CrossRef/Semantic Scholar metadata verification:
--onlinepython scripts/audit.py paper.tex --mode self-check --online --email user@example.com添加参数启用CrossRef/Semantic Scholar元数据验证:
--onlinepython scripts/audit.py paper.tex --mode self-check --online --email user@example.comScholarEval 8-Dimension Assessment
ScholarEval 8维度评估
Add to enable the 8-dimension evaluation framework:
--scholar-evalpython scripts/audit.py paper.tex --mode self-check --scholar-evalScript-evaluable dimensions (Soundness, Clarity, Presentation, partial Reproducibility) are scored automatically. For complete assessment, supplement with LLM evaluation of Novelty, Significance, Ethics, and Reproducibility. See .
SCHOLAR_EVAL_GUIDE.mdScholarEval LLM Assessment Prompt (for mode):
reviewRead the full paper and provide 1-10 scores with evidence in JSON format:
json
{
"novelty": {
"score": "<1-10>",
"evidence": "<Describe originality and distinction from prior work>"
},
"significance": {
"score": "<1-10>",
"evidence": "<Describe potential impact on the field>"
},
"reproducibility_llm": {
"score": "<1-10>",
"evidence": "<Assess experimental description completeness, code/data availability>"
},
"ethics": {
"score": "<1-10>",
"evidence": "<Assess ethical considerations, conflicts of interest, data privacy>"
}
}添加参数启用8维度评估框架:
--scholar-evalpython scripts/audit.py paper.tex --mode self-check --scholar-eval脚本可自动评估的维度(合理性、清晰度、呈现效果、部分可复现性)会自动打分。如需完整评估,需补充LLM对创新性、重要性、伦理、可复现性的评估,详见。
SCHOLAR_EVAL_GUIDE.mdScholarEval LLM评估提示词(适用于模式):
review阅读完整论文,以JSON格式输出1-10分的得分及佐证依据:
json
{
"novelty": {
"score": "<1-10>",
"evidence": "<Describe originality and distinction from prior work>"
},
"significance": {
"score": "<1-10>",
"evidence": "<Describe potential impact on the field>"
},
"reproducibility_llm": {
"score": "<1-10>",
"evidence": "<Assess experimental description completeness, code/data availability>"
},
"ethics": {
"score": "<1-10>",
"evidence": "<Assess ethical considerations, conflicts of interest, data privacy>"
}
}Mode: review
(Peer Review Simulation)
review模式:review
(模拟同行评审)
reviewTrigger keywords: simulate review, peer review, reviewer perspective, what would reviewers say
What it does: Everything in self-check PLUS:
- Paper summary from reviewer perspective
- Strengths analysis
- Weaknesses analysis with severity
- Questions a reviewer would ask
- Accept/reject recommendation with confidence
CLI:
python scripts/audit.py paper.tex --mode review触发关键词:simulate review、peer review、reviewer perspective、what would reviewers say
功能说明:包含self-check模式的所有功能,额外提供:
- 评审视角的论文摘要
- 优势分析
- 带严重等级的劣势分析
- 评审可能提出的问题
- 带置信度的接收/拒稿建议
CLI命令:
python scripts/audit.py paper.tex --mode reviewMode: gate
(Quality Gate)
gate模式:gate
(质量门控)
gateTrigger keywords: quality gate, pass/fail, can I submit, ready to submit, advisor check
What it does: Fast mandatory checks only:
- Format validation
- Bibliography integrity
- Figure/table references
- Pre-submission checklist
- Binary PASS/FAIL verdict with blocking issues
CLI:
python scripts/audit.py paper.tex --mode gate触发关键词:quality gate、pass/fail、can I submit、ready to submit、advisor check
功能说明:仅运行快速强制检查项:
- 格式校验
- 参考文献完整性
- 图表引用正确性
- 投稿前检查清单
- 二元通过/不通过判定及阻塞性问题说明
CLI命令:
python scripts/audit.py paper.tex --mode gateMode: polish
(Adversarial Dual-Agent Deep Polish)
polish模式:polish
(对抗式双Agent深度润色)
polishTrigger keywords: polish, deep polish, adversarial review, refine writing,
improve writing, paragraph polish
What it does:
- Phase 1 (Python): Fast rule-based precheck → .polish-state/precheck.json
- Phase 2 (Critic Agent): LLM adversarial review → per-section logic/expression scores
- Phase 3 (Mentor Agent × N): Per-section polish suggestions → Original vs Revised table
- Outputs: Structured polish report with diff-comment suggestions
Style options ():
--style- Plain Precise (default): Short sentences, active voice, technical precision
A - Narrative Fluent: Story-driven, transitions, accessible prose
B - Formal Academic: Passive voice acceptable, formal register, hedge words
C
Skip logic: bypasses Critic logic scoring; Mentor runs
expression-only polish. Equivalent to quick command.
--skip-logic/polishCLI:
python scripts/audit.py paper.tex --mode polish --style A --journal neurips触发关键词:polish、deep polish、adversarial review、refine writing、improve writing、paragraph polish
功能说明:
- 阶段1(Python脚本):快速规则预检查 → 输出
.polish-state/precheck.json - 阶段2(批评Agent):LLM对抗式评审 → 输出各章节逻辑/表达得分
- 阶段3(导师Agent × N):各章节润色建议 → 输出原文与修改对照表
- 输出:带差异注释建议的结构化润色报告
风格选项(参数):
--style- 简洁精准(默认):短句、主动语态、技术精准
A - 叙事流畅:故事驱动、过渡自然、通俗易懂
B - 正式学术:可使用被动语态、正式语体、缓冲表述
C
跳过逻辑检查:参数会跳过批评Agent的逻辑打分,导师Agent仅进行表达层面润色,等价于快速命令。
--skip-logic/polishCLI命令:
python scripts/audit.py paper.tex --mode polish --style A --journal neuripsSupported Formats
支持格式
| Format | Parser | Notes |
|---|---|---|
| LaTeX (.tex) | | Full support — all checks available |
| Typst (.typ) | | Full support — all checks available |
| PDF (.pdf) basic | | Text extraction with font-size heading detection |
| PDF (.pdf) enhanced | | Structured Markdown with table/header preservation |
PDF Limitations: Math formulas may be lost; some checks (format, figures) skip for PDF. Recommend providing source files (.tex/.typ) for maximum accuracy.
| 格式 | 解析器 | 备注 |
|---|---|---|
| LaTeX (.tex) | | 完全支持,可使用所有检查功能 |
| Typst (.typ) | | 完全支持,可使用所有检查功能 |
| PDF (.pdf) 基础版 | | 文本提取,支持字号标题识别 |
| PDF (.pdf) 增强版 | | 结构化Markdown输出,保留表格/标题 |
PDF限制:数学公式可能丢失,部分检查(格式、图表)对PDF不可用,建议提供源文件(.tex/.typ)以获得最高准确率。
Language Support
语言支持
| Language | Detection | Extra Checks |
|---|---|---|
| English | Auto (default) | Standard suite |
| Chinese | Auto (CJK ratio > 30%) | + consistency check, + GB/T 7714 compliance |
Force with or .
--lang en--lang zh| 语言 | 检测方式 | 额外检查项 |
|---|---|---|
| English | 自动检测(默认) | 标准检查套件 |
| Chinese | 自动检测(中日韩字符占比>30%) | 额外包含一致性检查、GB/T 7714格式合规检查 |
可通过或强制指定语言。
--lang en--lang zhCheck Modules
检查模块
| Module | Script Source | Dimensions Affected | Applicable Formats |
|---|---|---|---|
| Format Check | | Clarity | .tex, .typ |
| Grammar Analysis | | Clarity | .tex, .typ, .pdf |
| Logic & Coherence | | Quality, Significance | .tex, .typ, .pdf |
| Sentence Complexity | | Clarity | .tex, .typ, .pdf |
| De-AI Detection | | Clarity, Originality | .tex, .typ, .pdf |
| Bibliography | | Quality | .tex, .typ |
| Figure/Table Refs | | Clarity | .tex |
| Reference Integrity | | Clarity, Quality | .tex, .typ |
| Visual Layout | | Clarity | |
| Consistency (ZH) | | Clarity | .tex (Chinese only) |
| GB/T 7714 (ZH) | | Quality | .tex (Chinese only) |
| Pre-submission Checklist | Built-in | All | All formats |
| 模块 | 脚本源文件 | 影响维度 | 适用格式 |
|---|---|---|---|
| 格式检查 | | 清晰度 | .tex, .typ |
| 语法分析 | | 清晰度 | .tex, .typ, .pdf |
| 逻辑连贯性 | | 质量、重要性 | .tex, .typ, .pdf |
| 句子复杂度 | | 清晰度 | .tex, .typ, .pdf |
| AI生成检测 | | 清晰度、原创性 | .tex, .typ, .pdf |
| 参考文献检查 | | 质量 | .tex, .typ |
| 图表引用检查 | | 清晰度 | .tex |
| 引用完整性 | | 清晰度、质量 | .tex, .typ |
| 视觉布局 | | 清晰度 | |
| 一致性检查(中文) | | 清晰度 | .tex(仅中文) |
| GB/T 7714合规检查(中文) | | 质量 | .tex(仅中文) |
| 投稿前检查清单 | 内置 | 所有维度 | 所有格式 |
Scoring System
评分体系
Based on REVIEWER_PERSPECTIVE.md criteria:
基于标准:
REVIEWER_PERSPECTIVE.mdFour Dimensions
四个评分维度
- Quality (30%): Technical soundness, well-supported claims
- Clarity (30%): Clear writing, reproducible, good organization
- Significance (20%): Community impact, advances understanding
- Originality (20%): New insights, not obvious extensions
- 质量(30%):技术合理性、论据充分性
- 清晰度(30%):表述清晰、可复现、结构合理
- 重要性(20%):社区影响力、推动领域认知进展
- 原创性(20%):新见解、非显而易见的扩展
Six-Point Scale (NeurIPS standard)
6分制评分标准(NeurIPS标准)
| Score | Rating | Meaning |
|---|---|---|
| 5.5-6.0 | Strong Accept | Groundbreaking, technically flawless |
| 4.5-5.4 | Accept | Technically solid, high impact |
| 3.5-4.4 | Borderline Accept | Solid but limited evaluation/novelty |
| 2.5-3.4 | Borderline Reject | Merits but weaknesses outweigh |
| 1.5-2.4 | Reject | Technical flaws, insufficient evaluation |
| 1.0-1.4 | Strong Reject | Fundamental errors or known results |
| 分数 | 评级 | 含义 |
|---|---|---|
| 5.5-6.0 | 强烈接收 | 突破性成果,技术上无瑕疵 |
| 4.5-5.4 | 接收 | 技术扎实,影响力高 |
| 3.5-4.4 | 临界接收 | 成果扎实,但评估/创新性有限 |
| 2.5-3.4 | 临界拒稿 | 有可取之处,但缺陷占优 |
| 1.5-2.4 | 拒稿 | 存在技术缺陷,评估不充分 |
| 1.0-1.4 | 强烈拒稿 | 存在根本性错误或属于已知成果 |
Output Protocol
输出规范
All issues follow the unified format:
[MODULE] (Line N) [Severity: Critical|Major|Minor] [Priority: P0|P1|P2]: Issue description
Original: ...
Revised: ...
Rationale: ...- Severity: Critical (must fix), Major (should fix), Minor (nice to fix)
- Priority: P0 (blocking), P1 (important), P2 (low priority)
所有问题遵循统一格式:
[MODULE] (Line N) [Severity: Critical|Major|Minor] [Priority: P0|P1|P2]: Issue description
Original: ...
Revised: ...
Rationale: ...- 严重程度:Critical(必须修复)、Major(应该修复)、Minor(建议修复)
- 优先级:P0(阻塞性问题)、P1(重要问题)、P2(低优先级问题)
Workflow
工作流程
When a user requests a paper audit:
- Identify the file — locate the .tex, .typ, or .pdf file
- Determine mode — self-check (default), review, or gate based on user intent
- Run the orchestrator —
python scripts/audit.py <file> --mode <mode> - Present the report — show the Markdown report to the user
- Discuss findings — help the user address Critical and Major issues first
- Re-audit if needed — run again after fixes to verify improvements
For mode, supplement the automated report with LLM analysis of:
review- Overall paper strengths (what works well)
- Key weaknesses (what reviewers would criticize)
- Questions a reviewer would ask
- Missing related work or baselines
当用户发起论文审核请求时:
- 识别文件:定位.tex、.typ或.pdf文件
- 确定模式:根据用户意图选择self-check(默认)、review或gate模式
- 运行调度程序:执行
python scripts/audit.py <file> --mode <mode> - 展示报告:向用户呈现Markdown格式的报告
- 讨论结果:协助用户优先处理Critical和Major级问题
- 如需重新审核:修复后再次运行以验证改进效果
针对模式,需在自动报告基础上补充LLM分析:
review- 论文整体优势(亮点内容)
- 核心缺陷(评审可能批评的点)
- 评审可能提出的问题
- 缺失的相关工作或基线
Polish Mode Workflow
润色模式工作流程
-
Run Python precheck
python scripts/audit.py <file> --mode polish [--style A|B|C] [--journal <name>] [--skip-logic]Readfrom the paper's directory..polish-state/precheck.json -
Check hard blockers Ifis non-empty, display them and STOP. Say: "Fix these Critical issues before polish can proceed:" + list. Do NOT spawn any agent until user confirms fixes.
precheck.json["blockers"] -
Handle non-IMRaD structure (if) Show detected sections, ask user: "Proceed with polish on these sections?"
precheck.json["non_imrad"] == true -
Spawn Critic Agent via Task:Subagent type:Prompt template:
general-purposeYou are an adversarial academic reviewer. Paper: {file_path} | Language: {lang} | Journal: {journal} | Style: {style} Step 1: Read the paper using the Read tool (file: {file_path}). Step 2: The rule-based precheck found these issues: {precheck_issues_summary} Step 3: Produce a CRITIC REPORT as valid JSON (no markdown fencing): { "global_verdict": "ready_to_polish" | "needs_revision_first" | "major_restructure_needed", "global_rationale": "2-3 sentences", "section_verdicts": [ { "section": "<name>", "logic_score": 1-5, "expression_score": 1-5, "blocks_mentor": false, "blocking_reason": "", "top_issues": [{"type": "logic|expression|argument", "description": "..."}] } ], "cross_section_issues": ["..."] } blocks_mentor = true ONLY when logic_score <= 2 or section is structurally absent.Save the Critic's JSON output tousing Bash:.polish-state/critic_report.jsonpython -c "import pathlib; pathlib.Path('.polish-state/critic_report.json').write_text('<critic_json_here>', encoding='utf-8')" -
Display Critic Dashboard and gate Render the Critic report as a markdown table (see dashboard format). Show blocked sections. Ask: "How to proceed? [1] Polish all sections (override blocks) [2] Skip blocked sections, polish the rest [3] Stop and revise blocked sections first" Wait for response.
-
Spawn Mentor Agents per section (sequential, one at a time): For each approved section in IMRaD order:Subagent type:Prompt template:
general-purposeYou are a writing mentor specializing in academic polish. CRITICAL RULES (NEVER VIOLATE): - Never modify \cite{}, \ref{}, \label{}, \eqref{} in LaTeX - Never modify @cite, #cite(), #ref(), <label> in Typst - Never modify math environments: $...$, \begin{equation}..., \begin{align}... - Never add/remove citations - Mark any domain terminology changes as [TERM CHANGE: confirm?] Section: {section_name} (lines {start}-{end}) Target style: {style} ({style_description from POLISH_GUIDE.md}) Critic scores — Logic: {logic_score}/5, Expression: {expression_score}/5 Critic top issues: {top_issues} Pre-check expression issues in this section: {filtered_expression_issues} Read lines {start}-{end} of {file_path}: Use Read tool with offset={start-1} and limit={end-start+1}. Produce MENTOR REPORT in this format: ## Section: {section_name} ### Polish Suggestions [MENTOR] (Line N) [Severity: Major|Minor] [Priority: P1|P2]: description Original: <exact original text> Revised: <revised text preserving all LaTeX/Typst commands> Rationale: <one sentence> ### Section Summary <2-3 sentences on overall quality and key improvements>After each Mentor completes:- Display its output
- Ask: "Section {name} polish done. Accept and continue to next section?"
- Wait for confirmation before spawning next Mentor.
-
Final status dashboard (after all sections done): See dashboard format below.
-
运行Python预检查
python scripts/audit.py <file> --mode polish [--style A|B|C] [--journal <name>] [--skip-logic]读取论文目录下的文件。.polish-state/precheck.json -
检查硬阻塞问题 如果非空,展示问题并终止流程,提示:「润色前请先修复以下严重问题:」+ 问题列表,用户确认修复前不要启动任何Agent。
precheck.json["blockers"] -
处理非IMRaD结构(如果) 展示检测到的章节,询问用户:「是否继续对这些章节进行润色?」
precheck.json["non_imrad"] == true -
通过任务启动批评Agent:子Agent类型:提示词模板:
general-purposeYou are an adversarial academic reviewer. Paper: {file_path} | Language: {lang} | Journal: {journal} | Style: {style} Step 1: Read the paper using the Read tool (file: {file_path}). Step 2: The rule-based precheck found these issues: {precheck_issues_summary} Step 3: Produce a CRITIC REPORT as valid JSON (no markdown fencing): { "global_verdict": "ready_to_polish" | "needs_revision_first" | "major_restructure_needed", "global_rationale": "2-3 sentences", "section_verdicts": [ { "section": "<name>", "logic_score": 1-5, "expression_score": 1-5, "blocks_mentor": false, "blocking_reason": "", "top_issues": [{"type": "logic|expression|argument", "description": "..."}] } ], "cross_section_issues": ["..."] } blocks_mentor = true ONLY when logic_score <= 2 or section is structurally absent.通过Bash将批评Agent的JSON输出保存到:.polish-state/critic_report.jsonpython -c "import pathlib; pathlib.Path('.polish-state/critic_report.json').write_text('<critic_json_here>', encoding='utf-8')" -
展示批评面板和门控选项 将批评报告渲染为Markdown表格(参考面板格式),展示被阻塞的章节,询问: "如何继续? [1] 润色所有章节(覆盖阻塞设置) [2] 跳过阻塞章节,润色其余部分 [3] 停止,先修改阻塞章节问题" 等待用户回复。
-
按章节启动导师Agent(顺序执行,一次一个): 按IMRaD顺序处理每个获批章节:子Agent类型:提示词模板:
general-purposeYou are a writing mentor specializing in academic polish. CRITICAL RULES (NEVER VIOLATE): - Never modify \cite{}, \ref{}, \label{}, \eqref{} in LaTeX - Never modify @cite, #cite(), #ref(), <label> in Typst - Never modify math environments: $...$, \begin{equation}..., \begin{align}... - Never add/remove citations - Mark any domain terminology changes as [TERM CHANGE: confirm?] Section: {section_name} (lines {start}-{end}) Target style: {style} ({style_description from POLISH_GUIDE.md}) Critic scores — Logic: {logic_score}/5, Expression: {expression_score}/5 Critic top issues: {top_issues} Pre-check expression issues in this section: {filtered_expression_issues} Read lines {start}-{end} of {file_path}: Use Read tool with offset={start-1} and limit={end-start+1}. Produce MENTOR REPORT in this format: ## Section: {section_name} ### Polish Suggestions [MENTOR] (Line N) [Severity: Major|Minor] [Priority: P1|P2]: description Original: <exact original text> Revised: <revised text preserving all LaTeX/Typst commands> Rationale: <one sentence> ### Section Summary <2-3 sentences on overall quality and key improvements>每个导师Agent完成后:- 展示其输出
- 询问:「{name}章节润色完成,是否确认并继续下一个章节?」
- 启动下一个Agent前等待用户确认。
-
最终状态面板(所有章节完成后): 参考下方的面板格式。
Polish Status Dashboard Format
润色状态面板格式
Print at end of each phase and at completion:
╭─ 🔴🔵 paper-audit Polish Mode ──────────────────────────╮
│ 📄 File: {filename} | Style: {A/B/C} | Journal: {venue} │
│ ⚔️ Critic: {global_verdict} │
│ │
│ Section │ Logic │ Expr │ Mentor │ Suggestions │
│ abstract │ 4/5 │ 3/5 │ ✅ Done │ 3 │
│ introduction │ 3/5 │ 2/5 │ ✅ Done │ 7 │
│ method │ BLOCK │ 2/5 │ ⏭️ Skipped │ 0 │
│ experiment │ 4/5 │ 4/5 │ ✅ Done │ 2 │
│ conclusion │ 5/5 │ 3/5 │ ✅ Done │ 4 │
│ │
│ 👉 Next: {明确的下一步指示} │
╰───────────────────────────────────────────────────────────╯每个阶段结束及全部完成后输出:
╭─ 🔴🔵 paper-audit Polish Mode ──────────────────────────╮
│ 📄 File: {filename} | Style: {A/B/C} | Journal: {venue} │
│ ⚔️ Critic: {global_verdict} │
│ │
│ Section │ Logic │ Expr │ Mentor │ Suggestions │
│ abstract │ 4/5 │ 3/5 │ ✅ Done │ 3 │
│ introduction │ 3/5 │ 2/5 │ ✅ Done │ 7 │
│ method │ BLOCK │ 2/5 │ ⏭️ Skipped │ 0 │
│ experiment │ 4/5 │ 4/5 │ ✅ Done │ 2 │
│ conclusion │ 5/5 │ 3/5 │ ✅ Done │ 4 │
│ │
│ 👉 Next: {明确的下一步指示} │
╰───────────────────────────────────────────────────────────╯