citation-audit

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Citation Audit

引用审计

Verify every
\cite{...}
in a paper against three independent layers:
  1. Existence — the cited paper actually exists at the claimed arXiv ID / DOI / venue.
  2. Metadata correctness — author names, year, venue, and title match canonical sources (DBLP, arXiv, ACL Anthology, Nature, OpenReview, etc.).
  3. Context appropriateness — the cited paper actually supports the claim it is being used to support in the manuscript.
This skill is the fourth layer of \aris{}'s evidence-and-claim assurance, complementing
experiment-audit
(code),
result-to-claim
(science verdict), and
paper-claim-audit
(numerical claims). Together they form a bottom-up integrity stack from raw evaluation code to manuscript bibliography.
验证论文中每一个
\cite{...}
条目,覆盖三个独立验证维度:
  1. 存在性 —— 被引用论文是否确实存在于标注的arXiv ID / DOI / 会议/期刊中。
  2. 元数据正确性 —— 作者姓名、年份、会议/期刊、标题是否与权威来源(DBLP、arXiv、ACL Anthology、Nature、OpenReview等)一致。
  3. 上下文适配性 —— 被引用论文是否确实能支撑手稿中引用该文献时提出的声明。
该技能是\aris{}证据与声明保障体系的第四层,与
experiment-audit
(代码审核)、
result-to-claim
(科学结论审核)、
paper-claim-audit
(数值声明审核)形成互补。它们共同构成从原始评估代码到手稿参考文献的自底向上完整性验证栈。

When to Use This Skill

何时使用该技能

Run before submission. The right gating point is:
  • After
    paper-write
    has produced the LaTeX draft and bib file
  • After
    paper-claim-audit
    has verified numerical claims
  • Before final
    paper-compile
    for submission
Do not run this on a half-written draft — most of the work is in cross-checking each
\cite
against context, which is wasted on placeholder text.
提交前运行。最佳时机为:
  • paper-write
    生成LaTeX草稿和bib文件之后
  • paper-claim-audit
    完成数值声明验证之后
  • 最终
    paper-compile
    用于提交之前
请勿在未完成的草稿上运行——该工具的核心工作是交叉核对每个
\cite
的上下文,在占位文本上运行会浪费资源。

What This Skill Catches

该技能可识别的问题

The dangerous citation problems are not wildly fake citations — those are easy to spot. The dangerous ones are:
  • Wrong-context citations: real paper, but the cited claim is not what that paper actually establishes (e.g., citing Self-Refine to support "self-feedback produces correlated errors" — Self-Refine actually argues the opposite).
  • Author hallucinations: anonymous-author placeholders that slipped through, missing co-authors, wrong order.
  • Title drift: arXiv v1 vs v3 with different titles silently merged.
  • Venue confusion: arXiv preprint cited but the official venue is now CVPR/ICML/NeurIPS — using the wrong record.
  • Year mismatch: arXiv 2023 preprint with 2024 conference acceptance, year reported inconsistently.
  • Phantom DOIs: DOI looks real but does not resolve.
  • Self-citation drift: your own prior work cited with year off by one.
危险的引用问题并非那些明显伪造的引用——这类问题很容易发现。真正危险的问题包括:
  • 上下文不符引用:论文真实存在,但引用的声明并非该论文实际论证的内容(例如,引用Self-Refine来支撑“自我反馈会产生相关错误”,但Self-Refine实际论证的是相反结论)。
  • 作者虚构:遗漏的匿名作者占位符、缺失的合著者、错误的作者顺序。
  • 标题偏差:arXiv v1和v3版本标题不同却被静默合并。
  • 会议/期刊混淆:引用的是arXiv预印本,但该论文已正式发表于CVPR/ICML/NeurIPS——使用了错误的记录。
  • 年份不匹配:arXiv 2023预印本在2024年被会议收录,但年份标注不一致。
  • 虚假DOI:DOI格式看似合法但无法解析。
  • 自引用偏差:引用自己的前期工作时年份标注错误。

Constants

常量定义

  • REVIEWER_MODEL =
    gpt-5.4
    — Used via Codex MCP. Default for cross-model review with web access.
  • CONTEXT_POLICY =
    fresh
    — Each audit run uses a new reviewer thread (REVIEWER_BIAS_GUARD). Never
    codex-reply
    .
  • WEB_SEARCH = required — The reviewer must perform real web/DBLP/arXiv lookups, not pattern-match from memory.
  • OUTPUT =
    CITATION_AUDIT.md
    — Human-readable per-entry verdict report.
  • STATE =
    CITATION_AUDIT.json
    — Machine-readable verdict ledger consumable by downstream tools.
  • REVIEWER_MODEL =
    gpt-5.4
    —— 通过Codex MCP调用。默认用于支持网页访问的跨模型审核。
  • CONTEXT_POLICY =
    fresh
    —— 每次审核使用全新的审核线程(REVIEWER_BIAS_GUARD)。绝不使用
    codex-reply
  • WEB_SEARCH = required —— 审核器必须执行真实的网页/DBLP/arXiv查询,而非依赖记忆进行模式匹配。
  • OUTPUT =
    CITATION_AUDIT.md
    —— 可读性强的逐条审核结论报告。
  • STATE =
    CITATION_AUDIT.json
    —— 机器可读的审核结论台账,可供下游工具调用。

Workflow

工作流程

Step 1: Discover bib file and section files

步骤1:定位bib文件和章节文件

Locate:
  • references.bib
    (or
    paper.bib
    / similar) under the paper directory
  • All
    *.tex
    files containing
    \cite{...}
    calls (typically
    sec/
    or
    sections/
    )
If multiple bib files exist, audit each separately.
找到:
  • 论文目录下的
    references.bib
    (或
    paper.bib
    等类似文件)
  • 所有包含
    \cite{...}
    调用的
    *.tex
    文件(通常位于
    sec/
    sections/
    目录下)
若存在多个bib文件,则分别进行审核。

Step 2: Extract all (cite-key, context) pairs

步骤2:提取所有(引用键,上下文)对

For each
\cite{key1,key2,...}
invocation in the paper:
  • Record the cite key
  • Record the file + line number
  • Record the surrounding sentence (≥ 1 full sentence around the cite, for context check)
Output a flat list of
(key, file, line, surrounding_sentence)
tuples.
Also build the inverse: for each bib entry, the list of all places it is cited.
Save the extracted contexts to
paper/.aris/citation-audit/contexts.txt
so the reviewer can read it directly. Use the paper-dir-relative path
.aris/citation-audit/contexts.txt
when recording the file in
audited_input_hashes
; do not stage under
/tmp
or other transient locations that the verifier cannot rehash later.
针对论文中每一个
\cite{key1,key2,...}
调用:
  • 记录引用键
  • 记录文件+行号
  • 记录引用周围的句子(引用前后至少各1个完整句子,用于上下文检查)
输出一个包含
(key, file, line, surrounding_sentence)
元组的扁平列表。
同时构建反向映射:每个bib条目对应的所有引用位置。
将提取的上下文保存至
paper/.aris/citation-audit/contexts.txt
,以便审核器直接读取。在
audited_input_hashes
中记录文件时,使用相对于论文目录的路径
.aris/citation-audit/contexts.txt
;请勿存储在
/tmp
或其他临时位置,否则验证器后续无法重新哈希该文件。

Step 3: Send each entry to fresh cross-model reviewer

步骤3:将每个条目发送至全新跨模型审核器

For each bib entry, invoke
mcp__codex__codex
(NOT
codex-reply
— fresh thread per entry, or batch with explicit per-entry isolation):
mcp__codex__codex:
  model: gpt-5.4
  config: {"model_reasoning_effort": "xhigh"}
  sandbox: read-only
  prompt: |
    You are auditing a bibliographic entry. Use web/DBLP/arXiv search.

    ## Bib entry
    @article{key2024example,
      author = {...}, title = {...}, journal = {...}, year = {...}, ...
    }

    ## Where this entry is cited in the paper
    [paste extracted contexts]

    For this entry, verify:
    1. EXISTENCE: does this paper exist at the claimed arXiv ID / DOI / venue?
       Output: YES / NO / UNCERTAIN, with the verifying URL.
    2. METADATA: are author names, year, venue, title correct?
       For each, output: correct / wrong: should be ... / typo: ...
    3. CONTEXT: for each use, does the cited paper actually support the surrounding claim?
       Output per-use: SUPPORTS / WEAK / WRONG, with one-sentence reasoning.

    VERDICT: KEEP / FIX / REPLACE / REMOVE
    - KEEP: entry is clean, all uses are appropriate
    - FIX: metadata needs correction; uses are appropriate
    - REPLACE: cite is wrong-context, find a different paper that actually supports the claim
    - REMOVE: entry is hallucinated or unsupportable

    Be honest. If you cannot verify online, say UNCERTAIN; do not guess.
Save the response to
.aris/traces/citation-audit/<date>_runNN/<key>.md
per the review-tracing protocol.
针对每个bib条目,调用
mcp__codex__codex
禁止使用
codex-reply
——每个条目使用独立线程,或明确按条目隔离批量处理):
mcp__codex__codex:
  model: gpt-5.4
  config: {"model_reasoning_effort": "xhigh"}
  sandbox: read-only
  prompt: |
    You are auditing a bibliographic entry. Use web/DBLP/arXiv search.

    ## Bib entry
    @article{key2024example,
      author = {...}, title = {...}, journal = {...}, year = {...}, ...
    }

    ## Where this entry is cited in the paper
    [paste extracted contexts]

    For this entry, verify:
    1. EXISTENCE: does this paper exist at the claimed arXiv ID / DOI / venue?
       Output: YES / NO / UNCERTAIN, with the verifying URL.
    2. METADATA: are author names, year, venue, title correct?
       For each, output: correct / wrong: should be ... / typo: ...
    3. CONTEXT: for each use, does the cited paper actually support the surrounding claim?
       Output per-use: SUPPORTS / WEAK / WRONG, with one-sentence reasoning.

    VERDICT: KEEP / FIX / REPLACE / REMOVE
    - KEEP: entry is clean, all uses are appropriate
    - FIX: metadata needs correction; uses are appropriate
    - REPLACE: cite is wrong-context, find a different paper that actually supports the claim
    - REMOVE: entry is hallucinated or unsupportable

    Be honest. If you cannot verify online, say UNCERTAIN; do not guess.
根据审核追踪协议,将响应保存至
.aris/traces/citation-audit/<date>_runNN/<key>.md

Step 4: Aggregate verdicts

步骤4:汇总审核结论

Build
CITATION_AUDIT.json
following the schema defined in "Submission Artifact Emission" below (single authoritative schema for this file). Per-entry ledger data goes under
details.per_entry
, not under a top-level
entries
field. The top-level
verdict
is a single overall value (PASS / WARN / FAIL / NOT_APPLICABLE / BLOCKED / ERROR) derived from per-entry verdicts per the decision table in "Submission Artifact Emission"; the top-level
summary
is a one-line human-readable string.
Concretely,
details
carries the per-entry ledger:
json
"details": {
  "total_entries": 29,
  "counts": { "KEEP": 11, "FIX": 14, "REPLACE": 3, "REMOVE": 1 },
  "per_entry": [
    {
      "key": "lu2024aiscientist",
      "verdict": "KEEP",
      "axis_failures": [],
      "uses": [
        {"file": "sections/1.intro.tex", "line": 11, "verdict": "SUPPORTS"},
        {"file": "sections/6.related.tex", "line": 8, "verdict": "SUPPORTS"}
      ]
    },
    {
      "key": "madaan2023selfrefine",
      "verdict": "FIX",
      "axis_failures": ["CONTEXT"],
      "uses": [
        {"file": "sections/2.overview.tex", "line": 42, "verdict": "WRONG",
         "note": "Self-Refine demonstrates iterative improvement, not correlated errors"},
        {"file": "sections/6.related.tex", "line": 13, "verdict": "SUPPORTS"}
      ]
    }
  ]
}
See "Submission Artifact Emission" for the full artifact (top-level fields
audit_skill
,
verdict
,
reason_code
,
summary
,
audited_input_hashes
,
trace_path
,
thread_id
,
reviewer_model
,
reviewer_reasoning
,
generated_at
,
details
).
按照下方**“提交工件输出”**中定义的架构构建
CITATION_AUDIT.json
(该文件采用唯一权威架构)。逐条审核数据需放在
details.per_entry
下,而非顶层
entries
字段。顶层
verdict
是一个整体值(PASS / WARN / FAIL / NOT_APPLICABLE / BLOCKED / ERROR),由“提交工件输出”中的决策表根据逐条结论推导得出;顶层
summary
是一行人类可读的总结字符串。
具体而言,
details
字段包含逐条审核台账:
json
"details": {
  "total_entries": 29,
  "counts": { "KEEP": 11, "FIX": 14, "REPLACE": 3, "REMOVE": 1 },
  "per_entry": [
    {
      "key": "lu2024aiscientist",
      "verdict": "KEEP",
      "axis_failures": [],
      "uses": [
        {"file": "sections/1.intro.tex", "line": 11, "verdict": "SUPPORTS"},
        {"file": "sections/6.related.tex", "line": 8, "verdict": "SUPPORTS"}
      ]
    },
    {
      "key": "madaan2023selfrefine",
      "verdict": "FIX",
      "axis_failures": ["CONTEXT"],
      "uses": [
        {"file": "sections/2.overview.tex", "line": 42, "verdict": "WRONG",
         "note": "Self-Refine demonstrates iterative improvement, not correlated errors"},
        {"file": "sections/6.related.tex", "line": 13, "verdict": "SUPPORTS"}
      ]
    }
  ]
}
完整工件架构请参考“提交工件输出”(顶层字段包括
audit_skill
verdict
reason_code
summary
audited_input_hashes
trace_path
thread_id
reviewer_model
reviewer_reasoning
generated_at
details
)。

Step 5: Generate human-readable report

步骤5:生成人类可读报告

Write
CITATION_AUDIT.md
:
markdown
undefined
编写
CITATION_AUDIT.md
markdown
undefined

Citation Audit Report

Citation Audit Report

Date: 2026-04-19 Bib file: references.bib Total entries: 29
Date: 2026-04-19 Bib file: references.bib Total entries: 29

Summary

Summary

VerdictCount
KEEP11
FIX14
REPLACE3
REMOVE1
VerdictCount
KEEP11
FIX14
REPLACE3
REMOVE1

Priority Fixes (CRITICAL — apply before submission)

Priority Fixes (CRITICAL — apply before submission)

REMOVE: hidden2025aiscientistpitfalls

REMOVE: hidden2025aiscientistpitfalls

  • Author listed as "Anonymous" — actual authors are Luo, Kasirzadeh, Shah
  • Title is incomplete
  • ACTION: Replace key with
    luo2025aiscientistpitfalls
    , update authors and title
  • Author listed as "Anonymous" — actual authors are Luo, Kasirzadeh, Shah
  • Title is incomplete
  • ACTION: Replace key with
    luo2025aiscientistpitfalls
    , update authors and title

REPLACE-CONTEXT: madaan2023selfrefine in sec/2.overview.tex:42

REPLACE-CONTEXT: madaan2023selfrefine in sec/2.overview.tex:42

  • Cited to support: "single-model self-refinement can produce correlated errors"
  • Self-Refine paper actually demonstrates iterative IMPROVEMENT, not correlated errors
  • ACTION: Rewrite the sentence; cite Self-Refine for "self-feedback loop" framing instead
[... continues for each entry ...]
  • Cited to support: "single-model self-refinement can produce correlated errors"
  • Self-Refine paper actually demonstrates iterative IMPROVEMENT, not correlated errors
  • ACTION: Rewrite the sentence; cite Self-Refine for "self-feedback loop" framing instead
[... continues for each entry ...]

All-Clean Entries (no action needed)

All-Clean Entries (no action needed)

[list of KEEP keys]
undefined
[list of KEEP keys]
undefined

Step 6: Apply fixes (interactive)

步骤6:应用修复(交互式)

For each FIX/REPLACE/REMOVE verdict, prompt the user:
Fix [key]?
  Change: <description of change>
  Files affected: references.bib + sec/X.tex:Y
[Apply / Skip / Defer]
If
AUTO_APPLY = true
, apply all FIX-level changes (metadata corrections only). REPLACE and REMOVE always require human approval — they involve content changes.
针对每个FIX/REPLACE/REMOVE结论,向用户提示:
Fix [key]?
  Change: <description of change>
  Files affected: references.bib + sec/X.tex:Y
[Apply / Skip / Defer]
AUTO_APPLY = true
,则自动应用所有FIX级别的修改(仅元数据修正)。REPLACE和REMOVE始终需要人工确认——这些修改涉及内容变更。

Step 7: Recompile and verify

步骤7:重新编译并验证

bash
latexmk -C && latexmk -pdf -interaction=nonstopmode main.tex
Confirm:
  • No new
    Citation undefined
    warnings
  • No
    Reference undefined
    warnings
  • Page count unchanged or only minimally affected by metadata fixes
bash
latexmk -C && latexmk -pdf -interaction=nonstopmode main.tex
确认:
  • 无新的
    Citation undefined
    警告
  • Reference undefined
    警告
  • 页数未发生变化,或仅因元数据修正产生微小变化

Key Rules

核心规则

  • Fresh reviewer thread per audit run — never reuse prior review context
  • Web access required — the reviewer must do real lookups, not memory pattern-match
  • Wrong-context > metadata — a real paper used to support a wrong claim is more dangerous than a typo in author name
  • REPLACE/REMOVE require human approval — never auto-modify content claims
  • Always emit, never block — this skill always writes
    CITATION_AUDIT.json
    with a verdict; the decision to block finalization lives in
    paper-writing
    Phase 6 +
    tools/verify_paper_audits.sh
    , driven by the
    assurance
    level. See "Submission Artifact Emission" below.
  • Run once per submission — the audit is wall-clock expensive (web lookups for each entry); not for every save
  • 每次审核使用全新线程 —— 绝不复用之前的审核上下文
  • 必须使用网页访问 —— 审核器必须执行真实查询,而非依赖记忆进行模式匹配
  • 上下文错误优先级高于元数据错误 —— 真实论文被用于支撑错误声明的危险性远高于作者姓名的拼写错误
  • REPLACE/REMOVE需人工确认 —— 绝不自动修改内容声明
  • 始终输出结果,绝不阻塞 —— 该技能始终会写入
    CITATION_AUDIT.json
    并给出结论;是否阻止最终提交的决策由
    paper-writing
    第6阶段和
    tools/verify_paper_audits.sh
    根据
    assurance
    级别决定。详见下方“提交工件输出”。
  • 每次提交运行一次 —— 审核过程耗时较长(每个条目都要进行网页查询);无需每次保存都运行

Comparison with Other Audit Skills

与其他审核技能的对比

SkillWhat it auditsWhat it catches
/experiment-audit
Evaluation codeFake ground truth, self-normalized scores, phantom results
/result-to-claim
Result-to-claim mappingClaims unsupported by evidence
/paper-claim-audit
Numerical claims in manuscriptNumber inflation, best-seed cherry-pick, config mismatch
/citation-audit
Bibliographic entriesHallucinated refs, wrong-context citations, metadata errors
Together: code → result → numerical claim → cited claim. Each layer has cross-family review with no executor in the validator path.
技能审核对象可识别问题
/experiment-audit
评估代码伪造基准数据、自归一化分数、虚假结果
/result-to-claim
结果到声明的映射声明无证据支撑
/paper-claim-audit
手稿中的数值声明数值夸大、最优种子筛选、配置不匹配
/citation-audit
参考文献条目虚构引用、上下文不符引用、元数据错误
协同作用:代码 → 结果 → 数值声明 → 引用声明。每一层都采用跨模型审核,验证路径中无执行方干预。

Known Limitations

已知局限性

  • DBLP coverage gap: very recent papers (< 2 weeks) may not yet be in DBLP. Reviewer should fall back to arXiv.
  • Pre-print vs published: when both exist, reviewer should prefer the published venue (ICML 2024 over arXiv 2401.xxxxx) but flag both.
  • Anthology vs OpenReview: NeurIPS/ICLR papers have OpenReview entries before official proceedings; both are valid sources.
  • Multi-author truncation: bib entries with 6+ authors using
    and others
    are conventional and not flagged unless the truncation hides a co-author the user explicitly cares about.
  • DBLP覆盖缺口:极新的论文(<2周)可能尚未收录进DBLP。审核器应 fallback 至arXiv查询。
  • 预印本vs已发表版本:若两者均存在,审核器应优先使用已发表版本(如ICML 2024而非arXiv 2401.xxxxx),但需同时标记两者。
  • Anthology vs OpenReview:NeurIPS/ICLR论文在正式会议论文集发布前会有OpenReview条目;两者均为有效来源。
  • 多作者截断:包含6位及以上作者且使用
    and others
    的bib条目属于常规格式,除非截断隐藏了用户特别关注的合著者,否则不会被标记。

Review Tracing

审核追踪

After each
mcp__codex__codex
reviewer call, save the trace following
shared-references/review-tracing.md
. Use
tools/save_trace.sh
or write files directly to
.aris/traces/citation-audit/<date>_run<NN>/
. Respect the
--- trace:
parameter (default:
full
).
每次调用
mcp__codex__codex
审核器后,按照
shared-references/review-tracing.md
保存追踪记录。使用
tools/save_trace.sh
或直接将文件写入
.aris/traces/citation-audit/<date>_run<NN>/
。遵循
--- trace:
参数(默认值:
full
)。

Output Contract

输出约定

  • CITATION_AUDIT.md
    (human-readable report) at paper root
  • CITATION_AUDIT.json
    (machine-readable ledger; schema below) at paper root
  • .aris/traces/citation-audit/<date>_runNN/
    (per-entry review traces)
  • Optional: applied fixes to
    references.bib
    +
    sec/*.tex
    (with --apply flag)
  • CITATION_AUDIT.md
    (人类可读报告)位于论文根目录
  • CITATION_AUDIT.json
    (机器可读台账;架构见下方)位于论文根目录
  • .aris/traces/citation-audit/<date>_runNN/
    (逐条审核追踪记录)
  • 可选:应用于
    references.bib
    +
    sec/*.tex
    的修复(使用--apply参数)

Submission Artifact Emission

提交工件输出

This skill always writes
paper/CITATION_AUDIT.json
, regardless of caller or detector outcome. A paper with no
.bib
file or no
\cite{...}
usage emits verdict
NOT_APPLICABLE
; silent skip is forbidden.
paper-writing
Phase 6 and
tools/verify_paper_audits.sh
both rely on this artifact existing at a predictable path.
The artifact conforms to the schema in
shared-references/assurance-contract.md
:
json
{
  "audit_skill":      "citation-audit",
  "verdict":          "PASS | WARN | FAIL | NOT_APPLICABLE | BLOCKED | ERROR",
  "reason_code":      "all_entries_keep | metadata_drift | wrong_context | hallucinated | ...",
  "summary":          "One-line human-readable verdict summary.",
  "audited_input_hashes": {
    "references.bib":             "sha256:...",
    "main.tex":                   "sha256:...",
    "sections/3.related.tex":     "sha256:..."
  },
  "trace_path":       ".aris/traces/citation-audit/<date>_run<NN>/",
  "thread_id":        "<codex mcp thread id>",
  "reviewer_model":   "gpt-5.4",
  "reviewer_reasoning": "xhigh",
  "generated_at":     "<UTC ISO-8601>",
  "details": {
    "total_entries":  <int>,
    "per_entry":      [ { "key": "madaan2023selfrefine",
                          "verdict": "KEEP | FIX | REPLACE | REMOVE",
                          "axis_failures": [ "CONTEXT" | "METADATA" | "EXISTENCE" ],
                          "note": "..." }, ... ]
  }
}
无论调用方或检测结果如何,该技能始终会写入
paper/CITATION_AUDIT.json
。若论文无
.bib
文件或未使用
\cite{...}
,则输出结论
NOT_APPLICABLE
;禁止静默跳过。
paper-writing
第6阶段和
tools/verify_paper_audits.sh
均依赖该工件存在于可预测路径。
该工件符合
shared-references/assurance-contract.md
中的架构:
json
{
  "audit_skill":      "citation-audit",
  "verdict":          "PASS | WARN | FAIL | NOT_APPLICABLE | BLOCKED | ERROR",
  "reason_code":      "all_entries_keep | metadata_drift | wrong_context | hallucinated | ...",
  "summary":          "One-line human-readable verdict summary.",
  "audited_input_hashes": {
    "references.bib":             "sha256:...",
    "main.tex":                   "sha256:...",
    "sections/3.related.tex":     "sha256:..."
  },
  "trace_path":       ".aris/traces/citation-audit/<date>_run<NN>/",
  "thread_id":        "<codex mcp thread id>",
  "reviewer_model":   "gpt-5.4",
  "reviewer_reasoning": "xhigh",
  "generated_at":     "<UTC ISO-8601>",
  "details": {
    "total_entries":  <int>,
    "per_entry":      [ { "key": "madaan2023selfrefine",
                          "verdict": "KEEP | FIX | REPLACE | REMOVE",
                          "axis_failures": [ "CONTEXT" | "METADATA" | "EXISTENCE" ],
                          "note": "..." }, ... ]
  }
}

audited_input_hashes
scope

audited_input_hashes
范围

Hash the declared input set actually passed to this audit: the
.bib
file,
main.tex
, and every
sections/*.tex
file that supplied citation contexts. Do NOT hash extracted contexts from
/tmp
or other transient paths — if you need to stage extracted contexts, materialize them under
paper/.aris/
so the verifier can rehash reproducibly. Do NOT hash repo-wide unions or the reviewer's self-reported opened subset.
Path convention (must match
tools/verify_paper_audits.sh
): keys are paths relative to the paper directory (no
paper/
prefix — the verifier already resolves relative to the paper dir; prefixing produces
paper/paper/...
and false-fails as STALE). Use absolute paths for any file outside the paper dir.
实际传入审核的声明输入集进行哈希:
.bib
文件、
main.tex
以及所有提供引用上下文的
sections/*.tex
文件。请勿对
/tmp
或其他临时路径中的提取上下文进行哈希——若需存储提取上下文,请将其放在
paper/.aris/
目录下,以便验证器可重复哈希。请勿对仓库级别的文件集合或审核器自行报告的已打开子集进行哈希。
路径约定(必须与
tools/verify_paper_audits.sh
匹配):键为相对于论文目录的路径(无需
paper/
前缀——验证器已相对于论文目录解析路径;添加前缀会导致
paper/paper/...
路径,进而因STALE导致验证失败)。对于论文目录外的文件,使用绝对路径

Verdict decision table

结论决策表

Input stateVerdict
reason_code
example
No
.bib
file or no
\cite{...}
usage
NOT_APPLICABLE
no_citations
.bib
file referenced but unreadable / missing
BLOCKED
bib_unreadable
Every entry KEEP, all three axes green
PASS
all_entries_keep
Only FIX verdicts (metadata drift, no context errors)
WARN
metadata_drift
Any REPLACE or REMOVE (wrong-context or hallucinated entry)
FAIL
wrong_context
Web lookups timed out / reviewer invocation failed
ERROR
reviewer_error
输入状态结论
reason_code
示例
.bib
文件或未使用
\cite{...}
NOT_APPLICABLE
no_citations
.bib
文件被引用但无法读取/缺失
BLOCKED
bib_unreadable
所有条目均为KEEP,三个维度均通过
PASS
all_entries_keep
仅存在FIX结论(元数据偏差,无上下文错误)
WARN
metadata_drift
存在任意REPLACE或REMOVE结论(上下文不符或虚构条目)
FAIL
wrong_context
网页查询超时/审核器调用失败
ERROR
reviewer_error

Thread independence

线程独立性

Every invocation uses a fresh
mcp__codex__codex
thread. Never
codex-reply
. Do not accept prior audit outputs (PROOF_AUDIT, PAPER_CLAIM_AUDIT, EXPERIMENT_LOG) as input — the fresh thread preserves reviewer independence per
shared-references/reviewer-independence.md
.
This skill never blocks by itself;
paper-writing
Phase 6 plus the verifier decide whether the verdict blocks finalization based on the
assurance
level.
每次调用均使用全新的
mcp__codex__codex
线程。绝不使用
codex-reply
。请勿将之前的审核输出(PROOF_AUDIT、PAPER_CLAIM_AUDIT、EXPERIMENT_LOG)作为输入——全新线程可确保审核器独立性,符合
shared-references/reviewer-independence.md
要求。
该技能本身不会阻塞提交;
paper-writing
第6阶段和验证器会根据
assurance
级别决定是否根据结论阻止最终提交。

See Also

相关链接

  • /paper-claim-audit
    — sibling skill for numerical claim verification
  • /experiment-audit
    — sibling skill for evaluation code integrity
  • /result-to-claim
    — claim verdict assignment from results
  • shared-references/citation-discipline.md
    — protocol document for citation hygiene
  • shared-references/reviewer-independence.md
    — cross-model review constraints
  • /paper-claim-audit
    —— 数值声明验证的同类技能
  • /experiment-audit
    —— 评估代码完整性的同类技能
  • /result-to-claim
    —— 从结果推导声明结论
  • shared-references/citation-discipline.md
    —— 引用规范协议文档
  • shared-references/reviewer-independence.md
    —— 跨模型审核约束