citation-verifier

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Citation Verifier

Citation Verifier(引用验证工具)

Generate
citations/ref.bib
and ensure every entry has a traceable verification record in
citations/verified.jsonl
.
When network access is restricted, prefer a “record now, verify later” workflow: keep URLs/titles consistent and leave a clear verification note.
生成
citations/ref.bib
文件,并确保每个条目在
citations/verified.jsonl
中都有可追溯的验证记录。
当网络受限,优先采用“先记录,后验证”的工作流:保持URL和标题一致,并留下清晰的验证说明。

Input

输入

  • papers/paper_notes.jsonl
  • papers/paper_notes.jsonl

Outputs

输出

  • citations/ref.bib
  • citations/verified.jsonl
  • citations/ref.bib
  • citations/verified.jsonl

Workflow (heuristic)

工作流(启发式规则)

  1. Collect
    bibkey
    ,
    title
    ,
    url
    ,
    year
    ,
    authors
    from
    papers/paper_notes.jsonl
    .
  2. Write/refresh
    citations/ref.bib
    :
    • Prefer arXiv-style fields when
      arxiv_id
      /
      primary_category
      exist (
      eprint
      ,
      archivePrefix
      ,
      primaryClass
      ).
  3. Write one verification record per BibTeX entry to
    citations/verified.jsonl
    with at least:
    • bibkey
      ,
      title
      ,
      url
      ,
      date
  4. If you cannot verify via network, record a clear
    notes
    field (e.g., “auto-generated; needs manual verification”) and/or request human confirmation depending on your policy.
  1. papers/paper_notes.jsonl
    中提取
    bibkey
    title
    url
    year
    authors
    信息。
  2. 写入/更新
    citations/ref.bib
    • 当存在
      arxiv_id
      /
      primary_category
      时,优先使用arXiv风格字段(
      eprint
      archivePrefix
      primaryClass
      )。
  3. 为每个BibTeX条目在
    citations/verified.jsonl
    中写入一条验证记录,至少包含:
    • bibkey
      title
      url
      date
  4. 若无法通过网络验证,需在
    notes
    字段中留下清晰说明(例如:“自动生成;需手动验证”),或根据策略请求人工确认。

Quality checklist

质量检查清单

  • Every BibTeX entry has a corresponding
    verified.jsonl
    record.
  • No missing
    url
    /
    date
    /
    title
    in verification records.
  • 每个BibTeX条目都有对应的
    verified.jsonl
    记录。
  • 验证记录中无缺失的
    url
    /
    date
    /
    title

Offline Mode

离线模式

When network access is restricted, run in offline mode to produce auditable records now, then verify later.
  • Generate offline records:
    verification_status: offline_generated
  • Verify later (when network is available):
    --verify-only
当网络受限,可运行离线模式,先生成可审计的记录,后续再进行验证。
  • 生成离线记录:
    verification_status: offline_generated
  • 后续验证(网络可用时):
    --verify-only

verification_status

verification_status
状态说明

  • offline_generated
    : record was generated without network verification (needs later verification)
  • verified_online
    : URL/title verified successfully by the script
  • verify_failed
    : verification was attempted but failed (network error or title mismatch)
  • needs_manual_verification
    : missing/ambiguous fields (e.g., empty
    url
    /
    title
    )
  • offline_generated
    :记录生成时未进行网络验证(需后续验证)
  • verified_online
    :脚本已成功验证URL/标题
  • verify_failed
    :尝试验证但失败(网络错误或标题不匹配)
  • needs_manual_verification
    :字段缺失/模糊(例如:
    url
    /
    title
    为空)

Script

脚本

Quick Start

快速开始

  • python .codex/skills/citation-verifier/scripts/run.py --help
  • Offline (record now, verify later):
    python .codex/skills/citation-verifier/scripts/run.py --workspace <workspace_dir> --offline
  • python .codex/skills/citation-verifier/scripts/run.py --help
  • 离线模式(先记录后验证):
    python .codex/skills/citation-verifier/scripts/run.py --workspace <workspace_dir> --offline

All Options

所有选项

  • --offline
    : do not attempt network verification; write
    verification_status=offline_generated
  • --verify-only
    : verify existing
    citations/verified.jsonl
    records (does not rewrite BibTeX)
  • --verification-note <text>
    : stored in
    citations/verified.jsonl
    notes
  • --offline
    :不尝试网络验证;写入
    verification_status=offline_generated
  • --verify-only
    :验证已有的
    citations/verified.jsonl
    记录(不会重写BibTeX)
  • --verification-note <text>
    :将说明存储在
    citations/verified.jsonl
    notes
    字段中

Examples

示例

  • Generate BibTeX + offline verification records:
    • python .codex/skills/citation-verifier/scripts/run.py --workspace <ws> --offline --verification-note "auto-generated; needs manual verification"
  • Later, verify-only (when network is available):
    • python .codex/skills/citation-verifier/scripts/run.py --workspace <ws> --verify-only
  • 生成BibTeX + 离线验证记录:
    • python .codex/skills/citation-verifier/scripts/run.py --workspace <ws> --offline --verification-note "auto-generated; needs manual verification"
  • 后续网络可用时,仅执行验证:
    • python .codex/skills/citation-verifier/scripts/run.py --workspace <ws> --verify-only

Notes

注意事项

  • Minimal requirement for every verification record:
    url
    ,
    date
    ,
    title
    .
  • The script sanitizes stray/unbalanced
    {}
    in titles to keep
    bibtex
    parsing robust.
  • The script escapes LaTeX special chars in text fields (
    & % $ # _
    ) and rewrites superscript patterns like
    X^N
    or
    X$^N$
    as
    X\textsuperscript{N}
    to keep LaTeX builds stable.
  • URLs are kept raw in BibTeX
    url
    fields (BibTeX styles wrap them with
    \url{...}
    );
    @misc
    uses
    howpublished=\url{...}
    .
  • In offline mode, records are not truly verified; treat
    offline_generated
    as a to-do for human/network verification.
  • 每条验证记录的最低要求:包含
    url
    date
    title
  • 脚本会清理标题中多余/不匹配的
    {}
    ,确保BibTeX解析的稳定性。
  • 脚本会转义文本字段中的LaTeX特殊字符(
    & % $ # _
    ),并将
    X^N
    X$^N$
    这类上标格式重写为
    X\textsuperscript{N}
    ,保证LaTeX编译稳定。
  • URL在BibTeX的
    url
    字段中保持原始格式(BibTeX样式会用
    \url{...}
    包裹);
    @misc
    类型会使用
    howpublished=\url{...}
  • 离线模式下生成的记录并未真正经过验证;需将
    offline_generated
    视为待完成的人工/网络验证任务。

Troubleshooting

故障排除

Common Issues

常见问题

Issue: Missing
bibkey
/ missing
url
in notes

问题:笔记中缺失
bibkey
/ 缺失
url

Symptom:
  • citations/ref.bib
    is missing entries, or
    verified.jsonl
    has empty
    url/title
    .
Causes:
  • papers/paper_notes.jsonl
    lacks
    bibkey
    /
    url
    fields.
Solutions:
  • Ensure each core paper note has a stable
    bibkey
    and a canonical
    url
    .
  • Rerun citation generation after fixing notes.
症状
  • citations/ref.bib
    缺少条目,或
    verified.jsonl
    中的
    url/title
    为空。
原因
  • papers/paper_notes.jsonl
    缺少
    bibkey
    /
    url
    字段。
解决方案
  • 确保每篇核心论文笔记都有稳定的
    bibkey
    和标准的
    url
  • 修复笔记后重新运行引用生成脚本。

Issue:
verification_status=offline_generated

问题:
verification_status=offline_generated

Symptom:
  • Records exist but are not truly verified.
Causes:
  • --offline
    was used, or network verification was unavailable.
Solutions:
  • When network is available, run
    --verify-only
    to upgrade records.
  • Or manually verify and update
    citations/verified.jsonl
    with notes.
症状
  • 记录已存在但未经过真正验证。
原因
  • 使用了
    --offline
    参数,或网络验证不可用。
解决方案
  • 网络可用时,运行
    --verify-only
    参数升级记录状态。
  • 或手动验证并更新
    citations/verified.jsonl
    中的说明。

Recovery Checklist

恢复检查清单

  • Every BibTeX entry has a matching
    citations/verified.jsonl
    record.
  • Verification records include
    url
    ,
    date
    ,
    title
    .
  • 每个BibTeX条目都有对应的
    citations/verified.jsonl
    记录。
  • 验证记录包含
    url
    date
    title