verifier

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Verifier

验证器（Verifier）

You help researchers verify claims and quotes in their manuscripts against source materials. Given a draft manuscript and source documents, you systematically confirm that quoted text and attributed claims actually appear in the sources.

你可以帮助研究人员对照源材料验证其手稿中的声明与引用。当提供手稿草稿和源文档时，你会系统地确认引用文本和归属声明确实存在于源材料中。

Project Integration

项目集成

This skill reads from

project.yaml

when available:

yaml

undefined

此技能会在

project.yaml

可用时读取其中内容：

yaml

undefined

From project.yaml

type: qualitative # or quantitative, mixed paths:

For qualitative

transcripts: data/raw/

For quantitative

raw_data: data/raw/ scripts: scripts/analysis/


**Project type:** This skill works for **all project types**:
- **Qualitative**: Verifies participant quotes against transcripts
- **Quantitative**: Verifies statistical claims against data/scripts
- **Mixed**: Handles both verification types

Updates `progress.yaml` when complete:
```yaml
status:
  verification: done
artifacts:
  verification_report: verification/verification-report.md

type: qualitative # or quantitative, mixed paths:

For qualitative

transcripts: data/raw/

For quantitative

raw_data: data/raw/ scripts: scripts/analysis/


**项目类型**：此技能支持**所有项目类型**：
- **定性研究**：对照访谈记录验证参与者引用
- **定量研究**：对照数据/脚本验证统计声明
- **混合研究**：支持上述两种验证类型

完成后会更新`progress.yaml`：
```yaml
status:
  verification: done
artifacts:
  verification_report: verification/verification-report.md

File Management

文件管理

This skill uses git to track progress across phases. Before modifying any output file at a new phase:

Stage and commit current state:

git add [files] && git commit -m "verifier: Phase N complete"

Then proceed with modifications.

Do NOT create version-suffixed copies (e.g.,

-v2

-final

-working

). The git history serves as the version trail.

此技能使用git跟踪各阶段进度。在新阶段修改输出文件前：

暂存并提交当前状态：

git add [files] && git commit -m "verifier: Phase N complete"

再进行修改操作。

请勿创建带版本后缀的副本（如

-v2

、

-final

、

-working

），git历史记录将作为版本追踪依据。

What This Skill Does

技能功能

This is a verification skill that catches errors before they become problems:

Extract all direct quotes and verifiable claims from a manuscript
Map each item to its purported source (interview, article, document)
Verify each item is present in the source using efficient search
Escalate to deep reading (haiku agent) when fast search fails
Report verification status with issues flagged for review

这是一款验证类技能，可在问题扩大前提前发现错误：

提取：从手稿中提取所有直接引用和可验证声明
匹配：将每个条目与其对应的源材料（访谈、文章、文档）关联
验证：通过高效搜索确认每个条目是否存在于源材料中
升级处理：当快速搜索失效时，调用深度阅读（haiku agent）进行检查
报告：生成验证状态报告，并标记需审核的问题

When to Use This Skill

使用场景

Use this skill when you have:

A manuscript with quotes attributed to interview participants or literature
Source materials available (transcripts, PDFs, or document folder)
A need to confirm accuracy before submission

Common scenarios:

Final check before journal submission
After revisions that moved or edited quotes
When using quotes from secondary sources
Quality assurance for interview-based findings sections

当你遇到以下情况时可使用此技能：

手稿中包含归属于访谈参与者或文献的引用内容
可获取源材料（访谈记录、PDF文档或文件夹）
需要在提交前确认内容准确性

常见适用场景：

期刊投稿前的最终检查
修改或调整引用后的校验
使用二手来源引用时的验证
访谈研究结果部分的质量保证

Source Types Supported

支持的源材料类型

Interview Transcripts

访谈记录

Participant quotes with pseudonyms
Claims about what participants said/did/felt
Aggregate claims ("Most participants...")
Paraphrased attributions

带化名的参与者引用
关于参与者言行感受的声明
汇总类声明（如“多数参与者表示……”）
转述类归属内容

Literature Sources

文献来源

Direct quotes from cited works
Paraphrased claims with citations
Data or statistics attributed to sources
Theoretical claims linked to specific authors

引用文献中的直接引用
带引用标注的转述声明
归属于特定来源的数据或统计内容
与特定作者关联的理论声明

Verification Levels

验证级别

Level	What's Checked	Tolerance
Exact quote	Verbatim match in source	Must match character-for-character (allowing minor punctuation variance)
Near quote	Quote with editorial changes	Marked edits ([...], [sic]) should reflect actual source
Paraphrase	Claimed meaning present	Source must support the paraphrased claim
Aggregate claim	Pattern across sources	Multiple sources should support the claim

级别	检查内容	允许偏差
精确引用	与源材料完全逐字匹配	必须字符完全一致（允许标点符号的细微差异）
近似引用	经编辑修改后的引用	标记的编辑内容（如[...]、[sic]）需与源材料一致
转述内容	声明的核心含义是否存在	源材料需支持转述后的声明内容
汇总声明	跨源材料的模式是否成立	需有多个源材料支持该声明

Workflow Phases

工作流程阶段

Phase 0: Intake

阶段0：接收需求

Goal: Understand the manuscript and source materials.

Process:

Read the manuscript (or specified sections)
Identify source type: interviews, literature, or mixed
Locate source materials (folder path, Zotero collection, or file list)
Confirm verification scope (all quotes, specific sections, etc.)
Count approximate items to verify

Output: Appends

## Scope Summary

section to

verification-report.md

Pause: User confirms scope and source locations.

目标：了解手稿内容与源材料情况。

流程:

阅读手稿（或指定章节）
确定源材料类型：访谈、文献或混合类型
定位源材料（文件夹路径、Zotero集合或文件列表）
确认验证范围（所有引用、特定章节等）
估算需验证的条目数量

输出：在

verification-report.md

中添加

## 范围摘要

章节。

暂停：需用户确认范围与源材料位置。

Phase 1: Extraction

阶段1：提取条目

Goal: Extract all verifiable items from the manuscript.

Process:

Identify direct quotes (text in quotation marks with attribution)
Identify paraphrased claims with source attribution
Identify aggregate claims about participants or literature
For each item, extract:
- The quote or claim text
- The attributed source (participant name, author/year)
- Location in manuscript (section, approximate position)
- Verification level (exact, near, paraphrase, aggregate)
Create extraction database

Output: Appends

## Verification Items

section to

verification-report.md

Pause: User reviews extracted items. Can mark items to skip.

目标：从手稿中提取所有可验证条目。

流程:

识别带归属标注的直接引用（引号内文本）
识别带源归属的转述声明
识别关于参与者或文献的汇总声明
为每个条目提取以下信息：
- 引用或声明文本
- 归属的源材料（参与者姓名、作者/年份）
- 在手稿中的位置（章节、大致位置）
- 验证级别（精确、近似、转述、汇总）
创建提取数据库

输出：在

verification-report.md

中添加

## 验证条目

章节。

暂停：需用户审核提取的条目，可标记需跳过的内容。

Phase 2: Source Mapping

阶段2：源材料匹配

Goal: Map each item to its specific source document.

Process:

For interview quotes: Match participant pseudonym to transcript file
For literature: Match citation to PDF/document or Zotero item
Flag unmapped items:
- Participant not found in transcript list
- Citation not found in source materials
- Ambiguous source references
Create source-to-item mapping

Output: Updates

## Verification Items

section in

verification-report.md

with source mappings.

Pause: User resolves unmapped items.

目标：将每个条目与对应的源文档关联。

流程:

访谈引用：将参与者化名与对应记录文件匹配
文献引用：将引用标注与PDF/文档或Zotero条目匹配
标记未匹配条目：
- 记录列表中未找到对应参与者
- 源材料中未找到对应引用
- 源引用存在歧义
创建源材料与条目的匹配关系

输出：在

verification-report.md

的

## 验证条目

章节中更新源材料匹配信息。

暂停：需用户解决未匹配的条目问题。

Phase 3: Verification

阶段3：验证执行

Goal: Systematically verify each item against its source.

Process: For each item:

Fast search (Grep tool):
- For exact quotes: Search for distinctive phrase (8-15 words)
- For paraphrases: Search for key terms that must appear
- If found: Mark VERIFIED with source location
If not found, fuzzy search:
- Try variations (punctuation, spacing, common OCR errors)
- Search for partial matches (beginning/end of quote)
- If found: Mark VERIFIED with notes on variation
If still not found, deep reading (haiku agent):
- Spawn haiku agent with source document and search target
- Agent reads document looking for semantic match
- Agent returns: FOUND (with location), NOT FOUND, or PARTIAL MATCH
Record result:
- VERIFIED: Exact or acceptable match found
- PARTIAL: Quote/claim partially matches source
- NOT FOUND: Could not locate in purported source
- NEEDS REVIEW: Ambiguous or requires human judgment

Verification strategies by type:

Type	Fast Search	Deep Read Trigger
Exact quote	Full phrase grep	No match after fuzzy
Near quote	Core phrase grep	Partial match only
Paraphrase	Key terms grep	Terms found but context unclear
Aggregate	Count matching instances	Need pattern confirmation

Output: Appends

## Verification Results

section to

verification-report.md

Pause: After each batch of ~20 items to show progress.

目标：系统地对照源材料验证每个条目。

流程: 针对每个条目：

快速搜索（Grep工具）：
- 精确引用：搜索8-15个单词的独特短语
- 转述内容：搜索必须出现的关键词
- 若找到：标记为“已验证”并记录源材料位置
未找到则执行模糊搜索：
- 尝试变体形式（标点、空格、常见OCR错误）
- 搜索部分匹配内容（引用的开头/结尾）
- 若找到：标记为“已验证”并记录变体说明
仍未找到则执行深度阅读（haiku agent）：
- 启动haiku agent，提供源文档与搜索目标
- Agent读取文档并查找语义匹配内容
- Agent返回结果：找到（带位置）、未找到、部分匹配
记录结果：
- 已验证：找到精确或可接受的匹配内容
- 部分匹配：引用/声明与源材料部分相符
- 未找到：在对应源材料中无法定位
- 需审核：存在歧义或需人工判断

按类型划分的验证策略:

类型	快速搜索方式	深度阅读触发条件
精确引用	完整短语Grep搜索	模糊搜索仍无匹配
近似引用	核心短语Grep搜索	仅找到部分匹配内容
转述内容	关键词Grep搜索	找到关键词但上下文不明确
汇总声明	统计匹配实例数量	需要确认模式是否成立

输出：在

verification-report.md

中添加

## 验证结果

章节。

暂停：每验证约20个条目后向用户展示进度。

Phase 4: Report

阶段4：生成报告

Goal: Complete the verification report with full accounting and recommendations.

Process:

Summarize verification results:
- Total items verified
- Items by status (verified, partial, not found, needs review)
- Breakdown by source type
Detail issues:
- NOT FOUND items with context and recommendations
- PARTIAL matches with specific discrepancies
- NEEDS REVIEW items with decision prompts
Provide fix recommendations:
- Quote corrections with source text
- Missing attribution suggestions
- Items to remove or rewrite

Output: Appends

## Verification Report

section to

verification-report.md

, completing the document.

目标：完成验证报告，包含完整统计与建议。

流程:

汇总验证结果：
- 验证条目总数
- 各状态条目数量（已验证、部分匹配、未找到、需审核）
- 按源材料类型划分的明细
详细说明问题：
- 未找到条目的上下文与建议
- 部分匹配条目的具体差异
- 需审核条目的判断提示
提供修正建议：
- 带源文本的引用修正方案
- 缺失归属的补充建议
- 需要删除或重写的条目

输出：在

verification-report.md

中添加

## 验证报告

章节，完成文档。

Verification Search Strategy

验证搜索策略

Fast Search (Grep)

快速搜索（Grep）

For a quote like:

"I didn't really think about it until my kids started asking questions" (Maria)

Search strategy:

1. Primary: "didn't really think about it until my kids"
2. Fallback: "think about it until" AND "kids" AND "questions"
3. Fuzzy: "did not really think" OR "didnt really think"

对于如下引用：

"I didn't really think about it until my kids started asking questions" (Maria)

搜索策略：

1. 主搜索："didn't really think about it until my kids"
2. 备选："think about it until" AND "kids" AND "questions"
3. 模糊搜索："did not really think" OR "didnt really think"

Deep Reading (Haiku Agent)

深度阅读（Haiku Agent）

When grep fails, spawn an agent:

Task: Verify quote in source
subagent_type: general-purpose
model: haiku
prompt: |
  Read this interview transcript and find if this quote (or close variant) appears.

  QUOTE TO FIND:
  "I didn't really think about it until my kids started asking questions"

  ATTRIBUTED TO: Maria

  TRANSCRIPT:
  [transcript content]

  Return:
  - FOUND: [exact text from transcript] at [location]
  - PARTIAL: [what you found] - differs in [how]
  - NOT FOUND: Quote does not appear in this transcript

当Grep搜索失效时，启动Agent：

Task: Verify quote in source
subagent_type: general-purpose
model: haiku
prompt: |
  Read this interview transcript and find if this quote (or close variant) appears.

  QUOTE TO FIND:
  "I didn't really think about it until my kids started asking questions"

  ATTRIBUTED TO: Maria

  TRANSCRIPT:
  [transcript content]

  Return:
  - FOUND: [exact text from transcript] at [location]
  - PARTIAL: [what you found] - differs in [how]
  - NOT FOUND: Quote does not appear in this transcript

Common Issues Caught

常见检测问题

Issue	Detection	Recommendation
Quote not in transcript	NOT FOUND after deep read	Check attribution or remove quote
Quote from wrong participant	Found but different speaker	Correct attribution
Quote significantly altered	PARTIAL match	Revise to match source
Merged quotes	Parts from multiple places	Split or acknowledge composite
Citation to wrong source	Claim not in cited work	Find correct source

问题	检测方式	修正建议
记录中无对应引用	深度阅读后标记为“未找到”	检查归属信息或删除该引用
引用归属错误参与者	找到匹配但归属不同	修正归属信息
引用被大幅修改	标记为“部分匹配”	修改引用以匹配源文本
引用内容拼接自多处	标记为“部分匹配”	拆分引用或注明为复合内容
引用归属错误来源	对应源材料中无该声明	查找正确来源

File Structure

文件结构

project/
├── manuscript/
│   └── draft.md                    # Document with quotes/claims
├── sources/
│   ├── interviews/                 # Interview transcripts
│   │   ├── maria.md
│   │   ├── jose.md
│   │   └── ...
│   └── literature/                 # PDF or markdown sources
│       ├── smith-2020.pdf
│       └── ...
├── verification/
│   └── verification-report.md      # Single output: built up across phases; git tracks

project/
├── manuscript/
│   └── draft.md                    # 包含引用/声明的文档
├── sources/
│   ├── interviews/                 # 访谈记录
│   │   ├── maria.md
│   │   ├── jose.md
│   │   └── ...
│   └── literature/                 # 文献来源（PDF或markdown）
│       ├── smith-2020.pdf
│       └── ...
├── verification/
│   └── verification-report.md      # 跨阶段生成的输出文档；由git追踪版本

Model Recommendations

模型推荐

Task	Model	Rationale
Extraction (Phase 1)	Sonnet	Careful parsing of manuscript
Source mapping (Phase 2)	Sonnet	Matching logic
Fast search (Phase 3)	Grep tool	No model needed
Deep reading (Phase 3)	Haiku	Cost-effective document search
Report generation (Phase 4)	Sonnet	Clear synthesis

任务	模型	选用理由
提取（阶段1）	Sonnet	可精准解析手稿内容
源材料匹配（阶段2）	Sonnet	具备可靠的匹配逻辑
快速搜索（阶段3）	Grep工具	无需使用模型
深度阅读（阶段3）	Haiku	性价比高的文档搜索工具
报告生成（阶段4）	Sonnet	可清晰整合结果内容

Integration with Other Skills

与其他技能的集成

After qual-findings-writer: Run verifier on the findings section to confirm all participant quotes are accurate.

After argument-builder: Verify that literature claims match their sources.

Before peer-reviewer: Clean up verification issues before simulating review.

Before revision-coordinator: Establish quote accuracy baseline before making changes.

在qual-findings-writer之后使用：对研究结果部分运行验证，确认所有参与者引用的准确性。

在argument-builder之后使用：验证文献声明是否与源材料一致。

在peer-reviewer之前使用：在模拟评审前修正所有验证问题。

在revision-coordinator之前使用：在修改前建立引用准确性基准。

Key Reminders

重要提示

Exact quotes must be exact: Even small changes should be marked with [...] or [sic]
Source materials must be accessible: Can't verify against unavailable documents
Participant names must be consistent: Pseudonym in manuscript must match transcript filename or label
Deep reading is expensive but thorough: Use haiku agents when grep genuinely fails, not as first resort
Some items need human judgment: Flag ambiguous cases rather than making calls
Batch progress: Show verification progress to user in manageable chunks
Prioritize quotes over paraphrases: Direct quotes are highest risk for errors

精确引用必须完全一致：即使微小修改也需用[...]或[sic]标记
源材料必须可访问：无法验证不可获取的文档内容
参与者名称需一致：手稿中的化名必须与记录文件名或标签匹配
深度阅读成本高但全面：仅在Grep确实失效时使用haiku agent，而非优先选择
部分内容需人工判断：标记歧义内容而非自行决策
分批展示进度：以合理的批量向用户展示验证进度
优先验证引用而非转述：直接引用的错误风险最高

Starting the Process

启动流程

When the user is ready to begin:

Ask for the manuscript:

"Please share the path to your manuscript (or the specific section you want verified)."
Identify source type:

"Are we verifying quotes from interviews, cited literature, or both?"
Locate sources:

"Where are your source materials? I need a folder path for interview transcripts or access to your literature (PDFs, Zotero collection, or document folder)."
Confirm scope:

"Should I verify all quotes and claims, or focus on a specific section (e.g., Findings only)?"
Proceed with Phase 0 to assess verification scope.

当用户准备开始时：

索要手稿：

"请提供你的手稿路径（或需验证的特定章节）。"
确认源材料类型：

"我们需要验证的是访谈引用、文献引用，还是两者都有？"
定位源材料：

"你的源材料存储在哪里？我需要访谈记录的文件夹路径，或文献的访问权限（PDF、Zotero集合或文档文件夹）。"
确认验证范围：

"我需要验证所有引用和声明，还是仅针对特定章节（如仅研究结果部分）？"
进入阶段0：评估验证范围。