sec-filing-evidence-extractor

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

SEC Filing Evidence Extractor

SEC文件证据提取器

Extract evidence before drawing conclusions. Favor traceable, section-anchored facts over narrative summaries.
在得出结论前先提取证据。优先选择可追溯、锚定到具体章节的事实,而非叙述性摘要。

Workflow

工作流程

  1. Identify filing set and periods.
  2. Prefer at least 8 quarters plus latest annual filing when available.
  3. Extract facts into a normalized evidence table.
  4. Keep every row tied to a filing, period, and location anchor.
  5. Flag uncertain interpretation explicitly instead of forcing classification.
  1. 确定文件集和期间。
  2. 若有可用数据,优先选择至少8个季度的文件加上最新的年度文件。
  3. 将事实提取到标准化的证据表格中。
  4. 确保每一行都关联到对应的文件、期间和位置锚点。
  5. 明确标记存在疑问的解读,而非强行进行分类。

Evidence Table Schema

证据表格 Schema

Use this schema in Markdown or CSV.
  • evidence_id
    : stable id (for example
    EV-001
    )
  • filing
    : form + date (for example
    10-Q 2025-11-03
    )
  • period
    : fiscal period covered
  • location
    : section/footnote anchor
  • topic
    : revenue, receivables, reserves, CFFO, non-GAAP, acquisitions, etc.
  • fact
    : concise paraphrase of disclosure
  • numbers
    : raw numbers with units and sign
  • trend
    : increase/decrease/flat/volatile/one-off
  • initial_tags
    : optional candidate tags (
    EM4
    ,
    CF2
    ,
    AA1
    , etc.)
  • confidence
    :
    high|medium|low
    for extraction accuracy only
使用Markdown或CSV格式的以下Schema。
  • evidence_id
    : 稳定的ID(例如
    EV-001
  • filing
    : 表格类型+日期(例如
    10-Q 2025-11-03
  • period
    : 涵盖的会计期间
  • location
    : 章节/附注锚点
  • topic
    : 收入、应收账款、准备金、经营活动现金流(CFFO)、非GAAP指标、收购等
  • fact
    : 披露内容的简洁转述
  • numbers
    : 带单位和符号的原始数值
  • trend
    : 上升/下降/持平/波动/一次性
  • initial_tags
    : 可选的候选标签(
    EM4
    CF2
    AA1
    等)
  • confidence
    : 仅针对提取准确性的
    高|中|低

Extraction Priorities

提取优先级

Prioritize sections most likely to contain manipulative reporting clues.
  1. Revenue recognition policy and contract assets/liabilities notes
  2. Accounts receivable, allowance, DSO-related disclosures
  3. Capitalized costs, deferred costs, useful lives, impairment
  4. Restructuring, contingencies, reserves, and releases
  5. Statement of cash flows plus supplemental cash disclosures
  6. Non-GAAP reconciliations and KPI definition changes
  7. Acquisition accounting, purchase price allocation, pro forma metrics
优先处理最可能包含操纵性报告线索的章节。
  1. 收入确认政策和合同资产/负债附注
  2. 应收账款、坏账准备、应收账款周转天数(DSO)相关披露
  3. 资本化成本、递延成本、使用年限、减值
  4. 重组、或有事项、准备金及释放
  5. 现金流量表加上补充现金披露
  6. 非GAAP指标调节和关键绩效指标(KPI)定义变更
  7. 收购会计、购买价格分配、备考指标

Quality Rules

质量规则

  • Separate disclosure facts from interpretation.
  • Keep quote snippets short; prefer paraphrase with exact numbers.
  • Normalize sign conventions for cash flow and expenses.
  • Capture definition changes in KPIs as separate evidence rows.
  • If wording is ambiguous, add a
    low
    extraction confidence row and continue.
  • 将披露事实与解读分开。
  • 引用片段要简短;优先使用带有精确数值的转述。
  • 统一现金流和费用的符号约定。
  • 将KPI的定义变更作为单独的证据行记录。
  • 若表述模糊,添加一行提取置信度为
    的记录,然后继续处理。

Few-Shot Example

少样本示例

The following abbreviated evidence table shows correctly extracted rows from a hypothetical 10-Q. Use this as an anchor for format, granularity, and tone.
evidence_idfilingperiodlocationtopicfactnumberstrendinitial_tagsconfidence
EV-00110-Q 2025-11-03Q3 FY2025Note 3 – Revenue RecognitionrevenueCompany changed from point-in-time to over-time recognition for professional services contracts starting Q3n/aone-offEM1high
EV-00210-Q 2025-11-03Q3 FY2025Balance Sheet, Note 5receivablesGross AR rose 34% YoY while revenue grew 12%; allowance for doubtful accounts held flat at $4.2M (1.8% of AR vs 2.6% prior year)AR $233M vs $174M; allowance $4.2M vs $4.5MincreaseEM1, EM5high
EV-00310-Q 2025-11-03Q3 FY2025Cash Flow StatementCFFOCFO declined to $18M from $47M YoY; $22M increase in AR was the largest working-capital dragCFO $18M vs $47M; AR drag -$22MdecreaseCF3medium
Key points illustrated:
  • Each row captures one disclosure fact, not a conclusion.
  • Numbers are exact with units and comparative context.
  • Tags are candidates only; classification happens downstream.
  • The revenue-recognition policy change (EV-001) and the AR buildup (EV-002) are separate rows despite being related, preserving granularity.
以下简化的证据表格展示了从假设的10-Q文件中正确提取的行。以此作为格式、粒度和语气的参考。
evidence_idfilingperiodlocationtopicfactnumberstrendinitial_tagsconfidence
EV-00110-Q 2025-11-03Q3 FY2025附注3——收入确认收入公司从Q3开始将专业服务合同的收入确认时点从时点法变更为时段法一次性EM1
EV-00210-Q 2025-11-03Q3 FY2025资产负债表,附注5应收账款应收账款总额同比增长34%,而收入仅增长12%;坏账准备维持在420万美元不变(占应收账款的1.8%,上年同期为2.6%)应收账款 2.33亿美元 vs 1.74亿美元;坏账准备 420万美元 vs 450万美元上升EM1, EM5
EV-00310-Q 2025-11-03Q3 FY2025现金流量表经营活动现金流(CFFO)经营活动现金流同比从4700万美元降至1800万美元;应收账款增加2200万美元是营运资金的最大拖累经营活动现金流 1800万美元 vs 4700万美元;应收账款拖累 -2200万美元下降CF3
示例说明要点:
  • 每一行记录一个披露事实,而非结论。
  • 数值精确,带有单位和对比背景。
  • 标签仅为候选;分类在后续步骤进行。
  • 收入确认政策变更(EV-001)和应收账款增加(EV-002)虽相关,但分为单独行,以保留粒度。

Output Handoff

输出交付

Return:
  1. Evidence table
  2. Missing-data list (what was unavailable)
  3. Ready-for-classification package keyed by
    evidence_id
For section anchors and filing hotspots, read
references/sec-filing-hotspots.md
.
返回:
  1. 证据表格
  2. 缺失数据列表(无法获取的内容)
  3. evidence_id
    索引的待分类数据包
如需了解章节锚点和文件重点区域,请阅读
references/sec-filing-hotspots.md