tooluniverse-structural-variant-analysis

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Structural Variant Analysis Workflow

结构变异分析工作流

Systematic analysis of structural variants (deletions, duplications, inversions, translocations, complex rearrangements) for clinical genomics interpretation using ACMG-adapted criteria.

KEY PRINCIPLES:

Report-first approach - Create SV_analysis_report.md FIRST, then populate progressively
ACMG-style classification - Pathogenic/Likely Pathogenic/VUS/Likely Benign/Benign with explicit evidence
Evidence grading - Grade all findings by confidence level (★★★/★★☆/★☆☆)
Dosage sensitivity critical - Gene dosage effects drive SV pathogenicity
Breakpoint precision matters - Exact gene disruption vs dosage-only effects
Population context essential - gnomAD SVs for frequency assessment
English-first queries - Always use English terms in tool calls (gene names, disease names), even if the user writes in another language. Only try original-language terms as a fallback. Respond in the user's language

基于适配ACMG的标准，对临床基因组学解读所需的结构变异（缺失、重复、倒位、易位、复杂重排）进行系统性分析。

核心原则:

报告优先原则 - 先创建SV_analysis_report.md，再逐步填充内容
ACMG风格分类 - 明确标注致病性/疑似致病性/VUS（意义未明）/疑似良性/良性，并附具体证据
证据分级 - 按置信度对所有发现分级（★★★/★★☆/★☆☆）
剂量敏感性关键 - 基因剂量效应是SV致病性的核心驱动因素
断点精度重要 - 区分精确基因断裂与仅剂量效应的差异
人群背景必要 - 利用gnomAD SV数据库评估频率
英文优先查询 - 工具调用中始终使用英文术语（基因名、疾病名），即使用户使用其他语言提问。仅在英文查询失败时尝试原语言术语。用用户的语言回复

Problem This Skill Solves

本技能解决的问题

Structural variants (SVs) present unique interpretation challenges:

Complex molecular consequences - SVs can cause gene dosage changes, gene disruption, gene fusions, position effects
Size matters - Pathogenicity depends on size, gene content, and breakpoint precision
Limited databases - Fewer curated SVs in ClinVar compared to SNVs
Dosage sensitivity - Haploinsufficiency and triplosensitivity are critical but gene-specific
Population frequency - Large benign CNVs are common; distinguishing pathogenic from benign is challenging

This skill provides: A systematic workflow integrating SV classification, gene content analysis, dosage sensitivity assessment, population frequencies, and ACMG-adapted criteria into clinically actionable interpretations.

结构变异（SV）的解读面临独特挑战：

复杂分子效应 - SV可导致基因剂量改变、基因断裂、基因融合、位置效应
尺寸影响致病性 - 致病性取决于片段大小、基因内容和断点精度
数据库资源有限 - 相比SNV，ClinVar中收录的经注释SV更少
剂量敏感性 - 单倍剂量不足（Haploinsufficiency）和三倍剂量敏感性（Triplosensitivity）至关重要，但具有基因特异性
人群频率区分难 - 常见良性CNV普遍存在，区分致病性与良性变异难度大

本技能提供：整合SV分类、基因内容分析、剂量敏感性评估、人群频率分析和ACMG适配标准的系统性工作流，输出可落地的临床解读结果。

Triggers

触发场景

Use this skill when users:

Ask about structural variant interpretation
Have CNV data from array or sequencing
Ask "is this deletion/duplication pathogenic?"
Need ACMG classification for SVs
Want to assess gene dosage effects
Ask about chromosomal rearrangements
Have large-scale genomic alterations requiring interpretation

当用户出现以下需求时使用本技能：

询问结构变异解读方法
拥有来自芯片或测序的CNV数据
询问“该缺失/重复是否具有致病性？”
需要对SV进行ACMG分类
希望评估基因剂量效应
询问染色体重排相关问题
拥有需要解读的大规模基因组变异

Workflow Overview

工作流概览

┌─────────────────────────────────────────────────────────────────┐
│              STRUCTURAL VARIANT INTERPRETATION                   │
├─────────────────────────────────────────────────────────────────┤
│                                                                  │
│  Phase 1: SV IDENTITY & CLASSIFICATION                          │
│  ├── Normalize SV coordinates (hg19/hg38)                       │
│  ├── Determine SV type (DEL/DUP/INV/TRA/CPX)                   │
│  ├── Calculate SV size                                          │
│  └── Assess breakpoint precision                                │
│                                                                  │
│  Phase 2: GENE CONTENT ANALYSIS                                  │
│  ├── Identify genes fully contained in SV                       │
│  ├── Identify genes with breakpoints (disrupted)                │
│  ├── Annotate gene function and disease associations            │
│  ├── Identify regulatory elements affected                      │
│  └── Assess gene orientation (for inversions/translocations)    │
│                                                                  │
│  Phase 3: DOSAGE SENSITIVITY ASSESSMENT                          │
│  ├── ClinGen dosage sensitivity scores                          │
│  │   └─ Haploinsufficiency / Triplosensitivity ratings          │
│  ├── DECIPHER haploinsufficiency predictions                    │
│  ├── pLI scores (gnomAD) for loss-of-function intolerance       │
│  ├── OMIM gene-disease associations (dominant/recessive)        │
│  └── Known dosage-sensitive genes from literature               │
│                                                                  │
│  Phase 4: POPULATION FREQUENCY CONTEXT                           │
│  ├── gnomAD SV database (overlapping SVs)                       │
│  ├── DGV (Database of Genomic Variants)                         │
│  ├── ClinVar (known pathogenic/benign SVs)                      │
│  └── Calculate reciprocal overlap with population SVs           │
│                                                                  │
│  Phase 5: PATHOGENICITY SCORING                                  │
│  ├── Pathogenicity score (0-10 scale)                           │
│  │   ├─ Gene content weight (40%)                               │
│  │   ├─ Dosage sensitivity weight (30%)                         │
│  │   ├─ Population frequency weight (20%)                       │
│  │   └─ Inheritance/phenotype match weight (10%)                │
│  ├── Apply ACMG SV criteria                                     │
│  └── Generate classification recommendation                      │
│                                                                  │
│  Phase 6: LITERATURE & CLINICAL EVIDENCE                         │
│  ├── PubMed: Similar SVs, gene disruption studies               │
│  ├── DECIPHER: Developmental disorder cases                     │
│  ├── Clinical case reports                                      │
│  └── Functional evidence for gene dosage effects                │
│                                                                  │
│  Phase 7: ACMG-ADAPTED CLASSIFICATION                            │
│  ├── Apply SV-specific evidence codes                           │
│  ├── Calculate final classification                             │
│  ├── Identify limiting factors                                  │
│  └── Generate clinical recommendations                          │
│                                                                  │
└─────────────────────────────────────────────────────────────────┘

┌─────────────────────────────────────────────────────────────────┐
│              结构变异解读                                       │
├─────────────────────────────────────────────────────────────────┤
│                                                                  │
│  阶段1: SV识别与分类                                            │
│  ├── 标准化SV坐标（hg19/hg38）                                   │
│  ├── 确定SV类型（DEL/DUP/INV/TRA/CPX）                           │
│  ├── 计算SV尺寸                                                  │
│  └── 评估断点精度                                                │
│                                                                  │
│  阶段2: 基因内容分析                                              │
│  ├── 识别完全包含在SV内的基因                                   │
│  ├── 识别被断点打断的基因                                       │
│  ├── 注释基因功能与疾病关联                                     │
│  ├── 识别受影响的调控元件                                       │
│  └── 评估基因方向（针对倒位/易位）                               │
│                                                                  │
│  阶段3: 剂量敏感性评估                                          │
│  ├── ClinGen剂量敏感性评分                                      │
│  │   └─ 单倍剂量不足/三倍剂量敏感性评级                          │
│  ├── DECIPHER单倍剂量不足预测                                  │
│  ├── gnomAD的pLI评分（功能缺失不耐受性）                        │
│  ├── OMIM基因-疾病关联（显性/隐性）                              │
│  └── 文献中已知的剂量敏感性基因                                 │
│                                                                  │
│  阶段4: 人群频率背景分析                                         │
│  ├── gnomAD SV数据库（重叠SV）                                   │
│  ├── DGV（基因组变异数据库）                                     │
│  ├── ClinVar（已知致病性/良性SV）                                │
│  └── 计算与人群SV的互斥重叠度                                   │
│                                                                  │
│  阶段5: 致病性评分                                              │
│  ├── 致病性评分（0-10分）                                       │
│  │   ├─ 基因内容权重（40%）                                       │
│  │   ├─ 剂量敏感性权重（30%）                                     │
│  │   ├─ 人群频率权重（20%）                                       │
│  │   └─ 遗传/表型匹配权重（10%）                                 │
│  ├── 应用ACMG SV标准                                             │
│  └── 生成分类建议                                                │
│                                                                  │
│  阶段6: 文献与临床证据                                           │
│  ├── PubMed：相似SV、基因断裂研究                               │
│  ├── DECIPHER：发育障碍病例                                     │
│  ├── 临床病例报告                                                │
│  └── 基因剂量效应的功能证据                                     │
│                                                                  │
│  阶段7: 适配ACMG的分类                                            │
│  ├── 应用SV特异性证据代码                                       │
│  ├── 计算最终分类                                               │
│  ├── 识别限制因素                                                │
│  └── 生成临床建议                                                │
│                                                                  │
└─────────────────────────────────────────────────────────────────┘

Phase Details

各阶段细节

Phase 1: SV Identity & Classification

阶段1: SV识别与分类

Goal: Standardize SV notation and classify type

SV Types:

Type	Abbreviation	Description	Molecular Effect
Deletion	DEL	Loss of genomic segment	Haploinsufficiency, gene disruption
Duplication	DUP	Gain of genomic segment	Triplosensitivity, gene dosage imbalance
Inversion	INV	Segment flipped in orientation	Gene disruption at breakpoints, position effects
Translocation	TRA	Segment moved to different chromosome	Gene fusions, disruption, position effects
Complex	CPX	Multiple rearrangement types	Variable effects

Key Information to Capture:

Chromosome(s) involved
Coordinates (start, end) in hg19/hg38
SV size (bp or Mb)
SV type (DEL/DUP/INV/TRA/CPX)
Breakpoint precision (±50bp, ±1kb, etc.)
Inheritance pattern (de novo, inherited, unknown)

Example:

SV: arr[GRCh38] 17q21.31(44039927-44352659)x1
- Type: Deletion (heterozygous)
- Size: 313 kb
- Genes: MAPT, KANSL1 (fully contained)
- Breakpoints: Well-defined (array resolution ±5kb)

目标：标准化SV命名并分类类型

SV类型:

类型	缩写	描述	分子效应
缺失	DEL	基因组片段丢失	Haploinsufficiency、基因断裂
重复	DUP	基因组片段增加	Triplosensitivity、基因剂量失衡
倒位	INV	片段方向翻转	断点处基因断裂、位置效应
易位	TRA	片段转移至其他染色体	基因融合、断裂、位置效应
复杂重排	CPX	多种重排类型组合	效应多样

需捕获的关键信息:

涉及的染色体
hg19/hg38版本的坐标（起始、终止）
SV尺寸（bp或Mb）
SV类型（DEL/DUP/INV/TRA/CPX）
断点精度（±50bp、±1kb等）
遗传模式（新发、遗传、未知）

示例:

SV: arr[GRCh38] 17q21.31(44039927-44352659)x1
- 类型: 缺失（杂合）
- 尺寸: 313 kb
- 基因: MAPT、KANSL1（完全包含）
- 断点: 定义明确（芯片分辨率±5kb）

Phase 2: Gene Content Analysis

阶段2: 基因内容分析

Goal: Comprehensive annotation of genes affected by SV

Tools:

Tool	Purpose	Key Data
`Ensembl_lookup_gene`	Gene structure, coordinates	Gene boundaries, exons, transcripts
`NCBI_gene_search`	Gene information	Official symbol, aliases, description
`Gene_Ontology_get_term_info`	Gene function	Biological process, molecular function
`OMIM_search` , `OMIM_get_entry`	Disease associations	Inheritance, clinical features
`DisGeNET_search_gene`	Gene-disease associations	Evidence scores

Gene Categories:

Fully contained genes - Entire gene within SV boundaries
- Deletion: Complete loss of one copy (haploinsufficiency)
- Duplication: Extra copy (triplosensitivity)
Partially disrupted genes - Breakpoint within gene
- Likely loss-of-function for affected allele
- Check if critical domains disrupted
Flanking genes - Within 1 Mb of breakpoints
- May be affected by position effects
- Regulatory disruption possible

Example Gene Content Analysis:

python

def analyze_gene_content(tu, chrom, sv_start, sv_end, sv_type):
    """
    Identify and annotate all genes within SV region.
    """
    genes = {
        'fully_contained': [],
        'partially_disrupted': [],
        'flanking': []
    }

    # Use Ensembl to find overlapping genes
    # This is pseudocode - actual implementation depends on available tools

    for gene in genes_in_region:
        gene_start = gene['start']
        gene_end = gene['end']

        # Classify gene relationship to SV
        if gene_start >= sv_start and gene_end <= sv_end:
            # Fully contained
            gene_info = annotate_gene(tu, gene['symbol'])
            genes['fully_contained'].append(gene_info)

        elif (gene_start < sv_start < gene_end) or (gene_start < sv_end < gene_end):
            # Partially disrupted
            gene_info = annotate_gene(tu, gene['symbol'])
            genes['partially_disrupted'].append(gene_info)

        elif abs(gene_start - sv_end) < 1000000 or abs(gene_end - sv_start) < 1000000:
            # Flanking (within 1 Mb)
            gene_info = annotate_gene(tu, gene['symbol'])
            genes['flanking'].append(gene_info)

    return genes

def annotate_gene(tu, gene_symbol):
    """
    Comprehensive gene annotation.
    """
    # OMIM associations
    omim = tu.tools.OMIM_search(
        operation="search",
        query=gene_symbol,
        limit=5
    )

    # DisGeNET associations
    disgenet = tu.tools.DisGeNET_search_gene(
        operation="search_gene",
        gene=gene_symbol,
        limit=10
    )

    # Gene Ontology
    # Note: Need gene ID first
    ncbi = tu.tools.NCBI_gene_search(
        term=gene_symbol,
        organism="human"
    )

    return {
        'symbol': gene_symbol,
        'omim': omim,
        'disgenet': disgenet,
        'ncbi': ncbi
    }

Report Section:

markdown

undefined

目标：全面注释受SV影响的基因

工具:

工具	用途	核心数据
`Ensembl_lookup_gene`	基因结构、坐标	基因边界、外显子、转录本
`NCBI_gene_search`	基因信息	官方符号、别名、描述
`Gene_Ontology_get_term_info`	基因功能	生物学过程、分子功能
`OMIM_search` , `OMIM_get_entry`	基因-疾病关联	遗传模式、临床特征
`DisGeNET_search_gene`	基因-疾病关联	证据评分

基因分类:

完全包含的基因 - 整个基因位于SV边界内
- 缺失：一个拷贝完全丢失（Haploinsufficiency）
- 重复：额外增加一个拷贝（Triplosensitivity）
部分断裂的基因 - 断点位于基因内部
- 受影响的等位基因可能功能丧失
- 需检查关键结构域是否丢失
侧翼基因 - 位于断点1Mb范围内
- 可能受位置效应影响
- 存在调控元件被破坏的可能

基因内容分析示例:

python

def analyze_gene_content(tu, chrom, sv_start, sv_end, sv_type):
    """
    Identify and annotate all genes within SV region.
    """
    genes = {
        'fully_contained': [],
        'partially_disrupted': [],
        'flanking': []
    }

    # Use Ensembl to find overlapping genes
    # This is pseudocode - actual implementation depends on available tools

    for gene in genes_in_region:
        gene_start = gene['start']
        gene_end = gene['end']

        # Classify gene relationship to SV
        if gene_start >= sv_start and gene_end <= sv_end:
            # Fully contained
            gene_info = annotate_gene(tu, gene['symbol'])
            genes['fully_contained'].append(gene_info)

        elif (gene_start < sv_start < gene_end) or (gene_start < sv_end < gene_end):
            # Partially disrupted
            gene_info = annotate_gene(tu, gene['symbol'])
            genes['partially_disrupted'].append(gene_info)

        elif abs(gene_start - sv_end) < 1000000 or abs(gene_end - sv_start) < 1000000:
            # Flanking (within 1 Mb)
            gene_info = annotate_gene(tu, gene['symbol'])
            genes['flanking'].append(gene_info)

    return genes

def annotate_gene(tu, gene_symbol):
    """
    Comprehensive gene annotation.
    """
    # OMIM associations
    omim = tu.tools.OMIM_search(
        operation="search",
        query=gene_symbol,
        limit=5
    )

    # DisGeNET associations
    disgenet = tu.tools.DisGeNET_search_gene(
        operation="search_gene",
        gene=gene_symbol,
        limit=10
    )

    # Gene Ontology
    # Note: Need gene ID first
    ncbi = tu.tools.NCBI_gene_search(
        term=gene_symbol,
        organism="human"
    )

    return {
        'symbol': gene_symbol,
        'omim': omim,
        'disgenet': disgenet,
        'ncbi': ncbi
    }

报告章节:

markdown

undefined

2.1 Fully Contained Genes (Complete Dosage Effect)

2.1 完全包含的基因（完整剂量效应）

Gene	Function	Disease Association	Inheritance	Evidence
MAPT	Microtubule-associated protein tau	Frontotemporal dementia (AD)	Autosomal Dominant	★★★
KANSL1	Histone acetyltransferase complex	Koolen-De Vries syndrome (AD)	Autosomal Dominant	★★★

Interpretation: Deletion results in haploinsufficiency of two dosage-sensitive genes. KANSL1 haploinsufficiency is the primary cause of pathogenicity.

Sources: OMIM, DisGeNET, Ensembl

基因	功能	疾病关联	遗传模式	证据等级
MAPT	微管相关蛋白tau	额颞叶痴呆（AD）	常染色体显性遗传	★★★
KANSL1	组蛋白乙酰转移酶复合物	Koolen-De Vries综合征（AD）	常染色体显性遗传	★★★

解读：缺失导致两个剂量敏感性基因的单倍剂量不足。KANSL1单倍剂量不足是致病性的主要原因。

来源: OMIM, DisGeNET, Ensembl

2.2 Partially Disrupted Genes (Breakpoint Within Gene)

2.2 部分断裂的基因（断点位于基因内部）

Gene	Breakpoint Location	Effect	Critical Domains Lost
NF1	Intron 28 of 58	5' portion deleted	Yes - GTPase-activating domain

Interpretation: Breakpoint disrupts NF1 coding sequence, likely resulting in loss-of-function. NF1 is haploinsufficient (causes neurofibromatosis type 1).

基因	断点位置	效应	丢失的关键结构域
NF1	58个内含子中的第28个	5'端片段缺失	是 - GTP酶激活结构域

解读：断点破坏NF1编码序列，可能导致功能丧失。NF1为单倍剂量不足基因（引发1型神经纤维瘤病）。

2.3 Flanking Genes (Potential Position Effects)

2.3 侧翼基因（潜在位置效应）

Gene	Distance from SV	Regulatory Risk	Evidence
KCNJ2	450 kb upstream	Low	★☆☆

Note: Position effects are possible but less common. Consider if phenotype unexplained by contained genes.

---

基因	与SV的距离	调控风险	证据等级
KCNJ2	上游450 kb	低	★☆☆

注：位置效应有可能发生但并不常见。若表型无法被包含的基因解释时需考虑。

---

Phase 3: Dosage Sensitivity Assessment

阶段3: 剂量敏感性评估

Goal: Determine if affected genes are dosage-sensitive

Tools:

Tool	Purpose	Key Data
`ClinGen_search_dosage_sensitivity`	Gold standard curation	HI/TS scores (0-3)
`ClinGen_search_gene_validity`	Gene-disease validity	Definitive/Strong/Moderate
`gnomad_search` (pLI)	Loss-of-function intolerance	pLI score (0-1)
`DECIPHER_search`	Developmental disorders	Patient phenotypes with similar SVs
`OMIM_get_entry`	Inheritance pattern	AD/AR indicates dosage sensitivity

ClinGen Dosage Sensitivity Scores:

Score	Haploinsufficiency (HI)	Triplosensitivity (TS)	Interpretation
3	Sufficient evidence	Sufficient evidence	Gene IS dosage-sensitive
2	Emerging evidence	Emerging evidence	Likely dosage-sensitive
1	Little evidence	Little evidence	Insufficient evidence
0	No evidence	No evidence	No established dosage sensitivity

pLI Score Interpretation (gnomAD):

pLI Range	Interpretation	LoF Intolerance
≥0.9	Extremely intolerant	High - likely haploinsufficient
0.5-0.9	Moderately intolerant	Moderate
<0.5	Tolerant	Low - likely NOT haploinsufficient

Implementation:

python

def assess_dosage_sensitivity(tu, gene_list):
    """
    Assess dosage sensitivity for all genes in SV.
    Returns dosage scores and interpretation.
    """
    dosage_data = []

    for gene_symbol in gene_list:
        # 1. ClinGen dosage sensitivity (gold standard)
        clingen = tu.tools.ClinGen_search_dosage_sensitivity(
            gene=gene_symbol
        )

        hi_score = None
        ts_score = None
        if clingen.get('data'):
            for entry in clingen['data']:
                hi_score = entry.get('Haploinsufficiency Score')
                ts_score = entry.get('Triplosensitivity Score')
                break

        # 2. ClinGen gene validity (supports dosage sensitivity)
        validity = tu.tools.ClinGen_search_gene_validity(
            gene=gene_symbol
        )

        validity_level = None
        if validity.get('data'):
            for entry in validity['data']:
                validity_level = entry.get('Classification')
                break

        # 3. pLI score from gnomAD (if available via gene search)
        # Note: May need to use myvariant or other tools
        # pli_score = get_pli_score(tu, gene_symbol)

        # 4. OMIM inheritance pattern
        omim = tu.tools.OMIM_search(
            operation="search",
            query=gene_symbol,
            limit=3
        )

        inheritance_pattern = None
        if omim.get('data', {}).get('entries'):
            for entry in omim['data']['entries']:
                mim = entry.get('mimNumber')
                details = tu.tools.OMIM_get_entry(
                    operation="get_entry",
                    mim_number=str(mim)
                )
                # Extract inheritance from details
                # inheritance_pattern = parse_inheritance(details)

        # Integrate evidence
        dosage_assessment = {
            'gene': gene_symbol,
            'hi_score': hi_score,
            'ts_score': ts_score,
            'validity_level': validity_level,
            'inheritance': inheritance_pattern,
            'is_dosage_sensitive': (hi_score == '3' or ts_score == '3'),
            'evidence_grade': calculate_evidence_grade(hi_score, ts_score, validity_level)
        }

        dosage_data.append(dosage_assessment)

    return dosage_data

def calculate_evidence_grade(hi_score, ts_score, validity):
    """
    Calculate evidence grade for dosage sensitivity.
    """
    if (hi_score == '3' or ts_score == '3') and validity == 'Definitive':
        return '★★★'  # High confidence
    elif (hi_score in ['2', '3'] or ts_score in ['2', '3']):
        return '★★☆'  # Moderate confidence
    else:
        return '★☆☆'  # Low confidence

Report Section:

markdown

undefined

目标：确定受影响基因是否具有剂量敏感性

工具:

工具	用途	核心数据
`ClinGen_search_dosage_sensitivity`	金标准注释	HI/TS评分（0-3）
`ClinGen_search_gene_validity`	基因-疾病有效性	明确/强/中等
`gnomad_search` (pLI)	功能缺失不耐受性	pLI评分（0-1）
`DECIPHER_search`	发育障碍	携带相似SV的患者表型
`OMIM_get_entry`	遗传模式	AD/AR提示剂量敏感性

ClinGen剂量敏感性评分:

评分	单倍剂量不足（HI）	三倍剂量敏感性（TS）	解读
3	证据充分	证据充分	基因具有剂量敏感性
2	证据正在涌现	证据正在涌现	可能具有剂量敏感性
1	证据极少	证据极少	证据不足
0	无证据	无证据	无已证实的剂量敏感性

pLI评分解读 (gnomAD):

pLI范围	解读	功能缺失不耐受性
≥0.9	极度不耐受	高 - 可能为单倍剂量不足
0.5-0.9	中度不耐受	中等
<0.5	耐受	低 - 可能不具有单倍剂量不足性

实现代码:

python

def assess_dosage_sensitivity(tu, gene_list):
    """
    Assess dosage sensitivity for all genes in SV.
    Returns dosage scores and interpretation.
    """
    dosage_data = []

    for gene_symbol in gene_list:
        # 1. ClinGen dosage sensitivity (gold standard)
        clingen = tu.tools.ClinGen_search_dosage_sensitivity(
            gene=gene_symbol
        )

        hi_score = None
        ts_score = None
        if clingen.get('data'):
            for entry in clingen['data']:
                hi_score = entry.get('Haploinsufficiency Score')
                ts_score = entry.get('Triplosensitivity Score')
                break

        # 2. ClinGen gene validity (supports dosage sensitivity)
        validity = tu.tools.ClinGen_search_gene_validity(
            gene=gene_symbol
        )

        validity_level = None
        if validity.get('data'):
            for entry in validity['data']:
                validity_level = entry.get('Classification')
                break

        # 3. pLI score from gnomAD (if available via gene search)
        # Note: May need to use myvariant or other tools
        # pli_score = get_pli_score(tu, gene_symbol)

        # 4. OMIM inheritance pattern
        omim = tu.tools.OMIM_search(
            operation="search",
            query=gene_symbol,
            limit=3
        )

        inheritance_pattern = None
        if omim.get('data', {}).get('entries'):
            for entry in omim['data']['entries']:
                mim = entry.get('mimNumber')
                details = tu.tools.OMIM_get_entry(
                    operation="get_entry",
                    mim_number=str(mim)
                )
                # Extract inheritance from details
                # inheritance_pattern = parse_inheritance(details)

        # Integrate evidence
        dosage_assessment = {
            'gene': gene_symbol,
            'hi_score': hi_score,
            'ts_score': ts_score,
            'validity_level': validity_level,
            'inheritance': inheritance_pattern,
            'is_dosage_sensitive': (hi_score == '3' or ts_score == '3'),
            'evidence_grade': calculate_evidence_grade(hi_score, ts_score, validity_level)
        }

        dosage_data.append(dosage_assessment)

    return dosage_data

def calculate_evidence_grade(hi_score, ts_score, validity):
    """
    Calculate evidence grade for dosage sensitivity.
    """
    if (hi_score == '3' or ts_score == '3') and validity == 'Definitive':
        return '★★★'  # High confidence
    elif (hi_score in ['2', '3'] or ts_score in ['2', '3']):
        return '★★☆'  # Moderate confidence
    else:
        return '★☆☆'  # Low confidence

报告章节:

markdown

undefined

3. Dosage Sensitivity Assessment

3. 剂量敏感性评估

Haploinsufficient Genes (Deletions/Disruptions)

单倍剂量不足基因（缺失/断裂）

Gene	ClinGen HI Score	pLI	Validity	Disease	Evidence
KANSL1	3 (Sufficient)	0.99	Definitive	Koolen-De Vries syndrome	★★★
MAPT	2 (Emerging)	0.85	Strong	FTD (rare)	★★☆

Interpretation: KANSL1 has definitive evidence for haploinsufficiency. Deletion of one copy is expected to cause Koolen-De Vries syndrome (intellectual disability, hypotonia, distinctive facial features).

Sources: ClinGen Dosage Sensitivity Map, gnomAD pLI

基因	ClinGen HI评分	pLI	有效性等级	疾病	证据等级
KANSL1	3（证据充分）	0.99	明确	Koolen-De Vries综合征	★★★
MAPT	2（证据涌现）	0.85	强	罕见FTD	★★☆

解读：KANSL1具有明确的单倍剂量不足证据。一个拷贝的缺失预计会引发Koolen-De Vries综合征（智力障碍、肌张力低下、特殊面容）。

来源: ClinGen剂量敏感性图谱, gnomAD pLI

Triplosensitive Genes (Duplications)

三倍剂量敏感性基因（重复）

Gene	ClinGen TS Score	Disease Mechanism	Evidence
MECP2	3 (Sufficient)	MECP2 duplication syndrome	★★★
PMP22	3 (Sufficient)	Charcot-Marie-Tooth 1A	★★★

Note: For this deletion, triplosensitivity is not applicable. Listed for reference.

基因	ClinGen TS评分	疾病机制	证据等级
MECP2	3（证据充分）	MECP2重复综合征	★★★
PMP22	3（证据充分）	Charcot-Marie-Tooth 1A型	★★★

注：本案例为缺失，三倍剂量敏感性不适用。此处仅作参考。

Non-Dosage-Sensitive Genes

非剂量敏感性基因

Gene	HI Score	TS Score	Interpretation
GENE_X	0	0	No established dosage sensitivity
GENE_Y	1	1	Insufficient evidence

Interpretation: These genes lack evidence for dosage sensitivity. Deletion/duplication less likely to be pathogenic solely due to these genes.

---

基因	HI评分	TS评分	解读
GENE_X	0	0	无已证实的剂量敏感性
GENE_Y	1	1	证据不足

解读：这些基因缺乏剂量敏感性证据。仅因这些基因的缺失/重复导致致病性的可能性较低。

---

Phase 4: Population Frequency Context

阶段4: 人群频率背景分析

Goal: Determine if SV is common in general population (likely benign) or rare (supports pathogenicity)

Tools:

Tool	Purpose	Key Data
`gnomad_search`	Population SV frequencies	Overlapping SVs, frequencies
`ClinVar_search_variants`	Known pathogenic/benign SVs	Classification, review status
`DECIPHER_search`	Patient SVs with phenotypes	Case reports, phenotype similarity

Frequency Interpretation (adapted from ACMG):

SV Frequency	ACMG Code	Interpretation
≥1% in gnomAD SVs	BA1 (Stand-alone Benign)	Too common for rare disease
0.1-1%	BS1 (Strong Benign)	Likely benign common variant
<0.01%	PM2 (Supporting Pathogenic)	Rare, supports pathogenicity
Absent	PM2 (Supporting)	Very rare, supports pathogenicity

Reciprocal Overlap Calculation:

For proper comparison, calculate reciprocal overlap between query SV and population SV:

Reciprocal Overlap = min(overlap_with_A, overlap_with_B)
where:
  overlap_with_A = (overlap length) / (SV_A length)
  overlap_with_B = (overlap length) / (SV_B length)

Threshold: ≥70% reciprocal overlap = "same" SV

Implementation:

python

def assess_population_frequency(tu, chrom, sv_start, sv_end, sv_type):
    """
    Check population databases for overlapping SVs.
    """
    # 1. Check ClinVar for known pathogenic/benign SVs
    clinvar = tu.tools.ClinVar_search_variants(
        chromosome=str(chrom),
        start=sv_start,
        stop=sv_end,
        variant_type=sv_type.upper()
    )

    known_svs = []
    if clinvar.get('data'):
        for variant in clinvar['data']:
            classification = variant.get('clinical_significance')
            known_svs.append({
                'database': 'ClinVar',
                'classification': classification,
                'review_status': variant.get('review_status'),
                'coordinates': f"{variant.get('chromosome')}:{variant.get('start')}-{variant.get('stop')}"
            })

    # 2. gnomAD SVs (if available)
    # Note: gnomAD SV database may not have direct API access via ToolUniverse
    # May need to use genomic coordinate search

    # 3. DECIPHER for similar patient cases
    decipher_search = tu.tools.DECIPHER_search(
        query=f"chr{chrom}:{sv_start}-{sv_end}",
        search_type="region"
    )

    patient_cases = []
    if decipher_search.get('data'):
        patient_cases = decipher_search['data']

    return {
        'clinvar_matches': known_svs,
        'decipher_cases': patient_cases,
        'frequency_interpretation': interpret_frequency(known_svs)
    }

def interpret_frequency(known_svs):
    """
    Interpret frequency based on ClinVar matches.
    """
    if any(sv['classification'] == 'Benign' for sv in known_svs):
        return {
            'acmg_code': 'BA1 or BS1',
            'interpretation': 'Likely benign based on ClinVar benign classification',
            'evidence_grade': '★★★'
        }
    elif any(sv['classification'] == 'Pathogenic' for sv in known_svs):
        return {
            'acmg_code': 'PS1',
            'interpretation': 'Pathogenic based on ClinVar pathogenic classification',
            'evidence_grade': '★★★'
        }
    else:
        return {
            'acmg_code': 'PM2',
            'interpretation': 'Rare variant, not found in ClinVar or population databases',
            'evidence_grade': '★★☆'
        }

Report Section:

markdown

undefined

目标：确定SV在普通人群中是否常见（可能良性）或罕见（支持致病性）

工具:

工具	用途	核心数据
`gnomad_search`	人群SV频率	重叠SV、频率
`ClinVar_search_variants`	已知致病性/良性SV	分类、评审状态
`DECIPHER_search`	带表型的患者SV	病例报告、表型相似性

频率解读（适配ACMG）:

SV频率	ACMG代码	解读
在gnomAD SV中≥1%	BA1（独立良性）	过于常见，不可能引发罕见病
0.1-1%	BS1（强良性）	可能为良性常见变异
<0.01%	PM2（支持致病性）	罕见，支持致病性
未检出	PM2（支持）	极罕见，支持致病性

互斥重叠度计算:

为确保比较准确性，计算查询SV与人群SV的互斥重叠度：

互斥重叠度 = min(与A的重叠度, 与B的重叠度)
其中:
  与A的重叠度 = (重叠长度) / (SV_A长度)
  与B的重叠度 = (重叠长度) / (SV_B长度)

阈值: ≥70%互斥重叠度 = "相同" SV

实现代码:

python

def assess_population_frequency(tu, chrom, sv_start, sv_end, sv_type):
    """
    Check population databases for overlapping SVs.
    """
    # 1. Check ClinVar for known pathogenic/benign SVs
    clinvar = tu.tools.ClinVar_search_variants(
        chromosome=str(chrom),
        start=sv_start,
        stop=sv_end,
        variant_type=sv_type.upper()
    )

    known_svs = []
    if clinvar.get('data'):
        for variant in clinvar['data']:
            classification = variant.get('clinical_significance')
            known_svs.append({
                'database': 'ClinVar',
                'classification': classification,
                'review_status': variant.get('review_status'),
                'coordinates': f"{variant.get('chromosome')}:{variant.get('start')}-{variant.get('stop')}"
            })

    # 2. gnomAD SVs (if available)
    # Note: gnomAD SV database may not have direct API access via ToolUniverse
    # May need to use genomic coordinate search

    # 3. DECIPHER for similar patient cases
    decipher_search = tu.tools.DECIPHER_search(
        query=f"chr{chrom}:{sv_start}-{sv_end}",
        search_type="region"
    )

    patient_cases = []
    if decipher_search.get('data'):
        patient_cases = decipher_search['data']

    return {
        'clinvar_matches': known_svs,
        'decipher_cases': patient_cases,
        'frequency_interpretation': interpret_frequency(known_svs)
    }

def interpret_frequency(known_svs):
    """
    Interpret frequency based on ClinVar matches.
    """
    if any(sv['classification'] == 'Benign' for sv in known_svs):
        return {
            'acmg_code': 'BA1 or BS1',
            'interpretation': 'Likely benign based on ClinVar benign classification',
            'evidence_grade': '★★★'
        }
    elif any(sv['classification'] == 'Pathogenic' for sv in known_svs):
        return {
            'acmg_code': 'PS1',
            'interpretation': 'Pathogenic based on ClinVar pathogenic classification',
            'evidence_grade': '★★★'
        }
    else:
        return {
            'acmg_code': 'PM2',
            'interpretation': 'Rare variant, not found in ClinVar or population databases',
            'evidence_grade': '★★☆'
        }

报告章节:

markdown

undefined

4. Population Frequency Context

4. 人群频率背景分析

ClinVar Matches (Overlapping SVs)

ClinVar匹配结果（重叠SV）

VCV ID	Classification	Size	Overlap	Review Status	Genes
VCV000012345	Pathogenic	320 kb	95% reciprocal	★★★ Reviewed by expert panel	KANSL1, MAPT

Match Found: Query deletion has 95% reciprocal overlap with known pathogenic deletion in ClinVar (VCV000012345). This is the Koolen-De Vries syndrome deletion.

ACMG Code: PS1 (Strong) - Same genomic region as established pathogenic SV

Source: ClinVar via
ClinVar_search_variants

VCV ID	分类	尺寸	重叠度	评审状态	基因
VCV000012345	致病性	320 kb	95%互斥重叠	★★★ 专家评审	KANSL1, MAPT

匹配发现：查询缺失与ClinVar中已知致病性缺失（VCV000012345）的互斥重叠度为95%。该缺失为Koolen-De Vries综合征缺失。

ACMG代码：PS1（强） - 与已确立致病性的SV位于相同基因组区域

来源: ClinVar via
ClinVar_search_variants

gnomAD SV Database

gnomAD SV数据库

Search Result: No overlapping deletions found in gnomAD SV v4.0 (>10,000 genomes)

Interpretation: Absence from gnomAD supports rarity and pathogenic potential.

ACMG Code: PM2 (Moderate) - Absent from population databases

Note: gnomAD SVs queried via browser (no direct API access)

搜索结果：在gnomAD SV v4.0（>10,000个基因组）中未发现重叠缺失

解读：在gnomAD中未检出支持其罕见性和致病潜力。

ACMG代码：PM2（中等） - 未在人群数据库中检出

注: gnomAD SV通过浏览器查询（无直接API访问）

DECIPHER Patient Cases

DECIPHER患者病例

Case ID	Phenotype	SV Type	Size	Overlap	Similarity
12345	Intellectual disability, hypotonia	DEL	315 kb	98%	High
67890	Developmental delay, facial dysmorphism	DEL	305 kb	92%	High

Phenotype Match: 8/10 DECIPHER patients have intellectual disability and hypotonia, consistent with Koolen-De Vries syndrome.

ACMG Support: PP4 (Supporting) - Patient phenotype consistent with gene's disease association

Source: DECIPHER via
DECIPHER_search

---

病例ID	表型	SV类型	尺寸	重叠度	相似性
12345	智力障碍、肌张力低下	DEL	315 kb	98%	高
67890	发育迟缓、面部畸形	DEL	305 kb	92%	高

表型匹配：8/10的DECIPHER患者具有智力障碍和肌张力低下，与Koolen-De Vries综合征一致。

ACMG支持：PP4（支持） - 患者表型与基因的疾病关联一致

来源: DECIPHER via
DECIPHER_search

---

Phase 5: Pathogenicity Scoring

阶段5: 致病性评分

Goal: Quantitative pathogenicity assessment (0-10 scale)

Scoring Components:

Gene Content (40 points max):
- 10 points per dosage-sensitive gene (HI/TS score 3)
- 5 points per likely dosage-sensitive gene (score 2)
- 2 points per gene with disease association
- Cap at 40 points
Dosage Sensitivity Evidence (30 points max):
- 30 points: Multiple genes with definitive HI/TS (score 3)
- 20 points: One gene with definitive HI/TS
- 10 points: Genes with emerging evidence (score 2)
- 5 points: Predicted haploinsufficiency (pLI >0.9)
Population Frequency (20 points max):
- 20 points: Absent from gnomAD, DGV
- 10 points: Rare (<0.01%)
- 0 points: Common (>0.1%)
- -20 points: Very common (>1%) - likely benign
Clinical Evidence (10 points max):
- 10 points: Matching ClinVar pathogenic SV
- 8 points: DECIPHER cases with matching phenotype
- 5 points: Literature support for gene dosage effects
- 3 points: Phenotype consistent with genes

Pathogenicity Score Interpretation:

Score	Classification	Confidence	Interpretation
9-10	Pathogenic	★★★	High confidence pathogenic
7-8	Likely Pathogenic	★★☆	Strong evidence for pathogenicity
4-6	VUS	★☆☆	Uncertain significance
2-3	Likely Benign	★★☆	Strong evidence for benign
0-1	Benign	★★★	High confidence benign

Implementation:

python

def calculate_pathogenicity_score(gene_content, dosage_data, frequency_data, clinical_data):
    """
    Calculate comprehensive pathogenicity score (0-10 scale).
    """
    score = 0
    breakdown = {}

    # 1. Gene content scoring (40 points max)
    gene_score = 0
    for gene in gene_content['fully_contained'] + gene_content['partially_disrupted']:
        dosage_info = next((d for d in dosage_data if d['gene'] == gene['symbol']), None)
        if dosage_info:
            if dosage_info['hi_score'] == '3':
                gene_score += 10
            elif dosage_info['hi_score'] == '2':
                gene_score += 5
            elif gene.get('omim_disease'):
                gene_score += 2

    gene_score = min(gene_score, 40)  # Cap at 40
    breakdown['gene_content'] = gene_score / 40 * 4  # Scale to 0-4

    # 2. Dosage sensitivity scoring (30 points max)
    dosage_score = 0
    definitive_genes = sum(1 for d in dosage_data if d['hi_score'] == '3')

    if definitive_genes >= 2:
        dosage_score = 30
    elif definitive_genes == 1:
        dosage_score = 20
    else:
        emerging_genes = sum(1 for d in dosage_data if d['hi_score'] == '2')
        dosage_score = emerging_genes * 5

    dosage_score = min(dosage_score, 30)
    breakdown['dosage_sensitivity'] = dosage_score / 30 * 3  # Scale to 0-3

    # 3. Population frequency scoring (20 points max)
    freq_score = 0
    if frequency_data.get('frequency') is None:
        freq_score = 20  # Absent
    elif frequency_data['frequency'] < 0.0001:
        freq_score = 10  # Rare
    elif frequency_data['frequency'] < 0.001:
        freq_score = 5  # Uncommon
    elif frequency_data['frequency'] > 0.01:
        freq_score = -20  # Common - likely benign

    breakdown['population_frequency'] = freq_score / 20 * 2  # Scale to -2 to 2

    # 4. Clinical evidence scoring (10 points max)
    clinical_score = 0
    if clinical_data.get('clinvar_pathogenic'):
        clinical_score = 10
    elif clinical_data.get('decipher_matching_phenotype'):
        clinical_score = 8
    elif clinical_data.get('literature_support'):
        clinical_score = 5

    clinical_score = min(clinical_score, 10)
    breakdown['clinical_evidence'] = clinical_score / 10 * 1  # Scale to 0-1

    # Total score (0-10 scale)
    total_score = breakdown['gene_content'] + breakdown['dosage_sensitivity'] + \
                  breakdown['population_frequency'] + breakdown['clinical_evidence']

    total_score = max(0, min(10, total_score))  # Ensure 0-10 range

    return {
        'total_score': round(total_score, 1),
        'breakdown': breakdown,
        'classification': classify_score(total_score)
    }

def classify_score(score):
    """Map score to ACMG-style classification."""
    if score >= 9:
        return 'Pathogenic'
    elif score >= 7:
        return 'Likely Pathogenic'
    elif score >= 4:
        return 'VUS'
    elif score >= 2:
        return 'Likely Benign'
    else:
        return 'Benign'

Report Section:

markdown

undefined

目标：量化致病性评估（0-10分）

评分组成:

基因内容（最高40分）:
- 每个剂量敏感性基因（HI/TS评分3）加10分
- 每个可能剂量敏感性基因（评分2）加5分
- 每个具有疾病关联的基因加2分
- 最高40分
剂量敏感性证据（最高30分）:
- 30分：多个具有明确HI/TS（评分3）的基因
- 20分：一个具有明确HI/TS的基因
- 10分：具有涌现证据的基因（评分2）
- 5分：预测单倍剂量不足（pLI >0.9）
人群频率（最高20分）:
- 20分：在gnomAD、DGV中未检出
- 10分：罕见（<0.01%）
- 0分：常见（>0.1%）
- -20分：非常常见（>1%） - 可能良性
临床证据（最高10分）:
- 10分：匹配ClinVar致病性SV
- 8分：DECIPHER病例具有匹配表型
- 5分：文献支持基因剂量效应
- 3分：表型与基因一致

致病性评分解读:

评分	分类	置信度	解读
9-10	致病性	★★★	高置信度致病性
7-8	疑似致病性	★★☆	强致病性证据
4-6	VUS	★☆☆	意义未明
2-3	疑似良性	★★☆	强良性证据
0-1	良性	★★★	高置信度良性

实现代码:

python

def calculate_pathogenicity_score(gene_content, dosage_data, frequency_data, clinical_data):
    """
    Calculate comprehensive pathogenicity score (0-10 scale).
    """
    score = 0
    breakdown = {}

    # 1. Gene content scoring (40 points max)
    gene_score = 0
    for gene in gene_content['fully_contained'] + gene_content['partially_disrupted']:
        dosage_info = next((d for d in dosage_data if d['gene'] == gene['symbol']), None)
        if dosage_info:
            if dosage_info['hi_score'] == '3':
                gene_score += 10
            elif dosage_info['hi_score'] == '2':
                gene_score += 5
            elif gene.get('omim_disease'):
                gene_score += 2

    gene_score = min(gene_score, 40)  # Cap at 40
    breakdown['gene_content'] = gene_score / 40 * 4  # Scale to 0-4

    # 2. Dosage sensitivity scoring (30 points max)
    dosage_score = 0
    definitive_genes = sum(1 for d in dosage_data if d['hi_score'] == '3')

    if definitive_genes >= 2:
        dosage_score = 30
    elif definitive_genes == 1:
        dosage_score = 20
    else:
        emerging_genes = sum(1 for d in dosage_data if d['hi_score'] == '2')
        dosage_score = emerging_genes * 5

    dosage_score = min(dosage_score, 30)
    breakdown['dosage_sensitivity'] = dosage_score / 30 * 3  # Scale to 0-3

    # 3. Population frequency scoring (20 points max)
    freq_score = 0
    if frequency_data.get('frequency') is None:
        freq_score = 20  # Absent
    elif frequency_data['frequency'] < 0.0001:
        freq_score = 10  # Rare
    elif frequency_data['frequency'] < 0.001:
        freq_score = 5  # Uncommon
    elif frequency_data['frequency'] > 0.01:
        freq_score = -20  # Common - likely benign

    breakdown['population_frequency'] = freq_score / 20 * 2  # Scale to -2 to 2

    # 4. Clinical evidence scoring (10 points max)
    clinical_score = 0
    if clinical_data.get('clinvar_pathogenic'):
        clinical_score = 10
    elif clinical_data.get('decipher_matching_phenotype'):
        clinical_score = 8
    elif clinical_data.get('literature_support'):
        clinical_score = 5

    clinical_score = min(clinical_score, 10)
    breakdown['clinical_evidence'] = clinical_score / 10 * 1  # Scale to 0-1

    # Total score (0-10 scale)
    total_score = breakdown['gene_content'] + breakdown['dosage_sensitivity'] + \
                  breakdown['population_frequency'] + breakdown['clinical_evidence']

    total_score = max(0, min(10, total_score))  # Ensure 0-10 range

    return {
        'total_score': round(total_score, 1),
        'breakdown': breakdown,
        'classification': classify_score(total_score)
    }

def classify_score(score):
    """Map score to ACMG-style classification."""
    if score >= 9:
        return 'Pathogenic'
    elif score >= 7:
        return 'Likely Pathogenic'
    elif score >= 4:
        return 'VUS'
    elif score >= 2:
        return 'Likely Benign'
    else:
        return 'Benign'

报告章节:

markdown

undefined

5. Pathogenicity Scoring

5. 致病性评分

Quantitative Assessment (0-10 Scale)

量化评估（0-10分）

Component	Points	Max	Contribution	Rationale
Gene Content	4.0	4	40%	KANSL1 (HI score 3), MAPT (HI score 2)
Dosage Sensitivity	2.5	3	25%	One definitive HI gene (KANSL1)
Population Frequency	2.0	2	20%	Absent from gnomAD SVs
Clinical Evidence	1.0	1	10%	ClinVar pathogenic match
Total Score	9.5	10	100%

Classification: Pathogenic (★★★ High Confidence)

Interpretation: Score of 9.5/10 indicates high confidence pathogenic SV. Deletion encompasses established haploinsufficient gene (KANSL1), absent from population databases, and matches known pathogenic ClinVar variant.

组成部分	得分	满分	贡献占比	依据
基因内容	4.0	4	40%	KANSL1（HI评分3）, MAPT（HI评分2）
剂量敏感性	2.5	3	25%	一个明确HI基因（KANSL1）
人群频率	2.0	2	20%	在gnomAD SV中未检出
临床证据	1.0	1	10%	ClinVar致病性匹配
总分	9.5	10	100%

分类：致病性（★★★ 高置信度）

解读：9.5/10的评分表明SV具有高置信度致病性。缺失包含已确立的单倍剂量不足基因（KANSL1），未在人群数据库中检出，且与已知致病性ClinVar变异匹配。

Score Breakdown Visualization

得分细分可视化

Gene Content:        ████████████████████████████████████████ 4.0/4
Dosage Sensitivity:  ██████████████████████████░░░░░░░░░░░░░ 2.5/3
Population Freq:     ████████████████████████████████████████ 2.0/2
Clinical Evidence:   ██████████████████████████████████████░░ 1.0/1
                     ─────────────────────────────────────────
Total:               ██████████████████████████████████████░░ 9.5/10

Key Drivers of Pathogenicity:

KANSL1 haploinsufficiency (definitive evidence)
Exact match to known pathogenic deletion
Absence from population databases
Phenotype consistency with Koolen-De Vries syndrome

---

基因内容:        ████████████████████████████████████████ 4.0/4
剂量敏感性:      ██████████████████████████░░░░░░░░░░░░░ 2.5/3
人群频率:        ████████████████████████████████████████ 2.0/2
临床证据:        ██████████████████████████████████████░░ 1.0/1
                 ─────────────────────────────────────────
总分:            ██████████████████████████████████████░░ 9.5/10

致病性核心驱动因素:

KANSL1单倍剂量不足（明确证据）
与已知致病性缺失完全匹配
未在人群数据库中检出
表型与Koolen-De Vries综合征一致

---

Phase 6: Literature & Clinical Evidence

阶段6: 文献与临床证据

Goal: Find case reports, functional studies, and clinical validation

Tools:

Tool	Purpose	Coverage
`PubMed_search`	Peer-reviewed literature	Comprehensive
`DECIPHER_search`	Patient case database	Developmental disorders
`EuropePMC_search`	European literature	Additional coverage

Search Strategies:

python

def comprehensive_literature_search(tu, genes, sv_type, phenotype):
    """
    Search literature for SV evidence.
    """
    # 1. Gene-specific searches
    literature = []
    for gene in genes:
        # Dosage sensitivity literature
        dosage_papers = tu.tools.PubMed_search(
            query=f'"{gene}" AND (haploinsufficiency OR dosage sensitivity OR deletion syndrome)',
            max_results=20
        )

        # Case reports
        case_papers = tu.tools.PubMed_search(
            query=f'"{gene}" AND deletion AND {phenotype}',
            max_results=15
        )

        literature.append({
            'gene': gene,
            'dosage_papers': dosage_papers,
            'case_reports': case_papers
        })

    # 2. SV-specific searches
    if sv_type == 'DEL':
        sv_papers = tu.tools.PubMed_search(
            query=f'deletion AND {" AND ".join(genes[:3])} AND syndrome',
            max_results=25
        )

    # 3. DECIPHER cases
    decipher_cases = []
    for gene in genes:
        cases = tu.tools.DECIPHER_search(
            query=gene,
            search_type="gene"
        )
        decipher_cases.append(cases)

    return {
        'gene_literature': literature,
        'sv_literature': sv_papers,
        'decipher_cases': decipher_cases
    }

Report Section:

markdown

undefined

目标：查找病例报告、功能研究和临床验证数据

工具:

工具	用途	覆盖范围
`PubMed_search`	同行评审文献	全面
`DECIPHER_search`	患者病例数据库	发育障碍
`EuropePMC_search`	欧洲文献	补充覆盖

搜索策略:

python

def comprehensive_literature_search(tu, genes, sv_type, phenotype):
    """
    Search literature for SV evidence.
    """
    # 1. Gene-specific searches
    literature = []
    for gene in genes:
        # Dosage sensitivity literature
        dosage_papers = tu.tools.PubMed_search(
            query=f'"{gene}" AND (haploinsufficiency OR dosage sensitivity OR deletion syndrome)',
            max_results=20
        )

        # Case reports
        case_papers = tu.tools.PubMed_search(
            query=f'"{gene}" AND deletion AND {phenotype}',
            max_results=15
        )

        literature.append({
            'gene': gene,
            'dosage_papers': dosage_papers,
            'case_reports': case_papers
        })

    # 2. SV-specific searches
    if sv_type == 'DEL':
        sv_papers = tu.tools.PubMed_search(
            query=f'deletion AND {" AND ".join(genes[:3])} AND syndrome',
            max_results=25
        )

    # 3. DECIPHER cases
    decipher_cases = []
    for gene in genes:
        cases = tu.tools.DECIPHER_search(
            query=gene,
            search_type="gene"
        )
        decipher_cases.append(cases)

    return {
        'gene_literature': literature,
        'sv_literature': sv_papers,
        'decipher_cases': decipher_cases
    }

报告章节:

markdown

undefined

6. Literature & Clinical Evidence

6. 文献与临床证据

Key Publications

关键出版物

Study	Finding	Evidence Type	PMID
Koolen et al., 2006	Described 17q21.31 microdeletion syndrome	Original description	16222315
Koolen et al., 2008	KANSL1 haploinsufficiency confirmed	Functional validation	18394581
Zollino et al., 2012	Phenotype characterization (n=52)	Clinical series	22736773

Key Findings:

17q21.31 deletion is recurrent (mediated by LCRs)
KANSL1 haploinsufficiency is primary mechanism
Phenotype: ID (100%), hypotonia (95%), friendly demeanor (85%)
Penetrance: >95% for developmental features

Source: PubMed via
PubMed_search

研究	发现	证据类型	PMID
Koolen et al., 2006	描述17q21.31微缺失综合征	原始描述	16222315
Koolen et al., 2008	证实KANSL1单倍剂量不足	功能验证	18394581
Zollino et al., 2012	表型特征分析（n=52）	临床系列研究	22736773

关键发现:

17q21.31缺失为 recurrent（由LCR介导）
KANSL1单倍剂量不足是主要机制
表型：智力障碍（100%）、肌张力低下（95%）、友好性情（85%）
外显率：发育特征外显率>95%

来源: PubMed via
PubMed_search

DECIPHER Patient Cases (n=45)

DECIPHER患者病例（n=45）

Phenotype Frequency in DECIPHER Cohort:

Feature	Frequency	Match to Patient
Intellectual disability	45/45 (100%)	✓ Yes
Hypotonia	42/45 (93%)	✓ Yes
Feeding difficulties	38/45 (84%)	✓ Yes
Distinctive facies	40/45 (89%)	✓ Yes
Friendly personality	35/45 (78%)	Unknown

Phenotype Match: Patient phenotype highly consistent with DECIPHER cohort (4/4 assessable features present).

ACMG Code: PP4 (Supporting) - Patient's clinical features consistent with gene's known phenotype

Source: DECIPHER via
DECIPHER_search

DECIPHER队列表型频率:

特征	频率	与患者匹配
智力障碍	45/45 (100%)	✓ 是
肌张力低下	42/45 (93%)	✓ 是
喂养困难	38/45 (84%)	✓ 是
特殊面容	40/45 (89%)	✓ 是
友好性格	35/45 (78%)	未知

表型匹配：患者表型与DECIPHER队列高度一致（4项可评估特征均匹配）。

ACMG代码：PP4（支持） - 患者临床特征与基因已知表型一致

来源: DECIPHER via
DECIPHER_search

Functional Evidence for KANSL1 Dosage Sensitivity

KANSL1剂量敏感性的功能证据

Study	Model	Finding	PMID
Koolen et al., 2012	Patient cells	Reduced KANSL1 protein	22736773
Zollino et al., 2015	Mouse model	Kansl1+/- recapitulates phenotype	25607366
Arbogast et al., 2017	Zebrafish	kansl1 knockdown → developmental defects	28666126

Strength of Evidence: ★★★ (High) - Multiple independent studies confirm haploinsufficiency mechanism

ACMG Code: PS3_Moderate - Well-established functional studies showing dosage sensitivity

---

研究	模型	发现	PMID
Koolen et al., 2012	患者细胞	KANSL1蛋白表达降低	22736773
Zollino et al., 2015	小鼠模型	Kansl1+/-重现表型	25607366
Arbogast et al., 2017	斑马鱼	kansl1敲低→发育缺陷	28666126

证据强度：★★★（高） - 多项独立研究证实单倍剂量不足机制

ACMG代码：PS3_Moderate - 完善的功能研究证实剂量敏感性

---

Phase 7: ACMG-Adapted Classification

阶段7: 适配ACMG的分类

Goal: Apply ACMG/ClinGen criteria adapted for SVs

SV-Specific ACMG Criteria:

目标：应用适配SV的ACMG/ClinGen标准

SV特异性ACMG标准:

Pathogenic Evidence Codes

致病性证据代码

Code	Strength	Criteria	SV Application
PVS1	Very Strong	Null variant in HI gene	Complete deletion of HI gene
PS1	Strong	Same SV as known pathogenic	≥70% reciprocal overlap with ClinVar pathogenic
PS2	Strong	De novo (maternity/paternity confirmed)	De novo SV in patient with matching phenotype
PS3	Strong	Functional studies	Gene dosage effects demonstrated
PS4	Strong	Case-control enrichment	SV enriched in cases vs controls
PM1	Moderate	Critical region	Deletion of exons in HI gene
PM2	Moderate	Absent from controls	Not in gnomAD SVs, DGV
PM3	Moderate	Recessive: homozygous or compound het	Both alleles affected (rare for SVs)
PM4	Moderate	Protein length change	In-frame deletion/duplication
PM5	Moderate	Similar SVs pathogenic	Nearby SVs in ClinVar pathogenic
PM6	Moderate	De novo (no confirmation)	De novo SV, phenotype consistent
PP1	Supporting	Segregation in family	SV segregates with phenotype
PP2	Supporting	Gene/pathway relevant	Genes in SV match phenotype
PP3	Supporting	Computational evidence	Multiple predictors support haploinsufficiency
PP4	Supporting	Phenotype consistent	Patient phenotype matches gene-disease

代码	强度	标准	SV应用场景
PVS1	极强	HI基因中的无效变异	HI基因完全缺失
PS1	强	与已知致病性SV相同	与ClinVar致病性SV互斥重叠≥70%
PS2	强	新发（亲子关系已确认）	患者中出现新发SV且表型匹配
PS3	强	功能研究	证实基因剂量效应
PS4	强	病例-对照富集	SV在病例中富集
PM1	中等	关键区域	HI基因外显子缺失
PM2	中等	未在对照中检出	未在gnomAD SV、DGV中检出
PM3	中等	隐性：纯合或复合杂合	两个等位基因均受影响（SV中罕见）
PM4	中等	蛋白长度改变	框内缺失/重复
PM5	中等	相似SV具有致病性	附近SV在ClinVar中为致病性
PM6	中等	新发（未确认）	新发SV，表型一致
PP1	支持	家系共分离	SV与表型共分离
PP2	支持	基因/通路相关	SV中的基因与表型匹配
PP3	支持	计算证据	多个预测工具支持单倍剂量不足
PP4	支持	表型一致	患者表型与基因-疾病关联匹配

Benign Evidence Codes

良性证据代码

Code	Strength	Criteria	SV Application
BA1	Stand-Alone	MAF >5%	SV frequency >5% in gnomAD
BS1	Strong	MAF too high for disease	SV frequency >1%
BS2	Strong	Healthy adult with phenotype-associated genotype	SV in healthy individual (careful - reduced penetrance)
BS3	Strong	Functional studies show no effect	No dosage sensitivity demonstrated
BS4	Strong	Non-segregation	SV doesn't segregate with phenotype
BP1	Supporting	Missense in gene without known LOF	N/A for SVs
BP2	Supporting	Observed in trans with pathogenic	SV + pathogenic SNV = compound het (patient unaffected)
BP4	Supporting	Computational evidence benign	Predictors suggest no haploinsufficiency
BP5	Supporting	Found in case with alt cause	Phenotype explained by different variant
BP7	Supporting	Synonymous with no splice effect	N/A for SVs

Classification Algorithm (ACMG SV Criteria):

Classification	Evidence Required
Pathogenic	PVS1 + PS1; OR 2 Strong; OR 1 Strong + 3 Moderate
Likely Pathogenic	1 Very Strong + 1 Moderate; OR 1 Strong + 2 Moderate; OR 3 Moderate
VUS	Criteria not met; OR conflicting evidence
Likely Benign	1 Strong + 1 Supporting; OR 2 Supporting
Benign	BA1; OR BS1 + BS2; OR 2 Strong

Implementation:

python

def apply_acmg_criteria(gene_content, dosage_data, frequency_data, clinical_data, inheritance):
    """
    Apply ACMG SV criteria and calculate classification.
    """
    evidence = {
        'pathogenic': [],
        'benign': []
    }

    # PVS1: Complete deletion of HI gene
    hi_genes = [d for d in dosage_data if d['hi_score'] == '3']
    if len(hi_genes) > 0 and len(gene_content['fully_contained']) > 0:
        evidence['pathogenic'].append({
            'code': 'PVS1',
            'strength': 'Very Strong',
            'rationale': f"Complete deletion of haploinsufficient gene(s): {', '.join(g['gene'] for g in hi_genes)}"
        })

    # PS1: Same as known pathogenic SV
    if clinical_data.get('clinvar_pathogenic_match'):
        evidence['pathogenic'].append({
            'code': 'PS1',
            'strength': 'Strong',
            'rationale': f"≥70% overlap with ClinVar pathogenic SV: {clinical_data['clinvar_id']}"
        })

    # PS2: De novo with phenotype match
    if inheritance == 'de_novo' and clinical_data.get('phenotype_match'):
        evidence['pathogenic'].append({
            'code': 'PS2',
            'strength': 'Strong',
            'rationale': "De novo occurrence in patient with consistent phenotype"
        })

    # PS3: Functional studies
    if clinical_data.get('functional_evidence'):
        evidence['pathogenic'].append({
            'code': 'PS3',
            'strength': 'Strong',
            'rationale': "Well-established functional studies demonstrate dosage sensitivity"
        })

    # PM2: Absent from controls
    if frequency_data.get('frequency') == 0 or frequency_data.get('frequency') is None:
        evidence['pathogenic'].append({
            'code': 'PM2',
            'strength': 'Moderate',
            'rationale': "Absent from gnomAD SV database and DGV"
        })

    # PP4: Phenotype consistent
    if clinical_data.get('phenotype_consistent'):
        evidence['pathogenic'].append({
            'code': 'PP4',
            'strength': 'Supporting',
            'rationale': "Patient phenotype highly consistent with gene-disease association"
        })

    # BA1: Common variant
    if frequency_data.get('frequency', 0) > 0.05:
        evidence['benign'].append({
            'code': 'BA1',
            'strength': 'Stand-Alone',
            'rationale': f"Frequency {frequency_data['frequency']:.3f} too high for rare disease"
        })

    # BS1: High frequency
    if 0.01 < frequency_data.get('frequency', 0) <= 0.05:
        evidence['benign'].append({
            'code': 'BS1',
            'strength': 'Strong',
            'rationale': f"Frequency {frequency_data['frequency']:.3f} exceeds expected for disease"
        })

    # Calculate classification
    classification = determine_classification(evidence)

    return {
        'evidence': evidence,
        'classification': classification['class'],
        'confidence': classification['confidence']
    }

def determine_classification(evidence):
    """
    Apply ACMG classification rules.
    """
    path = evidence['pathogenic']
    ben = evidence['benign']

    # Count evidence by strength
    very_strong = len([e for e in path if e['strength'] == 'Very Strong'])
    strong_path = len([e for e in path if e['strength'] == 'Strong'])
    moderate_path = len([e for e in path if e['strength'] == 'Moderate'])
    supporting_path = len([e for e in path if e['strength'] == 'Supporting'])

    standalone_ben = len([e for e in ben if e['strength'] == 'Stand-Alone'])
    strong_ben = len([e for e in ben if e['strength'] == 'Strong'])
    supporting_ben = len([e for e in ben if e['strength'] == 'Supporting'])

    # Benign criteria (takes precedence if strong)
    if standalone_ben >= 1:
        return {'class': 'Benign', 'confidence': '★★★'}
    if strong_ben >= 2:
        return {'class': 'Benign', 'confidence': '★★★'}
    if strong_ben >= 1 and supporting_ben >= 1:
        return {'class': 'Likely Benign', 'confidence': '★★☆'}
    if supporting_ben >= 2:
        return {'class': 'Likely Benign', 'confidence': '★★☆'}

    # Pathogenic criteria
    if very_strong >= 1 and strong_path >= 1:
        return {'class': 'Pathogenic', 'confidence': '★★★'}
    if strong_path >= 2:
        return {'class': 'Pathogenic', 'confidence': '★★★'}
    if very_strong >= 1 and moderate_path >= 1:
        return {'class': 'Likely Pathogenic', 'confidence': '★★☆'}
    if strong_path >= 1 and moderate_path >= 2:
        return {'class': 'Likely Pathogenic', 'confidence': '★★☆'}
    if strong_path >= 1 and moderate_path >= 1 and supporting_path >= 1:
        return {'class': 'Likely Pathogenic', 'confidence': '★★☆'}
    if moderate_path >= 3:
        return {'class': 'Likely Pathogenic', 'confidence': '★☆☆'}

    # Default to VUS
    return {'class': 'VUS', 'confidence': '★☆☆'}

Report Section:

markdown

undefined

代码	强度	标准	SV应用场景
BA1	独立	MAF >5%	SV在gnomAD中频率>5%
BS1	强	MAF过高，不可能引发疾病	SV频率>1%
BS2	强	健康成人携带表型相关基因型	健康个体携带SV（需注意：外显率降低）
BS3	强	功能研究显示无效应	未证实剂量敏感性
BS4	强	不共分离	SV与表型不共分离
BP1	支持	无已知LOF的基因错义变异	SV不适用
BP2	支持	与致病性变异反式存在	SV + 致病性SNV = 复合杂合（患者未患病）
BP4	支持	计算证据提示良性	预测工具提示无单倍剂量不足
BP5	支持	病例中存在其他病因	表型由其他变异解释
BP7	支持	同义变异无剪接效应	SV不适用

分类算法（ACMG SV标准）:

分类	所需证据
致病性	PVS1 + PS1; 或 2项强证据; 或 1项强证据 + 3项中等证据
疑似致病性	1项极强证据 + 1项中等证据; 或 1项强证据 + 2项中等证据; 或 3项中等证据
VUS	未满足标准; 或证据冲突
疑似良性	1项强证据 + 1项支持证据; 或 2项支持证据
良性	BA1; 或 BS1 + BS2; 或 2项强证据

实现代码:

python

def apply_acmg_criteria(gene_content, dosage_data, frequency_data, clinical_data, inheritance):
    """
    Apply ACMG SV criteria and calculate classification.
    """
    evidence = {
        'pathogenic': [],
        'benign': []
    }

    # PVS1: Complete deletion of HI gene
    hi_genes = [d for d in dosage_data if d['hi_score'] == '3']
    if len(hi_genes) > 0 and len(gene_content['fully_contained']) > 0:
        evidence['pathogenic'].append({
            'code': 'PVS1',
            'strength': 'Very Strong',
            'rationale': f"Complete deletion of haploinsufficient gene(s): {', '.join(g['gene'] for g in hi_genes)}"
        })

    # PS1: Same as known pathogenic SV
    if clinical_data.get('clinvar_pathogenic_match'):
        evidence['pathogenic'].append({
            'code': 'PS1',
            'strength': 'Strong',
            'rationale': f"≥70% overlap with ClinVar pathogenic SV: {clinical_data['clinvar_id']}"
        })

    # PS2: De novo with phenotype match
    if inheritance == 'de_novo' and clinical_data.get('phenotype_match'):
        evidence['pathogenic'].append({
            'code': 'PS2',
            'strength': 'Strong',
            'rationale': "De novo occurrence in patient with consistent phenotype"
        })

    # PS3: Functional studies
    if clinical_data.get('functional_evidence'):
        evidence['pathogenic'].append({
            'code': 'PS3',
            'strength': 'Strong',
            'rationale': "Well-established functional studies demonstrate dosage sensitivity"
        })

    # PM2: Absent from controls
    if frequency_data.get('frequency') == 0 or frequency_data.get('frequency') is None:
        evidence['pathogenic'].append({
            'code': 'PM2',
            'strength': 'Moderate',
            'rationale': "Absent from gnomAD SV database and DGV"
        })

    # PP4: Phenotype consistent
    if clinical_data.get('phenotype_consistent'):
        evidence['pathogenic'].append({
            'code': 'PP4',
            'strength': 'Supporting',
            'rationale': "Patient phenotype highly consistent with gene-disease association"
        })

    # BA1: Common variant
    if frequency_data.get('frequency', 0) > 0.05:
        evidence['benign'].append({
            'code': 'BA1',
            'strength': 'Stand-Alone',
            'rationale': f"Frequency {frequency_data['frequency']:.3f} too high for rare disease"
        })

    # BS1: High frequency
    if 0.01 < frequency_data.get('frequency', 0) <= 0.05:
        evidence['benign'].append({
            'code': 'BS1',
            'strength': 'Strong',
            'rationale': f"Frequency {frequency_data['frequency']:.3f} exceeds expected for disease"
        })

    # Calculate classification
    classification = determine_classification(evidence)

    return {
        'evidence': evidence,
        'classification': classification['class'],
        'confidence': classification['confidence']
    }

def determine_classification(evidence):
    """
    Apply ACMG classification rules.
    """
    path = evidence['pathogenic']
    ben = evidence['benign']

    # Count evidence by strength
    very_strong = len([e for e in path if e['strength'] == 'Very Strong'])
    strong_path = len([e for e in path if e['strength'] == 'Strong'])
    moderate_path = len([e for e in path if e['strength'] == 'Moderate'])
    supporting_path = len([e for e in path if e['strength'] == 'Supporting'])

    standalone_ben = len([e for e in ben if e['strength'] == 'Stand-Alone'])
    strong_ben = len([e for e in ben if e['strength'] == 'Strong'])
    supporting_ben = len([e for e in ben if e['strength'] == 'Supporting'])

    # Benign criteria (takes precedence if strong)
    if standalone_ben >= 1:
        return {'class': 'Benign', 'confidence': '★★★'}
    if strong_ben >= 2:
        return {'class': 'Benign', 'confidence': '★★★'}
    if strong_ben >= 1 and supporting_ben >= 1:
        return {'class': 'Likely Benign', 'confidence': '★★☆'}
    if supporting_ben >= 2:
        return {'class': 'Likely Benign', 'confidence': '★★☆'}

    # Pathogenic criteria
    if very_strong >= 1 and strong_path >= 1:
        return {'class': 'Pathogenic', 'confidence': '★★★'}
    if strong_path >= 2:
        return {'class': 'Pathogenic', 'confidence': '★★★'}
    if very_strong >= 1 and moderate_path >= 1:
        return {'class': 'Likely Pathogenic', 'confidence': '★★☆'}
    if strong_path >= 1 and moderate_path >= 2:
        return {'class': 'Likely Pathogenic', 'confidence': '★★☆'}
    if strong_path >= 1 and moderate_path >= 1 and supporting_path >= 1:
        return {'class': 'Likely Pathogenic', 'confidence': '★★☆'}
    if moderate_path >= 3:
        return {'class': 'Likely Pathogenic', 'confidence': '★☆☆'}

    # Default to VUS
    return {'class': 'VUS', 'confidence': '★☆☆'}

报告章节:

markdown

undefined

7. ACMG-Adapted Classification

7. 适配ACMG的分类

Evidence Codes Applied

应用的证据代码

Pathogenic Evidence:

Code	Strength	Rationale
PVS1	Very Strong	Complete deletion of haploinsufficient gene (KANSL1, HI score 3)
PS1	Strong	≥95% overlap with ClinVar pathogenic deletion (VCV000012345)
PM2	Moderate	Absent from gnomAD SV database (>10,000 genomes)
PP4	Supporting	Patient phenotype consistent with Koolen-De Vries syndrome

Benign Evidence: None

致病性证据:

代码	强度	依据
PVS1	极强	完全缺失单倍剂量不足基因（KANSL1，HI评分3）
PS1	强	与ClinVar致病性缺失（VCV000012345）重叠≥95%
PM2	中等	在gnomAD SV数据库（>10,000个基因组）中未检出
PP4	支持	患者表型与Koolen-De Vries综合征一致

良性证据：无

Evidence Summary

证据汇总

Pathogenic	Benign
1 Very Strong (PVS1)	None
1 Strong (PS1)
1 Moderate (PM2)
1 Supporting (PP4)

致病性	良性
1项极强（PVS1）	无
1项强（PS1）
1项中等（PM2）
1项支持（PP4）

Classification: PATHOGENIC ★★★

分类: 致病性 ★★★

Rationale: Meets ACMG criteria for Pathogenic (1 Very Strong + 1 Strong). Complete deletion of established haploinsufficient gene (KANSL1) with exact match to known pathogenic deletion.

Confidence: ★★★ (High) - Multiple independent lines of strong evidence

依据：满足ACMG致病性标准（1项极强 + 1项强证据）。完全缺失已确立的单倍剂量不足基因（KANSL1），且与已知致病性缺失完全匹配。

置信度：★★★（高） - 多条独立强证据支持

Classification Certainty Factors

分类确定性因素

✅ Strengths:

Exact match to well-characterized pathogenic deletion
Complete deletion of definitive HI gene (KANSL1)
Absent from population databases
Phenotype highly consistent with gene-disease

⚠ Limitations:

None significant - this is a well-established pathogenic SV

---

✅ 优势:

与已充分表征的致病性缺失完全匹配
完全缺失明确HI基因（KANSL1）
未在人群数据库中检出
表型与基因-疾病高度一致

⚠ 局限性:

无显著局限性 - 该SV为已充分确立的致病性变异

---

Output Structure

输出结构

Report File:

SV_analysis_report.md

报告文件:

SV_analysis_report.md

markdown

undefined

markdown

undefined

Structural Variant Analysis Report: [SV_IDENTIFIER]

结构变异分析报告: [SV标识符]

Generated: [Date] | Analyst: ToolUniverse SV Interpreter

生成时间: [日期] | 分析者: ToolUniverse SV解读工具

Executive Summary

执行摘要

Field	Value
SV Type	Deletion / Duplication / Inversion / Translocation
Coordinates	chr17:44039927-44352659 (GRCh38)
Size	313 kb
Gene Content	2 genes fully contained, 0 partially disrupted
Classification	Pathogenic / Likely Pathogenic / VUS / Likely Benign / Benign
Pathogenicity Score	X.X / 10
Confidence	★★★ / ★★☆ / ★☆☆
Key Finding	[One-sentence summary]

Clinical Action: [Required / Recommended / None]

字段	值
SV类型	缺失 / 重复 / 倒位 / 易位
坐标	chr17:44039927-44352659 (GRCh38)
尺寸	313 kb
基因内容	2个完全包含的基因，0个部分断裂的基因
分类	致病性 / 疑似致病性 / VUS / 疑似良性 / 良性
致病性评分	X.X / 10
置信度	★★★ / ★★☆ / ★☆☆
关键发现	[一句话总结]

临床行动: [必需 / 推荐 / 无]

1. SV Identity & Classification

1. SV识别与分类

{SV type, coordinates, size, breakpoint precision, inheritance}

{SV类型、坐标、尺寸、断点精度、遗传模式}

2. Gene Content Analysis

2. 基因内容分析

2.1 Fully Contained Genes

2.1 完全包含的基因

{Table of genes with functions, disease associations}

{基因功能、疾病关联表格}

2.2 Partially Disrupted Genes

2.2 部分断裂的基因

{Genes with breakpoints, domains affected}

{断点基因、受影响结构域}

2.3 Flanking Genes

2.3 侧翼基因

{Genes near breakpoints, position effect risk}

{断点附近基因、位置效应风险}

3. Dosage Sensitivity Assessment

3. 剂量敏感性评估

3.1 Haploinsufficient Genes

3.1 单倍剂量不足基因

{ClinGen HI scores, pLI, evidence}

{ClinGen HI评分、pLI、证据表格}

3.2 Triplosensitive Genes

3.2 三倍剂量敏感性基因

{ClinGen TS scores, duplication syndromes}

{ClinGen TS评分、重复综合征}

3.3 Non-Dosage-Sensitive Genes

3.3 非剂量敏感性基因

{Genes without established dosage effects}

{无剂量效应证据的基因}

4. Population Frequency Context

4. 人群频率背景分析

4.1 ClinVar Matches

4.1 ClinVar匹配结果

{Known pathogenic/benign SVs}

{已知致病性/良性SV}

4.2 gnomAD SV Database

4.2 gnomAD SV数据库

{Population frequencies}

{人群频率}

4.3 DECIPHER Patient Cases

4.3 DECIPHER患者病例

{Similar SVs, phenotype matching}

{相似SV、表型匹配}

5. Pathogenicity Scoring

5. 致病性评分

5.1 Quantitative Assessment

5.1 量化评估

{0-10 score with breakdown}

{0-10分及细分}

5.2 Score Components

5.2 得分组成

{Gene content, dosage, frequency, clinical}

{基因内容、剂量、频率、临床}

6. Literature & Clinical Evidence

6. 文献与临床证据

6.1 Key Publications

6.1 关键出版物

{Functional studies, case series}

{功能研究、临床系列}

6.2 DECIPHER Cohort Analysis

6.2 DECIPHER队列分析

{Phenotype frequencies, matching}

{表型频率、匹配情况}

6.3 Functional Evidence

6.3 功能证据

{Gene dosage studies}

{基因剂量研究}

7. ACMG-Adapted Classification

7. 适配ACMG的分类

7.1 Evidence Codes Applied

7.1 应用的证据代码

{Pathogenic and benign codes with rationale}

{致病性与良性代码及依据}

7.2 Classification

7.2 分类结果

{Final classification with confidence}

{最终分类及置信度}

7.3 Certainty Factors

7.3 确定性因素

{Strengths and limitations}

{优势与局限性}

8. Clinical Recommendations

8. 临床建议

8.1 For Affected Individual

8.1 针对受检者

{Testing, management, surveillance}

{检测、管理、监测}

8.2 For Family Members

8.2 针对家属

{Cascade testing, genetic counseling}

{级联检测、遗传咨询}

8.3 Reproductive Considerations

8.3 生殖考量

{Recurrence risk, prenatal testing}

{复发风险、产前检测}

9. Limitations & Uncertainties

9. 局限性与不确定性

{Missing data, conflicting evidence, knowledge gaps}

{缺失数据、证据冲突、知识空白}

Data Sources

数据来源

{All tools and databases queried with results}

---

{所有查询的工具与数据库及结果}

---

Evidence Grading System

证据分级系统

Symbol	Confidence	Criteria
★★★	High	ClinGen definitive, ClinVar expert reviewed, multiple independent studies
★★☆	Moderate	ClinGen strong/moderate, single good study, DECIPHER cohort support
★☆☆	Limited	Computational predictions only, case reports, emerging evidence

符号	置信度	标准
★★★	高	ClinGen明确、ClinVar专家评审、多项独立研究
★★☆	中	ClinGen强/中等、单项优质研究、DECIPHER队列支持
★☆☆	有限	仅计算预测、病例报告、涌现证据

Special Scenarios

特殊场景

Scenario 1: Recurrent Microdeletion Syndrome

场景1: 复发微缺失综合征

Additional considerations:

Check for recurrence mechanism (LCRs, NAHR)
Look for founder effects
Population-specific frequencies
Incomplete penetrance
Variable expressivity

Example: 22q11.2 deletion, 17q21.31 deletion (Koolen-De Vries)

额外考量:

检查复发机制（LCR、NAHR）
查找奠基者效应
人群特异性频率
不完全外显
可变表达

示例: 22q11.2缺失、17q21.31缺失（Koolen-De Vries）

Scenario 2: Balanced Translocation (No Gene Disruption)

场景2: 平衡易位（无基因断裂）

Assessment approach:

If no genes disrupted: Likely benign (in most cases)
Check for cryptic imbalances
Consider position effects (rare)
Reproductive risk (unbalanced offspring)

Classification: Usually VUS or Likely Benign unless offspring affected

评估方法:

若无基因断裂：通常为良性（多数情况）
检查隐性不平衡
考虑位置效应（罕见）
生殖风险（不平衡子代）

分类: 通常为VUS或疑似良性，除非子代受影响

Scenario 3: Complex Rearrangement

场景3: 复杂重排

Analysis strategy:

Break down into component SVs
Assess each breakpoint independently
Look for chromothripsis pattern
Consider cumulative gene dosage effects
Check for DNA repair defects

分析策略:

分解为单个SV组件
独立评估每个断点
查找chromothripsis模式
考虑累计基因剂量效应
检查DNA修复缺陷

Scenario 4: Small In-Frame Deletion/Duplication

场景4: 小型框内缺失/重复

Special considerations:

May not cause haploinsufficiency
Check if critical domain affected
Look for similar variants in ClinVar
Consider protein structural impact
May need functional studies

特殊考量:

可能不引发单倍剂量不足
检查关键结构域是否受影响
查找ClinVar中相似变异
考虑蛋白结构影响
可能需要功能研究

Quantified Minimums

最低量化要求

Section	Requirement
Gene content	All genes in SV region annotated
Dosage sensitivity	ClinGen scores for all genes (if available)
Population frequency	Check gnomAD SV + ClinVar + DGV
Literature search	≥2 search strategies (PubMed + DECIPHER)
ACMG codes	All applicable codes listed

章节	要求
基因内容	SV区域内所有基因均需注释
剂量敏感性	所有基因均需ClinGen评分（若可用）
人群频率	检查gnomAD SV + ClinVar + DGV
文献搜索	≥2种搜索策略（PubMed + DECIPHER）
ACMG代码	列出所有适用代码

Tools Reference

工具参考

Core Tools for SV Analysis

SV分析核心工具

Tool	Purpose	Required?
`ClinGen_search_dosage_sensitivity`	HI/TS scores	Required
`ClinGen_search_gene_validity`	Gene-disease validity	Required
`ClinVar_search_variants`	Known pathogenic/benign SVs	Required
`DECIPHER_search`	Patient cases, phenotypes	Highly recommended
`Ensembl_lookup_gene`	Gene coordinates, structure	Required
`OMIM_search` , `OMIM_get_entry`	Gene-disease associations	Required
`DisGeNET_search_gene`	Additional disease associations	Recommended
`PubMed_search`	Literature evidence	Recommended
`Gene_Ontology_get_term_info`	Gene function	Supporting

工具	用途	是否必需
`ClinGen_search_dosage_sensitivity`	HI/TS评分	必需
`ClinGen_search_gene_validity`	基因-疾病有效性	必需
`ClinVar_search_variants`	已知致病性/良性SV	必需
`DECIPHER_search`	患者病例、表型	高度推荐
`Ensembl_lookup_gene`	基因坐标、结构	必需
`OMIM_search` , `OMIM_get_entry`	基因-疾病关联	必需
`DisGeNET_search_gene`	额外疾病关联	推荐
`PubMed_search`	文献证据	推荐
`Gene_Ontology_get_term_info`	基因功能	支持

Report File Naming

报告文件命名规则

SV_analysis_[TYPE]_chr[CHR]_[START]_[END]_[GENES].md

Examples:
SV_analysis_DEL_chr17_44039927_44352659_KANSL1_MAPT.md
SV_analysis_DUP_chr22_17400000_17800000_TBX1.md
SV_analysis_INV_chr11_2100000_2400000_complex.md

SV_analysis_[TYPE]_chr[CHR]_[START]_[END]_[GENES].md

示例:
SV_analysis_DEL_chr17_44039927_44352659_KANSL1_MAPT.md
SV_analysis_DUP_chr22_17400000_17800000_TBX1.md
SV_analysis_INV_chr11_2100000_2400000_complex.md

Clinical Recommendations Framework

临床建议框架

For Pathogenic/Likely Pathogenic SVs

针对致病性/疑似致病性SV

SV Type	Recommendations
Deletion (HI gene)	Genetic counseling, cascade testing, phenotype-specific surveillance
Duplication (TS gene)	Same as deletion; check for dosage-specific syndrome
Translocation (disruption)	Assess both breakpoints, consider reproductive counseling
Complex	Multidisciplinary evaluation, research enrollment

SV类型	建议
缺失（HI基因）	遗传咨询、级联检测、表型特异性监测
重复（TS基因）	与缺失相同；检查剂量特异性综合征
易位（断裂）	评估两个断点，考虑生殖咨询
复杂重排	多学科评估、参与研究

For VUS

针对VUS

Action	Details
Clinical management	Base on phenotype, not genotype
Follow-up	Reinterpret in 1-2 years or when phenotype evolves
Research	Functional studies if research-grade samples available
Family studies	Segregation analysis can reclassify

行动	细节
临床管理	基于表型而非基因型
随访	1-2年后重新解读，或表型变化时
研究	若有研究级样本可进行功能研究
家系研究	共分离分析可重新分类

For Benign/Likely Benign

针对良性/疑似良性

Action	Details
Clinical	Not expected to cause rare disease
Family	No cascade testing needed (unless recurrent/reproductive risk)
Reproductive	Balanced translocation carriers may have offspring risk

行动	细节
临床	预计不会引发罕见病
家属	无需级联检测（除非复发/生殖风险）
生殖	平衡易位携带者可能存在子代风险

When NOT to Use This Skill

不适用场景

Single nucleotide variants (SNVs) → Use
```
tooluniverse-variant-interpretation
```
skill
Small indels (<50 bp) → Use variant interpretation skill
Somatic variants in cancer → Different framework needed
Mitochondrial variants → Specialized interpretation required
Repeat expansions → Different mechanism

Use this skill for structural variants ≥50 bp requiring dosage sensitivity assessment and ACMG-adapted classification.

单核苷酸变异（SNV） → 使用
```
tooluniverse-variant-interpretation
```
技能
小插入缺失（<50 bp） → 使用变异解读技能
癌症体细胞变异 → 需要不同框架
线粒体变异 → 需专业解读
重复扩增 → 机制不同

本技能适用于**≥50 bp的结构变异**，需进行剂量敏感性评估和适配ACMG的分类。

参见

```
EXAMPLES.md
```
- Sample SV interpretations
```
README.md
```
- Quick start guide
```
tooluniverse-variant-interpretation
```
- For SNVs and small indels
ClinGen Dosage Sensitivity Map: https://www.ncbi.nlm.nih.gov/projects/dbvar/clingen/
ACMG SV Guidelines: Riggs et al., Genet Med 2020 (PMID: 31690835)

```
EXAMPLES.md
```
- SV解读示例
```
README.md
```
- 快速入门指南
```
tooluniverse-variant-interpretation
```
- 用于SNV和小插入缺失
ClinGen剂量敏感性图谱: https://www.ncbi.nlm.nih.gov/projects/dbvar/clingen/
ACMG SV指南: Riggs et al., Genet Med 2020 (PMID: 31690835)