tooluniverse-structural-variant-analysis

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Structural Variant Analysis Workflow

结构变异分析工作流

Systematic analysis of structural variants (deletions, duplications, inversions, translocations, complex rearrangements) for clinical genomics interpretation using ACMG-adapted criteria.
KEY PRINCIPLES:
  1. Report-first approach - Create SV_analysis_report.md FIRST, then populate progressively
  2. ACMG-style classification - Pathogenic/Likely Pathogenic/VUS/Likely Benign/Benign with explicit evidence
  3. Evidence grading - Grade all findings by confidence level (★★★/★★☆/★☆☆)
  4. Dosage sensitivity critical - Gene dosage effects drive SV pathogenicity
  5. Breakpoint precision matters - Exact gene disruption vs dosage-only effects
  6. Population context essential - gnomAD SVs for frequency assessment
  7. English-first queries - Always use English terms in tool calls (gene names, disease names), even if the user writes in another language. Only try original-language terms as a fallback. Respond in the user's language

基于适配ACMG的标准,对临床基因组学解读所需的结构变异(缺失、重复、倒位、易位、复杂重排)进行系统性分析。
核心原则:
  1. 报告优先原则 - 先创建SV_analysis_report.md,再逐步填充内容
  2. ACMG风格分类 - 明确标注致病性/疑似致病性/VUS(意义未明)/疑似良性/良性,并附具体证据
  3. 证据分级 - 按置信度对所有发现分级(★★★/★★☆/★☆☆)
  4. 剂量敏感性关键 - 基因剂量效应是SV致病性的核心驱动因素
  5. 断点精度重要 - 区分精确基因断裂与仅剂量效应的差异
  6. 人群背景必要 - 利用gnomAD SV数据库评估频率
  7. 英文优先查询 - 工具调用中始终使用英文术语(基因名、疾病名),即使用户使用其他语言提问。仅在英文查询失败时尝试原语言术语。用用户的语言回复

Problem This Skill Solves

本技能解决的问题

Structural variants (SVs) present unique interpretation challenges:
  1. Complex molecular consequences - SVs can cause gene dosage changes, gene disruption, gene fusions, position effects
  2. Size matters - Pathogenicity depends on size, gene content, and breakpoint precision
  3. Limited databases - Fewer curated SVs in ClinVar compared to SNVs
  4. Dosage sensitivity - Haploinsufficiency and triplosensitivity are critical but gene-specific
  5. Population frequency - Large benign CNVs are common; distinguishing pathogenic from benign is challenging
This skill provides: A systematic workflow integrating SV classification, gene content analysis, dosage sensitivity assessment, population frequencies, and ACMG-adapted criteria into clinically actionable interpretations.

结构变异(SV)的解读面临独特挑战:
  1. 复杂分子效应 - SV可导致基因剂量改变、基因断裂、基因融合、位置效应
  2. 尺寸影响致病性 - 致病性取决于片段大小、基因内容和断点精度
  3. 数据库资源有限 - 相比SNV,ClinVar中收录的经注释SV更少
  4. 剂量敏感性 - 单倍剂量不足(Haploinsufficiency)和三倍剂量敏感性(Triplosensitivity)至关重要,但具有基因特异性
  5. 人群频率区分难 - 常见良性CNV普遍存在,区分致病性与良性变异难度大
本技能提供:整合SV分类、基因内容分析、剂量敏感性评估、人群频率分析和ACMG适配标准的系统性工作流,输出可落地的临床解读结果。

Triggers

触发场景

Use this skill when users:
  • Ask about structural variant interpretation
  • Have CNV data from array or sequencing
  • Ask "is this deletion/duplication pathogenic?"
  • Need ACMG classification for SVs
  • Want to assess gene dosage effects
  • Ask about chromosomal rearrangements
  • Have large-scale genomic alterations requiring interpretation

当用户出现以下需求时使用本技能:
  • 询问结构变异解读方法
  • 拥有来自芯片或测序的CNV数据
  • 询问“该缺失/重复是否具有致病性?”
  • 需要对SV进行ACMG分类
  • 希望评估基因剂量效应
  • 询问染色体重排相关问题
  • 拥有需要解读的大规模基因组变异

Workflow Overview

工作流概览

┌─────────────────────────────────────────────────────────────────┐
│              STRUCTURAL VARIANT INTERPRETATION                   │
├─────────────────────────────────────────────────────────────────┤
│                                                                  │
│  Phase 1: SV IDENTITY & CLASSIFICATION                          │
│  ├── Normalize SV coordinates (hg19/hg38)                       │
│  ├── Determine SV type (DEL/DUP/INV/TRA/CPX)                   │
│  ├── Calculate SV size                                          │
│  └── Assess breakpoint precision                                │
│                                                                  │
│  Phase 2: GENE CONTENT ANALYSIS                                  │
│  ├── Identify genes fully contained in SV                       │
│  ├── Identify genes with breakpoints (disrupted)                │
│  ├── Annotate gene function and disease associations            │
│  ├── Identify regulatory elements affected                      │
│  └── Assess gene orientation (for inversions/translocations)    │
│                                                                  │
│  Phase 3: DOSAGE SENSITIVITY ASSESSMENT                          │
│  ├── ClinGen dosage sensitivity scores                          │
│  │   └─ Haploinsufficiency / Triplosensitivity ratings          │
│  ├── DECIPHER haploinsufficiency predictions                    │
│  ├── pLI scores (gnomAD) for loss-of-function intolerance       │
│  ├── OMIM gene-disease associations (dominant/recessive)        │
│  └── Known dosage-sensitive genes from literature               │
│                                                                  │
│  Phase 4: POPULATION FREQUENCY CONTEXT                           │
│  ├── gnomAD SV database (overlapping SVs)                       │
│  ├── DGV (Database of Genomic Variants)                         │
│  ├── ClinVar (known pathogenic/benign SVs)                      │
│  └── Calculate reciprocal overlap with population SVs           │
│                                                                  │
│  Phase 5: PATHOGENICITY SCORING                                  │
│  ├── Pathogenicity score (0-10 scale)                           │
│  │   ├─ Gene content weight (40%)                               │
│  │   ├─ Dosage sensitivity weight (30%)                         │
│  │   ├─ Population frequency weight (20%)                       │
│  │   └─ Inheritance/phenotype match weight (10%)                │
│  ├── Apply ACMG SV criteria                                     │
│  └── Generate classification recommendation                      │
│                                                                  │
│  Phase 6: LITERATURE & CLINICAL EVIDENCE                         │
│  ├── PubMed: Similar SVs, gene disruption studies               │
│  ├── DECIPHER: Developmental disorder cases                     │
│  ├── Clinical case reports                                      │
│  └── Functional evidence for gene dosage effects                │
│                                                                  │
│  Phase 7: ACMG-ADAPTED CLASSIFICATION                            │
│  ├── Apply SV-specific evidence codes                           │
│  ├── Calculate final classification                             │
│  ├── Identify limiting factors                                  │
│  └── Generate clinical recommendations                          │
│                                                                  │
└─────────────────────────────────────────────────────────────────┘

┌─────────────────────────────────────────────────────────────────┐
│              结构变异解读                                       │
├─────────────────────────────────────────────────────────────────┤
│                                                                  │
│  阶段1: SV识别与分类                                            │
│  ├── 标准化SV坐标(hg19/hg38)                                   │
│  ├── 确定SV类型(DEL/DUP/INV/TRA/CPX)                           │
│  ├── 计算SV尺寸                                                  │
│  └── 评估断点精度                                                │
│                                                                  │
│  阶段2: 基因内容分析                                              │
│  ├── 识别完全包含在SV内的基因                                   │
│  ├── 识别被断点打断的基因                                       │
│  ├── 注释基因功能与疾病关联                                     │
│  ├── 识别受影响的调控元件                                       │
│  └── 评估基因方向(针对倒位/易位)                               │
│                                                                  │
│  阶段3: 剂量敏感性评估                                          │
│  ├── ClinGen剂量敏感性评分                                      │
│  │   └─ 单倍剂量不足/三倍剂量敏感性评级                          │
│  ├── DECIPHER单倍剂量不足预测                                  │
│  ├── gnomAD的pLI评分(功能缺失不耐受性)                        │
│  ├── OMIM基因-疾病关联(显性/隐性)                              │
│  └── 文献中已知的剂量敏感性基因                                 │
│                                                                  │
│  阶段4: 人群频率背景分析                                         │
│  ├── gnomAD SV数据库(重叠SV)                                   │
│  ├── DGV(基因组变异数据库)                                     │
│  ├── ClinVar(已知致病性/良性SV)                                │
│  └── 计算与人群SV的互斥重叠度                                   │
│                                                                  │
│  阶段5: 致病性评分                                              │
│  ├── 致病性评分(0-10分)                                       │
│  │   ├─ 基因内容权重(40%)                                       │
│  │   ├─ 剂量敏感性权重(30%)                                     │
│  │   ├─ 人群频率权重(20%)                                       │
│  │   └─ 遗传/表型匹配权重(10%)                                 │
│  ├── 应用ACMG SV标准                                             │
│  └── 生成分类建议                                                │
│                                                                  │
│  阶段6: 文献与临床证据                                           │
│  ├── PubMed:相似SV、基因断裂研究                               │
│  ├── DECIPHER:发育障碍病例                                     │
│  ├── 临床病例报告                                                │
│  └── 基因剂量效应的功能证据                                     │
│                                                                  │
│  阶段7: 适配ACMG的分类                                            │
│  ├── 应用SV特异性证据代码                                       │
│  ├── 计算最终分类                                               │
│  ├── 识别限制因素                                                │
│  └── 生成临床建议                                                │
│                                                                  │
└─────────────────────────────────────────────────────────────────┘

Phase Details

各阶段细节

Phase 1: SV Identity & Classification

阶段1: SV识别与分类

Goal: Standardize SV notation and classify type
SV Types:
TypeAbbreviationDescriptionMolecular Effect
DeletionDELLoss of genomic segmentHaploinsufficiency, gene disruption
DuplicationDUPGain of genomic segmentTriplosensitivity, gene dosage imbalance
InversionINVSegment flipped in orientationGene disruption at breakpoints, position effects
TranslocationTRASegment moved to different chromosomeGene fusions, disruption, position effects
ComplexCPXMultiple rearrangement typesVariable effects
Key Information to Capture:
  • Chromosome(s) involved
  • Coordinates (start, end) in hg19/hg38
  • SV size (bp or Mb)
  • SV type (DEL/DUP/INV/TRA/CPX)
  • Breakpoint precision (±50bp, ±1kb, etc.)
  • Inheritance pattern (de novo, inherited, unknown)
Example:
SV: arr[GRCh38] 17q21.31(44039927-44352659)x1
- Type: Deletion (heterozygous)
- Size: 313 kb
- Genes: MAPT, KANSL1 (fully contained)
- Breakpoints: Well-defined (array resolution ±5kb)

目标:标准化SV命名并分类类型
SV类型:
类型缩写描述分子效应
缺失DEL基因组片段丢失Haploinsufficiency、基因断裂
重复DUP基因组片段增加Triplosensitivity、基因剂量失衡
倒位INV片段方向翻转断点处基因断裂、位置效应
易位TRA片段转移至其他染色体基因融合、断裂、位置效应
复杂重排CPX多种重排类型组合效应多样
需捕获的关键信息:
  • 涉及的染色体
  • hg19/hg38版本的坐标(起始、终止)
  • SV尺寸(bp或Mb)
  • SV类型(DEL/DUP/INV/TRA/CPX)
  • 断点精度(±50bp、±1kb等)
  • 遗传模式(新发、遗传、未知)
示例:
SV: arr[GRCh38] 17q21.31(44039927-44352659)x1
- 类型: 缺失(杂合)
- 尺寸: 313 kb
- 基因: MAPT、KANSL1(完全包含)
- 断点: 定义明确(芯片分辨率±5kb)

Phase 2: Gene Content Analysis

阶段2: 基因内容分析

Goal: Comprehensive annotation of genes affected by SV
Tools:
ToolPurposeKey Data
Ensembl_lookup_gene
Gene structure, coordinatesGene boundaries, exons, transcripts
NCBI_gene_search
Gene informationOfficial symbol, aliases, description
Gene_Ontology_get_term_info
Gene functionBiological process, molecular function
OMIM_search
,
OMIM_get_entry
Disease associationsInheritance, clinical features
DisGeNET_search_gene
Gene-disease associationsEvidence scores
Gene Categories:
  1. Fully contained genes - Entire gene within SV boundaries
    • Deletion: Complete loss of one copy (haploinsufficiency)
    • Duplication: Extra copy (triplosensitivity)
  2. Partially disrupted genes - Breakpoint within gene
    • Likely loss-of-function for affected allele
    • Check if critical domains disrupted
  3. Flanking genes - Within 1 Mb of breakpoints
    • May be affected by position effects
    • Regulatory disruption possible
Example Gene Content Analysis:
python
def analyze_gene_content(tu, chrom, sv_start, sv_end, sv_type):
    """
    Identify and annotate all genes within SV region.
    """
    genes = {
        'fully_contained': [],
        'partially_disrupted': [],
        'flanking': []
    }

    # Use Ensembl to find overlapping genes
    # This is pseudocode - actual implementation depends on available tools

    for gene in genes_in_region:
        gene_start = gene['start']
        gene_end = gene['end']

        # Classify gene relationship to SV
        if gene_start >= sv_start and gene_end <= sv_end:
            # Fully contained
            gene_info = annotate_gene(tu, gene['symbol'])
            genes['fully_contained'].append(gene_info)

        elif (gene_start < sv_start < gene_end) or (gene_start < sv_end < gene_end):
            # Partially disrupted
            gene_info = annotate_gene(tu, gene['symbol'])
            genes['partially_disrupted'].append(gene_info)

        elif abs(gene_start - sv_end) < 1000000 or abs(gene_end - sv_start) < 1000000:
            # Flanking (within 1 Mb)
            gene_info = annotate_gene(tu, gene['symbol'])
            genes['flanking'].append(gene_info)

    return genes

def annotate_gene(tu, gene_symbol):
    """
    Comprehensive gene annotation.
    """
    # OMIM associations
    omim = tu.tools.OMIM_search(
        operation="search",
        query=gene_symbol,
        limit=5
    )

    # DisGeNET associations
    disgenet = tu.tools.DisGeNET_search_gene(
        operation="search_gene",
        gene=gene_symbol,
        limit=10
    )

    # Gene Ontology
    # Note: Need gene ID first
    ncbi = tu.tools.NCBI_gene_search(
        term=gene_symbol,
        organism="human"
    )

    return {
        'symbol': gene_symbol,
        'omim': omim,
        'disgenet': disgenet,
        'ncbi': ncbi
    }
Report Section:
markdown
undefined
目标:全面注释受SV影响的基因
工具:
工具用途核心数据
Ensembl_lookup_gene
基因结构、坐标基因边界、外显子、转录本
NCBI_gene_search
基因信息官方符号、别名、描述
Gene_Ontology_get_term_info
基因功能生物学过程、分子功能
OMIM_search
,
OMIM_get_entry
基因-疾病关联遗传模式、临床特征
DisGeNET_search_gene
基因-疾病关联证据评分
基因分类:
  1. 完全包含的基因 - 整个基因位于SV边界内
    • 缺失:一个拷贝完全丢失(Haploinsufficiency)
    • 重复:额外增加一个拷贝(Triplosensitivity)
  2. 部分断裂的基因 - 断点位于基因内部
    • 受影响的等位基因可能功能丧失
    • 需检查关键结构域是否丢失
  3. 侧翼基因 - 位于断点1Mb范围内
    • 可能受位置效应影响
    • 存在调控元件被破坏的可能
基因内容分析示例:
python
def analyze_gene_content(tu, chrom, sv_start, sv_end, sv_type):
    """
    Identify and annotate all genes within SV region.
    """
    genes = {
        'fully_contained': [],
        'partially_disrupted': [],
        'flanking': []
    }

    # Use Ensembl to find overlapping genes
    # This is pseudocode - actual implementation depends on available tools

    for gene in genes_in_region:
        gene_start = gene['start']
        gene_end = gene['end']

        # Classify gene relationship to SV
        if gene_start >= sv_start and gene_end <= sv_end:
            # Fully contained
            gene_info = annotate_gene(tu, gene['symbol'])
            genes['fully_contained'].append(gene_info)

        elif (gene_start < sv_start < gene_end) or (gene_start < sv_end < gene_end):
            # Partially disrupted
            gene_info = annotate_gene(tu, gene['symbol'])
            genes['partially_disrupted'].append(gene_info)

        elif abs(gene_start - sv_end) < 1000000 or abs(gene_end - sv_start) < 1000000:
            # Flanking (within 1 Mb)
            gene_info = annotate_gene(tu, gene['symbol'])
            genes['flanking'].append(gene_info)

    return genes

def annotate_gene(tu, gene_symbol):
    """
    Comprehensive gene annotation.
    """
    # OMIM associations
    omim = tu.tools.OMIM_search(
        operation="search",
        query=gene_symbol,
        limit=5
    )

    # DisGeNET associations
    disgenet = tu.tools.DisGeNET_search_gene(
        operation="search_gene",
        gene=gene_symbol,
        limit=10
    )

    # Gene Ontology
    # Note: Need gene ID first
    ncbi = tu.tools.NCBI_gene_search(
        term=gene_symbol,
        organism="human"
    )

    return {
        'symbol': gene_symbol,
        'omim': omim,
        'disgenet': disgenet,
        'ncbi': ncbi
    }
报告章节:
markdown
undefined

2.1 Fully Contained Genes (Complete Dosage Effect)

2.1 完全包含的基因(完整剂量效应)

GeneFunctionDisease AssociationInheritanceEvidence
MAPTMicrotubule-associated protein tauFrontotemporal dementia (AD)Autosomal Dominant★★★
KANSL1Histone acetyltransferase complexKoolen-De Vries syndrome (AD)Autosomal Dominant★★★
Interpretation: Deletion results in haploinsufficiency of two dosage-sensitive genes. KANSL1 haploinsufficiency is the primary cause of pathogenicity.
Sources: OMIM, DisGeNET, Ensembl
基因功能疾病关联遗传模式证据等级
MAPT微管相关蛋白tau额颞叶痴呆(AD)常染色体显性遗传★★★
KANSL1组蛋白乙酰转移酶复合物Koolen-De Vries综合征(AD)常染色体显性遗传★★★
解读:缺失导致两个剂量敏感性基因的单倍剂量不足。KANSL1单倍剂量不足是致病性的主要原因。
来源: OMIM, DisGeNET, Ensembl

2.2 Partially Disrupted Genes (Breakpoint Within Gene)

2.2 部分断裂的基因(断点位于基因内部)

GeneBreakpoint LocationEffectCritical Domains Lost
NF1Intron 28 of 585' portion deletedYes - GTPase-activating domain
Interpretation: Breakpoint disrupts NF1 coding sequence, likely resulting in loss-of-function. NF1 is haploinsufficient (causes neurofibromatosis type 1).
基因断点位置效应丢失的关键结构域
NF158个内含子中的第28个5'端片段缺失是 - GTP酶激活结构域
解读:断点破坏NF1编码序列,可能导致功能丧失。NF1为单倍剂量不足基因(引发1型神经纤维瘤病)。

2.3 Flanking Genes (Potential Position Effects)

2.3 侧翼基因(潜在位置效应)

GeneDistance from SVRegulatory RiskEvidence
KCNJ2450 kb upstreamLow★☆☆
Note: Position effects are possible but less common. Consider if phenotype unexplained by contained genes.

---
基因与SV的距离调控风险证据等级
KCNJ2上游450 kb★☆☆
:位置效应有可能发生但并不常见。若表型无法被包含的基因解释时需考虑。

---

Phase 3: Dosage Sensitivity Assessment

阶段3: 剂量敏感性评估

Goal: Determine if affected genes are dosage-sensitive
Tools:
ToolPurposeKey Data
ClinGen_search_dosage_sensitivity
Gold standard curationHI/TS scores (0-3)
ClinGen_search_gene_validity
Gene-disease validityDefinitive/Strong/Moderate
gnomad_search
(pLI)
Loss-of-function intolerancepLI score (0-1)
DECIPHER_search
Developmental disordersPatient phenotypes with similar SVs
OMIM_get_entry
Inheritance patternAD/AR indicates dosage sensitivity
ClinGen Dosage Sensitivity Scores:
ScoreHaploinsufficiency (HI)Triplosensitivity (TS)Interpretation
3Sufficient evidenceSufficient evidenceGene IS dosage-sensitive
2Emerging evidenceEmerging evidenceLikely dosage-sensitive
1Little evidenceLittle evidenceInsufficient evidence
0No evidenceNo evidenceNo established dosage sensitivity
pLI Score Interpretation (gnomAD):
pLI RangeInterpretationLoF Intolerance
≥0.9Extremely intolerantHigh - likely haploinsufficient
0.5-0.9Moderately intolerantModerate
<0.5TolerantLow - likely NOT haploinsufficient
Implementation:
python
def assess_dosage_sensitivity(tu, gene_list):
    """
    Assess dosage sensitivity for all genes in SV.
    Returns dosage scores and interpretation.
    """
    dosage_data = []

    for gene_symbol in gene_list:
        # 1. ClinGen dosage sensitivity (gold standard)
        clingen = tu.tools.ClinGen_search_dosage_sensitivity(
            gene=gene_symbol
        )

        hi_score = None
        ts_score = None
        if clingen.get('data'):
            for entry in clingen['data']:
                hi_score = entry.get('Haploinsufficiency Score')
                ts_score = entry.get('Triplosensitivity Score')
                break

        # 2. ClinGen gene validity (supports dosage sensitivity)
        validity = tu.tools.ClinGen_search_gene_validity(
            gene=gene_symbol
        )

        validity_level = None
        if validity.get('data'):
            for entry in validity['data']:
                validity_level = entry.get('Classification')
                break

        # 3. pLI score from gnomAD (if available via gene search)
        # Note: May need to use myvariant or other tools
        # pli_score = get_pli_score(tu, gene_symbol)

        # 4. OMIM inheritance pattern
        omim = tu.tools.OMIM_search(
            operation="search",
            query=gene_symbol,
            limit=3
        )

        inheritance_pattern = None
        if omim.get('data', {}).get('entries'):
            for entry in omim['data']['entries']:
                mim = entry.get('mimNumber')
                details = tu.tools.OMIM_get_entry(
                    operation="get_entry",
                    mim_number=str(mim)
                )
                # Extract inheritance from details
                # inheritance_pattern = parse_inheritance(details)

        # Integrate evidence
        dosage_assessment = {
            'gene': gene_symbol,
            'hi_score': hi_score,
            'ts_score': ts_score,
            'validity_level': validity_level,
            'inheritance': inheritance_pattern,
            'is_dosage_sensitive': (hi_score == '3' or ts_score == '3'),
            'evidence_grade': calculate_evidence_grade(hi_score, ts_score, validity_level)
        }

        dosage_data.append(dosage_assessment)

    return dosage_data

def calculate_evidence_grade(hi_score, ts_score, validity):
    """
    Calculate evidence grade for dosage sensitivity.
    """
    if (hi_score == '3' or ts_score == '3') and validity == 'Definitive':
        return '★★★'  # High confidence
    elif (hi_score in ['2', '3'] or ts_score in ['2', '3']):
        return '★★☆'  # Moderate confidence
    else:
        return '★☆☆'  # Low confidence
Report Section:
markdown
undefined
目标:确定受影响基因是否具有剂量敏感性
工具:
工具用途核心数据
ClinGen_search_dosage_sensitivity
金标准注释HI/TS评分(0-3)
ClinGen_search_gene_validity
基因-疾病有效性明确/强/中等
gnomad_search
(pLI)
功能缺失不耐受性pLI评分(0-1)
DECIPHER_search
发育障碍携带相似SV的患者表型
OMIM_get_entry
遗传模式AD/AR提示剂量敏感性
ClinGen剂量敏感性评分:
评分单倍剂量不足(HI)三倍剂量敏感性(TS)解读
3证据充分证据充分基因具有剂量敏感性
2证据正在涌现证据正在涌现可能具有剂量敏感性
1证据极少证据极少证据不足
0无证据无证据无已证实的剂量敏感性
pLI评分解读 (gnomAD):
pLI范围解读功能缺失不耐受性
≥0.9极度不耐受高 - 可能为单倍剂量不足
0.5-0.9中度不耐受中等
<0.5耐受低 - 可能不具有单倍剂量不足性
实现代码:
python
def assess_dosage_sensitivity(tu, gene_list):
    """
    Assess dosage sensitivity for all genes in SV.
    Returns dosage scores and interpretation.
    """
    dosage_data = []

    for gene_symbol in gene_list:
        # 1. ClinGen dosage sensitivity (gold standard)
        clingen = tu.tools.ClinGen_search_dosage_sensitivity(
            gene=gene_symbol
        )

        hi_score = None
        ts_score = None
        if clingen.get('data'):
            for entry in clingen['data']:
                hi_score = entry.get('Haploinsufficiency Score')
                ts_score = entry.get('Triplosensitivity Score')
                break

        # 2. ClinGen gene validity (supports dosage sensitivity)
        validity = tu.tools.ClinGen_search_gene_validity(
            gene=gene_symbol
        )

        validity_level = None
        if validity.get('data'):
            for entry in validity['data']:
                validity_level = entry.get('Classification')
                break

        # 3. pLI score from gnomAD (if available via gene search)
        # Note: May need to use myvariant or other tools
        # pli_score = get_pli_score(tu, gene_symbol)

        # 4. OMIM inheritance pattern
        omim = tu.tools.OMIM_search(
            operation="search",
            query=gene_symbol,
            limit=3
        )

        inheritance_pattern = None
        if omim.get('data', {}).get('entries'):
            for entry in omim['data']['entries']:
                mim = entry.get('mimNumber')
                details = tu.tools.OMIM_get_entry(
                    operation="get_entry",
                    mim_number=str(mim)
                )
                # Extract inheritance from details
                # inheritance_pattern = parse_inheritance(details)

        # Integrate evidence
        dosage_assessment = {
            'gene': gene_symbol,
            'hi_score': hi_score,
            'ts_score': ts_score,
            'validity_level': validity_level,
            'inheritance': inheritance_pattern,
            'is_dosage_sensitive': (hi_score == '3' or ts_score == '3'),
            'evidence_grade': calculate_evidence_grade(hi_score, ts_score, validity_level)
        }

        dosage_data.append(dosage_assessment)

    return dosage_data

def calculate_evidence_grade(hi_score, ts_score, validity):
    """
    Calculate evidence grade for dosage sensitivity.
    """
    if (hi_score == '3' or ts_score == '3') and validity == 'Definitive':
        return '★★★'  # High confidence
    elif (hi_score in ['2', '3'] or ts_score in ['2', '3']):
        return '★★☆'  # Moderate confidence
    else:
        return '★☆☆'  # Low confidence
报告章节:
markdown
undefined

3. Dosage Sensitivity Assessment

3. 剂量敏感性评估

Haploinsufficient Genes (Deletions/Disruptions)

单倍剂量不足基因(缺失/断裂)

GeneClinGen HI ScorepLIValidityDiseaseEvidence
KANSL13 (Sufficient)0.99DefinitiveKoolen-De Vries syndrome★★★
MAPT2 (Emerging)0.85StrongFTD (rare)★★☆
Interpretation: KANSL1 has definitive evidence for haploinsufficiency. Deletion of one copy is expected to cause Koolen-De Vries syndrome (intellectual disability, hypotonia, distinctive facial features).
Sources: ClinGen Dosage Sensitivity Map, gnomAD pLI
基因ClinGen HI评分pLI有效性等级疾病证据等级
KANSL13(证据充分)0.99明确Koolen-De Vries综合征★★★
MAPT2(证据涌现)0.85罕见FTD★★☆
解读:KANSL1具有明确的单倍剂量不足证据。一个拷贝的缺失预计会引发Koolen-De Vries综合征(智力障碍、肌张力低下、特殊面容)。
来源: ClinGen剂量敏感性图谱, gnomAD pLI

Triplosensitive Genes (Duplications)

三倍剂量敏感性基因(重复)

GeneClinGen TS ScoreDisease MechanismEvidence
MECP23 (Sufficient)MECP2 duplication syndrome★★★
PMP223 (Sufficient)Charcot-Marie-Tooth 1A★★★
Note: For this deletion, triplosensitivity is not applicable. Listed for reference.
基因ClinGen TS评分疾病机制证据等级
MECP23(证据充分)MECP2重复综合征★★★
PMP223(证据充分)Charcot-Marie-Tooth 1A型★★★
:本案例为缺失,三倍剂量敏感性不适用。此处仅作参考。

Non-Dosage-Sensitive Genes

非剂量敏感性基因

GeneHI ScoreTS ScoreInterpretation
GENE_X00No established dosage sensitivity
GENE_Y11Insufficient evidence
Interpretation: These genes lack evidence for dosage sensitivity. Deletion/duplication less likely to be pathogenic solely due to these genes.

---
基因HI评分TS评分解读
GENE_X00无已证实的剂量敏感性
GENE_Y11证据不足
解读:这些基因缺乏剂量敏感性证据。仅因这些基因的缺失/重复导致致病性的可能性较低。

---

Phase 4: Population Frequency Context

阶段4: 人群频率背景分析

Goal: Determine if SV is common in general population (likely benign) or rare (supports pathogenicity)
Tools:
ToolPurposeKey Data
gnomad_search
Population SV frequenciesOverlapping SVs, frequencies
ClinVar_search_variants
Known pathogenic/benign SVsClassification, review status
DECIPHER_search
Patient SVs with phenotypesCase reports, phenotype similarity
Frequency Interpretation (adapted from ACMG):
SV FrequencyACMG CodeInterpretation
≥1% in gnomAD SVsBA1 (Stand-alone Benign)Too common for rare disease
0.1-1%BS1 (Strong Benign)Likely benign common variant
<0.01%PM2 (Supporting Pathogenic)Rare, supports pathogenicity
AbsentPM2 (Supporting)Very rare, supports pathogenicity
Reciprocal Overlap Calculation:
For proper comparison, calculate reciprocal overlap between query SV and population SV:
Reciprocal Overlap = min(overlap_with_A, overlap_with_B)
where:
  overlap_with_A = (overlap length) / (SV_A length)
  overlap_with_B = (overlap length) / (SV_B length)

Threshold: ≥70% reciprocal overlap = "same" SV
Implementation:
python
def assess_population_frequency(tu, chrom, sv_start, sv_end, sv_type):
    """
    Check population databases for overlapping SVs.
    """
    # 1. Check ClinVar for known pathogenic/benign SVs
    clinvar = tu.tools.ClinVar_search_variants(
        chromosome=str(chrom),
        start=sv_start,
        stop=sv_end,
        variant_type=sv_type.upper()
    )

    known_svs = []
    if clinvar.get('data'):
        for variant in clinvar['data']:
            classification = variant.get('clinical_significance')
            known_svs.append({
                'database': 'ClinVar',
                'classification': classification,
                'review_status': variant.get('review_status'),
                'coordinates': f"{variant.get('chromosome')}:{variant.get('start')}-{variant.get('stop')}"
            })

    # 2. gnomAD SVs (if available)
    # Note: gnomAD SV database may not have direct API access via ToolUniverse
    # May need to use genomic coordinate search

    # 3. DECIPHER for similar patient cases
    decipher_search = tu.tools.DECIPHER_search(
        query=f"chr{chrom}:{sv_start}-{sv_end}",
        search_type="region"
    )

    patient_cases = []
    if decipher_search.get('data'):
        patient_cases = decipher_search['data']

    return {
        'clinvar_matches': known_svs,
        'decipher_cases': patient_cases,
        'frequency_interpretation': interpret_frequency(known_svs)
    }

def interpret_frequency(known_svs):
    """
    Interpret frequency based on ClinVar matches.
    """
    if any(sv['classification'] == 'Benign' for sv in known_svs):
        return {
            'acmg_code': 'BA1 or BS1',
            'interpretation': 'Likely benign based on ClinVar benign classification',
            'evidence_grade': '★★★'
        }
    elif any(sv['classification'] == 'Pathogenic' for sv in known_svs):
        return {
            'acmg_code': 'PS1',
            'interpretation': 'Pathogenic based on ClinVar pathogenic classification',
            'evidence_grade': '★★★'
        }
    else:
        return {
            'acmg_code': 'PM2',
            'interpretation': 'Rare variant, not found in ClinVar or population databases',
            'evidence_grade': '★★☆'
        }
Report Section:
markdown
undefined
目标:确定SV在普通人群中是否常见(可能良性)或罕见(支持致病性)
工具:
工具用途核心数据
gnomad_search
人群SV频率重叠SV、频率
ClinVar_search_variants
已知致病性/良性SV分类、评审状态
DECIPHER_search
带表型的患者SV病例报告、表型相似性
频率解读(适配ACMG):
SV频率ACMG代码解读
在gnomAD SV中≥1%BA1(独立良性)过于常见,不可能引发罕见病
0.1-1%BS1(强良性)可能为良性常见变异
<0.01%PM2(支持致病性)罕见,支持致病性
未检出PM2(支持)极罕见,支持致病性
互斥重叠度计算:
为确保比较准确性,计算查询SV与人群SV的互斥重叠度:
互斥重叠度 = min(与A的重叠度, 与B的重叠度)
其中:
  与A的重叠度 = (重叠长度) / (SV_A长度)
  与B的重叠度 = (重叠长度) / (SV_B长度)

阈值: ≥70%互斥重叠度 = "相同" SV
实现代码:
python
def assess_population_frequency(tu, chrom, sv_start, sv_end, sv_type):
    """
    Check population databases for overlapping SVs.
    """
    # 1. Check ClinVar for known pathogenic/benign SVs
    clinvar = tu.tools.ClinVar_search_variants(
        chromosome=str(chrom),
        start=sv_start,
        stop=sv_end,
        variant_type=sv_type.upper()
    )

    known_svs = []
    if clinvar.get('data'):
        for variant in clinvar['data']:
            classification = variant.get('clinical_significance')
            known_svs.append({
                'database': 'ClinVar',
                'classification': classification,
                'review_status': variant.get('review_status'),
                'coordinates': f"{variant.get('chromosome')}:{variant.get('start')}-{variant.get('stop')}"
            })

    # 2. gnomAD SVs (if available)
    # Note: gnomAD SV database may not have direct API access via ToolUniverse
    # May need to use genomic coordinate search

    # 3. DECIPHER for similar patient cases
    decipher_search = tu.tools.DECIPHER_search(
        query=f"chr{chrom}:{sv_start}-{sv_end}",
        search_type="region"
    )

    patient_cases = []
    if decipher_search.get('data'):
        patient_cases = decipher_search['data']

    return {
        'clinvar_matches': known_svs,
        'decipher_cases': patient_cases,
        'frequency_interpretation': interpret_frequency(known_svs)
    }

def interpret_frequency(known_svs):
    """
    Interpret frequency based on ClinVar matches.
    """
    if any(sv['classification'] == 'Benign' for sv in known_svs):
        return {
            'acmg_code': 'BA1 or BS1',
            'interpretation': 'Likely benign based on ClinVar benign classification',
            'evidence_grade': '★★★'
        }
    elif any(sv['classification'] == 'Pathogenic' for sv in known_svs):
        return {
            'acmg_code': 'PS1',
            'interpretation': 'Pathogenic based on ClinVar pathogenic classification',
            'evidence_grade': '★★★'
        }
    else:
        return {
            'acmg_code': 'PM2',
            'interpretation': 'Rare variant, not found in ClinVar or population databases',
            'evidence_grade': '★★☆'
        }
报告章节:
markdown
undefined

4. Population Frequency Context

4. 人群频率背景分析

ClinVar Matches (Overlapping SVs)

ClinVar匹配结果(重叠SV)

VCV IDClassificationSizeOverlapReview StatusGenes
VCV000012345Pathogenic320 kb95% reciprocal★★★ Reviewed by expert panelKANSL1, MAPT
Match Found: Query deletion has 95% reciprocal overlap with known pathogenic deletion in ClinVar (VCV000012345). This is the Koolen-De Vries syndrome deletion.
ACMG Code: PS1 (Strong) - Same genomic region as established pathogenic SV
Source: ClinVar via
ClinVar_search_variants
VCV ID分类尺寸重叠度评审状态基因
VCV000012345致病性320 kb95%互斥重叠★★★ 专家评审KANSL1, MAPT
匹配发现:查询缺失与ClinVar中已知致病性缺失(VCV000012345)的互斥重叠度为95%。该缺失为Koolen-De Vries综合征缺失。
ACMG代码PS1(强) - 与已确立致病性的SV位于相同基因组区域
来源: ClinVar via
ClinVar_search_variants

gnomAD SV Database

gnomAD SV数据库

Search Result: No overlapping deletions found in gnomAD SV v4.0 (>10,000 genomes)
Interpretation: Absence from gnomAD supports rarity and pathogenic potential.
ACMG Code: PM2 (Moderate) - Absent from population databases
Note: gnomAD SVs queried via browser (no direct API access)
搜索结果:在gnomAD SV v4.0(>10,000个基因组)中未发现重叠缺失
解读:在gnomAD中未检出支持其罕见性和致病潜力。
ACMG代码PM2(中等) - 未在人群数据库中检出
注: gnomAD SV通过浏览器查询(无直接API访问)

DECIPHER Patient Cases

DECIPHER患者病例

Case IDPhenotypeSV TypeSizeOverlapSimilarity
12345Intellectual disability, hypotoniaDEL315 kb98%High
67890Developmental delay, facial dysmorphismDEL305 kb92%High
Phenotype Match: 8/10 DECIPHER patients have intellectual disability and hypotonia, consistent with Koolen-De Vries syndrome.
ACMG Support: PP4 (Supporting) - Patient phenotype consistent with gene's disease association
Source: DECIPHER via
DECIPHER_search

---
病例ID表型SV类型尺寸重叠度相似性
12345智力障碍、肌张力低下DEL315 kb98%
67890发育迟缓、面部畸形DEL305 kb92%
表型匹配:8/10的DECIPHER患者具有智力障碍和肌张力低下,与Koolen-De Vries综合征一致。
ACMG支持PP4(支持) - 患者表型与基因的疾病关联一致
来源: DECIPHER via
DECIPHER_search

---

Phase 5: Pathogenicity Scoring

阶段5: 致病性评分

Goal: Quantitative pathogenicity assessment (0-10 scale)
Scoring Components:
  1. Gene Content (40 points max):
    • 10 points per dosage-sensitive gene (HI/TS score 3)
    • 5 points per likely dosage-sensitive gene (score 2)
    • 2 points per gene with disease association
    • Cap at 40 points
  2. Dosage Sensitivity Evidence (30 points max):
    • 30 points: Multiple genes with definitive HI/TS (score 3)
    • 20 points: One gene with definitive HI/TS
    • 10 points: Genes with emerging evidence (score 2)
    • 5 points: Predicted haploinsufficiency (pLI >0.9)
  3. Population Frequency (20 points max):
    • 20 points: Absent from gnomAD, DGV
    • 10 points: Rare (<0.01%)
    • 0 points: Common (>0.1%)
    • -20 points: Very common (>1%) - likely benign
  4. Clinical Evidence (10 points max):
    • 10 points: Matching ClinVar pathogenic SV
    • 8 points: DECIPHER cases with matching phenotype
    • 5 points: Literature support for gene dosage effects
    • 3 points: Phenotype consistent with genes
Pathogenicity Score Interpretation:
ScoreClassificationConfidenceInterpretation
9-10Pathogenic★★★High confidence pathogenic
7-8Likely Pathogenic★★☆Strong evidence for pathogenicity
4-6VUS★☆☆Uncertain significance
2-3Likely Benign★★☆Strong evidence for benign
0-1Benign★★★High confidence benign
Implementation:
python
def calculate_pathogenicity_score(gene_content, dosage_data, frequency_data, clinical_data):
    """
    Calculate comprehensive pathogenicity score (0-10 scale).
    """
    score = 0
    breakdown = {}

    # 1. Gene content scoring (40 points max)
    gene_score = 0
    for gene in gene_content['fully_contained'] + gene_content['partially_disrupted']:
        dosage_info = next((d for d in dosage_data if d['gene'] == gene['symbol']), None)
        if dosage_info:
            if dosage_info['hi_score'] == '3':
                gene_score += 10
            elif dosage_info['hi_score'] == '2':
                gene_score += 5
            elif gene.get('omim_disease'):
                gene_score += 2

    gene_score = min(gene_score, 40)  # Cap at 40
    breakdown['gene_content'] = gene_score / 40 * 4  # Scale to 0-4

    # 2. Dosage sensitivity scoring (30 points max)
    dosage_score = 0
    definitive_genes = sum(1 for d in dosage_data if d['hi_score'] == '3')

    if definitive_genes >= 2:
        dosage_score = 30
    elif definitive_genes == 1:
        dosage_score = 20
    else:
        emerging_genes = sum(1 for d in dosage_data if d['hi_score'] == '2')
        dosage_score = emerging_genes * 5

    dosage_score = min(dosage_score, 30)
    breakdown['dosage_sensitivity'] = dosage_score / 30 * 3  # Scale to 0-3

    # 3. Population frequency scoring (20 points max)
    freq_score = 0
    if frequency_data.get('frequency') is None:
        freq_score = 20  # Absent
    elif frequency_data['frequency'] < 0.0001:
        freq_score = 10  # Rare
    elif frequency_data['frequency'] < 0.001:
        freq_score = 5  # Uncommon
    elif frequency_data['frequency'] > 0.01:
        freq_score = -20  # Common - likely benign

    breakdown['population_frequency'] = freq_score / 20 * 2  # Scale to -2 to 2

    # 4. Clinical evidence scoring (10 points max)
    clinical_score = 0
    if clinical_data.get('clinvar_pathogenic'):
        clinical_score = 10
    elif clinical_data.get('decipher_matching_phenotype'):
        clinical_score = 8
    elif clinical_data.get('literature_support'):
        clinical_score = 5

    clinical_score = min(clinical_score, 10)
    breakdown['clinical_evidence'] = clinical_score / 10 * 1  # Scale to 0-1

    # Total score (0-10 scale)
    total_score = breakdown['gene_content'] + breakdown['dosage_sensitivity'] + \
                  breakdown['population_frequency'] + breakdown['clinical_evidence']

    total_score = max(0, min(10, total_score))  # Ensure 0-10 range

    return {
        'total_score': round(total_score, 1),
        'breakdown': breakdown,
        'classification': classify_score(total_score)
    }

def classify_score(score):
    """Map score to ACMG-style classification."""
    if score >= 9:
        return 'Pathogenic'
    elif score >= 7:
        return 'Likely Pathogenic'
    elif score >= 4:
        return 'VUS'
    elif score >= 2:
        return 'Likely Benign'
    else:
        return 'Benign'
Report Section:
markdown
undefined
目标:量化致病性评估(0-10分)
评分组成:
  1. 基因内容(最高40分):
    • 每个剂量敏感性基因(HI/TS评分3)加10分
    • 每个可能剂量敏感性基因(评分2)加5分
    • 每个具有疾病关联的基因加2分
    • 最高40分
  2. 剂量敏感性证据(最高30分):
    • 30分:多个具有明确HI/TS(评分3)的基因
    • 20分:一个具有明确HI/TS的基因
    • 10分:具有涌现证据的基因(评分2)
    • 5分:预测单倍剂量不足(pLI >0.9)
  3. 人群频率(最高20分):
    • 20分:在gnomAD、DGV中未检出
    • 10分:罕见(<0.01%)
    • 0分:常见(>0.1%)
    • -20分:非常常见(>1%) - 可能良性
  4. 临床证据(最高10分):
    • 10分:匹配ClinVar致病性SV
    • 8分:DECIPHER病例具有匹配表型
    • 5分:文献支持基因剂量效应
    • 3分:表型与基因一致
致病性评分解读:
评分分类置信度解读
9-10致病性★★★高置信度致病性
7-8疑似致病性★★☆强致病性证据
4-6VUS★☆☆意义未明
2-3疑似良性★★☆强良性证据
0-1良性★★★高置信度良性
实现代码:
python
def calculate_pathogenicity_score(gene_content, dosage_data, frequency_data, clinical_data):
    """
    Calculate comprehensive pathogenicity score (0-10 scale).
    """
    score = 0
    breakdown = {}

    # 1. Gene content scoring (40 points max)
    gene_score = 0
    for gene in gene_content['fully_contained'] + gene_content['partially_disrupted']:
        dosage_info = next((d for d in dosage_data if d['gene'] == gene['symbol']), None)
        if dosage_info:
            if dosage_info['hi_score'] == '3':
                gene_score += 10
            elif dosage_info['hi_score'] == '2':
                gene_score += 5
            elif gene.get('omim_disease'):
                gene_score += 2

    gene_score = min(gene_score, 40)  # Cap at 40
    breakdown['gene_content'] = gene_score / 40 * 4  # Scale to 0-4

    # 2. Dosage sensitivity scoring (30 points max)
    dosage_score = 0
    definitive_genes = sum(1 for d in dosage_data if d['hi_score'] == '3')

    if definitive_genes >= 2:
        dosage_score = 30
    elif definitive_genes == 1:
        dosage_score = 20
    else:
        emerging_genes = sum(1 for d in dosage_data if d['hi_score'] == '2')
        dosage_score = emerging_genes * 5

    dosage_score = min(dosage_score, 30)
    breakdown['dosage_sensitivity'] = dosage_score / 30 * 3  # Scale to 0-3

    # 3. Population frequency scoring (20 points max)
    freq_score = 0
    if frequency_data.get('frequency') is None:
        freq_score = 20  # Absent
    elif frequency_data['frequency'] < 0.0001:
        freq_score = 10  # Rare
    elif frequency_data['frequency'] < 0.001:
        freq_score = 5  # Uncommon
    elif frequency_data['frequency'] > 0.01:
        freq_score = -20  # Common - likely benign

    breakdown['population_frequency'] = freq_score / 20 * 2  # Scale to -2 to 2

    # 4. Clinical evidence scoring (10 points max)
    clinical_score = 0
    if clinical_data.get('clinvar_pathogenic'):
        clinical_score = 10
    elif clinical_data.get('decipher_matching_phenotype'):
        clinical_score = 8
    elif clinical_data.get('literature_support'):
        clinical_score = 5

    clinical_score = min(clinical_score, 10)
    breakdown['clinical_evidence'] = clinical_score / 10 * 1  # Scale to 0-1

    # Total score (0-10 scale)
    total_score = breakdown['gene_content'] + breakdown['dosage_sensitivity'] + \
                  breakdown['population_frequency'] + breakdown['clinical_evidence']

    total_score = max(0, min(10, total_score))  # Ensure 0-10 range

    return {
        'total_score': round(total_score, 1),
        'breakdown': breakdown,
        'classification': classify_score(total_score)
    }

def classify_score(score):
    """Map score to ACMG-style classification."""
    if score >= 9:
        return 'Pathogenic'
    elif score >= 7:
        return 'Likely Pathogenic'
    elif score >= 4:
        return 'VUS'
    elif score >= 2:
        return 'Likely Benign'
    else:
        return 'Benign'
报告章节:
markdown
undefined

5. Pathogenicity Scoring

5. 致病性评分

Quantitative Assessment (0-10 Scale)

量化评估(0-10分)

ComponentPointsMaxContributionRationale
Gene Content4.0440%KANSL1 (HI score 3), MAPT (HI score 2)
Dosage Sensitivity2.5325%One definitive HI gene (KANSL1)
Population Frequency2.0220%Absent from gnomAD SVs
Clinical Evidence1.0110%ClinVar pathogenic match
Total Score9.510100%
Classification: Pathogenic (★★★ High Confidence)
Interpretation: Score of 9.5/10 indicates high confidence pathogenic SV. Deletion encompasses established haploinsufficient gene (KANSL1), absent from population databases, and matches known pathogenic ClinVar variant.
组成部分得分满分贡献占比依据
基因内容4.0440%KANSL1(HI评分3), MAPT(HI评分2)
剂量敏感性2.5325%一个明确HI基因(KANSL1)
人群频率2.0220%在gnomAD SV中未检出
临床证据1.0110%ClinVar致病性匹配
总分9.510100%
分类致病性(★★★ 高置信度)
解读:9.5/10的评分表明SV具有高置信度致病性。缺失包含已确立的单倍剂量不足基因(KANSL1),未在人群数据库中检出,且与已知致病性ClinVar变异匹配。

Score Breakdown Visualization

得分细分可视化

Gene Content:        ████████████████████████████████████████ 4.0/4
Dosage Sensitivity:  ██████████████████████████░░░░░░░░░░░░░ 2.5/3
Population Freq:     ████████████████████████████████████████ 2.0/2
Clinical Evidence:   ██████████████████████████████████████░░ 1.0/1
                     ─────────────────────────────────────────
Total:               ██████████████████████████████████████░░ 9.5/10
Key Drivers of Pathogenicity:
  1. KANSL1 haploinsufficiency (definitive evidence)
  2. Exact match to known pathogenic deletion
  3. Absence from population databases
  4. Phenotype consistency with Koolen-De Vries syndrome

---
基因内容:        ████████████████████████████████████████ 4.0/4
剂量敏感性:      ██████████████████████████░░░░░░░░░░░░░ 2.5/3
人群频率:        ████████████████████████████████████████ 2.0/2
临床证据:        ██████████████████████████████████████░░ 1.0/1
                 ─────────────────────────────────────────
总分:            ██████████████████████████████████████░░ 9.5/10
致病性核心驱动因素:
  1. KANSL1单倍剂量不足(明确证据)
  2. 与已知致病性缺失完全匹配
  3. 未在人群数据库中检出
  4. 表型与Koolen-De Vries综合征一致

---

Phase 6: Literature & Clinical Evidence

阶段6: 文献与临床证据

Goal: Find case reports, functional studies, and clinical validation
Tools:
ToolPurposeCoverage
PubMed_search
Peer-reviewed literatureComprehensive
DECIPHER_search
Patient case databaseDevelopmental disorders
EuropePMC_search
European literatureAdditional coverage
Search Strategies:
python
def comprehensive_literature_search(tu, genes, sv_type, phenotype):
    """
    Search literature for SV evidence.
    """
    # 1. Gene-specific searches
    literature = []
    for gene in genes:
        # Dosage sensitivity literature
        dosage_papers = tu.tools.PubMed_search(
            query=f'"{gene}" AND (haploinsufficiency OR dosage sensitivity OR deletion syndrome)',
            max_results=20
        )

        # Case reports
        case_papers = tu.tools.PubMed_search(
            query=f'"{gene}" AND deletion AND {phenotype}',
            max_results=15
        )

        literature.append({
            'gene': gene,
            'dosage_papers': dosage_papers,
            'case_reports': case_papers
        })

    # 2. SV-specific searches
    if sv_type == 'DEL':
        sv_papers = tu.tools.PubMed_search(
            query=f'deletion AND {" AND ".join(genes[:3])} AND syndrome',
            max_results=25
        )

    # 3. DECIPHER cases
    decipher_cases = []
    for gene in genes:
        cases = tu.tools.DECIPHER_search(
            query=gene,
            search_type="gene"
        )
        decipher_cases.append(cases)

    return {
        'gene_literature': literature,
        'sv_literature': sv_papers,
        'decipher_cases': decipher_cases
    }
Report Section:
markdown
undefined
目标:查找病例报告、功能研究和临床验证数据
工具:
工具用途覆盖范围
PubMed_search
同行评审文献全面
DECIPHER_search
患者病例数据库发育障碍
EuropePMC_search
欧洲文献补充覆盖
搜索策略:
python
def comprehensive_literature_search(tu, genes, sv_type, phenotype):
    """
    Search literature for SV evidence.
    """
    # 1. Gene-specific searches
    literature = []
    for gene in genes:
        # Dosage sensitivity literature
        dosage_papers = tu.tools.PubMed_search(
            query=f'"{gene}" AND (haploinsufficiency OR dosage sensitivity OR deletion syndrome)',
            max_results=20
        )

        # Case reports
        case_papers = tu.tools.PubMed_search(
            query=f'"{gene}" AND deletion AND {phenotype}',
            max_results=15
        )

        literature.append({
            'gene': gene,
            'dosage_papers': dosage_papers,
            'case_reports': case_papers
        })

    # 2. SV-specific searches
    if sv_type == 'DEL':
        sv_papers = tu.tools.PubMed_search(
            query=f'deletion AND {" AND ".join(genes[:3])} AND syndrome',
            max_results=25
        )

    # 3. DECIPHER cases
    decipher_cases = []
    for gene in genes:
        cases = tu.tools.DECIPHER_search(
            query=gene,
            search_type="gene"
        )
        decipher_cases.append(cases)

    return {
        'gene_literature': literature,
        'sv_literature': sv_papers,
        'decipher_cases': decipher_cases
    }
报告章节:
markdown
undefined

6. Literature & Clinical Evidence

6. 文献与临床证据

Key Publications

关键出版物

StudyFindingEvidence TypePMID
Koolen et al., 2006Described 17q21.31 microdeletion syndromeOriginal description16222315
Koolen et al., 2008KANSL1 haploinsufficiency confirmedFunctional validation18394581
Zollino et al., 2012Phenotype characterization (n=52)Clinical series22736773
Key Findings:
  • 17q21.31 deletion is recurrent (mediated by LCRs)
  • KANSL1 haploinsufficiency is primary mechanism
  • Phenotype: ID (100%), hypotonia (95%), friendly demeanor (85%)
  • Penetrance: >95% for developmental features
Source: PubMed via
PubMed_search
研究发现证据类型PMID
Koolen et al., 2006描述17q21.31微缺失综合征原始描述16222315
Koolen et al., 2008证实KANSL1单倍剂量不足功能验证18394581
Zollino et al., 2012表型特征分析(n=52)临床系列研究22736773
关键发现:
  • 17q21.31缺失为 recurrent(由LCR介导)
  • KANSL1单倍剂量不足是主要机制
  • 表型:智力障碍(100%)、肌张力低下(95%)、友好性情(85%)
  • 外显率:发育特征外显率>95%
来源: PubMed via
PubMed_search

DECIPHER Patient Cases (n=45)

DECIPHER患者病例(n=45)

Phenotype Frequency in DECIPHER Cohort:
FeatureFrequencyMatch to Patient
Intellectual disability45/45 (100%)✓ Yes
Hypotonia42/45 (93%)✓ Yes
Feeding difficulties38/45 (84%)✓ Yes
Distinctive facies40/45 (89%)✓ Yes
Friendly personality35/45 (78%)Unknown
Phenotype Match: Patient phenotype highly consistent with DECIPHER cohort (4/4 assessable features present).
ACMG Code: PP4 (Supporting) - Patient's clinical features consistent with gene's known phenotype
Source: DECIPHER via
DECIPHER_search
DECIPHER队列表型频率:
特征频率与患者匹配
智力障碍45/45 (100%)✓ 是
肌张力低下42/45 (93%)✓ 是
喂养困难38/45 (84%)✓ 是
特殊面容40/45 (89%)✓ 是
友好性格35/45 (78%)未知
表型匹配:患者表型与DECIPHER队列高度一致(4项可评估特征均匹配)。
ACMG代码PP4(支持) - 患者临床特征与基因已知表型一致
来源: DECIPHER via
DECIPHER_search

Functional Evidence for KANSL1 Dosage Sensitivity

KANSL1剂量敏感性的功能证据

StudyModelFindingPMID
Koolen et al., 2012Patient cellsReduced KANSL1 protein22736773
Zollino et al., 2015Mouse modelKansl1+/- recapitulates phenotype25607366
Arbogast et al., 2017Zebrafishkansl1 knockdown → developmental defects28666126
Strength of Evidence: ★★★ (High) - Multiple independent studies confirm haploinsufficiency mechanism
ACMG Code: PS3_Moderate - Well-established functional studies showing dosage sensitivity

---
研究模型发现PMID
Koolen et al., 2012患者细胞KANSL1蛋白表达降低22736773
Zollino et al., 2015小鼠模型Kansl1+/-重现表型25607366
Arbogast et al., 2017斑马鱼kansl1敲低→发育缺陷28666126
证据强度:★★★(高) - 多项独立研究证实单倍剂量不足机制
ACMG代码PS3_Moderate - 完善的功能研究证实剂量敏感性

---

Phase 7: ACMG-Adapted Classification

阶段7: 适配ACMG的分类

Goal: Apply ACMG/ClinGen criteria adapted for SVs
SV-Specific ACMG Criteria:
目标:应用适配SV的ACMG/ClinGen标准
SV特异性ACMG标准:

Pathogenic Evidence Codes

致病性证据代码

CodeStrengthCriteriaSV Application
PVS1Very StrongNull variant in HI geneComplete deletion of HI gene
PS1StrongSame SV as known pathogenic≥70% reciprocal overlap with ClinVar pathogenic
PS2StrongDe novo (maternity/paternity confirmed)De novo SV in patient with matching phenotype
PS3StrongFunctional studiesGene dosage effects demonstrated
PS4StrongCase-control enrichmentSV enriched in cases vs controls
PM1ModerateCritical regionDeletion of exons in HI gene
PM2ModerateAbsent from controlsNot in gnomAD SVs, DGV
PM3ModerateRecessive: homozygous or compound hetBoth alleles affected (rare for SVs)
PM4ModerateProtein length changeIn-frame deletion/duplication
PM5ModerateSimilar SVs pathogenicNearby SVs in ClinVar pathogenic
PM6ModerateDe novo (no confirmation)De novo SV, phenotype consistent
PP1SupportingSegregation in familySV segregates with phenotype
PP2SupportingGene/pathway relevantGenes in SV match phenotype
PP3SupportingComputational evidenceMultiple predictors support haploinsufficiency
PP4SupportingPhenotype consistentPatient phenotype matches gene-disease
代码强度标准SV应用场景
PVS1极强HI基因中的无效变异HI基因完全缺失
PS1与已知致病性SV相同与ClinVar致病性SV互斥重叠≥70%
PS2新发(亲子关系已确认)患者中出现新发SV且表型匹配
PS3功能研究证实基因剂量效应
PS4病例-对照富集SV在病例中富集
PM1中等关键区域HI基因外显子缺失
PM2中等未在对照中检出未在gnomAD SV、DGV中检出
PM3中等隐性:纯合或复合杂合两个等位基因均受影响(SV中罕见)
PM4中等蛋白长度改变框内缺失/重复
PM5中等相似SV具有致病性附近SV在ClinVar中为致病性
PM6中等新发(未确认)新发SV,表型一致
PP1支持家系共分离SV与表型共分离
PP2支持基因/通路相关SV中的基因与表型匹配
PP3支持计算证据多个预测工具支持单倍剂量不足
PP4支持表型一致患者表型与基因-疾病关联匹配

Benign Evidence Codes

良性证据代码

CodeStrengthCriteriaSV Application
BA1Stand-AloneMAF >5%SV frequency >5% in gnomAD
BS1StrongMAF too high for diseaseSV frequency >1%
BS2StrongHealthy adult with phenotype-associated genotypeSV in healthy individual (careful - reduced penetrance)
BS3StrongFunctional studies show no effectNo dosage sensitivity demonstrated
BS4StrongNon-segregationSV doesn't segregate with phenotype
BP1SupportingMissense in gene without known LOFN/A for SVs
BP2SupportingObserved in trans with pathogenicSV + pathogenic SNV = compound het (patient unaffected)
BP4SupportingComputational evidence benignPredictors suggest no haploinsufficiency
BP5SupportingFound in case with alt causePhenotype explained by different variant
BP7SupportingSynonymous with no splice effectN/A for SVs
Classification Algorithm (ACMG SV Criteria):
ClassificationEvidence Required
PathogenicPVS1 + PS1; OR 2 Strong; OR 1 Strong + 3 Moderate
Likely Pathogenic1 Very Strong + 1 Moderate; OR 1 Strong + 2 Moderate; OR 3 Moderate
VUSCriteria not met; OR conflicting evidence
Likely Benign1 Strong + 1 Supporting; OR 2 Supporting
BenignBA1; OR BS1 + BS2; OR 2 Strong
Implementation:
python
def apply_acmg_criteria(gene_content, dosage_data, frequency_data, clinical_data, inheritance):
    """
    Apply ACMG SV criteria and calculate classification.
    """
    evidence = {
        'pathogenic': [],
        'benign': []
    }

    # PVS1: Complete deletion of HI gene
    hi_genes = [d for d in dosage_data if d['hi_score'] == '3']
    if len(hi_genes) > 0 and len(gene_content['fully_contained']) > 0:
        evidence['pathogenic'].append({
            'code': 'PVS1',
            'strength': 'Very Strong',
            'rationale': f"Complete deletion of haploinsufficient gene(s): {', '.join(g['gene'] for g in hi_genes)}"
        })

    # PS1: Same as known pathogenic SV
    if clinical_data.get('clinvar_pathogenic_match'):
        evidence['pathogenic'].append({
            'code': 'PS1',
            'strength': 'Strong',
            'rationale': f"≥70% overlap with ClinVar pathogenic SV: {clinical_data['clinvar_id']}"
        })

    # PS2: De novo with phenotype match
    if inheritance == 'de_novo' and clinical_data.get('phenotype_match'):
        evidence['pathogenic'].append({
            'code': 'PS2',
            'strength': 'Strong',
            'rationale': "De novo occurrence in patient with consistent phenotype"
        })

    # PS3: Functional studies
    if clinical_data.get('functional_evidence'):
        evidence['pathogenic'].append({
            'code': 'PS3',
            'strength': 'Strong',
            'rationale': "Well-established functional studies demonstrate dosage sensitivity"
        })

    # PM2: Absent from controls
    if frequency_data.get('frequency') == 0 or frequency_data.get('frequency') is None:
        evidence['pathogenic'].append({
            'code': 'PM2',
            'strength': 'Moderate',
            'rationale': "Absent from gnomAD SV database and DGV"
        })

    # PP4: Phenotype consistent
    if clinical_data.get('phenotype_consistent'):
        evidence['pathogenic'].append({
            'code': 'PP4',
            'strength': 'Supporting',
            'rationale': "Patient phenotype highly consistent with gene-disease association"
        })

    # BA1: Common variant
    if frequency_data.get('frequency', 0) > 0.05:
        evidence['benign'].append({
            'code': 'BA1',
            'strength': 'Stand-Alone',
            'rationale': f"Frequency {frequency_data['frequency']:.3f} too high for rare disease"
        })

    # BS1: High frequency
    if 0.01 < frequency_data.get('frequency', 0) <= 0.05:
        evidence['benign'].append({
            'code': 'BS1',
            'strength': 'Strong',
            'rationale': f"Frequency {frequency_data['frequency']:.3f} exceeds expected for disease"
        })

    # Calculate classification
    classification = determine_classification(evidence)

    return {
        'evidence': evidence,
        'classification': classification['class'],
        'confidence': classification['confidence']
    }

def determine_classification(evidence):
    """
    Apply ACMG classification rules.
    """
    path = evidence['pathogenic']
    ben = evidence['benign']

    # Count evidence by strength
    very_strong = len([e for e in path if e['strength'] == 'Very Strong'])
    strong_path = len([e for e in path if e['strength'] == 'Strong'])
    moderate_path = len([e for e in path if e['strength'] == 'Moderate'])
    supporting_path = len([e for e in path if e['strength'] == 'Supporting'])

    standalone_ben = len([e for e in ben if e['strength'] == 'Stand-Alone'])
    strong_ben = len([e for e in ben if e['strength'] == 'Strong'])
    supporting_ben = len([e for e in ben if e['strength'] == 'Supporting'])

    # Benign criteria (takes precedence if strong)
    if standalone_ben >= 1:
        return {'class': 'Benign', 'confidence': '★★★'}
    if strong_ben >= 2:
        return {'class': 'Benign', 'confidence': '★★★'}
    if strong_ben >= 1 and supporting_ben >= 1:
        return {'class': 'Likely Benign', 'confidence': '★★☆'}
    if supporting_ben >= 2:
        return {'class': 'Likely Benign', 'confidence': '★★☆'}

    # Pathogenic criteria
    if very_strong >= 1 and strong_path >= 1:
        return {'class': 'Pathogenic', 'confidence': '★★★'}
    if strong_path >= 2:
        return {'class': 'Pathogenic', 'confidence': '★★★'}
    if very_strong >= 1 and moderate_path >= 1:
        return {'class': 'Likely Pathogenic', 'confidence': '★★☆'}
    if strong_path >= 1 and moderate_path >= 2:
        return {'class': 'Likely Pathogenic', 'confidence': '★★☆'}
    if strong_path >= 1 and moderate_path >= 1 and supporting_path >= 1:
        return {'class': 'Likely Pathogenic', 'confidence': '★★☆'}
    if moderate_path >= 3:
        return {'class': 'Likely Pathogenic', 'confidence': '★☆☆'}

    # Default to VUS
    return {'class': 'VUS', 'confidence': '★☆☆'}
Report Section:
markdown
undefined
代码强度标准SV应用场景
BA1独立MAF >5%SV在gnomAD中频率>5%
BS1MAF过高,不可能引发疾病SV频率>1%
BS2健康成人携带表型相关基因型健康个体携带SV(需注意:外显率降低)
BS3功能研究显示无效应未证实剂量敏感性
BS4不共分离SV与表型不共分离
BP1支持无已知LOF的基因错义变异SV不适用
BP2支持与致病性变异反式存在SV + 致病性SNV = 复合杂合(患者未患病)
BP4支持计算证据提示良性预测工具提示无单倍剂量不足
BP5支持病例中存在其他病因表型由其他变异解释
BP7支持同义变异无剪接效应SV不适用
分类算法(ACMG SV标准):
分类所需证据
致病性PVS1 + PS1; 或 2项强证据; 或 1项强证据 + 3项中等证据
疑似致病性1项极强证据 + 1项中等证据; 或 1项强证据 + 2项中等证据; 或 3项中等证据
VUS未满足标准; 或证据冲突
疑似良性1项强证据 + 1项支持证据; 或 2项支持证据
良性BA1; 或 BS1 + BS2; 或 2项强证据
实现代码:
python
def apply_acmg_criteria(gene_content, dosage_data, frequency_data, clinical_data, inheritance):
    """
    Apply ACMG SV criteria and calculate classification.
    """
    evidence = {
        'pathogenic': [],
        'benign': []
    }

    # PVS1: Complete deletion of HI gene
    hi_genes = [d for d in dosage_data if d['hi_score'] == '3']
    if len(hi_genes) > 0 and len(gene_content['fully_contained']) > 0:
        evidence['pathogenic'].append({
            'code': 'PVS1',
            'strength': 'Very Strong',
            'rationale': f"Complete deletion of haploinsufficient gene(s): {', '.join(g['gene'] for g in hi_genes)}"
        })

    # PS1: Same as known pathogenic SV
    if clinical_data.get('clinvar_pathogenic_match'):
        evidence['pathogenic'].append({
            'code': 'PS1',
            'strength': 'Strong',
            'rationale': f"≥70% overlap with ClinVar pathogenic SV: {clinical_data['clinvar_id']}"
        })

    # PS2: De novo with phenotype match
    if inheritance == 'de_novo' and clinical_data.get('phenotype_match'):
        evidence['pathogenic'].append({
            'code': 'PS2',
            'strength': 'Strong',
            'rationale': "De novo occurrence in patient with consistent phenotype"
        })

    # PS3: Functional studies
    if clinical_data.get('functional_evidence'):
        evidence['pathogenic'].append({
            'code': 'PS3',
            'strength': 'Strong',
            'rationale': "Well-established functional studies demonstrate dosage sensitivity"
        })

    # PM2: Absent from controls
    if frequency_data.get('frequency') == 0 or frequency_data.get('frequency') is None:
        evidence['pathogenic'].append({
            'code': 'PM2',
            'strength': 'Moderate',
            'rationale': "Absent from gnomAD SV database and DGV"
        })

    # PP4: Phenotype consistent
    if clinical_data.get('phenotype_consistent'):
        evidence['pathogenic'].append({
            'code': 'PP4',
            'strength': 'Supporting',
            'rationale': "Patient phenotype highly consistent with gene-disease association"
        })

    # BA1: Common variant
    if frequency_data.get('frequency', 0) > 0.05:
        evidence['benign'].append({
            'code': 'BA1',
            'strength': 'Stand-Alone',
            'rationale': f"Frequency {frequency_data['frequency']:.3f} too high for rare disease"
        })

    # BS1: High frequency
    if 0.01 < frequency_data.get('frequency', 0) <= 0.05:
        evidence['benign'].append({
            'code': 'BS1',
            'strength': 'Strong',
            'rationale': f"Frequency {frequency_data['frequency']:.3f} exceeds expected for disease"
        })

    # Calculate classification
    classification = determine_classification(evidence)

    return {
        'evidence': evidence,
        'classification': classification['class'],
        'confidence': classification['confidence']
    }

def determine_classification(evidence):
    """
    Apply ACMG classification rules.
    """
    path = evidence['pathogenic']
    ben = evidence['benign']

    # Count evidence by strength
    very_strong = len([e for e in path if e['strength'] == 'Very Strong'])
    strong_path = len([e for e in path if e['strength'] == 'Strong'])
    moderate_path = len([e for e in path if e['strength'] == 'Moderate'])
    supporting_path = len([e for e in path if e['strength'] == 'Supporting'])

    standalone_ben = len([e for e in ben if e['strength'] == 'Stand-Alone'])
    strong_ben = len([e for e in ben if e['strength'] == 'Strong'])
    supporting_ben = len([e for e in ben if e['strength'] == 'Supporting'])

    # Benign criteria (takes precedence if strong)
    if standalone_ben >= 1:
        return {'class': 'Benign', 'confidence': '★★★'}
    if strong_ben >= 2:
        return {'class': 'Benign', 'confidence': '★★★'}
    if strong_ben >= 1 and supporting_ben >= 1:
        return {'class': 'Likely Benign', 'confidence': '★★☆'}
    if supporting_ben >= 2:
        return {'class': 'Likely Benign', 'confidence': '★★☆'}

    # Pathogenic criteria
    if very_strong >= 1 and strong_path >= 1:
        return {'class': 'Pathogenic', 'confidence': '★★★'}
    if strong_path >= 2:
        return {'class': 'Pathogenic', 'confidence': '★★★'}
    if very_strong >= 1 and moderate_path >= 1:
        return {'class': 'Likely Pathogenic', 'confidence': '★★☆'}
    if strong_path >= 1 and moderate_path >= 2:
        return {'class': 'Likely Pathogenic', 'confidence': '★★☆'}
    if strong_path >= 1 and moderate_path >= 1 and supporting_path >= 1:
        return {'class': 'Likely Pathogenic', 'confidence': '★★☆'}
    if moderate_path >= 3:
        return {'class': 'Likely Pathogenic', 'confidence': '★☆☆'}

    # Default to VUS
    return {'class': 'VUS', 'confidence': '★☆☆'}
报告章节:
markdown
undefined

7. ACMG-Adapted Classification

7. 适配ACMG的分类

Evidence Codes Applied

应用的证据代码

Pathogenic Evidence:
CodeStrengthRationale
PVS1Very StrongComplete deletion of haploinsufficient gene (KANSL1, HI score 3)
PS1Strong≥95% overlap with ClinVar pathogenic deletion (VCV000012345)
PM2ModerateAbsent from gnomAD SV database (>10,000 genomes)
PP4SupportingPatient phenotype consistent with Koolen-De Vries syndrome
Benign Evidence: None
致病性证据:
代码强度依据
PVS1极强完全缺失单倍剂量不足基因(KANSL1,HI评分3)
PS1与ClinVar致病性缺失(VCV000012345)重叠≥95%
PM2中等在gnomAD SV数据库(>10,000个基因组)中未检出
PP4支持患者表型与Koolen-De Vries综合征一致
良性证据:无

Evidence Summary

证据汇总

PathogenicBenign
1 Very Strong (PVS1)None
1 Strong (PS1)
1 Moderate (PM2)
1 Supporting (PP4)
致病性良性
1项极强(PVS1)
1项强(PS1)
1项中等(PM2)
1项支持(PP4)

Classification: PATHOGENIC ★★★

分类: 致病性 ★★★

Rationale: Meets ACMG criteria for Pathogenic (1 Very Strong + 1 Strong). Complete deletion of established haploinsufficient gene (KANSL1) with exact match to known pathogenic deletion.
Confidence: ★★★ (High) - Multiple independent lines of strong evidence
依据:满足ACMG致病性标准(1项极强 + 1项强证据)。完全缺失已确立的单倍剂量不足基因(KANSL1),且与已知致病性缺失完全匹配。
置信度:★★★(高) - 多条独立强证据支持

Classification Certainty Factors

分类确定性因素

Strengths:
  • Exact match to well-characterized pathogenic deletion
  • Complete deletion of definitive HI gene (KANSL1)
  • Absent from population databases
  • Phenotype highly consistent with gene-disease
Limitations:
  • None significant - this is a well-established pathogenic SV

---
优势:
  • 与已充分表征的致病性缺失完全匹配
  • 完全缺失明确HI基因(KANSL1)
  • 未在人群数据库中检出
  • 表型与基因-疾病高度一致
局限性:
  • 无显著局限性 - 该SV为已充分确立的致病性变异

---

Output Structure

输出结构

Report File:
SV_analysis_report.md

报告文件:
SV_analysis_report.md

markdown
undefined
markdown
undefined

Structural Variant Analysis Report: [SV_IDENTIFIER]

结构变异分析报告: [SV标识符]

Generated: [Date] | Analyst: ToolUniverse SV Interpreter

生成时间: [日期] | 分析者: ToolUniverse SV解读工具

Executive Summary

执行摘要

FieldValue
SV TypeDeletion / Duplication / Inversion / Translocation
Coordinateschr17:44039927-44352659 (GRCh38)
Size313 kb
Gene Content2 genes fully contained, 0 partially disrupted
ClassificationPathogenic / Likely Pathogenic / VUS / Likely Benign / Benign
Pathogenicity ScoreX.X / 10
Confidence★★★ / ★★☆ / ★☆☆
Key Finding[One-sentence summary]
Clinical Action: [Required / Recommended / None]

字段
SV类型缺失 / 重复 / 倒位 / 易位
坐标chr17:44039927-44352659 (GRCh38)
尺寸313 kb
基因内容2个完全包含的基因,0个部分断裂的基因
分类致病性 / 疑似致病性 / VUS / 疑似良性 / 良性
致病性评分X.X / 10
置信度★★★ / ★★☆ / ★☆☆
关键发现[一句话总结]
临床行动: [必需 / 推荐 / 无]

1. SV Identity & Classification

1. SV识别与分类

{SV type, coordinates, size, breakpoint precision, inheritance}

{SV类型、坐标、尺寸、断点精度、遗传模式}

2. Gene Content Analysis

2. 基因内容分析

2.1 Fully Contained Genes

2.1 完全包含的基因

{Table of genes with functions, disease associations}
{基因功能、疾病关联表格}

2.2 Partially Disrupted Genes

2.2 部分断裂的基因

{Genes with breakpoints, domains affected}
{断点基因、受影响结构域}

2.3 Flanking Genes

2.3 侧翼基因

{Genes near breakpoints, position effect risk}

{断点附近基因、位置效应风险}

3. Dosage Sensitivity Assessment

3. 剂量敏感性评估

3.1 Haploinsufficient Genes

3.1 单倍剂量不足基因

{ClinGen HI scores, pLI, evidence}
{ClinGen HI评分、pLI、证据表格}

3.2 Triplosensitive Genes

3.2 三倍剂量敏感性基因

{ClinGen TS scores, duplication syndromes}
{ClinGen TS评分、重复综合征}

3.3 Non-Dosage-Sensitive Genes

3.3 非剂量敏感性基因

{Genes without established dosage effects}

{无剂量效应证据的基因}

4. Population Frequency Context

4. 人群频率背景分析

4.1 ClinVar Matches

4.1 ClinVar匹配结果

{Known pathogenic/benign SVs}
{已知致病性/良性SV}

4.2 gnomAD SV Database

4.2 gnomAD SV数据库

{Population frequencies}
{人群频率}

4.3 DECIPHER Patient Cases

4.3 DECIPHER患者病例

{Similar SVs, phenotype matching}

{相似SV、表型匹配}

5. Pathogenicity Scoring

5. 致病性评分

5.1 Quantitative Assessment

5.1 量化评估

{0-10 score with breakdown}
{0-10分及细分}

5.2 Score Components

5.2 得分组成

{Gene content, dosage, frequency, clinical}

{基因内容、剂量、频率、临床}

6. Literature & Clinical Evidence

6. 文献与临床证据

6.1 Key Publications

6.1 关键出版物

{Functional studies, case series}
{功能研究、临床系列}

6.2 DECIPHER Cohort Analysis

6.2 DECIPHER队列分析

{Phenotype frequencies, matching}
{表型频率、匹配情况}

6.3 Functional Evidence

6.3 功能证据

{Gene dosage studies}

{基因剂量研究}

7. ACMG-Adapted Classification

7. 适配ACMG的分类

7.1 Evidence Codes Applied

7.1 应用的证据代码

{Pathogenic and benign codes with rationale}
{致病性与良性代码及依据}

7.2 Classification

7.2 分类结果

{Final classification with confidence}
{最终分类及置信度}

7.3 Certainty Factors

7.3 确定性因素

{Strengths and limitations}

{优势与局限性}

8. Clinical Recommendations

8. 临床建议

8.1 For Affected Individual

8.1 针对受检者

{Testing, management, surveillance}
{检测、管理、监测}

8.2 For Family Members

8.2 针对家属

{Cascade testing, genetic counseling}
{级联检测、遗传咨询}

8.3 Reproductive Considerations

8.3 生殖考量

{Recurrence risk, prenatal testing}

{复发风险、产前检测}

9. Limitations & Uncertainties

9. 局限性与不确定性

{Missing data, conflicting evidence, knowledge gaps}

{缺失数据、证据冲突、知识空白}

Data Sources

数据来源

{All tools and databases queried with results}

---
{所有查询的工具与数据库及结果}

---

Evidence Grading System

证据分级系统

SymbolConfidenceCriteria
★★★HighClinGen definitive, ClinVar expert reviewed, multiple independent studies
★★☆ModerateClinGen strong/moderate, single good study, DECIPHER cohort support
★☆☆LimitedComputational predictions only, case reports, emerging evidence

符号置信度标准
★★★ClinGen明确、ClinVar专家评审、多项独立研究
★★☆ClinGen强/中等、单项优质研究、DECIPHER队列支持
★☆☆有限仅计算预测、病例报告、涌现证据

Special Scenarios

特殊场景

Scenario 1: Recurrent Microdeletion Syndrome

场景1: 复发微缺失综合征

Additional considerations:
  • Check for recurrence mechanism (LCRs, NAHR)
  • Look for founder effects
  • Population-specific frequencies
  • Incomplete penetrance
  • Variable expressivity
Example: 22q11.2 deletion, 17q21.31 deletion (Koolen-De Vries)
额外考量:
  • 检查复发机制(LCR、NAHR)
  • 查找奠基者效应
  • 人群特异性频率
  • 不完全外显
  • 可变表达
示例: 22q11.2缺失、17q21.31缺失(Koolen-De Vries)

Scenario 2: Balanced Translocation (No Gene Disruption)

场景2: 平衡易位(无基因断裂)

Assessment approach:
  • If no genes disrupted: Likely benign (in most cases)
  • Check for cryptic imbalances
  • Consider position effects (rare)
  • Reproductive risk (unbalanced offspring)
Classification: Usually VUS or Likely Benign unless offspring affected
评估方法:
  • 若无基因断裂:通常为良性(多数情况)
  • 检查隐性不平衡
  • 考虑位置效应(罕见)
  • 生殖风险(不平衡子代)
分类: 通常为VUS或疑似良性,除非子代受影响

Scenario 3: Complex Rearrangement

场景3: 复杂重排

Analysis strategy:
  • Break down into component SVs
  • Assess each breakpoint independently
  • Look for chromothripsis pattern
  • Consider cumulative gene dosage effects
  • Check for DNA repair defects
分析策略:
  • 分解为单个SV组件
  • 独立评估每个断点
  • 查找chromothripsis模式
  • 考虑累计基因剂量效应
  • 检查DNA修复缺陷

Scenario 4: Small In-Frame Deletion/Duplication

场景4: 小型框内缺失/重复

Special considerations:
  • May not cause haploinsufficiency
  • Check if critical domain affected
  • Look for similar variants in ClinVar
  • Consider protein structural impact
  • May need functional studies

特殊考量:
  • 可能不引发单倍剂量不足
  • 检查关键结构域是否受影响
  • 查找ClinVar中相似变异
  • 考虑蛋白结构影响
  • 可能需要功能研究

Quantified Minimums

最低量化要求

SectionRequirement
Gene contentAll genes in SV region annotated
Dosage sensitivityClinGen scores for all genes (if available)
Population frequencyCheck gnomAD SV + ClinVar + DGV
Literature search≥2 search strategies (PubMed + DECIPHER)
ACMG codesAll applicable codes listed

章节要求
基因内容SV区域内所有基因均需注释
剂量敏感性所有基因均需ClinGen评分(若可用)
人群频率检查gnomAD SV + ClinVar + DGV
文献搜索≥2种搜索策略(PubMed + DECIPHER)
ACMG代码列出所有适用代码

Tools Reference

工具参考

Core Tools for SV Analysis

SV分析核心工具

ToolPurposeRequired?
ClinGen_search_dosage_sensitivity
HI/TS scoresRequired
ClinGen_search_gene_validity
Gene-disease validityRequired
ClinVar_search_variants
Known pathogenic/benign SVsRequired
DECIPHER_search
Patient cases, phenotypesHighly recommended
Ensembl_lookup_gene
Gene coordinates, structureRequired
OMIM_search
,
OMIM_get_entry
Gene-disease associationsRequired
DisGeNET_search_gene
Additional disease associationsRecommended
PubMed_search
Literature evidenceRecommended
Gene_Ontology_get_term_info
Gene functionSupporting

工具用途是否必需
ClinGen_search_dosage_sensitivity
HI/TS评分必需
ClinGen_search_gene_validity
基因-疾病有效性必需
ClinVar_search_variants
已知致病性/良性SV必需
DECIPHER_search
患者病例、表型高度推荐
Ensembl_lookup_gene
基因坐标、结构必需
OMIM_search
,
OMIM_get_entry
基因-疾病关联必需
DisGeNET_search_gene
额外疾病关联推荐
PubMed_search
文献证据推荐
Gene_Ontology_get_term_info
基因功能支持

Report File Naming

报告文件命名规则

SV_analysis_[TYPE]_chr[CHR]_[START]_[END]_[GENES].md

Examples:
SV_analysis_DEL_chr17_44039927_44352659_KANSL1_MAPT.md
SV_analysis_DUP_chr22_17400000_17800000_TBX1.md
SV_analysis_INV_chr11_2100000_2400000_complex.md

SV_analysis_[TYPE]_chr[CHR]_[START]_[END]_[GENES].md

示例:
SV_analysis_DEL_chr17_44039927_44352659_KANSL1_MAPT.md
SV_analysis_DUP_chr22_17400000_17800000_TBX1.md
SV_analysis_INV_chr11_2100000_2400000_complex.md

Clinical Recommendations Framework

临床建议框架

For Pathogenic/Likely Pathogenic SVs

针对致病性/疑似致病性SV

SV TypeRecommendations
Deletion (HI gene)Genetic counseling, cascade testing, phenotype-specific surveillance
Duplication (TS gene)Same as deletion; check for dosage-specific syndrome
Translocation (disruption)Assess both breakpoints, consider reproductive counseling
ComplexMultidisciplinary evaluation, research enrollment
SV类型建议
缺失(HI基因)遗传咨询、级联检测、表型特异性监测
重复(TS基因)与缺失相同;检查剂量特异性综合征
易位(断裂)评估两个断点,考虑生殖咨询
复杂重排多学科评估、参与研究

For VUS

针对VUS

ActionDetails
Clinical managementBase on phenotype, not genotype
Follow-upReinterpret in 1-2 years or when phenotype evolves
ResearchFunctional studies if research-grade samples available
Family studiesSegregation analysis can reclassify
行动细节
临床管理基于表型而非基因型
随访1-2年后重新解读,或表型变化时
研究若有研究级样本可进行功能研究
家系研究共分离分析可重新分类

For Benign/Likely Benign

针对良性/疑似良性

ActionDetails
ClinicalNot expected to cause rare disease
FamilyNo cascade testing needed (unless recurrent/reproductive risk)
ReproductiveBalanced translocation carriers may have offspring risk

行动细节
临床预计不会引发罕见病
家属无需级联检测(除非复发/生殖风险)
生殖平衡易位携带者可能存在子代风险

When NOT to Use This Skill

不适用场景

  • Single nucleotide variants (SNVs) → Use
    tooluniverse-variant-interpretation
    skill
  • Small indels (<50 bp) → Use variant interpretation skill
  • Somatic variants in cancer → Different framework needed
  • Mitochondrial variants → Specialized interpretation required
  • Repeat expansions → Different mechanism
Use this skill for structural variants ≥50 bp requiring dosage sensitivity assessment and ACMG-adapted classification.

  • 单核苷酸变异(SNV) → 使用
    tooluniverse-variant-interpretation
    技能
  • 小插入缺失(<50 bp) → 使用变异解读技能
  • 癌症体细胞变异 → 需要不同框架
  • 线粒体变异 → 需专业解读
  • 重复扩增 → 机制不同
本技能适用于**≥50 bp的结构变异**,需进行剂量敏感性评估和适配ACMG的分类。

See Also

参见

  • EXAMPLES.md
    - Sample SV interpretations
  • README.md
    - Quick start guide
  • tooluniverse-variant-interpretation
    - For SNVs and small indels
  • ClinGen Dosage Sensitivity Map: https://www.ncbi.nlm.nih.gov/projects/dbvar/clingen/
  • ACMG SV Guidelines: Riggs et al., Genet Med 2020 (PMID: 31690835)
  • EXAMPLES.md
    - SV解读示例
  • README.md
    - 快速入门指南
  • tooluniverse-variant-interpretation
    - 用于SNV和小插入缺失
  • ClinGen剂量敏感性图谱: https://www.ncbi.nlm.nih.gov/projects/dbvar/clingen/
  • ACMG SV指南: Riggs et al., Genet Med 2020 (PMID: 31690835)