tooluniverse-protein-therapeutic-design

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Therapeutic Protein Designer

治疗性蛋白质设计工具

AI-guided de novo protein design using RFdiffusion backbone generation, ProteinMPNN sequence optimization, and structure validation for therapeutic protein development.
KEY PRINCIPLES:
  1. Structure-first design - Generate backbone geometry before sequence
  2. Target-guided - Design binders with target structure in mind
  3. Iterative validation - Predict structure to validate designs
  4. Developability-aware - Consider aggregation, immunogenicity, expression
  5. Evidence-graded - Grade designs by confidence metrics
  6. Actionable output - Provide sequences ready for experimental testing
  7. English-first queries - Always use English terms in tool calls (protein names, target names), even if the user writes in another language. Only try original-language terms as a fallback. Respond in the user's language

基于AI引导的从头蛋白质设计,通过RFdiffusion生成蛋白骨架、ProteinMPNN优化序列并进行结构验证,助力治疗性蛋白质开发。
核心原则:
  1. 结构优先设计 - 先生成蛋白骨架结构,再设计序列
  2. 靶点导向 - 设计结合剂时以靶点结构为核心
  3. 迭代验证 - 通过结构预测验证设计结果
  4. 成药性考量 - 评估聚集性、免疫原性及表达可行性
  5. 可信度分级 - 基于置信度指标对设计结果分级
  6. 可落地输出 - 提供可直接用于实验测试的序列
  7. 英文优先查询 - 工具调用中始终使用英文术语(蛋白质名称、靶点名称),即使用户使用其他语言提问。仅在英文查询失败时尝试原语言术语。回复使用用户的语言

When to Use

适用场景

Apply when user asks:
  • "Design a protein binder for [target]"
  • "Create a therapeutic protein against [protein/epitope]"
  • "Design a protein scaffold with [property]"
  • "Optimize this protein sequence for [function]"
  • "Design a de novo enzyme for [reaction]"
  • "Generate protein variants for [target binding]"

当用户提出以下需求时使用:
  • "为[靶点]设计蛋白质结合剂"
  • "创建针对[蛋白质/表位]的治疗性蛋白质"
  • "设计具备[特性]的蛋白质支架"
  • "优化该蛋白质序列以实现[功能]"
  • "为[反应]设计从头酶"
  • "生成针对[靶点结合]的蛋白质变体"

Critical Workflow Requirements

关键工作流要求

1. Report-First Approach (MANDATORY)

1. 报告优先方法(强制要求)

  1. Create the report file FIRST:
    • File name:
      [TARGET]_protein_design_report.md
    • Initialize with section headers
    • Add placeholder:
      [Designing...]
  2. Progressively update as designs are generated
  3. Output separate files:
    • [TARGET]_designed_sequences.fasta
      - All designed sequences
    • [TARGET]_top_candidates.csv
      - Ranked candidates with metrics
  1. 优先创建报告文件:
    • 文件名:
      [TARGET]_protein_design_report.md
    • 初始化时添加章节标题
    • 加入占位符:
      [设计中...]
  2. 随设计推进逐步更新报告
  3. 输出独立文件:
    • [TARGET]_designed_sequences.fasta
      - 所有设计的序列
    • [TARGET]_top_candidates.csv
      - 带指标的候选序列排名表

2. Design Documentation (MANDATORY)

2. 设计文档规范(强制要求)

Every design MUST include:
markdown
undefined
每个设计必须包含:
markdown
undefined

Design: Binder_001

设计: Binder_001

Sequence: MVLSPADKTN... Length: 85 amino acids Target: PD-L1 (UniProt: Q9NZQ7) Method: RFdiffusion → ProteinMPNN → ESMFold validation
Quality Metrics:
MetricValueInterpretation
pLDDT88.5High confidence
pTM0.82Good fold
ProteinMPNN score-2.3Favorable
Predicted bindingStrongBased on interface pLDDT
Source: NVIDIA NIM via
NvidiaNIM_rfdiffusion
,
NvidiaNIM_proteinmpnn
,
NvidiaNIM_esmfold

---
序列: MVLSPADKTN... 长度: 85个氨基酸 靶点: PD-L1 (UniProt: Q9NZQ7) 方法: RFdiffusion → ProteinMPNN → ESMFold验证
质量指标:
指标数值解读
pLDDT88.5高置信度
pTM0.82折叠效果良好
ProteinMPNN得分-2.3结果理想
预测结合能力基于界面pLDDT
来源: NVIDIA NIM via
NvidiaNIM_rfdiffusion
,
NvidiaNIM_proteinmpnn
,
NvidiaNIM_esmfold

---

Phase 0: Tool Verification

阶段0: 工具验证

NVIDIA NIM Tools Required

所需NVIDIA NIM工具

ToolPurposeAPI Key Required
NvidiaNIM_rfdiffusion
Backbone generationYes
NvidiaNIM_proteinmpnn
Sequence designYes
NvidiaNIM_esmfold
Fast structure validationYes
NvidiaNIM_alphafold2
High-accuracy validationYes
NvidiaNIM_esm2_650m
Sequence embeddingsYes
工具用途是否需要API密钥
NvidiaNIM_rfdiffusion
骨架生成
NvidiaNIM_proteinmpnn
序列设计
NvidiaNIM_esmfold
快速结构验证
NvidiaNIM_alphafold2
高精度结构验证
NvidiaNIM_esm2_650m
序列嵌入

Parameter Verification

参数验证

ToolWRONG ParameterCORRECT Parameter
NvidiaNIM_rfdiffusion
num_steps
diffusion_steps
NvidiaNIM_proteinmpnn
pdb
pdb_string
NvidiaNIM_esmfold
seq
sequence

工具错误参数正确参数
NvidiaNIM_rfdiffusion
num_steps
diffusion_steps
NvidiaNIM_proteinmpnn
pdb
pdb_string
NvidiaNIM_esmfold
seq
sequence

Workflow Overview

工作流概览

Phase 1: Target Characterization
├── Get target structure (PDB, EMDB cryo-EM, or AlphaFold)
├── Identify binding epitope
├── Analyze existing binders
├── Check EMDB for membrane protein structures (NEW)
└── OUTPUT: Target profile
Phase 2: Backbone Generation (RFdiffusion)
├── Define design constraints
├── Generate multiple backbones
├── Filter by geometry quality
└── OUTPUT: Candidate backbones
Phase 3: Sequence Design (ProteinMPNN)
├── Design sequences for each backbone
├── Sample multiple sequences per backbone
├── Score by ProteinMPNN likelihood
└── OUTPUT: Designed sequences
Phase 4: Structure Validation
├── Predict structure (ESMFold/AlphaFold2)
├── Compare to designed backbone
├── Assess fold quality (pLDDT, pTM)
└── OUTPUT: Validated designs
Phase 5: Developability Assessment
├── Aggregation propensity
├── Expression likelihood
├── Immunogenicity prediction
└── OUTPUT: Developability scores
Phase 6: Report Synthesis
├── Ranked candidate list
├── Experimental recommendations
├── Next steps
└── OUTPUT: Final report

阶段1: 靶点特征分析
├── 获取靶点结构(PDB、EMDB冷冻电镜结构或AlphaFold预测结构)
├── 识别结合表位
├── 分析现有结合剂
├── 检查EMDB中的膜蛋白结构(新增)
└── 输出: 靶点特征报告
阶段2: 骨架生成(RFdiffusion)
├── 定义设计约束
├── 生成多个骨架结构
├── 基于几何质量筛选
└── 输出: 候选骨架
阶段3: 序列设计(ProteinMPNN)
├── 为每个骨架设计序列
├── 每个骨架生成多个序列样本
├── 基于ProteinMPNN可能性得分排序
└── 输出: 设计序列
阶段4: 结构验证
├── 预测结构(ESMFold/AlphaFold2)
├── 与设计骨架对比
├── 评估折叠质量(pLDDT、pTM)
└── 输出: 验证通过的设计
阶段5: 成药性评估
├── 聚集倾向分析
├── 表达可能性预测
├── 免疫原性预测
└── 输出: 成药性得分
阶段6: 报告整合
├── 候选序列排名
├── 实验建议
├── 后续步骤
└── 输出: 最终报告

Phase 1: Target Characterization

阶段1: 靶点特征分析

1.1 Get Target Structure

1.1 获取靶点结构

python
def get_target_structure(tu, target_id):
    """Get target structure from PDB, EMDB, or predict."""
    
    # Try PDB first (X-ray/NMR)
    pdb_results = tu.tools.PDB_search_by_uniprot(uniprot_id=target_id)
    
    if pdb_results:
        # Get highest resolution structure
        best_pdb = sorted(pdb_results, key=lambda x: x['resolution'])[0]
        structure = tu.tools.PDB_get_structure(pdb_id=best_pdb['pdb_id'])
        return {'source': 'PDB', 'pdb_id': best_pdb['pdb_id'], 
                'resolution': best_pdb['resolution'], 'structure': structure}
    
    # Try EMDB for cryo-EM structures (valuable for membrane proteins)
    protein_info = tu.tools.UniProt_get_protein_by_accession(accession=target_id)
    emdb_results = tu.tools.emdb_search(
        query=protein_info['proteinDescription']['recommendedName']['fullName']['value']
    )
    
    if emdb_results and len(emdb_results) > 0:
        # Get highest resolution cryo-EM entry
        best_emdb = sorted(emdb_results, key=lambda x: x.get('resolution', 99))[0]
        # Get associated PDB model if available
        emdb_details = tu.tools.emdb_get_entry(entry_id=best_emdb['emdb_id'])
        if emdb_details.get('pdb_ids'):
            structure = tu.tools.PDB_get_structure(pdb_id=emdb_details['pdb_ids'][0])
            return {'source': 'EMDB cryo-EM', 'emdb_id': best_emdb['emdb_id'],
                    'pdb_id': emdb_details['pdb_ids'][0], 
                    'resolution': best_emdb.get('resolution'), 'structure': structure}
    
    # Fallback to AlphaFold prediction
    sequence = tu.tools.UniProt_get_protein_sequence(accession=target_id)
    structure = tu.tools.NvidiaNIM_alphafold2(
        sequence=sequence['sequence'],
        algorithm="mmseqs2"
    )
    return {'source': 'AlphaFold2 (predicted)', 'structure': structure}
python
def get_target_structure(tu, target_id):
    """从PDB、EMDB获取靶点结构,或进行结构预测。"""
    
    # 优先尝试PDB(X射线/NMR结构)
    pdb_results = tu.tools.PDB_search_by_uniprot(uniprot_id=target_id)
    
    if pdb_results:
        # 获取分辨率最高的结构
        best_pdb = sorted(pdb_results, key=lambda x: x['resolution'])[0]
        structure = tu.tools.PDB_get_structure(pdb_id=best_pdb['pdb_id'])
        return {'source': 'PDB', 'pdb_id': best_pdb['pdb_id'], 
                'resolution': best_pdb['resolution'], 'structure': structure}
    
    # 尝试EMDB的冷冻电镜结构(对膜蛋白有价值)
    protein_info = tu.tools.UniProt_get_protein_by_accession(accession=target_id)
    emdb_results = tu.tools.emdb_search(
        query=protein_info['proteinDescription']['recommendedName']['fullName']['value']
    )
    
    if emdb_results and len(emdb_results) > 0:
        # 获取分辨率最高的冷冻电镜条目
        best_emdb = sorted(emdb_results, key=lambda x: x.get('resolution', 99))[0]
        # 获取关联的PDB模型(如果有)
        emdb_details = tu.tools.emdb_get_entry(entry_id=best_emdb['emdb_id'])
        if emdb_details.get('pdb_ids'):
            structure = tu.tools.PDB_get_structure(pdb_id=emdb_details['pdb_ids'][0])
            return {'source': 'EMDB冷冻电镜', 'emdb_id': best_emdb['emdb_id'],
                    'pdb_id': emdb_details['pdb_ids'][0], 
                    'resolution': best_emdb.get('resolution'), 'structure': structure}
    
    # 备选方案:AlphaFold预测结构
    sequence = tu.tools.UniProt_get_protein_sequence(accession=target_id)
    structure = tu.tools.NvidiaNIM_alphafold2(
        sequence=sequence['sequence'],
        algorithm="mmseqs2"
    )
    return {'source': 'AlphaFold2(预测)', 'structure': structure}

1.1b EMDB for Membrane Proteins (NEW)

1.1b 膜蛋白的EMDB结构(新增)

When to prioritize EMDB: Membrane proteins, large complexes, and targets where conformational states matter.
python
def get_cryoem_structures(tu, target_name):
    """Get cryo-EM structures for membrane proteins/complexes."""
    
    # Search EMDB
    emdb_results = tu.tools.emdb_search(
        query=f"{target_name} membrane OR receptor"
    )
    
    structures = []
    for entry in emdb_results[:5]:
        details = tu.tools.emdb_get_entry(entry_id=entry['emdb_id'])
        structures.append({
            'emdb_id': entry['emdb_id'],
            'resolution': entry.get('resolution', 'N/A'),
            'title': entry.get('title', 'N/A'),
            'conformational_state': details.get('state', 'Unknown'),
            'pdb_models': details.get('pdb_ids', [])
        })
    
    return structures
Output for Report:
markdown
undefined
优先使用EMDB的场景: 膜蛋白、大型复合物,以及需要关注构象状态的靶点。
python
def get_cryoem_structures(tu, target_name):
    """获取膜蛋白/复合物的冷冻电镜结构。"""
    
    # 搜索EMDB
    emdb_results = tu.tools.emdb_search(
        query=f"{target_name} membrane OR receptor"
    )
    
    structures = []
    for entry in emdb_results[:5]:
        details = tu.tools.emdb_get_entry(entry_id=entry['emdb_id'])
        structures.append({
            'emdb_id': entry['emdb_id'],
            'resolution': entry.get('resolution', 'N/A'),
            'title': entry.get('title', 'N/A'),
            'conformational_state': details.get('state', 'Unknown'),
            'pdb_models': details.get('pdb_ids', [])
        })
    
    return structures
报告输出格式:
markdown
undefined

1.1b Cryo-EM Structures (EMDB)

1.1b 冷冻电镜结构(EMDB)

EMDB IDResolutionPDB ModelConformation
EMD-123452.8 Å7ABCActive state
EMD-234563.1 Å8DEFInactive state
Note: Cryo-EM structures capture physiologically relevant conformations for membrane protein targets.
Source: EMDB
undefined
EMDB编号分辨率PDB模型构象状态
EMD-123452.8 Å7ABC激活态
EMD-234563.1 Å8DEF非激活态
说明: 冷冻电镜结构可捕捉膜蛋白靶点的生理相关构象。
来源: EMDB
undefined

1.2 Identify Binding Epitope

1.2 识别结合表位

python
def identify_epitope(tu, target_structure, epitope_residues=None):
    """Identify or validate binding epitope."""
    
    if epitope_residues:
        # User-specified epitope
        return {'residues': epitope_residues, 'source': 'user-defined'}
    
    # Find surface-exposed regions
    # Use structural analysis to identify potential epitopes
    return analyze_surface(target_structure)
python
def identify_epitope(tu, target_structure, epitope_residues=None):
    """识别或验证结合表位。"""
    
    if epitope_residues:
        # 用户指定的表位
        return {'residues': epitope_residues, 'source': '用户定义'}
    
    # 寻找表面暴露区域
    # 通过结构分析识别潜在表位
    return analyze_surface(target_structure)

1.3 Output for Report

1.3 报告输出格式

markdown
undefined
markdown
undefined

1. Target Characterization

1. 靶点特征分析

1.1 Target Information

1.1 靶点信息

PropertyValue
TargetPD-L1 (Programmed death-ligand 1)
UniProtQ9NZQ7
Structure sourcePDB: 4ZQK (2.0 Å resolution)
Binding epitopeIgV domain, residues 19-127
Known bindersAtezolizumab, durvalumab, avelumab
属性数值
靶点PD-L1(程序性死亡配体1)
UniProt编号Q9NZQ7
结构来源PDB: 4ZQK(分辨率2.0 Å)
结合表位IgV结构域,残基19-127
已知结合剂阿替利珠单抗、度伐利尤单抗、阿维鲁单抗

1.2 Epitope Analysis

1.2 表位分析

Residue RangeTypeSurface AreaDruggability
54-68Loop850 ŲHigh
115-125Beta strand420 ŲMedium
19-30N-terminus380 ŲMedium
Selected Epitope: Residues 54-68 (PD-1 binding interface)
Source: PDB 4ZQK, surface analysis

---
残基范围类型表面积成药性
54-68环区850 Ų
115-125β链420 Ų
19-30N端380 Ų
选定表位: 残基54-68(PD-1结合界面)
来源: PDB 4ZQK,表面分析

---

Phase 2: Backbone Generation

阶段2: 骨架生成

2.1 RFdiffusion Design

2.1 RFdiffusion设计

python
def generate_backbones(tu, design_params):
    """Generate de novo backbones using RFdiffusion."""
    
    backbones = tu.tools.NvidiaNIM_rfdiffusion(
        diffusion_steps=design_params.get('steps', 50),
        # Additional parameters depending on design type
    )
    
    return backbones
python
def generate_backbones(tu, design_params):
    """使用RFdiffusion生成从头骨架结构。"""
    
    backbones = tu.tools.NvidiaNIM_rfdiffusion(
        diffusion_steps=design_params.get('steps', 50),
        # 根据设计类型添加其他参数
    )
    
    return backbones

2.2 Design Modes

2.2 设计模式

ModeUse CaseKey Parameters
UnconditionalDe novo scaffold
diffusion_steps
only
Binder designTarget-guided binder
target_structure
,
hotspot_residues
Motif scaffoldingFunctional motif embedding
motif_sequence
,
motif_structure
模式适用场景关键参数
无约束从头支架设计
diffusion_steps
结合剂设计靶点导向结合剂
target_structure
,
hotspot_residues
基序支架化功能基序嵌入
motif_sequence
,
motif_structure

2.3 Output for Report

2.3 报告输出格式

markdown
undefined
markdown
undefined

2. Backbone Generation

2. 骨架生成

2.1 Design Parameters

2.1 设计参数

ParameterValue
MethodRFdiffusion via NVIDIA NIM
Design modeUnconditional scaffold generation
Diffusion steps50
Number generated10 backbones
参数数值
方法RFdiffusion via NVIDIA NIM
设计模式无约束支架生成
扩散步数50
生成数量10个骨架

2.2 Generated Backbones

2.2 生成的骨架

BackboneLengthTopologyQuality
BB_00185 aa3-helix bundleGood
BB_00292 aaBeta sandwichGood
BB_00378 aaAlpha-betaGood
BB_00488 aaAll-alphaModerate
BB_00595 aaMixedGood
Selected for sequence design: BB_001, BB_002, BB_003, BB_005 (top 4)
Source: NVIDIA NIM via
NvidiaNIM_rfdiffusion

---
骨架长度拓扑结构质量
BB_00185 aa3螺旋束良好
BB_00292 aaβ折叠夹层良好
BB_00378 aaα-β混合良好
BB_00488 aa全α螺旋中等
BB_00595 aa混合拓扑良好
选定用于序列设计的骨架: BB_001, BB_002, BB_003, BB_005(排名前4)
来源: NVIDIA NIM via
NvidiaNIM_rfdiffusion

---

Phase 3: Sequence Design

阶段3: 序列设计

3.1 ProteinMPNN Design

3.1 ProteinMPNN设计

python
def design_sequences(tu, backbone_pdb, num_sequences=8):
    """Design sequences for backbone using ProteinMPNN."""
    
    sequences = tu.tools.NvidiaNIM_proteinmpnn(
        pdb_string=backbone_pdb,
        num_sequences=num_sequences,
        temperature=0.1  # Lower = more conservative
    )
    
    return sequences
python
def design_sequences(tu, backbone_pdb, num_sequences=8):
    """使用ProteinMPNN为骨架设计序列。"""
    
    sequences = tu.tools.NvidiaNIM_proteinmpnn(
        pdb_string=backbone_pdb,
        num_sequences=num_sequences,
        temperature=0.1  # 数值越低,设计越保守
    )
    
    return sequences

3.2 Sampling Parameters

3.2 采样参数

ParameterConservativeModerateDiverse
Temperature0.10.20.5
Sequences per backbone4816
Use caseValidated scaffoldExplorationDiversity
参数保守型中等型多样化
温度0.10.20.5
每个骨架生成的序列数4816
适用场景已验证支架探索性设计多样性设计

3.3 Output for Report

3.3 报告输出格式

markdown
undefined
markdown
undefined

3. Sequence Design

3. 序列设计

3.1 Design Parameters

3.1 设计参数

ParameterValue
MethodProteinMPNN via NVIDIA NIM
Temperature0.1 (conservative)
Sequences per backbone8
Total sequences32
参数数值
方法ProteinMPNN via NVIDIA NIM
温度0.1(保守型)
每个骨架生成的序列数8
总序列数32

3.2 Designed Sequences (Top 10 by Score)

3.2 设计序列(得分前10)

RankBackboneSequence IDLengthMPNN ScorePredicted pI
1BB_001Seq_001_A85-1.896.2
2BB_002Seq_002_C92-1.955.8
3BB_001Seq_001_B85-2.017.1
4BB_003Seq_003_A78-2.086.5
5BB_005Seq_005_B95-2.125.4
排名骨架序列ID长度MPNN得分预测等电点
1BB_001Seq_001_A85-1.896.2
2BB_002Seq_002_C92-1.955.8
3BB_001Seq_001_B85-2.017.1
4BB_003Seq_003_A78-2.086.5
5BB_005Seq_005_B95-2.125.4

3.3 Top Sequence: Seq_001_A

3.3 最优序列: Seq_001_A

>Seq_001_A (85 aa, MPNN score: -1.89)
MVLSPADKTNVKAAWGKVGAHAGEYGAEALERMFLSFPTTKTYFPHFDLSH
GSAQVKGHGKKVADALTNAVAHVDDMPNALSALSDLHAHKL
Source: NVIDIA NIM via
NvidiaNIM_proteinmpnn

---
>Seq_001_A (85 aa, MPNN score: -1.89)
MVLSPADKTNVKAAWGKVGAHAGEYGAEALERMFLSFPTTKTYFPHFDLSH
GSAQVKGHGKKVADALTNAVAHVDDMPNALSALSDLHAHKL
来源: NVIDIA NIM via
NvidiaNIM_proteinmpnn

---

Phase 4: Structure Validation

阶段4: 结构验证

4.1 ESMFold Validation

4.1 ESMFold验证

python
def validate_structure(tu, sequence):
    """Validate designed sequence by structure prediction."""
    
    # Fast validation with ESMFold
    predicted = tu.tools.NvidiaNIM_esmfold(sequence=sequence)
    
    # Extract quality metrics
    plddt = extract_plddt(predicted)
    ptm = extract_ptm(predicted)
    
    return {
        'structure': predicted,
        'mean_plddt': np.mean(plddt),
        'ptm': ptm,
        'passes': np.mean(plddt) > 70 and ptm > 0.7
    }
python
def validate_structure(tu, sequence):
    """通过结构预测验证设计序列。"""
    
    # 使用ESMFold快速验证
    predicted = tu.tools.NvidiaNIM_esmfold(sequence=sequence)
    
    # 提取质量指标
    plddt = extract_plddt(predicted)
    ptm = extract_ptm(predicted)
    
    return {
        'structure': predicted,
        'mean_plddt': np.mean(plddt),
        'ptm': ptm,
        'passes': np.mean(plddt) > 70 and ptm > 0.7
    }

4.2 Validation Criteria

4.2 验证标准

MetricThresholdInterpretation
Mean pLDDT>70Confident fold
pTM>0.7Good global topology
RMSD to backbone<2 ÅDesign recapitulated
指标阈值解读
平均pLDDT>70折叠置信度高
pTM>0.7全局拓扑结构良好
与骨架的RMSD<2 Å设计结构复现度高

4.3 Output for Report

4.3 报告输出格式

markdown
undefined
markdown
undefined

4. Structure Validation

4. 结构验证

4.1 Validation Results

4.1 验证结果

SequencepLDDTpTMRMSD to DesignStatus
Seq_001_A88.50.851.2 Å✓ PASS
Seq_002_C82.30.791.5 Å✓ PASS
Seq_001_B85.10.821.3 Å✓ PASS
Seq_003_A79.80.761.8 Å✓ PASS
Seq_005_B68.20.652.8 Å✗ FAIL
序列pLDDTpTM与设计骨架的RMSD状态
Seq_001_A88.50.851.2 Å✓ 通过
Seq_002_C82.30.791.5 Å✓ 通过
Seq_001_B85.10.821.3 Å✓ 通过
Seq_003_A79.80.761.8 Å✓ 通过
Seq_005_B68.20.652.8 Å✗ 未通过

4.2 Top Validated Design: Seq_001_A

4.2 最优验证设计: Seq_001_A

RegionResiduespLDDTInterpretation
Helix 11-2892.3Very high confidence
Loop 129-3578.4Moderate confidence
Helix 236-5891.8Very high confidence
Loop 259-6575.2Moderate confidence
Helix 366-8590.1Very high confidence
Overall: Well-folded 3-helix bundle with high confidence core
Source: NVIDIA NIM via
NvidiaNIM_esmfold

---
区域残基pLDDT解读
螺旋11-2892.3置信度极高
环区129-3578.4置信度中等
螺旋236-5891.8置信度极高
环区259-6575.2置信度中等
螺旋366-8590.1置信度极高
整体评价: 折叠良好的3螺旋束结构,核心区域置信度高
来源: NVIDIA NIM via
NvidiaNIM_esmfold

---

Phase 5: Developability Assessment

阶段5: 成药性评估

5.1 Aggregation Propensity

5.1 聚集倾向分析

python
def assess_aggregation(sequence):
    """Assess aggregation propensity."""
    
    # Calculate hydrophobic patches
    # Calculate isoelectric point
    # Identify aggregation-prone motifs
    
    return {
        'aggregation_score': score,
        'hydrophobic_patches': patches,
        'risk_level': 'Low' if score < 0.5 else 'Medium' if score < 0.7 else 'High'
    }
python
def assess_aggregation(sequence):
    """评估序列的聚集倾向。"""
    
    # 计算疏水区
    # 计算等电点
    # 识别聚集倾向基序
    
    return {
        'aggregation_score': score,
        'hydrophobic_patches': patches,
        'risk_level': '低' if score < 0.5 else '中' if score < 0.7 else '高'
    }

5.2 Developability Metrics

5.2 成药性指标

MetricFavorableMarginalUnfavorable
Aggregation score<0.50.5-0.7>0.7
Isoelectric point5-94-5 or 9-10<4 or >10
Hydrophobic patches<33-5>5
Cysteine count0 or evenOddMultiple unpaired
指标理想值临界值不理想值
聚集得分<0.50.5-0.7>0.7
等电点5-94-5 或 9-10<4 或 >10
疏水区数量<33-5>5
半胱氨酸数量0 或偶数奇数多个未配对

5.3 Output for Report

5.3 报告输出格式

markdown
undefined
markdown
undefined

5. Developability Assessment

5. 成药性评估

5.1 Developability Scores

5.1 成药性得分

DesignAggregationpICysteinesExpressionOverall
Seq_001_A0.32 (Low)6.20High★★★
Seq_002_C0.45 (Low)5.82 (paired)Medium★★☆
Seq_001_B0.38 (Low)7.10High★★★
Seq_003_A0.58 (Med)6.50Medium★★☆
设计聚集倾向pI半胱氨酸表达可能性整体评价
Seq_001_A0.32(低)6.20★★★
Seq_002_C0.45(低)5.82(配对)★★☆
Seq_001_B0.38(低)7.10★★★
Seq_003_A0.58(中)6.50★★☆

5.2 Recommendations

5.2 建议

Best candidate for expression: Seq_001_A
  • Low aggregation propensity
  • Neutral pI (easy purification)
  • No cysteines (no misfolding risk)
  • Predicted high E. coli expression
Source: Sequence analysis

---
最适合表达的候选序列: Seq_001_A
  • 聚集倾向低
  • 中性等电点(易于纯化)
  • 无半胱氨酸(无错误折叠风险)
  • 预测在大肠杆菌中表达量高
来源: 序列分析

---

Report Template

报告模板

markdown
undefined
markdown
undefined

Therapeutic Protein Design Report: [TARGET]

治疗性蛋白质设计报告: [TARGET]

Generated: [Date] | Query: [Original query] | Status: In Progress

生成时间: [日期] | 用户请求: [原始请求] | 状态: 设计中

Executive Summary

执行摘要

[Designing...]

[设计中...]

1. Target Characterization

1. 靶点特征分析

1.1 Target Information

1.1 靶点信息

[Designing...]
[设计中...]

1.2 Binding Epitope

1.2 结合表位

[Designing...]

[设计中...]

2. Backbone Generation

2. 骨架生成

2.1 Design Parameters

2.1 设计参数

[Designing...]
[设计中...]

2.2 Generated Backbones

2.2 生成的骨架

[Designing...]

[设计中...]

3. Sequence Design

3. 序列设计

3.1 ProteinMPNN Results

3.1 ProteinMPNN结果

[Designing...]
[设计中...]

3.2 Top Sequences

3.2 最优序列

[Designing...]

[设计中...]

4. Structure Validation

4. 结构验证

4.1 ESMFold Validation

4.1 ESMFold验证

[Designing...]
[设计中...]

4.2 Quality Metrics

4.2 质量指标

[Designing...]

[设计中...]

5. Developability Assessment

5. 成药性评估

5.1 Scores

5.1 得分

[Designing...]
[设计中...]

5.2 Recommendations

5.2 建议

[Designing...]

[设计中...]

6. Final Candidates

6. 最终候选序列

6.1 Ranked List

6.1 排名列表

[Designing...]
[设计中...]

6.2 Sequences for Testing

6.2 用于测试的序列

[Designing...]

[设计中...]

7. Experimental Recommendations

7. 实验建议

[Designing...]

[设计中...]

8. Data Sources

8. 数据来源

[Will be populated...]

---
[待填充...]

---

Evidence Grading

可信度分级

TierSymbolCriteria
T1★★★pLDDT >85, pTM >0.8, low aggregation, neutral pI
T2★★☆pLDDT >75, pTM >0.7, acceptable developability
T3★☆☆pLDDT >70, pTM >0.65, developability concerns
T4☆☆☆Failed validation or major developability issues

等级符号标准
T1★★★pLDDT>85, pTM>0.8, 聚集倾向低, 等电点中性
T2★★☆pLDDT>75, pTM>0.7, 成药性可接受
T3★☆☆pLDDT>70, pTM>0.65, 存在成药性问题
T4☆☆☆未通过验证或存在严重成药性问题

Completeness Checklist

完整性检查清单

Phase 1: Target

阶段1: 靶点

  • Target structure obtained (PDB or predicted)
  • Binding epitope identified
  • Existing binders noted
  • 已获取靶点结构(PDB或预测结构)
  • 已识别结合表位
  • 已记录现有结合剂

Phase 2: Backbones

阶段2: 骨架

  • ≥5 backbones generated
  • Top 3-5 selected for sequence design
  • Selection criteria documented
  • 生成≥5个骨架
  • 选定3-5个最优骨架用于序列设计
  • 已记录筛选标准

Phase 3: Sequences

阶段3: 序列

  • ≥8 sequences per backbone designed
  • MPNN scores reported
  • Top 10 sequences listed
  • 每个骨架设计≥8个序列
  • 已报告MPNN得分
  • 已列出前10个序列

Phase 4: Validation

阶段4: 验证

  • All sequences validated by ESMFold
  • pLDDT and pTM reported
  • Pass/fail criteria applied
  • ≥3 passing designs
  • 所有序列已通过ESMFold验证
  • 已报告pLDDT和pTM
  • 已应用通过/未通过标准
  • 获得≥3个通过验证的设计

Phase 5: Developability

阶段5: 成药性

  • Aggregation assessed
  • pI calculated
  • Expression prediction
  • Final ranking
  • 已评估聚集倾向
  • 已计算等电点
  • 已预测表达可能性
  • 已完成最终排名

Phase 6: Deliverables

阶段6: 交付物

  • Ranked candidate list
  • FASTA file with sequences
  • Experimental recommendations

  • 已生成候选序列排名表
  • 已生成FASTA序列文件
  • 已提供实验建议

Fallback Chains

备选工具链

Primary ToolFallback 1Fallback 2
NvidiaNIM_rfdiffusion
Manual backbone designScaffold from PDB
NvidiaNIM_proteinmpnn
Rosetta ProteinMPNNManual sequence design
NvidiaNIM_esmfold
NvidiaNIM_alphafold2
AlphaFold DB
PDB structure
NvidiaNIM_alphafold2
AlphaFold DB

主工具备选工具1备选工具2
NvidiaNIM_rfdiffusion
手动骨架设计从PDB获取支架
NvidiaNIM_proteinmpnn
Rosetta ProteinMPNN手动序列设计
NvidiaNIM_esmfold
NvidiaNIM_alphafold2
AlphaFold DB
PDB结构
NvidiaNIM_alphafold2
AlphaFold DB

Tool Reference

工具参考

See TOOLS_REFERENCE.md for complete tool documentation.
完整工具文档请查看TOOLS_REFERENCE.md