binder-design

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Binder Design Tool Selection

结合剂设计工具选择

Decision tree

决策树

De novo binder design?
├─ Standard target → BoltzGen (recommended)
│   All-atom output (no separate ProteinMPNN step needed)
│   Better for ligand/small molecule binding
│   Single-step design (backbone + sequence + side chains)
├─ Need diversity/exploration → RFdiffusion + ProteinMPNN
│   Maximum backbone diversity
│   Two-step: backbone then sequence
├─ Integrated validation → BindCraft
│   Built-in AF2 validation
│   End-to-end pipeline
├─ Ligand binding → BoltzGen ✓
│   All-atom diffusion handles ligand context
├─ Peptide/nanobody → Germinal
│   VHH/nanobody design
│   Germline-aware optimization
└─ Antibody/Nanobody
    +-- VHH design --> germinal skill
是否需要从头设计结合剂?
├─ 标准靶标 → BoltzGen(推荐)
│   全原子输出(无需单独执行ProteinMPNN步骤)
│   更适用于配体/小分子结合
│   单步设计(骨架 + 序列 + 侧链)
├─ 需要多样性/探索性设计 → RFdiffusion + ProteinMPNN
│   骨架多样性最大化
│   两步流程:先设计骨架再生成序列
├─ 集成验证功能 → BindCraft
│   内置AF2验证
│   端到端流程
├─ 配体结合 → BoltzGen ✓
│   全原子扩散模型可处理配体上下文
├─ 肽/纳米抗体 → Germinal
│   VHH/纳米抗体设计
│   胚系序列感知优化
└─ 抗体/纳米抗体
    +-- VHH设计 --> germinal技能

Tool comparison

工具对比

ToolStrengthsWeaknessesBest For
BoltzGenAll-atom, single-step, ligand-awareHigher GPU requirementStandard (recommended)
BindCraftEnd-to-end, built-in AF2 validationLess diverseProduction campaigns
RFdiffusionHigh diversity, fastRequires ProteinMPNNExploration, diversity
GerminalNanobody/VHH designSpecializedAntibody optimization
工具优势劣势最佳适用场景
BoltzGen全原子、单步流程、支持配体感知对GPU要求较高标准场景(推荐)
BindCraft端到端流程、内置AF2验证多样性较低生产级项目
RFdiffusion多样性高、速度快需要搭配ProteinMPNN使用探索性设计、多样性需求场景
Germinal纳米抗体/VHH设计功能专一抗体优化

Recommended Pipeline: BoltzGen → Chai → QC

推荐流程:BoltzGen → Chai → 质量控制(QC)

BoltzGen provides all-atom design with built-in side-chain packing:
Target → BoltzGen → Validate → Filter
 (pdb)  (all-atom)   (chai)     (qc)
BoltzGen可提供内置侧链包装的全原子设计:
靶标 → BoltzGen → 验证 → 过滤
 (pdb)  (全原子结构)   (chai)     (qc)

1. Target preparation

1. 靶标准备

bash
undefined
bash
undefined

Fetch structure from PDB

从PDB获取结构

Use pdb skill for guidance

请使用pdb技能获取指导

- Trim to binding region + 10A buffer
- Remove waters and ligands
- Renumber chains if needed
- 裁剪至结合区域 + 10Å缓冲范围
- 去除水分子和配体
- 必要时重新编号链

2. Hotspot selection

2. 热点残基选择

  • Choose 3-6 exposed residues
  • Prefer charged/aromatic residues
  • Cluster spatially (within 10-15A)
  • 选择3-6个暴露的残基
  • 优先选择带电/芳香族残基
  • 空间上聚类(10-15Å范围内)

3. Design with BoltzGen (Recommended)

3. 使用BoltzGen进行设计(推荐)

First, create a YAML config file (e.g.,
binder.yaml
):
yaml
entities:
  - protein:
      id: B
      sequence: 70..100

  - file:
      path: target.cif
      include:
        - chain:
            id: A
      binding_types:
        - chain:
            id: A
            binding: 45,67,89
Then run:
bash
modal run modal_boltzgen.py \
  --input-yaml binder.yaml \
  --protocol protein-anything \
  --num-designs 50
Why BoltzGen?
  • All-atom output (no separate ProteinMPNN step needed)
  • Better for ligand/small molecule binding
  • Single-step design (backbone + sequence + side chains)
首先,创建YAML配置文件(例如:
binder.yaml
):
yaml
entities:
  - protein:
      id: B
      sequence: 70..100

  - file:
      path: target.cif
      include:
        - chain:
            id: A
      binding_types:
        - chain:
            id: A
            binding: 45,67,89
然后运行:
bash
modal run modal_boltzgen.py \
  --input-yaml binder.yaml \
  --protocol protein-anything \
  --num-designs 50
为什么选择BoltzGen?
  • 全原子输出(无需单独执行ProteinMPNN步骤)
  • 更适用于配体/小分子结合
  • 单步设计(骨架 + 序列 + 侧链)

4. Alternative: RFdiffusion Pipeline

4. 替代方案:RFdiffusion流程

For maximum diversity or when backbone-only is preferred:
bash
undefined
如需最大化多样性或偏好仅设计骨架:
bash
undefined

Step 1: Backbone generation

步骤1:骨架生成

modal run modal_rfdiffusion.py
--pdb target.pdb
--contigs "A1-150/0 70-100"
--hotspot "A45,A67,A89"
--num-designs 500
modal run modal_rfdiffusion.py
--pdb target.pdb
--contigs "A1-150/0 70-100"
--hotspot "A45,A67,A89"
--num-designs 500

Step 2: Sequence design

步骤2:序列设计

modal run modal_ligandmpnn.py
--pdb-path backbone.pdb
--num-seq-per-target 16
--sampling-temp 0.1
undefined
modal run modal_ligandmpnn.py
--pdb-path backbone.pdb
--num-seq-per-target 16
--sampling-temp 0.1
undefined

5. Validation

5. 验证

bash
modal run modal_chai1.py \
  --input-faa sequences.fasta \
  --out-dir predictions/
bash
modal run modal_chai1.py \
  --input-faa sequences.fasta \
  --out-dir predictions/

6. Filtering

6. 过滤

Apply standard thresholds:
  • pLDDT > 0.80
  • ipTM > 0.50
  • PAE_interface < 10
  • scRMSD < 2.0 A
See protein-qc skill for details.
应用标准阈值:
  • pLDDT > 0.80
  • ipTM > 0.50
  • PAE_interface < 10
  • scRMSD < 2.0 Å
详情请查看protein-qc技能。

Number of designs

设计数量参考

StageCountPurpose
Backbone generation500-1000Diversity
Sequences per backbone8-16Sequence space
AF2 predictionsAllValidation
After filtering50-200Candidates
Experimental testing10-50Final selection
阶段数量目的
骨架生成500-1000保证多样性
每个骨架对应的序列数8-16覆盖序列空间
AF2预测全部验证
过滤后50-200候选结构
实验测试10-50最终筛选

Common mistakes

常见错误

Wrong hotspots

热点残基选择错误

  • Using buried residues
  • Too many hotspots (over-constrain)
  • Wrong chain/residue numbers
  • 使用埋藏的残基
  • 选择过多热点残基(过度约束)
  • 链/残基编号错误

Insufficient diversity

多样性不足

  • Too few designs generated
  • Low temperature in ProteinMPNN
  • Not exploring multiple backbones
  • 生成的设计数量过少
  • ProteinMPNN中采样温度设置过低
  • 未探索多种骨架

Poor target preparation

靶标准备不当

  • Including full protein instead of binding region
  • Missing important structural features
  • Wrong protonation states
  • 保留完整蛋白质而非仅结合区域
  • 遗漏重要结构特征
  • 质子化状态错误

Timeline guide

时间线参考

StepCompute Time
RFdiffusion (500 designs)2-4 hours
ProteinMPNN (8000 sequences)1-2 hours
AF2 prediction (8000 sequences)12-24 hours
Filtering and analysis1-2 hours
Total: 1-2 days of compute
步骤计算时间
RFdiffusion(500个设计)2-4小时
ProteinMPNN(8000条序列)1-2小时
AF2预测(8000条序列)12-24小时
过滤与分析1-2小时
总计算时间:1-2天