binder-design
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseBinder Design Tool Selection
结合剂设计工具选择
Decision tree
决策树
De novo binder design?
│
├─ Standard target → BoltzGen (recommended)
│ All-atom output (no separate ProteinMPNN step needed)
│ Better for ligand/small molecule binding
│ Single-step design (backbone + sequence + side chains)
│
├─ Need diversity/exploration → RFdiffusion + ProteinMPNN
│ Maximum backbone diversity
│ Two-step: backbone then sequence
│
├─ Integrated validation → BindCraft
│ Built-in AF2 validation
│ End-to-end pipeline
│
├─ Ligand binding → BoltzGen ✓
│ All-atom diffusion handles ligand context
│
├─ Peptide/nanobody → Germinal
│ VHH/nanobody design
│ Germline-aware optimization
│
└─ Antibody/Nanobody
+-- VHH design --> germinal skill是否需要从头设计结合剂?
│
├─ 标准靶标 → BoltzGen(推荐)
│ 全原子输出(无需单独执行ProteinMPNN步骤)
│ 更适用于配体/小分子结合
│ 单步设计(骨架 + 序列 + 侧链)
│
├─ 需要多样性/探索性设计 → RFdiffusion + ProteinMPNN
│ 骨架多样性最大化
│ 两步流程:先设计骨架再生成序列
│
├─ 集成验证功能 → BindCraft
│ 内置AF2验证
│ 端到端流程
│
├─ 配体结合 → BoltzGen ✓
│ 全原子扩散模型可处理配体上下文
│
├─ 肽/纳米抗体 → Germinal
│ VHH/纳米抗体设计
│ 胚系序列感知优化
│
└─ 抗体/纳米抗体
+-- VHH设计 --> germinal技能Tool comparison
工具对比
| Tool | Strengths | Weaknesses | Best For |
|---|---|---|---|
| BoltzGen | All-atom, single-step, ligand-aware | Higher GPU requirement | Standard (recommended) |
| BindCraft | End-to-end, built-in AF2 validation | Less diverse | Production campaigns |
| RFdiffusion | High diversity, fast | Requires ProteinMPNN | Exploration, diversity |
| Germinal | Nanobody/VHH design | Specialized | Antibody optimization |
| 工具 | 优势 | 劣势 | 最佳适用场景 |
|---|---|---|---|
| BoltzGen | 全原子、单步流程、支持配体感知 | 对GPU要求较高 | 标准场景(推荐) |
| BindCraft | 端到端流程、内置AF2验证 | 多样性较低 | 生产级项目 |
| RFdiffusion | 多样性高、速度快 | 需要搭配ProteinMPNN使用 | 探索性设计、多样性需求场景 |
| Germinal | 纳米抗体/VHH设计 | 功能专一 | 抗体优化 |
Recommended Pipeline: BoltzGen → Chai → QC
推荐流程:BoltzGen → Chai → 质量控制(QC)
BoltzGen provides all-atom design with built-in side-chain packing:
Target → BoltzGen → Validate → Filter
(pdb) (all-atom) (chai) (qc)BoltzGen可提供内置侧链包装的全原子设计:
靶标 → BoltzGen → 验证 → 过滤
(pdb) (全原子结构) (chai) (qc)1. Target preparation
1. 靶标准备
bash
undefinedbash
undefinedFetch structure from PDB
从PDB获取结构
Use pdb skill for guidance
请使用pdb技能获取指导
- Trim to binding region + 10A buffer
- Remove waters and ligands
- Renumber chains if needed- 裁剪至结合区域 + 10Å缓冲范围
- 去除水分子和配体
- 必要时重新编号链2. Hotspot selection
2. 热点残基选择
- Choose 3-6 exposed residues
- Prefer charged/aromatic residues
- Cluster spatially (within 10-15A)
- 选择3-6个暴露的残基
- 优先选择带电/芳香族残基
- 空间上聚类(10-15Å范围内)
3. Design with BoltzGen (Recommended)
3. 使用BoltzGen进行设计(推荐)
First, create a YAML config file (e.g., ):
binder.yamlyaml
entities:
- protein:
id: B
sequence: 70..100
- file:
path: target.cif
include:
- chain:
id: A
binding_types:
- chain:
id: A
binding: 45,67,89Then run:
bash
modal run modal_boltzgen.py \
--input-yaml binder.yaml \
--protocol protein-anything \
--num-designs 50Why BoltzGen?
- All-atom output (no separate ProteinMPNN step needed)
- Better for ligand/small molecule binding
- Single-step design (backbone + sequence + side chains)
首先,创建YAML配置文件(例如:):
binder.yamlyaml
entities:
- protein:
id: B
sequence: 70..100
- file:
path: target.cif
include:
- chain:
id: A
binding_types:
- chain:
id: A
binding: 45,67,89然后运行:
bash
modal run modal_boltzgen.py \
--input-yaml binder.yaml \
--protocol protein-anything \
--num-designs 50为什么选择BoltzGen?
- 全原子输出(无需单独执行ProteinMPNN步骤)
- 更适用于配体/小分子结合
- 单步设计(骨架 + 序列 + 侧链)
4. Alternative: RFdiffusion Pipeline
4. 替代方案:RFdiffusion流程
For maximum diversity or when backbone-only is preferred:
bash
undefined如需最大化多样性或偏好仅设计骨架:
bash
undefinedStep 1: Backbone generation
步骤1:骨架生成
modal run modal_rfdiffusion.py
--pdb target.pdb
--contigs "A1-150/0 70-100"
--hotspot "A45,A67,A89"
--num-designs 500
--pdb target.pdb
--contigs "A1-150/0 70-100"
--hotspot "A45,A67,A89"
--num-designs 500
modal run modal_rfdiffusion.py
--pdb target.pdb
--contigs "A1-150/0 70-100"
--hotspot "A45,A67,A89"
--num-designs 500
--pdb target.pdb
--contigs "A1-150/0 70-100"
--hotspot "A45,A67,A89"
--num-designs 500
Step 2: Sequence design
步骤2:序列设计
modal run modal_ligandmpnn.py
--pdb-path backbone.pdb
--num-seq-per-target 16
--sampling-temp 0.1
--pdb-path backbone.pdb
--num-seq-per-target 16
--sampling-temp 0.1
undefinedmodal run modal_ligandmpnn.py
--pdb-path backbone.pdb
--num-seq-per-target 16
--sampling-temp 0.1
--pdb-path backbone.pdb
--num-seq-per-target 16
--sampling-temp 0.1
undefined5. Validation
5. 验证
bash
modal run modal_chai1.py \
--input-faa sequences.fasta \
--out-dir predictions/bash
modal run modal_chai1.py \
--input-faa sequences.fasta \
--out-dir predictions/6. Filtering
6. 过滤
Apply standard thresholds:
- pLDDT > 0.80
- ipTM > 0.50
- PAE_interface < 10
- scRMSD < 2.0 A
See protein-qc skill for details.
应用标准阈值:
- pLDDT > 0.80
- ipTM > 0.50
- PAE_interface < 10
- scRMSD < 2.0 Å
详情请查看protein-qc技能。
Number of designs
设计数量参考
| Stage | Count | Purpose |
|---|---|---|
| Backbone generation | 500-1000 | Diversity |
| Sequences per backbone | 8-16 | Sequence space |
| AF2 predictions | All | Validation |
| After filtering | 50-200 | Candidates |
| Experimental testing | 10-50 | Final selection |
| 阶段 | 数量 | 目的 |
|---|---|---|
| 骨架生成 | 500-1000 | 保证多样性 |
| 每个骨架对应的序列数 | 8-16 | 覆盖序列空间 |
| AF2预测 | 全部 | 验证 |
| 过滤后 | 50-200 | 候选结构 |
| 实验测试 | 10-50 | 最终筛选 |
Common mistakes
常见错误
Wrong hotspots
热点残基选择错误
- Using buried residues
- Too many hotspots (over-constrain)
- Wrong chain/residue numbers
- 使用埋藏的残基
- 选择过多热点残基(过度约束)
- 链/残基编号错误
Insufficient diversity
多样性不足
- Too few designs generated
- Low temperature in ProteinMPNN
- Not exploring multiple backbones
- 生成的设计数量过少
- ProteinMPNN中采样温度设置过低
- 未探索多种骨架
Poor target preparation
靶标准备不当
- Including full protein instead of binding region
- Missing important structural features
- Wrong protonation states
- 保留完整蛋白质而非仅结合区域
- 遗漏重要结构特征
- 质子化状态错误
Timeline guide
时间线参考
| Step | Compute Time |
|---|---|
| RFdiffusion (500 designs) | 2-4 hours |
| ProteinMPNN (8000 sequences) | 1-2 hours |
| AF2 prediction (8000 sequences) | 12-24 hours |
| Filtering and analysis | 1-2 hours |
Total: 1-2 days of compute
| 步骤 | 计算时间 |
|---|---|
| RFdiffusion(500个设计) | 2-4小时 |
| ProteinMPNN(8000条序列) | 1-2小时 |
| AF2预测(8000条序列) | 12-24小时 |
| 过滤与分析 | 1-2小时 |
总计算时间:1-2天