boltz

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Boltz Structure Prediction

Boltz结构预测

Prerequisites

前提条件

RequirementMinimumRecommended
Python3.10+3.11
CUDA12.0+12.1+
GPU VRAM24GB48GB (L40S)
RAM32GB64GB
要求最低配置推荐配置
Python3.10+3.11
CUDA12.0+12.1+
GPU显存24GB48GB (L40S)
内存32GB64GB

How to run

运行方法

First time? See Installation Guide to set up Modal and biomodals.
首次使用? 查看安装指南设置Modal和biomodals。

Option 1: Modal

选项1:Modal运行

bash
cd biomodals
modal run modal_boltz.py \
  --input-faa complex.fasta \
  --out-dir predictions/
GPU: L40S (48GB) | Timeout: 1800s default
bash
cd biomodals
modal run modal_boltz.py \
  --input-faa complex.fasta \
  --out-dir predictions/
GPU:L40S (48GB) | 超时时间:默认1800s

Option 2: Local installation

选项2:本地安装

bash
pip install boltz

boltz predict \
  --fasta complex.fasta \
  --output predictions/
bash
pip install boltz

boltz predict \
  --fasta complex.fasta \
  --output predictions/

Key parameters

关键参数

ParameterDefaultRangeDescription
--recycling_steps
31-10Recycling iterations
--sampling_steps
20050-500Diffusion steps
--use_msa_server
trueboolUse MSA server
参数默认值范围说明
--recycling_steps
31-10循环迭代次数
--sampling_steps
20050-500扩散步数
--use_msa_server
truebool是否使用MSA服务器

FASTA Format

FASTA格式

>protein_A
MKTAYIAKQRQISFVK...
>protein_B
MVLSPADKTNVKAAWG...
>protein_A
MKTAYIAKQRQISFVK...
>protein_B
MVLSPADKTNVKAAWG...

Output format

输出格式

predictions/
├── model_0.cif       # Best model (CIF format)
├── confidence.json   # pLDDT, pTM, ipTM
└── pae.npy          # PAE matrix
Note: Boltz outputs CIF format. Convert to PDB if needed:
python
from Bio.PDB import MMCIFParser, PDBIO
parser = MMCIFParser()
structure = parser.get_structure("model", "model_0.cif")
io = PDBIO()
io.set_structure(structure)
io.save("model_0.pdb")
predictions/
├── model_0.cif       # 最优模型(CIF格式)
├── confidence.json   # pLDDT、pTM、ipTM指标
└── pae.npy          # PAE矩阵
注意:Boltz输出CIF格式。若需转换为PDB格式,可使用以下代码:
python
from Bio.PDB import MMCIFParser, PDBIO
parser = MMCIFParser()
structure = parser.get_structure("model", "model_0.cif")
io = PDBIO()
io.set_structure(structure)
io.save("model_0.pdb")

Comparison

功能对比

FeatureBoltz-1Boltz-2AF2-Multimer
MSA-free modeYesYesNo
DiffusionYesYesNo
SpeedFastFasterSlower
Open sourceYesYesYes
特性Boltz-1Boltz-2AF2-Multimer
无MSA模式支持支持不支持
扩散模型支持支持不支持
速度更快较慢
开源

Sample output

示例输出

Successful run

运行成功示例

$ boltz predict --fasta complex.fasta --output predictions/
[INFO] Loading Boltz-1 weights...
[INFO] Predicting structure...
[INFO] Saved model to predictions/model_0.cif

predictions/confidence.json:
{
  "ptm": 0.78,
  "iptm": 0.65,
  "plddt": 0.81
}
What good output looks like:
  • pTM: > 0.7 (confident global structure)
  • ipTM: > 0.5 (confident interface)
  • pLDDT: > 0.7 (confident per-residue)
  • CIF file: ~100-500 KB for typical complex
$ boltz predict --fasta complex.fasta --output predictions/
[INFO] Loading Boltz-1 weights...
[INFO] Predicting structure...
[INFO] Saved model to predictions/model_0.cif

predictions/confidence.json:
{
  "ptm": 0.78,
  "iptm": 0.65,
  "plddt": 0.81
}
优质输出特征
  • pTM: > 0.7(全局结构置信度高)
  • ipTM: > 0.5(界面置信度高)
  • pLDDT: > 0.7(每个残基置信度高)
  • CIF文件:典型复合物大小约100-500 KB

Decision tree

选择决策树

Should I use Boltz?
├─ What are you predicting?
│  ├─ Protein-protein complex → Boltz ✓ or Chai or ColabFold
│  ├─ Protein + ligand → Boltz ✓ or Chai
│  └─ Single protein → Use ESMFold (faster)
├─ Need MSA?
│  ├─ No / want speed → Boltz ✓
│  └─ Yes / maximum accuracy → ColabFold
└─ Why Boltz over Chai?
   ├─ Open weights preference → Boltz ✓
   ├─ Boltz-2 speed → Boltz ✓
   └─ DNA/RNA support → Consider Chai
是否应该使用Boltz?
├─ 你要预测什么?
│  ├─ 蛋白质-蛋白质复合物 → 推荐Boltz ✓ 或Chai/ColabFold
│  ├─ 蛋白质+配体 → 推荐Boltz ✓ 或Chai
│  └─ 单链蛋白质 → 使用ESMFold(速度更快)
├─ 是否需要MSA?
│  ├─ 不需要/追求速度 → 推荐Boltz ✓
│  └─ 需要/追求最高精度 → 使用ColabFold
└─ 为什么选Boltz而不是Chai?
   ├─ 偏好开源权重 → 推荐Boltz ✓
   ├─ 追求Boltz-2的速度 → 推荐Boltz ✓
   └─ 需要DNA/RNA支持 → 考虑使用Chai

Typical performance

典型性能

Campaign SizeTime (L40S)Cost (Modal)Notes
100 complexes30-45 min~$8Standard validation
500 complexes2-3h~$35Large campaign
1000 complexes4-6h~$70Comprehensive
Per-complex: ~15-30s for typical binder-target complex.

任务规模耗时(L40S)成本(Modal)说明
100个复合物30-45分钟~$8标准验证
500个复合物2-3小时~$35大规模任务
1000个复合物4-6小时~$70全面验证
单个复合物:典型结合剂-靶标复合物约需15-30秒。

Verify

结果验证

bash
find predictions -name "*.cif" | wc -l  # Should match input count

bash
find predictions -name "*.cif" | wc -l  # 数值应与输入数量匹配

Troubleshooting

故障排除

Low confidence: Increase recycling_steps OOM errors: Use MSA-free mode or A100-80GB Slow prediction: Reduce sampling_steps
置信度低:增加recycling_steps参数值 内存不足(OOM)错误:使用无MSA模式或A100-80GB GPU 预测速度慢:减少sampling_steps参数值

Error interpretation

错误解析

ErrorCauseFix
RuntimeError: CUDA out of memory
Complex too largeUse
--use_msa_server false
or larger GPU
KeyError: 'iptm'
Single chain onlyEnsure FASTA has 2+ chains
FileNotFoundError: weights
Missing modelRun
boltz download
first
ValueError: invalid residue
Non-standard AACheck for modified residues in sequence
错误原因解决方法
RuntimeError: CUDA out of memory
复合物规模过大使用
--use_msa_server false
或更大显存的GPU
KeyError: 'iptm'
仅输入单链序列确保FASTA文件包含2条及以上序列
FileNotFoundError: weights
模型权重缺失先运行
boltz download
下载权重
ValueError: invalid residue
存在非标准氨基酸检查序列中是否有修饰后的残基

Boltz-1 vs Boltz-2

Boltz-1与Boltz-2对比

AspectBoltz-1Boltz-2
SpeedFast~2x faster
AccuracyGoodImproved
LigandsBasicBetter support
Release2024Late 2024

Next:
protein-qc
for filtering and ranking.
维度Boltz-1Boltz-2
速度约2倍快
精度良好有所提升
配体支持基础更好的支持
发布时间2024年2024年末

下一步:使用
protein-qc
进行过滤和排序。