boltz

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Boltz Structure Prediction

Boltz结构预测

Prerequisites

前提条件

Requirement	Minimum	Recommended
Python	3.10+	3.11
CUDA	12.0+	12.1+
GPU VRAM	24GB	48GB (L40S)
RAM	32GB	64GB

要求	最低配置	推荐配置
Python	3.10+	3.11
CUDA	12.0+	12.1+
GPU显存	24GB	48GB (L40S)
内存	32GB	64GB

How to run

运行方法

First time? See Installation Guide to set up Modal and biomodals.

首次使用？ 查看安装指南设置Modal和biomodals。

Option 1: Modal

选项1：Modal运行

bash

cd biomodals
modal run modal_boltz.py \
  --input-faa complex.fasta \
  --out-dir predictions/

GPU: L40S (48GB) | Timeout: 1800s default

bash

cd biomodals
modal run modal_boltz.py \
  --input-faa complex.fasta \
  --out-dir predictions/

GPU：L40S (48GB) | 超时时间：默认1800s

Option 2: Local installation

选项2：本地安装

bash

pip install boltz

boltz predict \
  --fasta complex.fasta \
  --output predictions/

bash

pip install boltz

boltz predict \
  --fasta complex.fasta \
  --output predictions/

Key parameters

关键参数

Parameter	Default	Range	Description
`--recycling_steps`	3	1-10	Recycling iterations
`--sampling_steps`	200	50-500	Diffusion steps
`--use_msa_server`	true	bool	Use MSA server

参数	默认值	范围	说明
`--recycling_steps`	3	1-10	循环迭代次数
`--sampling_steps`	200	50-500	扩散步数
`--use_msa_server`	true	bool	是否使用MSA服务器

FASTA Format

FASTA格式

>protein_A
MKTAYIAKQRQISFVK...
>protein_B
MVLSPADKTNVKAAWG...

>protein_A
MKTAYIAKQRQISFVK...
>protein_B
MVLSPADKTNVKAAWG...

Output format

输出格式

predictions/
├── model_0.cif       # Best model (CIF format)
├── confidence.json   # pLDDT, pTM, ipTM
└── pae.npy          # PAE matrix

Note: Boltz outputs CIF format. Convert to PDB if needed:

python

from Bio.PDB import MMCIFParser, PDBIO
parser = MMCIFParser()
structure = parser.get_structure("model", "model_0.cif")
io = PDBIO()
io.set_structure(structure)
io.save("model_0.pdb")

predictions/
├── model_0.cif       # 最优模型（CIF格式）
├── confidence.json   # pLDDT、pTM、ipTM指标
└── pae.npy          # PAE矩阵

注意：Boltz输出CIF格式。若需转换为PDB格式，可使用以下代码：

python

from Bio.PDB import MMCIFParser, PDBIO
parser = MMCIFParser()
structure = parser.get_structure("model", "model_0.cif")
io = PDBIO()
io.set_structure(structure)
io.save("model_0.pdb")

Comparison

功能对比

Feature	Boltz-1	Boltz-2	AF2-Multimer
MSA-free mode	Yes	Yes	No
Diffusion	Yes	Yes	No
Speed	Fast	Faster	Slower
Open source	Yes	Yes	Yes

特性	Boltz-1	Boltz-2	AF2-Multimer
无MSA模式	支持	支持	不支持
扩散模型	支持	支持	不支持
速度	快	更快	较慢
开源	是	是	是

Sample output

示例输出

Successful run

运行成功示例

$ boltz predict --fasta complex.fasta --output predictions/
[INFO] Loading Boltz-1 weights...
[INFO] Predicting structure...
[INFO] Saved model to predictions/model_0.cif

predictions/confidence.json:
{
  "ptm": 0.78,
  "iptm": 0.65,
  "plddt": 0.81
}

What good output looks like:

pTM: > 0.7 (confident global structure)
ipTM: > 0.5 (confident interface)
pLDDT: > 0.7 (confident per-residue)
CIF file: ~100-500 KB for typical complex

$ boltz predict --fasta complex.fasta --output predictions/
[INFO] Loading Boltz-1 weights...
[INFO] Predicting structure...
[INFO] Saved model to predictions/model_0.cif

predictions/confidence.json:
{
  "ptm": 0.78,
  "iptm": 0.65,
  "plddt": 0.81
}

优质输出特征：

pTM: > 0.7（全局结构置信度高）
ipTM: > 0.5（界面置信度高）
pLDDT: > 0.7（每个残基置信度高）
CIF文件：典型复合物大小约100-500 KB

Decision tree

选择决策树

Should I use Boltz?
│
├─ What are you predicting?
│  ├─ Protein-protein complex → Boltz ✓ or Chai or ColabFold
│  ├─ Protein + ligand → Boltz ✓ or Chai
│  └─ Single protein → Use ESMFold (faster)
│
├─ Need MSA?
│  ├─ No / want speed → Boltz ✓
│  └─ Yes / maximum accuracy → ColabFold
│
└─ Why Boltz over Chai?
   ├─ Open weights preference → Boltz ✓
   ├─ Boltz-2 speed → Boltz ✓
   └─ DNA/RNA support → Consider Chai

是否应该使用Boltz？
│
├─ 你要预测什么？
│  ├─ 蛋白质-蛋白质复合物 → 推荐Boltz ✓ 或Chai/ColabFold
│  ├─ 蛋白质+配体 → 推荐Boltz ✓ 或Chai
│  └─ 单链蛋白质 → 使用ESMFold（速度更快）
│
├─ 是否需要MSA？
│  ├─ 不需要/追求速度 → 推荐Boltz ✓
│  └─ 需要/追求最高精度 → 使用ColabFold
│
└─ 为什么选Boltz而不是Chai？
   ├─ 偏好开源权重 → 推荐Boltz ✓
   ├─ 追求Boltz-2的速度 → 推荐Boltz ✓
   └─ 需要DNA/RNA支持 → 考虑使用Chai

Typical performance

典型性能

Campaign Size	Time (L40S)	Cost (Modal)	Notes
100 complexes	30-45 min	~$8	Standard validation
500 complexes	2-3h	~$35	Large campaign
1000 complexes	4-6h	~$70	Comprehensive

Per-complex: ~15-30s for typical binder-target complex.

任务规模	耗时（L40S）	成本（Modal）	说明
100个复合物	30-45分钟	~$8	标准验证
500个复合物	2-3小时	~$35	大规模任务
1000个复合物	4-6小时	~$70	全面验证

单个复合物：典型结合剂-靶标复合物约需15-30秒。

Verify

结果验证

bash

find predictions -name "*.cif" | wc -l  # Should match input count

bash

find predictions -name "*.cif" | wc -l  # 数值应与输入数量匹配

Troubleshooting

故障排除

Low confidence: Increase recycling_steps OOM errors: Use MSA-free mode or A100-80GB Slow prediction: Reduce sampling_steps

置信度低：增加recycling_steps参数值 内存不足（OOM）错误：使用无MSA模式或A100-80GB GPU 预测速度慢：减少sampling_steps参数值

Error interpretation

错误解析

Error	Cause	Fix
`RuntimeError: CUDA out of memory`	Complex too large	Use `--use_msa_server false` or larger GPU
`KeyError: 'iptm'`	Single chain only	Ensure FASTA has 2+ chains
`FileNotFoundError: weights`	Missing model	Run `boltz download` first
`ValueError: invalid residue`	Non-standard AA	Check for modified residues in sequence

错误	原因	解决方法
`RuntimeError: CUDA out of memory`	复合物规模过大	使用 `--use_msa_server false` 或更大显存的GPU
`KeyError: 'iptm'`	仅输入单链序列	确保FASTA文件包含2条及以上序列
`FileNotFoundError: weights`	模型权重缺失	先运行 `boltz download` 下载权重
`ValueError: invalid residue`	存在非标准氨基酸	检查序列中是否有修饰后的残基

Boltz-1 vs Boltz-2

Boltz-1与Boltz-2对比

Aspect	Boltz-1	Boltz-2
Speed	Fast	~2x faster
Accuracy	Good	Improved
Ligands	Basic	Better support
Release	2024	Late 2024

Next:

protein-qc

for filtering and ranking.

维度	Boltz-1	Boltz-2
速度	快	约2倍快
精度	良好	有所提升
配体支持	基础	更好的支持
发布时间	2024年	2024年末

下一步：使用

protein-qc

进行过滤和排序。