protein-qc

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Protein Design Quality Control

蛋白质设计质量控制

Critical Limitation

关键局限性

Individual metrics have weak predictive power for binding. Research shows:

Individual metric ROC AUC: 0.64-0.66 (slightly better than random)
Metrics are pre-screening filters, not affinity predictors
Composite scoring is essential for meaningful ranking

These thresholds filter out poor designs but do NOT predict binding affinity.

单个指标对结合能力的预测能力较弱。研究表明：

单个指标的ROC AUC为0.64-0.66（仅略优于随机水平）
这些指标是预筛选过滤器，而非亲和力预测器
综合评分对于有效排序至关重要

这些阈值可以筛选出劣质设计，但无法预测结合亲和力。

QC Organization

QC（质量控制）分类

QC is organized by purpose and level:

Purpose	What it assesses	Key metrics
Binding	Interface quality, binding geometry	ipTM, PAE, SC, dG, dSASA
Expression	Manufacturability, solubility	Instability, GRAVY, pI, cysteines
Structural	Fold confidence, consistency	pLDDT, pTM, scRMSD

Each category has two levels:

Metric-level: Calculated values with thresholds (pLDDT > 0.85)
Design-level: Pattern/motif detection (odd cysteines, NG sites)

QC按用途和层级进行分类：

用途	评估内容	关键指标
结合能力	界面质量、结合几何结构	ipTM, PAE, SC, dG, dSASA
表达能力	可制造性、溶解性	不稳定性指标, GRAVY, pI, 半胱氨酸
结构特性	折叠置信度、一致性	pLDDT, pTM, scRMSD

每个类别分为两个层级：

指标层级：带阈值的计算值（如pLDDT > 0.85）
设计层级：模式/基序检测（如奇数半胱氨酸、NG位点）

Quick Reference: All Thresholds

快速参考：所有阈值

Category	Metric	Standard	Stringent	Source
Structural	pLDDT	> 0.85	> 0.90	AF2/Chai/Boltz
	pTM	> 0.70	> 0.80	AF2/Chai/Boltz
	scRMSD	< 2.0 Å	< 1.5 Å	Design vs pred
Binding	ipTM	> 0.50	> 0.60	AF2/Chai/Boltz
	PAE_interaction	< 12 Å	< 10 Å	AF2/Chai/Boltz
	Shape Comp (SC)	> 0.50	> 0.60	PyRosetta
	interface_dG	< -10	< -15	PyRosetta
Expression	Instability	< 40	< 30	BioPython
	GRAVY	< 0.4	< 0.2	BioPython
	ESM2 PLL	> 0.0	> 0.2	ESM2

类别	指标	标准阈值	严格阈值	来源
结构特性	pLDDT	> 0.85	> 0.90	AF2/Chai/Boltz
	pTM	> 0.70	> 0.80	AF2/Chai/Boltz
	scRMSD	< 2.0 Å	< 1.5 Å	设计 vs 预测
结合能力	ipTM	> 0.50	> 0.60	AF2/Chai/Boltz
	PAE_interaction	< 12 Å	< 10 Å	AF2/Chai/Boltz
	形状互补性(SC)	> 0.50	> 0.60	PyRosetta
	interface_dG	< -10	< -15	PyRosetta
表达能力	不稳定性指标	< 40	< 30	BioPython
	GRAVY	< 0.4	< 0.2	BioPython
	ESM2 PLL	> 0.0	> 0.2	ESM2

Design-Level Checks (Expression)

设计层级检查（表达能力）

Pattern	Risk	Action
Odd cysteine count	Unpaired disulfides	Redesign
NG/NS/NT motifs	Deamidation	Flag/avoid
K/R >= 3 consecutive	Proteolysis	Flag
>= 6 hydrophobic run	Aggregation	Redesign

See: references/binding-qc.md, references/expression-qc.md, references/structural-qc.md

模式	风险	处理措施
半胱氨酸数量为奇数	未配对二硫键	重新设计
NG/NS/NT基序	脱酰胺作用	标记/避免
连续3个及以上K/R	蛋白水解	标记
连续6个及以上疏水氨基酸	聚集	重新设计

参考文档：references/binding-qc.md, references/expression-qc.md, references/structural-qc.md

Sequential Filtering Pipeline

多阶段过滤流程

python

import pandas as pd

designs = pd.read_csv('designs.csv')

python

import pandas as pd

designs = pd.read_csv('designs.csv')

Stage 1: Structural confidence

阶段1：结构置信度过滤

designs = designs[designs['pLDDT'] > 0.85]

Stage 2: Self-consistency

阶段2：自一致性过滤

designs = designs[designs['scRMSD'] < 2.0]

Stage 3: Binding quality

阶段3：结合质量过滤

designs = designs[(designs['ipTM'] > 0.5) & (designs['PAE_interaction'] < 10)]

Stage 4: Sequence plausibility

阶段4：序列合理性过滤

designs = designs[designs['esm2_pll_normalized'] > 0.0]

Stage 5: Expression checks (design-level)

阶段5：表达能力检查（设计层级）

designs = designs[designs['cysteine_count'] % 2 == 0] # Even cysteines designs = designs[designs['instability_index'] < 40]

---

designs = designs[designs['cysteine_count'] % 2 == 0] # 半胱氨酸数量为偶数 designs = designs[designs['instability_index'] < 40]

---

Composite Scoring (Required for Ranking)

综合评分（排序必备）

Individual metrics alone are too weak. Use composite scoring:

python

def composite_score(row):
    return (
        0.30 * row['pLDDT'] +
        0.20 * row['ipTM'] +
        0.20 * (1 - row['PAE_interaction'] / 20) +
        0.15 * row['shape_complementarity'] +
        0.15 * row['esm2_pll_normalized']
    )

designs['score'] = designs.apply(composite_score, axis=1)
top_designs = designs.nlargest(100, 'score')

For advanced composite scoring, see references/composite-scoring.md.

单个指标的效力太弱，建议使用综合评分：

python

def composite_score(row):
    return (
        0.30 * row['pLDDT'] +
        0.20 * row['ipTM'] +
        0.20 * (1 - row['PAE_interaction'] / 20) +
        0.15 * row['shape_complementarity'] +
        0.15 * row['esm2_pll_normalized']
    )

designs['score'] = designs.apply(composite_score, axis=1)
top_designs = designs.nlargest(100, 'score')

进阶综合评分方法请参考：references/composite-scoring.md。

Tool-Specific Filtering

工具专属过滤规则

BindCraft Filter Levels

BindCraft过滤级别

Level	Use Case	Stringency
Default	Standard design	Most stringent
Relaxed	Need more designs	Higher failure rate
Peptide	Designs < 30 AA	~5-10x lower success

级别	使用场景	严格程度
默认	标准设计	最严格
宽松	需要更多设计结果	失败率更高
肽链	长度<30氨基酸的设计	成功率约低5-10倍

BoltzGen Filtering

BoltzGen过滤配置

bash

boltzgen run ... \
  --budget 60 \
  --alpha 0.01 \
  --filter_biased true \
  --refolding_rmsd_threshold 2.0 \
  --additional_filters 'ALA_fraction<0.3'

```
alpha=0.0
```
: Quality-only ranking
```
alpha=0.01
```
: Default (slight diversity)
```
alpha=1.0
```
: Diversity-only

bash

boltzgen run ... \
  --budget 60 \
  --alpha 0.01 \
  --filter_biased true \
  --refolding_rmsd_threshold 2.0 \
  --additional_filters 'ALA_fraction<0.3'

```
alpha=0.0
```
: 仅按质量排序
```
alpha=0.01
```
: 默认配置（轻微兼顾多样性）
```
alpha=1.0
```
: 仅按多样性排序

Design-Level Severity Scoring

设计层级严重程度评分

For pattern-based checks, use severity scoring:

Severity Level	Score	Action
LOW	0-15	Proceed
MODERATE	16-35	Review flagged issues
HIGH	36-60	Redesign recommended
CRITICAL	61+	Redesign required

针对基于模式的检查，使用严重程度评分：

严重程度	分数	处理措施
低	0-15	继续推进
中	16-35	复查标记问题
高	36-60	建议重新设计
极高	61+	必须重新设计

Experimental Correlation

实验相关性

Metric	AUC	Use
ipTM	~0.64	Pre-screening
PAE	~0.65	Pre-screening
ESM2 PLL	~0.72	Best single metric
Composite	~0.75+	Always use

Key insight: Metrics work as filters (eliminating failures) not predictors (ranking successes).

指标	AUC	用途
ipTM	~0.64	预筛选
PAE	~0.65	预筛选
ESM2 PLL	~0.72	最佳单个指标
综合评分	~0.75+	必须使用

核心结论：这些指标是过滤器（淘汰失败设计）而非预测器（对成功设计排序）。

Campaign Health Assessment

设计项目健康度评估

Quick assessment of your design campaign:

Pass Rate	Status	Interpretation
> 15%	Excellent	Above average, proceed
10-15%	Good	Normal, proceed
5-10%	Marginal	Below average, review issues
< 5%	Poor	Significant problems, diagnose

快速评估你的设计项目状态：

通过率	状态	解读
> 15%	优秀	高于平均水平，继续推进
10-15%	良好	正常水平，继续推进
5-10%	边缘	低于平均水平，复查问题
< 5%	较差	存在严重问题，排查原因

Failure Recovery Trees

故障排查树

Too Few Pass pLDDT Filter (< 5% with pLDDT > 0.85)

通过pLDDT过滤的设计过少（<5%的设计pLDDT>0.85）

Low pLDDT across campaign
├── Check scRMSD distribution
│   ├── High scRMSD (>2.5Å): Backbone issue
│   │   └── Fix: Regenerate backbones with lower noise_scale (0.5-0.8)
│   └── Low scRMSD but low pLDDT: Disordered regions
│       └── Fix: Check design length, simplify topology
├── Try more sequences per backbone
│   └── modal run modal_proteinmpnn.py --num-seq-per-target 32 --sampling-temp 0.1
├── Use SolubleMPNN instead of ProteinMPNN
│   └── Better for expression-optimized sequences
└── Consider different design tool
    └── BindCraft (integrated design) may work better

全项目pLDDT偏低
├── 检查scRMSD分布
│   ├── scRMSD偏高(>2.5Å): 主链结构问题
│   │   └── 修复方案：降低noise_scale(0.5-0.8)重新生成主链
│   └── scRMSD正常但pLDDT偏低: 存在无序区域
│       └── 修复方案：检查设计长度，简化拓扑结构
├── 为每个主链生成更多序列
│   └── 运行命令：modal_proteinmpnn.py --num-seq-per-target 32 --sampling-temp 0.1
├── 使用SolubleMPNN替代ProteinMPNN
│   └── 更适合优化表达的序列设计
└── 尝试不同的设计工具
    └── BindCraft（集成式设计工具）可能效果更好

Too Few Pass ipTM Filter (< 5% with ipTM > 0.5)

通过ipTM过滤的设计过少（<5%的设计ipTM>0.5）

Low ipTM across campaign
├── Review hotspot selection
│   ├── Are hotspots surface-exposed? (SASA > 20Å²)
│   ├── Are hotspots conserved? (check MSA)
│   └── Try 3-6 different hotspot combinations
├── Increase binder length (more contact area)
│   └── Try 80-100 AA instead of 60-80 AA
├── Check interface geometry
│   ├── Is target flat? → Try helical binders
│   └── Is target concave? → Try smaller binders
└── Try all-atom design tool
    └── BoltzGen (all-atom, better packing)

全项目ipTM偏低
├── 复查热点区域选择
│   ├── 热点区域是否暴露在表面？（SASA > 20Å²）
│   ├── 热点区域是否保守？（检查多序列比对MSA）
│   └── 尝试3-6种不同的热点组合
├── 增加结合剂长度（扩大接触面积）
│   └── 尝试80-100氨基酸，替代原60-80氨基酸
├── 检查界面几何结构
│   ├── 靶点是否为平面？→ 尝试螺旋结合剂
│   └── 靶点是否为凹面？→ 尝试更小的结合剂
└── 尝试全原子设计工具
    └── BoltzGen（全原子工具，包装效果更好）

High scRMSD (> 50% with scRMSD > 2.0Å)

scRMSD偏高（>50%的设计scRMSD>2.0Å）

Sequences don't specify intended structure
├── ProteinMPNN issue
│   ├── Lower temperature: --sampling-temp 0.1
│   ├── Increase sequences: --num-seq-per-target 32
│   └── Check fixed_positions aren't over-constraining
├── Backbone geometry issue
│   ├── Backbones may be unusual/strained
│   ├── Regenerate with lower noise_scale (0.5-0.8)
│   └── Reduce diffuser.T to 30-40
└── Try different sequence design
    └── ColabDesign (AF2 gradient-based) may work better

序列无法匹配预期结构
├── ProteinMPNN问题
│   ├── 降低采样温度：--sampling-temp 0.1
│   ├── 增加生成序列数量：--num-seq-per-target 32
│   └── 检查fixed_positions是否过度约束
├── 主链几何结构问题
│   ├── 主链结构可能异常/受力
│   ├── 降低noise_scale(0.5-0.8)重新生成
│   └── 降低diffuser.T至30-40
└── 尝试不同的序列设计工具
    └── ColabDesign（基于AF2梯度的工具）可能效果更好

Everything Passes But No Experimental Hits

所有过滤都通过但无实验阳性结果

In silico metrics don't predict affinity
├── Generate MORE designs (10x current)
│   └── Computational metrics have high false positive rate
├── Increase diversity
│   ├── Higher ProteinMPNN temperature (0.2-0.3)
│   ├── Different backbone topologies
│   └── Different hotspot combinations
├── Try different design approach
│   ├── BindCraft (different algorithm)
│   ├── ColabDesign (AF2 hallucination)
│   └── BoltzGen (all-atom diffusion)
└── Check if target is druggable
    └── Some targets are inherently difficult

模拟指标无法预测亲和力
├── 生成**更多**设计（当前数量的10倍）
│   └── 计算指标的假阳性率较高
├── 增加多样性
│   ├── 提高ProteinMPNN采样温度(0.2-0.3)
│   ├── 尝试不同的主链拓扑结构
│   └── 尝试不同的热点组合
├── 尝试不同的设计方法
│   ├── BindCraft（不同算法）
│   ├── ColabDesign（AF2幻觉法）
│   └── BoltzGen（全原子扩散法）
└── 检查靶点是否可成药
    └── 部分靶点本身难以设计结合剂

Too Many Designs Pass (> 50%)

诊断命令

—

快速项目评估

Suspiciously high pass rate
├── Check if thresholds are too lenient
│   └── Use stringent thresholds: pLDDT > 0.90, ipTM > 0.60
├── Verify prediction quality
│   ├── Are predictions actually running? Check output files
│   └── Are complexes being predicted, not just monomers?
├── Check for data issues
│   ├── Same sequence being predicted multiple times?
│   └── Wrong FASTA format (missing chain separator)?
└── Apply diversity filter
    └── Cluster at 70% identity, take top per cluster

python

import pandas as pd

df = pd.read_csv('designs.csv')

Diagnostic Commands

各阶段通过率

Quick Campaign Assessment

—

python

import pandas as pd

df = pd.read_csv('designs.csv')

print(f"总设计数: {len(df)}") print(f"pLDDT > 0.85: {(df['pLDDT'] > 0.85).mean():.1%}") print(f"ipTM > 0.50: {(df['ipTM'] > 0.50).mean():.1%}") print(f"scRMSD < 2.0: {(df['scRMSD'] < 2.0).mean():.1%}") print(f"通过所有过滤: {((df['pLDDT'] > 0.85) & (df['ipTM'] > 0.5) & (df['scRMSD'] < 2.0)).mean():.1%}")

Pass rates at each stage

识别核心问题

print(f"Total designs: {len(df)}") print(f"pLDDT > 0.85: {(df['pLDDT'] > 0.85).mean():.1%}") print(f"ipTM > 0.50: {(df['ipTM'] > 0.50).mean():.1%}") print(f"scRMSD < 2.0: {(df['scRMSD'] < 2.0).mean():.1%}") print(f"All filters: {((df['pLDDT'] > 0.85) & (df['ipTM'] > 0.5) & (df['scRMSD'] < 2.0)).mean():.1%}")

if (df['pLDDT'] > 0.85).mean() < 0.1: print("问题：pLDDT偏低 - 检查主链或序列质量") elif (df['ipTM'] > 0.50).mean() < 0.1: print("问题：ipTM偏低 - 检查热点区域或界面几何结构") elif (df['scRMSD'] < 2.0).mean() < 0.5: print("问题：scRMSD偏高 - 序列与主链不匹配")

---

Identify top issue

—

if (df['pLDDT'] > 0.85).mean() < 0.1: print("ISSUE: Low pLDDT - check backbone or sequence quality") elif (df['ipTM'] > 0.50).mean() < 0.1: print("ISSUE: Low ipTM - check hotspots or interface geometry") elif (df['scRMSD'] < 2.0).mean() < 0.5: print("ISSUE: High scRMSD - sequences don't specify backbone")

---

—