protein-qc

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Protein Design Quality Control

蛋白质设计质量控制

Critical Limitation

关键局限性

Individual metrics have weak predictive power for binding. Research shows:
  • Individual metric ROC AUC: 0.64-0.66 (slightly better than random)
  • Metrics are pre-screening filters, not affinity predictors
  • Composite scoring is essential for meaningful ranking
These thresholds filter out poor designs but do NOT predict binding affinity.
单个指标对结合能力的预测能力较弱。研究表明:
  • 单个指标的ROC AUC为0.64-0.66(仅略优于随机水平)
  • 这些指标是预筛选过滤器,而非亲和力预测器
  • 综合评分对于有效排序至关重要
这些阈值可以筛选出劣质设计,但无法预测结合亲和力。

QC Organization

QC(质量控制)分类

QC is organized by purpose and level:
PurposeWhat it assessesKey metrics
BindingInterface quality, binding geometryipTM, PAE, SC, dG, dSASA
ExpressionManufacturability, solubilityInstability, GRAVY, pI, cysteines
StructuralFold confidence, consistencypLDDT, pTM, scRMSD
Each category has two levels:
  • Metric-level: Calculated values with thresholds (pLDDT > 0.85)
  • Design-level: Pattern/motif detection (odd cysteines, NG sites)

QC按用途层级进行分类:
用途评估内容关键指标
结合能力界面质量、结合几何结构ipTM, PAE, SC, dG, dSASA
表达能力可制造性、溶解性不稳定性指标, GRAVY, pI, 半胱氨酸
结构特性折叠置信度、一致性pLDDT, pTM, scRMSD
每个类别分为两个层级:
  • 指标层级:带阈值的计算值(如pLDDT > 0.85)
  • 设计层级:模式/基序检测(如奇数半胱氨酸、NG位点)

Quick Reference: All Thresholds

快速参考:所有阈值

CategoryMetricStandardStringentSource
StructuralpLDDT> 0.85> 0.90AF2/Chai/Boltz
pTM> 0.70> 0.80AF2/Chai/Boltz
scRMSD< 2.0 Å< 1.5 ÅDesign vs pred
BindingipTM> 0.50> 0.60AF2/Chai/Boltz
PAE_interaction< 12 Å< 10 ÅAF2/Chai/Boltz
Shape Comp (SC)> 0.50> 0.60PyRosetta
interface_dG< -10< -15PyRosetta
ExpressionInstability< 40< 30BioPython
GRAVY< 0.4< 0.2BioPython
ESM2 PLL> 0.0> 0.2ESM2
类别指标标准阈值严格阈值来源
结构特性pLDDT> 0.85> 0.90AF2/Chai/Boltz
pTM> 0.70> 0.80AF2/Chai/Boltz
scRMSD< 2.0 Å< 1.5 Å设计 vs 预测
结合能力ipTM> 0.50> 0.60AF2/Chai/Boltz
PAE_interaction< 12 Å< 10 ÅAF2/Chai/Boltz
形状互补性(SC)> 0.50> 0.60PyRosetta
interface_dG< -10< -15PyRosetta
表达能力不稳定性指标< 40< 30BioPython
GRAVY< 0.4< 0.2BioPython
ESM2 PLL> 0.0> 0.2ESM2

Design-Level Checks (Expression)

设计层级检查(表达能力)

PatternRiskAction
Odd cysteine countUnpaired disulfidesRedesign
NG/NS/NT motifsDeamidationFlag/avoid
K/R >= 3 consecutiveProteolysisFlag
>= 6 hydrophobic runAggregationRedesign
See: references/binding-qc.md, references/expression-qc.md, references/structural-qc.md

模式风险处理措施
半胱氨酸数量为奇数未配对二硫键重新设计
NG/NS/NT基序脱酰胺作用标记/避免
连续3个及以上K/R蛋白水解标记
连续6个及以上疏水氨基酸聚集重新设计
参考文档:references/binding-qc.md, references/expression-qc.md, references/structural-qc.md

Sequential Filtering Pipeline

多阶段过滤流程

python
import pandas as pd

designs = pd.read_csv('designs.csv')
python
import pandas as pd

designs = pd.read_csv('designs.csv')

Stage 1: Structural confidence

阶段1:结构置信度过滤

designs = designs[designs['pLDDT'] > 0.85]
designs = designs[designs['pLDDT'] > 0.85]

Stage 2: Self-consistency

阶段2:自一致性过滤

designs = designs[designs['scRMSD'] < 2.0]
designs = designs[designs['scRMSD'] < 2.0]

Stage 3: Binding quality

阶段3:结合质量过滤

designs = designs[(designs['ipTM'] > 0.5) & (designs['PAE_interaction'] < 10)]
designs = designs[(designs['ipTM'] > 0.5) & (designs['PAE_interaction'] < 10)]

Stage 4: Sequence plausibility

阶段4:序列合理性过滤

designs = designs[designs['esm2_pll_normalized'] > 0.0]
designs = designs[designs['esm2_pll_normalized'] > 0.0]

Stage 5: Expression checks (design-level)

阶段5:表达能力检查(设计层级)

designs = designs[designs['cysteine_count'] % 2 == 0] # Even cysteines designs = designs[designs['instability_index'] < 40]

---
designs = designs[designs['cysteine_count'] % 2 == 0] # 半胱氨酸数量为偶数 designs = designs[designs['instability_index'] < 40]

---

Composite Scoring (Required for Ranking)

综合评分(排序必备)

Individual metrics alone are too weak. Use composite scoring:
python
def composite_score(row):
    return (
        0.30 * row['pLDDT'] +
        0.20 * row['ipTM'] +
        0.20 * (1 - row['PAE_interaction'] / 20) +
        0.15 * row['shape_complementarity'] +
        0.15 * row['esm2_pll_normalized']
    )

designs['score'] = designs.apply(composite_score, axis=1)
top_designs = designs.nlargest(100, 'score')
For advanced composite scoring, see references/composite-scoring.md.

单个指标的效力太弱,建议使用综合评分:
python
def composite_score(row):
    return (
        0.30 * row['pLDDT'] +
        0.20 * row['ipTM'] +
        0.20 * (1 - row['PAE_interaction'] / 20) +
        0.15 * row['shape_complementarity'] +
        0.15 * row['esm2_pll_normalized']
    )

designs['score'] = designs.apply(composite_score, axis=1)
top_designs = designs.nlargest(100, 'score')
进阶综合评分方法请参考:references/composite-scoring.md。

Tool-Specific Filtering

工具专属过滤规则

BindCraft Filter Levels

BindCraft过滤级别

LevelUse CaseStringency
DefaultStandard designMost stringent
RelaxedNeed more designsHigher failure rate
PeptideDesigns < 30 AA~5-10x lower success
级别使用场景严格程度
默认标准设计最严格
宽松需要更多设计结果失败率更高
肽链长度<30氨基酸的设计成功率约低5-10倍

BoltzGen Filtering

BoltzGen过滤配置

bash
boltzgen run ... \
  --budget 60 \
  --alpha 0.01 \
  --filter_biased true \
  --refolding_rmsd_threshold 2.0 \
  --additional_filters 'ALA_fraction<0.3'
  • alpha=0.0
    : Quality-only ranking
  • alpha=0.01
    : Default (slight diversity)
  • alpha=1.0
    : Diversity-only

bash
boltzgen run ... \
  --budget 60 \
  --alpha 0.01 \
  --filter_biased true \
  --refolding_rmsd_threshold 2.0 \
  --additional_filters 'ALA_fraction<0.3'
  • alpha=0.0
    : 仅按质量排序
  • alpha=0.01
    : 默认配置(轻微兼顾多样性)
  • alpha=1.0
    : 仅按多样性排序

Design-Level Severity Scoring

设计层级严重程度评分

For pattern-based checks, use severity scoring:
Severity LevelScoreAction
LOW0-15Proceed
MODERATE16-35Review flagged issues
HIGH36-60Redesign recommended
CRITICAL61+Redesign required

针对基于模式的检查,使用严重程度评分:
严重程度分数处理措施
0-15继续推进
16-35复查标记问题
36-60建议重新设计
极高61+必须重新设计

Experimental Correlation

实验相关性

MetricAUCUse
ipTM~0.64Pre-screening
PAE~0.65Pre-screening
ESM2 PLL~0.72Best single metric
Composite~0.75+Always use
Key insight: Metrics work as filters (eliminating failures) not predictors (ranking successes).

指标AUC用途
ipTM~0.64预筛选
PAE~0.65预筛选
ESM2 PLL~0.72最佳单个指标
综合评分~0.75+必须使用
核心结论:这些指标是过滤器(淘汰失败设计)而非预测器(对成功设计排序)。

Campaign Health Assessment

设计项目健康度评估

Quick assessment of your design campaign:
Pass RateStatusInterpretation
> 15%ExcellentAbove average, proceed
10-15%GoodNormal, proceed
5-10%MarginalBelow average, review issues
< 5%PoorSignificant problems, diagnose

快速评估你的设计项目状态:
通过率状态解读
> 15%优秀高于平均水平,继续推进
10-15%良好正常水平,继续推进
5-10%边缘低于平均水平,复查问题
< 5%较差存在严重问题,排查原因

Failure Recovery Trees

故障排查树

Too Few Pass pLDDT Filter (< 5% with pLDDT > 0.85)

通过pLDDT过滤的设计过少(<5%的设计pLDDT>0.85)

Low pLDDT across campaign
├── Check scRMSD distribution
│   ├── High scRMSD (>2.5Å): Backbone issue
│   │   └── Fix: Regenerate backbones with lower noise_scale (0.5-0.8)
│   └── Low scRMSD but low pLDDT: Disordered regions
│       └── Fix: Check design length, simplify topology
├── Try more sequences per backbone
│   └── modal run modal_proteinmpnn.py --num-seq-per-target 32 --sampling-temp 0.1
├── Use SolubleMPNN instead of ProteinMPNN
│   └── Better for expression-optimized sequences
└── Consider different design tool
    └── BindCraft (integrated design) may work better
全项目pLDDT偏低
├── 检查scRMSD分布
│   ├── scRMSD偏高(>2.5Å): 主链结构问题
│   │   └── 修复方案:降低noise_scale(0.5-0.8)重新生成主链
│   └── scRMSD正常但pLDDT偏低: 存在无序区域
│       └── 修复方案:检查设计长度,简化拓扑结构
├── 为每个主链生成更多序列
│   └── 运行命令:modal_proteinmpnn.py --num-seq-per-target 32 --sampling-temp 0.1
├── 使用SolubleMPNN替代ProteinMPNN
│   └── 更适合优化表达的序列设计
└── 尝试不同的设计工具
    └── BindCraft(集成式设计工具)可能效果更好

Too Few Pass ipTM Filter (< 5% with ipTM > 0.5)

通过ipTM过滤的设计过少(<5%的设计ipTM>0.5)

Low ipTM across campaign
├── Review hotspot selection
│   ├── Are hotspots surface-exposed? (SASA > 20Ų)
│   ├── Are hotspots conserved? (check MSA)
│   └── Try 3-6 different hotspot combinations
├── Increase binder length (more contact area)
│   └── Try 80-100 AA instead of 60-80 AA
├── Check interface geometry
│   ├── Is target flat? → Try helical binders
│   └── Is target concave? → Try smaller binders
└── Try all-atom design tool
    └── BoltzGen (all-atom, better packing)
全项目ipTM偏低
├── 复查热点区域选择
│   ├── 热点区域是否暴露在表面?(SASA > 20Ų)
│   ├── 热点区域是否保守?(检查多序列比对MSA)
│   └── 尝试3-6种不同的热点组合
├── 增加结合剂长度(扩大接触面积)
│   └── 尝试80-100氨基酸,替代原60-80氨基酸
├── 检查界面几何结构
│   ├── 靶点是否为平面?→ 尝试螺旋结合剂
│   └── 靶点是否为凹面?→ 尝试更小的结合剂
└── 尝试全原子设计工具
    └── BoltzGen(全原子工具,包装效果更好)

High scRMSD (> 50% with scRMSD > 2.0Å)

scRMSD偏高(>50%的设计scRMSD>2.0Å)

Sequences don't specify intended structure
├── ProteinMPNN issue
│   ├── Lower temperature: --sampling-temp 0.1
│   ├── Increase sequences: --num-seq-per-target 32
│   └── Check fixed_positions aren't over-constraining
├── Backbone geometry issue
│   ├── Backbones may be unusual/strained
│   ├── Regenerate with lower noise_scale (0.5-0.8)
│   └── Reduce diffuser.T to 30-40
└── Try different sequence design
    └── ColabDesign (AF2 gradient-based) may work better
序列无法匹配预期结构
├── ProteinMPNN问题
│   ├── 降低采样温度:--sampling-temp 0.1
│   ├── 增加生成序列数量:--num-seq-per-target 32
│   └── 检查fixed_positions是否过度约束
├── 主链几何结构问题
│   ├── 主链结构可能异常/受力
│   ├── 降低noise_scale(0.5-0.8)重新生成
│   └── 降低diffuser.T至30-40
└── 尝试不同的序列设计工具
    └── ColabDesign(基于AF2梯度的工具)可能效果更好

Everything Passes But No Experimental Hits

所有过滤都通过但无实验阳性结果

In silico metrics don't predict affinity
├── Generate MORE designs (10x current)
│   └── Computational metrics have high false positive rate
├── Increase diversity
│   ├── Higher ProteinMPNN temperature (0.2-0.3)
│   ├── Different backbone topologies
│   └── Different hotspot combinations
├── Try different design approach
│   ├── BindCraft (different algorithm)
│   ├── ColabDesign (AF2 hallucination)
│   └── BoltzGen (all-atom diffusion)
└── Check if target is druggable
    └── Some targets are inherently difficult
模拟指标无法预测亲和力
├── 生成**更多**设计(当前数量的10倍)
│   └── 计算指标的假阳性率较高
├── 增加多样性
│   ├── 提高ProteinMPNN采样温度(0.2-0.3)
│   ├── 尝试不同的主链拓扑结构
│   └── 尝试不同的热点组合
├── 尝试不同的设计方法
│   ├── BindCraft(不同算法)
│   ├── ColabDesign(AF2幻觉法)
│   └── BoltzGen(全原子扩散法)
└── 检查靶点是否可成药
    └── 部分靶点本身难以设计结合剂

Too Many Designs Pass (> 50%)

诊断命令

快速项目评估

Suspiciously high pass rate
├── Check if thresholds are too lenient
│   └── Use stringent thresholds: pLDDT > 0.90, ipTM > 0.60
├── Verify prediction quality
│   ├── Are predictions actually running? Check output files
│   └── Are complexes being predicted, not just monomers?
├── Check for data issues
│   ├── Same sequence being predicted multiple times?
│   └── Wrong FASTA format (missing chain separator)?
└── Apply diversity filter
    └── Cluster at 70% identity, take top per cluster

python
import pandas as pd

df = pd.read_csv('designs.csv')

Diagnostic Commands

各阶段通过率

Quick Campaign Assessment

python
import pandas as pd

df = pd.read_csv('designs.csv')
print(f"总设计数: {len(df)}") print(f"pLDDT > 0.85: {(df['pLDDT'] > 0.85).mean():.1%}") print(f"ipTM > 0.50: {(df['ipTM'] > 0.50).mean():.1%}") print(f"scRMSD < 2.0: {(df['scRMSD'] < 2.0).mean():.1%}") print(f"通过所有过滤: {((df['pLDDT'] > 0.85) & (df['ipTM'] > 0.5) & (df['scRMSD'] < 2.0)).mean():.1%}")

Pass rates at each stage

识别核心问题

print(f"Total designs: {len(df)}") print(f"pLDDT > 0.85: {(df['pLDDT'] > 0.85).mean():.1%}") print(f"ipTM > 0.50: {(df['ipTM'] > 0.50).mean():.1%}") print(f"scRMSD < 2.0: {(df['scRMSD'] < 2.0).mean():.1%}") print(f"All filters: {((df['pLDDT'] > 0.85) & (df['ipTM'] > 0.5) & (df['scRMSD'] < 2.0)).mean():.1%}")
if (df['pLDDT'] > 0.85).mean() < 0.1: print("问题:pLDDT偏低 - 检查主链或序列质量") elif (df['ipTM'] > 0.50).mean() < 0.1: print("问题:ipTM偏低 - 检查热点区域或界面几何结构") elif (df['scRMSD'] < 2.0).mean() < 0.5: print("问题:scRMSD偏高 - 序列与主链不匹配")

---

Identify top issue

if (df['pLDDT'] > 0.85).mean() < 0.1: print("ISSUE: Low pLDDT - check backbone or sequence quality") elif (df['ipTM'] > 0.50).mean() < 0.1: print("ISSUE: Low ipTM - check hotspots or interface geometry") elif (df['scRMSD'] < 2.0).mean() < 0.5: print("ISSUE: High scRMSD - sequences don't specify backbone")

---