alphafold-database
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseAlphaFold Database
AlphaFold数据库
Programmatic access to DeepMind's AlphaFold Protein Structure Database (200M+ predicted structures).
通过编程方式访问DeepMind的AlphaFold蛋白质结构数据库(包含2亿+预测结构)。
Quick Reference
快速参考
python
undefinedpython
undefinedFetch structure via Biopython
Fetch structure via Biopython
from Bio.PDB import alphafold_db
predictions = list(alphafold_db.get_predictions("P00520"))
alphafold_db.download_cif_for(predictions[0], directory="./output")
from Bio.PDB import alphafold_db
predictions = list(alphafold_db.get_predictions("P00520"))
alphafold_db.download_cif_for(predictions[0], directory="./output")
Direct API call
Direct API call
import requests
resp = requests.get("https://alphafold.ebi.ac.uk/api/prediction/P00520")
entry_id = resp.json()[0]['entryId'] # AF-P00520-F1
import requests
resp = requests.get("https://alphafold.ebi.ac.uk/api/prediction/P00520")
entry_id = resp.json()[0]['entryId'] # AF-P00520-F1
Download structure file
Download structure file
structure_url = f"https://alphafold.ebi.ac.uk/files/{entry_id}-model_v4.cif"
undefinedstructure_url = f"https://alphafold.ebi.ac.uk/files/{entry_id}-model_v4.cif"
undefinedWhen to Use
使用场景
- Obtain 3D coordinates for proteins without experimental structures
- Assess prediction quality via pLDDT and PAE metrics
- Download structure files (mmCIF, PDB) for visualization or docking
- Retrieve proteome-scale datasets for computational analysis
- 获取无实验结构的蛋白质3D坐标
- 通过pLDDT和PAE指标评估预测质量
- 下载结构文件(mmCIF、PDB)用于可视化或对接
- 检索蛋白质组规模的数据集以进行计算分析
Key Concepts
核心概念
| Term | Description |
|---|---|
| UniProt Accession | Protein identifier (e.g., |
| AlphaFold ID | Format: |
| pLDDT | Per-residue confidence (0-100); >90 = reliable, <50 = disordered |
| PAE | Predicted Aligned Error; <5A = high confidence domain positions |
See for detailed interpretation guidance.
references/confidence-scores.md| 术语 | 描述 |
|---|---|
| UniProt Accession | 用于查询的蛋白质标识符(例如: |
| AlphaFold ID | 格式: |
| pLDDT | 每个残基的置信度(0-100);>90表示可靠,<50表示无序 |
| PAE | 预测对齐误差;<5Å表示结构域位置的置信度高 |
请参阅获取详细的解读指南。
references/confidence-scores.mdFile Types
文件类型
| File | URL Pattern | Contents |
|---|---|---|
| Coordinates | | Atomic positions (mmCIF) |
| Confidence | | Per-residue pLDDT array |
| PAE Matrix | | Inter-residue error |
Base URL:
https://alphafold.ebi.ac.uk/files/| 文件类型 | URL格式 | 内容 |
|---|---|---|
| 坐标文件 | | 原子位置(mmCIF格式) |
| 置信度文件 | | 每个残基的pLDDT数组 |
| PAE矩阵文件 | | 残基间误差 |
基础URL:
https://alphafold.ebi.ac.uk/files/Core Operations
核心操作
Fetch Structure Metadata
获取结构元数据
python
import requests
resp = requests.get(f"https://alphafold.ebi.ac.uk/api/prediction/{uniprot_id}")
metadata = resp.json()[0]
af_id = metadata['entryId']python
import requests
resp = requests.get(f"https://alphafold.ebi.ac.uk/api/prediction/{uniprot_id}")
metadata = resp.json()[0]
af_id = metadata['entryId']Download All Files
下载所有文件
Use :
scripts/alphafold_utils.pypython
from scripts.alphafold_utils import download_alphafold_files
paths = download_alphafold_files("AF-P04637-F1", output_dir="./data")使用:
scripts/alphafold_utils.pypython
from scripts.alphafold_utils import download_alphafold_files
paths = download_alphafold_files("AF-P04637-F1", output_dir="./data")Analyze Confidence
分析置信度
python
from scripts.alphafold_utils import get_plddt_scores
stats = get_plddt_scores("AF-P04637-F1")
print(f"Average pLDDT: {stats['mean']:.1f}")python
from scripts.alphafold_utils import get_plddt_scores
stats = get_plddt_scores("AF-P04637-F1")
print(f"Average pLDDT: {stats['mean']:.1f}")Bulk Proteome Access
批量蛋白质组访问
bash
undefinedbash
undefinedGoogle Cloud Storage
Google Cloud Storage
gsutil ls gs://public-datasets-deepmind-alphafold-v4/
gsutil -m cp "gs://public-datasets-deepmind-alphafold-v4/proteomes/proteome-tax_id-9606-*.tar" ./
See `references/bulk-access.md` for BigQuery queries and batch processing.gsutil ls gs://public-datasets-deepmind-alphafold-v4/
gsutil -m cp "gs://public-datasets-deepmind-alphafold-v4/proteomes/proteome-tax_id-9606-*.tar" ./
请参阅`references/bulk-access.md`获取BigQuery查询和批量处理相关内容。Caveats
注意事项
- Predictions, not experiments: Verify critical findings experimentally
- Confidence matters: Always check pLDDT before using regions
- Single chains only: No multimers or complexes
- No ligands: Missing cofactors, ions, PTMs
- 仅为预测结果,非实验数据:关键发现需通过实验验证
- 置信度至关重要:使用区域前务必检查pLDDT
- 仅支持单链:不包含多聚体或复合物
- 无配体信息:缺少辅因子、离子、翻译后修饰(PTMs)
Setup
环境搭建
bash
pip install biopython requests numpy matplotlib pandas scipybash
pip install biopython requests numpy matplotlib pandas scipyOptional: pip install google-cloud-bigquery gsutil
Optional: pip install google-cloud-bigquery gsutil
undefinedundefinedLinks
相关链接
- Database: https://alphafold.ebi.ac.uk/
- API Docs: https://alphafold.ebi.ac.uk/api-docs
- Biopython: https://biopython.org/docs/dev/api/Bio.PDB.alphafold_db.html