opentargets-database

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Open Targets Database

Open Targets数据库

Overview

概述

The Open Targets Platform is a comprehensive resource for systematic identification and prioritization of potential therapeutic drug targets. It integrates publicly available datasets including human genetics, omics, literature, and chemical data to build and score target-disease associations.
Key capabilities:
  • Query target (gene) annotations including tractability, safety, expression
  • Search for disease-target associations with evidence scores
  • Retrieve evidence from multiple data types (genetics, pathways, literature, etc.)
  • Find known drugs for diseases and their mechanisms
  • Access drug information including clinical trial phases and adverse events
  • Evaluate target druggability and therapeutic potential
Data access: The platform provides a GraphQL API, web interface, data downloads, and Google BigQuery access. This skill focuses on the GraphQL API for programmatic access.
Open Targets平台是一个用于系统识别和优先排序潜在治疗药物靶点的综合资源。它整合了包括人类遗传学、组学、文献和化学数据在内的公开可用数据集,以构建和评分靶点-疾病关联。
核心功能:
  • 查询靶点(基因)注释,包括可成药性、安全性、表达情况
  • 搜索带有证据评分的靶点-疾病关联
  • 从多种数据类型(遗传学、通路、文献等)获取证据
  • 查找疾病的已知药物及其作用机制
  • 获取药物信息,包括临床试验阶段和不良事件
  • 评估靶点的可成药性和治疗潜力
数据访问方式: 该平台提供GraphQL API、网页界面、数据下载和Google BigQuery访问。本技能聚焦于使用GraphQL API进行程序化访问。

When to Use This Skill

何时使用该技能

This skill should be used when:
  • Target discovery: Finding potential therapeutic targets for a disease
  • Target assessment: Evaluating tractability, safety, and druggability of genes
  • Evidence gathering: Retrieving supporting evidence for target-disease associations
  • Drug repurposing: Identifying existing drugs that could be repurposed for new indications
  • Competitive intelligence: Understanding clinical precedence and drug development landscape
  • Target prioritization: Ranking targets based on genetic evidence and other data types
  • Mechanism research: Investigating biological pathways and gene functions
  • Biomarker discovery: Finding genes differentially expressed in disease
  • Safety assessment: Identifying potential toxicity concerns for drug targets
在以下场景中应使用本技能:
  • 靶点发现: 为某一疾病寻找潜在治疗靶点
  • 靶点评估: 评估基因的可成药性、安全性和可药理性
  • 证据收集: 获取支持靶点-疾病关联的相关证据
  • 药物重定位: 识别可用于新适应症的现有药物
  • 竞争情报: 了解临床先例和药物开发现状
  • 靶点优先排序: 根据遗传学证据和其他数据类型对靶点进行排名
  • 机制研究: 研究生物通路和基因功能
  • 生物标志物发现: 寻找在疾病中差异表达的基因
  • 安全性评估: 识别药物靶点的潜在毒性风险

Core Workflow

核心工作流程

1. Search for Entities

1. 搜索实体

Start by finding the identifiers for targets, diseases, or drugs of interest.
For targets (genes):
python
from scripts.query_opentargets import search_entities
首先找到感兴趣的靶点、疾病或药物的标识符。
针对靶点(基因):
python
from scripts.query_opentargets import search_entities

Search by gene symbol or name

按基因符号或名称搜索

results = search_entities("BRCA1", entity_types=["target"])
results = search_entities("BRCA1", entity_types=["target"])

Returns: [{"id": "ENSG00000012048", "name": "BRCA1", ...}]

返回结果: [{"id": "ENSG00000012048", "name": "BRCA1", ...}]


**For diseases:**
```python

**针对疾病:**
```python

Search by disease name

按疾病名称搜索

results = search_entities("alzheimer", entity_types=["disease"])
results = search_entities("alzheimer", entity_types=["disease"])

Returns: [{"id": "EFO_0000249", "name": "Alzheimer disease", ...}]

返回结果: [{"id": "EFO_0000249", "name": "Alzheimer disease", ...}]


**For drugs:**
```python

**针对药物:**
```python

Search by drug name

按药物名称搜索

results = search_entities("aspirin", entity_types=["drug"])
results = search_entities("aspirin", entity_types=["drug"])

Returns: [{"id": "CHEMBL25", "name": "ASPIRIN", ...}]

返回结果: [{"id": "CHEMBL25", "name": "ASPIRIN", ...}]


**Identifiers used:**
- Targets: Ensembl gene IDs (e.g., `ENSG00000157764`)
- Diseases: EFO (Experimental Factor Ontology) IDs (e.g., `EFO_0000249`)
- Drugs: ChEMBL IDs (e.g., `CHEMBL25`)

**使用的标识符:**
- 靶点:Ensembl基因ID(例如:`ENSG00000157764`)
- 疾病:EFO(实验因子本体)ID(例如:`EFO_0000249`)
- 药物:ChEMBL ID(例如:`CHEMBL25`)

2. Query Target Information

2. 查询靶点信息

Retrieve comprehensive target annotations to assess druggability and biology.
python
from scripts.query_opentargets import get_target_info

target_info = get_target_info("ENSG00000157764", include_diseases=True)
获取全面的靶点注释,以评估可成药性和生物学特性。
python
from scripts.query_opentargets import get_target_info

target_info = get_target_info("ENSG00000157764", include_diseases=True)

Access key fields:

访问关键字段:

- approvedSymbol: HGNC gene symbol

- approvedSymbol: HGNC基因符号

- approvedName: Full gene name

- approvedName: 完整基因名称

- tractability: Druggability assessments across modalities

- tractability: 不同模态的可成药性评估

- safetyLiabilities: Known safety concerns

- safetyLiabilities: 已知安全隐患

- geneticConstraint: Constraint scores from gnomAD

- geneticConstraint: 来自gnomAD的约束评分

- associatedDiseases: Top disease associations with scores

- associatedDiseases: 带有评分的顶级疾病关联


**Key annotations to review:**
- **Tractability:** Small molecule, antibody, PROTAC druggability predictions
- **Safety:** Known toxicity concerns from multiple databases
- **Genetic constraint:** pLI and LOEUF scores indicating essentiality
- **Disease associations:** Diseases linked to the target with evidence scores

Refer to `references/target_annotations.md` for detailed information about all target features.

**需查看的关键注释:**
- **可成药性:** 小分子、抗体、PROTAC的可成药性预测
- **安全性:** 来自多个数据库的已知毒性风险
- **遗传约束:** pLI和LOEUF评分,指示基因必要性
- **疾病关联:** 与靶点相关联的疾病及证据评分

有关所有靶点特征的详细信息,请参考`references/target_annotations.md`。

3. Query Disease Information

3. 查询疾病信息

Get disease details and associated targets/drugs.
python
from scripts.query_opentargets import get_disease_info

disease_info = get_disease_info("EFO_0000249", include_targets=True)
获取疾病详情及相关靶点/药物。
python
from scripts.query_opentargets import get_disease_info

disease_info = get_disease_info("EFO_0000249", include_targets=True)

Access fields:

访问字段:

- name: Disease name

- name: 疾病名称

- description: Disease description

- description: 疾病描述

- therapeuticAreas: High-level disease categories

- therapeuticAreas: 高级疾病类别

- associatedTargets: Top targets with association scores

- associatedTargets: 带有关联评分的顶级靶点

undefined
undefined

4. Retrieve Target-Disease Evidence

4. 获取靶点-疾病证据

Get detailed evidence supporting a target-disease association.
python
from scripts.query_opentargets import get_target_disease_evidence
获取支持靶点-疾病关联的详细证据。
python
from scripts.query_opentargets import get_target_disease_evidence

Get all evidence

获取所有证据

evidence = get_target_disease_evidence( ensembl_id="ENSG00000157764", efo_id="EFO_0000249" )
evidence = get_target_disease_evidence( ensembl_id="ENSG00000157764", efo_id="EFO_0000249" )

Filter by evidence type

按证据类型过滤

genetic_evidence = get_target_disease_evidence( ensembl_id="ENSG00000157764", efo_id="EFO_0000249", data_types=["genetic_association"] )
genetic_evidence = get_target_disease_evidence( ensembl_id="ENSG00000157764", efo_id="EFO_0000249", data_types=["genetic_association"] )

Each evidence record contains:

每条证据记录包含:

- datasourceId: Specific data source (e.g., "gwas_catalog", "chembl")

- datasourceId: 具体数据源(例如:"gwas_catalog", "chembl")

- datatypeId: Evidence category (e.g., "genetic_association", "known_drug")

- datatypeId: 证据类别(例如:"genetic_association", "known_drug")

- score: Evidence strength (0-1)

- score: 证据强度(0-1)

- studyId: Original study identifier

- studyId: 原始研究标识符

- literature: Associated publications

- literature: 相关出版物


**Major evidence types:**
1. **genetic_association:** GWAS, rare variants, ClinVar, gene burden
2. **somatic_mutation:** Cancer Gene Census, IntOGen, cancer biomarkers
3. **known_drug:** Clinical precedence from approved/clinical drugs
4. **affected_pathway:** CRISPR screens, pathway analyses, gene signatures
5. **rna_expression:** Differential expression from Expression Atlas
6. **animal_model:** Mouse phenotypes from IMPC
7. **literature:** Text-mining from Europe PMC

Refer to `references/evidence_types.md` for detailed descriptions of all evidence types and interpretation guidelines.

**主要证据类型:**
1. **genetic_association:** GWAS、罕见变异、ClinVar、基因负荷
2. **somatic_mutation:** 癌症基因普查、IntOGen、癌症生物标志物
3. **known_drug:** 已获批/临床药物的临床先例
4. **affected_pathway:** CRISPR筛选、通路分析、基因特征
5. **rna_expression:** 来自Expression Atlas的差异表达
6. **animal_model:** 来自IMPC的小鼠表型
7. **literature:** 来自Europe PMC的文本挖掘

有关所有证据类型的详细描述和解读指南,请参考`references/evidence_types.md`。

5. Find Known Drugs

5. 查找已知药物

Identify drugs used for a disease and their targets.
python
from scripts.query_opentargets import get_known_drugs_for_disease

drugs = get_known_drugs_for_disease("EFO_0000249")
识别用于某一疾病的药物及其靶点。
python
from scripts.query_opentargets import get_known_drugs_for_disease

drugs = get_known_drugs_for_disease("EFO_0000249")

drugs contains:

drugs包含:

- uniqueDrugs: Total number of unique drugs

- uniqueDrugs: 独特药物总数

- uniqueTargets: Total number of unique targets

- uniqueTargets: 独特靶点总数

- rows: List of drug-target-indication records with:

- rows: 药物-靶点-适应症记录列表,包含:

- drug: {name, drugType, maximumClinicalTrialPhase}

- drug: {name, drugType, maximumClinicalTrialPhase}

- targets: Genes targeted by the drug

- targets: 药物作用的基因

- phase: Clinical trial phase for this indication

- phase: 该适应症的临床试验阶段

- status: Trial status (active, completed, etc.)

- status: 试验状态(活跃、已完成等)

- mechanismOfAction: How drug works

- mechanismOfAction: 药物作用机制


**Clinical phases:**
- Phase 4: Approved drug
- Phase 3: Late-stage clinical trials
- Phase 2: Mid-stage trials
- Phase 1: Early safety trials

**临床试验阶段:**
- Phase 4: 已获批药物
- Phase 3: 后期临床试验
- Phase 2: 中期临床试验
- Phase 1: 早期安全性试验

6. Get Drug Information

6. 获取药物信息

Retrieve detailed drug information including mechanisms and indications.
python
from scripts.query_opentargets import get_drug_info

drug_info = get_drug_info("CHEMBL25")
获取详细的药物信息,包括作用机制和适应症。
python
from scripts.query_opentargets import get_drug_info

drug_info = get_drug_info("CHEMBL25")

Access:

访问:

- name, synonyms: Drug identifiers

- name, synonyms: 药物标识符

- drugType: Small molecule, antibody, etc.

- drugType: 小分子、抗体等

- maximumClinicalTrialPhase: Development stage

- maximumClinicalTrialPhase: 开发阶段

- mechanismsOfAction: Target and action type

- mechanismsOfAction: 靶点和作用类型

- indications: Diseases with trial phases

- indications: 带有试验阶段的疾病

- withdrawnNotice: If withdrawn, reasons and countries

- withdrawnNotice: 若已撤市,包含原因和涉及国家

undefined
undefined

7. Get All Associations for a Target

7. 获取靶点的所有关联

Find all diseases associated with a target, optionally filtering by score.
python
from scripts.query_opentargets import get_target_associations
查找与某一靶点相关的所有疾病,可选择性按评分过滤。
python
from scripts.query_opentargets import get_target_associations

Get associations with score >= 0.5

获取评分 >= 0.5的关联

associations = get_target_associations( ensembl_id="ENSG00000157764", min_score=0.5 )
associations = get_target_associations( ensembl_id="ENSG00000157764", min_score=0.5 )

Each association contains:

每个关联包含:

- disease: {id, name}

- disease: {id, name}

- score: Overall association score (0-1)

- score: 整体关联评分(0-1)

- datatypeScores: Breakdown by evidence type

- datatypeScores: 按证据类型细分的评分


**Association scores:**
- Range: 0-1 (higher = stronger evidence)
- Aggregate evidence across all data types using harmonic sum
- NOT confidence scores but relative ranking metrics
- Under-studied diseases may have lower scores despite good evidence

**关联评分:**
- 范围:0-1(分数越高,证据越强)
- 使用调和和对所有数据类型的证据进行汇总
- 不是置信度评分,而是相对排名指标
- 研究不足的疾病可能尽管证据充分,但评分较低

GraphQL API Details

GraphQL API详情

For custom queries beyond the provided helper functions, use the GraphQL API directly or modify
scripts/query_opentargets.py
.
Key information:
  • Endpoint:
    https://api.platform.opentargets.org/api/v4/graphql
  • Interactive browser:
    https://api.platform.opentargets.org/api/v4/graphql/browser
  • No authentication required
  • Request only needed fields to minimize response size
  • Use pagination for large result sets:
    page: {size: N, index: M}
Refer to
references/api_reference.md
for:
  • Complete endpoint documentation
  • Example queries for all entity types
  • Error handling patterns
  • Best practices for API usage
对于超出提供的辅助函数的自定义查询,可直接使用GraphQL API或修改
scripts/query_opentargets.py
关键信息:
  • 端点:
    https://api.platform.opentargets.org/api/v4/graphql
  • 交互式浏览器:
    https://api.platform.opentargets.org/api/v4/graphql/browser
  • 无需身份验证
  • 仅请求所需字段以最小化响应大小
  • 对大型结果集使用分页:
    page: {size: N, index: M}
有关以下内容,请参考
references/api_reference.md
  • 完整的端点文档
  • 所有实体类型的示例查询
  • 错误处理模式
  • API使用最佳实践

Best Practices

最佳实践

Target Prioritization Strategy

靶点优先排序策略

When prioritizing drug targets:
  1. Start with genetic evidence: Human genetics (GWAS, rare variants) provides strongest disease relevance
  2. Check tractability: Prefer targets with clinical or discovery precedence
  3. Assess safety: Review safety liabilities, expression patterns, and genetic constraint
  4. Evaluate clinical precedence: Known drugs indicate druggability and therapeutic window
  5. Consider multiple evidence types: Convergent evidence from different sources increases confidence
  6. Validate mechanistically: Pathway evidence and biological plausibility
  7. Review literature manually: For critical decisions, examine primary publications
在对药物靶点进行优先排序时:
  1. 从遗传学证据开始: 人类遗传学(GWAS、罕见变异)提供最强的疾病相关性
  2. 检查可成药性: 优先选择有临床或研究先例的靶点
  3. 评估安全性: 审查安全隐患、表达模式和遗传约束
  4. 评估临床先例: 已知药物表明可成药性和治疗窗口
  5. 考虑多种证据类型: 来自不同来源的一致证据可提高置信度
  6. 进行机制验证: 通路证据和生物学合理性
  7. 手动查阅文献: 对于关键决策,检查原始出版物

Evidence Interpretation

证据解读

Strong evidence indicators:
  • Multiple independent evidence sources
  • High genetic association scores (especially GWAS with L2G > 0.5)
  • Clinical precedence from approved drugs
  • ClinVar pathogenic variants with disease match
  • Mouse models with relevant phenotypes
Caution flags:
  • Single evidence source only
  • Text-mining as sole evidence (requires manual validation)
  • Conflicting evidence across sources
  • High essentiality + ubiquitous expression (poor therapeutic window)
  • Multiple safety liabilities
Score interpretation:
  • Scores rank relative strength, not absolute confidence
  • Under-studied diseases have lower scores despite potentially valid targets
  • Weight expert-curated sources higher than computational predictions
  • Check evidence breakdown, not just overall score
强证据指标:
  • 多个独立证据来源
  • 高遗传关联评分(尤其是L2G > 0.5的GWAS)
  • 已获批药物的临床先例
  • 与疾病匹配的ClinVar致病性变异
  • 具有相关表型的小鼠模型
注意事项:
  • 仅单一证据来源
  • 仅以文本挖掘作为证据(需手动验证)
  • 不同来源的证据存在冲突
  • 高必要性+广泛表达(治疗窗口差)
  • 多个安全隐患
评分解读:
  • 评分对证据强度进行相对排名,而非绝对置信度
  • 研究不足的疾病尽管可能有有效靶点,但评分较低
  • 专家 curated 来源的权重高于计算预测
  • 检查证据细分,而非仅看整体评分

Common Workflows

常见工作流程

Workflow 1: Target Discovery for a Disease
  1. Search for disease → get EFO ID
  2. Query disease info with
    include_targets=True
  3. Review top targets sorted by association score
  4. For promising targets, get detailed target info
  5. Examine evidence types supporting each association
  6. Assess tractability and safety for prioritized targets
Workflow 2: Target Validation
  1. Search for target → get Ensembl ID
  2. Get comprehensive target info
  3. Check tractability (especially clinical precedence)
  4. Review safety liabilities and genetic constraint
  5. Examine disease associations to understand biology
  6. Look for chemical probes or tool compounds
  7. Check known drugs targeting gene for mechanism insights
Workflow 3: Drug Repurposing
  1. Search for disease → get EFO ID
  2. Get known drugs for disease
  3. For each drug, get detailed drug info
  4. Examine mechanisms of action and targets
  5. Look for related disease indications
  6. Assess clinical trial phases and status
  7. Identify repurposing opportunities based on mechanism
Workflow 4: Competitive Intelligence
  1. Search for target of interest
  2. Get associated diseases with evidence
  3. For each disease, get known drugs
  4. Review clinical phases and development status
  5. Identify competitors and their mechanisms
  6. Assess clinical precedence and market landscape
工作流程1:某一疾病的靶点发现
  1. 搜索疾病 → 获取EFO ID
  2. 使用
    include_targets=True
    查询疾病信息
  3. 查看按关联评分排序的顶级靶点
  4. 对有潜力的靶点,获取详细靶点信息
  5. 检查支持每个关联的证据类型
  6. 对优先排序的靶点评估可成药性和安全性
工作流程2:靶点验证
  1. 搜索靶点 → 获取Ensembl ID
  2. 获取全面的靶点信息
  3. 检查可成药性(尤其是临床先例)
  4. 审查安全隐患和遗传约束
  5. 检查疾病关联以了解生物学特性
  6. 寻找化学探针或工具化合物
  7. 检查针对该基因的已知药物以获取机制见解
工作流程3:药物重定位
  1. 搜索疾病 → 获取EFO ID
  2. 获取该疾病的已知药物
  3. 对每种药物,获取详细药物信息
  4. 检查作用机制和靶点
  5. 寻找相关疾病适应症
  6. 评估临床试验阶段和状态
  7. 根据作用机制识别重定位机会
工作流程4:竞争情报
  1. 搜索感兴趣的靶点
  2. 获取带有证据的相关疾病
  3. 对每种疾病,获取已知药物
  4. 审查临床试验阶段和开发状态
  5. 识别竞争对手及其作用机制
  6. 评估临床先例和市场格局

Resources

资源

Scripts

脚本

scripts/query_opentargets.py Helper functions for common API operations:
  • search_entities()
    - Search for targets, diseases, or drugs
  • get_target_info()
    - Retrieve target annotations
  • get_disease_info()
    - Retrieve disease information
  • get_target_disease_evidence()
    - Get supporting evidence
  • get_known_drugs_for_disease()
    - Find drugs for a disease
  • get_drug_info()
    - Retrieve drug details
  • get_target_associations()
    - Get all associations for a target
  • execute_query()
    - Execute custom GraphQL queries
scripts/query_opentargets.py 用于常见API操作的辅助函数:
  • search_entities()
    - 搜索靶点、疾病或药物
  • get_target_info()
    - 获取靶点注释
  • get_disease_info()
    - 获取疾病信息
  • get_target_disease_evidence()
    - 获取支持证据
  • get_known_drugs_for_disease()
    - 查找某一疾病的药物
  • get_drug_info()
    - 获取药物详情
  • get_target_associations()
    - 获取靶点的所有关联
  • execute_query()
    - 执行自定义GraphQL查询

References

参考资料

references/api_reference.md Complete GraphQL API documentation including:
  • Endpoint details and authentication
  • Available query types (target, disease, drug, search)
  • Example queries for all common operations
  • Error handling and best practices
  • Data licensing and citation requirements
references/evidence_types.md Comprehensive guide to evidence types and data sources:
  • Detailed descriptions of all 7 major evidence types
  • Scoring methodologies for each source
  • Evidence interpretation guidelines
  • Strengths and limitations of each evidence type
  • Quality assessment recommendations
references/target_annotations.md Complete target annotation reference:
  • 12 major annotation categories explained
  • Tractability assessment details
  • Safety liability sources
  • Expression, essentiality, and constraint data
  • Interpretation guidelines for target prioritization
  • Red flags and green flags for target assessment
references/api_reference.md 完整的GraphQL API文档,包括:
  • 端点详情和身份验证
  • 可用查询类型(靶点、疾病、药物、搜索)
  • 所有常见操作的示例查询
  • 错误处理和最佳实践
  • 数据许可和引用要求
references/evidence_types.md 证据类型和数据源的综合指南:
  • 所有7种主要证据类型的详细描述
  • 每个来源的评分方法
  • 证据解读指南
  • 每种证据类型的优势和局限性
  • 质量评估建议
references/target_annotations.md 完整的靶点注释参考:
  • 12个主要注释类别的解释
  • 可成药性评估详情
  • 安全隐患来源
  • 表达、必要性和约束数据
  • 靶点优先排序的解读指南
  • 靶点评估的警示信号和积极信号

Data Updates and Versioning

数据更新与版本控制

The Open Targets Platform is updated quarterly with new data releases. The current release (as of October 2025) is available at the API endpoint.
Release information: Check https://platform-docs.opentargets.org/release-notes for the latest updates.
Citation: When using Open Targets data, cite: Ochoa, D. et al. (2025) Open Targets Platform: facilitating therapeutic hypotheses building in drug discovery. Nucleic Acids Research, 53(D1):D1467-D1477.
Open Targets平台每季度更新数据发布版本。截至2025年10月的当前版本可通过API端点访问。
引用: 使用Open Targets数据时,请引用: Ochoa, D. et al. (2025) Open Targets Platform: facilitating therapeutic hypotheses building in drug discovery. Nucleic Acids Research, 53(D1):D1467-D1477.

Limitations and Considerations

局限性与注意事项

  1. API is for exploratory queries: For systematic analyses of many targets/diseases, use data downloads or BigQuery
  2. Scores are relative, not absolute: Association scores rank evidence strength but don't predict clinical success
  3. Under-studied diseases score lower: Novel or rare diseases may have strong evidence but lower aggregate scores
  4. Evidence quality varies: Weight expert-curated sources higher than computational predictions
  5. Requires biological interpretation: Scores and evidence must be interpreted in biological and clinical context
  6. No authentication required: All data is freely accessible, but cite appropriately
  1. API适用于探索性查询: 对于大量靶点/疾病的系统分析,请使用数据下载或BigQuery
  2. 评分是相对的,而非绝对的: 关联评分对证据强度进行排名,但不预测临床成功
  3. 研究不足的疾病评分较低: 新型或罕见疾病可能有强证据,但综合评分较低
  4. 证据质量参差不齐: 专家 curated 来源的权重应高于计算预测
  5. 需要生物学解读: 评分和证据必须结合生物学和临床背景进行解读
  6. 无需身份验证: 所有数据均可免费访问,但需适当引用