tooluniverse-drug-target-validation

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Drug Target Validation Pipeline

药物靶点验证流程

Validate drug target hypotheses using multi-dimensional computational evidence before committing to wet-lab work. Produces a quantitative Target Validation Score (0-100) with priority tier classification and GO/NO-GO recommendation.
KEY PRINCIPLES:
  1. Report-first approach - Create report file FIRST, then populate progressively
  2. Target disambiguation FIRST - Resolve all identifiers before analysis
  3. Evidence grading - Grade all evidence as T1 (experimental) to T4 (computational)
  4. Disease-specific - Tailor analysis to disease context when provided
  5. Modality-aware - Consider small molecule vs biologics tractability
  6. Safety-first - Prominently flag safety concerns early
  7. Quantitative scoring - Every dimension scored numerically (0-100 composite)
  8. Negative results documented - "No data" is data; empty sections are failures
  9. Source references - Every statement must cite tool/database
  10. Completeness checklist - Mandatory section showing analysis coverage
  11. English-first queries - Always use English terms in tool calls. Respond in user's language

在开展湿实验工作前,使用多维度计算证据验证药物靶点假说。生成0-100分的定量靶点验证评分,同时给出优先级层级分类和GO/NO-GO建议。
核心原则:
  1. 报告优先原则 - 先创建报告文件,再逐步填充内容
  2. 靶点消歧优先 - 在分析前解析所有标识符
  3. 证据分级 - 将所有证据分为T1(实验性)至T4(计算性)四个等级
  4. 疾病特异性 - 当提供疾病背景时,针对性调整分析内容
  5. 适配治疗模态 - 考虑小分子与生物制剂的可开发性
  6. 安全优先 - 尽早突出标记安全隐患
  7. 定量评分 - 每个维度均采用0-100的数值评分(综合总分0-100)
  8. 记录阴性结果 - “无数据”本身也是数据;空白部分视为分析失败
  9. 来源引用 - 所有结论必须标注工具/数据库来源
  10. 完整性检查清单 - 必须包含显示分析覆盖范围的章节
  11. 英文优先查询 - 工具调用时始终使用英文术语,以用户语言回复

When to Use This Skill

何时使用该技能

Apply when users:
  • Ask "Is [target] a good drug target for [disease]?"
  • Need target validation or druggability assessment
  • Want to compare targets for drug discovery prioritization
  • Ask about safety risks of modulating a target
  • Need chemical starting points for target validation
  • Ask about pathway context for a target
  • Need a GO/NO-GO recommendation for a target
  • Want a comprehensive target dossier for investment decisions
NOT for (use other skills instead):
  • General target biology overview -> Use
    tooluniverse-target-research
  • Drug compound profiling -> Use
    tooluniverse-drug-research
  • Variant interpretation -> Use
    tooluniverse-variant-interpretation
  • Disease research -> Use
    tooluniverse-disease-research

当用户有以下需求时适用:
  • 询问“[靶点]是否是[疾病]的良好药物靶点?”
  • 需要靶点验证或成药性评估
  • 希望对比靶点以进行药物研发优先级排序
  • 询问调控某一靶点的安全风险
  • 需要靶点验证的化学起始点
  • 询问靶点的通路背景
  • 需要针对某一靶点的GO/NO-GO建议
  • 用于投资决策的全面靶点档案
不适用场景(请使用其他技能):
  • 靶点生物学概述 -> 使用
    tooluniverse-target-research
  • 药物化合物分析 -> 使用
    tooluniverse-drug-research
  • 变异解读 -> 使用
    tooluniverse-variant-interpretation
  • 疾病研究 -> 使用
    tooluniverse-disease-research

Input Parameters

输入参数

ParameterRequiredDescriptionExample
targetYesGene symbol, protein name, or UniProt ID
EGFR
,
P00533
,
Epidermal growth factor receptor
diseaseNoDisease/indication for context
Non-small cell lung cancer
,
Pancreatic cancer
modalityNoPreferred therapeutic modality
small molecule
,
antibody
,
protein therapeutic
,
PROTAC

参数是否必填描述示例
target基因符号、蛋白质名称或UniProt ID
EGFR
,
P00533
,
Epidermal growth factor receptor
disease用于提供背景的疾病/适应症
Non-small cell lung cancer
,
Pancreatic cancer
modality首选治疗模态
small molecule
,
antibody
,
protein therapeutic
,
PROTAC

Target Validation Scoring System

靶点验证评分系统

Score Components (Total: 0-100)

评分构成(总分:0-100)

Disease Association (0-30 points):
  • Genetic evidence: 0-10 (GWAS, rare variants, somatic mutations)
  • Literature evidence: 0-10 (publications, clinical studies)
  • Pathway evidence: 0-10 (disease pathway involvement)
Druggability (0-25 points):
  • Structural tractability: 0-10 (structure quality, binding pockets)
  • Chemical matter: 0-10 (known compounds, bioactivity data)
  • Target class: 0-5 (validated target family bonus)
Safety Profile (0-20 points):
  • Tissue expression selectivity: 0-5 (expression in critical tissues)
  • Genetic validation: 0-10 (knockout phenotypes, human genetics)
  • Known adverse events: 0-5 (safety signals from modulators)
Clinical Precedent (0-15 points):
  • Approved drugs: 15 (strong precedent, validated target)
  • Clinical trials: 10 (moderate precedent)
  • Preclinical compounds: 5 (weak precedent)
  • None: 0 (novel target)
Validation Evidence (0-10 points):
  • Functional studies: 0-5 (CRISPR, siRNA, biochemical)
  • Disease models: 0-5 (animal models, patient data)
疾病关联性(0-30分):
  • 遗传证据: 0-10(GWAS、罕见变异、体细胞突变)
  • 文献证据: 0-10(出版物、临床研究)
  • 通路证据: 0-10(疾病通路参与度)
成药性(0-25分):
  • 结构可开发性: 0-10(结构质量、结合口袋)
  • 化学物质: 0-10(已知化合物、生物活性数据)
  • 靶点类别: 0-5(已验证靶点家族加分)
安全性概况(0-20分):
  • 组织表达选择性: 0-5(关键组织中的表达情况)
  • 遗传验证: 0-10(敲除表型、人类遗传学数据)
  • 已知不良事件: 0-5(调控剂的安全信号)
临床先例(0-15分):
  • 已获批药物: 15分(强先例,已验证靶点)
  • 临床试验: 10分(中等先例)
  • 临床前化合物: 5分(弱先例)
  • 无: 0分(全新靶点)
验证证据(0-10分):
  • 功能研究: 0-5(CRISPR、siRNA、生化实验)
  • 疾病模型: 0-5(动物模型、患者数据)

Priority Tiers

优先级层级

ScoreTierRecommendation
80-100Tier 1Highly validated - proceed with confidence
60-79Tier 2Good target - needs focused validation
40-59Tier 3Moderate risk - significant validation needed
0-39Tier 4High risk - consider alternatives
评分层级建议
80-100层级1高度验证 - 可放心推进
60-79层级2良好靶点 - 需要针对性验证
40-59层级3中等风险 - 需要大量验证工作
0-39层级4高风险 - 考虑替代靶点

Evidence Grading System

证据分级系统

TierSymbolCriteriaExamples
T1[T1]Direct mechanistic, human clinical proofFDA-approved drug, crystal structure with mechanism, patient mutation
T2[T2]Functional studies, model organismsiRNA phenotype, mouse KO, biochemical assay, CRISPR screen
T3[T3]Association, screen hits, computationalGWAS hit, DepMap essentiality, expression correlation
T4[T4]Mention, review, text-mined, predictedReview article, database annotation, AlphaFold prediction

层级符号标准示例
T1[T1]直接机制性、人类临床证据FDA获批药物、带机制解析的晶体结构、患者突变
T2[T2]功能研究、模式生物数据siRNA表型、基因敲除小鼠、生化实验、CRISPR筛选
T3[T3]关联性、筛选命中、计算数据GWAS命中结果、DepMap必需性、表达相关性
T4[T4]提及、综述、文本挖掘、预测数据综述文章、数据库注释、AlphaFold预测结果

Phase 0: Target Disambiguation & ID Resolution (ALWAYS FIRST)

阶段0:靶点消歧与标识符解析(必须首先执行)

Objective: Resolve target to ALL needed identifiers before any analysis.
目标: 在进行任何分析前,将靶点解析为所有所需标识符。

Resolution Strategy

解析策略

python
undefined
python
undefined

Step 1: Determine input type and get initial identifiers

Step 1: Determine input type and get initial identifiers

If gene symbol (e.g., "EGFR"):

If gene symbol (e.g., "EGFR"):

mygene = tu.tools.MyGene_query_genes(query="EGFR", species="human", fields="symbol,name,ensembl.gene,uniprot.Swiss-Prot,entrezgene")
mygene = tu.tools.MyGene_query_genes(query="EGFR", species="human", fields="symbol,name,ensembl.gene,uniprot.Swiss-Prot,entrezgene")

Extract: ensembl_id, uniprot_id, entrez_id, symbol, name

Extract: ensembl_id, uniprot_id, entrez_id, symbol, name

If UniProt ID (e.g., "P00533"):

If UniProt ID (e.g., "P00533"):

uniprot = tu.tools.UniProt_get_entry_by_accession(accession="P00533")
uniprot = tu.tools.UniProt_get_entry_by_accession(accession="P00533")

Extract: gene names, Ensembl xrefs, function

Extract: gene names, Ensembl xrefs, function

Step 2: Resolve Ensembl ID and get versioned ID for GTEx

Step 2: Resolve Ensembl ID and get versioned ID for GTEx

ensembl = tu.tools.ensembl_lookup_gene(gene_id=ensembl_id, species="homo_sapiens")
ensembl = tu.tools.ensembl_lookup_gene(gene_id=ensembl_id, species="homo_sapiens")

CRITICAL: species parameter is REQUIRED

CRITICAL: species parameter is REQUIRED

CRITICAL: Response is wrapped in {status, data, url, content_type} - access via ensembl['data']

CRITICAL: Response is wrapped in {status, data, url, content_type} - access via ensembl['data']

ensembl_data = ensembl.get('data', ensembl) if isinstance(ensembl, dict) else ensembl
ensembl_data = ensembl.get('data', ensembl) if isinstance(ensembl, dict) else ensembl

Extract: version for versioned_id (e.g., "ENSG00000146648.18")

Extract: version for versioned_id (e.g., "ENSG00000146648.18")

Step 3: Get Ensembl cross-references

Step 3: Get Ensembl cross-references

xrefs = tu.tools.ensembl_get_xrefs(id=ensembl_id)
xrefs = tu.tools.ensembl_get_xrefs(id=ensembl_id)

Extract: HGNC, UniProt, EntrezGene mappings

Extract: HGNC, UniProt, EntrezGene mappings

Step 4: Get OpenTargets target info

Step 4: Get OpenTargets target info

ot_target = tu.tools.OpenTargets_get_target_id_description_by_name(targetName="EGFR")
ot_target = tu.tools.OpenTargets_get_target_id_description_by_name(targetName="EGFR")

Verify ensemblId matches

Verify ensemblId matches

Step 5: Get ChEMBL target ID

Step 5: Get ChEMBL target ID

chembl_targets = tu.tools.ChEMBL_search_targets(pref_name__contains="EGFR", organism="Homo sapiens", limit=5)
chembl_targets = tu.tools.ChEMBL_search_targets(pref_name__contains="EGFR", organism="Homo sapiens", limit=5)

Extract: target_chembl_id for later use

Extract: target_chembl_id for later use

Step 6: Get UniProt function summary

Step 6: Get UniProt function summary

function_info = tu.tools.UniProt_get_function_by_accession(accession=uniprot_id)
function_info = tu.tools.UniProt_get_function_by_accession(accession=uniprot_id)

Returns list of strings (NOT dict)

Returns list of strings (NOT dict)

Step 7: Get alternative names for collision detection

Step 7: Get alternative names for collision detection

alt_names = tu.tools.UniProt_get_alternative_names_by_accession(accession=uniprot_id)
undefined
alt_names = tu.tools.UniProt_get_alternative_names_by_accession(accession=uniprot_id)
undefined

Identifier Resolution Output

标识符解析输出

markdown
undefined
markdown
undefined

1. Target Identity

1. 靶点标识

DatabaseIdentifierVerified
Gene SymbolEGFRYes
Full NameEpidermal growth factor receptorYes
EnsemblENSG00000146648Yes
Ensembl (versioned)ENSG00000146648.18Yes
UniProtP00533Yes
Entrez Gene1956Yes
ChEMBLCHEMBL203Yes
HGNCHGNC:3236Yes
Protein Function: [from UniProt_get_function_by_accession] Subcellular Location: [from UniProt_get_subcellular_location_by_accession] Target Class: [from OpenTargets_get_target_classes_by_ensemblID]
undefined
数据库标识符已验证
基因符号EGFR
全称Epidermal growth factor receptor
EnsemblENSG00000146648
Ensembl(带版本)ENSG00000146648.18
UniProtP00533
Entrez Gene1956
ChEMBLCHEMBL203
HGNCHGNC:3236
蛋白质功能: [来自UniProt_get_function_by_accession] 亚细胞定位: [来自UniProt_get_subcellular_location_by_accession] 靶点类别: [来自OpenTargets_get_target_classes_by_ensemblID]
undefined

Known Parameter Corrections

已知参数修正

ToolWRONG ParameterCORRECT Parameter
ensembl_lookup_gene
id
gene_id
(+
species="homo_sapiens"
REQUIRED)
Reactome_map_uniprot_to_pathways
uniprot_id
id
ensembl_get_xrefs
gene_id
id
GTEx_get_median_gene_expression
gencode_id
only
gencode_id
+
operation="median"
OpenTargets_*
ensemblID
(uppercase)
ensemblId
(camelCase)
OpenTargets_get_publications_*
ensemblId
entityId
OpenTargets_get_associated_drugs_by_target_ensemblID
ensemblId
only
ensemblId
+
size
(REQUIRED)
MyGene_query_genes
q
query
PubMed_search_articles
returns
{articles: [...]}
returns plain list of dicts
UniProt_get_function_by_accession
returns dictreturns list of strings
HPA_get_rna_expression_by_source
ensembl_id
gene_name
+
source_type
+
source_name
(ALL required)
alphafold_get_prediction
uniprot_accession
qualifier
drugbank_get_safety_*
simple params
query
,
case_sensitive
,
exact_match
,
limit
(ALL required)

工具错误参数正确参数
ensembl_lookup_gene
id
gene_id
(+ 必须添加
species="homo_sapiens"
)
Reactome_map_uniprot_to_pathways
uniprot_id
id
ensembl_get_xrefs
gene_id
id
GTEx_get_median_gene_expression
gencode_id
gencode_id
+
operation="median"
OpenTargets_*
ensemblID
(大写)
ensemblId
(驼峰式)
OpenTargets_get_publications_*
ensemblId
entityId
OpenTargets_get_associated_drugs_by_target_ensemblID
ensemblId
ensemblId
+
size
(必填)
MyGene_query_genes
q
query
PubMed_search_articles
返回
{articles: [...]}
返回纯字典列表
UniProt_get_function_by_accession
返回字典返回字符串列表
HPA_get_rna_expression_by_source
ensembl_id
gene_name
+
source_type
+
source_name
(全部必填)
alphafold_get_prediction
uniprot_accession
qualifier
drugbank_get_safety_*
简单参数
query
,
case_sensitive
,
exact_match
,
limit
(全部必填)

Phase 1: Disease Association Evidence (0-30 points)

阶段1:疾病关联性证据(0-30分)

Objective: Quantify the strength of target-disease association from genetic, literature, and pathway evidence.
目标: 从遗传、文献和通路证据量化靶点与疾病的关联强度。

1A. OpenTargets Disease Associations (Primary)

1A. OpenTargets疾病关联性(主要来源)

python
undefined
python
undefined

Get ALL disease associations for target

获取靶点的所有疾病关联

diseases = tu.tools.OpenTargets_get_diseases_phenotypes_by_target_ensembl(ensemblId=ensembl_id)
diseases = tu.tools.OpenTargets_get_diseases_phenotypes_by_target_ensembl(ensemblId=ensembl_id)

If specific disease provided, get detailed evidence

如果提供了特定疾病,获取详细证据

if disease_name: disease_info = tu.tools.OpenTargets_get_disease_id_description_by_name(diseaseName=disease_name) efo_id = disease_info.get('id') # e.g., "EFO_0003060"
evidence = tu.tools.OpenTargets_target_disease_evidence(
    efoId=efo_id, ensemblId=ensembl_id
)

# Get evidence by data source for detailed breakdown
datasource_evidence = tu.tools.OpenTargets_get_evidence_by_datasource(
    efoId=efo_id, ensemblId=ensembl_id,
    datasourceIds=["ot_genetics_portal", "eva", "gene2phenotype", "genomics_england", "uniprot_literature"],
    size=100
)
undefined
if disease_name: disease_info = tu.tools.OpenTargets_get_disease_id_description_by_name(diseaseName=disease_name) efo_id = disease_info.get('id') # 例如 "EFO_0003060"
evidence = tu.tools.OpenTargets_target_disease_evidence(
    efoId=efo_id, ensemblId=ensembl_id
)

# 按数据源获取详细证据细分
datasource_evidence = tu.tools.OpenTargets_get_evidence_by_datasource(
    efoId=efo_id, ensemblId=ensembl_id,
    datasourceIds=["ot_genetics_portal", "eva", "gene2phenotype", "genomics_england", "uniprot_literature"],
    size=100
)
undefined

1B. GWAS Genetic Evidence

1B. GWAS遗传证据

python
undefined
python
undefined

GWAS associations for target gene

靶点基因的GWAS关联

gwas_snps = tu.tools.gwas_get_snps_for_gene(mapped_gene=gene_symbol, size=50)
gwas_snps = tu.tools.gwas_get_snps_for_gene(mapped_gene=gene_symbol, size=50)

If specific disease, search for trait-specific associations

如果提供了特定疾病,搜索该疾病相关的关联研究

if disease_name: gwas_studies = tu.tools.gwas_search_studies(query=disease_name, size=20)
undefined
if disease_name: gwas_studies = tu.tools.gwas_search_studies(query=disease_name, size=20)
undefined

1C. Constraint Scores (gnomAD)

1C. 约束评分(gnomAD)

python
undefined
python
undefined

Genetic constraint - intolerance to loss of function

遗传约束 - 功能缺失耐受性

constraints = tu.tools.gnomad_get_gene_constraints(gene_symbol=gene_symbol)
constraints = tu.tools.gnomad_get_gene_constraints(gene_symbol=gene_symbol)

Extract: pLI, LOEUF, missense_z, pRec

提取:pLI, LOEUF, missense_z, pRec

High pLI (>0.9) = highly intolerant to LoF = likely essential

高pLI(>0.9) = 对功能缺失高度不耐受 = 可能是必需基因

undefined
undefined

1D. Literature Evidence

1D. 文献证据

python
undefined
python
undefined

PubMed for target-disease association

PubMed中靶点-疾病关联的文献

articles = tu.tools.PubMed_search_articles( query=f'"{gene_symbol}" AND "{disease_name}" AND (target OR therapeutic OR inhibitor)', limit=50 )
articles = tu.tools.PubMed_search_articles( query=f'"{gene_symbol}" AND "{disease_name}" AND (target OR therapeutic OR inhibitor)', limit=50 )

PubMed_search_articles returns a plain list of dicts

PubMed_search_articles返回纯字典列表

OpenTargets publications

OpenTargets相关出版物

pubs = tu.tools.OpenTargets_get_publications_by_target_ensemblID(entityId=ensembl_id)
undefined
pubs = tu.tools.OpenTargets_get_publications_by_target_ensemblID(entityId=ensembl_id)
undefined

Scoring Logic - Disease Association

疾病关联性评分逻辑

Genetic Evidence (0-10):
  - GWAS hits for specific disease: +3 per significant locus (max 6)
  - Rare variant evidence (ClinVar pathogenic): +2
  - Somatic mutations in disease: +2
  - pLI > 0.9 (essential gene): +2

Literature Evidence (0-10):
  - >100 publications on target+disease: 10
  - 50-100 publications: 7
  - 10-50 publications: 5
  - 1-10 publications: 3
  - 0 publications: 0

Pathway Evidence (0-10):
  - OpenTargets overall score > 0.8: 10
  - Score 0.5-0.8: 7
  - Score 0.2-0.5: 4
  - Score < 0.2: 1

遗传证据(0-10分):
  - 特定疾病的GWAS命中位点:每个显著位点+3分(最高6分)
  - 罕见变异证据(ClinVar致病性):+2分
  - 疾病中的体细胞突变:+2分
  - pLI>0.9(必需基因):+2分

文献证据(0-10分):
  - 靶点+疾病相关出版物>100篇:10分
  - 50-100篇:7分
  - 10-50篇:5分
  - 1-10篇:3分
  - 0篇:0分

通路证据(0-10分):
  - OpenTargets总体评分>0.8:10分
  - 评分0.5-0.8:7分
  - 评分0.2-0.5:4分
  - 评分<0.2:1分

Phase 2: Druggability Assessment (0-25 points)

阶段2:成药性评估(0-25分)

Objective: Assess whether the target is amenable to therapeutic intervention.
目标: 评估靶点是否适合进行治疗干预。

2A. OpenTargets Tractability

2A. OpenTargets可开发性

python
undefined
python
undefined

Tractability assessment across modalities

跨模态的可开发性评估

tractability = tu.tools.OpenTargets_get_target_tractability_by_ensemblID(ensemblId=ensembl_id)
tractability = tu.tools.OpenTargets_get_target_tractability_by_ensemblID(ensemblId=ensembl_id)

Returns: label, modality (SM, AB, PR, OC), value (boolean/score)

返回:label, modality (SM, AB, PR, OC), value (布尔值/评分)

Modalities: Small Molecule, Antibody, PROTAC, Other Clinical

治疗模态:Small Molecule(小分子)、Antibody(抗体)、PROTAC、Other Clinical(其他临床类型)

undefined
undefined

2B. Target Class & Family

2B. 靶点类别与家族

python
undefined
python
undefined

Target classification (kinase, GPCR, ion channel, etc.)

靶点分类(激酶、GPCR、离子通道等)

target_classes = tu.tools.OpenTargets_get_target_classes_by_ensemblID(ensemblId=ensembl_id)
target_classes = tu.tools.OpenTargets_get_target_classes_by_ensemblID(ensemblId=ensembl_id)

Pharos target development level

Pharos靶点开发阶段

pharos = tu.tools.Pharos_get_target(gene=gene_symbol)
pharos = tu.tools.Pharos_get_target(gene=gene_symbol)

TDL: Tclin (approved drug) > Tchem (compounds) > Tbio (biology) > Tdark (unknown)

TDL: Tclin(已获批药物) > Tchem(有化合物) > Tbio(有生物学数据) > Tdark(未知)

DGIdb druggability categories

DGIdb成药性分类

druggability = tu.tools.DGIdb_get_gene_druggability(genes=[gene_symbol])
undefined
druggability = tu.tools.DGIdb_get_gene_druggability(genes=[gene_symbol])
undefined

2C. Structural Tractability

2C. 结构可开发性

python
undefined
python
undefined

PDB structures available

可用的PDB结构

if uniprot_id: uniprot_entry = tu.tools.UniProt_get_entry_by_accession(accession=uniprot_id) # Extract PDB cross-references from entry
if uniprot_id: uniprot_entry = tu.tools.UniProt_get_entry_by_accession(accession=uniprot_id) # 从条目中提取PDB交叉引用

AlphaFold prediction

AlphaFold预测结果

alphafold = tu.tools.alphafold_get_prediction(qualifier=uniprot_id) alphafold_summary = tu.tools.alphafold_get_summary(qualifier=uniprot_id)
alphafold = tu.tools.alphafold_get_prediction(qualifier=uniprot_id) alphafold_summary = tu.tools.alphafold_get_summary(qualifier=uniprot_id)

For top PDB structures, analyze binding pockets

针对顶级PDB结构,分析结合口袋

ProteinsPlus DoGSiteScorer for pocket detection

使用ProteinsPlus DoGSiteScorer进行口袋检测

for pdb_id in top_pdb_ids[:3]: pockets = tu.tools.ProteinsPlus_predict_binding_sites(pdb_id=pdb_id) # Returns predicted druggable pockets with scores
undefined
for pdb_id in top_pdb_ids[:3]: pockets = tu.tools.ProteinsPlus_predict_binding_sites(pdb_id=pdb_id) # 返回预测的可成药口袋及评分
undefined

2D. Chemical Probes & Enabling Packages

2D. 化学探针与靶点赋能包

python
undefined
python
undefined

Chemical probes (validated tool compounds)

化学探针(经过验证的工具化合物)

probes = tu.tools.OpenTargets_get_chemical_probes_by_target_ensemblID(ensemblId=ensembl_id)
probes = tu.tools.OpenTargets_get_chemical_probes_by_target_ensemblID(ensemblId=ensembl_id)

Target Enabling Packages (TEPs)

靶点赋能包(TEPs)

teps = tu.tools.OpenTargets_get_target_enabling_packages_by_ensemblID(ensemblId=ensembl_id)
undefined
teps = tu.tools.OpenTargets_get_target_enabling_packages_by_ensemblID(ensemblId=ensembl_id)
undefined

Scoring Logic - Druggability

成药性评分逻辑

Structural Tractability (0-10):
  - High-res co-crystal structure with ligand: 10
  - PDB structure available, pockets detected: 7
  - AlphaFold only, confident pocket prediction: 5
  - AlphaFold low confidence / no structure: 2
  - No structural data: 0

Chemical Matter (0-10):
  - Known drug-like compounds (IC50 < 100nM): 10
  - Tool compounds (IC50 < 1uM): 7
  - HTS hits only (IC50 > 1uM): 4
  - No known ligands: 0

Target Class Bonus (0-5):
  - Validated druggable family (kinase, GPCR, nuclear receptor): 5
  - Enzyme, ion channel: 4
  - Protein-protein interaction, transporter: 2
  - Novel/unknown class: 0

结构可开发性(0-10分):
  - 带配体的高分辨率共晶结构:10分
  - 有PDB结构且检测到口袋:7分
  - 仅AlphaFold结构,口袋预测可信度高:5分
  - AlphaFold可信度低/无结构:2分
  - 无结构数据:0分

化学物质(0-10分):
  - 已知类药化合物(IC50 < 100nM):10分
  - 工具化合物(IC50 < 1uM):7分
  - 仅HTS命中化合物(IC50 > 1uM):4分
  - 无已知配体:0分

靶点类别加分(0-5分):
  - 已验证可成药家族(激酶、GPCR、核受体):5分
  - 酶、离子通道:4分
  - 蛋白质-蛋白质相互作用、转运体:2分
  - 新型/未知类别:0分

Phase 3: Known Modulators & Chemical Matter (Feeds into Phase 2 scoring)

阶段3:已知调控剂与化学物质(为阶段2评分提供数据)

Objective: Identify existing chemical starting points for target validation.
目标: 识别靶点验证的现有化学起始点。

3A. ChEMBL Bioactivity

3A. ChEMBL生物活性

python
undefined
python
undefined

Search for ChEMBL target

搜索ChEMBL靶点

chembl_targets = tu.tools.ChEMBL_search_targets( pref_name__contains=gene_symbol, organism="Homo sapiens", limit=10 )
chembl_targets = tu.tools.ChEMBL_search_targets( pref_name__contains=gene_symbol, organism="Homo sapiens", limit=10 )

Get activities for best matching target

获取最佳匹配靶点的活性数据

target_chembl_id = chembl_targets[0]['target_chembl_id'] activities = tu.tools.ChEMBL_get_target_activities( target_chembl_id__exact=target_chembl_id, limit=100 )
target_chembl_id = chembl_targets[0]['target_chembl_id'] activities = tu.tools.ChEMBL_get_target_activities( target_chembl_id__exact=target_chembl_id, limit=100 )

Parse: compound IDs, pChEMBL values, activity types (IC50, Ki, Kd)

解析:化合物ID、pChEMBL值、活性类型(IC50, Ki, Kd)

Filter: potent compounds (pChEMBL >= 6.0 = IC50 <= 1uM)

筛选:强效化合物(pChEMBL >= 6.0 = IC50 <= 1uM)

undefined
undefined

3B. BindingDB Ligands

3B. BindingDB配体

python
undefined
python
undefined

Experimental binding data

实验结合数据

ligands = tu.tools.BindingDB_get_ligands_by_uniprot( uniprot=uniprot_id, affinity_cutoff=10000 # nM )
ligands = tu.tools.BindingDB_get_ligands_by_uniprot( uniprot=uniprot_id, affinity_cutoff=10000 # nM )

Returns: SMILES, affinity_type (Ki/IC50/Kd), affinity value, PMID

返回:SMILES、亲和力类型(Ki/IC50/Kd)、亲和力值、PMID

undefined
undefined

3C. PubChem Bioassays

3C. PubChem生物测定

python
undefined
python
undefined

HTS screening data

HTS筛选数据

assays = tu.tools.PubChem_search_assays_by_target_gene(gene_symbol=gene_symbol)
assays = tu.tools.PubChem_search_assays_by_target_gene(gene_symbol=gene_symbol)

Get details for top assays

获取顶级测定的详细信息

for aid in assay_ids[:5]: summary = tu.tools.PubChem_get_assay_summary(aid=str(aid)) targets = tu.tools.PubChem_get_assay_targets(aid=str(aid)) actives = tu.tools.PubChem_get_assay_active_compounds(aid=str(aid))
undefined
for aid in assay_ids[:5]: summary = tu.tools.PubChem_get_assay_summary(aid=str(aid)) targets = tu.tools.PubChem_get_assay_targets(aid=str(aid)) actives = tu.tools.PubChem_get_assay_active_compounds(aid=str(aid))
undefined

3D. Known Drugs Targeting This Protein

3D. 靶向该蛋白质的已知药物

python
undefined
python
undefined

OpenTargets known drugs

OpenTargets已知药物

drugs = tu.tools.OpenTargets_get_associated_drugs_by_target_ensemblID( ensemblId=ensembl_id, size=25 )
drugs = tu.tools.OpenTargets_get_associated_drugs_by_target_ensemblID( ensemblId=ensembl_id, size=25 )

ChEMBL drug mechanisms

ChEMBL药物作用机制

drug_mechanisms = tu.tools.ChEMBL_search_mechanisms( target_chembl_id=target_chembl_id, limit=50 )
drug_mechanisms = tu.tools.ChEMBL_search_mechanisms( target_chembl_id=target_chembl_id, limit=50 )

Drug interaction databases

药物相互作用数据库

dgidb = tu.tools.DGIdb_get_gene_info(genes=[gene_symbol])
undefined
dgidb = tu.tools.DGIdb_get_gene_info(genes=[gene_symbol])
undefined

Report Format - Chemical Matter

化学物质报告格式

markdown
undefined
markdown
undefined

4. Known Modulators & Chemical Matter

4. 已知调控剂与化学物质

4.1 Approved Drugs

4.1 已获批药物

DrugChEMBL IDMechanismPhaseIndicationSource
ErlotinibCHEMBL553Inhibitor4NSCLC[T1] OpenTargets
GefitinibCHEMBL939Inhibitor4NSCLC[T1] OpenTargets
药物ChEMBL ID作用机制阶段适应症来源
ErlotinibCHEMBL553抑制剂4NSCLC[T1] OpenTargets
GefitinibCHEMBL939抑制剂4NSCLC[T1] OpenTargets

4.2 ChEMBL Bioactivity Summary

4.2 ChEMBL生物活性摘要

Total Activities: 12,456 datapoints across 2,341 assays Most Potent Compound: CHEMBL413456 (IC50 = 0.3 nM) [T1] Chemical Series: 8 distinct scaffolds with pChEMBL >= 7.0 Selectivity Data: Available for 45 compounds (kinase panel)
总活性数据点: 12,456个,覆盖2,341个测定 最强效化合物: CHEMBL413456 (IC50 = 0.3 nM) [T1] 化学系列: 8个不同骨架,pChEMBL >= 7.0 选择性数据: 45种化合物有激酶面板选择性数据

4.3 BindingDB Ligands

4.3 BindingDB配体

Total Ligands: 856 with measured affinity Best Affinity: 0.1 nM (Ki) Affinity Distribution: <1nM: 23, 1-10nM: 89, 10-100nM: 234, 100nM-1uM: 510
总配体数: 856个,带有测量亲和力 最佳亲和力: 0.1 nM (Ki) 亲和力分布: <1nM: 23个, 1-10nM: 89个, 10-100nM: 234个, 100nM-1uM: 510个

4.4 Chemical Probes

4.4 化学探针

ProbeSourcePotencySelectivityUse
SGC-1234SGCIC50=5nM>100xIn vitro

---
探针来源效力选择性用途
SGC-1234SGCIC50=5nM>100倍体外实验

---

Phase 4: Clinical Precedent (0-15 points)

阶段4:临床先例(0-15分)

Objective: Assess clinical validation from approved drugs and clinical trials.
目标: 从已获批药物和临床试验评估临床验证情况。

4A. FDA-Approved Drugs

4A. FDA获批药物

python
undefined
python
undefined

FDA label information

FDA标签信息

fda_moa = tu.tools.FDA_get_mechanism_of_action_by_drug_name(drug_name=gene_symbol) fda_indications = tu.tools.FDA_get_indications_by_drug_name(drug_name=known_drug_name)
fda_moa = tu.tools.FDA_get_mechanism_of_action_by_drug_name(drug_name=gene_symbol) fda_indications = tu.tools.FDA_get_indications_by_drug_name(drug_name=known_drug_name)

DrugBank pharmacology

DrugBank药理学

drugbank_targets = tu.tools.drugbank_get_targets_by_drug_name_or_drugbank_id( query=known_drug_name, case_sensitive=False, exact_match=False, limit=10 )
drugbank_targets = tu.tools.drugbank_get_targets_by_drug_name_or_drugbank_id( query=known_drug_name, case_sensitive=False, exact_match=False, limit=10 )

DrugBank safety info

DrugBank安全信息

drugbank_safety = tu.tools.drugbank_get_safety_by_drug_name_or_drugbank_id( query=known_drug_name, case_sensitive=False, exact_match=False, limit=10 )
undefined
drugbank_safety = tu.tools.drugbank_get_safety_by_drug_name_or_drugbank_id( query=known_drug_name, case_sensitive=False, exact_match=False, limit=10 )
undefined

4B. Clinical Trials

4B. 临床试验

python
undefined
python
undefined

Active clinical trials targeting this protein

针对该蛋白质的活跃临床试验

trials = tu.tools.search_clinical_trials( query_term=gene_symbol, intervention=gene_symbol, pageSize=50 )
trials = tu.tools.search_clinical_trials( query_term=gene_symbol, intervention=gene_symbol, pageSize=50 )

If specific disease context

如果有特定疾病背景

if disease_name: disease_trials = tu.tools.search_clinical_trials( query_term=gene_symbol, condition=disease_name, pageSize=50 )
undefined
if disease_name: disease_trials = tu.tools.search_clinical_trials( query_term=gene_symbol, condition=disease_name, pageSize=50 )
undefined

4C. Failed Programs (Learn from Failures)

4C. 失败项目(从失败中学习)

python
undefined
python
undefined

Drug warnings and withdrawals

药物警告与撤市信息

for drug_chembl_id in known_drug_ids: warnings = tu.tools.OpenTargets_get_drug_warnings_by_chemblId(chemblId=drug_chembl_id) adverse = tu.tools.OpenTargets_get_drug_adverse_events_by_chemblId(chemblId=drug_chembl_id)
undefined
for drug_chembl_id in known_drug_ids: warnings = tu.tools.OpenTargets_get_drug_warnings_by_chemblId(chemblId=drug_chembl_id) adverse = tu.tools.OpenTargets_get_drug_adverse_events_by_chemblId(chemblId=drug_chembl_id)
undefined

Scoring Logic - Clinical Precedent

临床先例评分逻辑

Clinical Precedent (0-15):
  - FDA-approved drug for SAME disease: 15
  - FDA-approved drug for DIFFERENT disease: 12
  - Phase 3 clinical trial: 10
  - Phase 2 clinical trial: 7
  - Phase 1 clinical trial: 5
  - Preclinical compounds only: 3
  - No clinical development: 0

Adjustment factors:
  - Failed clinical program for safety: -3
  - Drug withdrawal: -5
  - Multiple approved drugs (validated class): +2

临床先例(0-15分):
  - FDA获批用于同一疾病的药物:15分
  - FDA获批用于其他疾病的药物:12分
  - 3期临床试验:10分
  - 2期临床试验:7分
  - 1期临床试验:5分
  - 仅临床前化合物:3分
  - 无临床开发:0分

调整因素:
  - 因安全性失败的临床项目:-3分
  - 药物撤市:-5分
  - 多种获批药物(已验证类别):+2分

Phase 5: Safety & Toxicity Considerations (0-20 points)

阶段5:安全性与毒性考量(0-20分)

Objective: Identify safety risks from expression, genetics, and known adverse events.
目标: 从表达、遗传学和已知不良事件中识别安全风险。

5A. OpenTargets Safety Profile

5A. OpenTargets安全性概况

python
safety = tu.tools.OpenTargets_get_target_safety_profile_by_ensemblID(ensemblId=ensembl_id)
python
safety = tu.tools.OpenTargets_get_target_safety_profile_by_ensemblID(ensemblId=ensembl_id)

Returns: safety liabilities, adverse effects, experimental toxicity

返回:安全隐患、不良反应、实验毒性

undefined
undefined

5B. Expression in Critical Tissues

5B. 关键组织中的表达

python
undefined
python
undefined

GTEx tissue expression (identifies essential organ expression)

GTEx组织表达(识别重要器官中的表达)

gtex = tu.tools.GTEx_get_median_gene_expression( operation="median", gencode_id=ensembl_versioned_id )
gtex = tu.tools.GTEx_get_median_gene_expression( operation="median", gencode_id=ensembl_versioned_id )

If empty, try unversioned ID

如果为空,尝试不带版本的ID

HPA expression

HPA表达

NOTE: HPA_get_rna_expression_by_source requires gene_name, source_type, source_name

注意:HPA_get_rna_expression_by_source需要gene_name, source_type, source_name

hpa = tu.tools.HPA_search_genes_by_query(search_query=gene_symbol) hpa_details = tu.tools.HPA_get_comprehensive_gene_details_by_ensembl_id(ensembl_id=ensembl_id)
hpa = tu.tools.HPA_search_genes_by_query(search_query=gene_symbol) hpa_details = tu.tools.HPA_get_comprehensive_gene_details_by_ensembl_id(ensembl_id=ensembl_id)

Check expression in safety-critical tissues

检查安全关键组织中的表达

Heart, liver, kidney, brain, bone marrow = high risk if target is expressed

心脏、肝脏、肾脏、大脑、骨髓 = 如果靶点在此表达则风险高

undefined
undefined

5C. Knockout Phenotypes

5C. 敲除表型

python
undefined
python
undefined

Mouse model phenotypes

小鼠模型表型

mouse_models = tu.tools.OpenTargets_get_biological_mouse_models_by_ensemblID(ensemblId=ensembl_id)
mouse_models = tu.tools.OpenTargets_get_biological_mouse_models_by_ensemblID(ensemblId=ensembl_id)

Genetic constraint (proxy for essentiality)

遗传约束(必需性的替代指标)

constraints = tu.tools.gnomad_get_gene_constraints(gene_symbol=gene_symbol)
constraints = tu.tools.gnomad_get_gene_constraints(gene_symbol=gene_symbol)

High pLI = essential gene = potential safety concern

高pLI = 必需基因 = 潜在安全隐患

undefined
undefined

5D. Known Adverse Events from Target Modulation

5D. 靶点调控的已知不良事件

python
undefined
python
undefined

For known drugs targeting this protein

针对靶向该蛋白质的已知药物

for drug_name in known_drug_names: fda_adr = tu.tools.FDA_get_adverse_reactions_by_drug_name(drug_name=drug_name) fda_warnings = tu.tools.FDA_get_warnings_and_cautions_by_drug_name(drug_name=drug_name) fda_boxed = tu.tools.FDA_get_boxed_warning_info_by_drug_name(drug_name=drug_name) fda_contraindications = tu.tools.FDA_get_contraindications_by_drug_name(drug_name=drug_name)
undefined
for drug_name in known_drug_names: fda_adr = tu.tools.FDA_get_adverse_reactions_by_drug_name(drug_name=drug_name) fda_warnings = tu.tools.FDA_get_warnings_and_cautions_by_drug_name(drug_name=drug_name) fda_boxed = tu.tools.FDA_get_boxed_warning_info_by_drug_name(drug_name=drug_name) fda_contraindications = tu.tools.FDA_get_contraindications_by_drug_name(drug_name=drug_name)
undefined

5E. Homologs & Off-Target Risks

5E. 同源物与脱靶风险

python
undefined
python
undefined

Paralogs (close family members that might be hit)

旁系同源物(可能被命中的近缘家族成员)

homologs = tu.tools.OpenTargets_get_target_homologues_by_ensemblID(ensemblId=ensembl_id)
homologs = tu.tools.OpenTargets_get_target_homologues_by_ensemblID(ensemblId=ensembl_id)

Paralogs with high sequence identity = selectivity challenge

序列同一性高的旁系同源物 = 选择性挑战

undefined
undefined

Scoring Logic - Safety

安全性评分逻辑

Tissue Expression Selectivity (0-5):
  - Target restricted to disease tissue: 5
  - Low expression in heart/liver/kidney/brain: 4
  - Moderate expression in 1-2 critical tissues: 2
  - High expression in multiple critical tissues: 0

Genetic Validation (0-10):
  - Mouse KO viable, no severe phenotype: 10
  - Mouse KO viable with mild phenotype: 7
  - Mouse KO has concerning phenotype: 3
  - Mouse KO lethal: 0
  - No KO data, low pLI (<0.5): 5
  - No KO data, high pLI (>0.9): 2

Known Adverse Events (0-5):
  - No known safety signals: 5
  - Mild, manageable ADRs: 3
  - Serious ADRs reported: 1
  - Black box warning or drug withdrawal: 0

组织表达选择性(0-5分):
  - 靶点仅在疾病组织中表达:5分
  - 心脏/肝脏/肾脏/大脑中低表达:4分
  - 在1-2个关键组织中中度表达:2分
  - 在多个关键组织中高表达:0分

遗传验证(0-10分):
  - 基因敲除小鼠存活,无严重表型:10分
  - 基因敲除小鼠存活,有轻度表型:7分
  - 基因敲除小鼠有相关表型:3分
  - 基因敲除小鼠致死:0分
  - 无敲除数据,低pLI(<0.5):5分
  - 无敲除数据,高pLI(>0.9):2分

已知不良事件(0-5分):
  - 无已知安全信号:5分
  - 轻度、可管理的ADR:3分
  - 报告严重ADR:1分
  - 黑框警告或药物撤市:0分

Phase 6: Pathway Context & Network Analysis

阶段6:通路背景与网络分析

Objective: Understand the target's role in biological networks and disease pathways.
目标: 理解靶点在生物网络和疾病通路中的作用。

6A. Reactome Pathways

6A. Reactome通路

python
undefined
python
undefined

Map target to pathways

将靶点映射到通路

pathways = tu.tools.Reactome_map_uniprot_to_pathways(id=uniprot_id)
pathways = tu.tools.Reactome_map_uniprot_to_pathways(id=uniprot_id)

Get pathway details for top pathways

获取顶级通路的详细信息

for pathway in top_pathways[:5]: detail = tu.tools.Reactome_get_pathway(id=pathway['stId']) reactions = tu.tools.Reactome_get_pathway_reactions(id=pathway['stId'])
undefined
for pathway in top_pathways[:5]: detail = tu.tools.Reactome_get_pathway(id=pathway['stId']) reactions = tu.tools.Reactome_get_pathway_reactions(id=pathway['stId'])
undefined

6B. Protein-Protein Interactions

6B. 蛋白质-蛋白质相互作用

python
undefined
python
undefined

STRING network

STRING网络

string_ppi = tu.tools.STRING_get_protein_interactions( protein_ids=[gene_symbol], species=9606, confidence_score=0.7 )
string_ppi = tu.tools.STRING_get_protein_interactions( protein_ids=[gene_symbol], species=9606, confidence_score=0.7 )

Higher confidence = more reliable

置信度越高 = 越可靠

IntAct interactions (experimental)

IntAct相互作用(实验性)

intact_ppi = tu.tools.intact_get_interactions(identifier=uniprot_id)
intact_ppi = tu.tools.intact_get_interactions(identifier=uniprot_id)

OpenTargets interactions

OpenTargets相互作用

ot_ppi = tu.tools.OpenTargets_get_target_interactions_by_ensemblID(ensemblId=ensembl_id)
undefined
ot_ppi = tu.tools.OpenTargets_get_target_interactions_by_ensemblID(ensemblId=ensembl_id)
undefined

6C. Functional Enrichment

6C. 功能富集

python
undefined
python
undefined

GO annotations

GO注释

go_terms = tu.tools.OpenTargets_get_target_gene_ontology_by_ensemblID(ensemblId=ensembl_id)
go_terms = tu.tools.OpenTargets_get_target_gene_ontology_by_ensemblID(ensemblId=ensembl_id)

Direct GO query

直接GO查询

go_annotations = tu.tools.GO_get_annotations_for_gene(gene_id=gene_symbol)
go_annotations = tu.tools.GO_get_annotations_for_gene(gene_id=gene_symbol)

STRING functional enrichment of interaction partners

相互作用伙伴的STRING功能富集

enrichment = tu.tools.STRING_functional_enrichment( protein_ids=[gene_symbol], species=9606 )
undefined
enrichment = tu.tools.STRING_functional_enrichment( protein_ids=[gene_symbol], species=9606 )
undefined

Report Format - Pathway Context

通路背景报告格式

markdown
undefined
markdown
undefined

7. Pathway Context & Network Analysis

7. 通路背景与网络分析

7.1 Key Pathways

7.1 关键通路

PathwayReactome IDRelevance to DiseaseEvidence
EGFR signalingR-HSA-177929Driver pathway in NSCLC[T1]
RAS-RAF-MEK-ERKR-HSA-5673001Downstream effector[T1]
PI3K-AKT signalingR-HSA-2219528Resistance mechanism[T2]
通路Reactome ID与疾病的相关性证据
EGFR信号通路R-HSA-177929NSCLC中的驱动通路[T1]
RAS-RAF-MEK-ERKR-HSA-5673001下游效应通路[T1]
PI3K-AKT信号通路R-HSA-2219528耐药机制[T2]

7.2 Protein-Protein Interactions

7.2 蛋白质-蛋白质相互作用

Total Interactors: 45 (STRING confidence > 0.7) Key Interactors: GRB2, SHC1, PLCG1, PIK3CA, STAT3
总相互作用伙伴: 45个(STRING置信度>0.7) 关键相互作用伙伴: GRB2, SHC1, PLCG1, PIK3CA, STAT3

7.3 Pathway Redundancy Assessment

7.3 通路冗余评估

Compensation Risk: MODERATE
  • Parallel pathways: HER2, HER3 can compensate
  • Feedback loops: RAS activation bypasses EGFR
  • Downstream convergence: MEK/ERK shared with other RTKs

---
补偿风险: 中等
  • 平行通路:HER2、HER3可补偿
  • 反馈环路:RAS激活绕过EGFR
  • 下游收敛:MEK/ERK与其他RTKs共享

---

Phase 7: Validation Evidence (0-10 points)

阶段7:验证证据(0-10分)

Objective: Assess existing functional validation data.
目标: 评估现有功能验证数据。

7A. DepMap Essentiality (CRISPR/RNAi)

7A. DepMap必需性(CRISPR/RNAi)

python
undefined
python
undefined

Gene essentiality in cancer cell lines

癌细胞系中的基因必需性

deps = tu.tools.DepMap_get_gene_dependencies(gene_symbol=gene_symbol)
deps = tu.tools.DepMap_get_gene_dependencies(gene_symbol=gene_symbol)

Negative scores = essential (cells die upon KO)

负分 = 必需(敲除后细胞死亡)

Score < -0.5: moderately essential

评分 < -0.5: 中度必需

Score < -1.0: strongly essential

评分 < -1.0: 高度必需

undefined
undefined

7B. Literature Validation Evidence

7B. 文献验证证据

python
undefined
python
undefined

Search for functional studies

搜索功能研究

validation_papers = tu.tools.PubMed_search_articles( query=f'"{gene_symbol}" AND (CRISPR OR siRNA OR knockdown OR knockout OR "loss of function") AND "{disease_name}"', limit=30 )
validation_papers = tu.tools.PubMed_search_articles( query=f'"{gene_symbol}" AND (CRISPR OR siRNA OR knockdown OR knockout OR "loss of function") AND "{disease_name}"', limit=30 )

Search for biomarker studies

搜索生物标志物研究

biomarker_papers = tu.tools.PubMed_search_articles( query=f'"{gene_symbol}" AND (biomarker OR "target engagement" OR "pharmacodynamic")', limit=20 )
undefined
biomarker_papers = tu.tools.PubMed_search_articles( query=f'"{gene_symbol}" AND (biomarker OR "target engagement" OR "pharmacodynamic")', limit=20 )
undefined

7C. Animal Model Evidence

7C. 动物模型证据

python
undefined
python
undefined

Mouse phenotypes from OpenTargets (already retrieved in Phase 5)

OpenTargets中的小鼠表型(已在阶段5获取)

Reuse mouse_models data

复用mouse_models数据

CTD gene-disease associations (complementary)

CTD基因-疾病关联(补充数据)

ctd_diseases = tu.tools.CTD_get_gene_diseases(input_terms=gene_symbol)
undefined
ctd_diseases = tu.tools.CTD_get_gene_diseases(input_terms=gene_symbol)
undefined

Scoring Logic - Validation Evidence

验证证据评分逻辑

Functional Studies (0-5):
  - CRISPR KO shows disease-relevant phenotype: 5
  - siRNA knockdown shows phenotype: 4
  - Biochemical assay validates mechanism: 3
  - Overexpression study only: 2
  - No functional data: 0

Disease Models (0-5):
  - Patient-derived xenograft (PDX) response: 5
  - Genetically engineered mouse model: 4
  - Cell line model: 3
  - In silico model only: 1
  - No model data: 0

功能研究(0-5分):
  - CRISPR敲除显示疾病相关表型:5分
  - siRNA敲低显示表型:4分
  - 生化实验验证机制:3分
  - 仅过表达研究:2分
  - 无功能数据:0分

疾病模型(0-5分):
  - 患者来源异种移植(PDX)有响应:5分
  - 基因工程小鼠模型:4分
  - 细胞系模型:3分
  - 仅计算机模型:1分
  - 无模型数据:0分

Phase 8: Structural Insights

阶段8:结构见解

Objective: Leverage structural biology for druggability and mechanism understanding.
目标: 利用结构生物学理解成药性和作用机制。

8A. PDB Structures

8A. PDB结构

python
undefined
python
undefined

Get PDB entries from UniProt cross-references

从UniProt交叉引用获取PDB条目

uniprot_entry = tu.tools.UniProt_get_entry_by_accession(accession=uniprot_id)
uniprot_entry = tu.tools.UniProt_get_entry_by_accession(accession=uniprot_id)

Parse: uniProtKBCrossReferences where database == "PDB"

解析:database为"PDB"的uniProtKBCrossReferences

Get details for each PDB

获取每个PDB的详细信息

for pdb_id in pdb_ids[:10]: metadata = tu.tools.get_protein_metadata_by_pdb_id(pdb_id=pdb_id) quality = tu.tools.pdbe_get_entry_quality(pdb_id=pdb_id) summary = tu.tools.pdbe_get_entry_summary(pdb_id=pdb_id) experiment = tu.tools.pdbe_get_entry_experiment(pdb_id=pdb_id) molecules = tu.tools.pdbe_get_entry_molecules(pdb_id=pdb_id)
undefined
for pdb_id in pdb_ids[:10]: metadata = tu.tools.get_protein_metadata_by_pdb_id(pdb_id=pdb_id) quality = tu.tools.pdbe_get_entry_quality(pdb_id=pdb_id) summary = tu.tools.pdbe_get_entry_summary(pdb_id=pdb_id) experiment = tu.tools.pdbe_get_entry_experiment(pdb_id=pdb_id) molecules = tu.tools.pdbe_get_entry_molecules(pdb_id=pdb_id)
undefined

8B. AlphaFold Prediction

8B. AlphaFold预测

python
alphafold = tu.tools.alphafold_get_prediction(qualifier=uniprot_id)
alphafold_info = tu.tools.alphafold_get_summary(qualifier=uniprot_id)
python
alphafold = tu.tools.alphafold_get_prediction(qualifier=uniprot_id)
alphafold_info = tu.tools.alphafold_get_summary(qualifier=uniprot_id)

Check pLDDT scores for confidence

检查pLDDT评分以确认置信度

undefined
undefined

8C. Binding Pocket Analysis

8C. 结合口袋分析

python
undefined
python
undefined

ProteinsPlus DoGSiteScorer for best PDB structure

针对最佳PDB结构使用ProteinsPlus DoGSiteScorer

pockets = tu.tools.ProteinsPlus_predict_binding_sites(pdb_id=best_pdb_id)
pockets = tu.tools.ProteinsPlus_predict_binding_sites(pdb_id=best_pdb_id)

Returns: pocket locations, druggability scores, volume, surface

返回:口袋位置、成药性评分、体积、表面信息

Interaction diagram for co-crystal structures

共晶结构的相互作用图

if has_ligand: diagram = tu.tools.ProteinsPlus_generate_interaction_diagram(pdb_id=pdb_id)
undefined
if has_ligand: diagram = tu.tools.ProteinsPlus_generate_interaction_diagram(pdb_id=pdb_id)
undefined

8D. Domain Architecture

8D. 结构域架构

python
undefined
python
undefined

InterPro domains

InterPro结构域

domains = tu.tools.InterPro_get_protein_domains(uniprot_accession=uniprot_id)
domains = tu.tools.InterPro_get_protein_domains(uniprot_accession=uniprot_id)

Domain details for key domains

关键结构域的详细信息

for domain in domains[:5]: detail = tu.tools.InterPro_get_domain_details(entry_id=domain['accession'])

---
for domain in domains[:5]: detail = tu.tools.InterPro_get_domain_details(entry_id=domain['accession'])

---

Phase 9: Literature Deep Dive

阶段9:文献深度挖掘

Objective: Comprehensive literature analysis with collision-aware search.
目标: 结合碰撞检测的全面文献分析。

9A. Collision Detection

9A. 碰撞检测

python
undefined
python
undefined

Detect naming collisions before literature search

文献搜索前检测命名冲突

test_results = tu.tools.PubMed_search_articles( query=f'"{gene_symbol}"[Title]', limit=20 )
test_results = tu.tools.PubMed_search_articles( query=f'"{gene_symbol}"[Title]', limit=20 )

PubMed returns plain list of dicts

PubMed返回纯字典列表

Check if >20% of results are off-topic (no biology terms)

检查是否>20%的结果偏离主题(无生物学术语)

If collision detected, add filters: AND (protein OR gene OR receptor OR kinase)

如果检测到冲突,添加过滤条件:AND (protein OR gene OR receptor OR kinase)

undefined
undefined

9B. Publication Metrics

9B. 出版物指标

python
undefined
python
undefined

Total publications

总出版物数量

total = tu.tools.PubMed_search_articles( query=f'"{gene_symbol}" AND (protein OR gene)', limit=1 )
total = tu.tools.PubMed_search_articles( query=f'"{gene_symbol}" AND (protein OR gene)', limit=1 )

Check total_count field

检查total_count字段

Recent publications (5-year trend)

近期出版物(5年趋势)

recent = tu.tools.PubMed_search_articles( query=f'"{gene_symbol}" AND (protein OR gene) AND ("2021"[PDAT] : "2026"[PDAT])', limit=50 )
recent = tu.tools.PubMed_search_articles( query=f'"{gene_symbol}" AND (protein OR gene) AND ("2021"[PDAT] : "2026"[PDAT])', limit=50 )

Drug-focused publications

药物相关出版物

drug_pubs = tu.tools.PubMed_search_articles( query=f'"{gene_symbol}" AND (drug OR therapeutic OR inhibitor OR antibody)', limit=30 )
drug_pubs = tu.tools.PubMed_search_articles( query=f'"{gene_symbol}" AND (drug OR therapeutic OR inhibitor OR antibody)', limit=30 )

EuropePMC for broader coverage

EuropePMC更广泛的覆盖

epmc = tu.tools.EuropePMC_search_articles( query=f'"{gene_symbol}" AND drug target', limit=30 )
undefined
epmc = tu.tools.EuropePMC_search_articles( query=f'"{gene_symbol}" AND drug target', limit=30 )
undefined

9C. Key Reviews and Landmark Papers

9C. 关键综述与里程碑论文

python
undefined
python
undefined

Reviews for target overview

靶点综述

reviews = tu.tools.PubMed_search_articles( query=f'"{gene_symbol}" AND drug target AND review[pt]', limit=10 )
reviews = tu.tools.PubMed_search_articles( query=f'"{gene_symbol}" AND drug target AND review[pt]', limit=10 )

OpenAlex for citation metrics

OpenAlex引用指标

openalex_works = tu.tools.openalex_search_works( query=f'{gene_symbol} drug target', limit=20 )

---
openalex_works = tu.tools.openalex_search_works( query=f'{gene_symbol} drug target', limit=20 )

---

Phase 10: Validation Roadmap (Synthesis)

阶段10:验证路线图(综合)

Objective: Generate actionable recommendations based on all evidence.
This phase synthesizes all previous phases into:
  1. Target Validation Score (0-100)
  2. Priority Tier (1-4)
  3. GO/NO-GO Recommendation
  4. Recommended Experiments
  5. Tool Compounds for Testing
  6. Biomarker Strategy
  7. Key Risks & Mitigations
目标: 基于所有证据生成可执行建议。
本阶段将所有前期阶段的结果综合为:
  1. 靶点验证评分(0-100)
  2. 优先级层级(1-4)
  3. GO/NO-GO建议
  4. 推荐实验
  5. 测试用工具化合物
  6. 生物标志物策略
  7. 关键风险与缓解措施

Score Calculation

评分计算

python
def calculate_validation_score(phase_results):
    """
    Calculate Target Validation Score (0-100).

    Components:
    - Disease Association: 0-30
    - Druggability: 0-25
    - Safety: 0-20
    - Clinical Precedent: 0-15
    - Validation Evidence: 0-10
    """
    score = {
        'disease_genetic': 0,      # 0-10
        'disease_literature': 0,   # 0-10
        'disease_pathway': 0,      # 0-10
        'drug_structural': 0,      # 0-10
        'drug_chemical': 0,        # 0-10
        'drug_class': 0,           # 0-5
        'safety_expression': 0,    # 0-5
        'safety_genetic': 0,       # 0-10
        'safety_adverse': 0,       # 0-5
        'clinical': 0,             # 0-15
        'validation_functional': 0, # 0-5
        'validation_models': 0,    # 0-5
    }

    # ... scoring logic from each phase ...

    total = sum(score.values())

    if total >= 80:
        tier = "Tier 1"
        recommendation = "GO - Highly validated target"
    elif total >= 60:
        tier = "Tier 2"
        recommendation = "CONDITIONAL GO - Needs focused validation"
    elif total >= 40:
        tier = "Tier 3"
        recommendation = "CAUTION - Significant validation needed"
    else:
        tier = "Tier 4"
        recommendation = "NO-GO - Consider alternatives"

    return total, tier, recommendation, score

python
def calculate_validation_score(phase_results):
    """
    计算靶点验证评分(0-100)。

    构成部分:
    - 疾病关联性: 0-30
    - 成药性: 0-25
    - 安全性: 0-20
    - 临床先例: 0-15
    - 验证证据: 0-10
    """
    score = {
        'disease_genetic': 0,      # 0-10
        'disease_literature': 0,   # 0-10
        'disease_pathway': 0,      # 0-10
        'drug_structural': 0,      # 0-10
        'drug_chemical': 0,        # 0-10
        'drug_class': 0,           # 0-5
        'safety_expression': 0,    # 0-5
        'safety_genetic': 0,       # 0-10
        'safety_adverse': 0,       # 0-5
        'clinical': 0,             # 0-15
        'validation_functional': 0, # 0-5
        'validation_models': 0,    # 0-5
    }

    # ... 各阶段的评分逻辑 ...

    total = sum(score.values())

    if total >= 80:
        tier = "Tier 1"
        recommendation = "GO - Highly validated target"
    elif total >= 60:
        tier = "Tier 2"
        recommendation = "CONDITIONAL GO - Needs focused validation"
    elif total >= 40:
        tier = "Tier 3"
        recommendation = "CAUTION - Significant validation needed"
    else:
        tier = "Tier 4"
        recommendation = "NO-GO - Consider alternatives"

    return total, tier, recommendation, score

Report Template

报告模板

File:
[TARGET]_[DISEASE]_validation_report.md
markdown
undefined
文件:
[TARGET]_[DISEASE]_validation_report.md
markdown
undefined

Drug Target Validation Report: [TARGET]

药物靶点验证报告: [TARGET]

Target: [Gene Symbol] ([Full Name]) Disease Context: [Disease Name] (if provided) Modality: [Small molecule / Antibody / etc.] (if specified) Generated: [Date] Status: In Progress

靶点: [基因符号] ([全称]) 疾病背景: [疾病名称](若提供) 治疗模态: [小分子/抗体/等](若指定) 生成日期: [日期] 状态: 进行中

Executive Summary

执行摘要

Target Validation Score: [XX/100] Priority Tier: [Tier X] - [Description] Recommendation: [GO / CONDITIONAL GO / CAUTION / NO-GO]
Key Findings:
  • [1-sentence disease association strength with evidence grade]
  • [1-sentence druggability assessment]
  • [1-sentence safety profile]
  • [1-sentence clinical precedent]
Critical Risks:
  • [Top risk 1]
  • [Top risk 2]

靶点验证评分: [XX/100] 优先级层级: [层级X] - [描述] 建议: [GO / 条件性GO / 谨慎推进 / NO-GO]
关键发现:
  • [1句话总结疾病关联强度及证据等级]
  • [1句话总结成药性评估]
  • [1句话总结安全性概况]
  • [1句话总结临床先例]
关键风险:
  • [顶级风险1]
  • [顶级风险2]

Validation Scorecard

验证评分卡

DimensionScoreMaxAssessmentKey Evidence
Disease Association30
- Genetic evidence10
- Literature evidence10
- Pathway evidence10
Druggability25
- Structural tractability10
- Chemical matter10
- Target class5
Safety Profile20
- Expression selectivity5
- Genetic validation10
- Known ADRs5
Clinical Precedent15
Validation Evidence10
- Functional studies5
- Disease models5
TOTALXX100[Tier]

维度得分满分评估关键证据
疾病关联性30
- 遗传证据10
- 文献证据10
- 通路证据10
成药性25
- 结构可开发性10
- 化学物质10
- 靶点类别5
安全性概况20
- 表达选择性5
- 遗传验证10
- 已知ADR5
临床先例15
验证证据10
- 功能研究5
- 疾病模型5
总分XX100[层级]

1. Target Identity

1. 靶点标识

[Researching...]
[研究中...]

2. Disease Association Evidence

2. 疾病关联性证据

2.1 OpenTargets Disease Associations

2.1 OpenTargets疾病关联

[Researching...]
[研究中...]

2.2 GWAS Genetic Evidence

2.2 GWAS遗传证据

[Researching...]
[研究中...]

2.3 Constraint Scores (gnomAD)

2.3 约束评分(gnomAD)

[Researching...]
[研究中...]

2.4 Literature Evidence

2.4 文献证据

[Researching...]
[研究中...]

3. Druggability Assessment

3. 成药性评估

3.1 Tractability (OpenTargets)

3.1 可开发性(OpenTargets)

[Researching...]
[研究中...]

3.2 Target Classification

3.2 靶点分类

[Researching...]
[研究中...]

3.3 Structural Tractability

3.3 结构可开发性

[Researching...]
[研究中...]

3.4 Chemical Probes & Enabling Packages

3.4 化学探针与靶点赋能包

[Researching...]
[研究中...]

4. Known Modulators & Chemical Matter

4. 已知调控剂与化学物质

4.1 Approved/Clinical Drugs

4.1 已获批/临床阶段药物

[Researching...]
[研究中...]

4.2 ChEMBL Bioactivity

4.2 ChEMBL生物活性

[Researching...]
[研究中...]

4.3 BindingDB Ligands

4.3 BindingDB配体

[Researching...]
[研究中...]

4.4 PubChem Bioassays

4.4 PubChem生物测定

[Researching...]
[研究中...]

4.5 Chemical Probes

4.5 化学探针

[Researching...]
[研究中...]

5. Clinical Precedent

5. 临床先例

5.1 FDA-Approved Drugs

5.1 FDA获批药物

[Researching...]
[研究中...]

5.2 Clinical Trial Landscape

5.2 临床试验现状

[Researching...]
[研究中...]

5.3 Failed Programs & Lessons

5.3 失败项目与经验教训

[Researching...]
[研究中...]

6. Safety & Toxicity Profile

6. 安全性与毒性概况

6.1 OpenTargets Safety Liabilities

6.1 OpenTargets安全隐患

[Researching...]
[研究中...]

6.2 Expression in Critical Tissues

6.2 关键组织中的表达

[Researching...]
[研究中...]

6.3 Knockout Phenotypes

6.3 敲除表型

[Researching...]
[研究中...]

6.4 Known Adverse Events

6.4 已知不良事件

[Researching...]
[研究中...]

6.5 Paralog & Off-Target Risks

6.5 旁系同源物与脱靶风险

[Researching...]
[研究中...]

7. Pathway Context & Network Analysis

7. 通路背景与网络分析

7.1 Biological Pathways

7.1 生物通路

[Researching...]
[研究中...]

7.2 Protein-Protein Interactions

7.2 蛋白质-蛋白质相互作用

[Researching...]
[研究中...]

7.3 Functional Enrichment

7.3 功能富集

[Researching...]
[研究中...]

7.4 Pathway Redundancy Assessment

7.4 通路冗余评估

[Researching...]
[研究中...]

8. Validation Evidence

8. 验证证据

8.1 Target Essentiality (DepMap)

8.1 靶点必需性(DepMap)

[Researching...]
[研究中...]

8.2 Functional Studies

8.2 功能研究

[Researching...]
[研究中...]

8.3 Animal Models

8.3 动物模型

[Researching...]
[研究中...]

8.4 Biomarker Potential

8.4 生物标志物潜力

[Researching...]
[研究中...]

9. Structural Insights

9. 结构见解

9.1 Experimental Structures (PDB)

9.1 实验结构(PDB)

[Researching...]
[研究中...]

9.2 AlphaFold Prediction

9.2 AlphaFold预测

[Researching...]
[研究中...]

9.3 Binding Pocket Analysis

9.3 结合口袋分析

[Researching...]
[研究中...]

9.4 Domain Architecture

9.4 结构域架构

[Researching...]
[研究中...]

10. Literature Landscape

10. 文献现状

10.1 Publication Metrics

10.1 出版物指标

[Researching...]
[研究中...]

10.2 Key Publications

10.2 关键出版物

[Researching...]
[研究中...]

10.3 Research Trend

10.3 研究趋势

[Researching...]
[研究中...]

11. Validation Roadmap

11. 验证路线图

11.1 Recommended Validation Experiments

11.1 推荐验证实验

[Researching...]
[研究中...]

11.2 Tool Compounds for Testing

11.2 测试用工具化合物

[Researching...]
[研究中...]

11.3 Biomarker Strategy

11.3 生物标志物策略

[Researching...]
[研究中...]

11.4 Clinical Biomarker Candidates

11.4 临床生物标志物候选

[Researching...]
[研究中...]

11.5 Disease Models to Test

11.5 待测试疾病模型

[Researching...]
[研究中...]

12. Risk Assessment

12. 风险评估

12.1 Key Risks

12.1 关键风险

[Researching...]
[研究中...]

12.2 Mitigation Strategies

12.2 缓解策略

[Researching...]
[研究中...]

12.3 Competitive Landscape

12.3 竞争格局

[Researching...]
[研究中...]

13. Completeness Checklist

13. 完整性检查清单

[To be populated post-audit...]
[审核后填充...]

14. Data Sources & Methodology

14. 数据来源与方法

[Will be populated as research progresses...]

---
[研究过程中填充...]

---

Completeness Checklist (MANDATORY)

完整性检查清单(必填)

Before finalizing, verify:
markdown
undefined
最终确定前,请验证:
markdown
undefined

13. Completeness Checklist

13. 完整性检查清单

Phase Coverage

阶段覆盖

  • Phase 0: Target disambiguation (all IDs resolved)
  • Phase 1: Disease association (OT + GWAS + gnomAD + literature)
  • Phase 2: Druggability (tractability + class + structure + probes)
  • Phase 3: Chemical matter (ChEMBL + BindingDB + PubChem + drugs)
  • Phase 4: Clinical precedent (FDA + trials + failures)
  • Phase 5: Safety (OT safety + expression + KO + ADRs + paralogs)
  • Phase 6: Pathway context (Reactome + STRING + GO)
  • Phase 7: Validation evidence (DepMap + literature + models)
  • Phase 8: Structural insights (PDB + AlphaFold + pockets + domains)
  • Phase 9: Literature (collision-aware + metrics + key papers)
  • Phase 10: Validation roadmap (score + recommendations)
  • 阶段0:靶点消歧(所有ID已解析)
  • 阶段1:疾病关联性(OT + GWAS + gnomAD + 文献)
  • 阶段2:成药性(可开发性 + 类别 + 结构 + 探针)
  • 阶段3:化学物质(ChEMBL + BindingDB + PubChem + 药物)
  • 阶段4:临床先例(FDA + 试验 + 失败项目)
  • 阶段5:安全性(OT安全 + 表达 + 敲除 + ADR + 旁系同源物)
  • 阶段6:通路背景(Reactome + STRING + GO)
  • 阶段7:验证证据(DepMap + 文献 + 模型)
  • 阶段8:结构见解(PDB + AlphaFold + 口袋 + 结构域)
  • 阶段9:文献(碰撞检测 + 指标 + 关键论文)
  • 阶段10:验证路线图(评分 + 建议)

Data Quality

数据质量

  • All scores justified with specific data
  • Evidence grades (T1-T4) assigned to key claims
  • Negative results documented (not left blank)
  • Failed tools with fallbacks documented
  • Source citations for all data points
  • 所有得分均有具体数据支撑
  • 关键结论已分配证据等级(T1-T4)
  • 阴性结果已记录(未留空)
  • 失败工具及替代方案已记录
  • 所有数据点均有来源引用

Scoring

评分

  • All 12 score components calculated
  • Total score summed correctly
  • Priority tier assigned
  • GO/NO-GO recommendation justified

---
  • 已计算所有12个评分构成部分
  • 总分计算正确
  • 已分配优先级层级
  • GO/NO-GO建议有充分依据

---

Fallback Chains

替代工具链

Primary ToolFallback 1Fallback 2If All Fail
OpenTargets_get_diseases_phenotypes_*
CTD_get_gene_diseases
PubMed searchNote in report
GTEx_get_median_gene_expression
(versioned)
GTEx (unversioned)
HPA_search_genes_by_query
Document gap
ChEMBL_get_target_activities
BindingDB_get_ligands_by_uniprot
DGIdb_get_gene_info
Note in report
gnomad_get_gene_constraints
OpenTargets_get_target_constraint_info_*
-Note as unavailable
Reactome_map_uniprot_to_pathways
OpenTargets_get_target_gene_ontology_*
-Use GO only
STRING_get_protein_interactions
intact_get_interactions
OpenTargets interactions
Note in report
ProteinsPlus_predict_binding_sites
alphafold_get_prediction
Literature pocketsNote as limited

主工具替代工具1替代工具2全部失败时
OpenTargets_get_diseases_phenotypes_*
CTD_get_gene_diseases
PubMed搜索在报告中注明
GTEx_get_median_gene_expression
(带版本)
GTEx(不带版本)
HPA_search_genes_by_query
记录数据缺口
ChEMBL_get_target_activities
BindingDB_get_ligands_by_uniprot
DGIdb_get_gene_info
在报告中注明
gnomad_get_gene_constraints
OpenTargets_get_target_constraint_info_*
-注明不可用
Reactome_map_uniprot_to_pathways
OpenTargets_get_target_gene_ontology_*
-仅使用GO数据
STRING_get_protein_interactions
intact_get_interactions
OpenTargets interactions
在报告中注明
ProteinsPlus_predict_binding_sites
alphafold_get_prediction
文献中的口袋信息注明数据有限

Modality-Specific Considerations

治疗模态特异性考量

Small Molecule Focus

小分子聚焦

  • Emphasize: binding pockets, ChEMBL compounds, Lipinski compliance
  • Key tractability: OpenTargets SM tractability bucket
  • Structure: co-crystal structures with small molecule ligands
  • Chemical matter: IC50/Ki/Kd data from ChEMBL/BindingDB
  • 重点:结合口袋、ChEMBL化合物、Lipinski规则合规性
  • 关键可开发性:OpenTargets SM可开发性分类
  • 结构:与小分子配体的共晶结构
  • 化学物质:ChEMBL/BindingDB中的IC50/Ki/Kd数据

Antibody Focus

抗体聚焦

  • Emphasize: extracellular domains, cell surface expression, glycosylation
  • Key tractability: OpenTargets AB tractability bucket
  • Structure: ectodomain structures, epitope mapping
  • Expression: surface expression in disease vs normal tissue
  • 重点:细胞外结构域、细胞表面表达、糖基化
  • 关键可开发性:OpenTargets AB可开发性分类
  • 结构:胞外域结构、表位定位
  • 表达:疾病组织与正常组织中的表面表达差异

PROTAC Focus

PROTAC聚焦

  • Emphasize: intracellular targets, surface lysines, E3 ligase proximity
  • Key tractability: OpenTargets PROTAC tractability
  • Structure: full-length structures for linker design
  • Chemical matter: known binders + E3 ligase binders

  • 重点:细胞内靶点、表面赖氨酸、E3连接酶 proximity
  • 关键可开发性:OpenTargets PROTAC可开发性
  • 结构:用于 linker 设计的全长结构
  • 化学物质:已知结合剂 + E3连接酶结合剂

Quick Reference: Verified Tool Parameters

快速参考:已验证工具参数

ToolParametersNotes
ensembl_lookup_gene
gene_id
,
species
species="homo_sapiens" REQUIRED; response wrapped in
{status, data, url, content_type}
OpenTargets_get_*_by_ensemblID
ensemblId
camelCase, NOT ensemblID
OpenTargets_get_publications_by_target_ensemblID
entityId
NOT ensemblId
OpenTargets_get_associated_drugs_by_target_ensemblID
ensemblId
,
size
size is REQUIRED
OpenTargets_target_disease_evidence
efoId
,
ensemblId
Both REQUIRED
GTEx_get_median_gene_expression
operation
,
gencode_id
operation="median" REQUIRED
HPA_get_rna_expression_by_source
gene_name
,
source_type
,
source_name
ALL 3 required
PubMed_search_articles
query
,
limit
Returns plain list, NOT {articles:[]}
UniProt_get_function_by_accession
accession
Returns list of strings
alphafold_get_prediction
qualifier
NOT uniprot_accession
drugbank_get_safety_*
query
,
case_sensitive
,
exact_match
,
limit
ALL required
STRING_get_protein_interactions
protein_ids
,
species
protein_ids is array; species=9606
Reactome_map_uniprot_to_pathways
id
NOT uniprot_id
ChEMBL_get_target_activities
target_chembl_id__exact
Note double underscore
search_clinical_trials
query_term
REQUIRED parameter
gnomad_get_gene_constraints
gene_symbol
NOT gene_id
DepMap_get_gene_dependencies
gene_symbol
NOT gene_id
BindingDB_get_ligands_by_uniprot
uniprot
,
affinity_cutoff
affinity in nM
Pharos_get_target
gene
or
uniprot
Both optional but need one

工具参数说明
ensembl_lookup_gene
gene_id
,
species
必须添加
species="homo_sapiens"
;响应包裹在
{status, data, url, content_type}
OpenTargets_get_*_by_ensemblID
ensemblId
驼峰式,非
ensemblID
OpenTargets_get_publications_by_target_ensemblID
entityId
ensemblId
OpenTargets_get_associated_drugs_by_target_ensemblID
ensemblId
,
size
size
必填
OpenTargets_target_disease_evidence
efoId
,
ensemblId
两者均必填
GTEx_get_median_gene_expression
operation
,
gencode_id
必须添加
operation="median"
HPA_get_rna_expression_by_source
gene_name
,
source_type
,
source_name
全部3个参数必填
PubMed_search_articles
query
,
limit
返回纯列表,非
{articles:[]}
UniProt_get_function_by_accession
accession
返回字符串列表
alphafold_get_prediction
qualifier
uniprot_accession
drugbank_get_safety_*
query
,
case_sensitive
,
exact_match
,
limit
全部必填
STRING_get_protein_interactions
protein_ids
,
species
protein_ids
为数组;
species=9606
Reactome_map_uniprot_to_pathways
id
uniprot_id
ChEMBL_get_target_activities
target_chembl_id__exact
注意双下划线
search_clinical_trials
query_term
必填参数
gnomad_get_gene_constraints
gene_symbol
gene_id
DepMap_get_gene_dependencies
gene_symbol
gene_id
BindingDB_get_ligands_by_uniprot
uniprot
,
affinity_cutoff
亲和力单位为nM
Pharos_get_target
gene
uniprot
两者可选但需至少提供一个

Example Execution: EGFR for NSCLC

示例执行:EGFR用于NSCLC

Phase 0 Result

阶段0结果

  • Symbol: EGFR, Ensembl: ENSG00000146648, UniProt: P00533, ChEMBL: CHEMBL203
  • 符号: EGFR, Ensembl: ENSG00000146648, UniProt: P00533, ChEMBL: CHEMBL203

Expected Scores (EGFR for NSCLC)

预期评分(EGFR用于NSCLC)

  • Disease Association: ~28/30 (strong genetic + pathway + literature)
  • Druggability: ~24/25 (kinase, many structures, abundant compounds)
  • Safety: ~14/20 (widely expressed but manageable toxicity)
  • Clinical Precedent: 15/15 (multiple approved drugs)
  • Validation Evidence: ~9/10 (extensive functional data)
  • Total: ~90/100 = Tier 1
  • 疾病关联性: ~28/30(强遗传+通路+文献证据)
  • 成药性: ~24/25(激酶家族,大量结构,丰富化合物)
  • 安全性: ~14/20(广泛表达但毒性可管理)
  • 临床先例: 15/15(多种获批药物)
  • 验证证据: ~9/10(大量功能数据)
  • 总分: ~90/100 = 层级1

Example for Novel Target (e.g., understudied kinase)

全新靶点示例(如未充分研究的激酶)

  • Disease Association: ~8/30 (limited GWAS, few publications)
  • Druggability: ~15/25 (kinase family bonus, AlphaFold structure)
  • Safety: ~12/20 (limited data, unknown KO phenotype)
  • Clinical Precedent: 0/15 (no clinical development)
  • Validation Evidence: ~2/10 (minimal functional data)
  • Total: ~37/100 = Tier 4
  • 疾病关联性: ~8/30(有限GWAS数据,少量出版物)
  • 成药性: ~15/25(激酶家族加分,AlphaFold结构)
  • 安全性: ~12/20(数据有限,未知敲除表型)
  • 临床先例: 0/15(无临床开发)
  • 验证证据: ~2/10(少量功能数据)
  • 总分: ~37/100 = 层级4