tooluniverse-small-molecule-discovery

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Small Molecule Discovery Skill

小分子发现技能

Systematic small molecule identification, characterization, and sourcing using PubChem, ChEMBL, BindingDB, ADMET-AI, SwissADME, eMolecules, and Enamine. Covers the full pipeline from compound name to structure, activity, ADMET properties, and commercial procurement.

借助PubChem、ChEMBL、BindingDB、ADMET-AI、SwissADME、eMolecules和Enamine实现系统化的小分子鉴定、表征与采购。覆盖从化合物名称到结构、活性、ADMET属性、商业采购的全流程。

Domain Reasoning

领域逻辑说明

Drug-likeness is not a binary property. Lipinski's Rule of 5 was derived from orally administered, passively absorbed drugs and has many well-known exceptions: natural products, macrocycles, PROTACs, and many approved drugs violate one or more rules. The relevant question is not "does this pass Ro5?" but "does this compound's physicochemical profile match the requirements of the target, the intended route of administration, and the therapeutic context?" Focus on the specific requirements, not rigid rules.

类药性并不是二元属性。利平斯基五规则（Lipinski's Rule of 5）衍生于口服给药、被动吸收的药物，存在许多广为人知的例外：天然产物、大环化合物、PROTAC以及大量获批药物都违反一条或多条规则。核心问题不是「这个化合物符合五规则吗？」，而是「该化合物的理化特性是否匹配靶点、预期给药途径和治疗场景的要求？」。请聚焦具体需求，而非拘泥于僵化规则。

LOOK UP DON'T GUESS

请查询不要猜测

Compound identity (CID, ChEMBL ID, SMILES): call
```
PubChem_get_CID_by_compound_name
```
and
```
ChEMBL_search_molecules
```
; do not assume IDs from memory.
ADMET properties: run
```
SwissADME_calculate_adme
```
or
```
ADMETAI_predict_*
```
on the actual SMILES; do not estimate logP, TPSA, or bioavailability.
Binding affinities against a target: query
```
ChEMBL_search_activities
```
or
```
BindingDB_get_ligands_by_uniprot
```
; never cite IC50 values from memory.
Commercial availability: check
```
eMolecules_search
```
or
```
Enamine_search_catalog
```
; do not assume availability.

KEY PRINCIPLES:

Resolve identity first - Always get CID and ChEMBL ID before research
SMILES required for property prediction - Extract canonical SMILES from PubChem early
English names in tools - Use IUPAC or common English names; avoid abbreviations in tool calls
BindingDB is often unavailable - Fall back to ChEMBL activities when BindingDB times out
eMolecules/Enamine return URLs - These tools generate search URLs, not direct data; note this to user

化合物标识（CID、ChEMBL ID、SMILES）：调用
```
PubChem_get_CID_by_compound_name
```
和
```
ChEMBL_search_molecules
```
；不要凭记忆假设ID。
ADMET属性：基于实际SMILES运行
```
SwissADME_calculate_adme
```
或
```
ADMETAI_predict_*
```
；不要估算logP、TPSA或生物利用度。
靶点结合亲和力：查询
```
ChEMBL_search_activities
```
或
```
BindingDB_get_ligands_by_uniprot
```
；绝对不要凭记忆引用IC50值。
商业可得性：查询
```
eMolecules_search
```
或
```
Enamine_search_catalog
```
；不要假设是否可购买。

核心原则:

优先解决标识问题 - 开展研究前务必先获取CID和ChEMBL ID
属性预测需要SMILES - 尽早从PubChem提取标准SMILES
工具调用使用英文名称 - 采用IUPAC名或通用英文名；工具调用中避免使用缩写
BindingDB经常不可用 - 当BindingDB超时，回退使用ChEMBL活性数据
eMolecules/Enamine返回URL - 这些工具生成搜索链接而非直接数据，请告知用户该特性

COMPUTE, DON'T DESCRIBE

直接计算不要描述

When analysis requires computation (statistics, data processing, scoring, enrichment), write and run Python code via Bash. Don't describe what you would do — execute it and report actual results. Use ToolUniverse tools to retrieve data, then Python (pandas, scipy, statsmodels, matplotlib) to analyze it.

当分析需要计算（统计、数据处理、评分、富集分析）时，通过Bash编写并运行Python代码。不要描述你打算做什么——直接执行并返回实际结果。使用ToolUniverse工具检索数据，再通过Python（pandas、scipy、statsmodels、matplotlib）开展分析。

When to Use

适用场景

"Find information about compound X"
"What is the drug-likeness of this SMILES?"
"Show binding affinities for EGFR inhibitors"
"Search for compounds similar to imatinib"
"Is this compound commercially available?"
"What are the ADMET properties of this molecule?"
"Find ChEMBL activities for target Y"
"Predict targets for this small molecule"

「查找化合物X的相关信息」
「这个SMILES对应的类药性如何？」
「展示EGFR抑制剂的结合亲和力数据」
「搜索与伊马替尼类似的化合物」
「这个化合物可以商业采购吗？」
「这个分子的ADMET属性是什么？」
「查找靶点Y的ChEMBL活性数据」
「预测这个小分子的作用靶点」

Key Tools

核心工具

Tool	Purpose	Key Params
`PubChem_get_CID_by_compound_name`	Name to CID lookup	`compound_name`
`PubChem_get_CID_by_SMILES`	SMILES to CID lookup	`smiles`
`PubChem_get_compound_properties_by_CID`	MW, formula, SMILES, InChIKey	`cid` , `properties`
`PubChem_search_compounds_by_similarity`	Find structurally similar compounds	`smiles` , `threshold` (0-100)
`PubChem_search_compounds_by_substructure`	Substructure search	`smiles`
`PubChem_get_compound_synonyms_by_CID`	All names/synonyms	`cid`
`ChEMBL_search_molecules`	Search ChEMBL by name or ID	`query`
`ChEMBL_get_molecule`	Full ChEMBL molecule record	`chembl_id`
`ChEMBL_search_similar_molecules`	Similarity search in ChEMBL	`query` (SMILES or ChEMBL ID)
`ChEMBL_search_activities`	Binding affinities and assay data	`molecule_chembl_id` , `target_chembl_id` , `pchembl_value__gte`
`ChEMBL_get_drug_mechanisms`	MOA for approved drugs	`drug_chembl_id` or `drug_name`
`ChEMBL_search_targets`	Find targets by name	`query` , `organism`
`ChEMBL_get_target_activities`	All ligands for a target	`target_chembl_id`
`SwissADME_calculate_adme`	Physicochemical + ADMET properties	`operation="calculate_adme"` , `smiles`
`SwissADME_check_druglikeness`	Lipinski, Veber, Egan rules	`operation="check_druglikeness"` , `smiles`
`ADMETAI_predict_physicochemical_properties`	MW, logP, TPSA, HBD/HBA	`smiles` (list)
`ADMETAI_predict_bioavailability`	Oral bioavailability prediction	`smiles` (list)
`ADMETAI_predict_BBB_penetrance`	Blood-brain barrier permeability	`smiles` (list)
`ADMETAI_predict_toxicity`	hERG, DILI, mutagenicity	`smiles` (list)
`ADMETAI_predict_CYP_interactions`	CYP450 inhibition/substrate	`smiles` (list)
`SwissTargetPrediction_predict`	Predict protein targets for compound	`operation="predict"` , `smiles`
`eMolecules_search`	Find commercially available compounds	`query` (name or keyword)
`eMolecules_search_smiles`	Structure-based commercial search	`smiles`
`eMolecules_get_vendors`	Find vendors for a specific compound	`compound_id`
`Enamine_search_catalog`	Search Enamine screening library	`query`
`Enamine_search_smiles`	Search Enamine by structure	`smiles`
`Enamine_get_libraries`	List Enamine compound libraries	(none required)

工具	用途	核心参数
`PubChem_get_CID_by_compound_name`	通过名称查询CID	`compound_name`
`PubChem_get_CID_by_SMILES`	通过SMILES查询CID	`smiles`
`PubChem_get_compound_properties_by_CID`	查询分子量、分子式、SMILES、InChIKey	`cid` , `properties`
`PubChem_search_compounds_by_similarity`	查找结构相似的化合物	`smiles` , `threshold` (0-100)
`PubChem_search_compounds_by_substructure`	子结构搜索	`smiles`
`PubChem_get_compound_synonyms_by_CID`	查询所有别名/同义词	`cid`
`ChEMBL_search_molecules`	通过名称或ID搜索ChEMBL	`query`
`ChEMBL_get_molecule`	获取完整ChEMBL分子记录	`chembl_id`
`ChEMBL_search_similar_molecules`	ChEMBL内的相似度搜索	`query` (SMILES或ChEMBL ID)
`ChEMBL_search_activities`	结合亲和力与实验数据	`molecule_chembl_id` , `target_chembl_id` , `pchembl_value__gte`
`ChEMBL_get_drug_mechanisms`	获批药物的作用机制	`drug_chembl_id` 或 `drug_name`
`ChEMBL_search_targets`	通过名称查找靶点	`query` , `organism`
`ChEMBL_get_target_activities`	获取靶点对应的所有配体	`target_chembl_id`
`SwissADME_calculate_adme`	理化属性 + ADMET属性计算	`operation="calculate_adme"` , `smiles`
`SwissADME_check_druglikeness`	利平斯基、Veber、Egan规则校验	`operation="check_druglikeness"` , `smiles`
`ADMETAI_predict_physicochemical_properties`	分子量、logP、TPSA、氢键供体/受体	`smiles` (列表)
`ADMETAI_predict_bioavailability`	口服生物利用度预测	`smiles` (列表)
`ADMETAI_predict_BBB_penetrance`	血脑屏障通透性预测	`smiles` (列表)
`ADMETAI_predict_toxicity`	hERG、药物性肝损伤、致突变性预测	`smiles` (列表)
`ADMETAI_predict_CYP_interactions`	CYP450抑制/底物特性预测	`smiles` (列表)
`SwissTargetPrediction_predict`	预测化合物的蛋白靶点	`operation="predict"` , `smiles`
`eMolecules_search`	查找可商业采购的化合物	`query` (名称或关键词)
`eMolecules_search_smiles`	基于结构的商业可得性搜索	`smiles`
`eMolecules_get_vendors`	查找特定化合物的供应商	`compound_id`
`Enamine_search_catalog`	搜索Enamine筛选库	`query`
`Enamine_search_smiles`	通过结构搜索Enamine库	`smiles`
`Enamine_get_libraries`	列出Enamine的化合物库	(无必填参数)

Workflow

工作流程

Phase 1: Compound Identification

阶段1：化合物鉴定

undefined

undefined

Step 1: Name -> CID (PubChem canonical identity)

步骤1：名称 -> CID (PubChem标准标识)

PubChem_get_CID_by_compound_name(compound_name="imatinib")

-> CID: 5291

Step 2: Get SMILES and properties (needed for all downstream tools)

步骤2：获取SMILES和属性（所有下游工具必需）

PubChem_get_compound_properties_by_CID( cid="5291", properties="MolecularFormula,MolecularWeight,CanonicalSMILES,InChIKey,IUPACName" )

-> canonical SMILES, InChIKey (global identifier)

-> 标准SMILES, InChIKey (全局标识)

Step 3: Get ChEMBL ID (for activity data)

步骤3：获取ChEMBL ID（用于获取活性数据）

ChEMBL_search_molecules(query="imatinib")

-> ChEMBL ID (e.g., "CHEMBL941")

-> ChEMBL ID (例如："CHEMBL941")

Step 4: Get all synonyms (brand names, INN, etc.)

步骤4：获取所有同义词（商品名、国际非专利名等）

PubChem_get_compound_synonyms_by_CID(cid="5291")


**ID resolution priority**:
1. Start with PubChem CID (most universal)
2. Get ChEMBL ID (for bioactivity data)
3. Use canonical SMILES for structure-based searches and ADMET

PubChem_get_compound_synonyms_by_CID(cid="5291")


**ID解析优先级**:
1. 优先使用PubChem CID（通用性最强）
2. 获取ChEMBL ID（用于生物活性数据查询）
3. 使用标准SMILES开展结构搜索和ADMET预测

Phase 2: Structure-Based Search

阶段2：基于结构的搜索

Similarity search (find analogs):

PubChem_search_compounds_by_similarity(
    smiles="CANONICAL_SMILES",
    threshold=85   # Tanimoto threshold 0-100; 85 = highly similar
)

相似度搜索（查找类似物）:

PubChem_search_compounds_by_similarity(
    smiles="CANONICAL_SMILES",
    threshold=85   # Tanimoto阈值0-100；85代表高度相似
)

Returns: list of CIDs of similar compounds

返回：相似化合物的CID列表

ChEMBL_search_similar_molecules(query="CHEMBL941") # Or SMILES

ChEMBL_search_similar_molecules(query="CHEMBL941") # 也可传入SMILES

Returns: ChEMBL entries sorted by similarity

返回：按相似度排序的ChEMBL条目


**Substructure search** (find compounds containing a scaffold):

PubChem_search_compounds_by_substructure(smiles="SCAFFOLD_SMILES")


**子结构搜索**（查找包含特定骨架的化合物）:

PubChem_search_compounds_by_substructure(smiles="SCAFFOLD_SMILES")

Returns: CIDs of compounds containing the scaffold

返回：包含该骨架的化合物CID

undefined

undefined

Phase 3: Bioactivity and Binding Affinity

阶段3：生物活性与结合亲和力

Get all activities for a compound (across all targets):

ChEMBL_search_activities(
    molecule_chembl_id="CHEMBL941",
    pchembl_value__gte=6,   # pIC50/Ki >= 6 = IC50/Ki <= 1 µM
    limit=50
)

获取化合物的所有活性数据（覆盖所有靶点）:

ChEMBL_search_activities(
    molecule_chembl_id="CHEMBL941",
    pchembl_value__gte=6,   # pIC50/Ki >= 6 对应 IC50/Ki <= 1 µM
    limit=50
)

Returns: assay_type, target_name, pchembl_value, units

返回：实验类型、靶点名称、pchembl值、单位


**Get all ligands for a target**:


**获取靶点的所有配体**:

First find target ChEMBL ID

首先查找靶点的ChEMBL ID

ChEMBL_search_targets(query="EGFR", organism="Homo sapiens")

-> target_chembl_id, e.g., "CHEMBL203"

-> target_chembl_id, 例如："CHEMBL203"

ChEMBL_get_target_activities( target_chembl_id="CHEMBL203" )

Returns: all compounds with binding data against this target

返回：该靶点所有有结合数据的化合物


**BindingDB** (when available — often times out):

BindingDB_get_ligands_by_uniprot(uniprot_id="P00533") # EGFR


**BindingDB**（可用时使用，经常超时）:

BindingDB_get_ligands_by_uniprot(uniprot_id="P00533") # EGFR

Returns: Ki, IC50, Kd data with literature references

返回：带文献引用的Ki、IC50、Kd数据

Note: BindingDB REST API is frequently unavailable; fall back to ChEMBL

注意：BindingDB REST API经常不可用；请回退使用ChEMBL


**pChEMBL Value interpretation**:
| pChEMBL | IC50 / Ki | Affinity |
|---------|-----------|---------|
| >= 9 | <= 1 nM | Very potent |
| >= 7 | <= 100 nM | Potent |
| >= 6 | <= 1 µM | Moderate |
| >= 5 | <= 10 µM | Weak |
| < 5 | > 10 µM | Inactive |


**pChEMBL值解读**:
| pChEMBL | IC50 / Ki | 亲和力 |
|---------|-----------|---------|
| >= 9 | <= 1 nM | 活性极强 |
| >= 7 | <= 100 nM | 活性强 |
| >= 6 | <= 1 µM | 活性中等 |
| >= 5 | <= 10 µM | 活性弱 |
| < 5 | > 10 µM | 无活性 |

Phase 4: Drug-likeness and ADMET

阶段4：类药性与ADMET

SwissADME (comprehensive, requires SMILES string — not list):

SwissADME_calculate_adme(
    operation="calculate_adme",
    smiles="CANONICAL_SMILES"
)

SwissADME（功能全面，要求SMILES为字符串，不能是列表）:

SwissADME_calculate_adme(
    operation="calculate_adme",
    smiles="CANONICAL_SMILES"
)

Returns: physicochemical, lipophilicity, water solubility, pharmacokinetics,

返回：理化属性、亲脂性、水溶性、药代动力学、

drug-likeness scores (Lipinski, Veber, Egan, Muegge), PAINS alerts

类药性评分（Lipinski、Veber、Egan、Muegge）、PAINS警报

SwissADME_check_druglikeness( operation="check_druglikeness", smiles="CANONICAL_SMILES" )

Returns: Lipinski/Veber/Egan pass/fail + lead-likeness

返回：Lipinski/Veber/Egan规则校验结果 + 先导化合物相似性


**ADMET-AI** (ML-based, requires SMILES as list — install tooluniverse[ml]):

ADMETAI_predict_physicochemical_properties(smiles=["CANONICAL_SMILES"]) ADMETAI_predict_bioavailability(smiles=["CANONICAL_SMILES"]) ADMETAI_predict_BBB_penetrance(smiles=["CANONICAL_SMILES"]) ADMETAI_predict_toxicity(smiles=["CANONICAL_SMILES"]) ADMETAI_predict_CYP_interactions(smiles=["CANONICAL_SMILES"])


**Note**: ADMET-AI requires `pip install tooluniverse[ml]`. If unavailable, use SwissADME as fallback.

**Key drug-likeness rules**:
- **Lipinski Ro5**: MW <= 500, logP <= 5, HBD <= 5, HBA <= 10 (oral drugs)
- **Veber**: TPSA <= 140 Å², rotatable bonds <= 10 (oral bioavailability)
- **Lead-like**: MW <= 350, logP <= 3, HBD <= 3, HBA <= 6 (fragment/lead)


**ADMET-AI**（基于机器学习，要求SMILES为列表——需安装tooluniverse[ml]）:


**注意**：ADMET-AI需要执行`pip install tooluniverse[ml]`。如果不可用，使用SwissADME作为替代。

**核心类药性规则**:
- **Lipinski五规则**: 分子量 <= 500, logP <= 5, 氢键供体 <= 5, 氢键受体 <= 10（口服药物）
- **Veber规则**: TPSA <= 140 Å², 可旋转键 <= 10（口服生物利用度）
- **先导化合物规则**: 分子量 <= 350, logP <= 3, 氢键供体 <= 3, 氢键受体 <= 6（片段/先导化合物）

Phase 5: Target Prediction

阶段5：靶点预测

When you have a novel compound and want to predict targets:

SwissTargetPrediction_predict(
    operation="predict",
    smiles="CANONICAL_SMILES"
)

当你有一个新化合物需要预测作用靶点时：

SwissTargetPrediction_predict(
    operation="predict",
    smiles="CANONICAL_SMILES"
)

Returns: predicted protein targets with probability scores

返回：带概率评分的预测蛋白靶点

Note: SwissTargetPrediction uses structure-similarity to known drug-target pairs

注意：SwissTargetPrediction基于与已知药物-靶点对的结构相似度

May time out for complex molecules

复杂分子可能会超时

undefined

undefined

Phase 6: Commercial Availability

阶段6：商业可得性

eMolecules (aggregates 200+ suppliers — returns search URL, not direct data):

eMolecules_search(query="compound_name")

eMolecules（整合200+供应商——返回搜索URL，不返回直接数据）:

eMolecules_search(query="compound_name")

-> Returns search_url to visit on eMolecules.com

-> 返回eMolecules.com的搜索链接

eMolecules_search_smiles(smiles="CANONICAL_SMILES")

-> Returns URL for exact/similar structure search

-> 返回精确/相似结构搜索的URL


**Enamine** (37B+ make-on-demand compounds — returns URL when API unavailable):

Enamine_search_catalog(query="compound_name")


**Enamine**（370亿+按需合成化合物——API不可用时返回URL）:

Enamine_search_catalog(query="compound_name")

-> If API available: returns catalog entries with catalog_id, price

-> 如果API可用：返回包含目录ID、价格的目录条目

-> If API unavailable: returns search_url for manual search

-> 如果API不可用：返回手动搜索的URL

Enamine_search_smiles(smiles="CANONICAL_SMILES")

-> Exact or similarity structure search

-> 精确或相似结构搜索

Enamine_get_libraries()

-> Lists available Enamine screening collections

-> 列出可用的Enamine筛选库


**Note**: eMolecules and Enamine APIs frequently return search URLs rather than live data. Present these to the user as "search here" links.

---


**注意**：eMolecules和Enamine API通常返回搜索链接而非实时数据。请将这些作为「在此搜索」的链接提供给用户。

---

Tool Parameter Reference

工具参数参考

Tool	Required Params	Notes
`PubChem_get_CID_by_compound_name`	`compound_name`	Returns list of CIDs; take first or most relevant
`PubChem_get_CID_by_SMILES`	`smiles`	Use canonical SMILES
`PubChem_get_compound_properties_by_CID`	`cid` , `properties`	`cid` as string; `properties` comma-separated
`PubChem_search_compounds_by_similarity`	`smiles`	`threshold` (int 0-100, default 90)
`PubChem_search_compounds_by_substructure`	`smiles`	Returns CIDs matching scaffold
`ChEMBL_search_molecules`	`query`	Name, ChEMBL ID, or InChIKey
`ChEMBL_get_molecule`	`chembl_id`	Full format: "CHEMBL941" not "941"
`ChEMBL_search_similar_molecules`	`query`	SMILES or ChEMBL ID
`ChEMBL_search_activities`	`molecule_chembl_id` OR `target_chembl_id`	Use `pchembl_value__gte=6` to filter potent
`ChEMBL_get_drug_mechanisms`	`drug_chembl_id` or `drug_name`	For approved drugs only
`ChEMBL_search_targets`	`query`	Add `organism="Homo sapiens"` to filter human
`ChEMBL_get_target_activities`	`target_chembl_id`	Returns all ligands for target
`SwissADME_calculate_adme`	`operation="calculate_adme"` , `smiles`	SMILES as string (not list)
`SwissADME_check_druglikeness`	`operation="check_druglikeness"` , `smiles`	SMILES as string
`ADMETAI_predict_*`	`smiles`	Must be a list: `["SMILES"]` not `"SMILES"`
`SwissTargetPrediction_predict`	`operation="predict"` , `smiles`	May time out
`eMolecules_search`	`query`	Returns search URL (no live data)
`eMolecules_search_smiles`	`smiles`	Canonical SMILES
`eMolecules_get_vendors`	`compound_id`	eMolecules internal ID
`Enamine_search_catalog`	`query`	Returns URL when API unavailable
`Enamine_search_smiles`	`smiles`	`search_type` : "exact", "similarity", "substructure"
`Enamine_get_compound`	`enamine_id`	Enamine-specific catalog ID
`BindingDB_get_ligands_by_uniprot`	`uniprot_id`	Frequently unavailable — use ChEMBL as fallback
`BindingDB_get_targets_by_compound`	`smiles`	SMILES-based target lookup

工具	必填参数	说明
`PubChem_get_CID_by_compound_name`	`compound_name`	返回CID列表；取第一个或最相关的结果
`PubChem_get_CID_by_SMILES`	`smiles`	使用标准SMILES
`PubChem_get_compound_properties_by_CID`	`cid` , `properties`	`cid` 为字符串； `properties` 用英文逗号分隔
`PubChem_search_compounds_by_similarity`	`smiles`	`threshold` (整数0-100，默认90)
`PubChem_search_compounds_by_substructure`	`smiles`	返回匹配骨架的CID
`ChEMBL_search_molecules`	`query`	名称、ChEMBL ID或InChIKey
`ChEMBL_get_molecule`	`chembl_id`	完整格式：「CHEMBL941」而非「941」
`ChEMBL_search_similar_molecules`	`query`	SMILES或ChEMBL ID
`ChEMBL_search_activities`	`molecule_chembl_id` 或 `target_chembl_id`	使用 `pchembl_value__gte=6` 过滤高活性化合物
`ChEMBL_get_drug_mechanisms`	`drug_chembl_id` 或 `drug_name`	仅适用于获批药物
`ChEMBL_search_targets`	`query`	添加 `organism="Homo sapiens"` 过滤人类靶点
`ChEMBL_get_target_activities`	`target_chembl_id`	返回靶点的所有配体
`SwissADME_calculate_adme`	`operation="calculate_adme"` , `smiles`	SMILES为字符串（非列表）
`SwissADME_check_druglikeness`	`operation="check_druglikeness"` , `smiles`	SMILES为字符串
`ADMETAI_predict_*`	`smiles`	必须是列表: `["SMILES"]` 而非 `"SMILES"`
`SwissTargetPrediction_predict`	`operation="predict"` , `smiles`	可能超时
`eMolecules_search`	`query`	返回搜索URL（无实时数据）
`eMolecules_search_smiles`	`smiles`	标准SMILES
`eMolecules_get_vendors`	`compound_id`	eMolecules内部ID
`Enamine_search_catalog`	`query`	API不可用时返回URL
`Enamine_search_smiles`	`smiles`	`search_type` : "exact", "similarity", "substructure"
`Enamine_get_compound`	`enamine_id`	Enamine专属目录ID
`BindingDB_get_ligands_by_uniprot`	`uniprot_id`	经常不可用——使用ChEMBL作为替代
`BindingDB_get_targets_by_compound`	`smiles`	基于SMILES的靶点查询

Common Patterns

常用模式

Pattern 1: Full Compound Profile

模式1：完整化合物档案

Input: Compound name (e.g., "imatinib")
Flow:
  1. PubChem_get_CID_by_compound_name -> CID + SMILES
  2. ChEMBL_search_molecules -> ChEMBL ID
  3. PubChem_get_compound_properties_by_CID -> physicochemical props
  4. SwissADME_calculate_adme / ADMETAI_predict_* -> ADMET profile
  5. ChEMBL_search_activities(molecule_chembl_id) -> binding data
  6. ChEMBL_get_drug_mechanisms -> MOA (if approved drug)
Output: Complete compound profile with identity, ADMET, and activity data

输入：化合物名称（例如：「imatinib」）
流程：
  1. PubChem_get_CID_by_compound_name -> CID + SMILES
  2. ChEMBL_search_molecules -> ChEMBL ID
  3. PubChem_get_compound_properties_by_CID -> 理化属性
  4. SwissADME_calculate_adme / ADMETAI_predict_* -> ADMET档案
  5. ChEMBL_search_activities(molecule_chembl_id) -> 结合数据
  6. ChEMBL_get_drug_mechanisms -> 作用机制（如果是获批药物）
输出：包含标识、ADMET、活性数据的完整化合物档案

Pattern 2: Analog Discovery

模式2：类似物发现

Input: Reference compound SMILES
Flow:
  1. PubChem_search_compounds_by_similarity(smiles, threshold=85) -> similar CIDs
  2. ChEMBL_search_similar_molecules(query=smiles) -> ChEMBL analogs
  3. For each hit: PubChem_get_compound_properties_by_CID -> properties
  4. SwissADME_check_druglikeness -> filter by drug-likeness
Output: Ranked list of analogs with activity data and drug-likeness scores

输入：参考化合物SMILES
流程：
  1. PubChem_search_compounds_by_similarity(smiles, threshold=85) -> 相似CID
  2. ChEMBL_search_similar_molecules(query=smiles) -> ChEMBL类似物
  3. 对每个命中化合物：PubChem_get_compound_properties_by_CID -> 属性
  4. SwissADME_check_druglikeness -> 按类药性过滤
输出：带活性数据和类药性评分的排序类似物列表

Pattern 3: Target-Based Compound Search

模式3：基于靶点的化合物搜索

Input: Target name (e.g., "EGFR")
Flow:
  1. ChEMBL_search_targets(query="EGFR", organism="Homo sapiens") -> target_chembl_id
  2. ChEMBL_get_target_activities(target_chembl_id) -> all ligands with Ki/IC50
  3. Filter by pchembl_value >= 7 (potent compounds)
  4. For top hits: SwissADME_check_druglikeness -> assess drug-likeness
  5. eMolecules_search(query=compound_name) -> check commercial availability
Output: Prioritized list of potent, drug-like, commercially available compounds

输入：靶点名称（例如：「EGFR」）
流程：
  1. ChEMBL_search_targets(query="EGFR", organism="Homo sapiens") -> target_chembl_id
  2. ChEMBL_get_target_activities(target_chembl_id) -> 所有带Ki/IC50的配体
  3. 按pchembl_value >= 7过滤（高活性化合物）
  4. 对 top 命中化合物：SwissADME_check_druglikeness -> 评估类药性
  5. eMolecules_search(query=compound_name) -> 核查商业可得性
输出：高活性、类药、可商业采购的优先级化合物列表

Pattern 4: ADMET Risk Assessment

模式4：ADMET风险评估

Input: Novel compound SMILES
Flow:
  1. SwissADME_calculate_adme(operation="calculate_adme", smiles) -> full ADMET
  2. ADMETAI_predict_toxicity(smiles=[smiles]) -> hERG, DILI, mutagenicity
  3. ADMETAI_predict_CYP_interactions(smiles=[smiles]) -> drug-drug interaction risk
  4. ADMETAI_predict_BBB_penetrance(smiles=[smiles]) -> CNS penetration
Output: ADMET risk profile with flagged liabilities

输入：新化合物SMILES
流程：
  1. SwissADME_calculate_adme(operation="calculate_adme", smiles) -> 完整ADMET
  2. ADMETAI_predict_toxicity(smiles=[smiles]) -> hERG、药物性肝损伤、致突变性
  3. ADMETAI_predict_CYP_interactions(smiles=[smiles]) -> 药物相互作用风险
  4. ADMETAI_predict_BBB_penetrance(smiles=[smiles]) -> 中枢神经系统渗透性
输出：标记风险点的ADMET风险档案

Fallback Chains

回退链路

Primary	Fallback	When
`BindingDB_get_ligands_by_uniprot`	`ChEMBL_get_target_activities`	BindingDB API unavailable
`ADMETAI_predict_*`	`SwissADME_calculate_adme`	ml dependencies not installed
`Enamine_search_catalog`	Returns URL only	API returns HTTP 500 (common)
`SwissTargetPrediction_predict`	`ChEMBL_search_similar_molecules` + known targets	Prediction times out
`PubChem_get_CID_by_compound_name`	`ChEMBL_search_molecules(query=name)`	Name not in PubChem

首选工具	回退方案	触发场景
`BindingDB_get_ligands_by_uniprot`	`ChEMBL_get_target_activities`	BindingDB API不可用
`ADMETAI_predict_*`	`SwissADME_calculate_adme`	未安装机器学习依赖
`Enamine_search_catalog`	仅返回URL	API返回HTTP 500（常见）
`SwissTargetPrediction_predict`	`ChEMBL_search_similar_molecules` + 已知靶点	预测超时
`PubChem_get_CID_by_compound_name`	`ChEMBL_search_molecules(query=name)`	名称不在PubChem中

Limitations

限制说明

BindingDB: REST API frequently times out; ChEMBL is the reliable alternative for binding data
Enamine API: Returns HTTP 500 often; tool provides search URL as fallback
eMolecules: No public API; tool generates search URLs only
ADMET-AI: Requires
```
pip install tooluniverse[ml]
```
; not always available in base install
SwissTargetPrediction: Web scraping-based; may time out for complex molecules
SMILES format: ADMET-AI requires a list
```
["SMILES"]
```
; SwissADME requires a string
```
"SMILES"
```
ChEMBL IDs: Always use full format
```
"CHEMBL941"
```
, never just
```
"941"
```

BindingDB: REST API经常超时；ChEMBL是结合数据的可靠替代方案
Enamine API: 经常返回HTTP 500；工具会提供搜索URL作为回退
eMolecules: 无公开API；工具仅生成搜索URL
ADMET-AI: 需要
```
pip install tooluniverse[ml]
```
；基础安装中不一定可用
SwissTargetPrediction: 基于网页爬取；复杂分子可能超时
SMILES格式: ADMET-AI要求传入列表
```
["SMILES"]
```
；SwissADME要求传入字符串
```
"SMILES"
```
ChEMBL ID: 始终使用完整格式
```
"CHEMBL941"
```
，不要仅使用
```
"941"
```