tooluniverse-chemical-sourcing

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Chemical Compound Sourcing & Procurement

化学化合物采购与获取

Pipeline for identifying, sourcing, and purchasing chemical compounds from commercial vendors. Resolves compound identity through PubChem/ChEMBL, searches multiple vendor databases (ZINC, Enamine, eMolecules, Mcule), compares pricing and availability, and identifies purchasable analogs when exact compounds are unavailable.
Guiding principles:
  1. Identity first -- confirm the compound's structure (SMILES, InChI) before searching vendors; names can be ambiguous
  2. Multi-vendor comparison -- always check multiple sources; pricing and stock vary significantly
  3. Analog fallback -- if the exact compound is unavailable, search for close analogs
  4. Purity and quantity awareness -- note catalog purity grades and minimum order quantities
  5. Structure over name -- vendor searches by SMILES/InChI are more reliable than name searches
  6. English-first queries -- use English compound names in tool calls
从商业供应商处识别、获取和购买化学化合物的流程。通过PubChem/ChEMBL确认化合物身份,搜索多个供应商数据库(ZINC、Enamine、eMolecules、Mcule),对比定价与可用性,当目标化合物无法获取时识别可购买的类似物。
指导原则:
  1. 先确认身份 —— 在搜索供应商前先确认化合物结构(SMILES、InChI);化合物名称可能存在歧义
  2. 多供应商对比 —— 务必检查多个来源;定价和库存差异极大
  3. 类似物备选 —— 若目标化合物无法获取,搜索结构相近的类似物
  4. 关注纯度与数量 —— 注意目录中的纯度等级和最小订购量
  5. 结构优先于名称 —— 通过SMILES/InChI进行供应商搜索比名称搜索更可靠
  6. 优先使用英文查询 —— 在工具调用中使用英文化合物名称

LOOK UP, DON'T GUESS

查资料,勿猜测

When uncertain about any scientific fact, SEARCH databases first rather than reasoning from memory. A database-verified answer is always more reliable than a guess.

当对任何科学事实存疑时,先搜索数据库而非凭记忆推断。经数据库验证的答案永远比猜测更可靠。

COMPUTE, DON'T DESCRIBE

去计算,勿描述

When analysis requires computation (statistics, data processing, scoring, enrichment), write and run Python code via Bash. Don't describe what you would do — execute it and report actual results. Use ToolUniverse tools to retrieve data, then Python (pandas, scipy, statsmodels, matplotlib) to analyze it.
当分析需要计算(统计、数据处理、评分、富集)时,通过Bash编写并运行Python代码。不要描述你会做什么——直接执行并报告实际结果。使用ToolUniverse工具检索数据,再用Python(pandas、scipy、statsmodels、matplotlib)进行分析。

When to Use

适用场景

Typical triggers:
  • "Where can I buy [compound]?"
  • "Find commercial sources for [SMILES]"
  • "Compare prices for [compound] across vendors"
  • "Is [compound] commercially available?"
  • "Find purchasable analogs of [compound]"
  • "I need [quantity] of [compound] -- who sells it?"
  • "Search ZINC/Enamine for [compound]"
Not this skill: For ADMET/toxicity assessment, use
tooluniverse-admet-prediction
. For drug-target interaction analysis, use
tooluniverse-drug-target-validation
.

典型触发场景:
  • “我在哪里可以购买[化合物]?”
  • “查找[SMILES]对应的商业来源”
  • “对比[化合物]在不同供应商处的价格”
  • “[化合物]是否有商业供应?”
  • “查找[化合物]的可购买类似物”
  • “我需要[数量]的[化合物]——谁在销售?”
  • “在ZINC/Enamine中搜索[化合物]”
不属于本技能范畴:如需进行ADMET/毒性评估,请使用
tooluniverse-admet-prediction
。如需进行药物-靶点相互作用分析,请使用
tooluniverse-drug-target-validation

Core Databases

核心数据库

DatabaseScopeBest For
ZINC230M+ purchasable compounds; aggregates vendorsBroadest coverage; substructure/similarity search; free
Enamine~4M in-stock, 30B+ REAL (make-on-demand)Large in-stock library; fast delivery; building blocks
eMoleculesMulti-vendor aggregator; 8M+ compoundsCross-vendor comparison; pricing transparency
Mcule40M+ compounds; one-stop purchasingIntegrated ordering; quote generation
PubChem110M+ compounds; identity resolutionAuthoritative compound identification; CID lookup
ChEMBL2.4M+ bioactive moleculesBioactivity context for sourced compounds

数据库范围适用场景
ZINC2.3亿+可采购化合物;整合多家供应商覆盖范围最广;子结构/相似性搜索;免费
Enamine~400万库存化合物,300亿+ REAL(按需合成)大型库存库;配送快;合成砌块
eMolecules多供应商整合平台;800万+化合物跨供应商对比;定价透明
Mcule4000万+化合物;一站式采购集成下单;报价生成
PubChem1.1亿+化合物;身份确认权威化合物识别;CID查询
ChEMBL240万+生物活性分子已采购化合物的生物活性背景

Workflow Overview

工作流程概述

Phase 0: Compound Identity Resolution
  Name/SMILES/CAS -> PubChem CID -> canonical SMILES
    |
Phase 1: Vendor Search
  Query ZINC, Enamine, eMolecules, Mcule
    |
Phase 2: Price & Availability Comparison
  Catalog numbers, pricing, stock status, purity
    |
Phase 3: Analog Search (if needed)
  Similarity search for purchasable alternatives
    |
Phase 4: Bioactivity Context (optional)
  ChEMBL activity data for sourced compounds
    |
Phase 5: Order Summary
  Consolidated vendor comparison table

Phase 0: Compound Identity Resolution
  Name/SMILES/CAS -> PubChem CID -> canonical SMILES
    |
Phase 1: Vendor Search
  Query ZINC, Enamine, eMolecules, Mcule
    |
Phase 2: Price & Availability Comparison
  Catalog numbers, pricing, stock status, purity
    |
Phase 3: Analog Search (if needed)
  Similarity search for purchasable alternatives
    |
Phase 4: Bioactivity Context (optional)
  ChEMBL activity data for sourced compounds
    |
Phase 5: Order Summary
  Consolidated vendor comparison table

Phase Details

阶段详情

Phase 0: Compound Identity Resolution

Phase 0: 化合物身份确认

Objective: Establish unambiguous compound identity before vendor searches.
Tools:
  • PubChem_get_CID_by_compound_name
    -- resolve name to CID
    • Input:
      name
      (compound name)
    • Output:
      {IdentifierList: {CID: [...]}}
  • PubChem_get_compound_properties_by_CID
    -- get SMILES, MW, formula
    • Input:
      cid
      (PubChem CID),
      properties
      (comma-separated list)
    • Output:
      {CID, MolecularWeight, ConnectivitySMILES, IUPACName}
  • ChEMBL_get_molecule
    -- get ChEMBL compound details
    • Input:
      molecule_chembl_id
      (ChEMBL ID) or search by name
    • Output: SMILES, molecular properties, synonyms
Workflow:
  1. If user provides a name: resolve to PubChem CID, then get SMILES
  2. If user provides SMILES: use directly (optionally verify via PubChem)
  3. If user provides CAS number: search PubChem by name (CAS numbers work as search terms)
  4. Record: canonical SMILES, molecular weight, molecular formula, IUPAC name
Important: PubChem
ConnectivitySMILES
(not
CanonicalSMILES
) is the correct property name. Always confirm the SMILES matches the intended compound before proceeding.
目标: 在搜索供应商前建立明确的化合物身份。
工具:
  • PubChem_get_CID_by_compound_name
    —— 将化合物名称解析为CID
    • 输入:
      name
      (化合物名称)
    • 输出:
      {IdentifierList: {CID: [...]}}
  • PubChem_get_compound_properties_by_CID
    —— 获取SMILES、分子量、分子式
    • 输入:
      cid
      (PubChem CID),
      properties
      (逗号分隔的属性列表)
    • 输出:
      {CID, MolecularWeight, ConnectivitySMILES, IUPACName}
  • ChEMBL_get_molecule
    —— 获取ChEMBL化合物详情
    • 输入:
      molecule_chembl_id
      (ChEMBL ID)或按名称搜索
    • 输出: SMILES、分子属性、同义词
工作流程:
  1. 若用户提供名称:解析为PubChem CID,再获取SMILES
  2. 若用户提供SMILES:直接使用(可选择通过PubChem验证)
  3. 若用户提供CAS号:通过名称搜索PubChem(CAS号可作为搜索词)
  4. 记录:标准SMILES、分子量、分子式、IUPAC名称
重要提示: PubChem的
ConnectivitySMILES
(而非
CanonicalSMILES
)是正确的属性名称。在继续操作前,务必确认SMILES与目标化合物匹配。

Phase 1: Vendor Search

Phase 1: 供应商搜索

Objective: Search all available vendor databases for the target compound.
Tools:
  • ZINC_search_compounds
    -- search ZINC by name or SMILES
    • Input:
      query
      (name or SMILES), optional
      catalog
      ,
      limit
    • Output: ZINC IDs, vendor info, purchasability status
  • ZINC_get_compound
    -- get detailed compound info from ZINC
    • Input:
      zinc_id
      (ZINC identifier)
    • Output: vendors, catalogs, pricing, SMILES
  • Enamine_search_catalog
    -- search Enamine catalog
    • Input:
      query
      (name or SMILES), optional
      catalog_type
      ,
      limit
    • Output: catalog numbers, availability, pricing
  • Enamine_get_compound
    -- get Enamine compound details
    • Input:
      compound_id
      (Enamine catalog number)
    • Output: structure, pricing, stock status, delivery time
  • eMolecules_search
    -- search across multiple vendors
    • Input:
      query
      (name or SMILES), optional
      limit
    • Output: vendor list, catalog numbers, pricing
  • eMolecules_get_compound
    -- get eMolecules compound details
    • Input:
      compound_id
      (eMolecules ID)
    • Output: vendors, pricing tiers, purity
  • Mcule_get_compound
    -- search Mcule database
    • Input:
      query
      (name or SMILES), optional
      limit
    • Output: Mcule IDs, availability, pricing
  • Mcule_get_compound
    -- get Mcule compound details
    • Input:
      compound_id
      (Mcule ID)
    • Output: pricing, delivery, purity, catalog number
Workflow:
  1. Search all four vendor databases in parallel using SMILES (preferred) or name
  2. For each hit, retrieve detailed compound info (pricing, stock, purity)
  3. Deduplicate results by matching SMILES across vendors
  4. Flag any structural mismatches (vendor compound differs from target)
Tip: SMILES-based searches are more precise than name searches. If name search returns too many results, switch to SMILES.
目标: 在所有可用供应商数据库中搜索目标化合物。
工具:
  • ZINC_search_compounds
    —— 按名称或SMILES搜索ZINC
    • 输入:
      query
      (名称或SMILES),可选参数
      catalog
      limit
    • 输出: ZINC ID、供应商信息、可采购状态
  • ZINC_get_compound
    —— 从ZINC获取化合物详细信息
    • 输入:
      zinc_id
      (ZINC标识符)
    • 输出: 供应商、目录、定价、SMILES
  • Enamine_search_catalog
    —— 搜索Enamine目录
    • 输入:
      query
      (名称或SMILES),可选参数
      catalog_type
      limit
    • 输出: 目录编号、可用性、定价
  • Enamine_get_compound
    —— 获取Enamine化合物详情
    • 输入:
      compound_id
      (Enamine目录编号)
    • 输出: 结构、定价、库存状态、配送时间
  • eMolecules_search
    —— 跨多供应商搜索
    • 输入:
      query
      (名称或SMILES),可选参数
      limit
    • 输出: 供应商列表、目录编号、定价
  • eMolecules_get_compound
    —— 获取eMolecules化合物详情
    • 输入:
      compound_id
      (eMolecules ID)
    • 输出: 供应商、定价层级、纯度
  • Mcule_get_compound
    —— 搜索Mcule数据库
    • 输入:
      query
      (名称或SMILES),可选参数
      limit
    • 输出: Mcule ID、可用性、定价
  • Mcule_get_compound
    —— 获取Mcule化合物详情
    • 输入:
      compound_id
      (Mcule ID)
    • 输出: 定价、配送、纯度、目录编号
工作流程:
  1. 使用SMILES(优先)或名称并行搜索四个供应商数据库
  2. 对每个匹配结果,获取化合物详细信息(定价、库存、纯度)
  3. 通过跨供应商匹配SMILES去重结果
  4. 标记任何结构不匹配情况(供应商化合物与目标化合物不同)
提示: 基于SMILES的搜索比名称搜索更精准。若名称搜索返回结果过多,切换为SMILES搜索。

Phase 2: Price & Availability Comparison

Phase 2: 定价与可用性对比

Objective: Create a comparison table across vendors.
Compile from Phase 1 results:
FieldDescription
VendorCompany name
Catalog #Vendor-specific identifier
QuantityAvailable pack sizes
PricePer unit or per mg
PurityStated purity grade (>95%, >98%, etc.)
StockIn-stock vs make-on-demand
DeliveryEstimated delivery time
Rank vendors by: (1) in-stock availability, (2) price per mg, (3) purity grade, (4) delivery time.
目标: 创建跨供应商的对比表格。
从Phase 1结果中整理以下内容:
字段描述
供应商公司名称
目录编号供应商专属标识符
数量可用包装规格
价格单价或每毫克价格
纯度标注的纯度等级(>95%、>98%等)
库存现货vs按需合成
配送预计配送时间
按以下优先级排序供应商:(1) 现货可用性,(2) 每毫克价格,(3) 纯度等级,(4) 配送时间。

Phase 3: Analog Search

Phase 3: 类似物搜索(按需)

Objective: When the exact compound is unavailable, find purchasable structural analogs.
Triggered when:
  • No vendors carry the target compound
  • The compound is prohibitively expensive
  • The user explicitly requests analogs
Approach:
  1. Use ZINC or Enamine similarity search (if supported by the tool's search mode)
  2. Search by substructure using the compound's core scaffold SMILES
  3. Filter analogs by: Tanimoto similarity >= 0.7, commercial availability, reasonable price
  4. Present analogs with structural differences highlighted
目标: 当目标化合物无法获取时,寻找可购买的结构类似物。
触发场景:
  • 无供应商销售目标化合物
  • 化合物价格过高
  • 用户明确要求类似物
方法:
  1. 使用ZINC或Enamine的相似性搜索(若工具搜索模式支持)
  2. 使用化合物核心骨架SMILES进行子结构搜索
  3. 按以下条件过滤类似物:Tanimoto相似度≥0.7、可商业获取、价格合理
  4. 展示类似物并高亮结构差异

Phase 4: Bioactivity Context (Optional)

Phase 4: 生物活性背景(可选)

Objective: Provide biological activity data for context when sourcing compounds for research.
Tools:
  • ChEMBL_get_molecule
    -- get bioactivity summary
    • Input: compound identifier
    • Output: known targets, activity values, assay data
Useful when:
  • User is sourcing compounds for a specific biological assay
  • Comparing analogs that might have different activity profiles
  • Verifying the compound has published bioactivity data
目标: 为研究用化合物采购提供生物活性数据背景。
工具:
  • ChEMBL_get_molecule
    —— 获取生物活性摘要
    • 输入: 化合物标识符
    • 输出: 已知靶点、活性值、实验数据
适用场景:
  • 用户为特定生物实验采购化合物
  • 对比可能具有不同活性谱的类似物
  • 验证化合物是否有已发表的生物活性数据

Phase 5: Decision & Order Summary

Phase 5: 决策与订单总结

Vendor selection decision matrix — don't just list vendors, recommend one:
ScenarioBest Vendor StrategyWhy
Need it this weekIn-stock vendor with fastest shippingMake-on-demand takes 2-4 weeks minimum
Budget-constrainedCheapest per mg, accept lower purity (>95%)Academic budgets are tight; >95% is fine for screening
High-throughput screenZINC/Enamine for large libraries; mg quantitiesPrice per compound matters more than purity
Assay validationHighest purity (>98%) from reputable vendorFalse positives from impurities waste months
Building blocks for synthesisEnamine (largest building block catalog)Purpose-built for medicinal chemistry
Exact compound unavailableAnalog search → check bioactivity (ChEMBL) → source best analogTanimoto > 0.85 likely retains activity; 0.7-0.85 may have different SAR
Red flags when sourcing:
  • Vendor has no published purity data → request CoA before ordering
  • Price is 10x lower than other vendors → may be a different salt form or impure
  • "In stock" but delivery estimate is 4+ weeks → likely not actually in stock
  • SMILES in vendor catalog differs from target SMILES → wrong compound
Generate a final sourcing report:
  1. Compound Identity -- name, SMILES, MW, CAS (if known), PubChem CID
  2. Vendor Comparison Table -- all vendors with pricing, stock, purity, delivery time
  3. Recommended Source -- specific vendor with reasoning (not just cheapest)
  4. Analogs (if searched) -- alternative compounds with similarity scores and bioactivity comparison
  5. Notes -- special handling, storage conditions, salt form, stereochemistry considerations

供应商选择决策矩阵 —— 不要仅列出供应商,要推荐最优选项:
场景最佳供应商策略原因
本周需要现货供应商且配送最快按需合成至少需要2-4周
预算有限每毫克价格最低,接受较低纯度(>95%)学术预算紧张;>95%纯度足以用于筛选实验
高通量筛选ZINC/Enamine的大型库;毫克级规格化合物单价比纯度更重要
实验验证高纯度(>98%)的知名供应商杂质导致的假阳性会浪费数月时间
合成砌块采购Enamine(最大的合成砌块目录)专为药物化学设计
目标化合物无法获取类似物搜索 → 检查生物活性(ChEMBL) → 采购最优类似物Tanimoto>0.85可能保留活性;0.7-0.85可能具有不同的构效关系
采购注意事项(红色预警):
  • 供应商无公开纯度数据 → 下单前索要CoA(分析证书)
  • 价格比其他供应商低10倍 → 可能是不同盐型或纯度不足
  • 标注“现货”但预计配送时间4周以上 → 实际可能无库存
  • 供应商目录中的SMILES与目标SMILES不符 → 化合物错误
生成最终采购报告:
  1. 化合物身份 —— 名称、SMILES、分子量、CAS号(如有)、PubChem CID
  2. 供应商对比表 —— 所有供应商的定价、库存、纯度、配送时间
  3. 推荐来源 —— 特定供应商及理由(不只是最便宜的)
  4. 类似物(若搜索) —— 备选化合物的相似度评分及生物活性对比
  5. 注意事项 —— 特殊处理、储存条件、盐型、立体化学考量

Common Analysis Patterns

常见分析模式

PatternDescriptionKey Phases
Quick Availability CheckIs this compound purchasable?0, 1
Full Vendor ComparisonCompare all sources with pricing0, 1, 2, 5
Analog DiscoveryCompound unavailable; find alternatives0, 1, 3, 5
Building Block SourcingFind reagents for synthesis0, 1, 2
Hit-to-Lead SourcingSource screening hits with bioactivity context0, 1, 2, 4, 5

模式描述核心阶段
快速可用性检查该化合物是否可采购?0, 1
完整供应商对比对比所有来源的定价0, 1, 2, 5
类似物发现化合物无法获取;寻找替代选项0, 1, 3, 5
合成砌块采购寻找合成用试剂0, 1, 2
苗头化合物到先导化合物采购采购具有生物活性背景的筛选命中化合物0, 1, 2, 4, 5

Edge Cases & Fallbacks

边缘情况与备选方案

  • Name ambiguity: Multiple compounds share a name (e.g., "aspirin" vs "acetylsalicylic acid"). Always resolve to SMILES first
  • Stereochemistry: Vendors may sell racemic mixtures vs specific enantiomers. Check SMILES stereochemistry carefully
  • Salt forms: The same drug may be sold as different salts (HCl, maleate, etc.). Note the specific form
  • No vendors found: Compound may be available through custom synthesis. Note this in the report
  • Make-on-demand: Enamine REAL compounds require synthesis (2-4 weeks). Distinguish from in-stock items

  • 名称歧义: 多个化合物共用一个名称(如“aspirin” vs “acetylsalicylic acid”)。务必先解析为SMILES
  • 立体化学: 供应商可能销售外消旋混合物 vs 特定对映体。仔细检查SMILES的立体化学信息
  • 盐型: 同一种药物可能以不同盐型出售(HCl、马来酸盐等)。注意具体盐型
  • 无供应商: 化合物可能可通过定制合成获取。在报告中注明
  • 按需合成: Enamine REAL化合物需要合成(2-4周)。与现货商品区分开

Interpretation Framework

解读框架

Evidence GradeCriteriaAction
A -- High confidenceIn-stock at 2+ vendors, purity >=98%, CoA availableOrder directly
B -- Moderate confidenceSingle vendor or make-on-demand, purity >=95%Request CoA, verify structure
C -- Low confidenceNo stock, purity unstated, or price outlier (>5x median)Custom synthesis or analog search
Interpreting vendor results:
  • A 10x price difference between vendors for the same compound usually indicates different salt forms, purity grades, or packaging sizes rather than genuine cost differences -- always compare on a per-mg, same-purity basis.
  • Purity of >=95% is sufficient for primary screening; >=98% is recommended for dose-response and SAR studies; >=99% is needed for reference standards and pharmacokinetic work.
  • "In-stock" status in aggregator databases can be stale by weeks -- confirm real-time availability with the vendor before committing to a timeline.
Synthesis questions to address in the final report:
  1. Do all vendor SMILES resolve to the same canonical structure (including stereochemistry and salt form)?
  2. Is the price-per-mg consistent with the compound's synthetic complexity, or does an outlier suggest a catalog error?
  3. For analogs: does the structural change fall outside the pharmacophore, preserving expected activity?

证据等级标准行动
A -- 高置信度2家以上供应商有现货,纯度≥98%,可提供CoA直接下单
B -- 中等置信度单一供应商或按需合成,纯度≥95%索要CoA,验证结构
C -- 低置信度无库存、未标注纯度或价格异常(>中位数5倍)定制合成或搜索类似物
供应商结果解读:
  • 同一种化合物在不同供应商间的价格差达10倍,通常意味着盐型、纯度等级或包装规格不同,而非真实成本差异——务必按每毫克、同纯度标准进行对比。
  • ≥95%纯度足以用于初筛;≥98%纯度推荐用于剂量反应和构效关系研究;≥99%纯度用于参考标准和药代动力学研究。
  • 聚合数据库中的“现货”状态可能已过期数周——在确定时间线前,务必与供应商确认实时库存。
最终报告需解决的合成相关问题:
  1. 所有供应商的SMILES是否对应相同的标准结构(包括立体化学和盐型)?
  2. 每毫克价格是否与化合物的合成复杂度相符?价格异常是否暗示目录错误?
  3. 对于类似物:结构变化是否超出药效团范围,能否保留预期活性?

Limitations

局限性

  • Pricing accuracy: Database prices may be outdated; actual quotes from vendors are authoritative
  • Regional availability: Some vendors ship only to specific regions; check shipping policies
  • Quantity limits: Academic vs commercial pricing may differ; some vendors require institutional accounts
  • Controlled substances: Some compounds have regulatory restrictions; this skill does not check legal status
  • No direct ordering: This skill finds sources but does not place orders; users need vendor accounts
  • 定价准确性: 数据库价格可能过时;供应商的实际报价才是权威的
  • 区域可用性: 部分供应商仅向特定地区发货;请查看运输政策
  • 数量限制: 学术与商业定价可能不同;部分供应商需要机构账户
  • 受控物质: 部分化合物受监管限制;本技能不检查合法状态
  • 无法直接下单: 本技能仅查找来源,不进行下单操作;用户需拥有供应商账户