pubchem-database

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

PubChem Database

PubChem数据库

Prerequisites

前提条件

  1. uv
    : Read the
    uv
    skill and follow its Setup instructions to ensure
    uv
    is installed and on PATH.
  2. User Notification: If LICENSE_NOTIFICATION.txt does not already exist in this skill directory then (1) prominently notify the user to check the terms at https://pubchem.ncbi.nlm.nih.gov/docs/citation-guidelines and https://pubchem.ncbi.nlm.nih.gov/docs/pug-rest, then (2) create the file recording the notification text and timestamp.
  1. uv
    :阅读
    uv
    技能文档并按照其设置说明操作,确保
    uv
    已安装并添加至系统PATH。
  2. 用户通知:若当前技能目录中不存在LICENSE_NOTIFICATION.txt文件,则需:(1) 显著提醒用户查看https://pubchem.ncbi.nlm.nih.gov/docs/citation-guidelines和https://pubchem.ncbi.nlm.nih.gov/docs/pug-rest上的条款;(2) 创建该文件,记录通知内容及时间戳。

Quick Start

快速开始

PubChem queries are executed via a robust Python wrapper script to respect terms-of-service and handle complex JSON parsing. This script allows safe multi-agent use of APIs.
Example: Resolve a chemical name to its Compound ID (CID)
bash
uv run scripts/pubchem_api.py resolve --name "aspirin" --output result.json
PubChem查询通过一个健壮的Python包装器脚本执行,以遵守服务条款并处理复杂的JSON解析。该脚本支持多Agent安全调用API。
示例:将化学名称解析为化合物ID(CID)
bash
uv run scripts/pubchem_api.py resolve --name "aspirin" --output result.json

Core Rules

核心规则

  • Use the Wrapper: ALWAYS execute the provided helper scripts to query the database rather than accessing the database directly. The scripts automatically enforce the required rate limit gracefully.
  • Read the generated JSON output file, and process it with jq or code.
  • Verify Facts: ALWAYS verify information retrieved from memory with a database query if the user asks for a specific fact that can be checked in PubChem. Do not rely solely on internal knowledge.
  • Notification: If this skill is used, ensure this is mentioned in the output.
  • 使用包装器:务必通过提供的辅助脚本查询数据库,而非直接访问数据库。脚本会自动优雅地执行所需的速率限制。
  • 读取生成的JSON输出文件,可使用jq或代码进行处理。
  • 验证事实:若用户询问可在PubChem中核实的具体事实,务必通过数据库查询验证从内存中获取的信息,切勿仅依赖内部知识。
  • 通知说明:若使用本技能,需在输出中提及这一点。

Core Capabilities

核心功能

1. Compound Resolution (Name or InChI to Identifiers) Convert chemical/trade names or InChI strings into PubChem CIDs, SMILES, and InChIKeys.
bash
uv run scripts/pubchem_api.py resolve --name "ibuprofen" --output result.json
1. 化合物解析(名称或InChI转换为标识符) 将化学/商品名称或InChI字符串转换为PubChem CID、SMILES和InChIKeys。
bash
uv run scripts/pubchem_api.py resolve --name "ibuprofen" --output result.json

OR

uv run scripts/pubchem_api.py resolve --inchi "InChI=1S/C3/c1-3-2/i1+1" --output result.json

**2. Physical & Chemical Property Retrieval** Fetch computed properties (e.g.,
MolecularWeight, XLogP, TPSA).

```bash
uv run scripts/pubchem_api.py properties --cid 2244 --output result.json
3. Synonyms and Trade Names Find alternative names and brand names.
bash
uv run scripts/pubchem_api.py synonyms --cid 2244 --output result.json
uv run scripts/pubchem_api.py resolve --inchi "InChI=1S/C3/c1-3-2/i1+1" --output result.json

**2. 物理与化学属性检索** 获取计算属性(如分子量、XLogP、TPSA)。

```bash
uv run scripts/pubchem_api.py properties --cid 2244 --output result.json
3. 同义词与商品名称 查找替代名称和品牌名称。
bash
uv run scripts/pubchem_api.py synonyms --cid 2244 --output result.json

Advanced Context

进阶内容

4. Safety and Hazard Information (GHS) Retrieve Global Harmonized System hazard statements and handling precautions (uses PUG-View).
bash
uv run scripts/pubchem_api.py safety --cid 2244 --output result.json
5. Drug and Medication Information Fetch FDA pharmacology data, mechanism of action, and therapeutic uses (uses PUG-View).
bash
uv run scripts/pubchem_api.py pharmacology --cid 2244 --output result.json
6. Custom Heading (PUG-View) Retrieve any specific heading from the PUG-View system (e.g., 'Geometry', 'Crystal Structures').
bash
uv run scripts/pubchem_api.py view --cid 3939 --heading "Crystal Structures" --output result.json
7. Image Generation Retrieve 2D chemical structure images. The script returns a Markdown-formatted image link.
bash
uv run scripts/pubchem_api.py image --cid 2244 --output result.json
4. 安全与危害信息(GHS) 获取全球化学品统一分类和标签制度(GHS)的危害说明及操作注意事项(使用PUG-View)。
bash
uv run scripts/pubchem_api.py safety --cid 2244 --output result.json
5. 药物与医疗信息 获取FDA药理学数据、作用机制及治疗用途(使用PUG-View)。
bash
uv run scripts/pubchem_api.py pharmacology --cid 2244 --output result.json
6. 自定义标题查询(PUG-View) 从PUG-View系统中检索特定标题下的内容(如“几何结构”、“晶体结构”)。
bash
uv run scripts/pubchem_api.py view --cid 3939 --heading "Crystal Structures" --output result.json
7. 图像生成 获取二维化学结构图像。脚本会返回Markdown格式的图片链接。
bash
uv run scripts/pubchem_api.py image --cid 2244 --output result.json

Complex Search & Biology

复杂搜索与生物学相关功能

8. Structure-Based Searching (Similarity & Substructure) Find molecules similar to a SMILES string or containing a specific substructure.
bash
uv run scripts/pubchem_api.py similarity --smiles "CC(=O)OC1=CC=CC=C1C(=O)O" --output result.json
and
bash
uv run scripts/pubchem_api.py substructure --smiles "C1=CC=CC=C1" --output result.json
9. BioAssay & Target Interactions Identify genes or proteins a chemical interacts with.
bash
uv run scripts/pubchem_api.py assays --cid 2244 --output result.json
8. 基于结构的搜索(相似性与子结构) 查找与指定SMILES字符串相似的分子,或包含特定子结构的分子。
bash
uv run scripts/pubchem_api.py similarity --smiles "CC(=O)OC1=CC=CC=C1C(=O)O" --output result.json
以及
bash
uv run scripts/pubchem_api.py substructure --smiles "C1=CC=CC=C1" --output result.json
9. 生物测定与靶点相互作用 识别与化学品相互作用的基因或蛋白质。
bash
uv run scripts/pubchem_api.py assays --cid 2244 --output result.json

Advanced Usage & Workflows

进阶用法与工作流

10. Cross-references (Xrefs) Fetch identifiers cross-referenced with a CID (e.g., PatentID, PubMedID).
bash
uv run scripts/pubchem_api.py xrefs --cid 2244 --type "PatentID" --output result.json
11. Property Range Search Find CIDs within a specific property range. Supported features include:
molecular_weight
,
heavy_atom_count
,
xlogp
,
tpsa
,
h_bond_donor_count
,
h_bond_acceptor_count
,
rotatable_bond_count
,
exact_mass
,
monoisotopic_mass
, and
complexity
.
bash
uv run scripts/pubchem_api.py range --feature molecular_weight --min 400.0 --max 400.05 --output result.json
12. Custom PUG-REST Query Execute a raw path against the PUG-REST API.
bash
uv run scripts/pubchem_api.py query --path "compound/cid/2244/xrefs/PatentID/JSON" --output result.json
10. 交叉引用(Xrefs) 获取与CID关联的其他标识符(如专利ID、PubMedID)。
bash
uv run scripts/pubchem_api.py xrefs --cid 2244 --type "PatentID" --output result.json
11. 属性范围搜索 查找属性值在特定范围内的CID。支持的属性包括:
molecular_weight
heavy_atom_count
xlogp
tpsa
h_bond_donor_count
h_bond_acceptor_count
rotatable_bond_count
exact_mass
monoisotopic_mass
complexity
bash
uv run scripts/pubchem_api.py range --feature molecular_weight --min 400.0 --max 400.05 --output result.json
12. 自定义PUG-REST查询 针对PUG-REST API执行原始路径查询。
bash
uv run scripts/pubchem_api.py query --path "compound/cid/2244/xrefs/PatentID/JSON" --output result.json

Fallback Search Strategies

备选搜索策略

If direct resolution by name or formula fails (e.g., for complex compounds or specific ions):
  • Search for parent/neutral molecule: If searching for an ion or salt, try searching for the neutral parent compound.
  • Deconstruct complex formulas: If a complex formula returns no results, try searching for major components or ligands.
  • Use substructure or similarity search: If you have a SMILES string or can generate one for a component, use it to find related compounds.
若直接通过名称或分子式解析失败(如针对复杂化合物或特定离子):
  • 搜索母体/中性分子:若搜索离子或盐类,尝试搜索其中性母体化合物。
  • 拆解复杂分子式:若复杂分子式无结果,尝试搜索主要成分或配体。
  • 使用子结构或相似性搜索:若拥有SMILES字符串或可为某成分生成SMILES字符串,使用其查找相关化合物。

Complex Queries and Multi-Step Tasks

复杂查询与多步骤任务

  • Custom/Complex Queries: For more details, read references/endpoints.md to construct raw PUG-REST URLs.
  • Multi-Step Tasks: For complex tasks like drug discovery pipelines, follow the checklists in references/workflows.md.
  • 自定义/复杂查询:如需更多细节,请阅读references/endpoints.md以构建原始PUG-REST URL。
  • 多步骤任务:针对药物发现流程等复杂任务,请遵循references/workflows.md中的检查清单。