pubchem-database
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChinesePubChem Database
PubChem数据库
Prerequisites
前提条件
- : Read the
uvskill and follow its Setup instructions to ensureuvis installed and on PATH.uv - User Notification: If LICENSE_NOTIFICATION.txt does not already exist in this skill directory then (1) prominently notify the user to check the terms at https://pubchem.ncbi.nlm.nih.gov/docs/citation-guidelines and https://pubchem.ncbi.nlm.nih.gov/docs/pug-rest, then (2) create the file recording the notification text and timestamp.
- :阅读
uv技能文档并按照其设置说明操作,确保uv已安装并添加至系统PATH。uv - 用户通知:若当前技能目录中不存在LICENSE_NOTIFICATION.txt文件,则需:(1) 显著提醒用户查看https://pubchem.ncbi.nlm.nih.gov/docs/citation-guidelines和https://pubchem.ncbi.nlm.nih.gov/docs/pug-rest上的条款;(2) 创建该文件,记录通知内容及时间戳。
Quick Start
快速开始
PubChem queries are executed via a robust Python wrapper script to respect
terms-of-service and handle complex JSON parsing. This script allows safe
multi-agent use of APIs.
Example: Resolve a chemical name to its Compound ID (CID)
bash
uv run scripts/pubchem_api.py resolve --name "aspirin" --output result.jsonPubChem查询通过一个健壮的Python包装器脚本执行,以遵守服务条款并处理复杂的JSON解析。该脚本支持多Agent安全调用API。
示例:将化学名称解析为化合物ID(CID)
bash
uv run scripts/pubchem_api.py resolve --name "aspirin" --output result.jsonCore Rules
核心规则
- Use the Wrapper: ALWAYS execute the provided helper scripts to query the database rather than accessing the database directly. The scripts automatically enforce the required rate limit gracefully.
- Read the generated JSON output file, and process it with jq or code.
- Verify Facts: ALWAYS verify information retrieved from memory with a database query if the user asks for a specific fact that can be checked in PubChem. Do not rely solely on internal knowledge.
- Notification: If this skill is used, ensure this is mentioned in the output.
- 使用包装器:务必通过提供的辅助脚本查询数据库,而非直接访问数据库。脚本会自动优雅地执行所需的速率限制。
- 读取生成的JSON输出文件,可使用jq或代码进行处理。
- 验证事实:若用户询问可在PubChem中核实的具体事实,务必通过数据库查询验证从内存中获取的信息,切勿仅依赖内部知识。
- 通知说明:若使用本技能,需在输出中提及这一点。
Core Capabilities
核心功能
1. Compound Resolution (Name or InChI to Identifiers) Convert chemical/trade
names or InChI strings into PubChem CIDs, SMILES, and InChIKeys.
bash
uv run scripts/pubchem_api.py resolve --name "ibuprofen" --output result.json1. 化合物解析(名称或InChI转换为标识符) 将化学/商品名称或InChI字符串转换为PubChem CID、SMILES和InChIKeys。
bash
uv run scripts/pubchem_api.py resolve --name "ibuprofen" --output result.jsonOR
或
uv run scripts/pubchem_api.py resolve --inchi "InChI=1S/C3/c1-3-2/i1+1" --output result.json
**2. Physical & Chemical Property Retrieval** Fetch computed properties (e.g.,
MolecularWeight, XLogP, TPSA).
```bash
uv run scripts/pubchem_api.py properties --cid 2244 --output result.json3. Synonyms and Trade Names Find alternative names and brand names.
bash
uv run scripts/pubchem_api.py synonyms --cid 2244 --output result.jsonuv run scripts/pubchem_api.py resolve --inchi "InChI=1S/C3/c1-3-2/i1+1" --output result.json
**2. 物理与化学属性检索** 获取计算属性(如分子量、XLogP、TPSA)。
```bash
uv run scripts/pubchem_api.py properties --cid 2244 --output result.json3. 同义词与商品名称 查找替代名称和品牌名称。
bash
uv run scripts/pubchem_api.py synonyms --cid 2244 --output result.jsonAdvanced Context
进阶内容
4. Safety and Hazard Information (GHS) Retrieve Global Harmonized System
hazard statements and handling precautions (uses PUG-View).
bash
uv run scripts/pubchem_api.py safety --cid 2244 --output result.json5. Drug and Medication Information Fetch FDA pharmacology data, mechanism of
action, and therapeutic uses (uses PUG-View).
bash
uv run scripts/pubchem_api.py pharmacology --cid 2244 --output result.json6. Custom Heading (PUG-View) Retrieve any specific heading from the PUG-View
system (e.g., 'Geometry', 'Crystal Structures').
bash
uv run scripts/pubchem_api.py view --cid 3939 --heading "Crystal Structures" --output result.json7. Image Generation Retrieve 2D chemical structure images. The script
returns a Markdown-formatted image link.
bash
uv run scripts/pubchem_api.py image --cid 2244 --output result.json4. 安全与危害信息(GHS) 获取全球化学品统一分类和标签制度(GHS)的危害说明及操作注意事项(使用PUG-View)。
bash
uv run scripts/pubchem_api.py safety --cid 2244 --output result.json5. 药物与医疗信息 获取FDA药理学数据、作用机制及治疗用途(使用PUG-View)。
bash
uv run scripts/pubchem_api.py pharmacology --cid 2244 --output result.json6. 自定义标题查询(PUG-View) 从PUG-View系统中检索特定标题下的内容(如“几何结构”、“晶体结构”)。
bash
uv run scripts/pubchem_api.py view --cid 3939 --heading "Crystal Structures" --output result.json7. 图像生成 获取二维化学结构图像。脚本会返回Markdown格式的图片链接。
bash
uv run scripts/pubchem_api.py image --cid 2244 --output result.jsonComplex Search & Biology
复杂搜索与生物学相关功能
8. Structure-Based Searching (Similarity & Substructure) Find molecules
similar to a SMILES string or containing a specific substructure.
bash
uv run scripts/pubchem_api.py similarity --smiles "CC(=O)OC1=CC=CC=C1C(=O)O" --output result.jsonand
bash
uv run scripts/pubchem_api.py substructure --smiles "C1=CC=CC=C1" --output result.json9. BioAssay & Target Interactions Identify genes or proteins a chemical
interacts with.
bash
uv run scripts/pubchem_api.py assays --cid 2244 --output result.json8. 基于结构的搜索(相似性与子结构) 查找与指定SMILES字符串相似的分子,或包含特定子结构的分子。
bash
uv run scripts/pubchem_api.py similarity --smiles "CC(=O)OC1=CC=CC=C1C(=O)O" --output result.json以及
bash
uv run scripts/pubchem_api.py substructure --smiles "C1=CC=CC=C1" --output result.json9. 生物测定与靶点相互作用 识别与化学品相互作用的基因或蛋白质。
bash
uv run scripts/pubchem_api.py assays --cid 2244 --output result.jsonAdvanced Usage & Workflows
进阶用法与工作流
10. Cross-references (Xrefs) Fetch identifiers cross-referenced with a CID
(e.g., PatentID, PubMedID).
bash
uv run scripts/pubchem_api.py xrefs --cid 2244 --type "PatentID" --output result.json11. Property Range Search Find CIDs within a specific property range.
Supported features include: , , ,
, , , ,
, , and .
molecular_weightheavy_atom_countxlogptpsah_bond_donor_counth_bond_acceptor_countrotatable_bond_countexact_massmonoisotopic_masscomplexitybash
uv run scripts/pubchem_api.py range --feature molecular_weight --min 400.0 --max 400.05 --output result.json12. Custom PUG-REST Query Execute a raw path against the PUG-REST API.
bash
uv run scripts/pubchem_api.py query --path "compound/cid/2244/xrefs/PatentID/JSON" --output result.json10. 交叉引用(Xrefs) 获取与CID关联的其他标识符(如专利ID、PubMedID)。
bash
uv run scripts/pubchem_api.py xrefs --cid 2244 --type "PatentID" --output result.json11. 属性范围搜索 查找属性值在特定范围内的CID。支持的属性包括:、、、、、、、、和。
molecular_weightheavy_atom_countxlogptpsah_bond_donor_counth_bond_acceptor_countrotatable_bond_countexact_massmonoisotopic_masscomplexitybash
uv run scripts/pubchem_api.py range --feature molecular_weight --min 400.0 --max 400.05 --output result.json12. 自定义PUG-REST查询 针对PUG-REST API执行原始路径查询。
bash
uv run scripts/pubchem_api.py query --path "compound/cid/2244/xrefs/PatentID/JSON" --output result.jsonFallback Search Strategies
备选搜索策略
If direct resolution by name or formula fails (e.g., for complex compounds or
specific ions):
- Search for parent/neutral molecule: If searching for an ion or salt, try searching for the neutral parent compound.
- Deconstruct complex formulas: If a complex formula returns no results, try searching for major components or ligands.
- Use substructure or similarity search: If you have a SMILES string or can generate one for a component, use it to find related compounds.
若直接通过名称或分子式解析失败(如针对复杂化合物或特定离子):
- 搜索母体/中性分子:若搜索离子或盐类,尝试搜索其中性母体化合物。
- 拆解复杂分子式:若复杂分子式无结果,尝试搜索主要成分或配体。
- 使用子结构或相似性搜索:若拥有SMILES字符串或可为某成分生成SMILES字符串,使用其查找相关化合物。
Complex Queries and Multi-Step Tasks
复杂查询与多步骤任务
- Custom/Complex Queries: For more details, read references/endpoints.md to construct raw PUG-REST URLs.
- Multi-Step Tasks: For complex tasks like drug discovery pipelines, follow the checklists in references/workflows.md.
- 自定义/复杂查询:如需更多细节,请阅读references/endpoints.md以构建原始PUG-REST URL。
- 多步骤任务:针对药物发现流程等复杂任务,请遵循references/workflows.md中的检查清单。