hmdb-database

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

HMDB Database

HMDB数据库

Overview

概述

The Human Metabolome Database (HMDB) is a comprehensive, freely available resource containing detailed information about small molecule metabolites found in the human body.

人类代谢组数据库（HMDB）是一个全面、免费的资源库，包含人体中发现的小分子代谢物的详细信息。

When to Use This Skill

何时使用该技能

This skill should be used when performing metabolomics research, clinical chemistry, biomarker discovery, or metabolite identification tasks.

当开展代谢组学研究、临床化学分析、生物标志物发现或代谢物鉴定工作时，可使用该技能。

Database Contents

数据库内容

HMDB version 5.0 (current as of 2025) contains:

220,945 metabolite entries covering both water-soluble and lipid-soluble compounds
8,610 protein sequences for enzymes and transporters involved in metabolism
130+ data fields per metabolite including:
- Chemical properties (structure, formula, molecular weight, InChI, SMILES)
- Clinical data (biomarker associations, diseases, normal/abnormal concentrations)
- Biological information (pathways, reactions, locations)
- Spectroscopic data (NMR, MS, MS-MS spectra)
- External database links (KEGG, PubChem, MetaCyc, ChEBI, PDB, UniProt, GenBank)

截至2025年，HMDB 5.0版本包含：

220,945个代谢物条目，涵盖水溶性和脂溶性化合物
8,610条蛋白质序列，涉及代谢相关的酶和转运蛋白
每个代谢物包含130+个数据字段，包括：
- 化学性质（结构、分子式、分子量、InChI、SMILES）
- 临床数据（生物标志物关联、疾病、正常/异常浓度）
- 生物学信息（代谢通路、反应、定位）
- 谱学数据（NMR、MS、MS-MS谱图）
- 外部数据库链接（KEGG、PubChem、MetaCyc、ChEBI、PDB、UniProt、GenBank）

Core Capabilities

核心功能

1. Web-Based Metabolite Searches

1. 基于网页的代谢物搜索

Access HMDB through the web interface at https://www.hmdb.ca/ for:

Text Searches:

Search by metabolite name, synonym, or identifier (HMDB ID)
Example HMDB IDs: HMDB0000001, HMDB0001234
Search by disease associations or pathway involvement
Query by biological specimen type (urine, serum, CSF, saliva, feces, sweat)

Structure-Based Searches:

Use ChemQuery for structure and substructure searches
Search by molecular weight or molecular weight range
Use SMILES or InChI strings to find compounds

Spectral Searches:

LC-MS spectral matching
GC-MS spectral matching
NMR spectral searches for metabolite identification

Advanced Searches:

Combine multiple criteria (name, properties, concentration ranges)
Filter by biological locations or specimen types
Search by protein/enzyme associations

通过网页界面访问HMDB（https://www.hmdb.ca/），可进行以下操作：

文本搜索：

按代谢物名称、同义词或标识符（HMDB ID）搜索
示例HMDB ID：HMDB0000001、HMDB0001234
按疾病关联或代谢通路参与情况搜索
按生物样本类型查询（尿液、血清、脑脊液、唾液、粪便、汗液）

基于结构的搜索：

使用ChemQuery进行结构和亚结构搜索
按分子量或分子量范围搜索
通过SMILES或InChI字符串查找化合物

谱图搜索：

LC-MS谱图匹配
GC-MS谱图匹配
NMR谱图搜索，用于代谢物鉴定

高级搜索：

组合多个搜索条件（名称、性质、浓度范围）
按生物定位或样本类型筛选
按蛋白质/酶关联情况搜索

2. Accessing Metabolite Information

2. 获取代谢物信息

When retrieving metabolite data, HMDB provides:

Chemical Information:

Systematic name, traditional names, and synonyms
Chemical formula and molecular weight
Structure representations (2D/3D, SMILES, InChI, MOL file)
Chemical taxonomy and classification

Biological Context:

Metabolic pathways and reactions
Associated enzymes and transporters
Subcellular locations
Biological roles and functions

Clinical Relevance:

Normal concentration ranges in biological fluids
Biomarker associations with diseases
Clinical significance
Toxicity information when applicable

Analytical Data:

Experimental and predicted NMR spectra
MS and MS-MS spectra
Retention times and chromatographic data
Reference peaks for identification

检索代谢物数据时，HMDB提供以下信息：

化学信息：

系统名称、常用名称及同义词
分子式和分子量
结构表示（2D/3D、SMILES、InChI、MOL文件）
化学分类和归类

生物学背景：

代谢通路和反应
相关的酶和转运蛋白
亚细胞定位
生物学作用和功能

临床相关性：

生物体液中的正常浓度范围
与疾病相关的生物标志物关联
临床意义
适用情况下的毒性信息

分析数据：

实验和预测的NMR谱图
MS和MS-MS谱图
保留时间和色谱数据
用于鉴定的参考峰

3. Downloadable Datasets

3. 可下载的数据集

HMDB offers bulk data downloads at https://www.hmdb.ca/downloads in multiple formats:

Available Formats:

XML: Complete metabolite, protein, and spectra data
SDF: Metabolite structure files for cheminformatics
FASTA: Protein and gene sequences
TXT: Raw spectra peak lists
CSV/TSV: Tabular data exports

Dataset Categories:

All metabolites or filtered by specimen type
Protein/enzyme sequences
Experimental and predicted spectra (NMR, GC-MS, MS-MS)
Pathway information

Best Practices:

Download XML format for comprehensive data including all fields
Use SDF format for structure-based analysis and cheminformatics workflows
Parse CSV/TSV formats for integration with data analysis pipelines
Check version dates to ensure up-to-date data (current: v5.0, 2023-07-01)

Usage Requirements:

Free for academic and non-commercial research
Commercial use requires explicit permission (contact samackay@ualberta.ca)
Cite HMDB publication when using data

HMDB在https://www.hmdb.ca/downloads提供多种格式的批量数据下载：

可用格式：

XML：完整的代谢物、蛋白质和谱图数据
SDF：用于 cheminformatics 的代谢物结构文件
FASTA：蛋白质和基因序列
TXT：原始谱图峰列表
CSV/TSV：表格数据导出

数据集类别：

所有代谢物或按样本类型筛选的代谢物
蛋白质/酶序列
实验和预测谱图（NMR、GC-MS、MS-MS）
代谢通路信息

最佳实践：

如需包含所有字段的全面数据，下载XML格式
如需基于结构的分析和 cheminformatics 工作流，使用SDF格式
解析CSV/TSV格式以整合到数据分析流程中
检查版本日期以确保数据是最新的（当前版本：v5.0，2023-07-01）

使用要求：

学术和非商业研究可免费使用
商业使用需获得明确许可（联系samackay@ualberta.ca）
使用数据时需引用HMDB出版物

4. Programmatic API Access

4. 程序化API访问

API Availability: HMDB does not provide a public REST API. Programmatic access requires contacting the development team:

Academic/Research groups: Contact eponine@ualberta.ca (Eponine) or samackay@ualberta.ca (Scott)
Commercial organizations: Contact samackay@ualberta.ca (Scott) for customized API access

Alternative Programmatic Access:

R/Bioconductor: Use the
```
hmdbQuery
```
package for R-based queries
- Install:
```
BiocManager::install("hmdbQuery")
```
- Provides HTTP-based querying functions
Downloaded datasets: Parse XML or CSV files locally for programmatic analysis
Web scraping: Not recommended; contact team for proper API access instead

API可用性： HMDB未提供公开的REST API。如需程序化访问，需联系开发团队：

学术/研究团队：联系eponine@ualberta.ca（Eponine）或samackay@ualberta.ca（Scott）
商业机构：联系samackay@ualberta.ca（Scott）获取定制化API访问权限

替代程序化访问方式：

R/Bioconductor：使用
```
hmdbQuery
```
包进行基于R的查询
- 安装：
```
BiocManager::install("hmdbQuery")
```
- 提供基于HTTP的查询函数
下载的数据集：本地解析XML或CSV文件以进行程序化分析
网页抓取：不推荐；请联系团队获取正规API访问权限

5. Common Research Workflows

5. 常见研究工作流

Metabolite Identification in Untargeted Metabolomics:

Obtain experimental MS or NMR spectra from samples
Use HMDB spectral search tools to match against reference spectra
Verify candidates by checking molecular weight, retention time, and MS-MS fragmentation
Review biological plausibility (expected in specimen type, known pathways)

Biomarker Discovery:

Search HMDB for metabolites associated with disease of interest
Review concentration ranges in normal vs. disease states
Identify metabolites with strong differential abundance
Examine pathway context and biological mechanisms
Cross-reference with literature via PubMed links

Pathway Analysis:

Identify metabolites of interest from experimental data
Look up HMDB entries for each metabolite
Extract pathway associations and enzymatic reactions
Use linked SMPDB (Small Molecule Pathway Database) for pathway diagrams
Identify pathway enrichment for biological interpretation

Database Integration:

Download HMDB data in XML or CSV format
Parse and extract relevant fields for local database
Link with external IDs (KEGG, PubChem, ChEBI) for cross-database queries
Build local tools or pipelines incorporating HMDB reference data

非靶向代谢组学中的代谢物鉴定：

从样本中获取实验MS或NMR谱图
使用HMDB谱图搜索工具与参考谱图匹配
通过检查分子量、保留时间和MS-MS碎片验证候选物
评估生物学合理性（样本类型中是否预期存在、是否属于已知通路）

生物标志物发现：

在HMDB中搜索与目标疾病相关的代谢物
查看正常状态与疾病状态下的浓度范围
找出差异丰度显著的代谢物
分析通路背景和生物学机制
通过PubMed链接交叉引用文献

通路分析：

从实验数据中确定感兴趣的代谢物
查询每个代谢物的HMDB条目
提取通路关联和酶促反应
使用关联的SMPDB（小分子通路数据库）获取通路图
识别通路富集情况以进行生物学解释

数据库整合：

下载XML或CSV格式的HMDB数据
解析并提取相关字段用于本地数据库
与外部ID（KEGG、PubChem、ChEBI）关联以进行跨数据库查询
构建整合HMDB参考数据的本地工具或流程

Related HMDB Resources

Best Practices

最佳实践

Data Quality:

Verify metabolite identifications with multiple evidence types (spectra, structure, properties)
Check experimental vs. predicted data quality indicators
Review citations and evidence for biomarker associations

Version Tracking:

Note HMDB version used in research (current: v5.0)
Databases are updated periodically with new entries and corrections
Re-query for updates when publishing to ensure current information

Citation:

Always cite HMDB in publications using the database
Reference specific HMDB IDs when discussing metabolites
Acknowledge data sources for downloaded datasets

Performance:

For large-scale analysis, download complete datasets rather than repeated web queries
Use appropriate file formats (XML for comprehensive data, CSV for tabular analysis)
Consider local caching of frequently accessed metabolite information

数据质量：

使用多种证据类型（谱图、结构、性质）验证代谢物鉴定结果
检查实验数据与预测数据的质量指标
查看生物标志物关联的引用和证据

版本跟踪：

记录研究中使用的HMDB版本（当前版本：v5.0）
数据库会定期更新，添加新条目并修正错误
发表研究时重新查询以确保信息是最新的

引用规范：

使用数据库时，务必在出版物中引用HMDB
讨论代谢物时参考具体的HMDB ID
对下载的数据集注明数据来源

性能优化：

对于大规模分析，下载完整数据集而非重复进行网页查询
使用合适的文件格式（XML用于全面数据，CSV用于表格分析）
考虑对频繁访问的代谢物信息进行本地缓存

Reference Documentation

参考文档

See

references/hmdb_data_fields.md

for detailed information about available data fields and their meanings.

如需了解可用数据字段及其含义的详细信息，请查看

references/hmdb_data_fields.md

。",