hmdb-database
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseHMDB Database
HMDB数据库
Overview
概述
The Human Metabolome Database (HMDB) is a comprehensive, freely available resource containing detailed information about small molecule metabolites found in the human body.
人类代谢组数据库(HMDB)是一个全面、免费的资源库,包含人体中发现的小分子代谢物的详细信息。
When to Use This Skill
何时使用该技能
This skill should be used when performing metabolomics research, clinical chemistry, biomarker discovery, or metabolite identification tasks.
当开展代谢组学研究、临床化学分析、生物标志物发现或代谢物鉴定工作时,可使用该技能。
Database Contents
数据库内容
HMDB version 5.0 (current as of 2025) contains:
- 220,945 metabolite entries covering both water-soluble and lipid-soluble compounds
- 8,610 protein sequences for enzymes and transporters involved in metabolism
- 130+ data fields per metabolite including:
- Chemical properties (structure, formula, molecular weight, InChI, SMILES)
- Clinical data (biomarker associations, diseases, normal/abnormal concentrations)
- Biological information (pathways, reactions, locations)
- Spectroscopic data (NMR, MS, MS-MS spectra)
- External database links (KEGG, PubChem, MetaCyc, ChEBI, PDB, UniProt, GenBank)
截至2025年,HMDB 5.0版本包含:
- 220,945个代谢物条目,涵盖水溶性和脂溶性化合物
- 8,610条蛋白质序列,涉及代谢相关的酶和转运蛋白
- 每个代谢物包含130+个数据字段,包括:
- 化学性质(结构、分子式、分子量、InChI、SMILES)
- 临床数据(生物标志物关联、疾病、正常/异常浓度)
- 生物学信息(代谢通路、反应、定位)
- 谱学数据(NMR、MS、MS-MS谱图)
- 外部数据库链接(KEGG、PubChem、MetaCyc、ChEBI、PDB、UniProt、GenBank)
Core Capabilities
核心功能
1. Web-Based Metabolite Searches
1. 基于网页的代谢物搜索
Access HMDB through the web interface at https://www.hmdb.ca/ for:
Text Searches:
- Search by metabolite name, synonym, or identifier (HMDB ID)
- Example HMDB IDs: HMDB0000001, HMDB0001234
- Search by disease associations or pathway involvement
- Query by biological specimen type (urine, serum, CSF, saliva, feces, sweat)
Structure-Based Searches:
- Use ChemQuery for structure and substructure searches
- Search by molecular weight or molecular weight range
- Use SMILES or InChI strings to find compounds
Spectral Searches:
- LC-MS spectral matching
- GC-MS spectral matching
- NMR spectral searches for metabolite identification
Advanced Searches:
- Combine multiple criteria (name, properties, concentration ranges)
- Filter by biological locations or specimen types
- Search by protein/enzyme associations
通过网页界面访问HMDB(https://www.hmdb.ca/),可进行以下操作:
文本搜索:
- 按代谢物名称、同义词或标识符(HMDB ID)搜索
- 示例HMDB ID:HMDB0000001、HMDB0001234
- 按疾病关联或代谢通路参与情况搜索
- 按生物样本类型查询(尿液、血清、脑脊液、唾液、粪便、汗液)
基于结构的搜索:
- 使用ChemQuery进行结构和亚结构搜索
- 按分子量或分子量范围搜索
- 通过SMILES或InChI字符串查找化合物
谱图搜索:
- LC-MS谱图匹配
- GC-MS谱图匹配
- NMR谱图搜索,用于代谢物鉴定
高级搜索:
- 组合多个搜索条件(名称、性质、浓度范围)
- 按生物定位或样本类型筛选
- 按蛋白质/酶关联情况搜索
2. Accessing Metabolite Information
2. 获取代谢物信息
When retrieving metabolite data, HMDB provides:
Chemical Information:
- Systematic name, traditional names, and synonyms
- Chemical formula and molecular weight
- Structure representations (2D/3D, SMILES, InChI, MOL file)
- Chemical taxonomy and classification
Biological Context:
- Metabolic pathways and reactions
- Associated enzymes and transporters
- Subcellular locations
- Biological roles and functions
Clinical Relevance:
- Normal concentration ranges in biological fluids
- Biomarker associations with diseases
- Clinical significance
- Toxicity information when applicable
Analytical Data:
- Experimental and predicted NMR spectra
- MS and MS-MS spectra
- Retention times and chromatographic data
- Reference peaks for identification
检索代谢物数据时,HMDB提供以下信息:
化学信息:
- 系统名称、常用名称及同义词
- 分子式和分子量
- 结构表示(2D/3D、SMILES、InChI、MOL文件)
- 化学分类和归类
生物学背景:
- 代谢通路和反应
- 相关的酶和转运蛋白
- 亚细胞定位
- 生物学作用和功能
临床相关性:
- 生物体液中的正常浓度范围
- 与疾病相关的生物标志物关联
- 临床意义
- 适用情况下的毒性信息
分析数据:
- 实验和预测的NMR谱图
- MS和MS-MS谱图
- 保留时间和色谱数据
- 用于鉴定的参考峰
3. Downloadable Datasets
3. 可下载的数据集
HMDB offers bulk data downloads at https://www.hmdb.ca/downloads in multiple formats:
Available Formats:
- XML: Complete metabolite, protein, and spectra data
- SDF: Metabolite structure files for cheminformatics
- FASTA: Protein and gene sequences
- TXT: Raw spectra peak lists
- CSV/TSV: Tabular data exports
Dataset Categories:
- All metabolites or filtered by specimen type
- Protein/enzyme sequences
- Experimental and predicted spectra (NMR, GC-MS, MS-MS)
- Pathway information
Best Practices:
- Download XML format for comprehensive data including all fields
- Use SDF format for structure-based analysis and cheminformatics workflows
- Parse CSV/TSV formats for integration with data analysis pipelines
- Check version dates to ensure up-to-date data (current: v5.0, 2023-07-01)
Usage Requirements:
- Free for academic and non-commercial research
- Commercial use requires explicit permission (contact samackay@ualberta.ca)
- Cite HMDB publication when using data
可用格式:
- XML:完整的代谢物、蛋白质和谱图数据
- SDF:用于 cheminformatics 的代谢物结构文件
- FASTA:蛋白质和基因序列
- TXT:原始谱图峰列表
- CSV/TSV:表格数据导出
数据集类别:
- 所有代谢物或按样本类型筛选的代谢物
- 蛋白质/酶序列
- 实验和预测谱图(NMR、GC-MS、MS-MS)
- 代谢通路信息
最佳实践:
- 如需包含所有字段的全面数据,下载XML格式
- 如需基于结构的分析和 cheminformatics 工作流,使用SDF格式
- 解析CSV/TSV格式以整合到数据分析流程中
- 检查版本日期以确保数据是最新的(当前版本:v5.0,2023-07-01)
使用要求:
- 学术和非商业研究可免费使用
- 商业使用需获得明确许可(联系samackay@ualberta.ca)
- 使用数据时需引用HMDB出版物
4. Programmatic API Access
4. 程序化API访问
API Availability:
HMDB does not provide a public REST API. Programmatic access requires contacting the development team:
- Academic/Research groups: Contact eponine@ualberta.ca (Eponine) or samackay@ualberta.ca (Scott)
- Commercial organizations: Contact samackay@ualberta.ca (Scott) for customized API access
Alternative Programmatic Access:
- R/Bioconductor: Use the package for R-based queries
hmdbQuery- Install:
BiocManager::install("hmdbQuery") - Provides HTTP-based querying functions
- Install:
- Downloaded datasets: Parse XML or CSV files locally for programmatic analysis
- Web scraping: Not recommended; contact team for proper API access instead
API可用性:
HMDB未提供公开的REST API。如需程序化访问,需联系开发团队:
- 学术/研究团队:联系eponine@ualberta.ca(Eponine)或samackay@ualberta.ca(Scott)
- 商业机构:联系samackay@ualberta.ca(Scott)获取定制化API访问权限
替代程序化访问方式:
- R/Bioconductor:使用包进行基于R的查询
hmdbQuery- 安装:
BiocManager::install("hmdbQuery") - 提供基于HTTP的查询函数
- 安装:
- 下载的数据集:本地解析XML或CSV文件以进行程序化分析
- 网页抓取:不推荐;请联系团队获取正规API访问权限
5. Common Research Workflows
5. 常见研究工作流
Metabolite Identification in Untargeted Metabolomics:
- Obtain experimental MS or NMR spectra from samples
- Use HMDB spectral search tools to match against reference spectra
- Verify candidates by checking molecular weight, retention time, and MS-MS fragmentation
- Review biological plausibility (expected in specimen type, known pathways)
Biomarker Discovery:
- Search HMDB for metabolites associated with disease of interest
- Review concentration ranges in normal vs. disease states
- Identify metabolites with strong differential abundance
- Examine pathway context and biological mechanisms
- Cross-reference with literature via PubMed links
Pathway Analysis:
- Identify metabolites of interest from experimental data
- Look up HMDB entries for each metabolite
- Extract pathway associations and enzymatic reactions
- Use linked SMPDB (Small Molecule Pathway Database) for pathway diagrams
- Identify pathway enrichment for biological interpretation
Database Integration:
- Download HMDB data in XML or CSV format
- Parse and extract relevant fields for local database
- Link with external IDs (KEGG, PubChem, ChEBI) for cross-database queries
- Build local tools or pipelines incorporating HMDB reference data
非靶向代谢组学中的代谢物鉴定:
- 从样本中获取实验MS或NMR谱图
- 使用HMDB谱图搜索工具与参考谱图匹配
- 通过检查分子量、保留时间和MS-MS碎片验证候选物
- 评估生物学合理性(样本类型中是否预期存在、是否属于已知通路)
生物标志物发现:
- 在HMDB中搜索与目标疾病相关的代谢物
- 查看正常状态与疾病状态下的浓度范围
- 找出差异丰度显著的代谢物
- 分析通路背景和生物学机制
- 通过PubMed链接交叉引用文献
通路分析:
- 从实验数据中确定感兴趣的代谢物
- 查询每个代谢物的HMDB条目
- 提取通路关联和酶促反应
- 使用关联的SMPDB(小分子通路数据库)获取通路图
- 识别通路富集情况以进行生物学解释
数据库整合:
- 下载XML或CSV格式的HMDB数据
- 解析并提取相关字段用于本地数据库
- 与外部ID(KEGG、PubChem、ChEBI)关联以进行跨数据库查询
- 构建整合HMDB参考数据的本地工具或流程
Related HMDB Resources
相关HMDB资源
The HMDB ecosystem includes related databases:
- DrugBank: ~2,832 drug compounds with pharmaceutical information
- T3DB (Toxin and Toxin Target Database): ~3,670 toxic compounds
- SMPDB (Small Molecule Pathway Database): Pathway diagrams and maps
- FooDB: ~70,000 food component compounds
These databases share similar structure and identifiers, enabling integrated queries across human metabolome, drug, toxin, and food databases.
HMDB生态系统包含以下相关数据库:
- DrugBank:约2,832种药物化合物,包含药学信息
- T3DB(毒素与毒素靶点数据库):约3,670种有毒化合物
- SMPDB(小分子通路数据库):通路图和图谱
- FooDB:约70,000种食品成分化合物
这些数据库具有相似的结构和标识符,支持在人类代谢组、药物、毒素和食品数据库之间进行整合查询。
Best Practices
最佳实践
Data Quality:
- Verify metabolite identifications with multiple evidence types (spectra, structure, properties)
- Check experimental vs. predicted data quality indicators
- Review citations and evidence for biomarker associations
Version Tracking:
- Note HMDB version used in research (current: v5.0)
- Databases are updated periodically with new entries and corrections
- Re-query for updates when publishing to ensure current information
Citation:
- Always cite HMDB in publications using the database
- Reference specific HMDB IDs when discussing metabolites
- Acknowledge data sources for downloaded datasets
Performance:
- For large-scale analysis, download complete datasets rather than repeated web queries
- Use appropriate file formats (XML for comprehensive data, CSV for tabular analysis)
- Consider local caching of frequently accessed metabolite information
数据质量:
- 使用多种证据类型(谱图、结构、性质)验证代谢物鉴定结果
- 检查实验数据与预测数据的质量指标
- 查看生物标志物关联的引用和证据
版本跟踪:
- 记录研究中使用的HMDB版本(当前版本:v5.0)
- 数据库会定期更新,添加新条目并修正错误
- 发表研究时重新查询以确保信息是最新的
引用规范:
- 使用数据库时,务必在出版物中引用HMDB
- 讨论代谢物时参考具体的HMDB ID
- 对下载的数据集注明数据来源
性能优化:
- 对于大规模分析,下载完整数据集而非重复进行网页查询
- 使用合适的文件格式(XML用于全面数据,CSV用于表格分析)
- 考虑对频繁访问的代谢物信息进行本地缓存
Reference Documentation
参考文档
See for detailed information about available data fields and their meanings.
references/hmdb_data_fields.md如需了解可用数据字段及其含义的详细信息,请查看。",
references/hmdb_data_fields.md