pyopenms
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChinesePyOpenMS
PyOpenMS
Overview
概述
PyOpenMS provides Python bindings to the OpenMS library for computational mass spectrometry, enabling analysis of proteomics and metabolomics data. Use for handling mass spectrometry file formats, processing spectral data, detecting features, identifying peptides/proteins, and performing quantitative analysis.
PyOpenMS为计算质谱分析提供了OpenMS库的Python绑定,支持蛋白质组学和代谢组学数据分析。可用于处理质谱文件格式、处理光谱数据、检测特征、鉴定肽段/蛋白质以及执行定量分析。
Installation
安装
Install using uv:
bash
uv uv pip install pyopenmsVerify installation:
python
import pyopenms
print(pyopenms.__version__)使用uv进行安装:
bash
uv uv pip install pyopenms验证安装:
python
import pyopenms
print(pyopenms.__version__)Core Capabilities
核心功能
PyOpenMS organizes functionality into these domains:
PyOpenMS的功能分为以下几个领域:
1. File I/O and Data Formats
1. 文件输入输出与数据格式
Handle mass spectrometry file formats and convert between representations.
Supported formats: mzML, mzXML, TraML, mzTab, FASTA, pepXML, protXML, mzIdentML, featureXML, consensusXML, idXML
Basic file reading:
python
import pyopenms as ms处理质谱文件格式并在不同表示形式之间转换。
支持的格式:mzML、mzXML、TraML、mzTab、FASTA、pepXML、protXML、mzIdentML、featureXML、consensusXML、idXML
基础文件读取:
python
import pyopenms as msRead mzML file
读取mzML文件
exp = ms.MSExperiment()
ms.MzMLFile().load("data.mzML", exp)
exp = ms.MSExperiment()
ms.MzMLFile().load("data.mzML", exp)
Access spectra
访问光谱数据
for spectrum in exp:
mz, intensity = spectrum.get_peaks()
print(f"Spectrum: {len(mz)} peaks")
**For detailed file handling**: See `references/file_io.md`for spectrum in exp:
mz, intensity = spectrum.get_peaks()
print(f"Spectrum: {len(mz)} peaks")
**详细文件处理说明**:参见`references/file_io.md`2. Signal Processing
2. 信号处理
Process raw spectral data with smoothing, filtering, centroiding, and normalization.
Basic spectrum processing:
python
undefined对原始光谱数据进行平滑、过滤、centroiding和归一化处理。
基础光谱处理:
python
undefinedSmooth spectrum with Gaussian filter
使用高斯滤波器平滑光谱
gaussian = ms.GaussFilter()
params = gaussian.getParameters()
params.setValue("gaussian_width", 0.1)
gaussian.setParameters(params)
gaussian.filterExperiment(exp)
**For algorithm details**: See `references/signal_processing.md`gaussian = ms.GaussFilter()
params = gaussian.getParameters()
params.setValue("gaussian_width", 0.1)
gaussian.setParameters(params)
gaussian.filterExperiment(exp)
**算法详情**:参见`references/signal_processing.md`3. Feature Detection
3. 特征检测
Detect and link features across spectra and samples for quantitative analysis.
python
undefined检测并关联不同光谱和样本中的特征,用于定量分析。
python
undefinedDetect features
检测特征
ff = ms.FeatureFinder()
ff.run("centroided", exp, features, params, ms.FeatureMap())
**For complete workflows**: See `references/feature_detection.md`ff = ms.FeatureFinder()
ff.run("centroided", exp, features, params, ms.FeatureMap())
**完整工作流程**:参见`references/feature_detection.md`4. Peptide and Protein Identification
4. 肽段与蛋白质鉴定
Integrate with search engines and process identification results.
Supported engines: Comet, Mascot, MSGFPlus, XTandem, OMSSA, Myrimatch
Basic identification workflow:
python
undefined与搜索引擎集成并处理鉴定结果。
支持的引擎:Comet、Mascot、MSGFPlus、XTandem、OMSSA、Myrimatch
基础鉴定工作流程:
python
undefinedLoad identification data
加载鉴定数据
protein_ids = []
peptide_ids = []
ms.IdXMLFile().load("identifications.idXML", protein_ids, peptide_ids)
protein_ids = []
peptide_ids = []
ms.IdXMLFile().load("identifications.idXML", protein_ids, peptide_ids)
Apply FDR filtering
应用FDR过滤
fdr = ms.FalseDiscoveryRate()
fdr.apply(peptide_ids)
**For detailed workflows**: See `references/identification.md`fdr = ms.FalseDiscoveryRate()
fdr.apply(peptide_ids)
**详细工作流程**:参见`references/identification.md`5. Metabolomics Analysis
5. 代谢组学分析
Perform untargeted metabolomics preprocessing and analysis.
Typical workflow:
- Load and process raw data
- Detect features
- Align retention times across samples
- Link features to consensus map
- Annotate with compound databases
For complete metabolomics workflows: See
references/metabolomics.md执行非靶向代谢组学预处理和分析。
典型工作流程:
- 加载并处理原始数据
- 检测特征
- 对齐样本间的保留时间
- 将特征关联到共识图谱
- 利用化合物数据库进行注释
完整代谢组学工作流程:参见
references/metabolomics.mdData Structures
数据结构
PyOpenMS uses these primary objects:
- MSExperiment: Collection of spectra and chromatograms
- MSSpectrum: Single mass spectrum with m/z and intensity pairs
- MSChromatogram: Chromatographic trace
- Feature: Detected chromatographic peak with quality metrics
- FeatureMap: Collection of features
- PeptideIdentification: Search results for peptides
- ProteinIdentification: Search results for proteins
For detailed documentation: See
references/data_structures.mdPyOpenMS使用以下主要对象:
- MSExperiment:光谱和色谱图的集合
- MSSpectrum:包含m/z和强度对的单个质谱图
- MSChromatogram:色谱轨迹
- Feature:带有质量指标的检测到的色谱峰
- FeatureMap:特征的集合
- PeptideIdentification:肽段的搜索结果
- ProteinIdentification:蛋白质的搜索结果
详细文档:参见
references/data_structures.mdCommon Workflows
常见工作流程
Quick Start: Load and Explore Data
快速入门:加载并探索数据
python
import pyopenms as mspython
import pyopenms as msLoad mzML file
加载mzML文件
exp = ms.MSExperiment()
ms.MzMLFile().load("sample.mzML", exp)
exp = ms.MSExperiment()
ms.MzMLFile().load("sample.mzML", exp)
Get basic statistics
获取基本统计信息
print(f"Number of spectra: {exp.getNrSpectra()}")
print(f"Number of chromatograms: {exp.getNrChromatograms()}")
print(f"Number of spectra: {exp.getNrSpectra()}")
print(f"Number of chromatograms: {exp.getNrChromatograms()}")
Examine first spectrum
检查第一个光谱
spec = exp.getSpectrum(0)
print(f"MS level: {spec.getMSLevel()}")
print(f"Retention time: {spec.getRT()}")
mz, intensity = spec.get_peaks()
print(f"Peaks: {len(mz)}")
undefinedspec = exp.getSpectrum(0)
print(f"MS level: {spec.getMSLevel()}")
print(f"Retention time: {spec.getRT()}")
mz, intensity = spec.get_peaks()
print(f"Peaks: {len(mz)}")
undefinedParameter Management
参数管理
Most algorithms use a parameter system:
python
undefined大多数算法使用参数系统:
python
undefinedGet algorithm parameters
获取算法参数
algo = ms.GaussFilter()
params = algo.getParameters()
algo = ms.GaussFilter()
params = algo.getParameters()
View available parameters
查看可用参数
for param in params.keys():
print(f"{param}: {params.getValue(param)}")
for param in params.keys():
print(f"{param}: {params.getValue(param)}")
Modify parameters
修改参数
params.setValue("gaussian_width", 0.2)
algo.setParameters(params)
undefinedparams.setValue("gaussian_width", 0.2)
algo.setParameters(params)
undefinedExport to Pandas
导出到Pandas
Convert data to pandas DataFrames for analysis:
python
import pyopenms as ms
import pandas as pd将数据转换为pandas DataFrame进行分析:
python
import pyopenms as ms
import pandas as pdLoad feature map
加载特征图谱
fm = ms.FeatureMap()
ms.FeatureXMLFile().load("features.featureXML", fm)
fm = ms.FeatureMap()
ms.FeatureXMLFile().load("features.featureXML", fm)
Convert to DataFrame
转换为DataFrame
df = fm.get_df()
print(df.head())
undefineddf = fm.get_df()
print(df.head())
undefinedIntegration with Other Tools
与其他工具的集成
PyOpenMS integrates with:
- Pandas: Export data to DataFrames
- NumPy: Work with peak arrays
- Scikit-learn: Machine learning on MS data
- Matplotlib/Seaborn: Visualization
- R: Via rpy2 bridge
PyOpenMS可与以下工具集成:
- Pandas:将数据导出为DataFrame
- NumPy:处理峰数组
- Scikit-learn:对质谱数据进行机器学习
- Matplotlib/Seaborn:可视化
- R:通过rpy2桥接
Resources
资源
- Official documentation: https://pyopenms.readthedocs.io
- OpenMS documentation: https://www.openms.org
- GitHub: https://github.com/OpenMS/OpenMS
- 官方文档:https://pyopenms.readthedocs.io
- OpenMS文档:https://www.openms.org
- GitHub:https://github.com/OpenMS/OpenMS
References
参考资料
- - Comprehensive file format handling
references/file_io.md - - Signal processing algorithms
references/signal_processing.md - - Feature detection and linking
references/feature_detection.md - - Peptide and protein identification
references/identification.md - - Metabolomics-specific workflows
references/metabolomics.md - - Core objects and data structures
references/data_structures.md
- - 全面的文件格式处理
references/file_io.md - - 信号处理算法
references/signal_processing.md - - 特征检测与关联
references/feature_detection.md - - 肽段与蛋白质鉴定
references/identification.md - - 代谢组学专属工作流程
references/metabolomics.md - - 核心对象与数据结构
references/data_structures.md