pyopenms
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChinesepyOpenMS
pyOpenMS
Overview
概述
pyOpenMS is an open-source Python library for mass spectrometry data analysis in proteomics and metabolomics. Process LC-MS/MS data, perform peptide identification, detect and quantify features, and integrate with common proteomics tools (Comet, Mascot, MSGF+, Percolator, MSstats) using Python bindings to the OpenMS C++ library.
pyOpenMS是一个用于蛋白质组学和代谢组学领域质谱数据分析的开源Python库。它通过OpenMS C++库的Python绑定,处理LC-MS/MS数据、执行肽段鉴定、检测和定量特征,并与常见的蛋白质组学工具(Comet、Mascot、MSGF+、Percolator、MSstats)集成。
When to Use This Skill
适用场景
This skill should be used when:
- Processing mass spectrometry data (mzML, mzXML files)
- Performing peak picking and feature detection in LC-MS data
- Conducting peptide and protein identification workflows
- Quantifying metabolites or proteins
- Integrating proteomics or metabolomics tools into Python pipelines
- Working with OpenMS tools and file formats
在以下场景中可使用本技能:
- 处理质谱数据(mzML、mzXML文件)
- 在LC-MS数据中执行峰提取和特征检测
- 开展肽段和蛋白质鉴定工作流
- 对代谢物或蛋白质进行定量分析
- 将蛋白质组学或代谢组学工具集成到Python流水线中
- 使用OpenMS工具和文件格式
Core Capabilities
核心功能
1. File I/O and Data Import/Export
1. 文件输入输出与数据导入导出
Handle diverse mass spectrometry file formats efficiently:
Supported Formats:
- mzML/mzXML: Primary raw MS data formats (profile or centroid)
- FASTA: Protein/peptide sequence databases
- mzTab: Standardized reporting format for identification and quantification
- mzIdentML: Peptide and protein identification data
- TraML: Transition lists for targeted experiments
- pepXML/protXML: Search engine results
Reading mzML Files:
python
import pyopenms as oms高效处理多种质谱文件格式:
支持的格式:
- mzML/mzXML:主要的原始质谱数据格式(轮廓图或 centroided 图)
- FASTA:蛋白质/肽段序列数据库
- mzTab:用于鉴定和定量的标准化报告格式
- mzIdentML:肽段和蛋白质鉴定数据
- TraML:靶向实验的过渡列表
- pepXML/protXML:搜索引擎结果
读取mzML文件:
python
import pyopenms as omsLoad MS data
加载质谱数据
exp = oms.MSExperiment()
oms.MzMLFile().load("input_data.mzML", exp)
exp = oms.MSExperiment()
oms.MzMLFile().load("input_data.mzML", exp)
Access basic information
访问基础信息
print(f"Number of spectra: {exp.getNrSpectra()}")
print(f"Number of chromatograms: {exp.getNrChromatograms()}")
**Writing mzML Files:**
```pythonprint(f"谱图数量: {exp.getNrSpectra()}")
print(f"色谱图数量: {exp.getNrChromatograms()}")
**写入mzML文件:**
```pythonSave processed data
保存处理后的数据
oms.MzMLFile().store("output_data.mzML", exp)
**File Encoding:** pyOpenMS automatically handles Base64 encoding, zlib compression, and Numpress compression internally.oms.MzMLFile().store("output_data.mzML", exp)
**文件编码:** pyOpenMS会自动在内部处理Base64编码、zlib压缩和Numpress压缩。2. MS Data Structures and Manipulation
2. 质谱数据结构与操作
Work with core mass spectrometry data structures. See for comprehensive details.
references/data_structures.mdMSSpectrum - Individual mass spectrum:
python
undefined使用核心质谱数据结构。如需详细说明,请参阅。
references/data_structures.mdMSSpectrum - 单个质谱图:
python
undefinedCreate spectrum with metadata
创建带元数据的谱图
spectrum = oms.MSSpectrum()
spectrum.setRT(205.2) # Retention time in seconds
spectrum.setMSLevel(2) # MS2 spectrum
spectrum = oms.MSSpectrum()
spectrum.setRT(205.2) # 保留时间(秒)
spectrum.setMSLevel(2) # MS2谱图
Set peak data (m/z, intensity arrays)
设置峰数据(m/z、强度数组)
mz_array = [100.5, 200.3, 300.7, 400.2]
intensity_array = [1000, 5000, 3000, 2000]
spectrum.set_peaks((mz_array, intensity_array))
mz_array = [100.5, 200.3, 300.7, 400.2]
intensity_array = [1000, 5000, 3000, 2000]
spectrum.set_peaks((mz_array, intensity_array))
Add precursor information for MS2
为MS2添加母离子信息
precursor = oms.Precursor()
precursor.setMZ(450.5)
precursor.setCharge(2)
spectrum.setPrecursors([precursor])
**MSExperiment** - Complete LC-MS/MS run:
```pythonprecursor = oms.Precursor()
precursor.setMZ(450.5)
precursor.setCharge(2)
spectrum.setPrecursors([precursor])
**MSExperiment** - 完整的LC-MS/MS运行数据:
```pythonCreate experiment and add spectra
创建实验并添加谱图
exp = oms.MSExperiment()
exp.addSpectrum(spectrum)
exp = oms.MSExperiment()
exp.addSpectrum(spectrum)
Access spectra
访问谱图
first_spectrum = exp.getSpectrum(0)
for spec in exp:
print(f"RT: {spec.getRT()}, MS Level: {spec.getMSLevel()}")
**MSChromatogram** - Extracted ion chromatogram:
```pythonfirst_spectrum = exp.getSpectrum(0)
for spec in exp:
print(f"保留时间: {spec.getRT()}, MS级别: {spec.getMSLevel()}")
**MSChromatogram** - 提取离子色谱图:
```pythonCreate chromatogram
创建色谱图
chrom = oms.MSChromatogram()
chrom.set_peaks(([10.5, 11.2, 11.8], [1000, 5000, 3000])) # RT, intensity
exp.addChromatogram(chrom)
**Efficient Peak Access:**
```pythonchrom = oms.MSChromatogram()
chrom.set_peaks(([10.5, 11.2, 11.8], [1000, 5000, 3000])) # 保留时间、强度
exp.addChromatogram(chrom)
**高效峰访问:**
```pythonGet peaks as numpy arrays for fast processing
将峰数据作为numpy数组获取以实现快速处理
mz_array, intensity_array = spectrum.get_peaks()
mz_array, intensity_array = spectrum.get_peaks()
Modify and set back
修改后重新设置
intensity_array *= 2 # Double all intensities
spectrum.set_peaks((mz_array, intensity_array))
undefinedintensity_array *= 2 # 将所有强度翻倍
spectrum.set_peaks((mz_array, intensity_array))
undefined3. Chemistry and Peptide Handling
3. 化学与肽段处理
Perform chemical calculations for proteomics and metabolomics. See for detailed examples.
references/chemistry.mdMolecular Formulas and Mass Calculations:
python
undefined为蛋白质组学和代谢组学执行化学计算。如需详细示例,请参阅。
references/chemistry.md分子式与质量计算:
python
undefinedCreate empirical formula
创建经验分子式
formula = oms.EmpiricalFormula("C6H12O6") # Glucose
print(f"Monoisotopic mass: {formula.getMonoWeight()}")
print(f"Average mass: {formula.getAverageWeight()}")
formula = oms.EmpiricalFormula("C6H12O6") # 葡萄糖
print(f"单同位素质量: {formula.getMonoWeight()}")
print(f"平均质量: {formula.getAverageWeight()}")
Formula arithmetic
分子式运算
water = oms.EmpiricalFormula("H2O")
dehydrated = formula - water
water = oms.EmpiricalFormula("H2O")
dehydrated = formula - water
Isotope-specific formulas
同位素特异性分子式
heavy_carbon = oms.EmpiricalFormula("(13)C6H12O6")
**Isotopic Distributions:**
```pythonheavy_carbon = oms.EmpiricalFormula("(13)C6H12O6")
**同位素分布:**
```pythonGenerate coarse isotope pattern (unit mass resolution)
生成粗粒度同位素模式(单位质量分辨率)
coarse_gen = oms.CoarseIsotopePatternGenerator()
pattern = coarse_gen.run(formula)
coarse_gen = oms.CoarseIsotopePatternGenerator()
pattern = coarse_gen.run(formula)
Generate fine structure (high resolution)
生成精细结构(高分辨率)
fine_gen = oms.FineIsotopePatternGenerator(0.01) # 0.01 Da resolution
fine_pattern = fine_gen.run(formula)
**Amino Acids and Residues:**
```pythonfine_gen = oms.FineIsotopePatternGenerator(0.01) # 0.01 Da分辨率
fine_pattern = fine_gen.run(formula)
**氨基酸与残基:**
```pythonAccess residue information
访问残基信息
res_db = oms.ResidueDB()
leucine = res_db.getResidue("Leucine")
print(f"L monoisotopic mass: {leucine.getMonoWeight()}")
print(f"L formula: {leucine.getFormula()}")
print(f"L pKa: {leucine.getPka()}")
**Peptide Sequences:**
```pythonres_db = oms.ResidueDB()
leucine = res_db.getResidue("Leucine")
print(f"亮氨酸单同位素质量: {leucine.getMonoWeight()}")
print(f"亮氨酸分子式: {leucine.getFormula()}")
print(f"亮氨酸pKa值: {leucine.getPka()}")
**肽段序列:**
```pythonCreate peptide sequence
创建肽段序列
peptide = oms.AASequence.fromString("PEPTIDE")
print(f"Peptide mass: {peptide.getMonoWeight()}")
print(f"Formula: {peptide.getFormula()}")
peptide = oms.AASequence.fromString("PEPTIDE")
print(f"肽段质量: {peptide.getMonoWeight()}")
print(f"分子式: {peptide.getFormula()}")
Add modifications
添加修饰
modified = oms.AASequence.fromString("PEPTIDEM(Oxidation)")
print(f"Modified mass: {modified.getMonoWeight()}")
modified = oms.AASequence.fromString("PEPTIDEM(Oxidation)")
print(f"修饰后质量: {modified.getMonoWeight()}")
Theoretical fragmentation
理论碎裂
ions = []
for i in range(1, peptide.size()):
b_ion = peptide.getPrefix(i)
y_ion = peptide.getSuffix(i)
ions.append(('b', i, b_ion.getMonoWeight()))
ions.append(('y', i, y_ion.getMonoWeight()))
**Protein Digestion:**
```pythonions = []
for i in range(1, peptide.size()):
b_ion = peptide.getPrefix(i)
y_ion = peptide.getSuffix(i)
ions.append(('b', i, b_ion.getMonoWeight()))
ions.append(('y', i, y_ion.getMonoWeight()))
**蛋白质酶解:**
```pythonEnzymatic digestion
酶解
dig = oms.ProteaseDigestion()
dig.setEnzyme("Trypsin")
dig.setMissedCleavages(2)
protein_seq = oms.AASequence.fromString("MKTAYIAKQRQISFVKSHFSRQLEERLGLIEVQAPILSRVGDGTQDNLSGAEK")
peptides = []
dig.digest(protein_seq, peptides)
for pep in peptides:
print(f"{pep.toString()}: {pep.getMonoWeight():.2f} Da")
**Modifications:**
```pythondig = oms.ProteaseDigestion()
dig.setEnzyme("Trypsin")
dig.setMissedCleavages(2)
protein_seq = oms.AASequence.fromString("MKTAYIAKQRQISFVKSHFSRQLEERLGLIEVQAPILSRVGDGTQDNLSGAEK")
peptides = []
dig.digest(protein_seq, peptides)
for pep in peptides:
print(f"{pep.toString()}: {pep.getMonoWeight():.2f} Da")
**修饰:**
```pythonAccess modification database
访问修饰数据库
mod_db = oms.ModificationsDB()
oxidation = mod_db.getModification("Oxidation")
print(f"Oxidation mass diff: {oxidation.getDiffMonoMass()}")
print(f"Residues: {oxidation.getResidues()}")
undefinedmod_db = oms.ModificationsDB()
oxidation = mod_db.getModification("Oxidation")
print(f"氧化质量差异: {oxidation.getDiffMonoMass()}")
print(f"作用残基: {oxidation.getResidues()}")
undefined4. Signal Processing and Filtering
4. 信号处理与过滤
Apply algorithms to process and filter MS data. See for comprehensive coverage.
references/algorithms.mdSpectral Smoothing:
python
undefined应用算法处理和过滤质谱数据。如需全面介绍,请参阅。
references/algorithms.md谱图平滑:
python
undefinedGaussian smoothing
高斯平滑
gauss_filter = oms.GaussFilter()
params = gauss_filter.getParameters()
params.setValue("gaussian_width", 0.2)
gauss_filter.setParameters(params)
gauss_filter.filterExperiment(exp)
gauss_filter = oms.GaussFilter()
params = gauss_filter.getParameters()
params.setValue("gaussian_width", 0.2)
gauss_filter.setParameters(params)
gauss_filter.filterExperiment(exp)
Savitzky-Golay filter
萨维茨基-戈雷滤波
sg_filter = oms.SavitzkyGolayFilter()
sg_filter.filterExperiment(exp)
**Peak Filtering:**
```pythonsg_filter = oms.SavitzkyGolayFilter()
sg_filter.filterExperiment(exp)
**峰过滤:**
```pythonKeep only N largest peaks per spectrum
仅保留每个谱图中强度最高的N个峰
n_largest = oms.NLargest()
params = n_largest.getParameters()
params.setValue("n", 100) # Keep top 100 peaks
n_largest.setParameters(params)
n_largest.filterExperiment(exp)
n_largest = oms.NLargest()
params = n_largest.getParameters()
params.setValue("n", 100) # 保留前100个峰
n_largest.setParameters(params)
n_largest.filterExperiment(exp)
Threshold filtering
阈值过滤
threshold_filter = oms.ThresholdMower()
params = threshold_filter.getParameters()
params.setValue("threshold", 1000.0) # Remove peaks below 1000 intensity
threshold_filter.setParameters(params)
threshold_filter.filterExperiment(exp)
threshold_filter = oms.ThresholdMower()
params = threshold_filter.getParameters()
params.setValue("threshold", 1000.0) # 移除强度低于1000的峰
threshold_filter.setParameters(params)
threshold_filter.filterExperiment(exp)
Window-based filtering
窗口过滤
window_filter = oms.WindowMower()
params = window_filter.getParameters()
params.setValue("windowsize", 50.0) # 50 m/z windows
params.setValue("peakcount", 10) # Keep 10 highest per window
window_filter.setParameters(params)
window_filter.filterExperiment(exp)
**Spectrum Normalization:**
```python
normalizer = oms.Normalizer()
normalizer.filterExperiment(exp)MS Level Filtering:
python
undefinedwindow_filter = oms.WindowMower()
params = window_filter.getParameters()
params.setValue("windowsize", 50.0) # 50 m/z窗口
params.setValue("peakcount", 10) # 每个窗口保留强度最高的10个峰
window_filter.setParameters(params)
window_filter.filterExperiment(exp)
**谱图归一化:**
```python
normalizer = oms.Normalizer()
normalizer.filterExperiment(exp)MS级别过滤:
python
undefinedKeep only MS2 spectra
仅保留MS2谱图
exp.filterMSLevel(2)
exp.filterMSLevel(2)
Filter by retention time range
按保留时间范围过滤
exp.filterRT(100.0, 500.0) # Keep RT between 100-500 seconds
exp.filterRT(100.0, 500.0) # 保留100-500秒的保留时间
Filter by m/z range
按m/z范围过滤
exp.filterMZ(400.0, 1500.0) # Keep m/z between 400-1500
undefinedexp.filterMZ(400.0, 1500.0) # 保留400-1500的m/z
undefined5. Feature Detection and Quantification
5. 特征检测与定量
Detect and quantify features in LC-MS data:
Peak Picking (Centroiding):
python
undefined在LC-MS数据中检测和定量特征:
峰提取(Centroiding):
python
undefinedConvert profile data to centroid
将轮廓图数据转换为centroid图
picker = oms.PeakPickerHiRes()
params = picker.getParameters()
params.setValue("signal_to_noise", 1.0)
picker.setParameters(params)
exp_centroided = oms.MSExperiment()
picker.pickExperiment(exp, exp_centroided)
**Feature Detection:**
```pythonpicker = oms.PeakPickerHiRes()
params = picker.getParameters()
params.setValue("signal_to_noise", 1.0)
picker.setParameters(params)
exp_centroided = oms.MSExperiment()
picker.pickExperiment(exp, exp_centroided)
**特征检测:**
```pythonDetect features across LC-MS runs
在LC-MS运行中检测特征
feature_finder = oms.FeatureFinderMultiplex()
features = oms.FeatureMap()
feature_finder.run(exp, features, params)
print(f"Found {features.size()} features")
for feature in features:
print(f"m/z: {feature.getMZ():.4f}, RT: {feature.getRT():.2f}, "
f"Intensity: {feature.getIntensity():.0f}")
**Feature Linking (Map Alignment):**
```pythonfeature_finder = oms.FeatureFinderMultiplex()
features = oms.FeatureMap()
feature_finder.run(exp, features, params)
print(f"检测到 {features.size()} 个特征")
for feature in features:
print(f"m/z: {feature.getMZ():.4f}, 保留时间: {feature.getRT():.2f}, "
f"强度: {feature.getIntensity():.0f}")
**特征关联(图谱对齐):**
```pythonLink features across multiple samples
在多个样本间关联特征
feature_grouper = oms.FeatureGroupingAlgorithmQT()
consensus_map = oms.ConsensusMap()
feature_grouper = oms.FeatureGroupingAlgorithmQT()
consensus_map = oms.ConsensusMap()
Provide multiple feature maps from different samples
提供来自不同样本的多个特征图谱
feature_maps = [features1, features2, features3]
feature_grouper.group(feature_maps, consensus_map)
undefinedfeature_maps = [features1, features2, features3]
feature_grouper.group(feature_maps, consensus_map)
undefined6. Peptide Identification Workflows
6. 肽段鉴定工作流
Integrate with search engines and process identification results:
Database Searching:
python
undefined与搜索引擎集成并处理鉴定结果:
数据库搜索:
python
undefinedPrepare parameters for search engine
为搜索引擎准备参数
params = oms.Param()
params.setValue("database", "uniprot_human.fasta")
params.setValue("precursor_mass_tolerance", 10.0) # ppm
params.setValue("fragment_mass_tolerance", 0.5) # Da
params.setValue("enzyme", "Trypsin")
params.setValue("missed_cleavages", 2)
params = oms.Param()
params.setValue("database", "uniprot_human.fasta")
params.setValue("precursor_mass_tolerance", 10.0) # ppm
params.setValue("fragment_mass_tolerance", 0.5) # Da
params.setValue("enzyme", "Trypsin")
params.setValue("missed_cleavages", 2)
Variable modifications
可变修饰
params.setValue("variable_modifications", ["Oxidation (M)", "Phospho (STY)"])
params.setValue("variable_modifications", ["Oxidation (M)", "Phospho (STY)"])
Fixed modifications
固定修饰
params.setValue("fixed_modifications", ["Carbamidomethyl (C)"])
**FDR Control:**
```pythonparams.setValue("fixed_modifications", ["Carbamidomethyl (C)"])
**FDR控制:**
```pythonFalse discovery rate estimation
假发现率估计
fdr = oms.FalseDiscoveryRate()
fdr_threshold = 0.01 # 1% FDR
fdr = oms.FalseDiscoveryRate()
fdr_threshold = 0.01 # 1% FDR
Apply to peptide identifications
应用于肽段鉴定结果
protein_ids = []
peptide_ids = []
oms.IdXMLFile().load("search_results.idXML", protein_ids, peptide_ids)
fdr.apply(protein_ids, peptide_ids)
undefinedprotein_ids = []
peptide_ids = []
oms.IdXMLFile().load("search_results.idXML", protein_ids, peptide_ids)
fdr.apply(protein_ids, peptide_ids)
undefined7. Metabolomics Workflows
7. 代谢组学工作流
Analyze small molecule data:
Adduct Detection:
python
undefined分析小分子数据:
加合物检测:
python
undefinedCommon metabolite adducts
常见代谢物加合物
adducts = ["[M+H]+", "[M+Na]+", "[M+K]+", "[M-H]-", "[M+Cl]-"]
adducts = ["[M+H]+", "[M+Na]+", "[M+K]+", "[M-H]-", "[M+Cl]-"]
Feature annotation with adducts
用加合物注释特征
for feature in features:
mz = feature.getMZ()
# Calculate neutral mass for each adduct hypothesis
for adduct in adducts:
# Annotation logic
pass
**Isotope Pattern Matching:**
```pythonfor feature in features:
mz = feature.getMZ()
# 为每个加合物假设计算中性质量
for adduct in adducts:
# 注释逻辑
pass
**同位素模式匹配:**
```pythonCompare experimental to theoretical isotope patterns
比较实验与理论同位素模式
experimental_pattern = [] # Extract from feature
theoretical = coarse_gen.run(formula)
experimental_pattern = [] # 从特征中提取
theoretical = coarse_gen.run(formula)
Calculate similarity score
计算相似度得分
similarity = compare_isotope_patterns(experimental_pattern, theoretical)
undefinedsimilarity = compare_isotope_patterns(experimental_pattern, theoretical)
undefined8. Quality Control and Visualization
8. 质量控制与可视化
Monitor data quality and visualize results:
Basic Statistics:
python
undefined监控数据质量并可视化结果:
基础统计:
python
undefinedCalculate TIC (Total Ion Current)
计算总离子流(TIC)
tic_values = []
rt_values = []
for spectrum in exp:
if spectrum.getMSLevel() == 1:
tic = sum(spectrum.get_peaks()[1]) # Sum intensities
tic_values.append(tic)
rt_values.append(spectrum.getRT())
tic_values = []
rt_values = []
for spectrum in exp:
if spectrum.getMSLevel() == 1:
tic = sum(spectrum.get_peaks()[1]) # 强度求和
tic_values.append(tic)
rt_values.append(spectrum.getRT())
Base peak chromatogram
基峰色谱图
bpc_values = []
for spectrum in exp:
if spectrum.getMSLevel() == 1:
max_intensity = max(spectrum.get_peaks()[1]) if spectrum.size() > 0 else 0
bpc_values.append(max_intensity)
**Plotting (with pyopenms.plotting or matplotlib):**
```python
import matplotlib.pyplot as pltbpc_values = []
for spectrum in exp:
if spectrum.getMSLevel() == 1:
max_intensity = max(spectrum.get_peaks()[1]) if spectrum.size() > 0 else 0
bpc_values.append(max_intensity)
**绘图(使用pyopenms.plotting或matplotlib):**
```python
import matplotlib.pyplot as pltPlot TIC
绘制TIC
plt.figure(figsize=(10, 4))
plt.plot(rt_values, tic_values)
plt.xlabel('Retention Time (s)')
plt.ylabel('Total Ion Current')
plt.title('TIC')
plt.show()
plt.figure(figsize=(10, 4))
plt.plot(rt_values, tic_values)
plt.xlabel('保留时间 (秒)')
plt.ylabel('总离子流')
plt.title('TIC')
plt.show()
Plot single spectrum
绘制单个谱图
spectrum = exp.getSpectrum(0)
mz, intensity = spectrum.get_peaks()
plt.stem(mz, intensity, basefmt=' ')
plt.xlabel('m/z')
plt.ylabel('Intensity')
plt.title(f'Spectrum at RT {spectrum.getRT():.2f}s')
plt.show()
undefinedspectrum = exp.getSpectrum(0)
mz, intensity = spectrum.get_peaks()
plt.stem(mz, intensity, basefmt=' ')
plt.xlabel('m/z')
plt.ylabel('强度')
plt.title(f'保留时间为 {spectrum.getRT():.2f}秒 的谱图')
plt.show()
undefinedCommon Workflows
常见工作流
Complete LC-MS/MS Processing Pipeline
完整LC-MS/MS处理流水线
python
import pyopenms as omspython
import pyopenms as oms1. Load data
1. 加载数据
exp = oms.MSExperiment()
oms.MzMLFile().load("raw_data.mzML", exp)
exp = oms.MSExperiment()
oms.MzMLFile().load("raw_data.mzML", exp)
2. Filter and smooth
2. 过滤与平滑
exp.filterMSLevel(1) # Keep only MS1 for feature detection
gauss = oms.GaussFilter()
gauss.filterExperiment(exp)
exp.filterMSLevel(1) # 仅保留MS1用于特征检测
gauss = oms.GaussFilter()
gauss.filterExperiment(exp)
3. Peak picking
3. 峰提取
picker = oms.PeakPickerHiRes()
exp_centroid = oms.MSExperiment()
picker.pickExperiment(exp, exp_centroid)
picker = oms.PeakPickerHiRes()
exp_centroid = oms.MSExperiment()
picker.pickExperiment(exp, exp_centroid)
4. Feature detection
4. 特征检测
ff = oms.FeatureFinderMultiplex()
features = oms.FeatureMap()
ff.run(exp_centroid, features, oms.Param())
ff = oms.FeatureFinderMultiplex()
features = oms.FeatureMap()
ff.run(exp_centroid, features, oms.Param())
5. Export results
5. 导出结果
oms.FeatureXMLFile().store("features.featureXML", features)
print(f"Detected {features.size()} features")
undefinedoms.FeatureXMLFile().store("features.featureXML", features)
print(f"检测到 {features.size()} 个特征")
undefinedTheoretical Peptide Mass Calculation
理论肽段质量计算
python
undefinedpython
undefinedCalculate masses for peptide with modifications
计算带修饰肽段的质量
peptide = oms.AASequence.fromString("PEPTIDEK")
print(f"Unmodified [M+H]+: {peptide.getMonoWeight() + 1.007276:.4f}")
peptide = oms.AASequence.fromString("PEPTIDEK")
print(f"未修饰 [M+H]+: {peptide.getMonoWeight() + 1.007276:.4f}")
With modification
带修饰的情况
modified = oms.AASequence.fromString("PEPTIDEM(Oxidation)K")
print(f"Oxidized [M+H]+: {modified.getMonoWeight() + 1.007276:.4f}")
modified = oms.AASequence.fromString("PEPTIDEM(Oxidation)K")
print(f"氧化后 [M+H]+: {modified.getMonoWeight() + 1.007276:.4f}")
Calculate for different charge states
计算不同电荷状态下的质量
for z in [1, 2, 3]:
mz = (peptide.getMonoWeight() + z * 1.007276) / z
print(f"[M+{z}H]^{z}+: {mz:.4f}")
undefinedfor z in [1, 2, 3]:
mz = (peptide.getMonoWeight() + z * 1.007276) / z
print(f"[M+{z}H]^{z}+: {mz:.4f}")
undefinedInstallation
安装
Ensure pyOpenMS is installed before using this skill:
bash
undefined使用本技能前请确保已安装pyOpenMS:
bash
undefinedVia conda (recommended)
通过conda安装(推荐)
conda install -c bioconda pyopenms
conda install -c bioconda pyopenms
Via pip
通过pip安装
pip install pyopenms
undefinedpip install pyopenms
undefinedIntegration with Other Tools
与其他工具的集成
pyOpenMS integrates seamlessly with:
- Search Engines: Comet, Mascot, MSGF+, MSFragger, Sage, SpectraST
- Post-processing: Percolator, MSstats, Epiphany
- Metabolomics: SIRIUS, CSI:FingerID
- Data Analysis: Pandas, NumPy, SciPy for downstream analysis
- Visualization: Matplotlib, Seaborn for plotting
pyOpenMS可与以下工具无缝集成:
- 搜索引擎:Comet、Mascot、MSGF+、MSFragger、Sage、SpectraST
- 后处理工具:Percolator、MSstats、Epiphany
- 代谢组学工具:SIRIUS、CSI:FingerID
- 数据分析工具:Pandas、NumPy、SciPy(用于下游分析)
- 可视化工具:Matplotlib、Seaborn(用于绘图)
Resources
资源
references/
references/
Detailed documentation on core concepts:
- data_structures.md - Comprehensive guide to MSExperiment, MSSpectrum, MSChromatogram, and peak data handling
- algorithms.md - Complete reference for signal processing, filtering, feature detection, and quantification algorithms
- chemistry.md - In-depth coverage of chemistry calculations, peptide handling, modifications, and isotope distributions
Load these references when needing detailed information about specific pyOpenMS capabilities.
核心概念的详细文档:
- data_structures.md - MSExperiment、MSSpectrum、MSChromatogram和峰数据处理的全面指南
- algorithms.md - 信号处理、过滤、特征检测和定量算法的完整参考
- chemistry.md - 化学计算、肽段处理、修饰和同位素分布的深入介绍
当需要了解pyOpenMS特定功能的详细信息时,请查阅这些参考文档。
Best Practices
最佳实践
- File Format: Always use mzML for raw MS data (standardized, well-supported)
- Peak Access: Use and
get_peaks()with numpy arrays for efficient processingset_peaks() - Parameters: Always check and configure algorithm parameters via and
getParameters()setParameters() - Memory: For large datasets, process spectra iteratively rather than loading entire experiments
- Validation: Check data integrity (MS levels, RT ordering, precursor information) after loading
- Modifications: Use standard modification names from UniMod database
- Units: RT in seconds, m/z in Thomson (Da/charge), intensity in arbitrary units
- 文件格式:始终使用mzML作为原始质谱数据格式(标准化、支持良好)
- 峰访问:使用和
get_peaks()结合numpy数组以实现高效处理set_peaks() - 参数:始终通过和
getParameters()检查和配置算法参数setParameters() - 内存:对于大型数据集,迭代处理谱图而非加载整个实验数据
- 验证:加载后检查数据完整性(MS级别、保留时间顺序、母离子信息)
- 修饰:使用UniMod数据库中的标准修饰名称
- 单位:保留时间以秒为单位,m/z以Thomson(Da/电荷)为单位,强度为任意单位
Common Patterns
常见模式
Algorithm Application Pattern:
python
undefined算法应用模式:
python
undefined1. Instantiate algorithm
1. 实例化算法
algorithm = oms.SomeAlgorithm()
algorithm = oms.SomeAlgorithm()
2. Get and configure parameters
2. 获取并配置参数
params = algorithm.getParameters()
params.setValue("parameter_name", value)
algorithm.setParameters(params)
params = algorithm.getParameters()
params.setValue("parameter_name", value)
algorithm.setParameters(params)
3. Apply to data
3. 应用到数据
algorithm.filterExperiment(exp) # or .process(), .run(), depending on algorithm
**File I/O Pattern:**
```pythonalgorithm.filterExperiment(exp) # 或.process()、.run(),取决于算法
**文件输入输出模式:**
```pythonRead
读取
data_container = oms.DataContainer() # MSExperiment, FeatureMap, etc.
oms.FileHandler().load("input.format", data_container)
data_container = oms.DataContainer() # MSExperiment、FeatureMap等
oms.FileHandler().load("input.format", data_container)
Process
处理
... manipulate data_container ...
... 操作data_container ...
Write
写入
oms.FileHandler().store("output.format", data_container)
undefinedoms.FileHandler().store("output.format", data_container)
undefinedGetting Help
获取帮助
- Documentation: https://pyopenms.readthedocs.io/
- API Reference: Browse class documentation for detailed method signatures
- OpenMS Website: https://www.openms.org/
- GitHub Issues: https://github.com/OpenMS/OpenMS/issues
- 文档:https://pyopenms.readthedocs.io/
- API参考:浏览类文档以获取详细的方法签名
- OpenMS官网:https://www.openms.org/
- GitHub问题:https://github.com/OpenMS/OpenMS/issues