pyopenms

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

pyOpenMS

pyOpenMS

Overview

概述

pyOpenMS is an open-source Python library for mass spectrometry data analysis in proteomics and metabolomics. Process LC-MS/MS data, perform peptide identification, detect and quantify features, and integrate with common proteomics tools (Comet, Mascot, MSGF+, Percolator, MSstats) using Python bindings to the OpenMS C++ library.
pyOpenMS是一个用于蛋白质组学和代谢组学领域质谱数据分析的开源Python库。它通过OpenMS C++库的Python绑定,处理LC-MS/MS数据、执行肽段鉴定、检测和定量特征,并与常见的蛋白质组学工具(Comet、Mascot、MSGF+、Percolator、MSstats)集成。

When to Use This Skill

适用场景

This skill should be used when:
  • Processing mass spectrometry data (mzML, mzXML files)
  • Performing peak picking and feature detection in LC-MS data
  • Conducting peptide and protein identification workflows
  • Quantifying metabolites or proteins
  • Integrating proteomics or metabolomics tools into Python pipelines
  • Working with OpenMS tools and file formats
在以下场景中可使用本技能:
  • 处理质谱数据(mzML、mzXML文件)
  • 在LC-MS数据中执行峰提取和特征检测
  • 开展肽段和蛋白质鉴定工作流
  • 对代谢物或蛋白质进行定量分析
  • 将蛋白质组学或代谢组学工具集成到Python流水线中
  • 使用OpenMS工具和文件格式

Core Capabilities

核心功能

1. File I/O and Data Import/Export

1. 文件输入输出与数据导入导出

Handle diverse mass spectrometry file formats efficiently:
Supported Formats:
  • mzML/mzXML: Primary raw MS data formats (profile or centroid)
  • FASTA: Protein/peptide sequence databases
  • mzTab: Standardized reporting format for identification and quantification
  • mzIdentML: Peptide and protein identification data
  • TraML: Transition lists for targeted experiments
  • pepXML/protXML: Search engine results
Reading mzML Files:
python
import pyopenms as oms
高效处理多种质谱文件格式:
支持的格式:
  • mzML/mzXML:主要的原始质谱数据格式(轮廓图或 centroided 图)
  • FASTA:蛋白质/肽段序列数据库
  • mzTab:用于鉴定和定量的标准化报告格式
  • mzIdentML:肽段和蛋白质鉴定数据
  • TraML:靶向实验的过渡列表
  • pepXML/protXML:搜索引擎结果
读取mzML文件:
python
import pyopenms as oms

Load MS data

加载质谱数据

exp = oms.MSExperiment() oms.MzMLFile().load("input_data.mzML", exp)
exp = oms.MSExperiment() oms.MzMLFile().load("input_data.mzML", exp)

Access basic information

访问基础信息

print(f"Number of spectra: {exp.getNrSpectra()}") print(f"Number of chromatograms: {exp.getNrChromatograms()}")

**Writing mzML Files:**
```python
print(f"谱图数量: {exp.getNrSpectra()}") print(f"色谱图数量: {exp.getNrChromatograms()}")

**写入mzML文件:**
```python

Save processed data

保存处理后的数据

oms.MzMLFile().store("output_data.mzML", exp)

**File Encoding:** pyOpenMS automatically handles Base64 encoding, zlib compression, and Numpress compression internally.
oms.MzMLFile().store("output_data.mzML", exp)

**文件编码:** pyOpenMS会自动在内部处理Base64编码、zlib压缩和Numpress压缩。

2. MS Data Structures and Manipulation

2. 质谱数据结构与操作

Work with core mass spectrometry data structures. See
references/data_structures.md
for comprehensive details.
MSSpectrum - Individual mass spectrum:
python
undefined
使用核心质谱数据结构。如需详细说明,请参阅
references/data_structures.md
MSSpectrum - 单个质谱图:
python
undefined

Create spectrum with metadata

创建带元数据的谱图

spectrum = oms.MSSpectrum() spectrum.setRT(205.2) # Retention time in seconds spectrum.setMSLevel(2) # MS2 spectrum
spectrum = oms.MSSpectrum() spectrum.setRT(205.2) # 保留时间(秒) spectrum.setMSLevel(2) # MS2谱图

Set peak data (m/z, intensity arrays)

设置峰数据(m/z、强度数组)

mz_array = [100.5, 200.3, 300.7, 400.2] intensity_array = [1000, 5000, 3000, 2000] spectrum.set_peaks((mz_array, intensity_array))
mz_array = [100.5, 200.3, 300.7, 400.2] intensity_array = [1000, 5000, 3000, 2000] spectrum.set_peaks((mz_array, intensity_array))

Add precursor information for MS2

为MS2添加母离子信息

precursor = oms.Precursor() precursor.setMZ(450.5) precursor.setCharge(2) spectrum.setPrecursors([precursor])

**MSExperiment** - Complete LC-MS/MS run:
```python
precursor = oms.Precursor() precursor.setMZ(450.5) precursor.setCharge(2) spectrum.setPrecursors([precursor])

**MSExperiment** - 完整的LC-MS/MS运行数据:
```python

Create experiment and add spectra

创建实验并添加谱图

exp = oms.MSExperiment() exp.addSpectrum(spectrum)
exp = oms.MSExperiment() exp.addSpectrum(spectrum)

Access spectra

访问谱图

first_spectrum = exp.getSpectrum(0) for spec in exp: print(f"RT: {spec.getRT()}, MS Level: {spec.getMSLevel()}")

**MSChromatogram** - Extracted ion chromatogram:
```python
first_spectrum = exp.getSpectrum(0) for spec in exp: print(f"保留时间: {spec.getRT()}, MS级别: {spec.getMSLevel()}")

**MSChromatogram** - 提取离子色谱图:
```python

Create chromatogram

创建色谱图

chrom = oms.MSChromatogram() chrom.set_peaks(([10.5, 11.2, 11.8], [1000, 5000, 3000])) # RT, intensity exp.addChromatogram(chrom)

**Efficient Peak Access:**
```python
chrom = oms.MSChromatogram() chrom.set_peaks(([10.5, 11.2, 11.8], [1000, 5000, 3000])) # 保留时间、强度 exp.addChromatogram(chrom)

**高效峰访问:**
```python

Get peaks as numpy arrays for fast processing

将峰数据作为numpy数组获取以实现快速处理

mz_array, intensity_array = spectrum.get_peaks()
mz_array, intensity_array = spectrum.get_peaks()

Modify and set back

修改后重新设置

intensity_array *= 2 # Double all intensities spectrum.set_peaks((mz_array, intensity_array))
undefined
intensity_array *= 2 # 将所有强度翻倍 spectrum.set_peaks((mz_array, intensity_array))
undefined

3. Chemistry and Peptide Handling

3. 化学与肽段处理

Perform chemical calculations for proteomics and metabolomics. See
references/chemistry.md
for detailed examples.
Molecular Formulas and Mass Calculations:
python
undefined
为蛋白质组学和代谢组学执行化学计算。如需详细示例,请参阅
references/chemistry.md
分子式与质量计算:
python
undefined

Create empirical formula

创建经验分子式

formula = oms.EmpiricalFormula("C6H12O6") # Glucose print(f"Monoisotopic mass: {formula.getMonoWeight()}") print(f"Average mass: {formula.getAverageWeight()}")
formula = oms.EmpiricalFormula("C6H12O6") # 葡萄糖 print(f"单同位素质量: {formula.getMonoWeight()}") print(f"平均质量: {formula.getAverageWeight()}")

Formula arithmetic

分子式运算

water = oms.EmpiricalFormula("H2O") dehydrated = formula - water
water = oms.EmpiricalFormula("H2O") dehydrated = formula - water

Isotope-specific formulas

同位素特异性分子式

heavy_carbon = oms.EmpiricalFormula("(13)C6H12O6")

**Isotopic Distributions:**
```python
heavy_carbon = oms.EmpiricalFormula("(13)C6H12O6")

**同位素分布:**
```python

Generate coarse isotope pattern (unit mass resolution)

生成粗粒度同位素模式(单位质量分辨率)

coarse_gen = oms.CoarseIsotopePatternGenerator() pattern = coarse_gen.run(formula)
coarse_gen = oms.CoarseIsotopePatternGenerator() pattern = coarse_gen.run(formula)

Generate fine structure (high resolution)

生成精细结构(高分辨率)

fine_gen = oms.FineIsotopePatternGenerator(0.01) # 0.01 Da resolution fine_pattern = fine_gen.run(formula)

**Amino Acids and Residues:**
```python
fine_gen = oms.FineIsotopePatternGenerator(0.01) # 0.01 Da分辨率 fine_pattern = fine_gen.run(formula)

**氨基酸与残基:**
```python

Access residue information

访问残基信息

res_db = oms.ResidueDB() leucine = res_db.getResidue("Leucine") print(f"L monoisotopic mass: {leucine.getMonoWeight()}") print(f"L formula: {leucine.getFormula()}") print(f"L pKa: {leucine.getPka()}")

**Peptide Sequences:**
```python
res_db = oms.ResidueDB() leucine = res_db.getResidue("Leucine") print(f"亮氨酸单同位素质量: {leucine.getMonoWeight()}") print(f"亮氨酸分子式: {leucine.getFormula()}") print(f"亮氨酸pKa值: {leucine.getPka()}")

**肽段序列:**
```python

Create peptide sequence

创建肽段序列

peptide = oms.AASequence.fromString("PEPTIDE") print(f"Peptide mass: {peptide.getMonoWeight()}") print(f"Formula: {peptide.getFormula()}")
peptide = oms.AASequence.fromString("PEPTIDE") print(f"肽段质量: {peptide.getMonoWeight()}") print(f"分子式: {peptide.getFormula()}")

Add modifications

添加修饰

modified = oms.AASequence.fromString("PEPTIDEM(Oxidation)") print(f"Modified mass: {modified.getMonoWeight()}")
modified = oms.AASequence.fromString("PEPTIDEM(Oxidation)") print(f"修饰后质量: {modified.getMonoWeight()}")

Theoretical fragmentation

理论碎裂

ions = [] for i in range(1, peptide.size()): b_ion = peptide.getPrefix(i) y_ion = peptide.getSuffix(i) ions.append(('b', i, b_ion.getMonoWeight())) ions.append(('y', i, y_ion.getMonoWeight()))

**Protein Digestion:**
```python
ions = [] for i in range(1, peptide.size()): b_ion = peptide.getPrefix(i) y_ion = peptide.getSuffix(i) ions.append(('b', i, b_ion.getMonoWeight())) ions.append(('y', i, y_ion.getMonoWeight()))

**蛋白质酶解:**
```python

Enzymatic digestion

酶解

dig = oms.ProteaseDigestion() dig.setEnzyme("Trypsin") dig.setMissedCleavages(2)
protein_seq = oms.AASequence.fromString("MKTAYIAKQRQISFVKSHFSRQLEERLGLIEVQAPILSRVGDGTQDNLSGAEK") peptides = [] dig.digest(protein_seq, peptides)
for pep in peptides: print(f"{pep.toString()}: {pep.getMonoWeight():.2f} Da")

**Modifications:**
```python
dig = oms.ProteaseDigestion() dig.setEnzyme("Trypsin") dig.setMissedCleavages(2)
protein_seq = oms.AASequence.fromString("MKTAYIAKQRQISFVKSHFSRQLEERLGLIEVQAPILSRVGDGTQDNLSGAEK") peptides = [] dig.digest(protein_seq, peptides)
for pep in peptides: print(f"{pep.toString()}: {pep.getMonoWeight():.2f} Da")

**修饰:**
```python

Access modification database

访问修饰数据库

mod_db = oms.ModificationsDB() oxidation = mod_db.getModification("Oxidation") print(f"Oxidation mass diff: {oxidation.getDiffMonoMass()}") print(f"Residues: {oxidation.getResidues()}")
undefined
mod_db = oms.ModificationsDB() oxidation = mod_db.getModification("Oxidation") print(f"氧化质量差异: {oxidation.getDiffMonoMass()}") print(f"作用残基: {oxidation.getResidues()}")
undefined

4. Signal Processing and Filtering

4. 信号处理与过滤

Apply algorithms to process and filter MS data. See
references/algorithms.md
for comprehensive coverage.
Spectral Smoothing:
python
undefined
应用算法处理和过滤质谱数据。如需全面介绍,请参阅
references/algorithms.md
谱图平滑:
python
undefined

Gaussian smoothing

高斯平滑

gauss_filter = oms.GaussFilter() params = gauss_filter.getParameters() params.setValue("gaussian_width", 0.2) gauss_filter.setParameters(params) gauss_filter.filterExperiment(exp)
gauss_filter = oms.GaussFilter() params = gauss_filter.getParameters() params.setValue("gaussian_width", 0.2) gauss_filter.setParameters(params) gauss_filter.filterExperiment(exp)

Savitzky-Golay filter

萨维茨基-戈雷滤波

sg_filter = oms.SavitzkyGolayFilter() sg_filter.filterExperiment(exp)

**Peak Filtering:**
```python
sg_filter = oms.SavitzkyGolayFilter() sg_filter.filterExperiment(exp)

**峰过滤:**
```python

Keep only N largest peaks per spectrum

仅保留每个谱图中强度最高的N个峰

n_largest = oms.NLargest() params = n_largest.getParameters() params.setValue("n", 100) # Keep top 100 peaks n_largest.setParameters(params) n_largest.filterExperiment(exp)
n_largest = oms.NLargest() params = n_largest.getParameters() params.setValue("n", 100) # 保留前100个峰 n_largest.setParameters(params) n_largest.filterExperiment(exp)

Threshold filtering

阈值过滤

threshold_filter = oms.ThresholdMower() params = threshold_filter.getParameters() params.setValue("threshold", 1000.0) # Remove peaks below 1000 intensity threshold_filter.setParameters(params) threshold_filter.filterExperiment(exp)
threshold_filter = oms.ThresholdMower() params = threshold_filter.getParameters() params.setValue("threshold", 1000.0) # 移除强度低于1000的峰 threshold_filter.setParameters(params) threshold_filter.filterExperiment(exp)

Window-based filtering

窗口过滤

window_filter = oms.WindowMower() params = window_filter.getParameters() params.setValue("windowsize", 50.0) # 50 m/z windows params.setValue("peakcount", 10) # Keep 10 highest per window window_filter.setParameters(params) window_filter.filterExperiment(exp)

**Spectrum Normalization:**
```python
normalizer = oms.Normalizer()
normalizer.filterExperiment(exp)
MS Level Filtering:
python
undefined
window_filter = oms.WindowMower() params = window_filter.getParameters() params.setValue("windowsize", 50.0) # 50 m/z窗口 params.setValue("peakcount", 10) # 每个窗口保留强度最高的10个峰 window_filter.setParameters(params) window_filter.filterExperiment(exp)

**谱图归一化:**
```python
normalizer = oms.Normalizer()
normalizer.filterExperiment(exp)
MS级别过滤:
python
undefined

Keep only MS2 spectra

仅保留MS2谱图

exp.filterMSLevel(2)
exp.filterMSLevel(2)

Filter by retention time range

按保留时间范围过滤

exp.filterRT(100.0, 500.0) # Keep RT between 100-500 seconds
exp.filterRT(100.0, 500.0) # 保留100-500秒的保留时间

Filter by m/z range

按m/z范围过滤

exp.filterMZ(400.0, 1500.0) # Keep m/z between 400-1500
undefined
exp.filterMZ(400.0, 1500.0) # 保留400-1500的m/z
undefined

5. Feature Detection and Quantification

5. 特征检测与定量

Detect and quantify features in LC-MS data:
Peak Picking (Centroiding):
python
undefined
在LC-MS数据中检测和定量特征:
峰提取(Centroiding):
python
undefined

Convert profile data to centroid

将轮廓图数据转换为centroid图

picker = oms.PeakPickerHiRes() params = picker.getParameters() params.setValue("signal_to_noise", 1.0) picker.setParameters(params)
exp_centroided = oms.MSExperiment() picker.pickExperiment(exp, exp_centroided)

**Feature Detection:**
```python
picker = oms.PeakPickerHiRes() params = picker.getParameters() params.setValue("signal_to_noise", 1.0) picker.setParameters(params)
exp_centroided = oms.MSExperiment() picker.pickExperiment(exp, exp_centroided)

**特征检测:**
```python

Detect features across LC-MS runs

在LC-MS运行中检测特征

feature_finder = oms.FeatureFinderMultiplex()
features = oms.FeatureMap() feature_finder.run(exp, features, params)
print(f"Found {features.size()} features") for feature in features: print(f"m/z: {feature.getMZ():.4f}, RT: {feature.getRT():.2f}, " f"Intensity: {feature.getIntensity():.0f}")

**Feature Linking (Map Alignment):**
```python
feature_finder = oms.FeatureFinderMultiplex()
features = oms.FeatureMap() feature_finder.run(exp, features, params)
print(f"检测到 {features.size()} 个特征") for feature in features: print(f"m/z: {feature.getMZ():.4f}, 保留时间: {feature.getRT():.2f}, " f"强度: {feature.getIntensity():.0f}")

**特征关联(图谱对齐):**
```python

Link features across multiple samples

在多个样本间关联特征

feature_grouper = oms.FeatureGroupingAlgorithmQT() consensus_map = oms.ConsensusMap()
feature_grouper = oms.FeatureGroupingAlgorithmQT() consensus_map = oms.ConsensusMap()

Provide multiple feature maps from different samples

提供来自不同样本的多个特征图谱

feature_maps = [features1, features2, features3] feature_grouper.group(feature_maps, consensus_map)
undefined
feature_maps = [features1, features2, features3] feature_grouper.group(feature_maps, consensus_map)
undefined

6. Peptide Identification Workflows

6. 肽段鉴定工作流

Integrate with search engines and process identification results:
Database Searching:
python
undefined
与搜索引擎集成并处理鉴定结果:
数据库搜索:
python
undefined

Prepare parameters for search engine

为搜索引擎准备参数

params = oms.Param() params.setValue("database", "uniprot_human.fasta") params.setValue("precursor_mass_tolerance", 10.0) # ppm params.setValue("fragment_mass_tolerance", 0.5) # Da params.setValue("enzyme", "Trypsin") params.setValue("missed_cleavages", 2)
params = oms.Param() params.setValue("database", "uniprot_human.fasta") params.setValue("precursor_mass_tolerance", 10.0) # ppm params.setValue("fragment_mass_tolerance", 0.5) # Da params.setValue("enzyme", "Trypsin") params.setValue("missed_cleavages", 2)

Variable modifications

可变修饰

params.setValue("variable_modifications", ["Oxidation (M)", "Phospho (STY)"])
params.setValue("variable_modifications", ["Oxidation (M)", "Phospho (STY)"])

Fixed modifications

固定修饰

params.setValue("fixed_modifications", ["Carbamidomethyl (C)"])

**FDR Control:**
```python
params.setValue("fixed_modifications", ["Carbamidomethyl (C)"])

**FDR控制:**
```python

False discovery rate estimation

假发现率估计

fdr = oms.FalseDiscoveryRate() fdr_threshold = 0.01 # 1% FDR
fdr = oms.FalseDiscoveryRate() fdr_threshold = 0.01 # 1% FDR

Apply to peptide identifications

应用于肽段鉴定结果

protein_ids = [] peptide_ids = [] oms.IdXMLFile().load("search_results.idXML", protein_ids, peptide_ids)
fdr.apply(protein_ids, peptide_ids)
undefined
protein_ids = [] peptide_ids = [] oms.IdXMLFile().load("search_results.idXML", protein_ids, peptide_ids)
fdr.apply(protein_ids, peptide_ids)
undefined

7. Metabolomics Workflows

7. 代谢组学工作流

Analyze small molecule data:
Adduct Detection:
python
undefined
分析小分子数据:
加合物检测:
python
undefined

Common metabolite adducts

常见代谢物加合物

adducts = ["[M+H]+", "[M+Na]+", "[M+K]+", "[M-H]-", "[M+Cl]-"]
adducts = ["[M+H]+", "[M+Na]+", "[M+K]+", "[M-H]-", "[M+Cl]-"]

Feature annotation with adducts

用加合物注释特征

for feature in features: mz = feature.getMZ() # Calculate neutral mass for each adduct hypothesis for adduct in adducts: # Annotation logic pass

**Isotope Pattern Matching:**
```python
for feature in features: mz = feature.getMZ() # 为每个加合物假设计算中性质量 for adduct in adducts: # 注释逻辑 pass

**同位素模式匹配:**
```python

Compare experimental to theoretical isotope patterns

比较实验与理论同位素模式

experimental_pattern = [] # Extract from feature theoretical = coarse_gen.run(formula)
experimental_pattern = [] # 从特征中提取 theoretical = coarse_gen.run(formula)

Calculate similarity score

计算相似度得分

similarity = compare_isotope_patterns(experimental_pattern, theoretical)
undefined
similarity = compare_isotope_patterns(experimental_pattern, theoretical)
undefined

8. Quality Control and Visualization

8. 质量控制与可视化

Monitor data quality and visualize results:
Basic Statistics:
python
undefined
监控数据质量并可视化结果:
基础统计:
python
undefined

Calculate TIC (Total Ion Current)

计算总离子流(TIC)

tic_values = [] rt_values = [] for spectrum in exp: if spectrum.getMSLevel() == 1: tic = sum(spectrum.get_peaks()[1]) # Sum intensities tic_values.append(tic) rt_values.append(spectrum.getRT())
tic_values = [] rt_values = [] for spectrum in exp: if spectrum.getMSLevel() == 1: tic = sum(spectrum.get_peaks()[1]) # 强度求和 tic_values.append(tic) rt_values.append(spectrum.getRT())

Base peak chromatogram

基峰色谱图

bpc_values = [] for spectrum in exp: if spectrum.getMSLevel() == 1: max_intensity = max(spectrum.get_peaks()[1]) if spectrum.size() > 0 else 0 bpc_values.append(max_intensity)

**Plotting (with pyopenms.plotting or matplotlib):**
```python
import matplotlib.pyplot as plt
bpc_values = [] for spectrum in exp: if spectrum.getMSLevel() == 1: max_intensity = max(spectrum.get_peaks()[1]) if spectrum.size() > 0 else 0 bpc_values.append(max_intensity)

**绘图(使用pyopenms.plotting或matplotlib):**
```python
import matplotlib.pyplot as plt

Plot TIC

绘制TIC

plt.figure(figsize=(10, 4)) plt.plot(rt_values, tic_values) plt.xlabel('Retention Time (s)') plt.ylabel('Total Ion Current') plt.title('TIC') plt.show()
plt.figure(figsize=(10, 4)) plt.plot(rt_values, tic_values) plt.xlabel('保留时间 (秒)') plt.ylabel('总离子流') plt.title('TIC') plt.show()

Plot single spectrum

绘制单个谱图

spectrum = exp.getSpectrum(0) mz, intensity = spectrum.get_peaks() plt.stem(mz, intensity, basefmt=' ') plt.xlabel('m/z') plt.ylabel('Intensity') plt.title(f'Spectrum at RT {spectrum.getRT():.2f}s') plt.show()
undefined
spectrum = exp.getSpectrum(0) mz, intensity = spectrum.get_peaks() plt.stem(mz, intensity, basefmt=' ') plt.xlabel('m/z') plt.ylabel('强度') plt.title(f'保留时间为 {spectrum.getRT():.2f}秒 的谱图') plt.show()
undefined

Common Workflows

常见工作流

Complete LC-MS/MS Processing Pipeline

完整LC-MS/MS处理流水线

python
import pyopenms as oms
python
import pyopenms as oms

1. Load data

1. 加载数据

exp = oms.MSExperiment() oms.MzMLFile().load("raw_data.mzML", exp)
exp = oms.MSExperiment() oms.MzMLFile().load("raw_data.mzML", exp)

2. Filter and smooth

2. 过滤与平滑

exp.filterMSLevel(1) # Keep only MS1 for feature detection gauss = oms.GaussFilter() gauss.filterExperiment(exp)
exp.filterMSLevel(1) # 仅保留MS1用于特征检测 gauss = oms.GaussFilter() gauss.filterExperiment(exp)

3. Peak picking

3. 峰提取

picker = oms.PeakPickerHiRes() exp_centroid = oms.MSExperiment() picker.pickExperiment(exp, exp_centroid)
picker = oms.PeakPickerHiRes() exp_centroid = oms.MSExperiment() picker.pickExperiment(exp, exp_centroid)

4. Feature detection

4. 特征检测

ff = oms.FeatureFinderMultiplex() features = oms.FeatureMap() ff.run(exp_centroid, features, oms.Param())
ff = oms.FeatureFinderMultiplex() features = oms.FeatureMap() ff.run(exp_centroid, features, oms.Param())

5. Export results

5. 导出结果

oms.FeatureXMLFile().store("features.featureXML", features) print(f"Detected {features.size()} features")
undefined
oms.FeatureXMLFile().store("features.featureXML", features) print(f"检测到 {features.size()} 个特征")
undefined

Theoretical Peptide Mass Calculation

理论肽段质量计算

python
undefined
python
undefined

Calculate masses for peptide with modifications

计算带修饰肽段的质量

peptide = oms.AASequence.fromString("PEPTIDEK") print(f"Unmodified [M+H]+: {peptide.getMonoWeight() + 1.007276:.4f}")
peptide = oms.AASequence.fromString("PEPTIDEK") print(f"未修饰 [M+H]+: {peptide.getMonoWeight() + 1.007276:.4f}")

With modification

带修饰的情况

modified = oms.AASequence.fromString("PEPTIDEM(Oxidation)K") print(f"Oxidized [M+H]+: {modified.getMonoWeight() + 1.007276:.4f}")
modified = oms.AASequence.fromString("PEPTIDEM(Oxidation)K") print(f"氧化后 [M+H]+: {modified.getMonoWeight() + 1.007276:.4f}")

Calculate for different charge states

计算不同电荷状态下的质量

for z in [1, 2, 3]: mz = (peptide.getMonoWeight() + z * 1.007276) / z print(f"[M+{z}H]^{z}+: {mz:.4f}")
undefined
for z in [1, 2, 3]: mz = (peptide.getMonoWeight() + z * 1.007276) / z print(f"[M+{z}H]^{z}+: {mz:.4f}")
undefined

Installation

安装

Ensure pyOpenMS is installed before using this skill:
bash
undefined
使用本技能前请确保已安装pyOpenMS:
bash
undefined

Via conda (recommended)

通过conda安装(推荐)

conda install -c bioconda pyopenms
conda install -c bioconda pyopenms

Via pip

通过pip安装

pip install pyopenms
undefined
pip install pyopenms
undefined

Integration with Other Tools

与其他工具的集成

pyOpenMS integrates seamlessly with:
  • Search Engines: Comet, Mascot, MSGF+, MSFragger, Sage, SpectraST
  • Post-processing: Percolator, MSstats, Epiphany
  • Metabolomics: SIRIUS, CSI:FingerID
  • Data Analysis: Pandas, NumPy, SciPy for downstream analysis
  • Visualization: Matplotlib, Seaborn for plotting
pyOpenMS可与以下工具无缝集成:
  • 搜索引擎:Comet、Mascot、MSGF+、MSFragger、Sage、SpectraST
  • 后处理工具:Percolator、MSstats、Epiphany
  • 代谢组学工具:SIRIUS、CSI:FingerID
  • 数据分析工具:Pandas、NumPy、SciPy(用于下游分析)
  • 可视化工具:Matplotlib、Seaborn(用于绘图)

Resources

资源

references/

references/

Detailed documentation on core concepts:
  • data_structures.md - Comprehensive guide to MSExperiment, MSSpectrum, MSChromatogram, and peak data handling
  • algorithms.md - Complete reference for signal processing, filtering, feature detection, and quantification algorithms
  • chemistry.md - In-depth coverage of chemistry calculations, peptide handling, modifications, and isotope distributions
Load these references when needing detailed information about specific pyOpenMS capabilities.
核心概念的详细文档:
  • data_structures.md - MSExperiment、MSSpectrum、MSChromatogram和峰数据处理的全面指南
  • algorithms.md - 信号处理、过滤、特征检测和定量算法的完整参考
  • chemistry.md - 化学计算、肽段处理、修饰和同位素分布的深入介绍
当需要了解pyOpenMS特定功能的详细信息时,请查阅这些参考文档。

Best Practices

最佳实践

  1. File Format: Always use mzML for raw MS data (standardized, well-supported)
  2. Peak Access: Use
    get_peaks()
    and
    set_peaks()
    with numpy arrays for efficient processing
  3. Parameters: Always check and configure algorithm parameters via
    getParameters()
    and
    setParameters()
  4. Memory: For large datasets, process spectra iteratively rather than loading entire experiments
  5. Validation: Check data integrity (MS levels, RT ordering, precursor information) after loading
  6. Modifications: Use standard modification names from UniMod database
  7. Units: RT in seconds, m/z in Thomson (Da/charge), intensity in arbitrary units
  1. 文件格式:始终使用mzML作为原始质谱数据格式(标准化、支持良好)
  2. 峰访问:使用
    get_peaks()
    set_peaks()
    结合numpy数组以实现高效处理
  3. 参数:始终通过
    getParameters()
    setParameters()
    检查和配置算法参数
  4. 内存:对于大型数据集,迭代处理谱图而非加载整个实验数据
  5. 验证:加载后检查数据完整性(MS级别、保留时间顺序、母离子信息)
  6. 修饰:使用UniMod数据库中的标准修饰名称
  7. 单位:保留时间以秒为单位,m/z以Thomson(Da/电荷)为单位,强度为任意单位

Common Patterns

常见模式

Algorithm Application Pattern:
python
undefined
算法应用模式:
python
undefined

1. Instantiate algorithm

1. 实例化算法

algorithm = oms.SomeAlgorithm()
algorithm = oms.SomeAlgorithm()

2. Get and configure parameters

2. 获取并配置参数

params = algorithm.getParameters() params.setValue("parameter_name", value) algorithm.setParameters(params)
params = algorithm.getParameters() params.setValue("parameter_name", value) algorithm.setParameters(params)

3. Apply to data

3. 应用到数据

algorithm.filterExperiment(exp) # or .process(), .run(), depending on algorithm

**File I/O Pattern:**
```python
algorithm.filterExperiment(exp) # 或.process()、.run(),取决于算法

**文件输入输出模式:**
```python

Read

读取

data_container = oms.DataContainer() # MSExperiment, FeatureMap, etc. oms.FileHandler().load("input.format", data_container)
data_container = oms.DataContainer() # MSExperiment、FeatureMap等 oms.FileHandler().load("input.format", data_container)

Process

处理

... manipulate data_container ...

... 操作data_container ...

Write

写入

oms.FileHandler().store("output.format", data_container)
undefined
oms.FileHandler().store("output.format", data_container)
undefined

Getting Help

获取帮助