flowio
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseFlowIO: Flow Cytometry Standard File Handler
FlowIO:流式细胞术标准文件处理工具
Overview
概述
FlowIO is a lightweight Python library for reading and writing Flow Cytometry Standard (FCS) files. Parse FCS metadata, extract event data, and create new FCS files with minimal dependencies. The library supports FCS versions 2.0, 3.0, and 3.1, making it ideal for backend services, data pipelines, and basic cytometry file operations.
FlowIO是一个轻量级Python库,用于读取和写入流式细胞术标准(FCS)文件。它可以解析FCS元数据、提取事件数据,并以最少的依赖项创建新的FCS文件。该库支持FCS 2.0、3.0和3.1版本,非常适合后端服务、数据管道和基础流式细胞术文件操作。
When to Use This Skill
何时使用该工具
This skill should be used when:
- FCS files requiring parsing or metadata extraction
- Flow cytometry data needing conversion to NumPy arrays
- Event data requiring export to FCS format
- Multi-dataset FCS files needing separation
- Channel information extraction (scatter, fluorescence, time)
- Cytometry file validation or inspection
- Pre-processing workflows before advanced analysis
Related Tools: For advanced flow cytometry analysis including compensation, gating, and FlowJo/GatingML support, recommend FlowKit library as a companion to FlowIO.
在以下场景中应使用该工具:
- 需要解析FCS文件或提取元数据时
- 需要将流式细胞术数据转换为NumPy数组时
- 需要将事件数据导出为FCS格式时
- 需要分离多数据集FCS文件时
- 需要提取通道信息(散射、荧光、时间)时
- 需要验证或检查流式细胞术文件时
- 高级分析前的预处理工作流中
相关工具: 如需进行包括补偿、门控以及支持FlowJo/GatingML的高级流式细胞术分析,推荐将FlowKit库作为FlowIO的配套工具使用。
Installation
安装
bash
uv pip install flowioRequires Python 3.9 or later.
bash
uv pip install flowio需要Python 3.9或更高版本。
Quick Start
快速开始
Basic File Reading
基础文件读取
python
from flowio import FlowDatapython
from flowio import FlowDataRead FCS file
读取FCS文件
flow_data = FlowData('experiment.fcs')
flow_data = FlowData('experiment.fcs')
Access basic information
访问基础信息
print(f"FCS Version: {flow_data.version}")
print(f"Events: {flow_data.event_count}")
print(f"Channels: {flow_data.pnn_labels}")
print(f"FCS版本: {flow_data.version}")
print(f"事件数: {flow_data.event_count}")
print(f"通道: {flow_data.pnn_labels}")
Get event data as NumPy array
将事件数据转换为NumPy数组
events = flow_data.as_array() # Shape: (events, channels)
undefinedevents = flow_data.as_array() # 形状: (事件数, 通道数)
undefinedCreating FCS Files
创建FCS文件
python
import numpy as np
from flowio import create_fcspython
import numpy as np
from flowio import create_fcsPrepare data
准备数据
data = np.array([[100, 200, 50], [150, 180, 60]]) # 2 events, 3 channels
channels = ['FSC-A', 'SSC-A', 'FL1-A']
data = np.array([[100, 200, 50], [150, 180, 60]]) # 2个事件,3个通道
channels = ['FSC-A', 'SSC-A', 'FL1-A']
Create FCS file
创建FCS文件
create_fcs('output.fcs', data, channels)
undefinedcreate_fcs('output.fcs', data, channels)
undefinedCore Workflows
核心工作流
Reading and Parsing FCS Files
读取和解析FCS文件
The FlowData class provides the primary interface for reading FCS files.
Standard Reading:
python
from flowio import FlowDataFlowData类是读取FCS文件的主要接口。
标准读取:
python
from flowio import FlowDataBasic reading
基础读取
flow = FlowData('sample.fcs')
flow = FlowData('sample.fcs')
Access attributes
访问属性
version = flow.version # '3.0', '3.1', etc.
event_count = flow.event_count # Number of events
channel_count = flow.channel_count # Number of channels
pnn_labels = flow.pnn_labels # Short channel names
pns_labels = flow.pns_labels # Descriptive stain names
version = flow.version # '3.0'、'3.1'等
event_count = flow.event_count # 事件数量
channel_count = flow.channel_count # 通道数量
pnn_labels = flow.pnn_labels # 通道短名称
pns_labels = flow.pns_labels # 荧光标记的描述性名称
Get event data
获取事件数据
events = flow.as_array() # Preprocessed (gain, log scaling applied)
raw_events = flow.as_array(preprocess=False) # Raw data
**Memory-Efficient Metadata Reading:**
When only metadata is needed (no event data):
```pythonevents = flow.as_array() # 预处理后的数据(应用增益、对数缩放)
raw_events = flow.as_array(preprocess=False) # 原始数据
**内存高效的元数据读取:**
当仅需要元数据(无需事件数据)时:
```pythonOnly parse TEXT segment, skip DATA and ANALYSIS
仅解析TEXT段,跳过DATA和ANALYSIS段
flow = FlowData('sample.fcs', only_text=True)
flow = FlowData('sample.fcs', only_text=True)
Access metadata
访问元数据
metadata = flow.text # Dictionary of TEXT segment keywords
print(metadata.get('$DATE')) # Acquisition date
print(metadata.get('$CYT')) # Instrument name
**Handling Problematic Files:**
Some FCS files have offset discrepancies or errors:
```pythonmetadata = flow.text # TEXT段关键字的字典
print(metadata.get('$DATE')) # 采集日期
print(metadata.get('$CYT')) # 仪器名称
**处理有问题的文件:**
部分FCS文件存在偏移差异或错误:
```pythonIgnore offset discrepancies between HEADER and TEXT sections
忽略HEADER和TEXT段之间的偏移差异
flow = FlowData('problematic.fcs', ignore_offset_discrepancy=True)
flow = FlowData('problematic.fcs', ignore_offset_discrepancy=True)
Use HEADER offsets instead of TEXT offsets
使用HEADER段的偏移量而非TEXT段的偏移量
flow = FlowData('problematic.fcs', use_header_offsets=True)
flow = FlowData('problematic.fcs', use_header_offsets=True)
Ignore offset errors entirely
完全忽略偏移错误
flow = FlowData('problematic.fcs', ignore_offset_error=True)
**Excluding Null Channels:**
```pythonflow = FlowData('problematic.fcs', ignore_offset_error=True)
**排除空通道:**
```pythonExclude specific channels during parsing
解析时排除特定通道
flow = FlowData('sample.fcs', null_channel_list=['Time', 'Null'])
undefinedflow = FlowData('sample.fcs', null_channel_list=['Time', 'Null'])
undefinedExtracting Metadata and Channel Information
提取元数据和通道信息
FCS files contain rich metadata in the TEXT segment.
Common Metadata Keywords:
python
flow = FlowData('sample.fcs')FCS文件的TEXT段包含丰富的元数据。
常见元数据关键字:
python
flow = FlowData('sample.fcs')File-level metadata
文件级元数据
text_dict = flow.text
acquisition_date = text_dict.get('$DATE', 'Unknown')
instrument = text_dict.get('$CYT', 'Unknown')
data_type = flow.data_type # 'I', 'F', 'D', 'A'
text_dict = flow.text
acquisition_date = text_dict.get('$DATE', '未知')
instrument = text_dict.get('$CYT', '未知')
data_type = flow.data_type # 'I'、'F'、'D'、'A'
Channel metadata
通道元数据
for i in range(flow.channel_count):
pnn = flow.pnn_labels[i] # Short name (e.g., 'FSC-A')
pns = flow.pns_labels[i] # Descriptive name (e.g., 'Forward Scatter')
pnr = flow.pnr_values[i] # Range/max value
print(f"Channel {i}: {pnn} ({pns}), Range: {pnr}")
**Channel Type Identification:**
FlowIO automatically categorizes channels:
```pythonfor i in range(flow.channel_count):
pnn = flow.pnn_labels[i] # 短名称(如'FSC-A')
pns = flow.pns_labels[i] # 描述性名称(如'Forward Scatter')
pnr = flow.pnr_values[i] # 范围/最大值
print(f"通道 {i}: {pnn} ({pns}), 范围: {pnr}")
**通道类型识别:**
FlowIO会自动对通道进行分类:
```pythonGet indices by channel type
按通道类型获取索引
scatter_idx = flow.scatter_indices # [0, 1] for FSC, SSC
fluoro_idx = flow.fluoro_indices # [2, 3, 4] for FL channels
time_idx = flow.time_index # Index of time channel (or None)
scatter_idx = flow.scatter_indices # FSC、SSC的索引,如[0, 1]
fluoro_idx = flow.fluoro_indices # 荧光通道的索引,如[2, 3, 4]
time_idx = flow.time_index # 时间通道的索引(若不存在则为None)
Access specific channel types
访问特定类型的通道数据
events = flow.as_array()
scatter_data = events[:, scatter_idx]
fluorescence_data = events[:, fluoro_idx]
**ANALYSIS Segment:**
If present, access processed results:
```python
if flow.analysis:
analysis_keywords = flow.analysis # Dictionary of ANALYSIS keywords
print(analysis_keywords)events = flow.as_array()
scatter_data = events[:, scatter_idx]
fluorescence_data = events[:, fluoro_idx]
**ANALYSIS段:**
如果存在ANALYSIS段,可访问处理后的结果:
```python
if flow.analysis:
analysis_keywords = flow.analysis # ANALYSIS段关键字的字典
print(analysis_keywords)Creating New FCS Files
创建新的FCS文件
Generate FCS files from NumPy arrays or other data sources.
Basic Creation:
python
import numpy as np
from flowio import create_fcs从NumPy数组或其他数据源生成FCS文件。
基础创建:
python
import numpy as np
from flowio import create_fcsCreate event data (rows=events, columns=channels)
创建事件数据(行=事件,列=通道)
events = np.random.rand(10000, 5) * 1000
events = np.random.rand(10000, 5) * 1000
Define channel names
定义通道名称
channel_names = ['FSC-A', 'SSC-A', 'FL1-A', 'FL2-A', 'Time']
channel_names = ['FSC-A', 'SSC-A', 'FL1-A', 'FL2-A', 'Time']
Create FCS file
创建FCS文件
create_fcs('output.fcs', events, channel_names)
**With Descriptive Channel Names:**
```pythoncreate_fcs('output.fcs', events, channel_names)
**添加描述性通道名称:**
```pythonAdd optional descriptive names (PnS)
添加可选的描述性名称(PnS)
channel_names = ['FSC-A', 'SSC-A', 'FL1-A', 'FL2-A', 'Time']
descriptive_names = ['Forward Scatter', 'Side Scatter', 'FITC', 'PE', 'Time']
create_fcs('output.fcs',
events,
channel_names,
opt_channel_names=descriptive_names)
**With Custom Metadata:**
```pythonchannel_names = ['FSC-A', 'SSC-A', 'FL1-A', 'FL2-A', 'Time']
descriptive_names = ['Forward Scatter', 'Side Scatter', 'FITC', 'PE', 'Time']
create_fcs('output.fcs',
events,
channel_names,
opt_channel_names=descriptive_names)
**添加自定义元数据:**
```pythonAdd TEXT segment metadata
添加TEXT段元数据
metadata = {
'$SRC': 'Python script',
'$DATE': '19-OCT-2025',
'$CYT': 'Synthetic Instrument',
'$INST': 'Laboratory A'
}
create_fcs('output.fcs',
events,
channel_names,
opt_channel_names=descriptive_names,
metadata=metadata)
**Note:** FlowIO exports as FCS 3.1 with single-precision floating-point data.metadata = {
'$SRC': 'Python script',
'$DATE': '19-OCT-2025',
'$CYT': 'Synthetic Instrument',
'$INST': 'Laboratory A'
}
create_fcs('output.fcs',
events,
channel_names,
opt_channel_names=descriptive_names,
metadata=metadata)
**注意:** FlowIO以FCS 3.1版本导出,数据为单精度浮点型。Exporting Modified Data
导出修改后的数据
Modify existing FCS files and re-export them.
Approach 1: Using write_fcs() Method:
python
from flowio import FlowData修改现有FCS文件并重新导出。
方法1:使用write_fcs()方法:
python
from flowio import FlowDataRead original file
读取原始文件
flow = FlowData('original.fcs')
flow = FlowData('original.fcs')
Write with updated metadata
写入更新后的元数据
flow.write_fcs('modified.fcs', metadata={'$SRC': 'Modified data'})
**Approach 2: Extract, Modify, and Recreate:**
For modifying event data:
```python
from flowio import FlowData, create_fcsflow.write_fcs('modified.fcs', metadata={'$SRC': 'Modified data'})
**方法2:提取、修改并重新创建:**
如需修改事件数据:
```python
from flowio import FlowData, create_fcsRead and extract data
读取并提取数据
flow = FlowData('original.fcs')
events = flow.as_array(preprocess=False)
flow = FlowData('original.fcs')
events = flow.as_array(preprocess=False)
Modify event data
修改事件数据
events[:, 0] = events[:, 0] * 1.5 # Scale first channel
events[:, 0] = events[:, 0] * 1.5 # 缩放第一个通道
Create new FCS file with modified data
使用修改后的数据创建新的FCS文件
create_fcs('modified.fcs',
events,
flow.pnn_labels,
opt_channel_names=flow.pns_labels,
metadata=flow.text)
undefinedcreate_fcs('modified.fcs',
events,
flow.pnn_labels,
opt_channel_names=flow.pns_labels,
metadata=flow.text)
undefinedHandling Multi-Dataset FCS Files
处理多数据集FCS文件
Some FCS files contain multiple datasets in a single file.
Detecting Multi-Dataset Files:
python
from flowio import FlowData, MultipleDataSetsError
try:
flow = FlowData('sample.fcs')
except MultipleDataSetsError:
print("File contains multiple datasets")
# Use read_multiple_data_sets() insteadReading All Datasets:
python
from flowio import read_multiple_data_sets部分FCS文件在单个文件中包含多个数据集。
检测多数据集文件:
python
from flowio import FlowData, MultipleDataSetsError
try:
flow = FlowData('sample.fcs')
except MultipleDataSetsError:
print("文件包含多个数据集")
# 改用read_multiple_data_sets()方法读取所有数据集:
python
from flowio import read_multiple_data_setsRead all datasets from file
读取文件中的所有数据集
datasets = read_multiple_data_sets('multi_dataset.fcs')
print(f"Found {len(datasets)} datasets")
datasets = read_multiple_data_sets('multi_dataset.fcs')
print(f"发现 {len(datasets)} 个数据集")
Process each dataset
处理每个数据集
for i, dataset in enumerate(datasets):
print(f"\nDataset {i}:")
print(f" Events: {dataset.event_count}")
print(f" Channels: {dataset.pnn_labels}")
# Get event data for this dataset
events = dataset.as_array()
print(f" Shape: {events.shape}")
print(f" Mean values: {events.mean(axis=0)}")
**Reading Specific Dataset:**
```python
from flowio import FlowDatafor i, dataset in enumerate(datasets):
print(f"\n数据集 {i}:")
print(f" 事件数: {dataset.event_count}")
print(f" 通道: {dataset.pnn_labels}")
# 获取该数据集的事件数据
events = dataset.as_array()
print(f" 形状: {events.shape}")
print(f" 平均值: {events.mean(axis=0)}")
**读取特定数据集:**
```python
from flowio import FlowDataRead first dataset (nextdata_offset=0)
读取第一个数据集(nextdata_offset=0)
first_dataset = FlowData('multi.fcs', nextdata_offset=0)
first_dataset = FlowData('multi.fcs', nextdata_offset=0)
Read second dataset using NEXTDATA offset from first
使用第一个数据集的NEXTDATA偏移量读取第二个数据集
next_offset = int(first_dataset.text['$NEXTDATA'])
if next_offset > 0:
second_dataset = FlowData('multi.fcs', nextdata_offset=next_offset)
undefinednext_offset = int(first_dataset.text['$NEXTDATA'])
if next_offset > 0:
second_dataset = FlowData('multi.fcs', nextdata_offset=next_offset)
undefinedData Preprocessing
数据预处理
FlowIO applies standard FCS preprocessing transformations when .
preprocess=TruePreprocessing Steps:
- Gain Scaling: Multiply values by PnG (gain) keyword
- Logarithmic Transformation: Apply PnE exponential transformation if present
- Formula: where PnE = "a,b"
value = a * 10^(b * raw_value)
- Formula:
- Time Scaling: Convert time values to appropriate units
Controlling Preprocessing:
python
undefined当时,FlowIO会应用标准FCS预处理转换。
preprocess=True预处理步骤:
- 增益缩放: 将值乘以PnG(增益)关键字对应的值
- 对数转换: 如果存在PnE指数转换,则应用该转换
- 公式:,其中PnE = "a,b"
value = a * 10^(b * raw_value)
- 公式:
- 时间缩放: 将时间值转换为合适的单位
控制预处理:
python
undefinedPreprocessed data (default)
预处理后的数据(默认)
preprocessed = flow.as_array(preprocess=True)
preprocessed = flow.as_array(preprocess=True)
Raw data (no transformations)
原始数据(无转换)
raw = flow.as_array(preprocess=False)
undefinedraw = flow.as_array(preprocess=False)
undefinedError Handling
错误处理
Handle common FlowIO exceptions appropriately.
python
from flowio import (
FlowData,
FCSParsingError,
DataOffsetDiscrepancyError,
MultipleDataSetsError
)
try:
flow = FlowData('sample.fcs')
events = flow.as_array()
except FCSParsingError as e:
print(f"Failed to parse FCS file: {e}")
# Try with relaxed parsing
flow = FlowData('sample.fcs', ignore_offset_error=True)
except DataOffsetDiscrepancyError as e:
print(f"Offset discrepancy detected: {e}")
# Use ignore_offset_discrepancy parameter
flow = FlowData('sample.fcs', ignore_offset_discrepancy=True)
except MultipleDataSetsError as e:
print(f"Multiple datasets detected: {e}")
# Use read_multiple_data_sets instead
from flowio import read_multiple_data_sets
datasets = read_multiple_data_sets('sample.fcs')
except Exception as e:
print(f"Unexpected error: {e}")适当处理FlowIO的常见异常。
python
from flowio import (
FlowData,
FCSParsingError,
DataOffsetDiscrepancyError,
MultipleDataSetsError
)
try:
flow = FlowData('sample.fcs')
events = flow.as_array()
except FCSParsingError as e:
print(f"解析FCS文件失败: {e}")
# 尝试使用宽松解析
flow = FlowData('sample.fcs', ignore_offset_error=True)
except DataOffsetDiscrepancyError as e:
print(f"检测到偏移差异: {e}")
# 使用ignore_offset_discrepancy参数
flow = FlowData('sample.fcs', ignore_offset_discrepancy=True)
except MultipleDataSetsError as e:
print(f"检测到多数据集: {e}")
# 改用read_multiple_data_sets方法
from flowio import read_multiple_data_sets
datasets = read_multiple_data_sets('sample.fcs')
except Exception as e:
print(f"意外错误: {e}")Common Use Cases
常见用例
Inspecting FCS File Contents
检查FCS文件内容
Quick exploration of FCS file structure:
python
from flowio import FlowData
flow = FlowData('unknown.fcs')
print("=" * 50)
print(f"File: {flow.name}")
print(f"Version: {flow.version}")
print(f"Size: {flow.file_size:,} bytes")
print("=" * 50)
print(f"\nEvents: {flow.event_count:,}")
print(f"Channels: {flow.channel_count}")
print("\nChannel Information:")
for i, (pnn, pns) in enumerate(zip(flow.pnn_labels, flow.pns_labels)):
ch_type = "scatter" if i in flow.scatter_indices else \
"fluoro" if i in flow.fluoro_indices else \
"time" if i == flow.time_index else "other"
print(f" [{i}] {pnn:10s} | {pns:30s} | {ch_type}")
print("\nKey Metadata:")
for key in ['$DATE', '$BTIM', '$ETIM', '$CYT', '$INST', '$SRC']:
value = flow.text.get(key, 'N/A')
print(f" {key:15s}: {value}")快速探索FCS文件结构:
python
from flowio import FlowData
flow = FlowData('unknown.fcs')
print("=" * 50)
print(f"文件: {flow.name}")
print(f"版本: {flow.version}")
print(f"大小: {flow.file_size:,} 字节")
print("=" * 50)
print(f"\n事件数: {flow.event_count:,}")
print(f"通道数: {flow.channel_count}")
print("\n通道信息:")
for i, (pnn, pns) in enumerate(zip(flow.pnn_labels, flow.pns_labels)):
ch_type = "散射" if i in flow.scatter_indices else \
"荧光" if i in flow.fluoro_indices else \
"时间" if i == flow.time_index else "其他"
print(f" [{i}] {pnn:10s} | {pns:30s} | {ch_type}")
print("\n关键元数据:")
for key in ['$DATE', '$BTIM', '$ETIM', '$CYT', '$INST', '$SRC']:
value = flow.text.get(key, 'N/A')
print(f" {key:15s}: {value}")Batch Processing Multiple Files
批量处理多个文件
Process a directory of FCS files:
python
from pathlib import Path
from flowio import FlowData
import pandas as pd处理目录中的所有FCS文件:
python
from pathlib import Path
from flowio import FlowData
import pandas as pdFind all FCS files
查找所有FCS文件
fcs_files = list(Path('data/').glob('*.fcs'))
fcs_files = list(Path('data/').glob('*.fcs'))
Extract summary information
提取摘要信息
summaries = []
for fcs_path in fcs_files:
try:
flow = FlowData(str(fcs_path), only_text=True)
summaries.append({
'filename': fcs_path.name,
'version': flow.version,
'events': flow.event_count,
'channels': flow.channel_count,
'date': flow.text.get('$DATE', 'N/A')
})
except Exception as e:
print(f"Error processing {fcs_path.name}: {e}")
summaries = []
for fcs_path in fcs_files:
try:
flow = FlowData(str(fcs_path), only_text=True)
summaries.append({
'filename': fcs_path.name,
'version': flow.version,
'events': flow.event_count,
'channels': flow.channel_count,
'date': flow.text.get('$DATE', 'N/A')
})
except Exception as e:
print(f"处理 {fcs_path.name} 时出错: {e}")
Create summary DataFrame
创建摘要DataFrame
df = pd.DataFrame(summaries)
print(df)
undefineddf = pd.DataFrame(summaries)
print(df)
undefinedConverting FCS to CSV
将FCS转换为CSV
Export event data to CSV format:
python
from flowio import FlowData
import pandas as pd将事件数据导出为CSV格式:
python
from flowio import FlowData
import pandas as pdRead FCS file
读取FCS文件
flow = FlowData('sample.fcs')
flow = FlowData('sample.fcs')
Convert to DataFrame
转换为DataFrame
df = pd.DataFrame(
flow.as_array(),
columns=flow.pnn_labels
)
df = pd.DataFrame(
flow.as_array(),
columns=flow.pnn_labels
)
Add metadata as attributes
将元数据添加为属性
df.attrs['fcs_version'] = flow.version
df.attrs['instrument'] = flow.text.get('$CYT', 'Unknown')
df.attrs['fcs_version'] = flow.version
df.attrs['instrument'] = flow.text.get('$CYT', 'Unknown')
Export to CSV
导出为CSV
df.to_csv('output.csv', index=False)
print(f"Exported {len(df)} events to CSV")
undefineddf.to_csv('output.csv', index=False)
print(f"已将 {len(df)} 个事件导出为CSV")
undefinedFiltering Events and Re-exporting
过滤事件并重新导出
Apply filters and save filtered data:
python
from flowio import FlowData, create_fcs
import numpy as np应用过滤条件并保存过滤后的数据:
python
from flowio import FlowData, create_fcs
import numpy as npRead original file
读取原始文件
flow = FlowData('sample.fcs')
events = flow.as_array(preprocess=False)
flow = FlowData('sample.fcs')
events = flow.as_array(preprocess=False)
Apply filtering (example: threshold on first channel)
应用过滤条件(示例:对第一个通道设置阈值)
fsc_idx = 0
threshold = 500
mask = events[:, fsc_idx] > threshold
filtered_events = events[mask]
print(f"Original events: {len(events)}")
print(f"Filtered events: {len(filtered_events)}")
fsc_idx = 0
threshold = 500
mask = events[:, fsc_idx] > threshold
filtered_events = events[mask]
print(f"原始事件数: {len(events)}")
print(f"过滤后事件数: {len(filtered_events)}")
Create new FCS file with filtered data
使用过滤后的数据创建新的FCS文件
create_fcs('filtered.fcs',
filtered_events,
flow.pnn_labels,
opt_channel_names=flow.pns_labels,
metadata={**flow.text, '$SRC': 'Filtered data'})
undefinedcreate_fcs('filtered.fcs',
filtered_events,
flow.pnn_labels,
opt_channel_names=flow.pns_labels,
metadata={**flow.text, '$SRC': 'Filtered data'})
undefinedExtracting Specific Channels
提取特定通道
Extract and process specific channels:
python
from flowio import FlowData
import numpy as np
flow = FlowData('sample.fcs')
events = flow.as_array()提取并处理特定通道:
python
from flowio import FlowData
import numpy as np
flow = FlowData('sample.fcs')
events = flow.as_array()Extract fluorescence channels only
仅提取荧光通道
fluoro_indices = flow.fluoro_indices
fluoro_data = events[:, fluoro_indices]
fluoro_names = [flow.pnn_labels[i] for i in fluoro_indices]
print(f"Fluorescence channels: {fluoro_names}")
print(f"Shape: {fluoro_data.shape}")
fluoro_indices = flow.fluoro_indices
fluoro_data = events[:, fluoro_indices]
fluoro_names = [flow.pnn_labels[i] for i in fluoro_indices]
print(f"荧光通道: {fluoro_names}")
print(f"形状: {fluoro_data.shape}")
Calculate statistics per channel
计算每个通道的统计信息
for i, name in enumerate(fluoro_names):
channel_data = fluoro_data[:, i]
print(f"\n{name}:")
print(f" Mean: {channel_data.mean():.2f}")
print(f" Median: {np.median(channel_data):.2f}")
print(f" Std Dev: {channel_data.std():.2f}")
undefinedfor i, name in enumerate(fluoro_names):
channel_data = fluoro_data[:, i]
print(f"\n{name}:")
print(f" 平均值: {channel_data.mean():.2f}")
print(f" 中位数: {np.median(channel_data):.2f}")
print(f" 标准差: {channel_data.std():.2f}")
undefinedBest Practices
最佳实践
- Memory Efficiency: Use when event data is not needed
only_text=True - Error Handling: Wrap file operations in try-except blocks for robust code
- Multi-Dataset Detection: Check for MultipleDataSetsError and use appropriate function
- Preprocessing Control: Explicitly set parameter based on analysis needs
preprocess - Offset Issues: If parsing fails, try parameter
ignore_offset_discrepancy=True - Channel Validation: Verify channel counts and names match expectations before processing
- Metadata Preservation: When modifying files, preserve original TEXT segment keywords
- 内存效率: 当不需要事件数据时,使用参数
only_text=True - 错误处理: 将文件操作包裹在try-except块中,以提高代码健壮性
- 多数据集检测: 检查MultipleDataSetsError异常,并使用相应的函数
- 预处理控制: 根据分析需求显式设置参数
preprocess - 偏移问题: 如果解析失败,尝试使用参数
ignore_offset_discrepancy=True - 通道验证: 在处理前验证通道数量和名称是否符合预期
- 元数据保留: 修改文件时,保留原始TEXT段的关键字
Advanced Topics
高级主题
Understanding FCS File Structure
理解FCS文件结构
FCS files consist of four segments:
- HEADER: FCS version and byte offsets for other segments
- TEXT: Key-value metadata pairs (delimiter-separated)
- DATA: Raw event data (binary/float/ASCII format)
- ANALYSIS (optional): Results from data processing
Access these segments via FlowData attributes:
- - HEADER segment
flow.header - - TEXT segment keywords
flow.text - - DATA segment (as bytes)
flow.events - - ANALYSIS segment keywords (if present)
flow.analysis
FCS文件由四个部分组成:
- HEADER: FCS版本和其他段的字节偏移量
- TEXT: 键值对形式的元数据(分隔符分隔)
- DATA: 原始事件数据(二进制/浮点/ASCII格式)
- ANALYSIS(可选):数据处理的结果
可通过FlowData属性访问这些段:
- - HEADER段
flow.header - - TEXT段关键字
flow.text - - DATA段(字节形式)
flow.events - - ANALYSIS段关键字(如果存在)
flow.analysis
Detailed API Reference
详细API参考
For comprehensive API documentation including all parameters, methods, exceptions, and FCS keyword reference, consult the detailed reference file:
Read:
references/api_reference.mdThe reference includes:
- Complete FlowData class documentation
- All utility functions (read_multiple_data_sets, create_fcs)
- Exception classes and handling
- FCS file structure details
- Common TEXT segment keywords
- Extended example workflows
When working with complex FCS operations or encountering unusual file formats, load this reference for detailed guidance.
如需包含所有参数、方法、异常和FCS关键字参考的完整API文档,请查阅详细参考文件:
阅读:
references/api_reference.md该参考文档包括:
- 完整的FlowData类文档
- 所有工具函数(read_multiple_data_sets、create_fcs)
- 异常类和处理方法
- FCS文件结构细节
- 常见TEXT段关键字
- 扩展示例工作流
当处理复杂FCS操作或遇到不常见的文件格式时,请查阅该参考文档获取详细指导。
Integration Notes
集成说明
NumPy Arrays: All event data is returned as NumPy ndarrays with shape (events, channels)
Pandas DataFrames: Easily convert to DataFrames for analysis:
python
import pandas as pd
df = pd.DataFrame(flow.as_array(), columns=flow.pnn_labels)FlowKit Integration: For advanced analysis (compensation, gating, FlowJo support), use FlowKit library which builds on FlowIO's parsing capabilities
Web Applications: FlowIO's minimal dependencies make it ideal for web backend services processing FCS uploads
NumPy数组: 所有事件数据均以NumPy ndarray形式返回,形状为(事件数, 通道数)
Pandas DataFrames: 可轻松转换为DataFrame进行分析:
python
import pandas as pd
df = pd.DataFrame(flow.as_array(), columns=flow.pnn_labels)FlowKit集成: 如需进行高级分析(补偿、门控、FlowJo支持),可使用FlowKit库,它基于FlowIO的解析功能构建
Web应用: FlowIO的依赖项极少,非常适合处理FCS上传的Web后端服务
Troubleshooting
故障排除
Problem: "Offset discrepancy error"
Solution: Use parameter
ignore_offset_discrepancy=TrueProblem: "Multiple datasets error"
Solution: Use function instead of FlowData constructor
read_multiple_data_sets()Problem: Out of memory with large files
Solution: Use for metadata-only operations, or process events in chunks
only_text=TrueProblem: Unexpected channel counts
Solution: Check for null channels; use parameter to exclude them
null_channel_listProblem: Cannot modify event data in place
Solution: FlowIO doesn't support direct modification; extract data, modify, then use to save
create_fcs()问题: "偏移差异错误"
解决方案: 使用参数
ignore_offset_discrepancy=True问题: "多数据集错误"
解决方案: 使用函数替代FlowData构造函数
read_multiple_data_sets()问题: 处理大文件时内存不足
解决方案: 对于仅需元数据的操作,使用;或分块处理事件
only_text=True问题: 通道数量不符合预期
解决方案: 检查是否存在空通道;使用参数排除它们
null_channel_list问题: 无法直接修改事件数据
解决方案: FlowIO不支持直接修改;提取数据、修改后,使用保存
create_fcs()Summary
总结
FlowIO provides essential FCS file handling capabilities for flow cytometry workflows. Use it for parsing, metadata extraction, and file creation. For simple file operations and data extraction, FlowIO is sufficient. For complex analysis including compensation and gating, integrate with FlowKit or other specialized tools.
FlowIO为流式细胞术工作流提供了必要的FCS文件处理功能。可用于解析、元数据提取和文件创建。对于简单的文件操作和数据提取,FlowIO已足够。如需进行包括补偿和门控的复杂分析,可与FlowKit或其他专用工具集成。