tooluniverse-image-analysis

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Microscopy Image Analysis and Quantitative Imaging Data

显微镜图像分析与定量成像数据处理

Production-ready skill for analyzing microscopy-derived measurement data using pandas, numpy, scipy, statsmodels, and scikit-image. Designed for BixBench imaging questions covering colony morphometry, cell counting, fluorescence quantification, regression modeling, and statistical comparisons.

IMPORTANT: This skill handles complex multi-workflow analysis. Most implementation details have been moved to

references/

for progressive disclosure. This document focuses on high-level decision-making and workflow orchestration.

这是一款可用于生产环境的技能，使用pandas、numpy、scipy、statsmodels和scikit-image分析显微镜衍生的测量数据。专为BixBench成像相关问题设计，涵盖菌落形态分析、细胞计数、荧光定量、回归建模和统计对比。

重要提示：本技能可处理复杂的多工作流分析。大多数实现细节已移至

references/

目录，以便逐步展示。本文档重点介绍高层决策与工作流编排。

When to Use This Skill

适用场景

Apply when users:

Have microscopy measurement data (area, circularity, intensity, cell counts) in CSV/TSV
Ask about colony morphometry (bacterial swarming, biofilm, growth assays)
Need statistical comparisons of imaging measurements (t-test, ANOVA, Dunnett's, Mann-Whitney)
Ask about cell counting statistics (NeuN, DAPI, marker counts)
Need effect size calculations (Cohen's d) and power analysis
Want regression models (polynomial, spline) fitted to dose-response or ratio data
Ask about model comparison (R-squared, F-statistic, AIC/BIC)
Need Shapiro-Wilk normality testing on imaging data
Want confidence intervals for peak predictions from fitted models
Questions mention imaging software output (ImageJ, CellProfiler, QuPath)
Need fluorescence intensity quantification or colocalization analysis
Ask about image segmentation results (counts, areas, shapes)

BixBench Coverage: 21 questions across 4 projects (bix-18, bix-19, bix-41, bix-54)

NOT for (use other skills instead):

Phylogenetic analysis → Use
```
tooluniverse-phylogenetics
```
RNA-seq differential expression → Use
```
tooluniverse-rnaseq-deseq2
```
Single-cell scRNA-seq → Use
```
tooluniverse-single-cell
```
Statistical regression only (no imaging context) → Use
```
tooluniverse-statistical-modeling
```

当用户有以下需求时适用：

拥有CSV/TSV格式的显微镜测量数据（面积、圆度、强度、细胞计数）
询问菌落形态分析相关问题（细菌扩散、生物膜、生长实验）
需要对成像测量数据进行统计对比（t检验、ANOVA、Dunnett检验、Mann-Whitney检验）
询问细胞计数统计相关问题（NeuN、DAPI、标记物计数）
需要计算效应量（Cohen's d）并进行功效分析
希望为剂量反应或比例数据拟合回归模型（多项式、样条）
询问模型对比相关问题（R平方、F统计量、AIC/BIC）
需要对成像数据进行Shapiro-Wilk正态性检验
希望获取拟合模型峰值预测的置信区间
问题中提及成像软件输出（ImageJ、CellProfiler、QuPath）
需要荧光强度定量或共定位分析
询问图像分割结果（计数、面积、形状）

BixBench覆盖范围：4个项目（bix-18、bix-19、bix-41、bix-54）中的21个问题

不适用场景（请使用其他技能）：

系统发育分析 → 使用
```
tooluniverse-phylogenetics
```
RNA-seq差异表达分析 → 使用
```
tooluniverse-rnaseq-deseq2
```
单细胞scRNA-seq分析 → 使用
```
tooluniverse-single-cell
```
仅需统计回归（无成像场景） → 使用
```
tooluniverse-statistical-modeling
```

Core Principles

核心原则

Data-first approach - Load and inspect all CSV/TSV measurement data before analysis
Question-driven - Parse the exact statistic, comparison, or model requested
Statistical rigor - Proper effect sizes, multiple comparison corrections, model selection
Imaging-aware - Understand ImageJ/CellProfiler measurement columns (Area, Circularity, Round, Intensity)
Workflow flexibility - Support both pre-quantified data (CSV) and raw image processing
Precision - Match expected answer format (integer, range, decimal places)
Reproducible - Use standard Python/scipy equivalents to R functions

数据优先方法 - 在分析前加载并检查所有CSV/TSV测量数据
问题驱动 - 解析用户要求的具体统计量、对比或模型
统计严谨性 - 合理计算效应量、多重比较校正、模型选择
成像感知 - 理解ImageJ/CellProfiler的测量列（Area、Circularity、Round、Intensity）
工作流灵活性 - 支持预量化数据（CSV）和原始图像处理
精度 - 匹配预期答案格式（整数、范围、小数位数）
可复现性 - 使用与R函数等效的标准Python/scipy实现

Required Python Packages

所需Python包

python

undefined

python

undefined

Core (MUST be installed)

核心包（必须安装）

import pandas as pd import numpy as np from scipy import stats from scipy.interpolate import BSpline, make_interp_spline import statsmodels.api as sm from statsmodels.formula.api import ols from statsmodels.stats.power import TTestIndPower from patsy import dmatrix, bs, cr

Optional (for raw image processing)

可选包（用于原始图像处理）

import skimage import cv2 import tifffile


**Installation**:
```bash
pip install pandas numpy scipy statsmodels patsy scikit-image opencv-python-headless tifffile

import skimage import cv2 import tifffile


**安装命令**:
```bash
pip install pandas numpy scipy statsmodels patsy scikit-image opencv-python-headless tifffile

High-Level Workflow Decision Tree

高层工作流决策树

START: User question about microscopy data
│
├─ Q1: What type of data is available?
│  │
│  ├─ PRE-QUANTIFIED DATA (CSV/TSV with measurements)
│  │  └─ Workflow: Load → Parse question → Statistical analysis
│  │     Pattern: Most common BixBench pattern (bix-18, bix-19, bix-41, bix-54)
│  │     See: Section "Quantitative Data Analysis" below
│  │
│  └─ RAW IMAGES (TIFF, PNG, multi-channel)
│     └─ Workflow: Load → Segment → Measure → Analyze
│        See: references/image_processing.md
│
├─ Q2: What type of analysis is needed?
│  │
│  ├─ STATISTICAL COMPARISON
│  │  ├─ Two groups → t-test or Mann-Whitney
│  │  ├─ Multiple groups → ANOVA or Dunnett's test
│  │  ├─ Two factors → Two-way ANOVA
│  │  └─ Effect size → Cohen's d, power analysis
│  │  See: references/statistical_analysis.md
│  │
│  ├─ REGRESSION MODELING
│  │  ├─ Dose-response → Polynomial (quadratic, cubic)
│  │  ├─ Ratio optimization → Natural spline
│  │  └─ Model comparison → R-squared, F-statistic, AIC/BIC
│  │  See: references/statistical_analysis.md
│  │
│  ├─ CELL COUNTING
│  │  ├─ Fluorescence (DAPI, NeuN) → Threshold + watershed
│  │  ├─ Brightfield → Adaptive threshold
│  │  └─ High-density → CellPose or StarDist (external)
│  │  See: references/cell_counting.md
│  │
│  ├─ COLONY SEGMENTATION
│  │  ├─ Swarming assays → Otsu threshold + morphology
│  │  ├─ Biofilms → Li threshold + fill holes
│  │  └─ Growth assays → Time-lapse tracking
│  │  See: references/segmentation.md
│  │
│  └─ FLUORESCENCE QUANTIFICATION
│     ├─ Intensity measurement → regionprops
│     ├─ Colocalization → Pearson/Manders
│     └─ Multi-channel → Channel-wise quantification
│     See: references/fluorescence_analysis.md
│
└─ Q3: When to use scikit-image vs OpenCV?
   ├─ scikit-image: Scientific analysis, measurements, regionprops
   ├─ OpenCV: Fast processing, real-time, large batches
   └─ Both: Often interchangeable for basic operations
   See: references/image_processing.md "Library Selection Guide"

开始：用户询问显微镜数据相关问题
│
├─ 问题1：可用数据类型是什么？
│  │
│  ├─ 预量化数据（含测量值的CSV/TSV）
│  │  └─ 工作流：加载 → 解析问题 → 统计分析
│  │     模式：最常见的BixBench模式（bix-18、bix-19、bix-41、bix-54）
│  │     参考：下文“定量数据分析”章节
│  │
│  └─ 原始图像（TIFF、PNG、多通道）
│     └─ 工作流：加载 → 分割 → 测量 → 分析
│        参考：references/image_processing.md
│
├─ 问题2：需要何种类型的分析？
│  │
│  ├─ 统计对比
│  │  ├─ 两组对比 → t检验或Mann-Whitney检验
│  │  ├─ 多组对比 → ANOVA或Dunnett检验
│  │  ├─ 双因素 → 双因素ANOVA
│  │  └─ 效应量 → Cohen's d、功效分析
│  │  参考：references/statistical_analysis.md
│  │
│  ├─ 回归建模
│  │  ├─ 剂量反应 → 多项式（二次、三次）
│  │  ├─ 比例优化 → 自然样条
│  │  └─ 模型对比 → R平方、F统计量、AIC/BIC
│  │  参考：references/statistical_analysis.md
│  │
│  ├─ 细胞计数
│  │  ├─ 荧光（DAPI、NeuN） → 阈值 + 分水岭算法
│  │  ├─ 明场 → 自适应阈值
│  │  └─ 高密度 → CellPose或StarDist（外部工具）
│  │  参考：references/cell_counting.md
│  │
│  ├─ 菌落分割
│  │  ├─ 扩散实验 → Otsu阈值 + 形态学操作
│  │  ├─ 生物膜 → Li阈值 + 孔洞填充
│  │  └─ 生长实验 → 延时追踪
│  │  参考：references/segmentation.md
│  │
│  └─ 荧光定量
│     ├─ 强度测量 → regionprops
│     ├─ 共定位 → Pearson/Manders系数
│     └─ 多通道 → 分通道定量
│     参考：references/fluorescence_analysis.md
│
└─ 问题3：何时使用scikit-image vs OpenCV？
   ├─ scikit-image：科学分析、测量、regionprops
   ├─ OpenCV：快速处理、实时分析、大批次数据
   └─ 两者均可：基础操作通常可互换
   参考：references/image_processing.md中的“库选择指南”

Quantitative Data Analysis Workflow

定量数据分析工作流

Phase 0: Question Parsing and Data Discovery

阶段0：问题解析与数据发现

CRITICAL FIRST STEP: Before writing ANY code, identify what data files are available and what the question is asking for.

python

import os, glob, pandas as pd

关键第一步：在编写任何代码之前，先确定可用的数据文件以及用户问题的具体需求。

python

import os, glob, pandas as pd

Discover data files

发现数据文件

data_dir = "." csv_files = glob.glob(os.path.join(data_dir, '', '*.csv'), recursive=True) tsv_files = glob.glob(os.path.join(data_dir, '', '.tsv'), recursive=True) img_files = glob.glob(os.path.join(data_dir, '**', '.tif*'), recursive=True)

Load and inspect first measurement file

加载并检查第一个测量文件

if csv_files: df = pd.read_csv(csv_files[0]) print(f"Shape: {df.shape}") print(f"Columns: {list(df.columns)}") print(df.head()) print(df.describe())


**Common Column Names**:
- Area: Colony or cell area in pixels or calibrated units
- Circularity: 4*pi*area/perimeter^2, range [0,1], 1.0 = perfect circle
- Round: Roundness = 4*area/(pi*major_axis^2)
- Genotype/Strain: Biological grouping variable
- Ratio: Co-culture mixing ratio (e.g., "1:3", "5:1")
- NeuN/DAPI/GFP: Cell marker counts or intensities

if csv_files: df = pd.read_csv(csv_files[0]) print(f"数据形状: {df.shape}") print(f"列名: {list(df.columns)}") print(df.head()) print(df.describe())


**常见列名**:
- Area：菌落或细胞的面积（像素或校准单位）
- Circularity：4*pi*面积/周长²，范围[0,1]，1.0表示完美圆形
- Round：圆度 = 4*面积/(pi*长轴²)
- Genotype/Strain：生物分组变量
- Ratio：共培养混合比例（如"1:3"、"5:1"）
- NeuN/DAPI/GFP：细胞标记物计数或强度

Phase 1: Grouped Statistics

阶段1：分组统计

python

def grouped_summary(df, group_cols, measure_col):
    """Calculate summary statistics by group."""
    summary = df.groupby(group_cols)[measure_col].agg(
        Mean='mean',
        SD='std',
        Median='median',
        Min='min',
        Max='max',
        N='count'
    ).reset_index()
    summary['SEM'] = summary['SD'] / np.sqrt(summary['N'])
    return summary

python

def grouped_summary(df, group_cols, measure_col):
    """按分组计算汇总统计量。"""
    summary = df.groupby(group_cols)[measure_col].agg(
        Mean='mean',
        SD='std',
        Median='median',
        Min='min',
        Max='max',
        N='count'
    ).reset_index()
    summary['SEM'] = summary['SD'] / np.sqrt(summary['N'])
    return summary

Example: Colony morphometry by genotype

示例：按基因型统计菌落形态

area_summary = grouped_summary(df, 'Genotype', 'Area') circ_summary = grouped_summary(df, 'Genotype', 'Circularity')


For detailed statistical functions, see: **references/statistical_analysis.md**

area_summary = grouped_summary(df, 'Genotype', 'Area') circ_summary = grouped_summary(df, 'Genotype', 'Circularity')


详细统计函数请参考：**references/statistical_analysis.md**

Phase 2: Statistical Testing

阶段2：统计检验

Decision guide:

Normality test needed? → Shapiro-Wilk
Two groups comparison? → t-test or Mann-Whitney
Multiple groups vs control? → Dunnett's test
Multiple groups, all comparisons? → Tukey HSD
Two factors? → Two-way ANOVA
Effect size? → Cohen's d
Sample size planning? → Power analysis

See: references/statistical_analysis.md for complete implementations

决策指南:

是否需要正态性检验？ → Shapiro-Wilk检验
两组对比？ → t检验或Mann-Whitney检验
多组与对照组对比？ → Dunnett检验
多组间全对比？ → Tukey HSD检验
双因素？ → 双因素ANOVA
效应量？ → Cohen's d
样本量规划？ → 功效分析

完整实现请参考：references/statistical_analysis.md

Phase 3: Regression Modeling

阶段3：回归建模

When to use each model:

Polynomial (quadratic/cubic): Smooth dose-response, clear peak
Natural spline: Flexible, non-parametric, handles complex patterns
Linear: Simple relationships, checking for trends

Model comparison metrics:

R-squared: Overall fit (higher = better)
Adjusted R-squared: Penalizes complexity
F-statistic p-value: Model significance
AIC/BIC: Compare non-nested models

See: references/statistical_analysis.md for complete implementations

各模型适用场景:

多项式（二次/三次）：平滑剂量反应曲线、明确峰值
自然样条：灵活、非参数、处理复杂模式
线性：简单关系、趋势检验

模型对比指标:

R平方：整体拟合度（值越高越好）
调整后R平方：对复杂度进行惩罚
F统计量p值：模型显著性
AIC/BIC：对比非嵌套模型

完整实现请参考：references/statistical_analysis.md

Raw Image Processing Workflow

原始图像处理工作流

When Processing Raw Images

处理原始图像时

Workflow: Load → Preprocess → Segment → Measure → Export

python

undefined

工作流：加载 → 预处理 → 分割 → 测量 → 导出

python

undefined

Quick start for cell counting

细胞计数快速入门

from scripts.segment_cells import count_cells_in_image

result = count_cells_in_image( image_path="cells.tif", channel=0, # DAPI channel min_area=50 ) print(f"Found {result['count']} cells")

undefined

from scripts.segment_cells import count_cells_in_image

result = count_cells_in_image( image_path="cells.tif", channel=0, # DAPI通道 min_area=50 ) print(f"检测到 {result['count']} 个细胞")

undefined

Segmentation Method Selection

分割方法选择

Decision guide:

Cell Type	Density	Best Method	Notes
Nuclei (DAPI)	Low-Medium	Otsu + watershed	Standard approach
Nuclei (DAPI)	High	CellPose/StarDist	Handles touching
Colonies	Well-separated	Otsu threshold	Fast, reliable
Colonies	Touching	Watershed	Edge detection
Cells (phase)	Any	Adaptive threshold	Handles uneven illumination
Fluorescence	Low signal	Li threshold	More sensitive

See: references/segmentation.md and references/cell_counting.md for detailed protocols

决策指南:

细胞类型	密度	最佳方法	说明
细胞核（DAPI）	中低密度	Otsu阈值 + 分水岭算法	标准方法
细胞核（DAPI）	高密度	CellPose/StarDist	处理重叠细胞
菌落	分离良好	Otsu阈值	快速可靠
菌落	相互接触	分水岭算法	边缘检测
细胞（相差）	任意密度	自适应阈值	处理不均匀光照
荧光	低信号	Li阈值	灵敏度更高

详细方案请参考：references/segmentation.md和references/cell_counting.md

Library Selection: scikit-image vs OpenCV

库选择：scikit-image vs OpenCV

Use scikit-image when:

Scientific measurements needed (area, perimeter, intensity)
regionprops for object properties
Publication-quality analysis
Easier syntax for scientists

Use OpenCV when:

Processing large image batches
Speed is critical
Real-time processing
Advanced computer vision features

Both work for:

Thresholding, filtering, morphological operations
Basic image transformations
Most segmentation tasks

See: references/image_processing.md "Library Selection Guide"

优先使用scikit-image的场景:

需要科学测量（面积、周长、强度）
使用regionprops获取物体属性
用于发表级别的分析
对科学家而言语法更简洁

优先使用OpenCV的场景:

处理大批次图像
对速度要求高
实时处理
需要高级计算机视觉功能

两者均可使用的场景:

阈值化、滤波、形态学操作
基础图像变换
大多数分割任务

详细内容请参考：references/image_processing.md中的“库选择指南”

Common BixBench Patterns

常见BixBench模式

Pattern 1: Colony Morphometry (bix-18)

模式1：菌落形态分析（bix-18）

Question type: "Mean circularity of genotype with largest area?"

Data: CSV with Genotype, Area, Circularity columns

Workflow:

Load CSV → group by Genotype
Calculate mean Area per genotype
Identify genotype with max mean Area
Report mean Circularity for that genotype

See: references/segmentation.md "Colony Morphometry Analysis"

问题类型：“面积最大的基因型的平均圆度是多少？”

数据：包含Genotype、Area、Circularity列的CSV

工作流:

加载CSV → 按Genotype分组
计算每个基因型的平均Area
找出平均Area最大的基因型
报告该基因型的平均Circularity

参考：references/segmentation.md中的“菌落形态分析”

Pattern 2: Cell Counting Statistics (bix-19)

模式2：细胞计数统计（bix-19）

Question type: "Cohen's d for NeuN counts between conditions?"

Data: CSV with Condition, NeuN_count, Sex, Hemisphere columns

Workflow:

Load CSV → filter by hemisphere/sex if needed
Split by Condition (KD vs CTRL)
Calculate Cohen's d with pooled SD
Report effect size

See: references/statistical_analysis.md "Effect Size Calculations"

问题类型：“不同条件下NeuN计数的Cohen's d是多少？”

数据：包含Condition、NeuN_count、Sex、Hemisphere列的CSV

工作流:

加载CSV → 按需按半球/性别过滤
按Condition分组（KD vs CTRL）
使用合并标准差计算Cohen's d
报告效应量

参考：references/statistical_analysis.md中的“效应量计算”

Pattern 3: Multi-Group Comparison (bix-41)

模式3：多组对比（bix-41）

Question type: "Dunnett's test: How many ratios equivalent to control?"

Data: CSV with multiple co-culture ratios, Area, Circularity

Workflow:

Create Strain_Ratio labels
Run Dunnett's test for Area (vs control)
Run Dunnett's test for Circularity (vs control)
Count groups NOT significant in BOTH tests

See: references/statistical_analysis.md "Dunnett's Test"

问题类型：“Dunnett检验：有多少比例与对照组无显著差异？”

数据：包含多种共培养比例、Area、Circularity的CSV

工作流:

创建Strain_Ratio标签
对Area执行Dunnett检验（与对照组对比）
对Circularity执行Dunnett检验（与对照组对比）
统计在两项检验中均无显著差异的组

参考：references/statistical_analysis.md中的“Dunnett检验”

Pattern 4: Regression Optimization (bix-54)

模式4：回归优化（bix-54）

Question type: "Peak frequency from natural spline model?"

Data: CSV with co-culture frequencies and Area measurements

Workflow:

Convert ratio strings to frequencies
Fit natural spline model (df=4)
Find peak via grid search
Report peak frequency + confidence interval

See: references/statistical_analysis.md "Regression Modeling"

问题类型：“自然样条模型的峰值频率是多少？”

数据：包含共培养频率和Area测量值的CSV

工作流:

将比例字符串转换为频率
拟合自然样条模型（df=4）
通过网格搜索找到峰值
报告峰值频率及置信区间

参考：references/statistical_analysis.md中的“回归建模”

Quick Reference Table

快速参考表

Task	Primary Tool	Reference
Load measurement CSV	pandas.read_csv()	This file
Group statistics	df.groupby().agg()	This file
T-test	scipy.stats.ttest_ind()	statistical_analysis.md
ANOVA	statsmodels.ols + anova_lm()	statistical_analysis.md
Dunnett's test	scipy.stats.dunnett()	statistical_analysis.md
Cohen's d	Custom function (pooled SD)	statistical_analysis.md
Power analysis	statsmodels TTestIndPower	statistical_analysis.md
Polynomial regression	statsmodels.OLS + poly features	statistical_analysis.md
Natural spline	patsy.cr() + statsmodels.OLS	statistical_analysis.md
Cell segmentation	skimage.filters + watershed	cell_counting.md
Colony segmentation	skimage.filters.threshold_otsu	segmentation.md
Fluorescence quantification	skimage.measure.regionprops	fluorescence_analysis.md
Colocalization	Pearson/Manders	fluorescence_analysis.md
Image loading	tifffile, skimage.io	image_processing.md
Batch processing	scripts/batch_process.py	scripts/

任务	主要工具	参考文档
加载测量CSV	pandas.read_csv()	本文档
分组统计	df.groupby().agg()	本文档
t检验	scipy.stats.ttest_ind()	statistical_analysis.md
ANOVA	statsmodels.ols + anova_lm()	statistical_analysis.md
Dunnett检验	scipy.stats.dunnett()	statistical_analysis.md
Cohen's d	自定义函数（合并标准差）	statistical_analysis.md
功效分析	statsmodels TTestIndPower	statistical_analysis.md
多项式回归	statsmodels.OLS + 多项式特征	statistical_analysis.md
自然样条	patsy.cr() + statsmodels.OLS	statistical_analysis.md
细胞分割	skimage.filters + 分水岭算法	cell_counting.md
菌落分割	skimage.filters.threshold_otsu	segmentation.md
荧光定量	skimage.measure.regionprops	fluorescence_analysis.md
共定位	Pearson/Manders系数	fluorescence_analysis.md
图像加载	tifffile, skimage.io	image_processing.md
批量处理	scripts/batch_process.py	scripts/

Example Scripts

示例脚本

Ready-to-use scripts in

scripts/

directory:

segment_cells.py - Cell/nuclei counting with watershed
measure_fluorescence.py - Multi-channel intensity quantification
batch_process.py - Process folders of images
colony_morphometry.py - Measure colony area/circularity
statistical_comparison.py - Group comparison statistics

Usage:

bash

undefined

scripts/

目录下提供即用型脚本：

segment_cells.py - 使用分水岭算法计数细胞/细胞核
measure_fluorescence.py - 多通道强度定量
batch_process.py - 处理文件夹中的图像
colony_morphometry.py - 测量菌落面积/圆度
statistical_comparison.py - 分组对比统计

使用方法:

bash

undefined

Count cells in image

计数图像中的细胞

python scripts/segment_cells.py cells.tif --channel 0 --min-area 50

Batch process folder

批量处理文件夹

python scripts/batch_process.py input_folder/ output.csv --analysis cell_count

---

python scripts/batch_process.py input_folder/ output.csv --analysis cell_count

---

Detailed Reference Guides

详细参考指南

For complete implementations and protocols:

references/statistical_analysis.md - All statistical tests, regression models
references/cell_counting.md - Cell/nuclei counting protocols
references/segmentation.md - Colony and object segmentation
references/fluorescence_analysis.md - Intensity quantification, colocalization
references/image_processing.md - Image loading, preprocessing, library selection
references/troubleshooting.md - Common issues and solutions

完整实现与方案请参考：

references/statistical_analysis.md - 所有统计检验、回归模型
references/cell_counting.md - 细胞/细胞核计数方案
references/segmentation.md - 菌落与物体分割
references/fluorescence_analysis.md - 强度定量、共定位
references/image_processing.md - 图像加载、预处理、库选择
references/troubleshooting.md - 常见问题与解决方案

Important Notes

重要说明

Matching R Statistical Functions

与R统计函数匹配

Some BixBench questions use R for analysis. Python equivalents:

R's Dunnett test (
```
multcomp::glht
```
) →
```
scipy.stats.dunnett()
```
(scipy ≥ 1.10)
R's natural spline (
```
ns(x, df=4)
```
) →
```
patsy.cr(x, knots=...)
```
with explicit quantile knots
R's t-test (
```
t.test()
```
) →
```
scipy.stats.ttest_ind()
```

R's ANOVA (

aov()

) →

statsmodels.formula.api.ols()

sm.stats.anova_lm()

See: references/statistical_analysis.md for exact parameter matching

部分BixBench问题使用R进行分析，对应的Python等效实现：

R的Dunnett检验 (
```
multcomp::glht
```
) →
```
scipy.stats.dunnett()
```
（scipy ≥ 1.10）
R的自然样条 (
```
ns(x, df=4)
```
) →
```
patsy.cr(x, knots=...)
```
（使用显式分位数节点）
R的t检验 (
```
t.test()
```
) →
```
scipy.stats.ttest_ind()
```

R的ANOVA (

aov()

) →

statsmodels.formula.api.ols()

sm.stats.anova_lm()

参数匹配细节请参考：references/statistical_analysis.md

Answer Formatting

答案格式

BixBench expects specific formats:

"to the nearest thousand":
```
int(round(val, -3))
```
Percentages: Usually integer or 1-2 decimal places
Cohen's d: 3 decimal places
Sample sizes: Always integer (ceiling)
Ratios: String format "5:1"

BixBench要求特定格式：

“四舍五入到千位”：
```
int(round(val, -3))
```
百分比：通常为整数或1-2位小数
Cohen's d：3位小数
样本量：始终为整数（向上取整）
比例：字符串格式“5:1”

Completeness Checklist

完整性检查清单

Getting Help

获取帮助

Start with decision tree at top of this file
Check relevant reference guide for detailed protocol
Use example scripts as templates
See troubleshooting guide for common issues
All statistical implementations in statistical_analysis.md

从本文档顶部的决策树开始
查看相关参考指南获取详细方案
使用示例脚本作为模板
查看故障排除指南解决常见问题
所有统计实现均在statistical_analysis.md中