bulk-wgcna-analysis-with-omicverse
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseBulk WGCNA analysis with omicverse
基于omicverse的批量WGCNA分析
Overview
概述
Activate this skill for users who want to reproduce the WGCNA workflow from . It guides you through loading expression data, configuring PyWGCNA, constructing weighted gene co-expression networks, and inspecting modules of interest.
t_wgcna.ipynb当用户需要复现中的WGCNA工作流时,可启用此技能。它将引导你完成表达数据加载、PyWGCNA配置、加权基因共表达网络构建以及目标模块检查等步骤。
t_wgcna.ipynbInstructions
操作步骤
- Prepare the environment
- Import ,
omicverse as ov,scanpy as sc, andmatplotlib.pyplot as plt.pandas as pd - Set plotting defaults via .
ov.plot_set()
- Import
- Load and filter expression data
- Read expression matrices (e.g., from ).
expressionList.csv - Calculate median absolute deviation with and
from statsmodels import robust.gene_mad = data.apply(robust.mad) - Keep the top variable genes (e.g., ).
data = data.T.loc[gene_mad.sort_values(ascending=False).index[:2000]]
- Read expression matrices (e.g., from
- Initialise PyWGCNA
- Create .
pyWGCNA_5xFAD = ov.bulk.pyWGCNA(name=..., species='mus musculus', geneExp=data.T, outputPath='', save=True) - Confirm looks correct before proceeding.
pyWGCNA_5xFAD.geneExpr
- Create
- Preprocess the dataset
- Run to drop low-expression genes and problematic samples.
pyWGCNA_5xFAD.preprocess()
- Run
- Construct the co-expression network
- Evaluate soft-threshold power: .
pyWGCNA_5xFAD.calculate_soft_threshold() - Build adjacency and TOM matrices via and
calculating_adjacency_matrix().calculating_TOM_similarity_matrix()
- Evaluate soft-threshold power:
- Detect gene modules
- Generate dendrograms and modules: ,
calculate_geneTree().calculate_dynamicMods(kwargs_function={'cutreeHybrid': {...}}) - Derive module eigengenes with .
calculate_gene_module(kwargs_function={'moduleEigengenes': {'softPower': 8}}) - Visualise adjacency/TOM heatmaps using if needed.
plot_matrix(save=False)
- Generate dendrograms and modules:
- Inspect specific modules
- Extract genes from modules with .
get_sub_module([...], mod_type='module_color') - Build sub-networks using and plot them via
get_sub_network(mod_list=[...], mod_type='module_color', correlation_threshold=0.2).plot_sub_network(...)
- Extract genes from modules with
- Update sample metadata for downstream analyses
- Load sample annotations .
updateSampleInfo(path='.../sampleInfo.csv', sep=',') - Assign colour maps for metadata categories with .
setMetadataColor(...)
- Load sample annotations
- Analyse module–trait relationships
- Run to compute module–trait statistics.
analyseWGCNA() - Plot module eigengene heatmaps and bar charts with and
plotModuleEigenGene(module, metadata, show=True).barplotModuleEigenGene(...)
- Run
- Find hub genes
- Identify top hubs per module using .
top_n_hub_genes(moduleName='lightgreen', n=10)
- Identify top hubs per module using
- Troubleshooting tips
- Large datasets may require increasing to avoid writing many intermediate files.
save=False - If module detection fails, confirm enough genes remain after MAD filtering and adjust or
deepSplit.softPower - Ensure metadata categories have assigned colours before plotting eigengene heatmaps.
- Large datasets may require increasing
- 环境准备
- 导入、
omicverse as ov、scanpy as sc和matplotlib.pyplot as plt库。pandas as pd - 通过设置绘图默认参数。
ov.plot_set()
- 导入
- 加载并过滤表达数据
- 读取表达矩阵(例如来自文件)。
expressionList.csv - 导入的
statsmodels模块,通过robust计算中位数绝对偏差。gene_mad = data.apply(robust.mad) - 保留变异程度最高的基因(例如)。
data = data.T.loc[gene_mad.sort_values(ascending=False).index[:2000]]
- 读取表达矩阵(例如来自
- 初始化PyWGCNA
- 创建PyWGCNA实例:。
pyWGCNA_5xFAD = ov.bulk.pyWGCNA(name=..., species='mus musculus', geneExp=data.T, outputPath='', save=True) - 继续操作前确认数据格式正确。
pyWGCNA_5xFAD.geneExpr
- 创建PyWGCNA实例:
- 数据集预处理
- 运行,剔除低表达基因和异常样本。
pyWGCNA_5xFAD.preprocess()
- 运行
- 构建共表达网络
- 计算软阈值:。
pyWGCNA_5xFAD.calculate_soft_threshold() - 通过和
calculating_adjacency_matrix()构建邻接矩阵和TOM矩阵。calculating_TOM_similarity_matrix()
- 计算软阈值:
- 检测基因模块
- 生成树状图和模块:、
calculate_geneTree()。calculate_dynamicMods(kwargs_function={'cutreeHybrid': {...}}) - 通过推导模块特征基因。
calculate_gene_module(kwargs_function={'moduleEigengenes': {'softPower': 8}}) - 若需要,使用可视化邻接/TOM热图。
plot_matrix(save=False)
- 生成树状图和模块:
- 检查特定模块
- 通过提取模块中的基因。
get_sub_module([...], mod_type='module_color') - 使用构建子网络,并通过
get_sub_network(mod_list=[...], mod_type='module_color', correlation_threshold=0.2)绘制子网络图。plot_sub_network(...)
- 通过
- 更新样本元数据以用于下游分析
- 加载样本注释信息:。
updateSampleInfo(path='.../sampleInfo.csv', sep=',') - 通过为元数据类别分配颜色映射。
setMetadataColor(...)
- 加载样本注释信息:
- 分析模块与性状的关联
- 运行计算模块与性状的统计关系。
analyseWGCNA() - 使用和
plotModuleEigenGene(module, metadata, show=True)绘制模块特征基因热图和柱状图。barplotModuleEigenGene(...)
- 运行
- 寻找枢纽基因
- 通过识别每个模块中的顶级枢纽基因。
top_n_hub_genes(moduleName='lightgreen', n=10)
- 通过
- 故障排除提示
- 处理大型数据集时,可设置以避免生成过多中间文件。
save=False - 若模块检测失败,确认MAD过滤后剩余足够数量的基因,并调整或
deepSplit参数。softPower - 绘制特征基因热图前,确保已为元数据类别分配颜色。
- 处理大型数据集时,可设置
Examples
示例
- "Build a WGCNA network on the 5xFAD dataset, visualise modules, and extract hub genes from the lightgreen module."
- "Load sample metadata, update colours for sex and genotype, and plot module eigengene heatmaps."
- "Create a sub-network plot for the gold module using a correlation threshold of 0.2."
- "基于5xFAD数据集构建WGCNA网络,可视化模块并提取lightgreen模块中的枢纽基因。"
- "加载样本元数据,更新性别和基因型的颜色配置,并绘制模块特征基因热图。"
- "以0.2为相关阈值,绘制gold模块的子网络图。"
References
参考资料
- Tutorial notebook:
t_wgcna.ipynb - Tutorial dataset:
data/5xFAD_paper/ - Quick copy/paste commands:
reference.md
- 教程笔记本:
t_wgcna.ipynb - 教程数据集:
data/5xFAD_paper/ - 快速复制命令:
reference.md