data-viz-plots
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseData Visualization (Universal)
通用数据可视化
Overview
概述
This skill enables you to create professional scientific visualizations including scatter plots, line charts, heatmaps, violin plots, and more. Unlike cloud-hosted solutions, this skill uses the matplotlib and seaborn Python libraries and executes locally in your environment, making it compatible with ALL LLM providers including GPT, Gemini, Claude, DeepSeek, and Qwen.
此技能可帮助你创建专业的科学可视化图表,包括散点图、折线图、热图、小提琴图等。与云端托管解决方案不同,本技能使用matplotlib和seaborn Python库,并在你的本地环境中运行,因此兼容所有LLM提供商,包括GPT、Gemini、Claude、DeepSeek和Qwen。
When to Use This Skill
适用场景
- Create publication-quality figures for papers and presentations
- Generate exploratory data analysis (EDA) plots
- Visualize gene expression, QC metrics, or clustering results
- Create multi-panel figures combining different plot types
- Export high-resolution images for reports
- Customize plot aesthetics (colors, fonts, styles)
- 为论文和演示文稿创建出版级别的图表
- 生成探索性数据分析(EDA)图表
- 可视化基因表达、QC指标或聚类结果
- 创建组合多种图表类型的多面板图
- 导出高分辨率图片用于报告
- 自定义图表美学(颜色、字体、样式)
How to Use
使用方法
Step 1: Import Required Libraries
步骤1:导入所需库
python
import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd
import numpy as np
from matplotlib import gridspec
import matplotlib.patches as mpatchespython
import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd
import numpy as np
from matplotlib import gridspec
import matplotlib.patches as mpatchesSet style for publication-quality plots
Set style for publication-quality plots
sns.set_style("whitegrid")
plt.rcParams['figure.dpi'] = 150
plt.rcParams['savefig.dpi'] = 300
plt.rcParams['font.size'] = 10
undefinedsns.set_style("whitegrid")
plt.rcParams['figure.dpi'] = 150
plt.rcParams['savefig.dpi'] = 300
plt.rcParams['font.size'] = 10
undefinedStep 2: Basic Scatter Plot
步骤2:基础散点图
python
undefinedpython
undefinedCreate figure and axis
Create figure and axis
fig, ax = plt.subplots(figsize=(6, 5))
fig, ax = plt.subplots(figsize=(6, 5))
Scatter plot
Scatter plot
ax.scatter(x_data, y_data, s=20, alpha=0.6, c='steelblue', edgecolors='k', linewidths=0.5)
ax.scatter(x_data, y_data, s=20, alpha=0.6, c='steelblue', edgecolors='k', linewidths=0.5)
Labels and title
Labels and title
ax.set_xlabel('Gene Expression (log2)', fontsize=12)
ax.set_ylabel('Cell Count', fontsize=12)
ax.set_title('Expression vs. Cell Count', fontsize=14, fontweight='bold')
ax.set_xlabel('Gene Expression (log2)', fontsize=12)
ax.set_ylabel('Cell Count', fontsize=12)
ax.set_title('Expression vs. Cell Count', fontsize=14, fontweight='bold')
Grid and styling
Grid and styling
ax.grid(alpha=0.3)
ax.spines['top'].set_visible(False)
ax.spines['right'].set_visible(False)
ax.grid(alpha=0.3)
ax.spines['top'].set_visible(False)
ax.spines['right'].set_visible(False)
Save figure
Save figure
plt.tight_layout()
plt.savefig('scatter_plot.png', dpi=300, bbox_inches='tight')
plt.show()
print("✅ Scatter plot saved to: scatter_plot.png")
undefinedplt.tight_layout()
plt.savefig('scatter_plot.png', dpi=300, bbox_inches='tight')
plt.show()
print("✅ Scatter plot saved to: scatter_plot.png")
undefinedStep 3: Line Plot with Multiple Series
步骤3:多系列折线图
python
fig, ax = plt.subplots(figsize=(8, 5))python
fig, ax = plt.subplots(figsize=(8, 5))Plot multiple lines
Plot multiple lines
ax.plot(time_points, group1_values, marker='o', label='Group 1', color='#E74C3C', linewidth=2)
ax.plot(time_points, group2_values, marker='s', label='Group 2', color='#3498DB', linewidth=2)
ax.plot(time_points, group3_values, marker='^', label='Group 3', color='#2ECC71', linewidth=2)
ax.plot(time_points, group1_values, marker='o', label='Group 1', color='#E74C3C', linewidth=2)
ax.plot(time_points, group2_values, marker='s', label='Group 2', color='#3498DB', linewidth=2)
ax.plot(time_points, group3_values, marker='^', label='Group 3', color='#2ECC71', linewidth=2)
Styling
Styling
ax.set_xlabel('Time Point', fontsize=12)
ax.set_ylabel('Expression Level', fontsize=12)
ax.set_title('Gene Expression Over Time', fontsize=14, fontweight='bold')
ax.legend(frameon=True, loc='best', fontsize=10)
ax.grid(alpha=0.3, linestyle='--')
plt.tight_layout()
plt.savefig('line_plot.png', dpi=300, bbox_inches='tight')
plt.show()
undefinedax.set_xlabel('Time Point', fontsize=12)
ax.set_ylabel('Expression Level', fontsize=12)
ax.set_title('Gene Expression Over Time', fontsize=14, fontweight='bold')
ax.legend(frameon=True, loc='best', fontsize=10)
ax.grid(alpha=0.3, linestyle='--')
plt.tight_layout()
plt.savefig('line_plot.png', dpi=300, bbox_inches='tight')
plt.show()
undefinedStep 4: Box Plot and Violin Plot
步骤4:箱线图与小提琴图
python
undefinedpython
undefinedPrepare data (long-form DataFrame)
Prepare data (long-form DataFrame)
df should have columns: 'cluster', 'expression', 'gene', etc.
df should have columns: 'cluster', 'expression', 'gene', etc.
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(12, 5))
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(12, 5))
Box plot
Box plot
sns.boxplot(data=df, x='cluster', y='expression', palette='Set2', ax=ax1)
ax1.set_title('Box Plot: Expression by Cluster', fontsize=12, fontweight='bold')
ax1.set_xlabel('Cluster', fontsize=11)
ax1.set_ylabel('Expression Level', fontsize=11)
ax1.tick_params(axis='x', rotation=45)
sns.boxplot(data=df, x='cluster', y='expression', palette='Set2', ax=ax1)
ax1.set_title('Box Plot: Expression by Cluster', fontsize=12, fontweight='bold')
ax1.set_xlabel('Cluster', fontsize=11)
ax1.set_ylabel('Expression Level', fontsize=11)
ax1.tick_params(axis='x', rotation=45)
Violin plot
Violin plot
sns.violinplot(data=df, x='cluster', y='expression', palette='muted', ax=ax2, inner='quartile')
ax2.set_title('Violin Plot: Expression Distribution', fontsize=12, fontweight='bold')
ax2.set_xlabel('Cluster', fontsize=11)
ax2.set_ylabel('Expression Level', fontsize=11)
ax2.tick_params(axis='x', rotation=45)
plt.tight_layout()
plt.savefig('box_violin_plot.png', dpi=300, bbox_inches='tight')
plt.show()
undefinedsns.violinplot(data=df, x='cluster', y='expression', palette='muted', ax=ax2, inner='quartile')
ax2.set_title('Violin Plot: Expression Distribution', fontsize=12, fontweight='bold')
ax2.set_xlabel('Cluster', fontsize=11)
ax2.set_ylabel('Expression Level', fontsize=11)
ax2.tick_params(axis='x', rotation=45)
plt.tight_layout()
plt.savefig('box_violin_plot.png', dpi=300, bbox_inches='tight')
plt.show()
undefinedStep 5: Heatmap
步骤5:热图
python
undefinedpython
undefinedPrepare data matrix (rows=genes, columns=samples or clusters)
Prepare data matrix (rows=genes, columns=samples or clusters)
gene_expression_matrix: pandas DataFrame or numpy array
gene_expression_matrix: pandas DataFrame or numpy array
fig, ax = plt.subplots(figsize=(8, 6))
fig, ax = plt.subplots(figsize=(8, 6))
Create heatmap
Create heatmap
sns.heatmap(
gene_expression_matrix,
cmap='viridis',
cbar_kws={'label': 'Expression'},
xticklabels=True,
yticklabels=True,
linewidths=0.5,
linecolor='gray',
ax=ax
)
ax.set_title('Gene Expression Heatmap', fontsize=14, fontweight='bold')
ax.set_xlabel('Samples', fontsize=12)
ax.set_ylabel('Genes', fontsize=12)
plt.tight_layout()
plt.savefig('heatmap.png', dpi=300, bbox_inches='tight')
plt.show()
undefinedsns.heatmap(
gene_expression_matrix,
cmap='viridis',
cbar_kws={'label': 'Expression'},
xticklabels=True,
yticklabels=True,
linewidths=0.5,
linecolor='gray',
ax=ax
)
ax.set_title('Gene Expression Heatmap', fontsize=14, fontweight='bold')
ax.set_xlabel('Samples', fontsize=12)
ax.set_ylabel('Genes', fontsize=12)
plt.tight_layout()
plt.savefig('heatmap.png', dpi=300, bbox_inches='tight')
plt.show()
undefinedStep 6: Bar Plot with Error Bars
步骤6:带误差棒的柱状图
python
fig, ax = plt.subplots(figsize=(7, 5))python
fig, ax = plt.subplots(figsize=(7, 5))Data
Data
categories = ['Cluster 0', 'Cluster 1', 'Cluster 2', 'Cluster 3']
means = [120, 85, 200, 150]
errors = [15, 10, 25, 20]
categories = ['Cluster 0', 'Cluster 1', 'Cluster 2', 'Cluster 3']
means = [120, 85, 200, 150]
errors = [15, 10, 25, 20]
Bar plot
Bar plot
bars = ax.bar(categories, means, yerr=errors, capsize=5,
color=['#E74C3C', '#3498DB', '#2ECC71', '#F39C12'],
edgecolor='black', linewidth=1.2, alpha=0.8)
bars = ax.bar(categories, means, yerr=errors, capsize=5,
color=['#E74C3C', '#3498DB', '#2ECC71', '#F39C12'],
edgecolor='black', linewidth=1.2, alpha=0.8)
Labels
Labels
ax.set_ylabel('Cell Count', fontsize=12)
ax.set_title('Cell Counts by Cluster', fontsize=14, fontweight='bold')
ax.set_ylim(0, max(means) * 1.3)
ax.set_ylabel('Cell Count', fontsize=12)
ax.set_title('Cell Counts by Cluster', fontsize=14, fontweight='bold')
ax.set_ylim(0, max(means) * 1.3)
Add value labels on bars
Add value labels on bars
for bar, mean in zip(bars, means):
height = bar.get_height()
ax.text(bar.get_x() + bar.get_width()/2., height + 5,
f'{mean}', ha='center', va='bottom', fontsize=10)
plt.tight_layout()
plt.savefig('bar_plot.png', dpi=300, bbox_inches='tight')
plt.show()
undefinedfor bar, mean in zip(bars, means):
height = bar.get_height()
ax.text(bar.get_x() + bar.get_width()/2., height + 5,
f'{mean}', ha='center', va='bottom', fontsize=10)
plt.tight_layout()
plt.savefig('bar_plot.png', dpi=300, bbox_inches='tight')
plt.show()
undefinedAdvanced Features
高级功能
Multi-Panel Figure
多面板复合图
python
undefinedpython
undefinedCreate complex layout
Create complex layout
fig = plt.figure(figsize=(12, 8))
gs = gridspec.GridSpec(2, 3, figure=fig, hspace=0.3, wspace=0.3)
fig = plt.figure(figsize=(12, 8))
gs = gridspec.GridSpec(2, 3, figure=fig, hspace=0.3, wspace=0.3)
Panel A: Scatter
Panel A: Scatter
ax1 = fig.add_subplot(gs[0, :2])
ax1.scatter(x_data, y_data, c=cluster_labels, cmap='tab10', s=10, alpha=0.6)
ax1.set_title('A. UMAP Projection', fontsize=12, fontweight='bold', loc='left')
ax1.set_xlabel('UMAP1')
ax1.set_ylabel('UMAP2')
ax1 = fig.add_subplot(gs[0, :2])
ax1.scatter(x_data, y_data, c=cluster_labels, cmap='tab10', s=10, alpha=0.6)
ax1.set_title('A. UMAP Projection', fontsize=12, fontweight='bold', loc='left')
ax1.set_xlabel('UMAP1')
ax1.set_ylabel('UMAP2')
Panel B: Violin
Panel B: Violin
ax2 = fig.add_subplot(gs[0, 2])
sns.violinplot(data=df, y='expression', palette='Set2', ax=ax2)
ax2.set_title('B. Expression', fontsize=12, fontweight='bold', loc='left')
ax2 = fig.add_subplot(gs[0, 2])
sns.violinplot(data=df, y='expression', palette='Set2', ax=ax2)
ax2.set_title('B. Expression', fontsize=12, fontweight='bold', loc='left')
Panel C: Heatmap
Panel C: Heatmap
ax3 = fig.add_subplot(gs[1, :])
sns.heatmap(matrix, cmap='coolwarm', center=0, ax=ax3, cbar_kws={'label': 'Z-score'})
ax3.set_title('C. Gene Expression Heatmap', fontsize=12, fontweight='bold', loc='left')
plt.savefig('multi_panel_figure.png', dpi=300, bbox_inches='tight')
plt.show()
undefinedax3 = fig.add_subplot(gs[1, :])
sns.heatmap(matrix, cmap='coolwarm', center=0, ax=ax3, cbar_kws={'label': 'Z-score'})
ax3.set_title('C. Gene Expression Heatmap', fontsize=12, fontweight='bold', loc='left')
plt.savefig('multi_panel_figure.png', dpi=300, bbox_inches='tight')
plt.show()
undefinedCustom Color Palette
自定义配色方案
python
undefinedpython
undefinedDefine custom colors
Define custom colors
custom_palette = ['#E74C3C', '#3498DB', '#2ECC71', '#F39C12', '#9B59B6']
custom_palette = ['#E74C3C', '#3498DB', '#2ECC71', '#F39C12', '#9B59B6']
Use in seaborn
Use in seaborn
sns.set_palette(custom_palette)
sns.set_palette(custom_palette)
Or create color dict for specific mapping
Or create color dict for specific mapping
color_dict = {
'T cells': '#E74C3C',
'B cells': '#3498DB',
'Monocytes': '#2ECC71',
'NK cells': '#F39C12'
}
color_dict = {
'T cells': '#E74C3C',
'B cells': '#3498DB',
'Monocytes': '#2ECC71',
'NK cells': '#F39C12'
}
Use in scatter plot
Use in scatter plot
for cell_type, color in color_dict.items():
mask = df['celltype'] == cell_type
ax.scatter(df.loc[mask, 'x'], df.loc[mask, 'y'],
c=color, label=cell_type, s=20, alpha=0.7)
ax.legend()
undefinedfor cell_type, color in color_dict.items():
mask = df['celltype'] == cell_type
ax.scatter(df.loc[mask, 'x'], df.loc[mask, 'y'],
c=color, label=cell_type, s=20, alpha=0.7)
ax.legend()
undefinedDensity Plot
密度散点图
python
from scipy.stats import gaussian_kde
fig, ax = plt.subplots(figsize=(8, 6))python
from scipy.stats import gaussian_kde
fig, ax = plt.subplots(figsize=(8, 6))Calculate density
Calculate density
xy = np.vstack([x_data, y_data])
z = gaussian_kde(xy)(xy)
xy = np.vstack([x_data, y_data])
z = gaussian_kde(xy)(xy)
Sort points by density for better visualization
Sort points by density for better visualization
idx = z.argsort()
x, y, z = x_data[idx], y_data[idx], z[idx]
idx = z.argsort()
x, y, z = x_data[idx], y_data[idx], z[idx]
Scatter with density colors
Scatter with density colors
scatter = ax.scatter(x, y, c=z, s=20, cmap='viridis', alpha=0.6, edgecolors='none')
plt.colorbar(scatter, ax=ax, label='Density')
ax.set_xlabel('UMAP1', fontsize=12)
ax.set_ylabel('UMAP2', fontsize=12)
ax.set_title('Density Scatter Plot', fontsize=14, fontweight='bold')
plt.tight_layout()
plt.savefig('density_plot.png', dpi=300, bbox_inches='tight')
plt.show()
undefinedscatter = ax.scatter(x, y, c=z, s=20, cmap='viridis', alpha=0.6, edgecolors='none')
plt.colorbar(scatter, ax=ax, label='Density')
ax.set_xlabel('UMAP1', fontsize=12)
ax.set_ylabel('UMAP2', fontsize=12)
ax.set_title('Density Scatter Plot', fontsize=14, fontweight='bold')
plt.tight_layout()
plt.savefig('density_plot.png', dpi=300, bbox_inches='tight')
plt.show()
undefinedCommon Use Cases
常见应用场景
QC Metrics Visualization
QC指标可视化
python
undefinedpython
undefinedAssuming adata.obs has QC columns: n_genes, n_counts, percent_mito
Assuming adata.obs has QC columns: n_genes, n_counts, percent_mito
fig, axes = plt.subplots(1, 3, figsize=(15, 4))
fig, axes = plt.subplots(1, 3, figsize=(15, 4))
Plot 1: Histogram of genes per cell
Plot 1: Histogram of genes per cell
axes[0].hist(adata.obs['n_genes'], bins=50, color='steelblue', edgecolor='black', alpha=0.7)
axes[0].axvline(adata.obs['n_genes'].median(), color='red', linestyle='--', label='Median')
axes[0].set_xlabel('Genes per Cell', fontsize=11)
axes[0].set_ylabel('Frequency', fontsize=11)
axes[0].set_title('Genes per Cell Distribution', fontsize=12, fontweight='bold')
axes[0].legend()
axes[0].hist(adata.obs['n_genes'], bins=50, color='steelblue', edgecolor='black', alpha=0.7)
axes[0].axvline(adata.obs['n_genes'].median(), color='red', linestyle='--', label='Median')
axes[0].set_xlabel('Genes per Cell', fontsize=11)
axes[0].set_ylabel('Frequency', fontsize=11)
axes[0].set_title('Genes per Cell Distribution', fontsize=12, fontweight='bold')
axes[0].legend()
Plot 2: Scatter UMI vs Genes
Plot 2: Scatter UMI vs Genes
axes[1].scatter(adata.obs['n_counts'], adata.obs['n_genes'],
s=5, alpha=0.5, c='coral')
axes[1].set_xlabel('UMI Counts', fontsize=11)
axes[1].set_ylabel('Genes Detected', fontsize=11)
axes[1].set_title('UMIs vs Genes', fontsize=12, fontweight='bold')
axes[1].scatter(adata.obs['n_counts'], adata.obs['n_genes'],
s=5, alpha=0.5, c='coral')
axes[1].set_xlabel('UMI Counts', fontsize=11)
axes[1].set_ylabel('Genes Detected', fontsize=11)
axes[1].set_title('UMIs vs Genes', fontsize=12, fontweight='bold')
Plot 3: Violin plot of mitochondrial percentage
Plot 3: Violin plot of mitochondrial percentage
sns.violinplot(y=adata.obs['percent_mito'], ax=axes[2], color='lightgreen')
axes[2].axhline(y=20, color='red', linestyle='--', label='20% threshold')
axes[2].set_ylabel('Mitochondrial %', fontsize=11)
axes[2].set_title('Mitochondrial Content', fontsize=12, fontweight='bold')
axes[2].legend()
plt.tight_layout()
plt.savefig('qc_metrics.png', dpi=300, bbox_inches='tight')
plt.show()
undefinedsns.violinplot(y=adata.obs['percent_mito'], ax=axes[2], color='lightgreen')
axes[2].axhline(y=20, color='red', linestyle='--', label='20% threshold')
axes[2].set_ylabel('Mitochondrial %', fontsize=11)
axes[2].set_title('Mitochondrial Content', fontsize=12, fontweight='bold')
axes[2].legend()
plt.tight_layout()
plt.savefig('qc_metrics.png', dpi=300, bbox_inches='tight')
plt.show()
undefinedUMAP/tSNE Visualization
UMAP/tSNE可视化
python
undefinedpython
undefinedAssuming adata.obsm['X_umap'] exists and adata.obs['clusters'] exists
Assuming adata.obsm['X_umap'] exists and adata.obs['clusters'] exists
fig, ax = plt.subplots(figsize=(8, 7))
fig, ax = plt.subplots(figsize=(8, 7))
Get unique clusters
Get unique clusters
clusters = adata.obs['clusters'].unique()
n_clusters = len(clusters)
clusters = adata.obs['clusters'].unique()
n_clusters = len(clusters)
Generate colors
Generate colors
colors = plt.cm.tab20(np.linspace(0, 1, n_clusters))
colors = plt.cm.tab20(np.linspace(0, 1, n_clusters))
Plot each cluster
Plot each cluster
for i, cluster in enumerate(clusters):
mask = adata.obs['clusters'] == cluster
ax.scatter(
adata.obsm['X_umap'][mask, 0],
adata.obsm['X_umap'][mask, 1],
c=[colors[i]],
label=f'Cluster {cluster}',
s=10,
alpha=0.7,
edgecolors='none'
)
ax.set_xlabel('UMAP1', fontsize=12)
ax.set_ylabel('UMAP2', fontsize=12)
ax.set_title('UMAP Projection by Cluster', fontsize=14, fontweight='bold')
ax.legend(bbox_to_anchor=(1.05, 1), loc='upper left', frameon=True, fontsize=9)
plt.tight_layout()
plt.savefig('umap_clusters.png', dpi=300, bbox_inches='tight')
plt.show()
undefinedfor i, cluster in enumerate(clusters):
mask = adata.obs['clusters'] == cluster
ax.scatter(
adata.obsm['X_umap'][mask, 0],
adata.obsm['X_umap'][mask, 1],
c=[colors[i]],
label=f'Cluster {cluster}',
s=10,
alpha=0.7,
edgecolors='none'
)
ax.set_xlabel('UMAP1', fontsize=12)
ax.set_ylabel('UMAP2', fontsize=12)
ax.set_title('UMAP Projection by Cluster', fontsize=14, fontweight='bold')
ax.legend(bbox_to_anchor=(1.05, 1), loc='upper left', frameon=True, fontsize=9)
plt.tight_layout()
plt.savefig('umap_clusters.png', dpi=300, bbox_inches='tight')
plt.show()
undefinedGene Expression Dot Plot
基因表达点图
python
undefinedpython
undefinedgenes: list of gene names
genes: list of gene names
clusters: list of cluster IDs
clusters: list of cluster IDs
Create matrix: rows=genes, columns=clusters with mean expression and % expressing
Create matrix: rows=genes, columns=clusters with mean expression and % expressing
fig, ax = plt.subplots(figsize=(10, 6))
fig, ax = plt.subplots(figsize=(10, 6))
Prepare data
Prepare data
from matplotlib.colors import Normalize
from matplotlib.colors import Normalize
dot_size_matrix: % cells expressing (0-100)
dot_size_matrix: % cells expressing (0-100)
color_matrix: mean expression level
color_matrix: mean expression level
for i, gene in enumerate(genes):
for j, cluster in enumerate(clusters):
# Size proportional to % expressing
size = dot_size_matrix[i, j] * 5 # Scale factor
# Color by expression level
color_val = color_matrix[i, j]
ax.scatter(j, i, s=size, c=[color_val], cmap='Reds',
vmin=0, vmax=color_matrix.max(),
edgecolors='black', linewidths=0.5)for i, gene in enumerate(genes):
for j, cluster in enumerate(clusters):
# Size proportional to % expressing
size = dot_size_matrix[i, j] * 5 # Scale factor
# Color by expression level
color_val = color_matrix[i, j]
ax.scatter(j, i, s=size, c=[color_val], cmap='Reds',
vmin=0, vmax=color_matrix.max(),
edgecolors='black', linewidths=0.5)Labels
Labels
ax.set_xticks(range(len(clusters)))
ax.set_xticklabels(clusters, rotation=45, ha='right')
ax.set_yticks(range(len(genes)))
ax.set_yticklabels(genes)
ax.set_xlabel('Cluster', fontsize=12)
ax.set_ylabel('Gene', fontsize=12)
ax.set_title('Marker Gene Expression', fontsize=14, fontweight='bold')
ax.set_xticks(range(len(clusters)))
ax.set_xticklabels(clusters, rotation=45, ha='right')
ax.set_yticks(range(len(genes)))
ax.set_yticklabels(genes)
ax.set_xlabel('Cluster', fontsize=12)
ax.set_ylabel('Gene', fontsize=12)
ax.set_title('Marker Gene Expression', fontsize=14, fontweight='bold')
Colorbar
Colorbar
norm = Normalize(vmin=0, vmax=color_matrix.max())
sm = plt.cm.ScalarMappable(cmap='Reds', norm=norm)
sm.set_array([])
cbar = plt.colorbar(sm, ax=ax, pad=0.02)
cbar.set_label('Mean Expression', rotation=270, labelpad=15)
plt.tight_layout()
plt.savefig('gene_dotplot.png', dpi=300, bbox_inches='tight')
plt.show()
undefinednorm = Normalize(vmin=0, vmax=color_matrix.max())
sm = plt.cm.ScalarMappable(cmap='Reds', norm=norm)
sm.set_array([])
cbar = plt.colorbar(sm, ax=ax, pad=0.02)
cbar.set_label('Mean Expression', rotation=270, labelpad=15)
plt.tight_layout()
plt.savefig('gene_dotplot.png', dpi=300, bbox_inches='tight')
plt.show()
undefinedVolcano Plot (DEG Analysis)
火山图(差异基因分析)
python
undefinedpython
undefinedAssuming deg_df has columns: gene, log2FC, pvalue
Assuming deg_df has columns: gene, log2FC, pvalue
fig, ax = plt.subplots(figsize=(8, 7))
fig, ax = plt.subplots(figsize=(8, 7))
Calculate -log10(pvalue)
Calculate -log10(pvalue)
deg_df['-log10_pvalue'] = -np.log10(deg_df['pvalue'])
deg_df['-log10_pvalue'] = -np.log10(deg_df['pvalue'])
Classify genes
Classify genes
deg_df['significant'] = 'Not Significant'
deg_df.loc[(deg_df['log2FC'] > 1) & (deg_df['pvalue'] < 0.05), 'significant'] = 'Up-regulated'
deg_df.loc[(deg_df['log2FC'] < -1) & (deg_df['pvalue'] < 0.05), 'significant'] = 'Down-regulated'
deg_df['significant'] = 'Not Significant'
deg_df.loc[(deg_df['log2FC'] > 1) & (deg_df['pvalue'] < 0.05), 'significant'] = 'Up-regulated'
deg_df.loc[(deg_df['log2FC'] < -1) & (deg_df['pvalue'] < 0.05), 'significant'] = 'Down-regulated'
Plot
Plot
for category, color in zip(['Not Significant', 'Up-regulated', 'Down-regulated'],
['gray', 'red', 'blue']):
mask = deg_df['significant'] == category
ax.scatter(deg_df.loc[mask, 'log2FC'],
deg_df.loc[mask, '-log10_pvalue'],
c=color, label=category, s=20, alpha=0.6, edgecolors='none')
for category, color in zip(['Not Significant', 'Up-regulated', 'Down-regulated'],
['gray', 'red', 'blue']):
mask = deg_df['significant'] == category
ax.scatter(deg_df.loc[mask, 'log2FC'],
deg_df.loc[mask, '-log10_pvalue'],
c=color, label=category, s=20, alpha=0.6, edgecolors='none')
Threshold lines
Threshold lines
ax.axvline(x=1, color='black', linestyle='--', linewidth=1, alpha=0.5)
ax.axvline(x=-1, color='black', linestyle='--', linewidth=1, alpha=0.5)
ax.axhline(y=-np.log10(0.05), color='black', linestyle='--', linewidth=1, alpha=0.5)
ax.axvline(x=1, color='black', linestyle='--', linewidth=1, alpha=0.5)
ax.axvline(x=-1, color='black', linestyle='--', linewidth=1, alpha=0.5)
ax.axhline(y=-np.log10(0.05), color='black', linestyle='--', linewidth=1, alpha=0.5)
Labels
Labels
ax.set_xlabel('log2 Fold Change', fontsize=12)
ax.set_ylabel('-log10(p-value)', fontsize=12)
ax.set_title('Volcano Plot: Differential Expression', fontsize=14, fontweight='bold')
ax.legend(frameon=True, loc='upper right')
plt.tight_layout()
plt.savefig('volcano_plot.png', dpi=300, bbox_inches='tight')
plt.show()
undefinedax.set_xlabel('log2 Fold Change', fontsize=12)
ax.set_ylabel('-log10(p-value)', fontsize=12)
ax.set_title('Volcano Plot: Differential Expression', fontsize=14, fontweight='bold')
ax.legend(frameon=True, loc='upper right')
plt.tight_layout()
plt.savefig('volcano_plot.png', dpi=300, bbox_inches='tight')
plt.show()
undefinedBest Practices
最佳实践
- Figure Size: Use appropriate dimensions for target medium (papers: 6-8 inches wide, posters: larger)
- DPI: Save at 300 DPI for publications, 150 DPI for presentations
- Colors: Use colorblind-friendly palettes (e.g., ,
viridis,Set2)tab10 - Fonts: Keep font sizes readable (titles: 12-14pt, labels: 10-12pt, ticks: 8-10pt)
- Transparency: Use alpha for overlapping points to show density
- Layout: Always call before saving to prevent label clipping
plt.tight_layout() - File Format: PNG for general use, SVG for vector graphics (editable in Illustrator)
- Close Figures: Call after saving to free memory when generating many plots
plt.close()
- 图表尺寸:根据目标载体选择合适的尺寸(论文:6-8英寸宽,海报:更大尺寸)
- 分辨率:出版用图保存为300 DPI,演示用图保存为150 DPI
- 配色:选用色弱友好的配色方案(如、
viridis、Set2)tab10 - 字体:保持字体大小易读(标题:12-14pt,标签:10-12pt,刻度:8-10pt)
- 透明度:对重叠点使用透明度以展示密度
- 布局:保存前务必调用避免标签被截断
plt.tight_layout() - 文件格式:通用场景用PNG,矢量图用SVG(可在Illustrator中编辑)
- 关闭图表:生成大量图表时,保存后调用释放内存
plt.close()
Troubleshooting
常见问题排查
Issue: "Figure too cluttered with many points"
问题:图表包含大量点,显得过于拥挤
Solution: Use transparency and smaller point sizes
python
ax.scatter(x, y, s=5, alpha=0.3, edgecolors='none')解决方案:使用透明度和更小的点尺寸
python
ax.scatter(x, y, s=5, alpha=0.3, edgecolors='none')Issue: "Legend overlaps with data"
问题:图例与图表数据重叠
Solution: Place legend outside the plot area
python
ax.legend(bbox_to_anchor=(1.05, 1), loc='upper left')解决方案:将图例放置在图表区域外
python
ax.legend(bbox_to_anchor=(1.05, 1), loc='upper left')Issue: "Labels are cut off in saved figure"
问题:保存的图表中标签被截断
Solution: Use
bbox_inches='tight'python
plt.savefig('plot.png', dpi=300, bbox_inches='tight')解决方案:使用参数
bbox_inches='tight'python
plt.savefig('plot.png', dpi=300, bbox_inches='tight')Issue: "Colors don't match between plots"
问题:不同图表间颜色不匹配
Solution: Define color palette once and reuse
python
PALETTE = {'Group A': '#E74C3C', 'Group B': '#3498DB'}解决方案:定义一次配色方案并重复使用
python
PALETTE = {'Group A': '#E74C3C', 'Group B': '#3498DB'}Use PALETTE in all plots
在所有图表中使用PALETTE
undefinedundefinedIssue: "Heatmap text too small"
问题:热图中的文字过小
Solution: Adjust figure size or font size
python
fig, ax = plt.subplots(figsize=(12, 10))
sns.heatmap(data, ax=ax, annot_kws={'fontsize': 8})解决方案:调整图表尺寸或字体大小
python
fig, ax = plt.subplots(figsize=(12, 10))
sns.heatmap(data, ax=ax, annot_kws={'fontsize': 8})Technical Notes
技术说明
- Libraries: Uses and
matplotlib(widely supported, stable)seaborn - Execution: Runs locally in the agent's sandbox
- Compatibility: Works with ALL LLM providers (GPT, Gemini, Claude, DeepSeek, Qwen, etc.)
- File Formats: Supports PNG, PDF, SVG, JPEG
- Performance: Typical plot generation takes <1 second for standard plots, 2-5 seconds for complex multi-panel figures
- Memory: Keep figure count reasonable; close figures after saving if generating many plots
- 依赖库:使用和
matplotlib(支持广泛、稳定可靠)seaborn - 运行方式:在Agent的沙箱环境中本地运行
- 兼容性:兼容所有LLM提供商(GPT、Gemini、Claude、DeepSeek、Qwen等)
- 支持格式:支持PNG、PDF、SVG、JPEG等文件格式
- 性能:标准图表生成时间通常<1秒,复杂多面板图表耗时2-5秒
- 内存:合理控制图表数量;生成大量图表时,保存后关闭图表
References
参考资料
- Matplotlib documentation: https://matplotlib.org/stable/contents.html
- Seaborn documentation: https://seaborn.pydata.org/
- Matplotlib gallery: https://matplotlib.org/stable/gallery/index.html
- Seaborn gallery: https://seaborn.pydata.org/examples/index.html
- Matplotlib官方文档:https://matplotlib.org/stable/contents.html
- Seaborn官方文档:https://seaborn.pydata.org/
- Matplotlib示例图库:https://matplotlib.org/stable/gallery/index.html
- Seaborn示例图库:https://seaborn.pydata.org/examples/index.html