seaborn
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseSeaborn - Statistical Data Visualization
Seaborn - 统计数据可视化
Seaborn helps you explore and understand your data through beautiful, informative statistical plots. It automates complex tasks like calculating confidence intervals, aggregating data, and creating faceted grids.
Seaborn 帮助你通过美观、信息丰富的统计图表探索和理解数据。它可以自动完成计算置信区间、聚合数据以及创建分面网格等复杂任务。
When to Use
适用场景
- Visualizing complex relationships between multiple variables (relplot)
- Examining univariate and bivariate distributions (displot, kdeplot)
- Comparing categories with statistical summaries (catplot, boxplot, violinplot)
- Visualizing linear regression models and their uncertainty (regplot, lmplot)
- Creating heatmaps and cluster maps for large matrices
- Building multi-plot grids based on data subsets (FacetGrid)
- Setting high-level aesthetic themes for Matplotlib figures
- 可视化多个变量之间的复杂关系(relplot)
- 检查单变量和双变量分布(displot、kdeplot)
- 通过统计摘要比较分类数据(catplot、boxplot、violinplot)
- 可视化线性回归模型及其不确定性(regplot、lmplot)
- 为大型矩阵创建热力图和聚类图
- 基于数据子集构建多图网格(FacetGrid)
- 为Matplotlib图形设置高级美学主题
Reference Documentation
参考文档
Official docs: https://seaborn.pydata.org/
Example gallery: https://seaborn.pydata.org/examples/index.html
Search patterns:, , , ,
Example gallery: https://seaborn.pydata.org/examples/index.html
Search patterns:
sns.load_datasetsns.relplotsns.catplotsns.set_themesns.heatmap官方文档:https://seaborn.pydata.org/
示例图库:https://seaborn.pydata.org/examples/index.html
常用搜索关键词:, , , ,
示例图库:https://seaborn.pydata.org/examples/index.html
常用搜索关键词:
sns.load_datasetsns.relplotsns.catplotsns.set_themesns.heatmapCore Principles
核心原则
Figure-Level vs. Axes-Level Functions
图级别函数 vs 轴级别函数
| Function Type | Examples | Key Characteristic |
|---|---|---|
| Figure-Level | relplot, displot, catplot | Creates its own figure (FacetGrid). Best for subplots (col, row). |
| Axes-Level | scatterplot, histplot, boxplot | Plots onto a specific ax. Best for integration with Matplotlib layouts. |
| 函数类型 | 示例 | 核心特性 |
|---|---|---|
| 图级别 | relplot, displot, catplot | 创建独立的图形(FacetGrid),最适合子图布局(col、row参数)。 |
| 轴级别 | scatterplot, histplot, boxplot | 在指定的轴上绘图,最适合与Matplotlib布局集成。 |
Use Seaborn For
适合用Seaborn的场景
- Statistical analysis and exploratory data research (EDA).
- Working directly with Pandas DataFrames in "tidy" (long-form) format.
- Automatic calculation of 95% confidence intervals (error bars).
- Rapidly changing visual themes and color palettes.
- 统计分析与探索性数据研究(EDA)。
- 直接处理“整洁格式”(长格式)的Pandas DataFrames。
- 自动计算95%置信区间(误差棒)。
- 快速切换可视化主题和调色板。
Do NOT Use For
不适合用Seaborn的场景
- Very low-level custom graphics (use Matplotlib).
- Interactive web visualizations (use Plotly).
- 3D plotting (use Matplotlib mplot3d or PyVista).
- Network graphs (use NetworkX with Matplotlib).
- 极低层级的自定义图形(使用Matplotlib)。
- 交互式网页可视化(使用Plotly)。
- 3D绘图(使用Matplotlib mplot3d或PyVista)。
- 网络图(使用NetworkX结合Matplotlib)。
Quick Reference
快速参考
Installation
安装
bash
pip install seabornbash
pip install seabornStandard Imports
标准导入
python
import seaborn as sns
import matplotlib.pyplot as plt
import pandas as pd
import numpy as nppython
import seaborn as sns
import matplotlib.pyplot as plt
import pandas as pd
import numpy as npApply the default theme
应用默认主题
sns.set_theme()
undefinedsns.set_theme()
undefinedBasic Pattern - Tidy Data Mapping
基础模式 - 整洁数据映射
python
import seaborn as snspython
import seaborn as snsLoad an example dataset
加载示例数据集
tips = sns.load_dataset("tips")
tips = sns.load_dataset("tips")
Create a scatter plot with semantic mapping
创建带语义映射的散点图
sns.relplot(
data=tips,
x="total_bill", y="tip",
hue="smoker", style="time", size="size",
)
plt.show()
undefinedsns.relplot(
data=tips,
x="total_bill", y="tip",
hue="smoker", style="time", size="size",
)
plt.show()
undefinedCritical Rules
关键规则
✅ DO
✅ 正确做法
- Use Tidy Data - Ensure your DataFrame is in "long-form" (one row per observation).
- Prefer Figure-Level Functions - Use relplot/displot/catplot for better default layouts and faceting.
- Use the data= parameter - Always pass the DataFrame to keep code clean.
- Set Themes - Use early in your script.
sns.set_theme(style="whitegrid", palette="muted") - Leverage hue - Use semantic color mapping to add extra dimensions to 2D plots.
- Context matters - Use for publications or "talk" for presentations.
sns.set_context("paper")
- 使用整洁数据 - 确保你的DataFrame是“长格式”(每行对应一个观测值)。
- 优先使用图级别函数 - 使用relplot/displot/catplot获得更优的默认布局和分面功能。
- 使用data=参数 - 始终传入DataFrame以保持代码整洁。
- 设置主题 - 在脚本开头使用。
sns.set_theme(style="whitegrid", palette="muted") - 利用hue参数 - 使用语义颜色映射为2D图表添加额外维度。
- 根据场景选择上下文 - 对于出版物使用,演示文稿使用"talk"。
sns.set_context("paper")
❌ DON'T
❌ 错误做法
- Pass 1D arrays manually - Avoid ; it ignores the power of Pandas integration.
sns.plot(x_array, y_array) - Ignore the Index - Unlike Matplotlib, Seaborn mostly ignores the DataFrame index (use columns instead).
- Overcrowd plots - Too many semantic mappings (hue, size, style) make graphs unreadable.
- Forget Matplotlib - Remember that Seaborn functions return Matplotlib objects; use to tweak them.
ax.set_title()
- 手动传入一维数组 - 避免使用;这会浪费Pandas集成的优势。
sns.plot(x_array, y_array) - 忽略索引 - 与Matplotlib不同,Seaborn基本忽略DataFrame的索引(改用列)。
- 过度拥挤的图表 - 过多的语义映射(hue、size、style)会使图形难以阅读。
- 忘记Matplotlib - 记住Seaborn函数返回Matplotlib对象;使用进行微调。
ax.set_title()
Anti-Patterns (NEVER)
反模式(绝对避免)
python
import seaborn as sns
import matplotlib.pyplot as pltpython
import seaborn as sns
import matplotlib.pyplot as plt❌ BAD: Iterating through groups to plot manually
❌ 错误示范:手动遍历分组绘图
for s in df['species'].unique():
subset = df[df['species'] == s]
plt.scatter(subset['x'], subset['y'], label=s)
for s in df['species'].unique():
subset = df[df['species'] == s]
plt.scatter(subset['x'], subset['y'], label=s)
✅ GOOD: Let Seaborn handle grouping and legend
✅ 正确示范:让Seaborn处理分组和图例
sns.scatterplot(data=df, x='x', y='y', hue='species')
sns.scatterplot(data=df, x='x', y='y', hue='species')
❌ BAD: Mixing Seaborn and Matplotlib titles incorrectly
❌ 错误示范:错误混合Seaborn和Matplotlib标题设置
sns.displot(data=df, x='val')
plt.title("My Title") # ⚠️ Might apply to the wrong axis in a FacetGrid!
sns.displot(data=df, x='val')
plt.title("My Title") # ⚠️ 在FacetGrid中可能会应用到错误的轴!
✅ GOOD: Use the returned object
✅ 正确示范:使用返回的对象
g = sns.displot(data=df, x='val')
g.set_axis_labels("Value", "Count")
g.figure.suptitle("Correct Global Title", y=1.05)
undefinedg = sns.displot(data=df, x='val')
g.set_axis_labels("Value", "Count")
g.figure.suptitle("Correct Global Title", y=1.05)
undefinedRelational Plots (relplot)
关系图(relplot)
Scatter and Line Plots
散点图和折线图
python
undefinedpython
undefinedMulti-faceted scatter plot
多分面散点图
sns.relplot(
data=tips, x="total_bill", y="tip",
col="time", hue="day", style="sex",
kind="scatter"
)
sns.relplot(
data=tips, x="total_bill", y="tip",
col="time", hue="day", style="sex",
kind="scatter"
)
Line plot with automatic aggregation (mean + 95% CI)
带自动聚合的折线图(均值 + 95%置信区间)
fmri = sns.load_dataset("fmri")
sns.relplot(
data=fmri, x="timepoint", y="signal",
hue="event", style="region",
kind="line", errorbar="sd" # "sd" for standard deviation instead of CI
)
undefinedfmri = sns.load_dataset("fmri")
sns.relplot(
data=fmri, x="timepoint", y="signal",
hue="event", style="region",
kind="line", errorbar="sd" # "sd"表示用标准差代替置信区间
)
undefinedDistribution Plots (displot)
分布图(displot)
Histograms and KDEs
直方图和核密度估计图
python
penguins = sns.load_dataset("penguins")python
penguins = sns.load_dataset("penguins")Histogram with Kernel Density Estimate
带核密度估计的直方图
sns.displot(data=penguins, x="flipper_length_mm", hue="species", kde=True)
sns.displot(data=penguins, x="flipper_length_mm", hue="species", kde=True)
Bivariate distribution (Heatmap style)
双变量分布(热力图样式)
sns.displot(data=penguins, x="bill_length_mm", y="bill_depth_mm", hue="species", kind="kde")
sns.displot(data=penguins, x="bill_length_mm", y="bill_depth_mm", hue="species", kind="kde")
Empirical Cumulative Distribution (ECDF)
经验累积分布(ECDF)
sns.displot(data=penguins, x="flipper_length_mm", hue="species", kind="ecdf")
undefinedsns.displot(data=penguins, x="flipper_length_mm", hue="species", kind="ecdf")
undefinedCategorical Plots (catplot)
分类图(catplot)
Comparisons and Distribution within categories
分类内的比较与分布
python
undefinedpython
undefinedBoxplot (Show quartiles and outliers)
箱线图(展示四分位数和异常值)
sns.catplot(data=tips, x="day", y="total_bill", kind="box")
sns.catplot(data=tips, x="day", y="total_bill", kind="box")
Violin plot (Show density and quartiles)
小提琴图(展示密度和四分位数)
sns.catplot(data=tips, x="day", y="total_bill", hue="sex", kind="violin", split=True)
sns.catplot(data=tips, x="day", y="total_bill", hue="sex", kind="violin", split=True)
Swarm plot (Show every point without overlap)
蜂群图(展示所有数据点且无重叠)
sns.catplot(data=tips, x="day", y="total_bill", kind="swarm")
sns.catplot(data=tips, x="day", y="total_bill", kind="swarm")
Bar plot (Show mean and error bars)
条形图(展示均值和误差棒)
sns.catplot(data=tips, x="day", y="total_bill", kind="bar", errorbar=("pi", 95))
undefinedsns.catplot(data=tips, x="day", y="total_bill", kind="bar", errorbar=("pi", 95))
undefinedRegression Plots
回归图
Visualizing Linear Trends
可视化线性趋势
python
undefinedpython
undefinedSimple regression with scatter
带散点的简单回归图
sns.regplot(data=tips, x="total_bill", y="tip")
sns.regplot(data=tips, x="total_bill", y="tip")
Faceted regression
分面回归图
sns.lmplot(data=tips, x="total_bill", y="tip", col="smoker", hue="time")
sns.lmplot(data=tips, x="total_bill", y="tip", col="smoker", hue="time")
Logistic regression (for binary data)
逻辑回归(适用于二元数据)
sns.lmplot(data=df, x="variable", y="binary_outcome", logistic=True)
undefinedsns.lmplot(data=df, x="variable", y="binary_outcome", logistic=True)
undefinedMatrix Plots
矩阵图
Heatmaps and Clustering
热力图和聚类
python
flights = sns.load_dataset("flights").pivot(index="month", columns="year", values="passengers")python
flights = sns.load_dataset("flights").pivot(index="month", columns="year", values="passengers")Heatmap
热力图
plt.figure(figsize=(10, 8))
sns.heatmap(flights, annot=True, fmt="d", cmap="YlGnBu")
plt.figure(figsize=(10, 8))
sns.heatmap(flights, annot=True, fmt="d", cmap="YlGnBu")
Cluster map (Hierarchical clustering)
聚类图(层次聚类)
sns.clustermap(flights, standard_scale=1, cmap="mako")
undefinedsns.clustermap(flights, standard_scale=1, cmap="mako")
undefinedGrid Objects (Advanced)
网格对象(进阶)
Custom Multi-plot Layouts
自定义多图布局
python
undefinedpython
undefinedJointPlot (Scatter + Marginals)
联合图(散点图 + 边缘分布图)
sns.jointplot(data=penguins, x="bill_length_mm", y="bill_depth_mm", hue="species", kind="kde")
sns.jointplot(data=penguins, x="bill_length_mm", y="bill_depth_mm", hue="species", kind="kde")
PairPlot (All-against-all relations)
配对图(所有变量两两关系)
sns.pairplot(data=penguins, hue="species", corner=True)
sns.pairplot(data=penguins, hue="species", corner=True)
Custom FacetGrid
自定义FacetGrid
g = sns.FacetGrid(tips, col="time", row="sex")
g.map(sns.scatterplot, "total_bill", "tip")
undefinedg = sns.FacetGrid(tips, col="time", row="sex")
g.map(sns.scatterplot, "total_bill", "tip")
undefinedStyling and Aesthetics
样式与美学
Themes and Palettes
主题与调色板
python
undefinedpython
undefinedSet overall look
设置整体外观
sns.set_style("darkgrid") # white, dark, whitegrid, ticks
sns.set_context("talk") # paper, notebook, talk, poster
sns.set_style("darkgrid") # 可选值:white, dark, whitegrid, ticks
sns.set_context("talk") # 可选值:paper, notebook, talk, poster
Custom palettes
自定义调色板
sns.set_palette("husl") # Set global palette
my_pal = sns.color_palette("rocket", as_cmap=True) # Get palette as object
sns.set_palette("husl") # 设置全局调色板
my_pal = sns.color_palette("rocket", as_cmap=True) # 获取调色板对象
Viewing a palette
查看调色板
sns.palplot(sns.color_palette("Set2"))
undefinedsns.palplot(sns.color_palette("Set2"))
undefinedPractical Workflows
实用工作流
1. Exploratory Data Analysis (EDA) Pipeline
1. 探索性数据分析(EDA)流程
python
def initial_eda(df, target_col):
"""Generate basic visual summary of a dataset."""
# 1. Distribution of target
sns.displot(data=df, x=target_col, kde=True)
# 2. Pairwise relations of numeric features
sns.pairplot(data=df, hue=target_col if df[target_col].nunique() < 10 else None)
# 3. Correlation heatmap
plt.figure(figsize=(12, 10))
sns.heatmap(df.corr(numeric_only=True), annot=True, cmap='coolwarm', fmt=".2f")python
def initial_eda(df, target_col):
"""生成数据集的基础可视化摘要。"""
# 1. 目标变量的分布
sns.displot(data=df, x=target_col, kde=True)
# 2. 数值特征的两两关系
sns.pairplot(data=df, hue=target_col if df[target_col].nunique() < 10 else None)
# 3. 相关系数热力图
plt.figure(figsize=(12, 10))
sns.heatmap(df.corr(numeric_only=True), annot=True, cmap='coolwarm', fmt=".2f")initial_eda(iris, "species")
initial_eda(iris, "species")
undefinedundefined2. Scientific Result Comparison
2. 科学结果对比
python
def plot_experiment_results(df):
"""Plot results of an experiment with multiple conditions."""
g = sns.catplot(
data=df, kind="bar",
x="condition", y="metric", hue="group",
palette="viridis", alpha=.6, height=6
)
g.despine(left=True)
g.set_axis_labels("Experimental Condition", "Accuracy (%)")
g.legend.set_title("User Group")
return gpython
def plot_experiment_results(df):
"""绘制多条件实验结果。"""
g = sns.catplot(
data=df, kind="bar",
x="condition", y="metric", hue="group",
palette="viridis", alpha=.6, height=6
)
g.despine(left=True)
g.set_axis_labels("实验条件", "准确率(%)")
g.legend.set_title("用户组")
return g3. Time-Series Trends by Category
3. 按分类展示时间序列趋势
python
def plot_trends(df, time_col, val_col, cat_col):
"""Visualizes trends over time with confidence intervals."""
plt.figure(figsize=(12, 6))
sns.lineplot(
data=df, x=time_col, y=val_col, hue=cat_col,
marker="o", err_style="bars"
)
plt.xticks(rotation=45)
plt.tight_layout()python
def plot_trends(df, time_col, val_col, cat_col):
"""可视化带置信区间的时间趋势。"""
plt.figure(figsize=(12, 6))
sns.lineplot(
data=df, x=time_col, y=val_col, hue=cat_col,
marker="o", err_style="bars"
)
plt.xticks(rotation=45)
plt.tight_layout()Common Pitfalls and Solutions
常见问题与解决方案
Legend Outside the Plot
图例超出图表范围
python
undefinedpython
undefined❌ Problem: Legend covers data in narrow plots
❌ 问题:在窄图表中图例覆盖数据
✅ Solution: Move legend manually using Matplotlib logic
✅ 解决方案:使用Matplotlib逻辑手动移动图例
g = sns.scatterplot(data=tips, x="total_bill", y="tip", hue="day")
sns.move_legend(g, "upper left", bbox_to_anchor=(1, 1))
undefinedg = sns.scatterplot(data=tips, x="total_bill", y="tip", hue="day")
sns.move_legend(g, "upper left", bbox_to_anchor=(1, 1))
undefinedSlow Performance with Large Data
大数据集下性能缓慢
python
undefinedpython
undefined❌ Problem: sns.pairplot(large_df) hangs
❌ 问题:sns.pairplot(large_df) 卡顿
✅ Solution: Sample data or use simpler plots
✅ 解决方案:采样数据或使用更简单的图表
sns.pairplot(df.sample(1000), hue='category')
sns.pairplot(df.sample(1000), hue='category')
OR use hist instead of scatter
或者使用直方图代替散点图
sns.jointplot(data=df, x='x', y='y', kind="hist")
undefinedsns.jointplot(data=df, x='x', y='y', kind="hist")
undefinedOverlapping Labels
标签重叠
python
undefinedpython
undefined❌ Problem: Categorical labels on X-axis overlap
❌ 问题:X轴上的分类标签重叠
✅ Solution: Rotate labels using Matplotlib
✅ 解决方案:使用Matplotlib旋转标签
g = sns.boxplot(data=df, x='very_long_category_name', y='value')
g.set_xticklabels(g.get_xticklabels(), rotation=45, horizontalalignment='right')
undefinedg = sns.boxplot(data=df, x='very_long_category_name', y='value')
g.set_xticklabels(g.get_xticklabels(), rotation=45, horizontalalignment='right')
undefinedBest Practices
最佳实践
- Use tidy data format - Ensure your DataFrame is in long-form (one row per observation)
- Prefer figure-level functions - Use ,
relplot, anddisplotfor better default layouts and facetingcatplot - Always use the parameter - Pass the DataFrame directly to keep code clean and readable
data= - Set themes early - Use at the beginning of your script for consistent styling
sns.set_theme() - Leverage semantic mappings - Use ,
hue, andsizeto add dimensions to your plotsstyle - Choose appropriate context - Use for publications or "talk" for presentations
sns.set_context("paper") - Remember Seaborn returns Matplotlib objects - Use Matplotlib methods like for fine-tuning
ax.set_title() - Don't overcrowd plots - Limit semantic mappings to maintain readability
- Use figure-level functions for faceting - They handle subplot layouts automatically
- Sample large datasets - Use before plotting to improve performance with big data
df.sample()
Seaborn makes statistical visualization a joy by providing high-level abstractions that produce beautiful, publication-quality graphics with minimal effort.
- 使用整洁数据格式 - 确保你的DataFrame是长格式(每行对应一个观测值)
- 优先使用图级别函数 - 使用、
relplot和displot获得更优的默认布局和分面功能catplot - 始终使用参数 - 直接传入DataFrame以保持代码整洁可读
data= - 尽早设置主题 - 在脚本开头使用确保样式一致性
sns.set_theme() - 利用语义映射 - 使用、
hue和size为图表添加更多维度style - 选择合适的上下文 - 出版物使用,演示文稿使用"talk"
sns.set_context("paper") - 记住Seaborn返回Matplotlib对象 - 使用Matplotlib方法如进行微调
ax.set_title() - 不要过度拥挤图表 - 限制语义映射数量以保持可读性
- 使用图级别函数进行分面 - 它们会自动处理子图布局
- 对大数据集进行采样 - 绘图前使用提升性能
df.sample()
Seaborn 通过提供高级抽象,让你只需少量工作就能生成美观、达到出版级质量的统计图形,让统计可视化成为一种享受。