data-visualization
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseData Visualization
数据可视化
Create compelling visualizations to explore and communicate data insights.
制作有表现力的可视化图表,用于探索和传递数据洞察。
Quick Start
快速开始
Matplotlib Basics
Matplotlib基础
python
import matplotlib.pyplot as pltpython
import matplotlib.pyplot as pltLine plot
Line plot
plt.figure(figsize=(10, 6))
plt.plot(x, y, marker='o', linestyle='-', color='blue', label='Series 1')
plt.xlabel('X Label')
plt.ylabel('Y Label')
plt.title('Title')
plt.legend()
plt.grid(True, alpha=0.3)
plt.show()
plt.figure(figsize=(10, 6))
plt.plot(x, y, marker='o', linestyle='-', color='blue', label='Series 1')
plt.xlabel('X Label')
plt.ylabel('Y Label')
plt.title('Title')
plt.legend()
plt.grid(True, alpha=0.3)
plt.show()
Bar chart
Bar chart
plt.bar(categories, values, color='skyblue', edgecolor='black')
plt.xlabel('Categories')
plt.ylabel('Values')
plt.xticks(rotation=45)
plt.tight_layout()
plt.show()
undefinedplt.bar(categories, values, color='skyblue', edgecolor='black')
plt.xlabel('Categories')
plt.ylabel('Values')
plt.xticks(rotation=45)
plt.tight_layout()
plt.show()
undefinedSeaborn for Statistical Plots
用Seaborn绘制统计图表
python
import seaborn as snspython
import seaborn as snsSet style
Set style
sns.set_style("whitegrid")
sns.set_style("whitegrid")
Distribution
Distribution
sns.histplot(data=df, x='value', kde=True, bins=30)
sns.histplot(data=df, x='value', kde=True, bins=30)
Box plot
Box plot
sns.boxplot(data=df, x='category', y='value')
sns.boxplot(data=df, x='category', y='value')
Violin plot
Violin plot
sns.violinplot(data=df, x='category', y='value')
sns.violinplot(data=df, x='category', y='value')
Heatmap
Heatmap
corr = df.corr()
sns.heatmap(corr, annot=True, cmap='coolwarm', center=0)
corr = df.corr()
sns.heatmap(corr, annot=True, cmap='coolwarm', center=0)
Pairplot
Pairplot
sns.pairplot(df, hue='target', diag_kind='kde')
undefinedsns.pairplot(df, hue='target', diag_kind='kde')
undefinedExploratory Data Analysis
探索性数据分析
python
undefinedpython
undefinedQuick overview
Quick overview
df.info()
df.describe()
df.info()
df.describe()
Missing values
Missing values
df.isnull().sum()
df.isnull().sum()
Value counts
Value counts
df['category'].value_counts().plot(kind='bar')
df['category'].value_counts().plot(kind='bar')
Distribution
Distribution
df.hist(figsize=(12, 10), bins=30)
plt.tight_layout()
plt.show()
df.hist(figsize=(12, 10), bins=30)
plt.tight_layout()
plt.show()
Correlation matrix
Correlation matrix
plt.figure(figsize=(10, 8))
sns.heatmap(df.corr(), annot=True, cmap='coolwarm',
center=0, square=True)
plt.title('Correlation Matrix')
plt.show()
undefinedplt.figure(figsize=(10, 8))
sns.heatmap(df.corr(), annot=True, cmap='coolwarm',
center=0, square=True)
plt.title('Correlation Matrix')
plt.show()
undefinedInteractive Visualizations with Plotly
用Plotly制作交互式可视化
python
import plotly.express as px
import plotly.graph_objects as gopython
import plotly.express as px
import plotly.graph_objects as goInteractive scatter
Interactive scatter
fig = px.scatter(df, x='feature1', y='target',
color='category', size='value',
hover_data=['name', 'date'],
title='Interactive Scatter Plot')
fig.show()
fig = px.scatter(df, x='feature1', y='target',
color='category', size='value',
hover_data=['name', 'date'],
title='Interactive Scatter Plot')
fig.show()
Time series
Time series
fig = px.line(df, x='date', y='value', color='category',
title='Time Series')
fig.update_xaxes(rangeslider_visible=True)
fig.show()
fig = px.line(df, x='date', y='value', color='category',
title='Time Series')
fig.update_xaxes(rangeslider_visible=True)
fig.show()
3D scatter
3D scatter
fig = px.scatter_3d(df, x='x', y='y', z='z',
color='category', size='value')
fig.show()
undefinedfig = px.scatter_3d(df, x='x', y='y', z='z',
color='category', size='value')
fig.show()
undefinedDashboard with Plotly Dash
用Plotly Dash搭建仪表板
python
import dash
from dash import dcc, html
from dash.dependencies import Input, Output
app = dash.Dash(__name__)
app.layout = html.Div([
html.H1('Sales Dashboard'),
dcc.Dropdown(
id='category-dropdown',
options=[{'label': cat, 'value': cat}
for cat in df['category'].unique()],
value=df['category'].unique()[0]
),
dcc.Graph(id='sales-graph'),
dcc.RangeSlider(
id='year-slider',
min=df['year'].min(),
max=df['year'].max(),
value=[df['year'].min(), df['year'].max()],
marks={str(year): str(year)
for year in df['year'].unique()}
)
])
@app.callback(
Output('sales-graph', 'figure'),
[Input('category-dropdown', 'value'),
Input('year-slider', 'value')]
)
def update_graph(selected_category, year_range):
filtered_df = df[
(df['category'] == selected_category) &
(df['year'] >= year_range[0]) &
(df['year'] <= year_range[1])
]
fig = px.line(filtered_df, x='date', y='sales')
return fig
if __name__ == '__main__':
app.run_server(debug=True)python
import dash
from dash import dcc, html
from dash.dependencies import Input, Output
app = dash.Dash(__name__)
app.layout = html.Div([
html.H1('Sales Dashboard'),
dcc.Dropdown(
id='category-dropdown',
options=[{'label': cat, 'value': cat}
for cat in df['category'].unique()],
value=df['category'].unique()[0]
),
dcc.Graph(id='sales-graph'),
dcc.RangeSlider(
id='year-slider',
min=df['year'].min(),
max=df['year'].max(),
value=[df['year'].min(), df['year'].max()],
marks={str(year): str(year)
for year in df['year'].unique()}
)
])
@app.callback(
Output('sales-graph', 'figure'),
[Input('category-dropdown', 'value'),
Input('year-slider', 'value')]
)
def update_graph(selected_category, year_range):
filtered_df = df[
(df['category'] == selected_category) &
(df['year'] >= year_range[0]) &
(df['year'] <= year_range[1])
]
fig = px.line(filtered_df, x='date', y='sales')
return fig
if __name__ == '__main__':
app.run_server(debug=True)Subplots
子图
python
fig, axes = plt.subplots(2, 2, figsize=(12, 10))python
fig, axes = plt.subplots(2, 2, figsize=(12, 10))Top left
Top left
axes[0, 0].hist(data1, bins=30)
axes[0, 0].set_title('Histogram')
axes[0, 0].hist(data1, bins=30)
axes[0, 0].set_title('Histogram')
Top right
Top right
axes[0, 1].scatter(x, y)
axes[0, 1].set_title('Scatter')
axes[0, 1].scatter(x, y)
axes[0, 1].set_title('Scatter')
Bottom left
Bottom left
axes[1, 0].plot(x, y)
axes[1, 0].set_title('Line Plot')
axes[1, 0].plot(x, y)
axes[1, 0].set_title('Line Plot')
Bottom right
Bottom right
axes[1, 1].boxplot([data1, data2, data3])
axes[1, 1].set_title('Box Plot')
plt.tight_layout()
plt.show()
undefinedaxes[1, 1].boxplot([data1, data2, data3])
axes[1, 1].set_title('Box Plot')
plt.tight_layout()
plt.show()
undefinedVisualization Best Practices
可视化最佳实践
-
Choose the right chart type:
- Comparison: Bar chart
- Distribution: Histogram, box plot
- Relationship: Scatter plot
- Time series: Line chart
- Composition: Pie chart, stacked bar
-
Design principles:
- Clear labels and titles
- Appropriate color schemes
- Remove chart junk
- Consistent formatting
- Accessibility (color-blind friendly)
-
Common pitfalls to avoid:
- Misleading axes (non-zero baseline)
- Too many colors
- 3D charts (distort perception)
- Pie charts with many categories
- Dual y-axes (confusing)
-
选择合适的图表类型:
- 对比:柱状图
- 分布:直方图、箱线图
- 关联关系:散点图
- 时间序列:折线图
- 构成:饼图、堆叠柱状图
-
设计原则:
- 清晰的标签和标题
- 合适的配色方案
- 移除图表冗余元素
- 格式统一
- 无障碍设计(适配色盲用户)
-
需要避免的常见误区:
- 坐标轴误导(非零基线)
- 颜色过多
- 3D图表(会扭曲感知)
- 包含过多分类的饼图
- 双Y轴(易造成混淆)
Color Palettes
配色方案
python
undefinedpython
undefinedSeaborn palettes
Seaborn palettes
sns.color_palette("viridis", as_cmap=True)
sns.color_palette("coolwarm", as_cmap=True)
sns.color_palette("Set2")
sns.color_palette("viridis", as_cmap=True)
sns.color_palette("coolwarm", as_cmap=True)
sns.color_palette("Set2")
Custom colors
Custom colors
colors = ['#FF6B6B', '#4ECDC4', '#45B7D1', '#FFA07A']
undefinedcolors = ['#FF6B6B', '#4ECDC4', '#45B7D1', '#FFA07A']
undefinedExport Figures
导出图表
python
undefinedpython
undefinedHigh-resolution PNG
High-resolution PNG
plt.savefig('figure.png', dpi=300, bbox_inches='tight')
plt.savefig('figure.png', dpi=300, bbox_inches='tight')
Vector format (PDF, SVG)
Vector format (PDF, SVG)
plt.savefig('figure.pdf', bbox_inches='tight')
plt.savefig('figure.svg', bbox_inches='tight')
undefinedplt.savefig('figure.pdf', bbox_inches='tight')
plt.savefig('figure.svg', bbox_inches='tight')
undefined