streamlit
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseStreamlit Data Application Skill
Streamlit 数据应用技能
Build beautiful, interactive data applications with pure Python. Transform data scripts into shareable web apps in minutes with widgets, charts, and layouts.
使用纯Python构建美观、交互式的数据应用。借助小部件、图表和布局,只需几分钟即可将数据脚本转换为可共享的Web应用。
When to Use This Skill
何时使用该技能
USE Streamlit when:
适合使用Streamlit的场景:
- Rapid prototyping - Need to build a data app quickly
- Internal tools - Creating tools for your team
- Data exploration - Interactive exploration of datasets
- Demo applications - Showcasing data science projects
- ML model demos - Building interfaces for model inference
- Simple dashboards - Quick insights without complex setup
- Python-only development - No JavaScript/frontend knowledge required
- 快速原型开发 - 需要快速构建数据应用
- 内部工具 - 为团队创建工具
- 数据探索 - 交互式探索数据集
- 演示应用 - 展示数据科学项目
- ML模型演示 - 为模型推理构建界面
- 简单仪表板 - 无需复杂设置即可快速获取洞察
- 纯Python开发 - 无需JavaScript/前端知识
DON'T USE Streamlit when:
不适合使用Streamlit的场景:
- Complex interactivity - Need fine-grained callback control (use Dash)
- Enterprise deployment - Require advanced authentication/scaling (use Dash Enterprise)
- Custom components - Heavy custom JavaScript requirements
- High-traffic production - Thousands of concurrent users
- Real-time streaming - Sub-second update requirements
- 复杂交互 - 需要细粒度回调控制(使用Dash)
- 企业级部署 - 需要高级认证/扩展能力(使用Dash Enterprise)
- 自定义组件 - 有大量自定义JavaScript需求
- 高流量生产环境 - 数千并发用户
- 实时流处理 - 亚秒级更新需求
Prerequisites
前提条件
bash
undefinedbash
undefinedBasic installation
基础安装
pip install streamlit
pip install streamlit
With common extras
安装常用扩展
pip install streamlit plotly pandas polars
pip install streamlit plotly pandas polars
Using uv (recommended)
使用uv(推荐)
uv pip install streamlit plotly pandas polars altair
uv pip install streamlit plotly pandas polars altair
Verify installation
验证安装
streamlit hello
undefinedstreamlit hello
undefinedCore Capabilities
核心功能
1. Basic Application Structure
1. 基础应用结构
Minimal App (app.py):
python
import streamlit as st
import pandas as pd
import polars as pl最小应用(app.py):
python
import streamlit as st
import pandas as pd
import polars as plPage configuration (must be first Streamlit command)
页面配置(必须是第一个Streamlit命令)
st.set_page_config(
page_title="My Data App",
page_icon="📊",
layout="wide",
initial_sidebar_state="expanded"
)
st.set_page_config(
page_title="My Data App",
page_icon="📊",
layout="wide",
initial_sidebar_state="expanded"
)
Title and header
标题和页眉
st.title("My Data Application")
st.header("Welcome to the Dashboard")
st.subheader("Data Analysis Section")
st.title("My Data Application")
st.header("Welcome to the Dashboard")
st.subheader("Data Analysis Section")
Text elements
文本元素
st.text("This is plain text")
st.markdown("Bold and italic text with links")
st.caption("This is a caption for additional context")
st.code("print('Hello, Streamlit!')", language="python")
st.text("This is plain text")
st.markdown("Bold and italic text with links")
st.caption("This is a caption for additional context")
st.code("print('Hello, Streamlit!')", language="python")
Display data
展示数据
df = pd.DataFrame({
"Name": ["Alice", "Bob", "Charlie"],
"Age": [25, 30, 35],
"City": ["NYC", "LA", "Chicago"]
})
st.dataframe(df) # Interactive table
st.table(df) # Static table
st.json({"key": "value", "list": [1, 2, 3]})
df = pd.DataFrame({
"Name": ["Alice", "Bob", "Charlie"],
"Age": [25, 30, 35],
"City": ["NYC", "LA", "Chicago"]
})
st.dataframe(df) # 交互式表格
st.table(df) # 静态表格
st.json({"key": "value", "list": [1, 2, 3]})
Metrics
指标卡片
col1, col2, col3 = st.columns(3)
col1.metric("Revenue", "$1.2M", "+12%")
col2.metric("Users", "10,234", "-2%")
col3.metric("Conversion", "3.2%", "+0.5%")
**Run the app:**
```bash
streamlit run app.pycol1, col2, col3 = st.columns(3)
col1.metric("Revenue", "$1.2M", "+12%")
col2.metric("Users", "10,234", "-2%")
col3.metric("Conversion", "3.2%", "+0.5%")
**运行应用:**
```bash
streamlit run app.py2. Widgets and User Input
2. 小部件与用户输入
Input Widgets:
python
import streamlit as st
from datetime import datetime, date输入小部件:
python
import streamlit as st
from datetime import datetime, dateText inputs
文本输入
name = st.text_input("Enter your name", value="User")
bio = st.text_area("Tell us about yourself", height=100)
password = st.text_input("Password", type="password")
name = st.text_input("Enter your name", value="User")
bio = st.text_area("Tell us about yourself", height=100)
password = st.text_input("Password", type="password")
Numeric inputs
数值输入
age = st.number_input("Age", min_value=0, max_value=120, value=25, step=1)
price = st.slider("Price Range", 0.0, 100.0, (25.0, 75.0)) # Range slider
rating = st.slider("Rating", 1, 5, 3)
age = st.number_input("Age", min_value=0, max_value=120, value=25, step=1)
price = st.slider("Price Range", 0.0, 100.0, (25.0, 75.0)) # 范围滑块
rating = st.slider("Rating", 1, 5, 3)
Selection widgets
选择小部件
option = st.selectbox("Choose an option", ["Option A", "Option B", "Option C"])
options = st.multiselect("Select multiple", ["Red", "Green", "Blue"], default=["Red"])
radio_choice = st.radio("Pick one", ["Small", "Medium", "Large"], horizontal=True)
option = st.selectbox("Choose an option", ["Option A", "Option B", "Option C"])
options = st.multiselect("Select multiple", ["Red", "Green", "Blue"], default=["Red"])
radio_choice = st.radio("Pick one", ["Small", "Medium", "Large"], horizontal=True)
Boolean inputs
布尔输入
agree = st.checkbox("I agree to the terms")
toggle = st.toggle("Enable feature")
agree = st.checkbox("I agree to the terms")
toggle = st.toggle("Enable feature")
Date and time
日期和时间
selected_date = st.date_input("Select a date", value=date.today())
date_range = st.date_input(
"Date range",
value=(date(2025, 1, 1), date.today()),
format="YYYY-MM-DD"
)
selected_time = st.time_input("Select a time")
selected_date = st.date_input("Select a date", value=date.today())
date_range = st.date_input(
"Date range",
value=(date(2025, 1, 1), date.today()),
format="YYYY-MM-DD"
)
selected_time = st.time_input("Select a time")
File upload
文件上传
uploaded_file = st.file_uploader("Upload a CSV file", type=["csv", "xlsx"])
if uploaded_file is not None:
df = pd.read_csv(uploaded_file)
st.write(f"Loaded {len(df)} rows")
uploaded_file = st.file_uploader("Upload a CSV file", type=["csv", "xlsx"])
if uploaded_file is not None:
df = pd.read_csv(uploaded_file)
st.write(f"Loaded {len(df)} rows")
Color picker
颜色选择器
color = st.color_picker("Pick a color", "#00FF00")
color = st.color_picker("Pick a color", "#00FF00")
Buttons
按钮
if st.button("Click me"):
st.write("Button clicked!")
if st.button("Click me"):
st.write("Button clicked!")
Download button
下载按钮
@st.cache_data
def get_data():
return pd.DataFrame({"x": [1, 2, 3], "y": [4, 5, 6]})
csv = get_data().to_csv(index=False)
st.download_button(
label="Download CSV",
data=csv,
file_name="data.csv",
mime="text/csv"
)
undefined@st.cache_data
def get_data():
return pd.DataFrame({"x": [1, 2, 3], "y": [4, 5, 6]})
csv = get_data().to_csv(index=False)
st.download_button(
label="Download CSV",
data=csv,
file_name="data.csv",
mime="text/csv"
)
undefined3. Layout and Organization
3. 布局与组织
Columns:
python
import streamlit as st列布局:
python
import streamlit as stEqual columns
等宽列
col1, col2, col3 = st.columns(3)
with col1:
st.header("Column 1")
st.write("Content for column 1")
with col2:
st.header("Column 2")
st.metric("Metric", "100")
with col3:
st.header("Column 3")
st.button("Action")
col1, col2, col3 = st.columns(3)
with col1:
st.header("Column 1")
st.write("Content for column 1")
with col2:
st.header("Column 2")
st.metric("Metric", "100")
with col3:
st.header("Column 3")
st.button("Action")
Unequal columns
不等宽列
left, right = st.columns([2, 1]) # 2:1 ratio
with left:
st.write("Wider column")
with right:
st.write("Narrower column")
**Sidebar:**
```python
import streamlit as stleft, right = st.columns([2, 1]) # 2:1比例
with left:
st.write("Wider column")
with right:
st.write("Narrower column")
**侧边栏:**
```python
import streamlit as stSidebar content
侧边栏内容
st.sidebar.title("Navigation")
st.sidebar.header("Filters")
st.sidebar.title("Navigation")
st.sidebar.header("Filters")
Sidebar widgets
侧边栏小部件
category = st.sidebar.selectbox("Category", ["All", "A", "B", "C"])
min_value = st.sidebar.slider("Minimum Value", 0, 100, 25)
show_raw = st.sidebar.checkbox("Show raw data")
category = st.sidebar.selectbox("Category", ["All", "A", "B", "C"])
min_value = st.sidebar.slider("Minimum Value", 0, 100, 25)
show_raw = st.sidebar.checkbox("Show raw data")
Using 'with' syntax
使用'with'语法
with st.sidebar:
st.header("Settings")
theme = st.radio("Theme", ["Light", "Dark"])
st.divider()
st.caption("App v1.0.0")
**Tabs:**
```python
import streamlit as st
tab1, tab2, tab3 = st.tabs(["📈 Chart", "📊 Data", "⚙️ Settings"])
with tab1:
st.header("Chart View")
# Add chart here
with tab2:
st.header("Data View")
# Add dataframe here
with tab3:
st.header("Settings")
# Add settings hereExpanders and Containers:
python
import streamlit as stwith st.sidebar:
st.header("Settings")
theme = st.radio("Theme", ["Light", "Dark"])
st.divider()
st.caption("App v1.0.0")
**标签页:**
```python
import streamlit as st
tab1, tab2, tab3 = st.tabs(["📈 Chart", "📊 Data", "⚙️ Settings"])
with tab1:
st.header("Chart View")
# 在此处添加图表
with tab2:
st.header("Data View")
# 在此处添加数据框
with tab3:
st.header("Settings")
# 在此处添加设置展开面板与容器:
python
import streamlit as stExpander (collapsible section)
展开面板(可折叠区域)
with st.expander("Click to expand"):
st.write("Hidden content revealed!")
st.code("print('Hello')")
with st.expander("Click to expand"):
st.write("Hidden content revealed!")
st.code("print('Hello')")
Container (grouping elements)
容器(元素分组)
with st.container():
st.write("This is inside a container")
col1, col2 = st.columns(2)
col1.write("Left")
col2.write("Right")
with st.container():
st.write("This is inside a container")
col1, col2 = st.columns(2)
col1.write("Left")
col2.write("Right")
Container with border
带边框的容器
with st.container(border=True):
st.write("Content with border")
with st.container(border=True):
st.write("Content with border")
Empty placeholder (for dynamic updates)
空占位符(用于动态更新)
placeholder = st.empty()
placeholder.text("Initial text")
placeholder = st.empty()
placeholder.text("Initial text")
Later: placeholder.text("Updated text")
后续更新:placeholder.text("Updated text")
undefinedundefined4. Data Visualization
4. 数据可视化
Plotly Integration:
python
import streamlit as st
import plotly.express as px
import plotly.graph_objects as go
import pandas as pdPlotly集成:
python
import streamlit as st
import plotly.express as px
import plotly.graph_objects as go
import pandas as pdSample data
示例数据
df = pd.DataFrame({
"date": pd.date_range("2025-01-01", periods=100),
"value": [i + (i % 7) * 5 for i in range(100)],
"category": ["A", "B", "C", "D"] * 25
})
df = pd.DataFrame({
"date": pd.date_range("2025-01-01", periods=100),
"value": [i + (i % 7) * 5 for i in range(100)],
"category": ["A", "B", "C", "D"] * 25
})
Plotly Express charts
Plotly Express图表
fig = px.line(df, x="date", y="value", color="category", title="Time Series")
st.plotly_chart(fig, use_container_width=True)
fig = px.line(df, x="date", y="value", color="category", title="Time Series")
st.plotly_chart(fig, use_container_width=True)
Scatter plot
散点图
fig_scatter = px.scatter(
df, x="date", y="value",
color="category", size="value",
hover_data=["category"]
)
st.plotly_chart(fig_scatter, use_container_width=True)
fig_scatter = px.scatter(
df, x="date", y="value",
color="category", size="value",
hover_data=["category"]
)
st.plotly_chart(fig_scatter, use_container_width=True)
Bar chart
柱状图
category_totals = df.groupby("category")["value"].sum().reset_index()
fig_bar = px.bar(category_totals, x="category", y="value", title="Category Totals")
st.plotly_chart(fig_bar, use_container_width=True)
category_totals = df.groupby("category")["value"].sum().reset_index()
fig_bar = px.bar(category_totals, x="category", y="value", title="Category Totals")
st.plotly_chart(fig_bar, use_container_width=True)
Graph Objects for more control
更灵活的Graph Objects
fig_go = go.Figure()
fig_go.add_trace(go.Scatter(
x=df["date"],
y=df["value"],
mode="lines+markers",
name="Values"
))
fig_go.update_layout(title="Custom Plotly Chart", hovermode="x unified")
st.plotly_chart(fig_go, use_container_width=True)
**Built-in Charts:**
```python
import streamlit as st
import pandas as pd
import numpy as npfig_go = go.Figure()
fig_go.add_trace(go.Scatter(
x=df["date"],
y=df["value"],
mode="lines+markers",
name="Values"
))
fig_go.update_layout(title="Custom Plotly Chart", hovermode="x unified")
st.plotly_chart(fig_go, use_container_width=True)
**内置图表:**
```python
import streamlit as st
import pandas as pd
import numpy as npSample data
示例数据
chart_data = pd.DataFrame(
np.random.randn(20, 3),
columns=["A", "B", "C"]
)
chart_data = pd.DataFrame(
np.random.randn(20, 3),
columns=["A", "B", "C"]
)
Simple line chart
简单折线图
st.line_chart(chart_data)
st.line_chart(chart_data)
Area chart
面积图
st.area_chart(chart_data)
st.area_chart(chart_data)
Bar chart
柱状图
st.bar_chart(chart_data)
st.bar_chart(chart_data)
Scatter chart (Streamlit 1.26+)
散点图(Streamlit 1.26+)
scatter_data = pd.DataFrame({
"x": np.random.randn(100),
"y": np.random.randn(100),
"size": np.random.rand(100) * 100
})
st.scatter_chart(scatter_data, x="x", y="y", size="size")
scatter_data = pd.DataFrame({
"x": np.random.randn(100),
"y": np.random.randn(100),
"size": np.random.rand(100) * 100
})
st.scatter_chart(scatter_data, x="x", y="y", size="size")
Map
地图
map_data = pd.DataFrame({
"lat": np.random.randn(100) / 50 + 37.76,
"lon": np.random.randn(100) / 50 - 122.4
})
st.map(map_data)
**Matplotlib Integration:**
```python
import streamlit as st
import matplotlib.pyplot as plt
import numpy as npmap_data = pd.DataFrame({
"lat": np.random.randn(100) / 50 + 37.76,
"lon": np.random.randn(100) / 50 - 122.4
})
st.map(map_data)
**Matplotlib集成:**
```python
import streamlit as st
import matplotlib.pyplot as plt
import numpy as npCreate matplotlib figure
创建Matplotlib图表
fig, ax = plt.subplots(figsize=(10, 6))
x = np.linspace(0, 10, 100)
ax.plot(x, np.sin(x), label="sin(x)")
ax.plot(x, np.cos(x), label="cos(x)")
ax.legend()
ax.set_title("Matplotlib Chart")
fig, ax = plt.subplots(figsize=(10, 6))
x = np.linspace(0, 10, 100)
ax.plot(x, np.sin(x), label="sin(x)")
ax.plot(x, np.cos(x), label="cos(x)")
ax.legend()
ax.set_title("Matplotlib Chart")
Display in Streamlit
在Streamlit中展示
st.pyplot(fig)
undefinedst.pyplot(fig)
undefined5. Caching for Performance
5. 缓存优化性能
Cache Data (for expensive data operations):
python
import streamlit as st
import pandas as pd
import polars as pl
import time
@st.cache_data
def load_data(file_path: str) -> pd.DataFrame:
"""Load and cache data. Cache key: file_path."""
time.sleep(2) # Simulate slow load
return pd.read_csv(file_path)
@st.cache_data(ttl=3600) # Cache expires after 1 hour
def fetch_api_data(endpoint: str) -> dict:
"""Fetch data from API with time-based cache."""
import requests
response = requests.get(endpoint)
return response.json()
@st.cache_data(show_spinner="Loading data...")
def load_with_spinner(path: str) -> pl.DataFrame:
"""Show custom spinner while loading."""
return pl.read_parquet(path)缓存数据(用于耗时的数据操作):
python
import streamlit as st
import pandas as pd
import polars as pl
import time
@st.cache_data
def load_data(file_path: str) -> pd.DataFrame:
"""加载并缓存数据。缓存键:file_path。"""
time.sleep(2) # 模拟慢速加载
return pd.read_csv(file_path)
@st.cache_data(ttl=3600) # 缓存1小时后过期
def fetch_api_data(endpoint: str) -> dict:
"""从API获取数据并设置时间缓存。"""
import requests
response = requests.get(endpoint)
return response.json()
@st.cache_data(show_spinner="Loading data...")
def load_with_spinner(path: str) -> pl.DataFrame:
"""加载时显示自定义加载动画。"""
return pl.read_parquet(path)Using cached functions
使用缓存函数
df = load_data("data/sales.csv") # First call: slow
df = load_data("data/sales.csv") # Second call: instant (cached)
df = load_data("data/sales.csv") # 第一次调用:慢速
df = load_data("data/sales.csv") # 第二次调用:即时(已缓存)
Clear cache programmatically
程序化清除缓存
if st.button("Clear cache"):
st.cache_data.clear()
**Cache Resources (for global resources):**
```python
import streamlit as st
from sqlalchemy import create_engine
@st.cache_resource
def get_database_connection():
"""Cache database connection (singleton pattern)."""
return create_engine("postgresql://user:pass@localhost/db")
@st.cache_resource
def load_ml_model():
"""Cache ML model (loaded once per session)."""
import joblib
return joblib.load("model.pkl")if st.button("Clear cache"):
st.cache_data.clear()
**缓存资源(用于全局资源):**
```python
import streamlit as st
from sqlalchemy import create_engine
@st.cache_resource
def get_database_connection():
"""缓存数据库连接(单例模式)。"""
return create_engine("postgresql://user:pass@localhost/db")
@st.cache_resource
def load_ml_model():
"""缓存ML模型(每个会话仅加载一次)。"""
import joblib
return joblib.load("model.pkl")Use cached resources
使用缓存资源
engine = get_database_connection()
model = load_ml_model()
undefinedengine = get_database_connection()
model = load_ml_model()
undefined6. Session State
6. 会话状态
Managing State:
python
import streamlit as st状态管理:
python
import streamlit as stInitialize state
初始化状态
if "counter" not in st.session_state:
st.session_state.counter = 0
if "messages" not in st.session_state:
st.session_state.messages = []
if "counter" not in st.session_state:
st.session_state.counter = 0
if "messages" not in st.session_state:
st.session_state.messages = []
Display current state
显示当前状态
st.write(f"Counter: {st.session_state.counter}")
st.write(f"Counter: {st.session_state.counter}")
Update state with buttons
使用按钮更新状态
col1, col2, col3 = st.columns(3)
if col1.button("Increment"):
st.session_state.counter += 1
st.rerun()
if col2.button("Decrement"):
st.session_state.counter -= 1
st.rerun()
if col3.button("Reset"):
st.session_state.counter = 0
st.rerun()
col1, col2, col3 = st.columns(3)
if col1.button("Increment"):
st.session_state.counter += 1
st.rerun()
if col2.button("Decrement"):
st.session_state.counter -= 1
st.rerun()
if col3.button("Reset"):
st.session_state.counter = 0
st.rerun()
State with widgets
与小部件结合的状态
st.text_input("Name", key="user_name")
st.write(f"Hello, {st.session_state.user_name}!")
st.text_input("Name", key="user_name")
st.write(f"Hello, {st.session_state.user_name}!")
State callback
状态回调
def on_change():
st.session_state.processed = st.session_state.raw_input.upper()
st.text_input("Raw input", key="raw_input", on_change=on_change)
if "processed" in st.session_state:
st.write(f"Processed: {st.session_state.processed}")
**Form State:**
```python
import streamlit as stdef on_change():
st.session_state.processed = st.session_state.raw_input.upper()
st.text_input("Raw input", key="raw_input", on_change=on_change)
if "processed" in st.session_state:
st.write(f"Processed: {st.session_state.processed}")
**表单状态:**
```python
import streamlit as stForms prevent rerunning on every widget change
表单可避免每次小部件交互都重新运行
with st.form("my_form"):
st.write("Submit all at once:")
name = st.text_input("Name")
age = st.number_input("Age", min_value=0, max_value=120)
color = st.selectbox("Favorite color", ["Red", "Green", "Blue"])
# Every form needs a submit button
submitted = st.form_submit_button("Submit")
if submitted:
st.success(f"Thanks {name}! You're {age} and like {color}.")undefinedwith st.form("my_form"):
st.write("Submit all at once:")
name = st.text_input("Name")
age = st.number_input("Age", min_value=0, max_value=120)
color = st.selectbox("Favorite color", ["Red", "Green", "Blue"])
# 每个表单都需要一个提交按钮
submitted = st.form_submit_button("Submit")
if submitted:
st.success(f"Thanks {name}! You're {age} and like {color}.")undefined7. Multi-Page Applications
7. 多页面应用
Directory Structure:
my_app/
├── app.py # Main entry point (optional)
├── pages/
│ ├── 1_📊_Dashboard.py
│ ├── 2_📈_Analytics.py
│ └── 3_⚙️_Settings.py
└── utils/
└── helpers.pyMain App (app.py):
python
import streamlit as st
st.set_page_config(
page_title="Multi-Page App",
page_icon="🏠",
layout="wide"
)
st.title("Welcome to My App")
st.write("Use the sidebar to navigate between pages.")目录结构:
my_app/
├── app.py # 主入口(可选)
├── pages/
│ ├── 1_📊_Dashboard.py
│ ├── 2_📈_Analytics.py
│ └── 3_⚙️_Settings.py
└── utils/
└── helpers.py主应用(app.py):
python
import streamlit as st
st.set_page_config(
page_title="Multi-Page App",
page_icon="🏠",
layout="wide"
)
st.title("Welcome to My App")
st.write("Use the sidebar to navigate between pages.")Shared state initialization
共享状态初始化
if "user" not in st.session_state:
st.session_state.user = None
**Page 1 (pages/1_Dashboard.py):**
```python
import streamlit as st
st.set_page_config(page_title="Dashboard", page_icon="📊")
st.title("📊 Dashboard")
st.write("This is the dashboard page")if "user" not in st.session_state:
st.session_state.user = None
**页面1(pages/1_Dashboard.py):**
```python
import streamlit as st
st.set_page_config(page_title="Dashboard", page_icon="📊")
st.title("📊 Dashboard")
st.write("This is the dashboard page")Access shared state
访问共享状态
if st.session_state.get("user"):
st.write(f"Welcome back, {st.session_state.user}!")
**Page 2 (pages/2_Analytics.py):**
```python
import streamlit as st
st.set_page_config(page_title="Analytics", page_icon="📈")
st.title("📈 Analytics")
st.write("This is the analytics page")if st.session_state.get("user"):
st.write(f"Welcome back, {st.session_state.user}!")
**页面2(pages/2_Analytics.py):**
```python
import streamlit as st
st.set_page_config(page_title="Analytics", page_icon="📈")
st.title("📈 Analytics")
st.write("This is the analytics page")Add analytics content
添加分析内容
undefinedundefined8. Advanced Features
8. 高级功能
Status and Progress:
python
import streamlit as st
import time状态与进度:
python
import streamlit as st
import timeProgress bar
进度条
progress = st.progress(0, text="Processing...")
for i in range(100):
time.sleep(0.01)
progress.progress(i + 1, text=f"Processing... {i+1}%")
progress = st.progress(0, text="Processing...")
for i in range(100):
time.sleep(0.01)
progress.progress(i + 1, text=f"Processing... {i+1}%")
Spinner
加载动画
with st.spinner("Loading data..."):
time.sleep(2)
st.success("Done!")
with st.spinner("Loading data..."):
time.sleep(2)
st.success("Done!")
Status messages
状态消息
st.success("Operation successful!")
st.info("This is informational")
st.warning("Warning: Check your inputs")
st.error("An error occurred")
st.exception(ValueError("Example exception"))
st.success("Operation successful!")
st.info("This is informational")
st.warning("Warning: Check your inputs")
st.error("An error occurred")
st.exception(ValueError("Example exception"))
Toast notifications
提示通知
st.toast("Data saved!", icon="✅")
st.toast("Data saved!", icon="✅")
Balloons and snow
气球和雪花效果
st.balloons()
st.snow()
**Chat Interface:**
```python
import streamlit as st
import time
st.title("Chat Demo")st.balloons()
st.snow()
**聊天界面:**
```python
import streamlit as st
import time
st.title("Chat Demo")Initialize chat history
初始化聊天历史
if "messages" not in st.session_state:
st.session_state.messages = []
if "messages" not in st.session_state:
st.session_state.messages = []
Display chat history
显示聊天历史
for message in st.session_state.messages:
with st.chat_message(message["role"]):
st.markdown(message["content"])
for message in st.session_state.messages:
with st.chat_message(message["role"]):
st.markdown(message["content"])
Chat input
聊天输入
if prompt := st.chat_input("What's on your mind?"):
# Add user message
st.session_state.messages.append({"role": "user", "content": prompt})
with st.chat_message("user"):
st.markdown(prompt)
# Generate response
with st.chat_message("assistant"):
response = f"You said: {prompt}"
st.markdown(response)
st.session_state.messages.append({"role": "assistant", "content": response})
**Data Editor:**
```python
import streamlit as st
import pandas as pdif prompt := st.chat_input("What's on your mind?"):
# 添加用户消息
st.session_state.messages.append({"role": "user", "content": prompt})
with st.chat_message("user"):
st.markdown(prompt)
# 生成回复
with st.chat_message("assistant"):
response = f"You said: {prompt}"
st.markdown(response)
st.session_state.messages.append({"role": "assistant", "content": response})
**数据编辑器:**
```python
import streamlit as st
import pandas as pdEditable dataframe
可编辑的数据框
df = pd.DataFrame({
"Name": ["Alice", "Bob", "Charlie"],
"Age": [25, 30, 35],
"Active": [True, False, True]
})
edited_df = st.data_editor(
df,
num_rows="dynamic", # Allow adding/deleting rows
column_config={
"Name": st.column_config.TextColumn("Name", required=True),
"Age": st.column_config.NumberColumn("Age", min_value=0, max_value=120),
"Active": st.column_config.CheckboxColumn("Active")
}
)
if st.button("Save changes"):
st.write("Saved:", edited_df)
undefineddf = pd.DataFrame({
"Name": ["Alice", "Bob", "Charlie"],
"Age": [25, 30, 35],
"Active": [True, False, True]
})
edited_df = st.data_editor(
df,
num_rows="dynamic", # 允许添加/删除行
column_config={
"Name": st.column_config.TextColumn("Name", required=True),
"Age": st.column_config.NumberColumn("Age", min_value=0, max_value=120),
"Active": st.column_config.CheckboxColumn("Active")
}
)
if st.button("Save changes"):
st.write("Saved:", edited_df)
undefinedComplete Examples
完整示例
Example 1: Sales Dashboard
示例1:销售仪表板
python
import streamlit as st
import pandas as pd
import polars as pl
import plotly.express as px
import plotly.graph_objects as go
from datetime import datetime, timedeltapython
import streamlit as st
import pandas as pd
import polars as pl
import plotly.express as px
import plotly.graph_objects as go
from datetime import datetime, timedeltaPage config
页面配置
st.set_page_config(
page_title="Sales Dashboard",
page_icon="📊",
layout="wide"
)
st.set_page_config(
page_title="Sales Dashboard",
page_icon="📊",
layout="wide"
)
Custom CSS
自定义CSS
st.markdown("""
<style>
.metric-card {
background-color: #f0f2f6;
padding: 20px;
border-radius: 10px;
text-align: center;
}
</style>
""", unsafe_allow_html=True)
st.markdown("""
<style>
.metric-card {
background-color: #f0f2f6;
padding: 20px;
border-radius: 10px;
text-align: center;
}
</style>
""", unsafe_allow_html=True)
Title
标题
st.title("📊 Sales Analytics Dashboard")
st.title("📊 销售分析仪表板")
Sidebar filters
侧边栏过滤器
st.sidebar.header("Filters")
st.sidebar.header("过滤器")
Date range filter
日期范围过滤器
date_range = st.sidebar.date_input(
"Date Range",
value=(datetime.now() - timedelta(days=30), datetime.now()),
format="YYYY-MM-DD"
)
date_range = st.sidebar.date_input(
"日期范围",
value=(datetime.now() - timedelta(days=30), datetime.now()),
format="YYYY-MM-DD"
)
Category filter
类别过滤器
categories = st.sidebar.multiselect(
"Categories",
options=["Electronics", "Clothing", "Food", "Home", "Sports"],
default=["Electronics", "Clothing", "Food"]
)
categories = st.sidebar.multiselect(
"类别",
options=["电子产品", "服装", "食品", "家居", "运动"],
default=["电子产品", "服装", "食品"]
)
Region filter
区域过滤器
regions = st.sidebar.multiselect(
"Regions",
options=["North", "South", "East", "West"],
default=["North", "South", "East", "West"]
)
regions = st.sidebar.multiselect(
"区域",
options=["北部", "南部", "东部", "西部"],
default=["北部", "南部", "东部", "西部"]
)
Load and filter data
加载并过滤数据
@st.cache_data
def load_sales_data():
"""Generate sample sales data."""
import numpy as np
np.random.seed(42)
dates = pd.date_range(start="2024-01-01", end="2025-12-31", freq="D")
n = len(dates) * 10 # Multiple records per day
return pd.DataFrame({
"date": np.random.choice(dates, n),
"category": np.random.choice(
["Electronics", "Clothing", "Food", "Home", "Sports"], n
),
"region": np.random.choice(["North", "South", "East", "West"], n),
"revenue": np.random.uniform(100, 5000, n),
"units": np.random.randint(1, 50, n),
"customer_id": np.random.randint(1000, 9999, n)
})@st.cache_data
def load_sales_data():
"""生成示例销售数据。"""
import numpy as np
np.random.seed(42)
dates = pd.date_range(start="2024-01-01", end="2025-12-31", freq="D")
n = len(dates) * 10 # 每天多条记录
return pd.DataFrame({
"date": np.random.choice(dates, n),
"category": np.random.choice(
["电子产品", "服装", "食品", "家居", "运动"], n
),
"region": np.random.choice(["北部", "南部", "东部", "西部"], n),
"revenue": np.random.uniform(100, 5000, n),
"units": np.random.randint(1, 50, n),
"customer_id": np.random.randint(1000, 9999, n)
})Load data
加载数据
df = load_sales_data()
df = load_sales_data()
Apply filters
应用过滤器
filtered_df = df[
(df["date"] >= pd.Timestamp(date_range[0])) &
(df["date"] <= pd.Timestamp(date_range[1])) &
(df["category"].isin(categories)) &
(df["region"].isin(regions))
]
filtered_df = df[
(df["date"] >= pd.Timestamp(date_range[0])) &
(df["date"] <= pd.Timestamp(date_range[1])) &
(df["category"].isin(categories)) &
(df["region"].isin(regions))
]
KPI Metrics Row
KPI指标行
st.subheader("Key Performance Indicators")
col1, col2, col3, col4 = st.columns(4)
total_revenue = filtered_df["revenue"].sum()
total_units = filtered_df["units"].sum()
total_orders = len(filtered_df)
unique_customers = filtered_df["customer_id"].nunique()
col1.metric(
"Total Revenue",
f"${total_revenue:,.0f}",
delta=f"+{(total_revenue * 0.12):,.0f} vs last period"
)
col2.metric(
"Units Sold",
f"{total_units:,}",
delta=f"+{int(total_units * 0.08):,}"
)
col3.metric(
"Orders",
f"{total_orders:,}",
delta=f"+{int(total_orders * 0.05):,}"
)
col4.metric(
"Unique Customers",
f"{unique_customers:,}",
delta=f"+{int(unique_customers * 0.03):,}"
)
st.subheader("关键绩效指标")
col1, col2, col3, col4 = st.columns(4)
total_revenue = filtered_df["revenue"].sum()
total_units = filtered_df["units"].sum()
total_orders = len(filtered_df)
unique_customers = filtered_df["customer_id"].nunique()
col1.metric(
"总销售额",
f"${total_revenue:,.0f}",
delta=f"+{(total_revenue * 0.12):,.0f} 较上期"
)
col2.metric(
"销售数量",
f"{total_units:,}",
delta=f"+{int(total_units * 0.08):,}"
)
col3.metric(
"订单数",
f"{total_orders:,}",
delta=f"+{int(total_orders * 0.05):,}"
)
col4.metric(
"独立客户数",
f"{unique_customers:,}",
delta=f"+{int(unique_customers * 0.03):,}"
)
Charts Row
图表行
st.subheader("Revenue Analysis")
col1, col2 = st.columns(2)
with col1:
# Revenue trend
daily_revenue = filtered_df.groupby("date")["revenue"].sum().reset_index()
fig_trend = px.line(
daily_revenue,
x="date",
y="revenue",
title="Daily Revenue Trend"
)
fig_trend.update_layout(hovermode="x unified")
st.plotly_chart(fig_trend, use_container_width=True)
with col2:
# Revenue by category
category_revenue = filtered_df.groupby("category")["revenue"].sum().reset_index()
fig_category = px.pie(
category_revenue,
values="revenue",
names="category",
title="Revenue by Category"
)
st.plotly_chart(fig_category, use_container_width=True)
st.subheader("销售额分析")
col1, col2 = st.columns(2)
with col1:
# 销售额趋势
daily_revenue = filtered_df.groupby("date")["revenue"].sum().reset_index()
fig_trend = px.line(
daily_revenue,
x="date",
y="revenue",
title="每日销售额趋势"
)
fig_trend.update_layout(hovermode="x unified")
st.plotly_chart(fig_trend, use_container_width=True)
with col2:
# 按类别销售额
category_revenue = filtered_df.groupby("category")["revenue"].sum().reset_index()
fig_category = px.pie(
category_revenue,
values="revenue",
names="category",
title="按类别销售额分布"
)
st.plotly_chart(fig_category, use_container_width=True)
Second charts row
第二行图表
col1, col2 = st.columns(2)
with col1:
# Regional comparison
regional_data = filtered_df.groupby("region").agg({
"revenue": "sum",
"units": "sum"
}).reset_index()
fig_region = px.bar(
regional_data,
x="region",
y="revenue",
color="region",
title="Revenue by Region"
)
st.plotly_chart(fig_region, use_container_width=True)with col2:
# Category by region heatmap
pivot_data = filtered_df.pivot_table(
values="revenue",
index="category",
columns="region",
aggfunc="sum"
)
fig_heatmap = px.imshow(
pivot_data,
title="Revenue Heatmap: Category vs Region",
color_continuous_scale="Blues",
text_auto=".0f"
)
st.plotly_chart(fig_heatmap, use_container_width=True)col1, col2 = st.columns(2)
with col1:
# 区域对比
regional_data = filtered_df.groupby("region").agg({
"revenue": "sum",
"units": "sum"
}).reset_index()
fig_region = px.bar(
regional_data,
x="region",
y="revenue",
color="region",
title="按区域销售额"
)
st.plotly_chart(fig_region, use_container_width=True)with col2:
# 类别与区域热力图
pivot_data = filtered_df.pivot_table(
values="revenue",
index="category",
columns="region",
aggfunc="sum"
)
fig_heatmap = px.imshow(
pivot_data,
title="销售额热力图:类别 vs 区域",
color_continuous_scale="Blues",
text_auto=".0f"
)
st.plotly_chart(fig_heatmap, use_container_width=True)Data Table
数据表格
st.subheader("Detailed Data")
with st.expander("View Raw Data"):
# Aggregated summary
summary = filtered_df.groupby(["date", "category", "region"]).agg({
"revenue": "sum",
"units": "sum",
"customer_id": "nunique"
}).reset_index()
summary.columns = ["Date", "Category", "Region", "Revenue", "Units", "Customers"]
st.dataframe(
summary.sort_values("Date", ascending=False),
use_container_width=True,
height=400
)
# Download button
csv = summary.to_csv(index=False)
st.download_button(
label="Download CSV",
data=csv,
file_name=f"sales_data_{datetime.now().strftime('%Y%m%d')}.csv",
mime="text/csv"
)st.subheader("详细数据")
with st.expander("查看原始数据"):
# 聚合汇总
summary = filtered_df.groupby(["date", "category", "region"]).agg({
"revenue": "sum",
"units": "sum",
"customer_id": "nunique"
}).reset_index()
summary.columns = ["日期", "类别", "区域", "销售额", "数量", "客户数"]
st.dataframe(
summary.sort_values("日期", ascending=False),
use_container_width=True,
height=400
)
# 下载按钮
csv = summary.to_csv(index=False)
st.download_button(
label="下载CSV",
data=csv,
file_name=f"销售数据_{datetime.now().strftime('%Y%m%d')}.csv",
mime="text/csv"
)Footer
页脚
st.markdown("---")
st.caption(f"Last updated: {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}")
undefinedst.markdown("---")
st.caption(f"最后更新:{datetime.now().strftime('%Y-%m-%d %H:%M:%S')}")
undefinedExample 2: Data Explorer Tool
示例2:数据探索工具
python
import streamlit as st
import pandas as pd
import polars as pl
import plotly.express as px
st.set_page_config(page_title="Data Explorer", page_icon="🔍", layout="wide")
st.title("🔍 Interactive Data Explorer")python
import streamlit as st
import pandas as pd
import polars as pl
import plotly.express as px
st.set_page_config(page_title="数据探索器", page_icon="🔍", layout="wide")
st.title("🔍 交互式数据探索器")File upload
文件上传
uploaded_file = st.file_uploader(
"Upload your data file",
type=["csv", "xlsx", "parquet"],
help="Supported formats: CSV, Excel, Parquet"
)
@st.cache_data
def load_uploaded_file(file, file_type):
"""Load uploaded file based on type."""
if file_type == "csv":
return pd.read_csv(file)
elif file_type == "xlsx":
return pd.read_excel(file)
elif file_type == "parquet":
return pd.read_parquet(file)
if uploaded_file is not None:
# Determine file type
file_type = uploaded_file.name.split(".")[-1].lower()
# Load data
with st.spinner("Loading data..."):
df = load_uploaded_file(uploaded_file, file_type)
st.success(f"Loaded {len(df):,} rows and {len(df.columns)} columns")
# Data overview tabs
tab1, tab2, tab3, tab4 = st.tabs([
"📋 Overview",
"📊 Visualize",
"🔢 Statistics",
"🔍 Filter & Export"
])
with tab1:
st.subheader("Data Preview")
st.dataframe(df.head(100), use_container_width=True)
col1, col2 = st.columns(2)
with col1:
st.write("**Column Types:**")
type_df = pd.DataFrame({
"Column": df.columns,
"Type": df.dtypes.astype(str),
"Non-Null": df.notna().sum(),
"Null %": (df.isna().sum() / len(df) * 100).round(2)
})
st.dataframe(type_df, use_container_width=True)
with col2:
st.write("**Data Shape:**")
st.write(f"- Rows: {len(df):,}")
st.write(f"- Columns: {len(df.columns)}")
st.write(f"- Memory: {df.memory_usage(deep=True).sum() / 1024**2:.2f} MB")
with tab2:
st.subheader("Quick Visualizations")
# Get column types
numeric_cols = df.select_dtypes(include=["number"]).columns.tolist()
categorical_cols = df.select_dtypes(include=["object", "category"]).columns.tolist()
chart_type = st.selectbox(
"Chart Type",
["Histogram", "Scatter", "Line", "Bar", "Box Plot"]
)
if chart_type == "Histogram":
col = st.selectbox("Select column", numeric_cols)
bins = st.slider("Number of bins", 10, 100, 30)
fig = px.histogram(df, x=col, nbins=bins, title=f"Distribution of {col}")
st.plotly_chart(fig, use_container_width=True)
elif chart_type == "Scatter":
col1, col2 = st.columns(2)
x_col = col1.selectbox("X axis", numeric_cols)
y_col = col2.selectbox("Y axis", numeric_cols, index=min(1, len(numeric_cols)-1))
color_col = st.selectbox("Color by (optional)", ["None"] + categorical_cols)
fig = px.scatter(
df, x=x_col, y=y_col,
color=None if color_col == "None" else color_col,
title=f"{y_col} vs {x_col}"
)
st.plotly_chart(fig, use_container_width=True)
elif chart_type == "Line":
x_col = st.selectbox("X axis", df.columns.tolist())
y_cols = st.multiselect("Y axis (select multiple)", numeric_cols)
if y_cols:
fig = px.line(df, x=x_col, y=y_cols, title="Line Chart")
st.plotly_chart(fig, use_container_width=True)
elif chart_type == "Bar":
cat_col = st.selectbox("Category", categorical_cols if categorical_cols else df.columns.tolist())
val_col = st.selectbox("Value", numeric_cols)
agg_func = st.selectbox("Aggregation", ["sum", "mean", "count", "median"])
agg_data = df.groupby(cat_col)[val_col].agg(agg_func).reset_index()
fig = px.bar(agg_data, x=cat_col, y=val_col, title=f"{agg_func.title()} of {val_col} by {cat_col}")
st.plotly_chart(fig, use_container_width=True)
elif chart_type == "Box Plot":
val_col = st.selectbox("Value", numeric_cols)
group_col = st.selectbox("Group by (optional)", ["None"] + categorical_cols)
fig = px.box(
df, y=val_col,
x=None if group_col == "None" else group_col,
title=f"Distribution of {val_col}"
)
st.plotly_chart(fig, use_container_width=True)
with tab3:
st.subheader("Statistical Summary")
# Numeric statistics
if numeric_cols:
st.write("**Numeric Columns:**")
st.dataframe(df[numeric_cols].describe(), use_container_width=True)
# Categorical statistics
if categorical_cols:
st.write("**Categorical Columns:**")
for col in categorical_cols[:5]: # Limit to first 5
with st.expander(f"{col} value counts"):
st.dataframe(
df[col].value_counts().head(20).reset_index(),
use_container_width=True
)
# Correlation matrix
if len(numeric_cols) > 1:
st.write("**Correlation Matrix:**")
corr = df[numeric_cols].corr()
fig = px.imshow(
corr,
text_auto=".2f",
color_continuous_scale="RdBu_r",
aspect="auto"
)
st.plotly_chart(fig, use_container_width=True)
with tab4:
st.subheader("Filter & Export")
# Dynamic filtering
st.write("**Apply Filters:**")
filtered_df = df.copy()
for col in df.columns[:10]: # Limit columns for UI
if df[col].dtype in ["int64", "float64"]:
min_val, max_val = float(df[col].min()), float(df[col].max())
if min_val < max_val:
range_val = st.slider(
f"{col} range",
min_val, max_val, (min_val, max_val),
key=f"filter_{col}"
)
filtered_df = filtered_df[
(filtered_df[col] >= range_val[0]) &
(filtered_df[col] <= range_val[1])
]
elif df[col].dtype == "object" and df[col].nunique() < 20:
selected = st.multiselect(
f"{col}",
options=df[col].unique().tolist(),
default=df[col].unique().tolist(),
key=f"filter_{col}"
)
filtered_df = filtered_df[filtered_df[col].isin(selected)]
st.write(f"**Filtered data: {len(filtered_df):,} rows**")
st.dataframe(filtered_df.head(100), use_container_width=True)
# Export
col1, col2 = st.columns(2)
with col1:
csv = filtered_df.to_csv(index=False)
st.download_button(
"Download as CSV",
csv,
"filtered_data.csv",
"text/csv"
)
with col2:
excel_buffer = pd.ExcelWriter("filtered_data.xlsx", engine="openpyxl")
filtered_df.to_excel(excel_buffer, index=False)else:
st.info("Please upload a data file to get started.")
# Sample data option
if st.button("Load sample data"):
import numpy as np
np.random.seed(42)
sample_df = pd.DataFrame({
"date": pd.date_range("2025-01-01", periods=100),
"category": np.random.choice(["A", "B", "C"], 100),
"value": np.random.randn(100) * 100 + 500,
"count": np.random.randint(1, 100, 100)
})
sample_df.to_csv("/tmp/sample_data.csv", index=False)
st.success("Sample data created! Upload '/tmp/sample_data.csv'")undefineduploaded_file = st.file_uploader(
"上传您的数据文件",
type=["csv", "xlsx", "parquet"],
help="支持格式:CSV、Excel、Parquet"
)
@st.cache_data
def load_uploaded_file(file, file_type):
"""根据文件类型加载上传的文件。"""
if file_type == "csv":
return pd.read_csv(file)
elif file_type == "xlsx":
return pd.read_excel(file)
elif file_type == "parquet":
return pd.read_parquet(file)
if uploaded_file is not None:
# 判断文件类型
file_type = uploaded_file.name.split(".")[-1].lower()
# 加载数据
with st.spinner("加载数据中..."):
df = load_uploaded_file(uploaded_file, file_type)
st.success(f"已加载 {len(df):,} 行和 {len(df.columns)} 列")
# 数据概览标签页
tab1, tab2, tab3, tab4 = st.tabs([
"📋 概览",
"📊 可视化",
"🔢 统计",
"🔍 过滤与导出"
])
with tab1:
st.subheader("数据预览")
st.dataframe(df.head(100), use_container_width=True)
col1, col2 = st.columns(2)
with col1:
st.write("**列类型:**")
type_df = pd.DataFrame({
"列名": df.columns,
"类型": df.dtypes.astype(str),
"非空值": df.notna().sum(),
"空值占比%": (df.isna().sum() / len(df) * 100).round(2)
})
st.dataframe(type_df, use_container_width=True)
with col2:
st.write("**数据形状:**")
st.write(f"- 行数:{len(df):,}")
st.write(f"- 列数:{len(df.columns)}")
st.write(f"- 内存占用:{df.memory_usage(deep=True).sum() / 1024**2:.2f} MB")
with tab2:
st.subheader("快速可视化")
# 获取列类型
numeric_cols = df.select_dtypes(include=["number"]).columns.tolist()
categorical_cols = df.select_dtypes(include=["object", "category"]).columns.tolist()
chart_type = st.selectbox(
"图表类型",
["直方图", "散点图", "折线图", "柱状图", "箱线图"]
)
if chart_type == "直方图":
col = st.selectbox("选择列", numeric_cols)
bins = st.slider("分箱数", 10, 100, 30)
fig = px.histogram(df, x=col, nbins=bins, title=f"{col} 的分布")
st.plotly_chart(fig, use_container_width=True)
elif chart_type == "散点图":
col1, col2 = st.columns(2)
x_col = col1.selectbox("X轴", numeric_cols)
y_col = col2.selectbox("Y轴", numeric_cols, index=min(1, len(numeric_cols)-1))
color_col = st.selectbox("颜色分组(可选)", ["无"] + categorical_cols)
fig = px.scatter(
df, x=x_col, y=y_col,
color=None if color_col == "无" else color_col,
title=f"{y_col} vs {x_col}"
)
st.plotly_chart(fig, use_container_width=True)
elif chart_type == "折线图":
x_col = st.selectbox("X轴", df.columns.tolist())
y_cols = st.multiselect("Y轴(可多选)", numeric_cols)
if y_cols:
fig = px.line(df, x=x_col, y=y_cols, title="折线图")
st.plotly_chart(fig, use_container_width=True)
elif chart_type == "柱状图":
cat_col = st.selectbox("类别列", categorical_cols if categorical_cols else df.columns.tolist())
val_col = st.selectbox("值列", numeric_cols)
agg_func = st.selectbox("聚合方式", ["求和", "均值", "计数", "中位数"])
agg_data = df.groupby(cat_col)[val_col].agg(agg_func).reset_index()
fig = px.bar(agg_data, x=cat_col, y=val_col, title=f"{agg_func} {val_col} 按 {cat_col}")
st.plotly_chart(fig, use_container_width=True)
elif chart_type == "箱线图":
val_col = st.selectbox("值列", numeric_cols)
group_col = st.selectbox("分组列(可选)", ["无"] + categorical_cols)
fig = px.box(
df, y=val_col,
x=None if group_col == "无" else group_col,
title=f"{val_col} 的分布"
)
st.plotly_chart(fig, use_container_width=True)
with tab3:
st.subheader("统计摘要")
# 数值列统计
if numeric_cols:
st.write("**数值列统计:**")
st.dataframe(df[numeric_cols].describe(), use_container_width=True)
# 分类列统计
if categorical_cols:
st.write("**分类列统计:**")
for col in categorical_cols[:5]: # 限制显示前5列
with st.expander(f"{col} 值计数"):
st.dataframe(
df[col].value_counts().head(20).reset_index(),
use_container_width=True
)
# 相关矩阵
if len(numeric_cols) > 1:
st.write("**相关矩阵:**")
corr = df[numeric_cols].corr()
fig = px.imshow(
corr,
text_auto=".2f",
color_continuous_scale="RdBu_r",
aspect="auto"
)
st.plotly_chart(fig, use_container_width=True)
with tab4:
st.subheader("过滤与导出")
# 动态过滤
st.write("**应用过滤器:**")
filtered_df = df.copy()
for col in df.columns[:10]: # 限制显示列数以优化UI
if df[col].dtype in ["int64", "float64"]:
min_val, max_val = float(df[col].min()), float(df[col].max())
if min_val < max_val:
range_val = st.slider(
f"{col} 范围",
min_val, max_val, (min_val, max_val),
key=f"filter_{col}"
)
filtered_df = filtered_df[
(filtered_df[col] >= range_val[0]) &
(filtered_df[col] <= range_val[1])
]
elif df[col].dtype == "object" and df[col].nunique() < 20:
selected = st.multiselect(
f"{col}",
options=df[col].unique().tolist(),
default=df[col].unique().tolist(),
key=f"filter_{col}"
)
filtered_df = filtered_df[filtered_df[col].isin(selected)]
st.write(f"**过滤后数据:{len(filtered_df):,} 行**")
st.dataframe(filtered_df.head(100), use_container_width=True)
# 导出
col1, col2 = st.columns(2)
with col1:
csv = filtered_df.to_csv(index=False)
st.download_button(
"导出为CSV",
csv,
"filtered_data.csv",
"text/csv"
)
with col2:
excel_buffer = pd.ExcelWriter("filtered_data.xlsx", engine="openpyxl")
filtered_df.to_excel(excel_buffer, index=False)else:
st.info("请上传数据文件开始使用。")
# 示例数据选项
if st.button("加载示例数据"):
import numpy as np
np.random.seed(42)
sample_df = pd.DataFrame({
"date": pd.date_range("2025-01-01", periods=100),
"category": np.random.choice(["A", "B", "C"], 100),
"value": np.random.randn(100) * 100 + 500,
"count": np.random.randint(1, 100, 100)
})
sample_df.to_csv("/tmp/sample_data.csv", index=False)
st.success("示例数据已创建!请上传 '/tmp/sample_data.csv'")undefinedExample 3: ML Model Demo
示例3:ML模型演示
python
import streamlit as st
import pandas as pd
import numpy as np
import plotly.express as px
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score, classification_report
st.set_page_config(page_title="ML Demo", page_icon="🤖", layout="wide")
st.title("🤖 Machine Learning Model Demo")python
import streamlit as st
import pandas as pd
import numpy as np
import plotly.express as px
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score, classification_report
st.set_page_config(page_title="ML演示", page_icon="🤖", layout="wide")
st.title("🤖 机器学习模型演示")Sidebar
侧边栏
st.sidebar.header("Model Configuration")
st.sidebar.header("模型配置")
Model parameters
模型参数
n_estimators = st.sidebar.slider("Number of trees", 10, 200, 100)
max_depth = st.sidebar.slider("Max depth", 1, 20, 5)
test_size = st.sidebar.slider("Test size", 0.1, 0.5, 0.2)
n_estimators = st.sidebar.slider("决策树数量", 10, 200, 100)
max_depth = st.sidebar.slider("最大深度", 1, 20, 5)
test_size = st.sidebar.slider("测试集比例", 0.1, 0.5, 0.2)
Generate sample data
生成示例数据
@st.cache_data
def generate_sample_data(n_samples=1000):
np.random.seed(42)
# Features
X = pd.DataFrame({
"feature_1": np.random.randn(n_samples),
"feature_2": np.random.randn(n_samples),
"feature_3": np.random.uniform(0, 100, n_samples),
"feature_4": np.random.choice(["A", "B", "C"], n_samples)
})
# Target (based on features with some noise)
y = (
(X["feature_1"] > 0).astype(int) +
(X["feature_3"] > 50).astype(int)
) >= 1
y = y.astype(int)
return X, yX, y = generate_sample_data()
@st.cache_data
def generate_sample_data(n_samples=1000):
np.random.seed(42)
# 特征
X = pd.DataFrame({
"feature_1": np.random.randn(n_samples),
"feature_2": np.random.randn(n_samples),
"feature_3": np.random.uniform(0, 100, n_samples),
"feature_4": np.random.choice(["A", "B", "C"], n_samples)
})
# 目标变量(基于特征并添加噪声)
y = (
(X["feature_1"] > 0).astype(int) +
(X["feature_3"] > 50).astype(int)
) >= 1
y = y.astype(int)
return X, yX, y = generate_sample_data()
Convert categorical
转换分类变量
X_encoded = pd.get_dummies(X, columns=["feature_4"])
X_encoded = pd.get_dummies(X, columns=["feature_4"])
Train/test split
划分训练集和测试集
X_train, X_test, y_train, y_test = train_test_split(
X_encoded, y, test_size=test_size, random_state=42
)
X_train, X_test, y_train, y_test = train_test_split(
X_encoded, y, test_size=test_size, random_state=42
)
Train model
训练模型
@st.cache_resource
def train_model(n_est, depth, _X_train, _y_train):
model = RandomForestClassifier(
n_estimators=n_est,
max_depth=depth,
random_state=42
)
model.fit(_X_train, _y_train)
return model
@st.cache_resource
def train_model(n_est, depth, _X_train, _y_train):
model = RandomForestClassifier(
n_estimators=n_est,
max_depth=depth,
random_state=42
)
model.fit(_X_train, _y_train)
return model
Training
训练中
with st.spinner("Training model..."):
model = train_model(n_estimators, max_depth, X_train, y_train)
with st.spinner("训练模型中..."):
model = train_model(n_estimators, max_depth, X_train, y_train)
Predictions
预测
y_pred = model.predict(X_test)
accuracy = accuracy_score(y_test, y_pred)
y_pred = model.predict(X_test)
accuracy = accuracy_score(y_test, y_pred)
Results
结果
st.subheader("Model Performance")
col1, col2, col3 = st.columns(3)
col1.metric("Accuracy", f"{accuracy:.2%}")
col2.metric("Training Samples", f"{len(X_train):,}")
col3.metric("Test Samples", f"{len(X_test):,}")
st.subheader("模型性能")
col1, col2, col3 = st.columns(3)
col1.metric("准确率", f"{accuracy:.2%}")
col2.metric("训练样本数", f"{len(X_train):,}")
col3.metric("测试样本数", f"{len(X_test):,}")
Feature importance
特征重要性
st.subheader("Feature Importance")
importance_df = pd.DataFrame({
"Feature": X_encoded.columns,
"Importance": model.feature_importances_
}).sort_values("Importance", ascending=True)
fig = px.bar(
importance_df,
x="Importance",
y="Feature",
orientation="h",
title="Feature Importance"
)
st.plotly_chart(fig, use_container_width=True)
st.subheader("特征重要性")
importance_df = pd.DataFrame({
"特征": X_encoded.columns,
"重要性": model.feature_importances_
}).sort_values("重要性", ascending=True)
fig = px.bar(
importance_df,
x="重要性",
y="特征",
orientation="h",
title="特征重要性"
)
st.plotly_chart(fig, use_container_width=True)
Interactive prediction
交互式预测
st.subheader("Make a Prediction")
with st.form("prediction_form"):
col1, col2 = st.columns(2)
with col1:
f1 = st.number_input("Feature 1", value=0.0)
f2 = st.number_input("Feature 2", value=0.0)
with col2:
f3 = st.number_input("Feature 3", min_value=0.0, max_value=100.0, value=50.0)
f4 = st.selectbox("Feature 4", ["A", "B", "C"])
submitted = st.form_submit_button("Predict")
if submitted:
# Prepare input
input_data = pd.DataFrame({
"feature_1": [f1],
"feature_2": [f2],
"feature_3": [f3],
"feature_4": [f4]
})
input_encoded = pd.get_dummies(input_data, columns=["feature_4"])
# Align columns
for col in X_encoded.columns:
if col not in input_encoded.columns:
input_encoded[col] = 0
input_encoded = input_encoded[X_encoded.columns]
# Predict
prediction = model.predict(input_encoded)[0]
proba = model.predict_proba(input_encoded)[0]
st.success(f"Prediction: **{'Positive' if prediction == 1 else 'Negative'}**")
st.write(f"Confidence: {max(proba):.2%}")undefinedst.subheader("进行预测")
with st.form("prediction_form"):
col1, col2 = st.columns(2)
with col1:
f1 = st.number_input("Feature 1", value=0.0)
f2 = st.number_input("Feature 2", value=0.0)
with col2:
f3 = st.number_input("Feature 3", min_value=0.0, max_value=100.0, value=50.0)
f4 = st.selectbox("Feature 4", ["A", "B", "C"])
submitted = st.form_submit_button("预测")
if submitted:
# 准备输入数据
input_data = pd.DataFrame({
"feature_1": [f1],
"feature_2": [f2],
"feature_3": [f3],
"feature_4": [f4]
})
input_encoded = pd.get_dummies(input_data, columns=["feature_4"])
# 对齐列
for col in X_encoded.columns:
if col not in input_encoded.columns:
input_encoded[col] = 0
input_encoded = input_encoded[X_encoded.columns]
# 预测
prediction = model.predict(input_encoded)[0]
proba = model.predict_proba(input_encoded)[0]
st.success(f"预测结果:**{'阳性' if prediction == 1 else '阴性'}**")
st.write(f"置信度:{max(proba):.2%}")undefinedDeployment Patterns
部署模式
Streamlit Cloud Deployment
Streamlit Cloud部署
yaml
undefinedyaml
undefinedrequirements.txt
requirements.txt
streamlit>=1.32.0
pandas>=2.0.0
polars>=0.20.0
plotly>=5.18.0
numpy>=1.24.0
```tomlstreamlit>=1.32.0
pandas>=2.0.0
polars>=0.20.0
plotly>=5.18.0
numpy>=1.24.0
```toml.streamlit/config.toml
.streamlit/config.toml
[theme]
primaryColor="#1f77b4"
backgroundColor="#ffffff"
secondaryBackgroundColor="#f0f2f6"
textColor="#262730"
font="sans serif"
[server]
maxUploadSize=200
enableXsrfProtection=true
undefined[theme]
primaryColor="#1f77b4"
backgroundColor="#ffffff"
secondaryBackgroundColor="#f0f2f6"
textColor="#262730"
font="sans serif"
[server]
maxUploadSize=200
enableXsrfProtection=true
undefinedDocker Deployment
Docker部署
dockerfile
undefineddockerfile
undefinedDockerfile
Dockerfile
FROM python:3.11-slim
WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
COPY . .
EXPOSE 8501
HEALTHCHECK CMD curl --fail http://localhost:8501/_stcore/health
ENTRYPOINT ["streamlit", "run", "app.py", "--server.port=8501", "--server.address=0.0.0.0"]
```bashFROM python:3.11-slim
WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
COPY . .
EXPOSE 8501
HEALTHCHECK CMD curl --fail http://localhost:8501/_stcore/health
ENTRYPOINT ["streamlit", "run", "app.py", "--server.port=8501", "--server.address=0.0.0.0"]
```bashBuild and run
构建并运行
docker build -t my-streamlit-app .
docker run -p 8501:8501 my-streamlit-app
undefineddocker build -t my-streamlit-app .
docker run -p 8501:8501 my-streamlit-app
undefinedBest Practices
最佳实践
1. Use Caching Appropriately
1. 合理使用缓存
python
undefinedpython
undefinedGOOD: Cache data loading
推荐:缓存数据加载
@st.cache_data
def load_data():
return pd.read_csv("data.csv")
@st.cache_data
def load_data():
return pd.read_csv("data.csv")
GOOD: Cache resources (DB connections, models)
推荐:缓存资源(数据库连接、模型)
@st.cache_resource
def get_model():
return load_model("model.pkl")
@st.cache_resource
def get_model():
return load_model("model.pkl")
AVOID: Caching with unhashable arguments
避免:使用不可哈希的参数进行缓存
Use _arg prefix to skip hashing
使用_arg前缀跳过哈希
@st.cache_data
def process_data(_db_connection, query):
return _db_connection.execute(query)
undefined@st.cache_data
def process_data(_db_connection, query):
return _db_connection.execute(query)
undefined2. Organize Large Apps
2. 组织大型应用
python
undefinedpython
undefinedutils/data.py
utils/data.py
def load_data():
pass
def load_data():
pass
utils/charts.py
utils/charts.py
def create_chart(df):
pass
def create_chart(df):
pass
app.py
app.py
from utils.data import load_data
from utils.charts import create_chart
undefinedfrom utils.data import load_data
from utils.charts import create_chart
undefined3. Handle State Carefully
3. 谨慎处理状态
python
undefinedpython
undefinedGOOD: Initialize state at the top
推荐:在顶部初始化状态
if "data" not in st.session_state:
st.session_state.data = None
if "data" not in st.session_state:
st.session_state.data = None
GOOD: Use callbacks for complex updates
推荐:使用回调处理复杂更新
def on_filter_change():
st.session_state.filtered_data = apply_filter(st.session_state.data)
st.selectbox("Filter", options, on_change=on_filter_change)
undefineddef on_filter_change():
st.session_state.filtered_data = apply_filter(st.session_state.data)
st.selectbox("过滤器", options, on_change=on_filter_change)
undefined4. Optimize Performance
4. 优化性能
python
undefinedpython
undefinedUse containers for layout stability
使用容器保持布局稳定
placeholder = st.empty()
placeholder = st.empty()
Batch widget updates in forms
在表单中批量处理小部件更新
with st.form("filters"):
# Multiple widgets
st.form_submit_button()
with st.form("filters"):
# 多个小部件
st.form_submit_button()
Use columns for responsive layout
使用列实现响应式布局
cols = st.columns([1, 2, 1])
undefinedcols = st.columns([1, 2, 1])
undefinedTroubleshooting
故障排除
Common Issues
常见问题
Issue: App reruns on every interaction
python
undefined问题:每次交互应用都会重新运行
python
undefinedUse forms to batch inputs
使用表单批量处理输入
with st.form("my_form"):
input1 = st.text_input("Input")
submit = st.form_submit_button()
**Issue: Slow data loading**
```pythonwith st.form("my_form"):
input1 = st.text_input("输入")
submit = st.form_submit_button()
**问题:数据加载缓慢**
```pythonAdd caching
添加缓存
@st.cache_data(ttl=3600)
def load_data():
return pd.read_csv("large_file.csv")
**Issue: Memory issues with large files**
```python@st.cache_data(ttl=3600)
def load_data():
return pd.read_csv("large_file.csv")
**问题:大文件导致内存问题**
```pythonUse chunking
使用分块加载
@st.cache_data
def load_large_file(path, nrows=10000):
return pd.read_csv(path, nrows=nrows)
**Issue: Widget state lost on rerun**
```python@st.cache_data
def load_large_file(path, nrows=10000):
return pd.read_csv(path, nrows=nrows)
**问题:重新运行后小部件状态丢失**
```pythonPersist in session state
将会话状态持久化
if "value" not in st.session_state:
st.session_state.value = default_value
if "value" not in st.session_state:
st.session_state.value = default_value
Use key parameter
使用key参数
st.text_input("Name", key="user_name")
undefinedst.text_input("姓名", key="user_name")
undefinedVersion History
版本历史
- 1.0.0 (2026-01-17): Initial release
- Basic app structure and widgets
- Layout and organization patterns
- Data visualization integration
- Caching strategies
- Session state management
- Multi-page applications
- Complete dashboard examples
- Deployment patterns
- Best practices and troubleshooting
- 1.0.0 (2026-01-17): 初始版本
- 基础应用结构和小部件
- 布局与组织模式
- 数据可视化集成
- 缓存策略
- 会话状态管理
- 多页面应用
- 完整仪表板示例
- 部署模式
- 最佳实践与故障排除
Resources
资源
- Official Docs: https://docs.streamlit.io/
- Gallery: https://streamlit.io/gallery
- Components: https://streamlit.io/components
- Cloud: https://streamlit.io/cloud
- GitHub: https://github.com/streamlit/streamlit
Build beautiful data apps with pure Python - no frontend experience required!
- 官方文档: https://docs.streamlit.io/
- 应用画廊: https://streamlit.io/gallery
- 组件库: https://streamlit.io/components
- Streamlit云: https://streamlit.io/cloud
- GitHub仓库: https://github.com/streamlit/streamlit
使用纯Python构建美观的数据应用 - 无需前端开发经验!