data-analysis
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
Chinese<objective>
Enable executive-grade data analysis for VC, PE, and C-suite presentations. Covers data ingestion from any format, SaaS metrics calculations (MRR, LTV, CAC, churn), cohort retention analysis, McKinsey-quality visualizations with Plotly, and Streamlit dashboards.
</objective>
<quick_start>
Universal data loader:
python
df = load_data("file.csv") # Supports CSV, Excel, JSON, Parquet, PDF, PPTXSaaS metrics:
python
metrics = calculate_saas_metrics(df) # MRR, ARR, LTV, CAC, churn
retention = cohort_retention_analysis(df) # Retention matrixMcKinsey-style charts: Action titles ("Q4 Revenue Exceeded Target by 23%"), not descriptive titles
</quick_start>
<success_criteria>
Analysis is successful when:
- Data loaded and cleaned (dropna, dedup, type conversion)
- Metrics calculated correctly (MRR, ARR, LTV:CAC, churn, cohort retention)
- Charts follow McKinsey principles: action titles, data-ink ratio >80%, one message per chart
- Executive colors used (#003366 primary, #2E7D32 positive, #C62828 negative)
- Streamlit dashboard runs without errors
- NO OPENAI: Use Claude for narrative generation if needed </success_criteria>
<core_content>
Executive-grade data analysis for VC, PE, C-suite presentations using pandas, polars, Plotly, Altair, and Streamlit.
<objective>
为风投(VC)、私募股权(PE)及高管层演示提供高管级数据分析能力。支持任意格式的数据导入、SaaS指标计算(MRR、LTV、CAC、客户流失率)、同期群留存分析、基于Plotly的麦肯锡水准可视化,以及Streamlit仪表盘搭建。
</objective>
<quick_start>
通用数据加载器:
python
df = load_data("file.csv") # 支持CSV、Excel、JSON、Parquet、PDF、PPTX格式SaaS指标计算:
python
metrics = calculate_saas_metrics(df) # 计算MRR、ARR、LTV、CAC、客户流失率
retention = cohort_retention_analysis(df) # 生成留存矩阵麦肯锡风格图表: 使用行动式标题(如“Q4营收超出目标23%”),而非描述性标题
</quick_start>
<success_criteria>
分析成功的判定标准:
- 数据已加载并清洗完成(剔除空值、去重、类型转换)
- 指标计算准确(MRR、ARR、LTV:CAC、客户流失率、同期群留存)
- 图表遵循麦肯锡原则:行动式标题、数据墨水占比>80%、每张图表传递一个核心信息
- 使用高管风格配色(主色调#003366、正向色#2E7D32、负向色#C62828)
- Streamlit仪表盘可正常运行无报错
- 禁止使用OpenAI:如需生成叙事内容请使用Claude </success_criteria>
<core_content>
借助pandas、polars、Plotly、Altair和Streamlit,为风投、私募及高管层演示提供高管级数据分析能力。
Quick Reference
快速参考
| Task | Tools | Output |
|---|---|---|
| Data ingestion | pandas, polars, pdfplumber, python-pptx | DataFrame |
| Wrangling | pandas/polars transforms | Clean dataset |
| Analysis | numpy, scipy, statsmodels | Insights |
| Visualization | Plotly, Altair, Seaborn | Charts |
| Dashboards | Streamlit, DuckDB | Interactive apps |
| Presentations | Plotly export, PDF generation | Investor-ready |
| 任务 | 工具 | 输出结果 |
|---|---|---|
| 数据导入 | pandas, polars, pdfplumber, python-pptx | DataFrame |
| 数据清洗 | pandas/polars 转换操作 | 清洗后的数据集 |
| 数据分析 | numpy, scipy, statsmodels | 分析洞察 |
| 数据可视化 | Plotly, Altair, Seaborn | 图表 |
| 仪表盘搭建 | Streamlit, DuckDB | 交互式应用 |
| 演示文稿制作 | Plotly导出、PDF生成 | 符合投资者需求的材料 |
Data Ingestion Patterns
数据导入模式
Universal Data Loader
通用数据加载器
python
import pandas as pd
import polars as pl
from pathlib import Path
def load_data(file_path: str) -> pd.DataFrame:
"""Load data from any common format."""
path = Path(file_path)
suffix = path.suffix.lower()
loaders = {
'.csv': lambda p: pd.read_csv(p),
'.xlsx': lambda p: pd.read_excel(p, engine='openpyxl'),
'.xls': lambda p: pd.read_excel(p, engine='xlrd'),
'.json': lambda p: pd.read_json(p),
'.parquet': lambda p: pd.read_parquet(p),
'.sql': lambda p: pd.read_sql(open(p).read(), conn),
'.md': lambda p: parse_markdown_tables(p),
'.pdf': lambda p: extract_pdf_tables(p),
'.pptx': lambda p: extract_pptx_tables(p),
}
if suffix not in loaders:
raise ValueError(f"Unsupported format: {suffix}")
return loaders[suffix](path)python
import pandas as pd
import polars as pl
from pathlib import Path
def load_data(file_path: str) -> pd.DataFrame:
"""从任意常见格式加载数据。"""
path = Path(file_path)
suffix = path.suffix.lower()
loaders = {
'.csv': lambda p: pd.read_csv(p),
'.xlsx': lambda p: pd.read_excel(p, engine='openpyxl'),
'.xls': lambda p: pd.read_excel(p, engine='xlrd'),
'.json': lambda p: pd.read_json(p),
'.parquet': lambda p: pd.read_parquet(p),
'.sql': lambda p: pd.read_sql(open(p).read(), conn),
'.md': lambda p: parse_markdown_tables(p),
'.pdf': lambda p: extract_pdf_tables(p),
'.pptx': lambda p: extract_pptx_tables(p),
}
if suffix not in loaders:
raise ValueError(f"不支持的格式:{suffix}")
return loaders[suffix](path)PDF Table Extraction
PDF表格提取
python
import pdfplumber
def extract_pdf_tables(pdf_path: str) -> pd.DataFrame:
"""Extract tables from PDF using pdfplumber."""
all_tables = []
with pdfplumber.open(pdf_path) as pdf:
for page in pdf.pages:
tables = page.extract_tables()
for table in tables:
if table and len(table) > 1:
df = pd.DataFrame(table[1:], columns=table[0])
all_tables.append(df)
return pd.concat(all_tables, ignore_index=True) if all_tables else pd.DataFrame()python
import pdfplumber
def extract_pdf_tables(pdf_path: str) -> pd.DataFrame:
"""使用pdfplumber从PDF中提取表格。"""
all_tables = []
with pdfplumber.open(pdf_path) as pdf:
for page in pdf.pages:
tables = page.extract_tables()
for table in tables:
if table and len(table) > 1:
df = pd.DataFrame(table[1:], columns=table[0])
all_tables.append(df)
return pd.concat(all_tables, ignore_index=True) if all_tables else pd.DataFrame()PowerPoint Data Extraction
PowerPoint数据提取
python
from pptx import Presentation
from pptx.util import Inches
def extract_pptx_tables(pptx_path: str) -> list[pd.DataFrame]:
"""Extract all tables from PowerPoint."""
prs = Presentation(pptx_path)
tables = []
for slide in prs.slides:
for shape in slide.shapes:
if shape.has_table:
table = shape.table
data = []
for row in table.rows:
data.append([cell.text for cell in row.cells])
df = pd.DataFrame(data[1:], columns=data[0])
tables.append(df)
return tablespython
from pptx import Presentation
from pptx.util import Inches
def extract_pptx_tables(pptx_path: str) -> list[pd.DataFrame]:
"""从PowerPoint中提取所有表格。"""
prs = Presentation(pptx_path)
tables = []
for slide in prs.slides:
for shape in slide.shapes:
if shape.has_table:
table = shape.table
data = []
for row in table.rows:
data.append([cell.text for cell in row.cells])
df = pd.DataFrame(data[1:], columns=data[0])
tables.append(df)
return tablesData Wrangling Patterns
数据清洗模式
Polars for Performance (30x faster than pandas)
高性能Polars(比pandas快30倍)
python
import polars as plpython
import polars as plLazy evaluation for large datasets
针对大型数据集的惰性评估
df = (
pl.scan_csv("large_file.csv")
.filter(pl.col("revenue") > 0)
.with_columns([
(pl.col("revenue") / pl.col("customers")).alias("arpu"),
pl.col("date").str.to_date().alias("date_parsed"),
])
.group_by("segment")
.agg([
pl.col("revenue").sum().alias("total_revenue"),
pl.col("customers").mean().alias("avg_customers"),
])
.collect()
)
undefineddf = (
pl.scan_csv("large_file.csv")
.filter(pl.col("revenue") > 0)
.with_columns([
(pl.col("revenue") / pl.col("customers")).alias("arpu"),
pl.col("date").str.to_date().alias("date_parsed"),
])
.group_by("segment")
.agg([
pl.col("revenue").sum().alias("total_revenue"),
pl.col("customers").mean().alias("avg_customers"),
])
.collect()
)
undefinedCommon Transformations
常见转换操作
python
def prepare_for_analysis(df: pd.DataFrame) -> pd.DataFrame:
"""Standard data prep pipeline."""
return (df
.dropna(subset=['key_column'])
.drop_duplicates()
.assign(
date=lambda x: pd.to_datetime(x['date']),
revenue=lambda x: pd.to_numeric(x['revenue'], errors='coerce'),
month=lambda x: x['date'].dt.to_period('M'),
)
.sort_values('date')
.reset_index(drop=True)
)python
def prepare_for_analysis(df: pd.DataFrame) -> pd.DataFrame:
"""标准化数据预处理流程。"""
return (df
.dropna(subset=['key_column'])
.drop_duplicates()
.assign(
date=lambda x: pd.to_datetime(x['date']),
revenue=lambda x: pd.to_numeric(x['revenue'], errors='coerce'),
month=lambda x: x['date'].dt.to_period('M'),
)
.sort_values('date')
.reset_index(drop=True)
)SaaS Metrics Calculations
SaaS指标计算
Core Metrics
核心指标
python
def calculate_saas_metrics(df: pd.DataFrame) -> dict:
"""Calculate key SaaS metrics for investor reporting."""
# MRR / ARR
mrr = df.groupby('month')['mrr'].sum()
arr = mrr.iloc[-1] * 12
# Growth rates
mrr_growth = mrr.pct_change().iloc[-1]
# Churn
churned = df[df['status'] == 'churned']['mrr'].sum()
total_mrr = df['mrr'].sum()
churn_rate = churned / total_mrr if total_mrr > 0 else 0
# CAC & LTV
total_sales_marketing = df['sales_cost'].sum() + df['marketing_cost'].sum()
new_customers = df[df['is_new']]['customer_id'].nunique()
cac = total_sales_marketing / new_customers if new_customers > 0 else 0
avg_revenue_per_customer = df.groupby('customer_id')['mrr'].mean().mean()
avg_lifespan_months = 1 / churn_rate if churn_rate > 0 else 36
ltv = avg_revenue_per_customer * avg_lifespan_months
ltv_cac_ratio = ltv / cac if cac > 0 else 0
cac_payback_months = cac / avg_revenue_per_customer if avg_revenue_per_customer > 0 else 0
return {
'mrr': mrr.iloc[-1],
'arr': arr,
'mrr_growth': mrr_growth,
'churn_rate': churn_rate,
'cac': cac,
'ltv': ltv,
'ltv_cac_ratio': ltv_cac_ratio,
'cac_payback_months': cac_payback_months,
}python
def calculate_saas_metrics(df: pd.DataFrame) -> dict:
"""计算用于投资者汇报的核心SaaS指标。"""
# MRR / ARR
mrr = df.groupby('month')['mrr'].sum()
arr = mrr.iloc[-1] * 12
# 增长率
mrr_growth = mrr.pct_change().iloc[-1]
# 客户流失率
churned = df[df['status'] == 'churned']['mrr'].sum()
total_mrr = df['mrr'].sum()
churn_rate = churned / total_mrr if total_mrr > 0 else 0
# CAC & LTV
total_sales_marketing = df['sales_cost'].sum() + df['marketing_cost'].sum()
new_customers = df[df['is_new']]['customer_id'].nunique()
cac = total_sales_marketing / new_customers if new_customers > 0 else 0
avg_revenue_per_customer = df.groupby('customer_id')['mrr'].mean().mean()
avg_lifespan_months = 1 / churn_rate if churn_rate > 0 else 36
ltv = avg_revenue_per_customer * avg_lifespan_months
ltv_cac_ratio = ltv / cac if cac > 0 else 0
cac_payback_months = cac / avg_revenue_per_customer if avg_revenue_per_customer > 0 else 0
return {
'mrr': mrr.iloc[-1],
'arr': arr,
'mrr_growth': mrr_growth,
'churn_rate': churn_rate,
'cac': cac,
'ltv': ltv,
'ltv_cac_ratio': ltv_cac_ratio,
'cac_payback_months': cac_payback_months,
}Cohort Analysis
同期群分析
python
def cohort_retention_analysis(df: pd.DataFrame) -> pd.DataFrame:
"""Build cohort retention matrix for investor reporting."""
# Assign cohort (first purchase month)
df['cohort'] = df.groupby('customer_id')['date'].transform('min').dt.to_period('M')
df['period'] = df['date'].dt.to_period('M')
df['cohort_age'] = (df['period'] - df['cohort']).apply(lambda x: x.n)
# Build retention matrix
cohort_data = df.groupby(['cohort', 'cohort_age']).agg({
'customer_id': 'nunique',
'revenue': 'sum'
}).reset_index()
# Pivot for visualization
cohort_counts = cohort_data.pivot(
index='cohort',
columns='cohort_age',
values='customer_id'
)
# Calculate retention percentages
cohort_sizes = cohort_counts.iloc[:, 0]
retention = cohort_counts.divide(cohort_sizes, axis=0) * 100
return retentionpython
def cohort_retention_analysis(df: pd.DataFrame) -> pd.DataFrame:
"""构建用于投资者汇报的同期群留存矩阵。"""
# 分配同期群(首次购买月份)
df['cohort'] = df.groupby('customer_id')['date'].transform('min').dt.to_period('M')
df['period'] = df['date'].dt.to_period('M')
df['cohort_age'] = (df['period'] - df['cohort']).apply(lambda x: x.n)
# 构建留存矩阵
cohort_data = df.groupby(['cohort', 'cohort_age']).agg({
'customer_id': 'nunique',
'revenue': 'sum'
}).reset_index()
# 透视表用于可视化
cohort_counts = cohort_data.pivot(
index='cohort',
columns='cohort_age',
values='customer_id'
)
# 计算留存百分比
cohort_sizes = cohort_counts.iloc[:, 0]
retention = cohort_counts.divide(cohort_sizes, axis=0) * 100
return retentionExecutive Visualization
高管级可视化
McKinsey/BCG Chart Principles
麦肯锡/BCG图表原则
yaml
mckinsey_style:
colors:
primary: "#003366" # Deep blue
accent: "#0066CC" # Bright blue
positive: "#2E7D32" # Green
negative: "#C62828" # Red
neutral: "#757575" # Gray
typography:
title: "Georgia, serif"
body: "Arial, sans-serif"
size_title: 18
size_body: 12
principles:
- "One message per chart"
- "Action title (not descriptive)"
- "Data-ink ratio > 80%"
- "Remove chartjunk"
- "Label directly on chart"yaml
mckinsey_style:
colors:
primary: "#003366" # 深蓝色
accent: "#0066CC" # 亮蓝色
positive: "#2E7D32" # 绿色
negative: "#C62828" # 红色
neutral: "#757575" # 灰色
typography:
title: "Georgia, serif"
body: "Arial, sans-serif"
size_title: 18
size_body: 12
principles:
- "每张图表传递一个核心信息"
- "使用行动式标题(而非描述性标题)"
- "数据墨水占比>80%"
- "移除不必要的图表元素"
- "直接在图表上标注"Plotly Executive Charts
Plotly高管风格图表
python
import plotly.express as px
import plotly.graph_objects as go
EXEC_COLORS = {
'primary': '#003366',
'secondary': '#0066CC',
'positive': '#2E7D32',
'negative': '#C62828',
'neutral': '#757575',
}
def exec_line_chart(df, x, y, title):
"""McKinsey-style line chart."""
fig = px.line(df, x=x, y=y)
fig.update_layout(
title=dict(
text=f"<b>{title}</b>",
font=dict(size=18, family="Georgia"),
x=0,
),
font=dict(family="Arial", size=12),
plot_bgcolor='white',
xaxis=dict(showgrid=False, showline=True, linecolor='black'),
yaxis=dict(showgrid=True, gridcolor='#E0E0E0', showline=True, linecolor='black'),
margin=dict(l=60, r=40, t=60, b=40),
)
fig.update_traces(line=dict(color=EXEC_COLORS['primary'], width=3))
return fig
def exec_waterfall(values, labels, title):
"""Waterfall chart for revenue/cost breakdown."""
fig = go.Figure(go.Waterfall(
orientation="v",
measure=["relative"] * (len(values) - 1) + ["total"],
x=labels,
y=values,
connector=dict(line=dict(color="rgb(63, 63, 63)")),
increasing=dict(marker=dict(color=EXEC_COLORS['positive'])),
decreasing=dict(marker=dict(color=EXEC_COLORS['negative'])),
totals=dict(marker=dict(color=EXEC_COLORS['primary'])),
))
fig.update_layout(
title=dict(text=f"<b>{title}</b>", font=dict(size=18, family="Georgia")),
font=dict(family="Arial", size=12),
plot_bgcolor='white',
showlegend=False,
)
return figpython
import plotly.express as px
import plotly.graph_objects as go
EXEC_COLORS = {
'primary': '#003366',
'secondary': '#0066CC',
'positive': '#2E7D32',
'negative': '#C62828',
'neutral': '#757575',
}
def exec_line_chart(df, x, y, title):
"""麦肯锡风格折线图。"""
fig = px.line(df, x=x, y=y)
fig.update_layout(
title=dict(
text=f"<b>{title}</b>",
font=dict(size=18, family="Georgia"),
x=0,
),
font=dict(family="Arial", size=12),
plot_bgcolor='white',
xaxis=dict(showgrid=False, showline=True, linecolor='black'),
yaxis=dict(showgrid=True, gridcolor='#E0E0E0', showline=True, linecolor='black'),
margin=dict(l=60, r=40, t=60, b=40),
)
fig.update_traces(line=dict(color=EXEC_COLORS['primary'], width=3))
return fig
def exec_waterfall(values, labels, title):
"""用于营收/成本拆解的瀑布图。"""
fig = go.Figure(go.Waterfall(
orientation="v",
measure=["relative"] * (len(values) - 1) + ["total"],
x=labels,
y=values,
connector=dict(line=dict(color="rgb(63, 63, 63)")),
increasing=dict(marker=dict(color=EXEC_COLORS['positive'])),
decreasing=dict(marker=dict(color=EXEC_COLORS['negative'])),
totals=dict(marker=dict(color=EXEC_COLORS['primary'])),
))
fig.update_layout(
title=dict(text=f"<b>{title}</b>", font=dict(size=18, family="Georgia")),
font=dict(family="Arial", size=12),
plot_bgcolor='white',
showlegend=False,
)
return figCohort Heatmap
同期群热力图
python
def cohort_heatmap(retention_df, title="Customer Retention by Cohort"):
"""Publication-quality cohort retention heatmap."""
import plotly.figure_factory as ff
fig = px.imshow(
retention_df.values,
labels=dict(x="Months Since Acquisition", y="Cohort", color="Retention %"),
x=list(retention_df.columns),
y=[str(c) for c in retention_df.index],
color_continuous_scale='Blues',
aspect='auto',
)
# Add text annotations
for i, row in enumerate(retention_df.values):
for j, val in enumerate(row):
if not pd.isna(val):
fig.add_annotation(
x=j, y=i,
text=f"{val:.0f}%",
showarrow=False,
font=dict(color='white' if val > 50 else 'black', size=10)
)
fig.update_layout(
title=dict(text=f"<b>{title}</b>", font=dict(size=18, family="Georgia")),
font=dict(family="Arial", size=12),
)
return figpython
def cohort_heatmap(retention_df, title="Customer Retention by Cohort"):
"""达到出版级别的同期群留存热力图。"""
import plotly.figure_factory as ff
fig = px.imshow(
retention_df.values,
labels=dict(x="用户获取后的月份数", y="同期群", color="留存率%"),
x=list(retention_df.columns),
y=[str(c) for c in retention_df.index],
color_continuous_scale='Blues',
aspect='auto',
)
# 添加文本标注
for i, row in enumerate(retention_df.values):
for j, val in enumerate(row):
if not pd.isna(val):
fig.add_annotation(
x=j, y=i,
text=f"{val:.0f}%",
showarrow=False,
font=dict(color='white' if val > 50 else 'black', size=10)
)
fig.update_layout(
title=dict(text=f"<b>{title}</b>", font=dict(size=18, family="Georgia")),
font=dict(family="Arial", size=12),
)
return figStreamlit Dashboard Template
Streamlit仪表盘模板
python
import streamlit as st
import pandas as pd
import plotly.express as px
st.set_page_config(page_title="Executive Dashboard", layout="wide")python
import streamlit as st
import pandas as pd
import plotly.express as px
st.set_page_config(page_title="Executive Dashboard", layout="wide")Custom CSS for executive styling
高管风格自定义CSS
st.markdown("""
<style>
.metric-card {
background: linear-gradient(135deg, #003366, #0066CC);
padding: 20px;
border-radius: 10px;
color: white;
}
.stMetric label { font-family: Georgia, serif; }
</style>
""", unsafe_allow_html=True)
st.markdown("""
<style>
.metric-card {
background: linear-gradient(135deg, #003366, #0066CC);
padding: 20px;
border-radius: 10px;
color: white;
}
.stMetric label { font-family: Georgia, serif; }
</style>
""", unsafe_allow_html=True)
Header
头部
st.title("Executive Dashboard")
st.markdown("---")
st.title("高管仪表盘")
st.markdown("---")
KPI Row
KPI行
col1, col2, col3, col4 = st.columns(4)
with col1:
st.metric("MRR", f"${mrr:,.0f}", f"{mrr_growth:+.1%}")
with col2:
st.metric("ARR", f"${arr:,.0f}", f"{arr_growth:+.1%}")
with col3:
st.metric("LTV:CAC", f"{ltv_cac:.1f}x", delta_color="normal")
with col4:
st.metric("Churn", f"{churn:.1%}", f"{churn_delta:+.1%}", delta_color="inverse")
col1, col2, col3, col4 = st.columns(4)
with col1:
st.metric("MRR", f"${mrr:,.0f}", f"{mrr_growth:+.1%}")
with col2:
st.metric("ARR", f"${arr:,.0f}", f"{arr_growth:+.1%}")
with col3:
st.metric("LTV:CAC", f"{ltv_cac:.1f}x", delta_color="normal")
with col4:
st.metric("客户流失率", f"{churn:.1%}", f"{churn_delta:+.1%}", delta_color="inverse")
Charts Row
图表行
st.markdown("## Revenue Trend")
st.plotly_chart(exec_line_chart(df, 'month', 'revenue', 'MRR Growth Exceeds Target'), use_container_width=True)
st.markdown("## 营收趋势")
st.plotly_chart(exec_line_chart(df, 'month', 'revenue', 'MRR增长超出目标'), use_container_width=True)
Cohort Analysis
同期群分析
st.markdown("## Cohort Retention")
st.plotly_chart(cohort_heatmap(retention_df), use_container_width=True)
undefinedst.markdown("## 同期群留存")
st.plotly_chart(cohort_heatmap(retention_df), use_container_width=True)
undefinedInvestor Presentation Patterns
投资者演示文稿模式
Pitch Deck Metrics Sequence
融资演示指标序列
yaml
investor_metrics_flow:
1_unit_economics:
charts: ["CAC vs LTV bar", "LTV:CAC trend line"]
key_message: "3x+ LTV:CAC proves efficient growth"
2_mrr_waterfall:
charts: ["MRR waterfall (new, expansion, churn, contraction)"]
key_message: "Net revenue retention > 100%"
3_cohort_retention:
charts: ["Cohort heatmap", "Revenue retention curve"]
key_message: "Strong retention = compounding value"
4_growth_efficiency:
charts: ["Magic Number", "CAC payback period"]
key_message: "Efficient growth engine"
5_projections:
charts: ["ARR projection with scenarios"]
key_message: "Clear path to $X ARR"yaml
investor_metrics_flow:
1_unit_economics:
charts: ["CAC vs LTV柱状图", "LTV:CAC趋势线"]
key_message: "3倍以上的LTV:CAC证明增长效率"
2_mrr_waterfall:
charts: ["MRR瀑布图(新增、拓展、流失、收缩)"]
key_message: "净营收留存率>100%"
3_cohort_retention:
charts: ["同期群热力图", "营收留存曲线"]
key_message: "强劲留存带来复利价值"
4_growth_efficiency:
charts: ["魔法数字", "CAC回收期"]
key_message: "高效增长引擎"
5_projections:
charts: ["多场景ARR预测"]
key_message: "清晰的X美元ARR路径"Action Titles (McKinsey Style)
行动式标题(麦肯锡风格)
markdown
undefinedmarkdown
undefinedBad (Descriptive) → Good (Action)
错误(描述性)→ 正确(行动式)
❌ "Revenue by Quarter"
✅ "Q4 Revenue Exceeded Target by 23%"
❌ "Customer Acquisition Cost"
✅ "CAC Decreased 40% While Maintaining Quality"
❌ "Cohort Analysis"
✅ "90-Day Retention Improved to 85%, Up From 72%"
❌ "Market Size"
✅ "TAM of $4.2B with Clear Path to $500M SAM"
undefined❌ "季度营收"
✅ "Q4营收超出目标23%"
❌ "客户获取成本"
✅ "CAC降低40%,同时保持客户质量"
❌ "同期群分析"
✅ "90天留存率从72%提升至85%"
❌ "市场规模"
✅ "整体市场规模(TAM)达42亿美元,可触达市场(SAM)清晰可达5亿美元"
undefinedQuick Commands
快速命令
python
undefinedpython
undefinedLoad and analyze any file
加载并分析任意文件
df = load_data("data.csv")
metrics = calculate_saas_metrics(df)
retention = cohort_retention_analysis(df)
df = load_data("data.csv")
metrics = calculate_saas_metrics(df)
retention = cohort_retention_analysis(df)
Generate executive charts
生成高管风格图表
fig = exec_line_chart(df, 'month', 'mrr', 'MRR Growth Accelerating')
fig.write_html("mrr_chart.html")
fig.write_image("mrr_chart.png", scale=2)
fig = exec_line_chart(df, 'month', 'mrr', 'MRR增长加速')
fig.write_html("mrr_chart.html")
fig.write_image("mrr_chart.png", scale=2)
Run Streamlit dashboard
运行Streamlit仪表盘
streamlit run dashboard.py
streamlit run dashboard.py
undefinedundefinedIntegration Notes
集成说明
- Pairs with: revenue-ops-skill (metrics), pricing-strategy-skill (modeling)
- Stack: Python 3.11+, pandas, polars, plotly, altair, streamlit
- Projects: coperniq-forge (ROI calculators), thetaroom (trading analysis)
- NO OPENAI: Use Claude for narrative generation
- 搭配使用: revenue-ops-skill(指标计算)、pricing-strategy-skill(建模)
- 技术栈: Python 3.11+、pandas、polars、plotly、altair、streamlit
- 关联项目: coperniq-forge(ROI计算器)、thetaroom(交易分析)
- 禁止使用OpenAI: 如需生成叙事内容请使用Claude
Reference Files
参考文件
- - 20+ chart templates with code
reference/chart-gallery.md - - Complete SaaS KPI definitions
reference/saas-metrics.md - - Production dashboard patterns
reference/streamlit-patterns.md - - Format-specific extraction guides
reference/data-wrangling.md
- - 20+带代码的图表模板
reference/chart-gallery.md - - 完整SaaS KPI定义
reference/saas-metrics.md - - 生产级仪表盘模式
reference/streamlit-patterns.md - - 特定格式提取指南 ",
reference/data-wrangling.md