data-analysis

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

<objective> Enable executive-grade data analysis for VC, PE, and C-suite presentations. Covers data ingestion from any format, SaaS metrics calculations (MRR, LTV, CAC, churn), cohort retention analysis, McKinsey-quality visualizations with Plotly, and Streamlit dashboards. </objective>

<quick_start> Universal data loader:

python

df = load_data("file.csv")  # Supports CSV, Excel, JSON, Parquet, PDF, PPTX

SaaS metrics:

python

metrics = calculate_saas_metrics(df)  # MRR, ARR, LTV, CAC, churn
retention = cohort_retention_analysis(df)  # Retention matrix

McKinsey-style charts: Action titles ("Q4 Revenue Exceeded Target by 23%"), not descriptive titles </quick_start>

<success_criteria> Analysis is successful when:

Data loaded and cleaned (dropna, dedup, type conversion)
Metrics calculated correctly (MRR, ARR, LTV:CAC, churn, cohort retention)
Charts follow McKinsey principles: action titles, data-ink ratio >80%, one message per chart
Executive colors used (#003366 primary, #2E7D32 positive, #C62828 negative)
Streamlit dashboard runs without errors
NO OPENAI: Use Claude for narrative generation if needed </success_criteria>

<core_content> Executive-grade data analysis for VC, PE, C-suite presentations using pandas, polars, Plotly, Altair, and Streamlit.

<objective> 为风投（VC）、私募股权（PE）及高管层演示提供高管级数据分析能力。支持任意格式的数据导入、SaaS指标计算（MRR、LTV、CAC、客户流失率）、同期群留存分析、基于Plotly的麦肯锡水准可视化，以及Streamlit仪表盘搭建。 </objective>

<quick_start> 通用数据加载器：

python

df = load_data("file.csv")  # 支持CSV、Excel、JSON、Parquet、PDF、PPTX格式

SaaS指标计算：

python

metrics = calculate_saas_metrics(df)  # 计算MRR、ARR、LTV、CAC、客户流失率
retention = cohort_retention_analysis(df)  # 生成留存矩阵

麦肯锡风格图表： 使用行动式标题（如“Q4营收超出目标23%”），而非描述性标题 </quick_start>

<success_criteria> 分析成功的判定标准：

数据已加载并清洗完成（剔除空值、去重、类型转换）
指标计算准确（MRR、ARR、LTV:CAC、客户流失率、同期群留存）
图表遵循麦肯锡原则：行动式标题、数据墨水占比>80%、每张图表传递一个核心信息
使用高管风格配色（主色调#003366、正向色#2E7D32、负向色#C62828）
Streamlit仪表盘可正常运行无报错
禁止使用OpenAI：如需生成叙事内容请使用Claude </success_criteria>

<core_content> 借助pandas、polars、Plotly、Altair和Streamlit，为风投、私募及高管层演示提供高管级数据分析能力。

Quick Reference

快速参考

Task	Tools	Output
Data ingestion	pandas, polars, pdfplumber, python-pptx	DataFrame
Wrangling	pandas/polars transforms	Clean dataset
Analysis	numpy, scipy, statsmodels	Insights
Visualization	Plotly, Altair, Seaborn	Charts
Dashboards	Streamlit, DuckDB	Interactive apps
Presentations	Plotly export, PDF generation	Investor-ready

任务	工具	输出结果
数据导入	pandas, polars, pdfplumber, python-pptx	DataFrame
数据清洗	pandas/polars 转换操作	清洗后的数据集
数据分析	numpy, scipy, statsmodels	分析洞察
数据可视化	Plotly, Altair, Seaborn	图表
仪表盘搭建	Streamlit, DuckDB	交互式应用
演示文稿制作	Plotly导出、PDF生成	符合投资者需求的材料

Data Ingestion Patterns

数据导入模式

Universal Data Loader

通用数据加载器

python

import pandas as pd
import polars as pl
from pathlib import Path

def load_data(file_path: str) -> pd.DataFrame:
    """Load data from any common format."""
    path = Path(file_path)
    suffix = path.suffix.lower()

    loaders = {
        '.csv': lambda p: pd.read_csv(p),
        '.xlsx': lambda p: pd.read_excel(p, engine='openpyxl'),
        '.xls': lambda p: pd.read_excel(p, engine='xlrd'),
        '.json': lambda p: pd.read_json(p),
        '.parquet': lambda p: pd.read_parquet(p),
        '.sql': lambda p: pd.read_sql(open(p).read(), conn),
        '.md': lambda p: parse_markdown_tables(p),
        '.pdf': lambda p: extract_pdf_tables(p),
        '.pptx': lambda p: extract_pptx_tables(p),
    }

    if suffix not in loaders:
        raise ValueError(f"Unsupported format: {suffix}")

    return loaders[suffix](path)

python

import pandas as pd
import polars as pl
from pathlib import Path

def load_data(file_path: str) -> pd.DataFrame:
    """从任意常见格式加载数据。"""
    path = Path(file_path)
    suffix = path.suffix.lower()

    loaders = {
        '.csv': lambda p: pd.read_csv(p),
        '.xlsx': lambda p: pd.read_excel(p, engine='openpyxl'),
        '.xls': lambda p: pd.read_excel(p, engine='xlrd'),
        '.json': lambda p: pd.read_json(p),
        '.parquet': lambda p: pd.read_parquet(p),
        '.sql': lambda p: pd.read_sql(open(p).read(), conn),
        '.md': lambda p: parse_markdown_tables(p),
        '.pdf': lambda p: extract_pdf_tables(p),
        '.pptx': lambda p: extract_pptx_tables(p),
    }

    if suffix not in loaders:
        raise ValueError(f"不支持的格式：{suffix}")

    return loaders[suffix](path)

PDF Table Extraction

PDF表格提取

python

import pdfplumber

def extract_pdf_tables(pdf_path: str) -> pd.DataFrame:
    """Extract tables from PDF using pdfplumber."""
    all_tables = []

    with pdfplumber.open(pdf_path) as pdf:
        for page in pdf.pages:
            tables = page.extract_tables()
            for table in tables:
                if table and len(table) > 1:
                    df = pd.DataFrame(table[1:], columns=table[0])
                    all_tables.append(df)

    return pd.concat(all_tables, ignore_index=True) if all_tables else pd.DataFrame()

python

import pdfplumber

def extract_pdf_tables(pdf_path: str) -> pd.DataFrame:
    """使用pdfplumber从PDF中提取表格。"""
    all_tables = []

    with pdfplumber.open(pdf_path) as pdf:
        for page in pdf.pages:
            tables = page.extract_tables()
            for table in tables:
                if table and len(table) > 1:
                    df = pd.DataFrame(table[1:], columns=table[0])
                    all_tables.append(df)

    return pd.concat(all_tables, ignore_index=True) if all_tables else pd.DataFrame()

PowerPoint Data Extraction

PowerPoint数据提取

python

from pptx import Presentation
from pptx.util import Inches

def extract_pptx_tables(pptx_path: str) -> list[pd.DataFrame]:
    """Extract all tables from PowerPoint."""
    prs = Presentation(pptx_path)
    tables = []

    for slide in prs.slides:
        for shape in slide.shapes:
            if shape.has_table:
                table = shape.table
                data = []
                for row in table.rows:
                    data.append([cell.text for cell in row.cells])
                df = pd.DataFrame(data[1:], columns=data[0])
                tables.append(df)

    return tables

python

from pptx import Presentation
from pptx.util import Inches

def extract_pptx_tables(pptx_path: str) -> list[pd.DataFrame]:
    """从PowerPoint中提取所有表格。"""
    prs = Presentation(pptx_path)
    tables = []

    for slide in prs.slides:
        for shape in slide.shapes:
            if shape.has_table:
                table = shape.table
                data = []
                for row in table.rows:
                    data.append([cell.text for cell in row.cells])
                df = pd.DataFrame(data[1:], columns=data[0])
                tables.append(df)

    return tables

Data Wrangling Patterns

数据清洗模式

Polars for Performance (30x faster than pandas)

高性能Polars（比pandas快30倍）

python

import polars as pl

python

import polars as pl

Lazy evaluation for large datasets

针对大型数据集的惰性评估

df = ( pl.scan_csv("large_file.csv") .filter(pl.col("revenue") > 0) .with_columns([ (pl.col("revenue") / pl.col("customers")).alias("arpu"), pl.col("date").str.to_date().alias("date_parsed"), ]) .group_by("segment") .agg([ pl.col("revenue").sum().alias("total_revenue"), pl.col("customers").mean().alias("avg_customers"), ]) .collect() )

undefined

undefined

Common Transformations

常见转换操作

python

def prepare_for_analysis(df: pd.DataFrame) -> pd.DataFrame:
    """Standard data prep pipeline."""
    return (df
        .dropna(subset=['key_column'])
        .drop_duplicates()
        .assign(
            date=lambda x: pd.to_datetime(x['date']),
            revenue=lambda x: pd.to_numeric(x['revenue'], errors='coerce'),
            month=lambda x: x['date'].dt.to_period('M'),
        )
        .sort_values('date')
        .reset_index(drop=True)
    )

python

def prepare_for_analysis(df: pd.DataFrame) -> pd.DataFrame:
    """标准化数据预处理流程。"""
    return (df
        .dropna(subset=['key_column'])
        .drop_duplicates()
        .assign(
            date=lambda x: pd.to_datetime(x['date']),
            revenue=lambda x: pd.to_numeric(x['revenue'], errors='coerce'),
            month=lambda x: x['date'].dt.to_period('M'),
        )
        .sort_values('date')
        .reset_index(drop=True)
    )

SaaS Metrics Calculations

SaaS指标计算

Core Metrics

核心指标

python

def calculate_saas_metrics(df: pd.DataFrame) -> dict:
    """Calculate key SaaS metrics for investor reporting."""

    # MRR / ARR
    mrr = df.groupby('month')['mrr'].sum()
    arr = mrr.iloc[-1] * 12

    # Growth rates
    mrr_growth = mrr.pct_change().iloc[-1]

    # Churn
    churned = df[df['status'] == 'churned']['mrr'].sum()
    total_mrr = df['mrr'].sum()
    churn_rate = churned / total_mrr if total_mrr > 0 else 0

    # CAC & LTV
    total_sales_marketing = df['sales_cost'].sum() + df['marketing_cost'].sum()
    new_customers = df[df['is_new']]['customer_id'].nunique()
    cac = total_sales_marketing / new_customers if new_customers > 0 else 0

    avg_revenue_per_customer = df.groupby('customer_id')['mrr'].mean().mean()
    avg_lifespan_months = 1 / churn_rate if churn_rate > 0 else 36
    ltv = avg_revenue_per_customer * avg_lifespan_months

    ltv_cac_ratio = ltv / cac if cac > 0 else 0
    cac_payback_months = cac / avg_revenue_per_customer if avg_revenue_per_customer > 0 else 0

    return {
        'mrr': mrr.iloc[-1],
        'arr': arr,
        'mrr_growth': mrr_growth,
        'churn_rate': churn_rate,
        'cac': cac,
        'ltv': ltv,
        'ltv_cac_ratio': ltv_cac_ratio,
        'cac_payback_months': cac_payback_months,
    }

python

def calculate_saas_metrics(df: pd.DataFrame) -> dict:
    """计算用于投资者汇报的核心SaaS指标。"""

    # MRR / ARR
    mrr = df.groupby('month')['mrr'].sum()
    arr = mrr.iloc[-1] * 12

    # 增长率
    mrr_growth = mrr.pct_change().iloc[-1]

    # 客户流失率
    churned = df[df['status'] == 'churned']['mrr'].sum()
    total_mrr = df['mrr'].sum()
    churn_rate = churned / total_mrr if total_mrr > 0 else 0

    # CAC & LTV
    total_sales_marketing = df['sales_cost'].sum() + df['marketing_cost'].sum()
    new_customers = df[df['is_new']]['customer_id'].nunique()
    cac = total_sales_marketing / new_customers if new_customers > 0 else 0

    avg_revenue_per_customer = df.groupby('customer_id')['mrr'].mean().mean()
    avg_lifespan_months = 1 / churn_rate if churn_rate > 0 else 36
    ltv = avg_revenue_per_customer * avg_lifespan_months

    ltv_cac_ratio = ltv / cac if cac > 0 else 0
    cac_payback_months = cac / avg_revenue_per_customer if avg_revenue_per_customer > 0 else 0

    return {
        'mrr': mrr.iloc[-1],
        'arr': arr,
        'mrr_growth': mrr_growth,
        'churn_rate': churn_rate,
        'cac': cac,
        'ltv': ltv,
        'ltv_cac_ratio': ltv_cac_ratio,
        'cac_payback_months': cac_payback_months,
    }

Cohort Analysis

同期群分析

python

def cohort_retention_analysis(df: pd.DataFrame) -> pd.DataFrame:
    """Build cohort retention matrix for investor reporting."""

    # Assign cohort (first purchase month)
    df['cohort'] = df.groupby('customer_id')['date'].transform('min').dt.to_period('M')
    df['period'] = df['date'].dt.to_period('M')
    df['cohort_age'] = (df['period'] - df['cohort']).apply(lambda x: x.n)

    # Build retention matrix
    cohort_data = df.groupby(['cohort', 'cohort_age']).agg({
        'customer_id': 'nunique',
        'revenue': 'sum'
    }).reset_index()

    # Pivot for visualization
    cohort_counts = cohort_data.pivot(
        index='cohort',
        columns='cohort_age',
        values='customer_id'
    )

    # Calculate retention percentages
    cohort_sizes = cohort_counts.iloc[:, 0]
    retention = cohort_counts.divide(cohort_sizes, axis=0) * 100

    return retention

python

def cohort_retention_analysis(df: pd.DataFrame) -> pd.DataFrame:
    """构建用于投资者汇报的同期群留存矩阵。"""

    # 分配同期群（首次购买月份）
    df['cohort'] = df.groupby('customer_id')['date'].transform('min').dt.to_period('M')
    df['period'] = df['date'].dt.to_period('M')
    df['cohort_age'] = (df['period'] - df['cohort']).apply(lambda x: x.n)

    # 构建留存矩阵
    cohort_data = df.groupby(['cohort', 'cohort_age']).agg({
        'customer_id': 'nunique',
        'revenue': 'sum'
    }).reset_index()

    # 透视表用于可视化
    cohort_counts = cohort_data.pivot(
        index='cohort',
        columns='cohort_age',
        values='customer_id'
    )

    # 计算留存百分比
    cohort_sizes = cohort_counts.iloc[:, 0]
    retention = cohort_counts.divide(cohort_sizes, axis=0) * 100

    return retention

Executive Visualization

高管级可视化

McKinsey/BCG Chart Principles

麦肯锡/BCG图表原则

yaml

mckinsey_style:
  colors:
    primary: "#003366"      # Deep blue
    accent: "#0066CC"       # Bright blue
    positive: "#2E7D32"     # Green
    negative: "#C62828"     # Red
    neutral: "#757575"      # Gray

  typography:
    title: "Georgia, serif"
    body: "Arial, sans-serif"
    size_title: 18
    size_body: 12

  principles:
    - "One message per chart"
    - "Action title (not descriptive)"
    - "Data-ink ratio > 80%"
    - "Remove chartjunk"
    - "Label directly on chart"

yaml

mckinsey_style:
  colors:
    primary: "#003366"      # 深蓝色
    accent: "#0066CC"       # 亮蓝色
    positive: "#2E7D32"     # 绿色
    negative: "#C62828"     # 红色
    neutral: "#757575"      # 灰色

  typography:
    title: "Georgia, serif"
    body: "Arial, sans-serif"
    size_title: 18
    size_body: 12

  principles:
    - "每张图表传递一个核心信息"
    - "使用行动式标题（而非描述性标题）"
    - "数据墨水占比>80%"
    - "移除不必要的图表元素"
    - "直接在图表上标注"

Plotly Executive Charts

Plotly高管风格图表

python

import plotly.express as px
import plotly.graph_objects as go

EXEC_COLORS = {
    'primary': '#003366',
    'secondary': '#0066CC',
    'positive': '#2E7D32',
    'negative': '#C62828',
    'neutral': '#757575',
}

def exec_line_chart(df, x, y, title):
    """McKinsey-style line chart."""
    fig = px.line(df, x=x, y=y)

    fig.update_layout(
        title=dict(
            text=f"<b>{title}</b>",
            font=dict(size=18, family="Georgia"),
            x=0,
        ),
        font=dict(family="Arial", size=12),
        plot_bgcolor='white',
        xaxis=dict(showgrid=False, showline=True, linecolor='black'),
        yaxis=dict(showgrid=True, gridcolor='#E0E0E0', showline=True, linecolor='black'),
        margin=dict(l=60, r=40, t=60, b=40),
    )

    fig.update_traces(line=dict(color=EXEC_COLORS['primary'], width=3))

    return fig

def exec_waterfall(values, labels, title):
    """Waterfall chart for revenue/cost breakdown."""
    fig = go.Figure(go.Waterfall(
        orientation="v",
        measure=["relative"] * (len(values) - 1) + ["total"],
        x=labels,
        y=values,
        connector=dict(line=dict(color="rgb(63, 63, 63)")),
        increasing=dict(marker=dict(color=EXEC_COLORS['positive'])),
        decreasing=dict(marker=dict(color=EXEC_COLORS['negative'])),
        totals=dict(marker=dict(color=EXEC_COLORS['primary'])),
    ))

    fig.update_layout(
        title=dict(text=f"<b>{title}</b>", font=dict(size=18, family="Georgia")),
        font=dict(family="Arial", size=12),
        plot_bgcolor='white',
        showlegend=False,
    )

    return fig

python

import plotly.express as px
import plotly.graph_objects as go

EXEC_COLORS = {
    'primary': '#003366',
    'secondary': '#0066CC',
    'positive': '#2E7D32',
    'negative': '#C62828',
    'neutral': '#757575',
}

def exec_line_chart(df, x, y, title):
    """麦肯锡风格折线图。"""
    fig = px.line(df, x=x, y=y)

    fig.update_layout(
        title=dict(
            text=f"<b>{title}</b>",
            font=dict(size=18, family="Georgia"),
            x=0,
        ),
        font=dict(family="Arial", size=12),
        plot_bgcolor='white',
        xaxis=dict(showgrid=False, showline=True, linecolor='black'),
        yaxis=dict(showgrid=True, gridcolor='#E0E0E0', showline=True, linecolor='black'),
        margin=dict(l=60, r=40, t=60, b=40),
    )

    fig.update_traces(line=dict(color=EXEC_COLORS['primary'], width=3))

    return fig

def exec_waterfall(values, labels, title):
    """用于营收/成本拆解的瀑布图。"""
    fig = go.Figure(go.Waterfall(
        orientation="v",
        measure=["relative"] * (len(values) - 1) + ["total"],
        x=labels,
        y=values,
        connector=dict(line=dict(color="rgb(63, 63, 63)")),
        increasing=dict(marker=dict(color=EXEC_COLORS['positive'])),
        decreasing=dict(marker=dict(color=EXEC_COLORS['negative'])),
        totals=dict(marker=dict(color=EXEC_COLORS['primary'])),
    ))

    fig.update_layout(
        title=dict(text=f"<b>{title}</b>", font=dict(size=18, family="Georgia")),
        font=dict(family="Arial", size=12),
        plot_bgcolor='white',
        showlegend=False,
    )

    return fig

Cohort Heatmap

同期群热力图

python

def cohort_heatmap(retention_df, title="Customer Retention by Cohort"):
    """Publication-quality cohort retention heatmap."""
    import plotly.figure_factory as ff

    fig = px.imshow(
        retention_df.values,
        labels=dict(x="Months Since Acquisition", y="Cohort", color="Retention %"),
        x=list(retention_df.columns),
        y=[str(c) for c in retention_df.index],
        color_continuous_scale='Blues',
        aspect='auto',
    )

    # Add text annotations
    for i, row in enumerate(retention_df.values):
        for j, val in enumerate(row):
            if not pd.isna(val):
                fig.add_annotation(
                    x=j, y=i,
                    text=f"{val:.0f}%",
                    showarrow=False,
                    font=dict(color='white' if val > 50 else 'black', size=10)
                )

    fig.update_layout(
        title=dict(text=f"<b>{title}</b>", font=dict(size=18, family="Georgia")),
        font=dict(family="Arial", size=12),
    )

    return fig

python

def cohort_heatmap(retention_df, title="Customer Retention by Cohort"):
    """达到出版级别的同期群留存热力图。"""
    import plotly.figure_factory as ff

    fig = px.imshow(
        retention_df.values,
        labels=dict(x="用户获取后的月份数", y="同期群", color="留存率%"),
        x=list(retention_df.columns),
        y=[str(c) for c in retention_df.index],
        color_continuous_scale='Blues',
        aspect='auto',
    )

    # 添加文本标注
    for i, row in enumerate(retention_df.values):
        for j, val in enumerate(row):
            if not pd.isna(val):
                fig.add_annotation(
                    x=j, y=i,
                    text=f"{val:.0f}%",
                    showarrow=False,
                    font=dict(color='white' if val > 50 else 'black', size=10)
                )

    fig.update_layout(
        title=dict(text=f"<b>{title}</b>", font=dict(size=18, family="Georgia")),
        font=dict(family="Arial", size=12),
    )

    return fig

Streamlit Dashboard Template

Streamlit仪表盘模板

python

import streamlit as st
import pandas as pd
import plotly.express as px

st.set_page_config(page_title="Executive Dashboard", layout="wide")

python

import streamlit as st
import pandas as pd
import plotly.express as px

st.set_page_config(page_title="Executive Dashboard", layout="wide")

Custom CSS for executive styling

高管风格自定义CSS

st.markdown("""

""", unsafe_allow_html=True)

st.markdown("""

""", unsafe_allow_html=True)

Header

头部

st.title("Executive Dashboard") st.markdown("---")

st.title("高管仪表盘") st.markdown("---")

KPI Row

KPI行

col1, col2, col3, col4 = st.columns(4)

with col1: st.metric("MRR", f"${mrr:,.0f}", f"{mrr_growth:+.1%}") with col2: st.metric("ARR", f"${arr:,.0f}", f"{arr_growth:+.1%}") with col3: st.metric("LTV:CAC", f"{ltv_cac:.1f}x", delta_color="normal") with col4: st.metric("Churn", f"{churn:.1%}", f"{churn_delta:+.1%}", delta_color="inverse")

col1, col2, col3, col4 = st.columns(4)

with col1: st.metric("MRR", f"${mrr:,.0f}", f"{mrr_growth:+.1%}") with col2: st.metric("ARR", f"${arr:,.0f}", f"{arr_growth:+.1%}") with col3: st.metric("LTV:CAC", f"{ltv_cac:.1f}x", delta_color="normal") with col4: st.metric("客户流失率", f"{churn:.1%}", f"{churn_delta:+.1%}", delta_color="inverse")

Charts Row

图表行

st.markdown("## Revenue Trend") st.plotly_chart(exec_line_chart(df, 'month', 'revenue', 'MRR Growth Exceeds Target'), use_container_width=True)

st.markdown("## 营收趋势") st.plotly_chart(exec_line_chart(df, 'month', 'revenue', 'MRR增长超出目标'), use_container_width=True)

Cohort Analysis

同期群分析

st.markdown("## Cohort Retention") st.plotly_chart(cohort_heatmap(retention_df), use_container_width=True)

undefined

st.markdown("## 同期群留存") st.plotly_chart(cohort_heatmap(retention_df), use_container_width=True)

undefined

Investor Presentation Patterns

投资者演示文稿模式

Pitch Deck Metrics Sequence

融资演示指标序列

yaml

investor_metrics_flow:
  1_unit_economics:
    charts: ["CAC vs LTV bar", "LTV:CAC trend line"]
    key_message: "3x+ LTV:CAC proves efficient growth"

  2_mrr_waterfall:
    charts: ["MRR waterfall (new, expansion, churn, contraction)"]
    key_message: "Net revenue retention > 100%"

  3_cohort_retention:
    charts: ["Cohort heatmap", "Revenue retention curve"]
    key_message: "Strong retention = compounding value"

  4_growth_efficiency:
    charts: ["Magic Number", "CAC payback period"]
    key_message: "Efficient growth engine"

  5_projections:
    charts: ["ARR projection with scenarios"]
    key_message: "Clear path to $X ARR"

yaml

investor_metrics_flow:
  1_unit_economics:
    charts: ["CAC vs LTV柱状图", "LTV:CAC趋势线"]
    key_message: "3倍以上的LTV:CAC证明增长效率"

  2_mrr_waterfall:
    charts: ["MRR瀑布图（新增、拓展、流失、收缩）"]
    key_message: "净营收留存率>100%"

  3_cohort_retention:
    charts: ["同期群热力图", "营收留存曲线"]
    key_message: "强劲留存带来复利价值"

  4_growth_efficiency:
    charts: ["魔法数字", "CAC回收期"]
    key_message: "高效增长引擎"

  5_projections:
    charts: ["多场景ARR预测"]
    key_message: "清晰的X美元ARR路径"

Action Titles (McKinsey Style)

行动式标题（麦肯锡风格）

markdown

undefined

markdown

undefined

Bad (Descriptive) → Good (Action)

错误（描述性）→ 正确（行动式）

❌ "Revenue by Quarter" ✅ "Q4 Revenue Exceeded Target by 23%"

❌ "Customer Acquisition Cost" ✅ "CAC Decreased 40% While Maintaining Quality"

❌ "Cohort Analysis" ✅ "90-Day Retention Improved to 85%, Up From 72%"

❌ "Market Size" ✅ "TAM of $4.2B with Clear Path to $500M SAM"

undefined

❌ "季度营收" ✅ "Q4营收超出目标23%"

❌ "客户获取成本" ✅ "CAC降低40%，同时保持客户质量"

❌ "同期群分析" ✅ "90天留存率从72%提升至85%"

❌ "市场规模" ✅ "整体市场规模（TAM）达42亿美元，可触达市场（SAM）清晰可达5亿美元"

undefined

Quick Commands

快速命令

python

undefined

python

undefined

Load and analyze any file

加载并分析任意文件

df = load_data("data.csv") metrics = calculate_saas_metrics(df) retention = cohort_retention_analysis(df)

Generate executive charts

生成高管风格图表

fig = exec_line_chart(df, 'month', 'mrr', 'MRR Growth Accelerating') fig.write_html("mrr_chart.html") fig.write_image("mrr_chart.png", scale=2)

fig = exec_line_chart(df, 'month', 'mrr', 'MRR增长加速') fig.write_html("mrr_chart.html") fig.write_image("mrr_chart.png", scale=2)

Run Streamlit dashboard

运行Streamlit仪表盘

streamlit run dashboard.py

undefined

undefined

Integration Notes

集成说明

Pairs with: revenue-ops-skill (metrics), pricing-strategy-skill (modeling)
Stack: Python 3.11+, pandas, polars, plotly, altair, streamlit
Projects: coperniq-forge (ROI calculators), thetaroom (trading analysis)
NO OPENAI: Use Claude for narrative generation

搭配使用： revenue-ops-skill（指标计算）、pricing-strategy-skill（建模）
技术栈： Python 3.11+、pandas、polars、plotly、altair、streamlit
关联项目： coperniq-forge（ROI计算器）、thetaroom（交易分析）
禁止使用OpenAI： 如需生成叙事内容请使用Claude

Reference Files

参考文件

```
reference/chart-gallery.md
```
- 20+ chart templates with code
```
reference/saas-metrics.md
```
- Complete SaaS KPI definitions
```
reference/streamlit-patterns.md
```
- Production dashboard patterns
```
reference/data-wrangling.md
```
- Format-specific extraction guides

```
reference/chart-gallery.md
```
- 20+带代码的图表模板
```
reference/saas-metrics.md
```
- 完整SaaS KPI定义
```
reference/streamlit-patterns.md
```
- 生产级仪表盘模式
```
reference/data-wrangling.md
```
- 特定格式提取指南 ",