product-analytics
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseProduct Analytics
产品分析
Frameworks for turning raw product data into ship/extend/kill decisions. Covers A/B testing, cohort retention, funnel analysis, and the statistical foundations needed to make those decisions with confidence.
将原始产品数据转化为发布/扩展/下线决策的框架。涵盖A/B测试、群组留存、漏斗分析,以及做出这些决策所需的统计基础,确保决策的可信度。
Quick Reference
快速参考
| Category | Rules | Impact | When to Use |
|---|---|---|---|
| A/B Test Evaluation | 1 | HIGH | Comparing variants, measuring significance, shipping decisions |
| Cohort Retention | 1 | HIGH | Feature adoption curves, day-N retention, engagement scoring |
| Funnel Analysis | 1 | HIGH | Drop-off diagnosis, conversion optimization, stage mapping |
| Statistical Foundations | 1 | HIGH | p-value interpretation, sample sizing, confidence intervals |
Total: 4 rules across 4 categories
A/B Test Evaluation
A/B测试评估
Load for the full framework. Quick pattern:
rules/ab-test-evaluation.mdmarkdown
undefined加载 获取完整框架。快速模板:
rules/ab-test-evaluation.mdmarkdown
undefinedExperiment: [Name]
实验: [名称]
Hypothesis: If we [change], then [primary metric] will [direction] by [amount]
because [evidence or reasoning].
Sample size: [N per variant] — calculated for MDE=[X%], power=80%, alpha=0.05
Duration: [Minimum weeks] — never stop early (peeking bias)
Results:
Control: [metric value] n=[count]
Treatment: [metric value] n=[count]
Lift: [+/- X%] p=[value] 95% CI: [lower, upper]
Decision: SHIP / EXTEND / KILL
Rationale: [One sentence grounded in numbers, not gut feel]
**Decision rules:**
- **SHIP** — p < 0.05, CI excludes zero, no guardrail regressions
- **EXTEND** — trending positive but underpowered (add runtime, not reanalysis)
- **KILL** — null result or guardrail degradation
See `rules/ab-test-evaluation.md` for sample size formulas, SRM checks, and pitfall list.假设:如果我们[做出变更],那么[核心指标]将[变化方向][变化幅度],
原因是[证据或推理]。
样本量:[每个变体的样本数N] — 基于最小可检测效果(MDE)=[X%]、统计功效=80%、显著性水平α=0.05计算得出
持续时长:[最少周数] — 绝不能提前停止(避免偷看偏差)
结果:
对照组: [指标数值] n=[样本数]
实验组: [指标数值] n=[样本数]
提升幅度: [+/- X%] p=[数值] 95% 置信区间: [下限, 上限]
决策: 发布(SHIP)/ 扩展(EXTEND)/ 下线(KILL)
理由: [基于数据的一句话总结,而非主观判断]
**决策规则:**
- **发布(SHIP)** — p < 0.05、置信区间不包含0、无护栏指标退化
- **扩展(EXTEND)** — 呈正向趋势但统计功效不足(延长实验时长,而非重新分析)
- **下线(KILL)** — 无显著结果或护栏指标退化
查看 `rules/ab-test-evaluation.md` 获取样本量计算公式、样本比例偏差(SRM)检查方法及常见陷阱列表。Cohort Retention
群组留存
Load for full methodology. Quick pattern:
rules/cohort-retention.mdsql
-- Day-N retention cohort query
SELECT
DATE_TRUNC('week', first_seen) AS cohort_week,
COUNT(DISTINCT user_id) AS cohort_size,
COUNT(DISTINCT CASE
WHEN activity_date = first_seen + INTERVAL '7 days'
THEN user_id END) * 100.0
/ COUNT(DISTINCT user_id) AS day_7_retention
FROM user_activity
GROUP BY 1
ORDER BY 1;Retention benchmarks (SaaS):
- Day 1: 40–60% is healthy
- Day 7: 20–35% is healthy
- Day 30: 10–20% is healthy
- Flat curve after day 30 = product-market fit signal
See for behavior-based cohorts, feature adoption curves, and engagement scoring.
rules/cohort-retention.md加载 获取完整方法。快速查询模板:
rules/cohort-retention.mdsql
-- N日留存群组查询
SELECT
DATE_TRUNC('week', first_seen) AS cohort_week,
COUNT(DISTINCT user_id) AS cohort_size,
COUNT(DISTINCT CASE
WHEN activity_date = first_seen + INTERVAL '7 days'
THEN user_id END) * 100.0
/ COUNT(DISTINCT user_id) AS day_7_retention
FROM user_activity
GROUP BY 1
ORDER BY 1;SaaS产品留存基准:
- 次日留存: 40–60% 为健康水平
- 7日留存: 20–35% 为健康水平
- 30日留存: 10–20% 为健康水平
- 30日后留存曲线趋于平稳 = 产品市场契合信号
查看 获取基于行为的群组划分、功能采用曲线及参与度评分方法。
rules/cohort-retention.mdFunnel Analysis
漏斗分析
Load for full methodology. Quick pattern:
rules/funnel-analysis.mdmarkdown
undefined加载 获取完整方法。快速模板:
rules/funnel-analysis.mdmarkdown
undefinedFunnel: [Name] — [Date Range]
漏斗: [名称] — [时间范围]
Stage 1: [Aware / Land] → [N] users (entry)
Stage 2: [Activate / Sign] → [N] users ([X]% from stage 1)
Stage 3: [Engage / Use] → [N] users ([X]% from stage 2) ← biggest drop
Stage 4: [Convert / Pay] → [N] users ([X]% from stage 3)
Overall conversion: [X]%
Biggest drop-off: Stage 2→3 ([X]% loss) — investigate first
**Optimization order:** Fix the largest drop-off first. A 5-point improvement at a high-volume step is worth more than a 20-point improvement at a low-volume step.
See `rules/funnel-analysis.md` for segmented funnels, micro-conversion tracking, and prioritization patterns.阶段1: [认知/着陆] → [N] 位用户 (入口)
阶段2: [激活/注册] → [N] 位用户 (较阶段1留存[X]%)
阶段3: [参与/使用] → [N] 位用户 (较阶段2留存[X]%) ← 流失最严重的环节
阶段4: [转化/付费] → [N] 位用户 (较阶段3留存[X]%)
整体转化率: [X]%
最大流失点: 阶段2→3(流失[X]%) — 优先排查
**优化顺序:** 优先修复最大的流失环节。高流量环节提升5个百分点,比低流量环节提升20个百分点的价值更高。
查看 `rules/funnel-analysis.md` 获取细分漏斗、微转化追踪及优先级排序方法。Statistical Foundations
统计基础
Plain-English explanations of the stats every PM needs. Load for formulas and quick lookups.
references/stats-cheat-sheet.mdp-value in plain English: The probability that you would see a result this extreme (or more extreme) if the change had zero effect. p=0.03 means a 3% chance you're looking at random noise. It does NOT mean "97% probability the change works."
Confidence interval in plain English: The range where the true effect probably lives. "Lift = +8%, 95% CI [+2%, +14%]" means you are fairly confident the real lift is somewhere between 2% and 14%. If the CI includes zero, you cannot claim a win.
Minimum Detectable Effect (MDE): The smallest lift you care about detecting. Setting MDE too small forces impractically large sample sizes. Anchor MDE to business value — if a 2% lift is not worth shipping, set MDE = 5%.
Statistical vs practical significance: A result can be statistically significant (p < 0.05) but practically meaningless (lift = 0.01%). Always check both. A 0.01% lift that costs 6 weeks of eng time is not a win.
为产品经理准备的统计知识通俗讲解。加载 获取公式及快速查询指南。
references/stats-cheat-sheet.mdp值通俗解释: 如果变更没有任何效果,出现当前(或更极端)结果的概率。p=0.03意味着有3%的概率你看到的只是随机噪声。它不代表“变更有效的概率为97%”。
置信区间通俗解释: 真实效果大概率所在的范围。“提升幅度 = +8%, 95% 置信区间 [+2%, +14%]”意味着你有足够的信心认为真实提升幅度在2%到14%之间。如果置信区间包含0,则无法宣称实验成功。
最小可检测效果(MDE): 你关心的最小提升幅度。MDE设置过小会导致样本量需求过大,不切实际。应结合业务价值设定MDE — 如果2%的提升不值得发布,就将MDE设为5%。
统计显著性 vs 实际显著性: 结果可能具备统计显著性(p < 0.05)但毫无实际意义(提升幅度=0.01%)。务必同时检查两者。如果0.01%的提升需要花费6周的研发时间,这算不上成功。
Common Pitfalls
常见陷阱
- Peeking — stopping an experiment early because results look good inflates false-positive rate. Commit to a runtime before launch.
- Multiple comparisons — testing 10 metrics at p < 0.05 means ~1 false positive by chance. Apply Bonferroni correction or pre-register your primary metric.
- Sample Ratio Mismatch (SRM) — if variant group sizes differ from expected split by > 1%, your experiment is broken. Fix before analyzing results.
- Novelty effect — new features get inflated engagement in week 1. Run experiments long enough to see settled behavior (minimum 2 full business cycles).
- Simpson's paradox — aggregate results can reverse when segmented. Always check results by key segments (device, plan tier, geography).
- 偷看偏差 — 因为结果看起来不错就提前停止实验,会增加假阳性率。实验启动前就确定好持续时长。
- 多重比较 — 以p < 0.05为标准测试10个指标,约有1个假阳性结果是随机产生的。需采用邦费罗尼校正或预先注册核心指标。
- 样本比例偏差(SRM) — 如果变体组的样本量与预期分配比例差异超过1%,说明实验存在问题。分析结果前先修复问题。
- 新奇效应 — 新功能在第一周的参与度会被高估。实验时长需足够长,以观察稳定后的行为(至少2个完整业务周期)。
- 辛普森悖论 — 整体结果在细分后可能反转。务必按关键维度(设备、套餐层级、地域)细分检查结果。
Ship / Extend / Kill Framework
发布/扩展/下线框架
| Signal | Decision | Action |
|---|---|---|
| p < 0.05, CI excludes zero, guardrails green | SHIP | Full rollout, update success metrics |
| Positive trend, underpowered (p = 0.10–0.15) | EXTEND | Add runtime, do not peek again |
| p > 0.15, flat or negative | KILL | Revert, document learnings, re-hypothesize |
| Guardrail regression, any p-value | KILL | Immediate revert regardless of primary metric |
| SRM detected | INVALID | Fix assignment bug, restart experiment |
| 信号 | 决策 | 行动 |
|---|---|---|
| p < 0.05、置信区间不包含0、护栏指标正常 | 发布(SHIP) | 全面推出,更新成功指标 |
| 正向趋势、统计功效不足(p = 0.10–0.15) | 扩展(EXTEND) | 延长实验时长,不得再次偷看 |
| p > 0.15、结果平稳或负向 | 下线(KILL) | 回滚,记录经验教训,重新提出假设 |
| 护栏指标退化,无论p值如何 | 下线(KILL) | 立即回滚,无论核心指标结果如何 |
| 检测到SRM | 无效(INVALID) | 修复分组bug,重启实验 |
Related Skills
相关技能
- — OKRs, KPI trees, RICE prioritization, PRD templates
ork:product-frameworks - — Event naming, metric definition, alerting setup
ork:metrics-instrumentation - — Generate hypotheses and experiment ideas
ork:brainstorm - — Evaluate product quality and risks
ork:assess
- — OKRs、KPI树、RICE优先级排序、PRD模板
ork:product-frameworks - — 事件命名、指标定义、告警设置
ork:metrics-instrumentation - — 生成假设和实验想法
ork:brainstorm - — 评估产品质量与风险
ork:assess
References
参考资料
- — Hypothesis, sample size, significance, decision matrix
rules/ab-test-evaluation.md - — Cohort types, retention curves, SQL patterns
rules/cohort-retention.md - — Stage mapping, drop-off identification, optimization
rules/funnel-analysis.md - — Formulas, test selection, power analysis
references/stats-cheat-sheet.md
Version: 1.0.0 (March 2026)
- — 假设、样本量、显著性、决策矩阵
rules/ab-test-evaluation.md - — 群组类型、留存曲线、SQL模板
rules/cohort-retention.md - — 阶段映射、流失定位、优化方法
rules/funnel-analysis.md - — 公式、测试选择、功效分析
references/stats-cheat-sheet.md
版本: 1.0.0(2026年3月)