Loading...
Loading...
Compare original and translation side by side
| Type | What | When to Use | Traffic Needed |
|---|---|---|---|
| A/B Test | Two variants, randomly assigned | Sufficient traffic, clear metric, need statistical confidence | 1,000+ conversions per variant |
| Multivariate (MVT) | Multiple variables simultaneously | Understand interaction effects. Only with very high traffic | Much higher than A/B |
| Feature Flag / Progressive Rollout | Release to small %, gradually increase | New feature launches with risk mitigation | N/A (no statistical rigor needed) |
| Phased Rollout | Internal -> beta -> 10% -> 25% -> 50% -> 100% | Major launches with high risk | Monitor guardrails at each phase |
| Fake Door Test | Show non-existent feature, measure click rate | Validate demand before building | Low (measuring interest only) |
| Holdout Test | Keep 5-10% on old experience permanently | Measuring long-term cumulative impact | Months of duration |
| 类型 | 定义 | 使用场景 | 所需流量 |
|---|---|---|---|
| A/B测试 | 两种变体,随机分配用户 | 流量充足、指标明确、需要统计置信度 | 每个变体需1000+次转化 |
| 多变量测试(MVT) | 同时测试多个变量 | 需了解变量间的交互影响,仅适用于流量极高的场景 | 远高于A/B测试所需流量 |
| 功能开关/渐进式发布 | 先向小比例用户发布,逐步扩大范围 | 发布存在风险的新功能时用于降低风险 | 无要求(无需统计严谨性) |
| 分阶段发布 | 内部测试 -> 公测 -> 10%用户 -> 25%用户 -> 50%用户 -> 全量发布 | 高风险的重大功能发布 | 在每个阶段监控安全指标 |
| 假门测试 | 展示不存在的功能,衡量点击率 | 在开发前验证需求 | 流量要求低(仅需衡量用户兴趣) |
| 保留组测试 | 永久让5-10%的用户使用旧版本 | 衡量长期累积影响 | 测试周期需持续数月 |
We believe that [CHANGE]
will cause [EFFECT]
for [SEGMENT]
because [RATIONALE]
which we will measure by [METRIC]We believe that [CHANGE]
will cause [EFFECT]
for [SEGMENT]
because [RATIONALE]
which we will measure by [METRIC]We believe that adding a progress bar to the onboarding flow
will increase onboarding completion rate by 15%
for new free-tier signups
because visible progress toward a goal increases motivation (endowed progress effect)
which we will measure by the onboarding_completed event rate within 7 days of signupWe believe that showing annual pricing as the default (with monthly as secondary)
will increase annual plan selection rate by 20%
for users on the pricing page
because anchoring on the discounted annual price shifts perceived value
which we will measure by the % of checkout_completed events with billing_cycle = annualWe believe that adding a progress bar to the onboarding flow
will increase onboarding completion rate by 15%
for new free-tier signups
because visible progress toward a goal increases motivation (endowed progress effect)
which we will measure by the onboarding_completed event rate within 7 days of signupWe believe that showing annual pricing as the default (with monthly as secondary)
will increase annual plan selection rate by 20%
for users on the pricing page
because anchoring on the discounted annual price shifts perceived value
which we will measure by the % of checkout_completed events with billing_cycle = annual| Situation | Use |
|---|---|
| Small team, quick decisions | ICE |
| Larger team, cross-functional | RICE |
| Early stage, few experiments | ICE |
| Growth team with data | RICE |
| 场景 | 适用方法 |
|---|---|
| 小型团队、快速决策 | ICE |
| 大型团队、跨职能协作 | RICE |
| 早期阶段、实验数量少 | ICE |
| 具备数据支撑的增长团队 | RICE |
Experiment: [Name]
Hypothesis: [One-line hypothesis]
Target Metric: [Primary metric]
ICE Score: I=[X] C=[X] E=[X] Total=[X]
OR
RICE Score: R=[X] I=[X] C=[X] E=[X] Total=[X]
Expected Duration: [X weeks]
Resources Needed: [Engineering, design, copy]
Dependencies: [Any blockers]
Decision: [Run / Defer / Kill]Experiment: [Name]
Hypothesis: [One-line hypothesis]
Target Metric: [Primary metric]
ICE Score: I=[X] C=[X] E=[X] Total=[X]
OR
RICE Score: R=[X] I=[X] C=[X] E=[X] Total=[X]
Expected Duration: [X weeks]
Resources Needed: [Engineering, design, copy]
Dependencies: [Any blockers]
Decision: [Run / Defer / Kill]Backlog -> Designed -> Running -> Analyzing -> Learnings Documented
(20-50 (3-5 (2-4 (1-2 Decision
scored ready) active) awaiting) recorded)
ideas)Backlog -> Designed -> Running -> Analyzing -> Learnings Documented
(20-50 (3-5 (2-4 (1-2 Decision
scored ready) active) awaiting) recorded)
ideas)| Baseline Rate | MDE (Relative) | Conversions Per Variant |
|---|---|---|
| 2% | 20% (2% -> 2.4%) | ~14,700 |
| 5% | 20% (5% -> 6%) | ~5,500 |
| 10% | 10% (10% -> 11%) | ~14,300 |
| 10% | 20% (10% -> 12%) | ~3,600 |
| 20% | 10% (20% -> 22%) | ~6,400 |
| 20% | 20% (20% -> 24%) | ~1,600 |
| 50% | 10% (50% -> 55%) | ~3,200 |
| 基准转化率 | 最小可检测效果(相对值) | 每个变体所需转化次数 |
|---|---|---|
| 2% | 20%(2% -> 2.4%) | ~14,700 |
| 5% | 20%(5% -> 6%) | ~5,500 |
| 10% | 10%(10% -> 11%) | ~14,300 |
| 10% | 20%(10% -> 12%) | ~3,600 |
| 20% | 10%(20% -> 22%) | ~6,400 |
| 20% | 20%(20% -> 24%) | ~1,600 |
| 50% | 10%(50% -> 55%) | ~3,200 |
| Aspect | Frequentist | Bayesian |
|---|---|---|
| Output | p-value, confidence interval | Probability of being better, credible interval |
| Peeking | NOT allowed (inflates false positives) | Allowed (built into methodology) |
| Intuition | "I reject the null hypothesis" | "94% probability B is better" |
| Best for | Rigorous, pre-planned experiments | Iterative, continuous experimentation |
| 维度 | 频率统计 | 贝叶斯统计 |
|---|---|---|
| 输出结果 | p值、置信区间 | 变体更优的概率、可信区间 |
| 中途查看结果 | 不允许(会提升假阳性率) | 允许(方法本身支持) |
| 直观性 | “我拒绝原假设” | “B变体更优的概率为94%” |
| 最佳适用场景 | 严谨、预先规划的实验 | 迭代式、持续开展的实验 |
Variant Name: [Control / Variant B / Variant C]
Description: [What the user sees]
Change from Control: [Specific differences]
Screenshot/Mockup: [Link]
Technical Implementation: [How it is built]Variant Name: [Control / Variant B / Variant C]
Description: [What the user sees]
Change from Control: [Specific differences]
Screenshot/Mockup: [Link]
Technical Implementation: [How it is built]| Allocation | Use Case |
|---|---|
| 50/50 | Standard A/B test. Fastest to significance. |
| 70/30 or 80/20 | Limit risk. Larger group gets current experience. |
| 90/10 (Holdout) | Measure long-term cumulative impact. |
| Gradual ramp | 5% -> 25% -> 50% -> 100%. For risky changes. |
| 分配比例 | 使用场景 |
|---|---|
| 50/50 | 标准A/B测试,最快达到统计显著性 |
| 70/30或80/20 | 降低风险,更大比例用户使用当前版本 |
| 90/10(保留组) | 衡量长期累积影响 |
| 逐步扩大 | 5% -> 25% -> 50% -> 100%,适用于高风险变更 |
Example: Simplified pricing page
Primary: Checkout completion rate
Secondary: Time on pricing page, plan selection distribution, annual vs monthly split
Guardrail: Support ticket rate, 30-day churn rate, page load timeExample: Simplified pricing page
Primary: Checkout completion rate
Secondary: Time on pricing page, plan selection distribution, annual vs monthly split
Guardrail: Support ticket rate, 30-day churn rate, page load time Primary Metric
Improved No Change Degraded
Guardrails OK SHIP KILL/ITER KILL
Bad KILL KILL KILL Primary Metric
Improved No Change Degraded
Guardrails OK SHIP KILL/ITER KILL
Bad KILL KILL KILLundefinedundefined
---
---| Metric | Target |
|---|---|
| Experiments per month | 4-8 small teams, 15-30+ mature programs |
| Win rate | 15-30% (if >50%, not being bold enough) |
| Cumulative impact | Track quarterly compound impact |
| Idea-to-result cycle time | 2-4 weeks |
| Experiment coverage | >50% of key user flows |
| Inconclusive rate | <30% |
| 指标 | 目标值 |
|---|---|
| 每月开展实验数量 | 小型团队4-8个,成熟体系15-30+个 |
| 实验成功率 | 15-30%(若>50%,说明实验不够大胆) |
| 累积影响 | 按季度追踪复合增长影响 |
| 从想法到结论的周期 | 2-4周 |
| 实验覆盖范围 | >50%的核心用户流程 |
| 无明确结论的实验占比 | <30% |
plg-metricsproduct-analyticsgrowth-modelingplg-metricsproduct-analyticsgrowth-modeling