Loading...
Loading...
Found 3 Skills
How to read experiment results without fooling yourself. Confidence intervals, p-values, multiple testing, sequential testing, CUPED, heterogeneous treatment effects, ratio metrics, network effects, dashboard reconciliation, and the interpretation failures that produce confidently wrong shipping decisions.
Calculate Wilson Score confidence intervals for ranking items by positive proportion with sample size correction. Use this skill when the user needs to rank products by ratings, sort content by approval rate, or build a 'best rated' list that accounts for sample size — even if they say 'rank by star rating', 'best rated with few reviews', or 'confidence-adjusted rating'.
Produces calibrated three-point PERT estimates (best/likely/worst) with confidence intervals, unknowns, and assumptions. Triggers on: "estimate this", "how long will this take", "effort estimate", "confidence interval", "story points", "t-shirt sizing". NOT for task decomposition, use task-decomposer.