Trustworthy Experiments is a framework for running controlled experiments (A/B tests) that produce reliable, actionable results. The core insight: most experiments fail, and many "successful" results are actually false positives.
The key shift: Move from "Did the experiment show a positive result?" to "Can I trust this result enough to act on it?"
Ronny Kohavi, who built experimentation platforms at Microsoft, Amazon, and Airbnb, found that:
66-92% of experiments fail to improve the target metric
8% of experiments have invalid results due to sample ratio mismatch alone
When the base success rate is 8%, a P-value of 0.05 still means 26% false positive risk