Here’s a simple tool that you can use to test whether the results of your A/B Tests are statistically significant. Happy growth hacking!
Plug in your two variations sample sizes (
n2) and estimated success rates (
p2), and scroll down to Interpreting The Results, to understand the results of your test.
The Hypothesis Specification explains how to formulate your A/B Test experiment.
Interpreting The ResultsThis section will populate after you complete the form above.
Null Hypothesis: Success Rate A ≥ Success Rate B
Alternative Hypothesis: Success Rate A < Success Rate B
Significance Level, ⍺: 5%
Reference Inputs and Computed Statistics
This section will populate after you complete the form above.
The Magic of Statistics
In statistics vernacular, we’re doing a test of “difference in proportions”, or a “two-proportion z-test”.
The data that we’re considering is analogous to a repeated coin toss. You flip the coin, and it either comes up heads or tails. Then you do it again, and again, …
The distribution that this sort of data follows is called a “Binomial Distribution”. It’s characterized by two parameters: sample size (denoted by the variable n, for number of coin flips), and probability of success on any given “coin flip” (denoted by the variable p, for probability of success).
Many business applications with a discrete outcome follow a Binomial Distribution:
- ad click-through (n = number of ad impressions, p = probability of click-through),
- email open (n = number of emails sent, p = probability of an email being opened),
- website sign-up (n = number of website visitors, p = probability of a visitor signing up),
- checkout conversion (n = number of users who go to the checkout page, p = probability of successful checkout)
As such, we often gather this data in the course of growing our businesses: optimizing our ads, websites, and funnels for conversion.
In order to improve, we iterate and run split tests on different variations of these funnels.
To understand the results of these split tests, we need to use statistical methods like the above difference in proportions test. That way, we can have confidence in moving ahead with the best variations for our sales and marketing funnels.
Here’s the stats theory:
We first take the two parameters that we need to characterize each sample’s distribution: the sample size (n) and estimated proportion of successes (p). What we’re doing is taking these two poorly behaved sample Binomial distributions, and merging them together to create a well-behaved statistic, the Z-statistic
We construct a well-behaved Z-statistic from our sample data (n1, p1, n2, p2). The Z-Statistic follows a Gaussian or Normal Distribution, ~N(μ=0, σ=1), from which we can determine the probability of witnessing the data that we witnessed, under the Null Hypothesis.
From there, we’re able to easily evaluate the probability that what we saw was due to chance (the “p-value”), and therefore, determine which variation of our split test had the higher Success Rate (p).
The further our Z-statistic is from 0, the lower the probability that the Null Hypothesis. is correct.
Luckily for you, the above tool will not require you to do even think about the Z-statistic! But it’s probably helpful to have some idea of what’s going on under the hood.
This gets much more complicated when considering one- vs two-tailed tests, and varying the hypothesis specification. So please, for simplicity, and to avoid errors in your analysis — make sure that you specify your hypothesis as formulated above.
Hope this helps! Happy growth hacking 😁📈