“How many visitors do I need for my A/B test?” This question springs up in the mind of almost anyone starting out with Conversion Rate Optimization and the internet is littered with people asking just that. This article will provide you with a basic understanding of the mathematics that goes into finding that elusive number. You’ll also get a few nifty tools for calculating the required sample size for your specific A/B split test.
Deciding on the number of visitors to achieve statistical significance is not an exact science. Rather, it’s a more of a range. Theoretically, you can never say I need xx visitors to achieve significance. What you can say is “I need at least xx visitors to be xx% sure that the results are reliable”. Please note, the results are still not correct, they’re just reliable.
In statistics, significance means “reliable” instead of the English usage which indicates “importance”. What you want is to know is that you can reliably conclude that the winner indicated by an A/B test is actually the winner. The best-practice benchmark for statistical significance is 95%.
Reliability increases as you increase the number of data points. For example, if you split 1000 visitors between two-page layouts (the original, called “control”, and the “variation”) and you receive a result, that’s cool. However, if you receive the same result after splitting 2000 visitors between the two-page layouts, that’s far more reliable. This is common sense that’s solidly backed by statistics. What statistics will additionally show is that the range of error will go down.
In the image above, the range within which the actual result may lie is given by the ± number. For the variation titled “free download”, It means that the result can be 56% plus or minus 5.9% and the software is 98% certain that the actual value lies within this range. The range decreases as more visitors are tested as the software becomes increasingly certain about the result.
The larger your sample size and statistical confidence level are, and the lower your standard error is, the more reliable your test results will be. As a rule-of-thumb the more dramatic the difference between the two variations, the smaller amount of visitors (sample size) you’ll need to achieve a statistically significant result – and vice versa.
Useful tools for calculating sample size and statistical significance
How not to run an A/B test – Evan Miller
Statistical Significance and other A/b testing pitfalls – Cennydd Bowles
Excellent Analytics Tip #1: Statistical Significance – Occam’s Razor
A/B testing Tech Note: determining sample size – 37Signals
Siddharth Deswal works at Visual Website Optimizer, the world’s easiest A/B testing software. He’s been involved with web development for about 8 years now and actively looks to help online businesses discover the value of Conversion Rate Optimization. He tweets about A/B testing, landing pages and effective marketing tips on @wingify