Statistical Significance in Split Testing

Sometimes, you want to change something on your website. You want to try a better headline, or a different page layout, or a different order button. It is highly advisable to try small changes like these every once in a while, because they can improve your conversion rates and profits. But how do you find out if your new page actually performs better than the previous one?

You need to set up what is called a split test. You need to show one page to half of your visitors, and another version of the page to the other half. You can achieve that either with a server-side language like PHP, or with some JavaScript. There are also third-party solutions that allow you to perform split tests without any programming on your part.

This is organized this way to make sure that the audience that sees both variants of the page is similar. They should arrive from the same traffic sources, at the same interval of time, to make sure that any difference you observe in test results is caused mainly by the changes you make to the page.

How long do you need to run your test? That’s where we need to determine the statistical significance. First of all, you need at least 10 results (clicks, sales, subscriptions, etc.) from each variant. Otherwise the statistical analysis will not work, as low numbers carry a lot of random noise. Next, you need to run some calculations on your test results. A suitable method for statistical split test analysis is Pearson’s Chi-square test. You can learn it from college statistics course or from Wikipedia. It is too complicated to fit in this article. There are software applications and online calculators which can run the numbers for you.

As a rule of thumb, given the typical online conversion range (1-10%) and 100-1,000 tests, you need one variant to outrun the other by about 10:20 to get a 95% confidence that the result is not random. If one variant brings 10 sales and the other one brings 15, then the confidence level is only 75%. It’s up to you if that is reliable enough to make changes to your online business. If the results are even closer, then the test cannot really be trusted, as you could very likely receive the same results with two identical pages. You can either continue the test to get higher statistical significance (you will need 50-100 sales for that), or sign it off as inconclusive and try some other change.

Knowing statistical significance allows you to finish your tests sooner, and to have more confidence in the results. By trying different changes month after month, you can substantially improve your business performance.


Source by Val Danylchuk

Leave A Comment?