Significance calculator for A/B and multivariate tests
Get new whitepapers and updates on digital marketing topics with our newsletter
Subscribe to NewsletterWould you like to know whether your test results are significant and therefore truly meaningful? With our significance calculator, you can see at a glance which result is significant and which uplift or downlift may have occurred by chance.
The significance calculator is suitable for A/B and multivariate tests. You can compare up to 8 test variants (including control). Simply enter the number of unique visitors and the number of conversions – we will calculate conversion rate, uplift or downlift, as well as confidence and significance.
Confidence is the probability that the measured difference from the original variant (control) did not occur by chance, but is due to the test setup.
Significance is the opposite, i.e., the probability that both variants show no structural differences and that measured differences occurred purely by chance.
Confidence is therefore the inverse probability of significance. Significance and confidence always add up to 100%. For example, if the calculated significance is 20%, the confidence is 80%.
Is my test result significant?
Significance analysis is a useful tool for evaluating your test results. With our significance calculator, you can find out how likely it is that the results are meaningful or were measured purely by chance.
In general, the smaller the calculated significance, the less likely it is that the measured uplift or downlift occurred purely by chance. Conversely, a high confidence means that the probability of a random result is relatively low.
Confidence levels and significance levels
Before starting the test, a confidence level must be selected. This determines how certain you want to be that the test result also applies to the population. If the calculated confidence is above the confidence level, the test is accepted as statistically significant.
The following confidence-level scheme has become established in the social sciences:
One-tailed test vs. two-tailed test
In principle, there are two ways to test: one-tailed and two-tailed tests. The difference lies in the research question.
The two-tailed test determines whether there is any difference at all between the tested variant and the original variant (control).
In a one-tailed test, however, only whether the tested variant is better than the original variant is checked.
Please note: our significance calculator always uses a two-tailed evaluation. While a one-tailed calculation leads to a significant result more quickly, it is not entirely unproblematic: if the supposedly better test variant performs worse than the original, this is attributed to chance rather than a structural difference.
With a one-tailed test, you can therefore never find out whether a variant is significantly worse—only that it is not significantly better. Since you should generally not rule out that the tested variant performs worse than the original variant, we recommend always running a two-tailed test.
Do you still have questions about interpreting your tests?
Your test results are not significant?
Or are you looking for general information on A/B and multivariate testing?