Significance calculator for A/B and multivariate tests

Get new whitepapers and updates on digital marketing topics with our newsletter

Would you like to know whether your test results are significant and therefore truly meaningful? With our significance calculator, you can see at a glance which result is significant and which uplift or downlift may have occurred by chance.

The significance calculator is suitable for A/B and multivariate tests. You can compare up to 8 test variants (including control). Simply enter the number of unique visitors and the number of conversions – we will calculate conversion rate, uplift or downlift, as well as confidence and significance.

	Visitors	ConversionsConv.	Conversion RateCR?	Uplift?	Confidence 1-sidedConf.	Confidence 2-sidedConf.?
Control
Variant 1V1
Variant 2V2
Variant 3V3
Variant 4V4
Variant 5V5
Variant 6V6
+ Add variant			Remove variant–

Confidence is the probability that the measured difference from the original variant (control) did not occur by chance, but is due to the test setup.

Significance is the opposite, i.e., the probability that both variants show no structural differences and that measured differences occurred purely by chance.

Confidence is therefore the inverse probability of significance. Significance and confidence always add up to 100%. For example, if the calculated significance is 20%, the confidence is 80%.

Is my test result significant?

Significance analysis is a useful tool for evaluating your test results. With our significance calculator, you can find out how likely it is that the results are meaningful or were measured purely by chance.

In general, the smaller the calculated significance, the less likely it is that the measured uplift or downlift occurred purely by chance. Conversely, a high confidence means that the probability of a random result is relatively low.

Confidence levels and significance levels

Before starting the test, a confidence level must be selected. This determines how certain you want to be that the test result also applies to the population. If the calculated confidence is above the confidence level, the test is accepted as statistically significant.

The following confidence-level scheme has become established in the social sciences:

90%	indicative: A difference may still be assumed, but is not considered proven.
≥95%	significant: In many fields, 95% is the most commonly chosen confidence level. Therefore, in this standard scheme, confidence levels of 95% or more are considered statistically significant.
≥99%	highly significant: If the confidence exceeds a level of 99%, the test is considered highly significant.
≥99.9%	extremely significant: The probability that the difference occurred purely by chance is ≤ 0.1%. The result is extremely significant.

One-tailed test vs. two-tailed test

In principle, there are two ways to test: one-tailed and two-tailed tests. The difference lies in the research question.

The two-tailed test determines whether there is any difference at all between the tested variant and the original variant (control).
In a one-tailed test, however, only whether the tested variant is better than the original variant is checked.

Please note: our significance calculator always uses a two-tailed evaluation. While a one-tailed calculation leads to a significant result more quickly, it is not entirely unproblematic: if the supposedly better test variant performs worse than the original, this is attributed to chance rather than a structural difference.

With a one-tailed test, you can therefore never find out whether a variant is significantly worse—only that it is not significantly better. Since you should generally not rule out that the tested variant performs worse than the original variant, we recommend always running a two-tailed test.

Do you still have questions about interpreting your tests?
Your test results are not significant?
Or are you looking for general information on A/B and multivariate testing?