Whether advertising material, landing pages or website layouts: with A/B tests you can find out which version performs best. With this calculator it is very easy to calculate which variant is better and how high the confidence level and significance are. Furthermore: A lot of background information, formulas for the evaluation of A/B tests and calculation examples.
Evaluate your A/B test with this tool
- At a confidence level below 95%, the difference between the original variant and the test variant is statistically not significant
- If the confidence level is greater than 95 %, the difference is statistically significant
- If the confidence level is greater than 99 %, the difference is statistically highly significant
What is an A/B test?
An A/B test is a test method in which two variants of a website, design elements or advertising materials such as banners (variant A and variant B) are tested against each other in order to achieve a goal. Over a certain period of time, visitors to a website are randomly played off one of the two variants. The respective conversion rate is measured. The variant that results in a higher conversion rate is then selected and implemented. Conversions are usually not measured with a data warehouse or CRM, but with an analytics program such as Google Analytics, etracker, Adobe or Piwik. For this purpose, an e-commerce tracking system is set up or conversions/events are tracked. Google Analytics offers the possibility to record events automatically.
Formula for A/B tests: Calculate Statistical Significance
The chi-square test serves as a means of calculation. The formula is:
The variables are as follows:
- o: Visitors/Impressions of the original version
- v: Visitors/Impressions of the comparison variant
- co: Conversions or Clicks of the original variant
- cv: Conversions or Clicks of the comparison variant
- n: Total number of Visitors or Impressions
- nf: Total number of Visitors or Impressions without conversion
- nc: Total number of Visitors or Impressions with conversion
The Chi-square test simply explained
Each of the four summands within the chi formula represents one of the resulting expressions:
- A: Visitors of the original without conversion
- B: Visitors of the comparison variant without conversion
- C: Visitors of the original with conversion
- D: Visitors of the comparison variant with conversion
To simplify matters, we will only talk about visitors and conversions in the following explanation. The measured frequencies are entered in a cross-table. In the cross-table, the two variants (original and comparison variant) are assigned the two characteristics (visitors with conversion and visitors without conversion) and thus produce the four above-mentioned values:
|Visitors without Conversion||A: 960||B: 1120||2080|
|Visitors with Conversion||C: 40||D: 80||120|
The expected frequencies are then calculated by multiplying the number of visitors of the respective variant by the total number of visitors of the respective characteristic (without conversion, without conversion, with conversion) and dividing it by the total number of visitors of both variants. The expected frequency assumes that both variants are equally likely. To illustrate this, we calculate the expected frequency of characteristic A: Visitors to the original without conversion.
The remaining three expected frequencies are calculated in the same way:
|Visitors without Conversion||A: 945.45||B: 1134.55||2080|
|Visitors with Conversion||C: 54.55||D: 65.45||120|
For each of the four fields, the difference is formed from the measured frequency and the expected frequency, then squared and divided by the expected frequency:
|Visitors without Conversion||A: 0.22||B: 0.19|
|Visitors with Conversion||C: 3.88||D: 3.23|
Finally, all four fields are added together to get the chi-square value:
Now the calculated Chi-square-value only has to be compared with Chi²0.95 (1) and Chi²0.99 (1). 0.95 and 0.99 are the confidence level. A confidence level of more than 0.95 is generally recognised as’ statistically significant’. From 0.99 onward, the term’ statistically highly significant’ is used. The (1) stands for the degrees of freedom. A four-field matrix always has the degree of freedom one.
- Chi²0,95(1) = 3,84
- Chi²0,99(1) = 6,63
The difference in the example is therefore highly significant, since the calculated Chi-square value (7,52) is greater than 6,63.