A/B Testing

As most brands have their own website, social media marketing and online marketing nowadays, most are familiar with the term A/B testing. Many companies offer A/B testing for website design and optimization and some brands do A/B testing in-house. In its simplest form, A/B testing is comparing two items and seeing which one performs better. However, A/B testing is not a new concept, its approach is identical to between-groups experimental design, which is used by many different disciplines. For years, marketing research has performed this kind of research, such as for brand packaging, advertisement evaluations, and product evaluations.



A/B testing is the term for a randomized experiment that compares two versions of the same item, which are typically identical except for one variable. Focusing on websites as an example, Version A may be the current website and Version B may be the same website with one key difference based on an intended future re-design. A/B testing does not mean the brand only has two items to evaluate, it just means that you have one variable you are testing. The increasing prevalence of A/B testing in website design is due to being able to rapidly test various changes, however subtle, among website visitors and see the difference in real time. Repeatedly testing minor or significant modifications and applying statistical analysis to determine which combinations of variables perform better lead to increased customer retention and acquisition.



While the increase in use of A/B testing has been applied to many areas, A/B testing for website design has been in use since at least 2000. In 2000, Google ran its first A/B test trying to determine the optimal number of search results to display. By 2011, Google ran more than 7,000 A/B tests on its search algorithm. Big websites such as Amazon, Netflix, and EBay all constantly test site changes on users using a variation of an A/B methodology.

For website testing, A/B testing is straight forward. Randomly assign a number of users to one version of the variable being evaluated and assign another number of users a different version of the website variable. From there you measure the difference in click-through and behavior and determine which iteration of the website variable works better. The same applies to emails, assign some of the recipients one email with one or more differences, and assign others another email and determine which performs better.

But A/B testing is not constrained to just 1 variable. Multiple variables, or even multiple levels of a variable can be tested simultaneously. With enough sample, an almost limitless number of variable could be tested at one time as demonstrated in the chart below.


The main disadvantage of this experimental design is that they can become extremely complex, and require a very large number of participants to generate statistically significant data. Companies and organizations with access to large populations such as Google and Amazon can run hundreds of A/B tests simultaneously. Google famously tested 41 shades of blue for ad links when they launched ads on Gmail. They ran an experiment showing 1% of users each seeing 41 different shades of blue. With the results of their test, they determined which shade of blue people liked most, demonstrated by how much they clicked on them. This discovery led them to an extra $200 million a year in ad revenue.

Lastly, it is important to note that even though one of the variations can perform better, there is always the possibility that a variation of that variation could perform even better. This is the basis of Iterative A/B testing. In essence, Iterative A/B is “layering” A/B testing. For example, if version B performs better, a researcher may then test two different versions of version B and so on and so forth.



A/B testing is not restricted to the digital realm. While A/B testing in the real world will not provide results as quickly nor in the same quantity, it still provides valuable information. It can be applied to traditional marketing research methodologies such as focus groups, surveys, in depth interviews, etc. While the sample size may not be statistically valid, especially for qualitative methods, it still allows researchers to form and validate ideas.



The increasing prevalence of A/B testing is part of a shift in business philosophy and practices where data dictates change rather than instincts. As marketing researchers, this increase in A/B testing is well accepted as researchers have long been proponents of Evidence Based Practices, of which A/B testing is a part of. The development of Evidence Based Practices began as an interdisciplinary approach to clinical treatments which is now expanding to other fields. For marketing purposes, Evidence Based Practices refer to collecting feedback from customers about services, products, or improvements to existing services and products and changing services, products, or the improvements depending on the feedback.



However, with A/B testing, you lose any ability to determine why a customer likes one thing better than the other. That is where other traditional marketing research methodologies can come into play. For example, if an advertisement attempts to call a consumer into action, you could test a variety of advertisements to see which one performs better. However this just begins to approximate behavioral intent Qualitative research to help enlighten why customers chose the way they did can add valuable insights into customer behavior.



XAVIER ALVAREZ is a Project Manager at Q2 Insights, Inc., a research and innovation consulting firm with offices in San Diego and New Orleans. He can be reached at (760) 230-2950 ext. 4 or xavier.alvarez@q2insights.com.

This entry was posted in Data Analysis and tagged on November 28, 2016 by Q2 Insights