slot-machine-multi-armed bandit testing

How to optimize your drip campaigns with multi-armed bandit testing?

slot-machine-multi-armed bandit testing

Experimenting with your contents is a must-have thing if you are a marketer. But there are some cases when the well-known split testing methodology doesn’t provide satisfying results. Therefore you need to utilize multi-armed bandit testing! Here you will know what it is, how it is different, when it is better and how can you use it in practice.

What is multi-armed bandit testing?

The multi-armed bandit problem has a famous analogy in probability theory: a gambler goes into a casino and sees a row of one-armed bandits (slot machines). The gambler naturally wants to play on the machines that have the biggest chance of winning.

So he needs to know in which order and how many times he should play on the machines. When he starts to play with them, every machine has a specific probability distribution that will give him a reward. The gambler’s goal is to maximize his overall reward in the end of the day.

The same problem applies when it comes to optimization in marketing: you have different assumptions that need to be tested at a time. So you just “pull the levers” (CTAs, images, copies, headlines, subject lines, etc.), experiment with them and try to maximize your “rewards” (conversion rate).

AB testing or multi-armed bandit testing?

The great debate

There is a debate between A/B test and multi-armed bandit believers. Some of them prove that A/B testing is better because XY… some of them try to do the same for multi-armed bandit algorithms.

Well, the funny thing is that both solutions come from academic methodologies, therefore, both of them require academic approach. Academicians want to disprove their own statements but when the debate got started in the marketing communities, they tried to disprove each others’ approaches.

They use academic language and arguments with not academic argumentation behaviour… Funny.

So just to be honest: both solutions are really good. Just for different use cases.

How does it work?

To understand how it works, you need to understand the simple A/B testing and then you’ll be able to differentiate the bandit testing. It’s a little bit mathematics and statistics but it can be easily understood without numbers and formulas.

AB testing


In split testing, you have an A and a B version of a landing page, a website, an email or ad.

You send 50% of your traffic to the A and 50% to the B variation. You run this test until you reach significant result: when the sample is high enough to prove that the higher conversion rate of a version is not accidental. For example, 97% significance means that one of the versions performs better with the probability of 97%.

In A/B testing you run the experiment until you have a winner version that is significant enough. This is the period where you assign 50% of your visitors randomly to A or B version. This what statisticians call “pure exploration”.

After you have a winner, the “pure exploitation” part starts. In this stage, you send the 100% of your visitors to the winning version.

So you have a sample, you learn from it then you implement the learnt things – this is the sequence.

But the problem with this is that the exploration and exploitation are two distrinct stages.

Multi-armed bandit testing


Multi-armed bandit testing is different from A/B/N testing because it says:

  • the transition between exploration and exploitation should be more smooth
  • and the data collection (exploration) phase wastes too much time and resources

While split testing uses the pure exploration and pure exploitation phases that are really distinct, multi-armed bandit testing tries to mix the two phases. It means that it adaptively changes and continuously mixes exploration and exploitation.

As a result, you will earn and learn. As Matt Gershoff said, during exploration phase you learn but it has a cost: you lose opportunities. But if you can decrease the cost of exploration by immediate implementation of your learnings into exploitation, you’ll have higher ROI.

So multi-armed bandit algorithms work for only one goal: to reduce the possibility of sending traffic to lower performing variations as fast as it is possible. This way the cost of experimentation will decrease.

What are the situations where bandit testing comes handy for marketers and why?

According to Matt Gershoff, if you don’t really want to spend time on understanding but only on running the optimization, this is the way to go. But it is just one approach, there are 2 different use cases when multi-armed bandit testing will perform better.

Short tests

If you have a campaign that lasts only for few days or just a week you can’t conduct A/B tests because by the time you’d reach significance, the campaign is already over.

Therefore you have to mix the exploration and exploitation periods to have higher conversion rate. This is why you have to use multi-armed bandit algorithms.

A very good example is when you have a promotion that is just 1 day long: Black Friday. If you would run A/B test, you wouldn’t have enough time to test. Although it could reach significance but you lost more than the half of the day by that time.

Therefore the multi-armed bandit algorithm will adaptively send more traffic to the better-performing variations. (Said, Stephen Pavlovich.)

Long tests

There are 3 main examples when multi-armed bandit testing will outperform split testing.

Scaling your drip campaigns

When you have a process that doesn’t need big changes and continuously runs over and over and over again, you need to use multi-armed bandit testing. This way you don’t even have to take a look at them it will continuously increase your conversion rate. It is an ongoing test.

Chains of conversion points

The best example again is when you have a drip campaign with 3 emails that have the same goal (that truly can be anything). Every touch point will have a chance to convert them but the best way to optimize the overall conversion rate of your drip campaign: send the best 3 versions of emails automatically.

Content and ad distribution

When you have two types of users, one has a common behaviour and one behaves a little bit differently, you can utilize multi-armed bandit testing to serve the common users correctly but still experiment with the other users.

How can you implement bandit testing for your drip campaigns?

If you have a drip campaign that needs continuos improvement, unfortunately, there is no solution to do that right now. But Automizy will be the first one which implements machine learning to make marketers life easier in its upcoming feature.

The closed beta will be released this year September-October, if you want to be among the first few hundred people who can find it out, you subscribe here now.
In addition, you can check a quick overview how it will work in practice. Check the video below!

[trx_video url=”” autoplay=”off” title=”off” image=””]

2 replies
  1. Ariel
    Ariel says:

    Great blog! Do you have any helpful hints for aspiring
    writers? I’m planning to start my own blog soon but I’m a little lost on everything.

    Would you recommend starting with a free platform like WordPress or go for a paid option? There are so many choices
    out there that I’m completely confused ..
    Any tips? Bless you!


Leave a Reply

Want to join the discussion?
Feel free to contribute!

Leave a Reply

Your email address will not be published. Required fields are marked *