Architecturing A/B Experiments

Making experiments on a product is one of the most powerful technique for a company to get the best response from its customers. But sometimes, especially when they are more than 2 or 3 at the same time, it may become a bit frustrating for developers who find pieces of experiment blocks around the whole project.
In this article I’ll explain an easy architectural approach to keep our code clean and dynamic.


What’s an A/B test?

Let’s suppose we want to improve a feature of our app in order to get more users using it. We have few idea to improve it (involving design, visibility, etc), but we don’t know exactly which one would be the best to get the max response by our users. The solution is to set up an A/B test.
An A/B test is an experiment which randomly allocates users in a specific variation. Let’s see how does it work:

Let’s say that we want to know which is the best price we should sell our premium membership in order to get the highest revenues.
We decide to setup 3 variation:

  • Original Variation: 15,00 $;
  • Variation A: 25,00 $;
  • Variation B: 40,00 $.

Every user using our app, through specific APIs (i.e. Apptimize APIs), will be allocated in one of these variations, and the service will return to our client the variation where the user has been allocated. At this point, we’ll execute a piece of code rather than another, according with the chosen variation.
In the example above, the user will see the chosen price for the variation which he has been allocated in. Note that an experiment, must always includes the original variation, or in other words, the variation which was there before starting the experiment, otherwise we’d never know if that was the best solution or if effectively was not good enough.

Great job, the experiment is running! Now we have just to wait an arbitrary amount of time and get the results.

Ok, a week is gone and we found out these results:

  • Original Variation: 1000 users payed 15,00 $ -> 15.000,00 $ in revenues;
  • Variation A: 800 users payed 25,00 $ -> 20.000,00 $ in revenues;
  • Variation B: 300 users payed 40,00 $ -> 12.000,00 $ in revenues;

Note that neither the lowest price nor the highest price won! Indeed Variation A (with a price in the middle) produced the highest revenues.
Therefore, we should set the price of the membership we are selling to 25,00$ and close the experiment.


OMG this is so dirty!

An example of experiment using the famous Apptimize platform, might be this:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
[Apptimize runTest:@"Experiment name" withBaseline:^{
  // Baseline variant "original"
  // ..code
  // ..code
  // ..code
  } andVariations:@{
    @"variation1": ^{
      // Variant "Variation A"
      // ..code
      // ..code
      // ..code
    }, @"variation2": ^{
      // Variant "Variation B"
      // ..code
      // ..code
      // ..code
    }
  }
];

Can you imagine as dirty can become your project?
Let’s see the downsides of using this approach:

☹ Dirty code
In addition to have “spaghetti code”, Xcode doesn’t apply a correct syntax highlighting and indentation for pieces of code like the one above;

☹ Imports
A/B test framework imported here and there around the whole project;

☹ Nested Experiments
Even if they should be avoided, nested experiments become really hard to manage;

☹ No nil blocks
If you need to execute a block of code just in the variations of the experiment, doing nothing in the original variation, you cannot return nil in the baseline block. You have instead to return an empty block;

☹ Checking implies running
In order to check if an user has been allocated in an experiment’s variation, you need to run the test every time. Every time you run an experiment on an user, he becomes part of it, affecting the final stats. Sometimes instead, you might want just to check if he has been allocated in the experiment or not, without running it.
For example if you want to run an experiment on users which respect a specific condition, and at some point in the app you need to check if the current user is in a variation of this experiment or not without risking to allocate him in the experiment itself, it may be not easy to solve, and your code could really become dirty and tricky to read;

☹ Not easy to remove
Don’t forget that experiments are temporary, and you might want to easily remove them once closed. So you should not over complicate your current architecture and code;

☹ Affects unit tests
Experiments will run also during unit tests. To avoid this, you need to check the target before launching an experiment, making your code even more dirty.


A separate experiment manager

In order to solve the downsides exposed in the previous paragraph, we first need to wrap the “A/B test” framework inside a custom class:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
#import <Foundation/Foundation.h>


typedef void (^VariationBlock)();

@interface MGExperimentsManager : NSObject

+ (instancetype)sharedInstance;

- (void)setBaselineVariation:(VariationBlock)baselineVariation forExperimentWithName:(NSString *)experimentName;

- (void)setVariation:(VariationBlock)variation withName:(NSString *)variationName forExperimentName:(NSString *)experimentName;

- (void)runTest:(NSString *)experimentName;

- (NSDictionary <NSString *, NSString *> *)testInfo;

@end

As you see, there are specific methods to set the baseline variation and the other variations, plus a method to run a specific test.
Let’s see a case of use:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
__weak typeof (self) weakSelf = self;

[self setBaselineVariation:^{
  //...code
  //...code
  //...code
} forExperimentWithName:kMGExperimentName];

[self setVariation:^{
  //...code
  //...code
  //...code  
} withName:@"variation1" forExperimentName:kMGExperimentName];

[self runTest:kMGExperimentName];

A nice thing to highlight, is that you can launch these 3 methods above, also in different moments and in different points of the app. Basically you can easily prepare and then launch your experiment, maintaining the code clean and readable. Even more important, your framework dependence is just in a unique class which wraps it, and will help you to maintain your code (i.e. you want to replace Apptimize with another A/B test framework).


The following is the implementation file of the MGExperimentsManager class. I won’t spend time entering in the details of this implementation, as it’s quite easy to read and it’s not the main point of this article:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
#import "MGExperimentsManager.h"
#import <Apptimize/Apptimize.h>


@implementation MGExperimentsManager {
    NSMutableDictionary<NSString *, VariationBlock> *experimentsBaselines_;
    NSMutableDictionary<NSString *, NSMutableDictionary<NSString *, VariationBlock> *> *experimentsVariations_;

}

+ (instancetype)sharedInstance
{
    static MGExperimentsManager *experimentManager;
    static dispatch_once_t onceToken;
    dispatch_once(&onceToken, ^{
        experimentManager = [MGExperimentsManager new];
    });

    return experimentManager;
}

- (instancetype)init
{
    self = [super init];
    if (self) {
        experimentsBaselines_ = [NSMutableDictionary new];
        experimentsVariations_ = [NSMutableDictionary new];
    }
    return self;
}

- (void)setBaselineVariation:(nullable VariationBlock)baselineVariation forExperimentWithName:(nonnull NSString *)experimentName
{
    NSParameterAssert(experimentName);

    if (!experimentName) {
        return;
    }

    experimentsBaselines_[experimentName] = baselineVariation;
}

- (void)setVariation:(nullable VariationBlock)variation withName:(nonnull NSString *)variationName forExperimentName:(nonnull NSString *)experimentName
{
    NSParameterAssert(experimentName);
    NSParameterAssert(variationName);

    if (!experimentName || !variationName) {
        return;
    }

    if (!experimentsVariations_[experimentName]) {
        experimentsVariations_[experimentName] = [NSMutableDictionary new];
    }

    experimentsVariations_[experimentName][variationName] = variation;
}

- (void)runTest:(nonnull NSString *)experimentName
{
    NSParameterAssert(experimentName);

    if (!experimentName) {
        return;
    }

    VariationBlock baselineVariation = experimentsBaselines_[experimentName];
    NSDictionary *variations = [experimentsVariations_[experimentName] copy];

    NSAssert(variations, @"There is not even a variation");
    NSAssert([variations count] > 0, @"There is not even a variation");

    if (!variations || [variations count] == 0) {
        return;
    }

    if (!baselineVariation) {
        baselineVariation = ^(){};
    }

    [Apptimize runTest:experimentName withBaseline:baselineVariation andVariations:variations];

    [self mg_removeStoredBlocksForExperimentWithName:experimentName];
}

- (NSDictionary <NSString *, NSString *>*)testInfo
{
    NSDictionary *apptimizeTestInfo = [Apptimize testInfo];

    NSMutableDictionary <NSString *, NSString *>*enrolledTests = [NSMutableDictionary dictionary];

    if (apptimizeTestInfo && apptimizeTestInfo.count > 0) {
        [apptimizeTestInfo enumerateKeysAndObjectsUsingBlock:^(id  _Nonnull key, id  _Nonnull obj, BOOL * _Nonnull stop) {
            NSString *testName = [MGExperimentsManagerHelper formattedTestName:key];
            NSString *variation = [MGExperimentsManagerHelper trimmedString:[obj enrolledVariantName]];
            enrolledTests[testName] = variation;
        }];
    }

    return [enrolledTests copy];
}


#pragma mark - Private methods

- (void)mg_removeStoredBlocksForExperimentWithName:(nonnull NSString *)experimentName
{
    NSParameterAssert(experimentName);

    if (!experimentName) {
        return;
    }

    experimentsBaselines_[experimentName] = nil;
    experimentsVariations_[experimentName] = nil;
}

@end


The core of the architecture

Now that we have all the pieces, let’s pass to the final step of the architecture.
First of all, we want to keep our experiments in order, easy to use, easy to remove once closed. A good way to do this, is using categories.

Having a category of our experiment manager class for every experiment, we can maintain them well separated from each other. Moreover, where we need, we have just to import the category relatives to the experiment we want to run (or check).
The header file of the experiment’s category should be something like this:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
#import "MGExperimentsManager.h"


typedef NS_ENUM(NSInteger, MGPriceVariation) {
    MGPriceVariationNotEnrolled        = -1,
    MGPriceVariationNotAssigned        = 0, //or MGPriceVariationOriginal
    MGPriceVariationAssignedA          = 1,
    MGPriceVariationAssignedB          = 2
};

@interface MGExperimentsManager (PriceExperiment)

@property (readonly) MGPriceVariation priceVariation;

- (MGPriceVariation)runOrRefreshPriceExperimentConsideringUserPremium:(BOOL)isUserPremium;

@end

The enumeration must respects the variations we configured in the A/B test platform. MGPriceVariationNotAssigned is the default variation, or the baseline variation, or in other words, the original configuration which was there before launching the experiment.
MGPriceVariationNotEnrolled is instead the default value of the property priceVariation, which means that the experiment has never been launched in that session.
Note also that when the experiment is performed, a pre-condition is passed to the method. In this case, our pre-condition is that the user has to be a free (not currently paying or not premium) user. Just in that case, the user will be allocated in the experiment, otherwise the experiment will be not enrolled and the session will not affect our final stats.
Let’s see how this category is implemented:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
#import "MGExperimentsManager+PriceExperiment.h"

#import <objc/runtime.h>


static NSString * const kMGExperimentName = @"Price experiment";

static char const * const kMGPriceVariationProperty = "__kMGPriceVariationProperty";


@interface MGExperimentsManager ()

@property (readwrite) MGPriceVariation priceVariation;

@end


@implementation MGExperimentsManager (PriceExperiment)

- (MGPriceVariation)runOrRefreshPriceExperimentConsideringUserPremium:(BOOL)isUserPremium
{
    if (!isUserPremium) {
        __weak typeof (self) weakSelf = self;

        if (/* Check if is not Unit Test target */) {

            [self setBaselineVariation:^{
                weakSelf.priceVariation = MGPriceVariationNotAssigned;
            } forExperimentWithName:kMGExperimentName];

            [self setVariation:^{
                weakSelf.priceVariation = MGPriceVariationAssignedA;
            } withName:@"variation1" forExperimentName:kMGExperimentName];

            [self setVariation:^{
                weakSelf.priceVariation = MGPriceVariationAssignedB;
            } withName:@"variation2" forExperimentName:kMGExperimentName];

            [self runTest:kMGExperimentName];

        } else {
            weakSelf.priceVariation = MGPriceVariationNotAssigned;
        }

    } else {
        self.priceVariation = MGPriceVariationNotEnrolled;
    }

    return self.priceVariation;
}


#pragma mark - New properties

- (void)setPriceVariation:(MGPriceVariation)priceVariation
{
    objc_setAssociatedObject(self, kMGPriceVariationProperty, @(priceVariation) , OBJC_ASSOCIATION_RETAIN);
}

- (MGPriceVariation)priceVariation
{
    NSNumber *value = objc_getAssociatedObject(self, kMGPriceVariationProperty);
    return [value integerValue];
}

@end

It does basically just 1 thing:
for any experiment’s block, assign the correspondent enumeration’s value.

Note that if the pre-condition is not respected, the value is set to MGPriceVariationNotEnrolled, which exactly means that the experiment has not been run.
Note also that there is a special if condition (Line 25), which prevents the experiment to be performed on a Unit Test target, setting the value always to MGPriceVariationNotAssigned.
This class, could also contain other properties and methods to support the experiment while running, avoiding to get the code dirty with “temporary” line of codes around the project.
Indeed, once the experiment will be finished, we’ll just remove the #import and the relative calls from the classes who make use of it, without spending time (and introducing bug) finding the correct pieces of code to remove.


Example of usage

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
MGPriceVariation priceVariation = [[MGExperimentsManager sharedInstance] runOrRefreshLifetimeTestConsideringUserPremium:user.isPremium];

// User is premium
if (priceVariation == MGPriceVariationNotEnrolled) {
    return;
}

CGFloat price = 15.; // original price (MGPriceVariationNotAssigned)

switch (priceVariation) {
    case MGPriceVariationA:
        price = 25.;
        break;
    case MGPriceVariationB:
        price = 40.;
        break;
}

// ..code

Simple as that.
Note that once the experiment has been run, you can access the value without uselessly passing through the A/B test framework, by reading the property: [MGExperimentsManager sharedInstance].priceVariation.

We could also be in situations where we need to check multiple experiments at the same time, and in these cases our architecture it’s really helpful.
For example:

1
2
3
4
5
6
7
8
9
10
11
12
13
BOOL userIsInExperimentA =
[MGExperimentsManager sharedInstance].variationExperimentA != MGExperimentAVariationNotEnrolled;

BOOL userIsInExperimentBVariationA =
[MGExperimentsManager sharedInstance].variationExperimentB == MGExperimentBVariationAssignedA;

if (!userIsInExperimentA && userIsInExperimentBVariationA)  {
    // do something
} else if (userIsInExperimentA) {
    // do something else
} else {
    // do something else
}


This is an example of how much I reduced the code in a class where there were 2 nested experiments running together, using the described approach (highlighted lines):

Experiment Manager Refactor

The code on the left side, is using the experiment manager approach. The (2.5x) code on the right side, is not using it.


Conclusion

Happy A/B testing!

Cheers ;)

Follow me on Twitter!


Comments