Skip to main content

Simple A/B testing for your backend

Project description

Have you ever wanted to A/B test multiple code paths on your backend? Consider the following: management wants you to cut down on server costs. You come to terms with the fact that you really don’t need a 128 GB memory 32 core beast for your coconut store — but how low can you go? Let’s test how a slower coconut catalog page affects conversion! (First, without tracks.)

response = render_coconut_catalog()
test_value = random.random()
if test_value < (1 / 3):
    # control group
    sleep_sec = 0
elif test_value < (2 / 3):
    sleep_sec = 0.5
else:
    sleep_sec = 2
time.sleep(sleep_sec)
cur.execute('INSERT INTO test_runs (user_id, variant) VALUES (%s, %s)', user_id, sleep_sec)
return response

Now, this kinda works, but it’s already a bit messy and can easily get out of hand if you consider that right now:

  • The thing you test is really, really simple

  • You don’t lock a user’s version to make sure they always get the same test variant, making your data more mushy with each page load.

  • You don’t exclude your most important users from testing to avoid hurting conversion among coconut addicts in your initial test run. (Of course you will need to include them later to make an informed decision.)

  • You already have a few more ideas about things you want to A/B test. Maybe a hundred.

  • You can add another 5 engineers to the project and they will probably each implement these tests slightly differently, and that’s no good.

Enter tracks. First we define our variants:

class DelayTracks(tracks.SimpleTrackSet):

    @staticmethod
    def short_track():
        time.sleep(0.5)

    @staticmethod
    def long_track():
        time.sleep(2)

And to use it:

response = render_coconut_catalog()
with DelayTracks() as track:
    track()
    cur.execute('INSERT INTO test_runs (user_id, variant) VALUES (%s, %s)', user_id, track.name)
return response

Already a bit nicer, but look what happens if we go through the list of problems above one by one. Firstly, we can easily make the tests more complex:

class PricingTracks(tracks.SimpleTrackSet):

    @staticmethod
    def expensive_track(response):
        for coconut in response['coconuts']:
            coconut['price'] += 1

    @staticmethod
    def cheap_track(response):
        for coconut in response['coconuts']:
            coconut['price'] -= 1

Didn’t get that much worse, now did it? Let’s take it even further with more variants:

class PricingTracks(tracks.ParamTrackSet):

    params = [{'price_delta': n} for n in range(-5, 6)]
    add_control_track = False  # {'price_delta': 0} is our control group

    @staticmethod
    def track_name(price_delta):
        return 'price_adjusted_by_{0}'.format(price_delta)

    @staticmethod
    def track(response, price_delta):
        for coconut in response['coconuts']:
            coconut['price'] += price_delta

Tons of tests! Let’s move on to the second bullet point. How do we lock the variant served to users? Just change your usage to pass a user key to tracks:

response = render_coconut_catalog()
with DelayTracks(key=user_id) as track:
    track()
    cur.execute('INSERT INTO test_runs (user_id, variant) VALUES (%s, %s)', user_id, track.name)
return response

The key will be serialized to a string and the variant to use will be derived from that string. The key of course can be anything; in most cases it might be the user ID, but you could use a combination of the user’s country and the article ID for instance. (Not sure why you would want this specific example, but you get the point.)

So, with that solved, list item #3, here we come! What if we’re worried about our top customers being mad at us for testing things on them? Easy peasy.

class DelayTracks(tracks.SimpleTrackSet):

    @property
    def is_eligible(self):
        return not self.context['user']['is_vip']

    # code of variants trimmed

response = render_coconut_catalog()
with DelayTracks(context={'user': user_dict}) as track:
    track()  # will always be control for VIPs (even with `add_control_group = False`)
    if track.is_eligible:
        cur.execute('INSERT INTO test_runs (user_id, variant) VALUES (%s, %s)', (user_id, track.name))
return response

And finally, let’s try running the delay and pricing tests at the same time!

tracksets = [DelayTracks, PricingTracks]

response = render_coconut_catalog()
with tracks.MultiTrackSet(tracksets, key=user_id) as multitrack:
    multitrack()
    cur.execute(
        'INSERT INTO test_runs (user_id, test, variant) VALUES (%s, %s)',
        (user_id, multitrack.trackset.name, multitrack.track.name),
    )
return response

So, with this, tracks will gather all tests the user is eligible for, and choose one of them based on the given key. We could also set DelayTracks.weight to 1000 to make that one ten times as likely to be used as the pricing one (the default weight is 100.)

We got pretty far with just a few lines of code, didn’t we?

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

tracks-0.1.1-py2.py3-none-any.whl (7.5 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file tracks-0.1.1-py2.py3-none-any.whl.

File metadata

File hashes

Hashes for tracks-0.1.1-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 fcd114c22951e3fd1d925314525a32919ad102c2c0edb607b94b4bcad846e8dc
MD5 693bad960361dea1421a104b9b12db79
BLAKE2b-256 28364832cfef74182abae15b7a6d044bfb2856d0e1b41d93754a23e26f6dbf20

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page