Skip to main content

Machine learning tools for uplift models

Project description

Introduction

There are currently several packages for uplift models (see EconML , GRF, PTE). They tend to focus on interesting ways of estimating the heterogeneous treatment effect. However models in their current state tend to focus on the single response, singe treatment scenario. In addition the metrics they use do not give estimates to the expectations of response variables if the models were used in practice (PTE is an exception).

This package attempts to build an automated solution for Uplift modeling that includes the following features:

  1. It allows for Multiple Treatments. In addition one can incorporate meta features for each treatment. For example; a particular treatment might have several shared features with other bonuses. Instead of creating a dummy indicator for each bonus the user can create a vector of categorial or continuous variables to represent the treatment.

  2. ERUPT functionality that estimates model performance on OOS data. This metric calculates the expected response if the model were given to the average user similar to .

  3. Support for multiple responses. This allows estimation of tradeoffs between maximizing / minimizing weighted sums of responses. An example can be found here

It does so by estimating a neural network of the form y ∼ f(t,x) where y, x, and t are the response, explanatory variables and treatment variables. If optim_loss=True then an experimental loss function is used to estimate the function (see here). If the treatment was not randomly assigned there is functionality for propensity scores (see here). There is functionality to predict counterfactuals for all treatments and calculates ERUPT metrics on out of sample data.

Quick Start Example

In a python enviornment :

import numpy as np
import pandas as pd

from mr_uplift.dataset.data_simulation import get_simple_uplift_data
from mr_uplift.mr_uplift import MRUplift

#Generate Data
y, x, t = get_simple_uplift_data(10000)
y = pd.DataFrame(y)
y.columns = ['revenue','cost', 'noise']
y['profit'] = y['revenue'] - y['cost']

#Build / Gridsearch model
uplift_model = MRUplift()
param_grid = dict(num_nodes=[8], dropout=[.1, .5], activation=[
                      'relu'], num_layers=[1, 2], epochs=[25], batch_size=[30])
uplift_model.fit(x, y, t.reshape(-1,1), param_grid = param_grid, n_jobs = 1)

#OOS ERUPT Curves
erupt_curves, dists = uplift_model.get_erupt_curves()

#predict optimal treatments with new observations
_, x_new ,_  = get_simple_uplift_data(5)
uplift_model.predict_optimal_treatments(x_new, objective_weights = np.array([.6,-.4,0,0]).reshape(1,-1))

Relevant Papers and Blog Posts

For Discussion on the metric used to calculate how model performs see:

ERUPT: Expected Response Under Proposed Treatments

Uplift Modeling with Multiple Treatments and General Response Types

Heterogeneous Treatment Effects and Optimal Targeting Policy Evaluation

A comparison of methods for model selection when estimating individual treatment effects

Inference for the Effectiveness of Personalized Medicine with Software

For tradeoff analysis see:

Estimating and Visualizing Business Tradeoffs in Uplift Models

Experimental Evaluation of Individualized Treatment Rules

For optimized loss see:

Maximizing The ERUPT Metric for Uplift Models

Methods for Individual Treatment Assignment: An Application and Comparison for Playlist Generation

Acknowledgements

Thanks to Evan Harris, Andrew Tilley, Matt Johnson, and Nicole Woytarowicz for internal review before open source. Thanks to James Foley for logo artwork.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mr_uplift-0.0.16.tar.gz (22.4 kB view details)

Uploaded Source

File details

Details for the file mr_uplift-0.0.16.tar.gz.

File metadata

  • Download URL: mr_uplift-0.0.16.tar.gz
  • Upload date:
  • Size: 22.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.6.1 requests/2.24.0 setuptools/50.3.1.post20201107 requests-toolbelt/0.9.1 tqdm/4.50.2 CPython/3.8.5

File hashes

Hashes for mr_uplift-0.0.16.tar.gz
Algorithm Hash digest
SHA256 da8e5ca00bcdf06a93a7d1a3544691ea12f183dba30f1d5604d8ea036296468b
MD5 298de6224612a3c23c2cc2292fb56549
BLAKE2b-256 35ed9ff5a4753f35ca177821881e78b027f0b51673576617d7435aea06da0257

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page