Skip to main content

Active Learning for treatment effect estimation

Project description

Automatic Stopping for Batch-mode Experimentation

Created with nbdev by Zoltan Puha

Install

python -m pip install git+https://github.com/puhazoli/asbe

How to use

ASBE builds on the functional views of modAL, where an AL algorithm can be run by putting together pieces. You need the following ingredients: - an ITE estimator (ITEEstimator()), - an acquisition function, - and an assignment function. - Additionaly, you can add a stopping criteria to your model. If all the above are defined, you can construct an ASLearner, which will help you in the active learning process.

from asbe.base import *
from asbe.models import *
from sklearn.linear_model import LogisticRegression
from sklearn.ensemble import RandomForestClassifier
import numpy as np
N = 1000
X = np.random.normal(size = N*2).reshape((-1,2))
t = np.random.binomial(n = 1, p = 0.5, size = N)
y = np.random.binomial(n = 1, p = 1/(1+np.exp(X[:, 1]*2 + t*3)))
ite = 1/(1+np.exp(X[:, 1]*2 + t*3)) - 1/(1+np.exp(X[:, 1]*2))
a = BaseITEEstimator(LogisticRegression(solver="lbfgs"))
a.fit(X_training=X, t_training=t, y_training=y)

Learning actively

Similarly, you can create an BaseActiveLearner, for which you will initialize the dataset and set the preferred modeling options. Let’s see how it works: - we will use XBART to model the treatment effect with a one-model approach - we will use expected model change maximization - for that, we need an approximate model, we will use the SGDRegressor

You can call .fit() on the BaseActiveLearner, which will by default fit the training data supplied. To select new units from the pool, you just need to call the query() method, which will return the selected X and the query_ix of these units. BaseActiveLearner expects the n2 argument, which tells how many units are queried at once. For sequential AL, we can set this to 1. Additionally, some query strategies can require different treatment effect estimates - EMCM needs uncertainty around the ITE. We can explicitly tell the the BaseITEEstimator to return all the predicted treatment effects. Then, we can teach the newly acquired units to the learner, by calling the teach function. The score function provides an evaluation of the given learner.

from sklearn.model_selection import train_test_split
from sklearn.linear_model import SGDRegressor
from copy import deepcopy
import pandas as pd
X_train, X_test, t_train, t_test, y_train, y_test, ite_train, ite_test = train_test_split(
    X, t, y, ite,  test_size=0.8, random_state=1005)
ds = {"X_training": X_train,
     "y_training": y_train,
     "t_training": t_train,
     "ite_training": np.zeros_like(y_train),
     "X_pool": deepcopy(X_test), 
     "y_pool": deepcopy(y_test),
     "t_pool": deepcopy(t_test),
     "ite_pool" : np.zeros_like(y_test),
     "X_test": X_test,
     "y_test": y_test,
      "t_test": t_test,
      "ite_test": ite_test
     }
asl = BaseActiveLearner(estimator = BaseITEEstimator(model = RandomForestClassifier(),
                                         two_model=False),
                        acquisition_function=BaseAcquisitionFunction(),
                        assignment_function=BaseAssignmentFunction(),
                        stopping_function = None,
                        dataset=ds)
asl.fit()
X_new, query_idx = asl.query(no_query=10)
asl.teach(query_idx)
preds = asl.predict(asl.dataset["X_test"])
asl.score()
0.34842037641629464
asl = BaseActiveLearner(estimator = BaseITEEstimator(model = RandomForestClassifier(),
                                         two_model=True),
                        acquisition_function=[BaseAcquisitionFunction(),
                                             BaseAcquisitionFunction(no_query=20)],
                        assignment_function=BaseAssignmentFunction(),
                        stopping_function = None,
                        dataset=ds,
                        al_steps = 3)
resd = pd.DataFrame(asl.simulate(metric="decision"))
resd.plot()
<AxesSubplot:>

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

asbe-0.1.4.tar.gz (34.5 kB view details)

Uploaded Source

Built Distribution

asbe-0.1.4-py3-none-any.whl (33.7 kB view details)

Uploaded Python 3

File details

Details for the file asbe-0.1.4.tar.gz.

File metadata

  • Download URL: asbe-0.1.4.tar.gz
  • Upload date:
  • Size: 34.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.10.13

File hashes

Hashes for asbe-0.1.4.tar.gz
Algorithm Hash digest
SHA256 48aa24e6d12fb9266fc03de6f8299d638534e234d99243757800e3812b5fe94a
MD5 281c0471def0cb0b0248febef276abe8
BLAKE2b-256 bd144bad926a3ff2aa4652388ccffafd0a767e6547420bccb6fb66e950e5b688

See more details on using hashes here.

File details

Details for the file asbe-0.1.4-py3-none-any.whl.

File metadata

  • Download URL: asbe-0.1.4-py3-none-any.whl
  • Upload date:
  • Size: 33.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.10.13

File hashes

Hashes for asbe-0.1.4-py3-none-any.whl
Algorithm Hash digest
SHA256 48f84776d94cf1988e8fff2671febd86ab501c45f1376a2d439a7179a4ac7e38
MD5 56964590b6745250ccc8b7dbaee96900
BLAKE2b-256 657f959905a221e703aa053854005451fa0cf89e821d7b868482e83c5887caf0

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page