Skip to main content

Tree-based synthetic control methods

Project description

*** ATTENTION ***

Don't immidiately run pip install tbscm. See Section Installation.

Tree-based Synthetic Control Methods (tbscm)

This package implements the Tree-based Synthetic Control Methods (tbscm) from Mühlbach & Nielsen (2021), see https://arxiv.org/abs/1909.03968.

The method is essentially a nonparametric extension of the classic synthetical control group estimator proposed by Alberto Abadie and co-authors.

Please contact the authors below if you find any bugs or have any suggestions for improvement. Thank you!

Author: Nicolaj Søndergaard Mühlbach (n.muhlbach at gmail dot com, muhlbach at mit dot edu)

Code dependencies

This code has the following dependencies:

  • Python >=3.6
  • numpy >=1.19
  • pandas >=1.3
  • mlregression >=0.1.6

Note that mlregression in turn depends on

  • scikit-learn >=1
  • scikit-learn-intelex >= 2021.3
  • daal >= 2021.3
  • daal4py >= 2021.3
  • tbb >= 2021.4
  • xgboost >=1.3
  • lightgbm >=3.2

Installation

Before calling pip install tbscm, we recommend installing mlregression. For installation of mlregression and dependensies, please visit: https://pypi.org/project/mlregression/.

Usage

We demonstrate the use of mlregression below, using random forests, xgboost, and lightGBM as underlying regressors.

#------------------------------------------------------------------------------
# Libraries
#------------------------------------------------------------------------------
# Standard
import time, random
import numpy as np

# User
from tbscm.utils import data
from tbscm.synthetic_controls import SyntheticControl as SC
from tbscm.synthetic_controls import TreeBasedSyntheticControl as TBSC
from tbscm.synthetic_controls import ElasticNetSyntheticControl as ENSC

#------------------------------------------------------------------------------
# Settings
#------------------------------------------------------------------------------
# Number of covariates
p = 2
ar_lags = 3

# Number of max models to run
max_n_models = 5

# Data settings
data_settings = {
    # General    
    "T0":500,
    "T1":500,
    "ate":1,
        
    # Errors
    "eps_mean":0,
    "eps_std":1,
    "eps_cov_xx":0, # How the X's covary with each other
    "eps_cov_yy":0.1, # How the X's covary with y
    
    # X
    "X_type":"AR",
    "X_dist":"normal",
    "X_dim":p,
    "mu":0,
    "sigma":1,
    "covariance":0,
    "AR_lags":ar_lags,
    "AR_coefs":1/np.exp(np.arange(1,ar_lags+1)),
    
    # Y=f*
    "f":data.generate_linear_data, # generate_linear_data, generate_friedman_data_1, generate_friedman_data_2,
    }

# Start timer
t0 = time.time()

# Set seed
random.seed(1991)

#------------------------------------------------------------------------------
# Simple example
#------------------------------------------------------------------------------
# Generate data
df = data.simulate_data(**data_settings)

# True ate
ate = data_settings["ate"]

# Extract data
Y = df["Y"]
W = df["W"]
X = df[[col for col in df.columns if "X" in col]]

# Instantiate SC-objects
sc = SC()
tbsc = TBSC(max_n_models=max_n_models)
ensc = ENSC(max_n_models=max_n_models)

# Fit
sc.fit(Y=Y,W=W,X=X)
print(f"Estimated ATE using SC: {np.around(sc.average_treatment_effet_,2)}")

tbsc.fit(Y=Y,W=W,X=X)
print(f"Estimated ATE using TB-SC: {np.around(tbsc.average_treatment_effet_,2)}")

ensc.fit(Y=Y,W=W,X=X)
print(f"Estimated ATE using EN-SC: {np.around(ensc.average_treatment_effet_,2)}")

# Bootstrap
bootstrapped_results = tbsc.bootstrap_ate()

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tbscm-0.0.7.tar.gz (16.4 kB view details)

Uploaded Source

Built Distribution

tbscm-0.0.7-py3-none-any.whl (18.1 kB view details)

Uploaded Python 3

File details

Details for the file tbscm-0.0.7.tar.gz.

File metadata

  • Download URL: tbscm-0.0.7.tar.gz
  • Upload date:
  • Size: 16.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/4.8.1 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.8.12

File hashes

Hashes for tbscm-0.0.7.tar.gz
Algorithm Hash digest
SHA256 879ee5f02f3a8799cea11c00e617f5d2757a494042bf06ffecc04ca84ff6763a
MD5 8730769ccaf318bdaaaf29aa85405fe4
BLAKE2b-256 527a1d013b10ee6bc4a6f9a18b6bf1fff9d2b609616d33be9d95d04249da586a

See more details on using hashes here.

File details

Details for the file tbscm-0.0.7-py3-none-any.whl.

File metadata

  • Download URL: tbscm-0.0.7-py3-none-any.whl
  • Upload date:
  • Size: 18.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/4.8.1 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.8.12

File hashes

Hashes for tbscm-0.0.7-py3-none-any.whl
Algorithm Hash digest
SHA256 b25874d3b7adabbb6924b485e9c0ba2fea56bd80d9b8aa92009d510cac94ce58
MD5 154a303e2c3e319677f040996810e339
BLAKE2b-256 738b2784072078faa315d0a75805c85c589e7f38a60705894ef4fa8ab52497e5

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page