Skip to main content

Tree-based synthetic control methods

Project description

*** ATTENTION ***

Don't immidiately run pip install tbscm. See Section Installation.

Tree-based Synthetic Control Methods (tbscm)

This package implements the Tree-based Synthetic Control Methods (tbscm) from Mühlbach & Nielsen (2021), see https://arxiv.org/abs/1909.03968.

The method is essentially a nonparametric extension of the classic synthetical control group estimator proposed by Alberto Abadie and co-authors.

Please contact the authors below if you find any bugs or have any suggestions for improvement. Thank you!

Author: Nicolaj Søndergaard Mühlbach (n.muhlbach at gmail dot com, muhlbach at mit dot edu)

Code dependencies

This code has the following dependencies:

  • Python >=3.6
  • numpy >=1.19
  • pandas >=1.3
  • mlregression >=0.1.6

Note that mlregression in turn depends on

  • scikit-learn >=1
  • scikit-learn-intelex >= 2021.3
  • daal >= 2021.3
  • daal4py >= 2021.3
  • tbb >= 2021.4
  • xgboost >=1.3
  • lightgbm >=3.2

Installation

Before calling pip install tbscm, we recommend installing mlregression. For installation of mlregression and dependensies, please visit: https://pypi.org/project/mlregression/.

Usage

We demonstrate the use of mlregression below, using random forests, xgboost, and lightGBM as underlying regressors.

#------------------------------------------------------------------------------
# Libraries
#------------------------------------------------------------------------------
# Standard
import time, random
import numpy as np

# User
from tbscm.utils import data
from tbscm.synthetic_controls import SyntheticControl as SC
from tbscm.synthetic_controls import TreeBasedSyntheticControl as TBSC
from tbscm.synthetic_controls import ElasticNetSyntheticControl as ENSC

#------------------------------------------------------------------------------
# Settings
#------------------------------------------------------------------------------
# Number of covariates
p = 2
ar_lags = 3

# Number of max models to run
max_n_models = 5

# Data settings
data_settings = {
    # General    
    "T0":500,
    "T1":500,
    "ate":1,
        
    # Errors
    "eps_mean":0,
    "eps_std":1,
    "eps_cov_xx":0, # How the X's covary with each other
    "eps_cov_yy":0.1, # How the X's covary with y
    
    # X
    "X_type":"AR",
    "X_dist":"normal",
    "X_dim":p,
    "mu":0,
    "sigma":1,
    "covariance":0,
    "AR_lags":ar_lags,
    "AR_coefs":1/np.exp(np.arange(1,ar_lags+1)),
    
    # Y=f*
    "f":data.generate_linear_data, # generate_linear_data, generate_friedman_data_1, generate_friedman_data_2,
    }

# Start timer
t0 = time.time()

# Set seed
random.seed(1991)

#------------------------------------------------------------------------------
# Simple example
#------------------------------------------------------------------------------
# Generate data
df = data.simulate_data(**data_settings)

# True ate
ate = data_settings["ate"]

# Extract data
Y = df["Y"]
W = df["W"]
X = df[[col for col in df.columns if "X" in col]]

# Instantiate SC-objects
sc = SC()
tbsc = TBSC(max_n_models=max_n_models)
ensc = ENSC(max_n_models=max_n_models)

# Fit
sc.fit(Y=Y,W=W,X=X)
print(f"Estimated ATE using SC: {np.around(sc.average_treatment_effet_,2)}")

tbsc.fit(Y=Y,W=W,X=X)
print(f"Estimated ATE using TB-SC: {np.around(tbsc.average_treatment_effet_,2)}")

ensc.fit(Y=Y,W=W,X=X)
print(f"Estimated ATE using EN-SC: {np.around(ensc.average_treatment_effet_,2)}")

# Bootstrap
bootstrapped_results = tbsc.bootstrap_ate()

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tbscm-0.0.4.tar.gz (15.9 kB view details)

Uploaded Source

Built Distribution

tbscm-0.0.4-py3-none-any.whl (17.6 kB view details)

Uploaded Python 3

File details

Details for the file tbscm-0.0.4.tar.gz.

File metadata

  • Download URL: tbscm-0.0.4.tar.gz
  • Upload date:
  • Size: 15.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/4.8.1 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.8.12

File hashes

Hashes for tbscm-0.0.4.tar.gz
Algorithm Hash digest
SHA256 59f39cfe5a24ee88115d18cd5a8a3ec1fbb96035f3d22a3a8228f2374251d8e5
MD5 9a95ef779b44d0e3996a4af54085c109
BLAKE2b-256 701fc47e7777ccc8324b1a846cc93ab970fe4c7c2d94c94a71237cd2fd717770

See more details on using hashes here.

File details

Details for the file tbscm-0.0.4-py3-none-any.whl.

File metadata

  • Download URL: tbscm-0.0.4-py3-none-any.whl
  • Upload date:
  • Size: 17.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/4.8.1 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.8.12

File hashes

Hashes for tbscm-0.0.4-py3-none-any.whl
Algorithm Hash digest
SHA256 205cb4ae1a289aa10483ec4738d75409656a3ea8d22fd627c9a2b1fe6309cd65
MD5 7c1bf7c87c3f1c8898499fadc3b2e587
BLAKE2b-256 9377e2016260aba275889e13809f788e213c873a80fc94ebf93d8e8ae9cbeb4c

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page