Skip to main content

Tree-based synthetic control methods

Project description

*** ATTENTION ***

Don't immidiately run pip install tbscm. See Section Installation.

Tree-based Synthetic Control Methods (tbscm)

This package implements the Tree-based Synthetic Control Methods (tbscm) from Mühlbach & Nielsen (2021), see https://arxiv.org/abs/1909.03968.

The method is essentially a nonparametric extension of the classic synthetical control group estimator proposed by Alberto Abadie and co-authors.

Please contact the authors below if you find any bugs or have any suggestions for improvement. Thank you!

Author: Nicolaj Søndergaard Mühlbach (n.muhlbach at gmail dot com, muhlbach at mit dot edu)

Code dependencies

This code has the following dependencies:

  • Python >=3.6
  • numpy >=1.19
  • pandas >=1.3
  • mlregression >=0.1.6

Note that mlregression in turn depends on

  • scikit-learn >=1
  • scikit-learn-intelex >= 2021.3
  • daal >= 2021.3
  • daal4py >= 2021.3
  • tbb >= 2021.4
  • xgboost >=1.3
  • lightgbm >=3.2

Installation

Before calling pip install tbscm, we recommend installing mlregression. For installation of mlregression and dependensies, please visit: https://pypi.org/project/mlregression/.

Usage

We demonstrate the use of mlregression below, using random forests, xgboost, and lightGBM as underlying regressors.

#------------------------------------------------------------------------------
# Libraries
#------------------------------------------------------------------------------
# Standard
import time, random
import numpy as np

# User
from tbscm.utils import data
from tbscm.synthetic_controls import SyntheticControl as SC
from tbscm.synthetic_controls import TreeBasedSyntheticControl as TBSC
from tbscm.synthetic_controls import ElasticNetSyntheticControl as ENSC

#------------------------------------------------------------------------------
# Settings
#------------------------------------------------------------------------------
# Number of covariates
p = 2
ar_lags = 3

# Number of max models to run
max_n_models = 5

# Data settings
data_settings = {
    # General    
    "T0":500,
    "T1":500,
    "ate":1,
        
    # Errors
    "eps_mean":0,
    "eps_std":1,
    "eps_cov_xx":0, # How the X's covary with each other
    "eps_cov_yy":0.1, # How the X's covary with y
    
    # X
    "X_type":"AR",
    "X_dist":"normal",
    "X_dim":p,
    "mu":0,
    "sigma":1,
    "covariance":0,
    "AR_lags":ar_lags,
    "AR_coefs":1/np.exp(np.arange(1,ar_lags+1)),
    
    # Y=f*
    "f":data.generate_linear_data, # generate_linear_data, generate_friedman_data_1, generate_friedman_data_2,
    }

# Start timer
t0 = time.time()

# Set seed
random.seed(1991)

#------------------------------------------------------------------------------
# Simple example
#------------------------------------------------------------------------------
# Generate data
df = data.simulate_data(**data_settings)

# True ate
ate = data_settings["ate"]

# Extract data
Y = df["Y"]
W = df["W"]
X = df[[col for col in df.columns if "X" in col]]

# Instantiate SC-objects
sc = SC()
tbsc = TBSC(max_n_models=max_n_models)
ensc = ENSC(max_n_models=max_n_models)

# Fit
sc.fit(Y=Y,W=W,X=X)
print(f"Estimated ATE using SC: {np.around(sc.average_treatment_effet_,2)}")

tbsc.fit(Y=Y,W=W,X=X)
print(f"Estimated ATE using TB-SC: {np.around(tbsc.average_treatment_effet_,2)}")

ensc.fit(Y=Y,W=W,X=X)
print(f"Estimated ATE using EN-SC: {np.around(ensc.average_treatment_effet_,2)}")

# Bootstrap
bootstrapped_results = tbsc.bootstrap_ate()

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tbscm-0.0.6.tar.gz (16.4 kB view details)

Uploaded Source

Built Distribution

tbscm-0.0.6-py3-none-any.whl (18.1 kB view details)

Uploaded Python 3

File details

Details for the file tbscm-0.0.6.tar.gz.

File metadata

  • Download URL: tbscm-0.0.6.tar.gz
  • Upload date:
  • Size: 16.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/4.8.1 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.8.12

File hashes

Hashes for tbscm-0.0.6.tar.gz
Algorithm Hash digest
SHA256 29b5a6b82676db626b81e147d4261bf119dccbe859fd3f32148eb764f8cd04fc
MD5 6c01a7c4a190b77917b5a138bf9f712c
BLAKE2b-256 4454fe7850b515aaf8d672ecadb08a1b692bce7b938295b8ffed71dbc54898a9

See more details on using hashes here.

File details

Details for the file tbscm-0.0.6-py3-none-any.whl.

File metadata

  • Download URL: tbscm-0.0.6-py3-none-any.whl
  • Upload date:
  • Size: 18.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/4.8.1 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.8.12

File hashes

Hashes for tbscm-0.0.6-py3-none-any.whl
Algorithm Hash digest
SHA256 3819169ff1aecf3a542b6014a5ae6834216892c45fe3b449f8a3b959077c9e77
MD5 904cb005acbed5488116b86bfe358a08
BLAKE2b-256 d5072a541037abf3e49a18094bedfeea2f210e1fb44528caeb5e4c0570cdd72c

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page