Skip to main content

Tree-based synthetic control methods

Project description

*** ATTENTION ***

Don't immidiately run pip install tbscm. See Section Installation.

Tree-based Synthetic Control Methods (tbscm)

This package implements the Tree-based Synthetic Control Methods (tbscm) from Mühlbach & Nielsen (2021), see https://arxiv.org/abs/1909.03968.

The method is essentially a nonparametric extension of the classic synthetical control group estimator proposed by Alberto Abadie and co-authors.

Please contact the authors below if you find any bugs or have any suggestions for improvement. Thank you!

Author: Nicolaj Søndergaard Mühlbach (n.muhlbach at gmail dot com, muhlbach at mit dot edu)

Code dependencies

This code has the following dependencies:

  • Python >=3.6
  • numpy >=1.19
  • pandas >=1.3
  • mlregression >=0.1.6

Note that mlregression in turn depends on

  • scikit-learn >=1
  • scikit-learn-intelex >= 2021.3
  • daal >= 2021.3
  • daal4py >= 2021.3
  • tbb >= 2021.4
  • xgboost >=1.3
  • lightgbm >=3.2

Installation

Before calling pip install tbscm, we recommend installing mlregression. For installation of mlregression and dependensies, please visit: https://pypi.org/project/mlregression/.

Usage

We demonstrate the use of mlregression below, using random forests, xgboost, and lightGBM as underlying regressors.

#------------------------------------------------------------------------------
# Libraries
#------------------------------------------------------------------------------
# Standard
import time, random
import numpy as np

# User
from tbscm.utils import data
from tbscm.synthetic_controls import SyntheticControl as SC
from tbscm.synthetic_controls import TreeBasedSyntheticControl as TBSC
from tbscm.synthetic_controls import ElasticNetSyntheticControl as ENSC

#------------------------------------------------------------------------------
# Settings
#------------------------------------------------------------------------------
# Number of covariates
p = 2
ar_lags = 3

# Number of max models to run
max_n_models = 5

# Data settings
data_settings = {
    # General    
    "T0":500,
    "T1":500,
    "ate":1,
        
    # Errors
    "eps_mean":0,
    "eps_std":1,
    "eps_cov_xx":0, # How the X's covary with each other
    "eps_cov_yy":0.1, # How the X's covary with y
    
    # X
    "X_type":"AR",
    "X_dist":"normal",
    "X_dim":p,
    "mu":0,
    "sigma":1,
    "covariance":0,
    "AR_lags":ar_lags,
    "AR_coefs":1/np.exp(np.arange(1,ar_lags+1)),
    
    # Y=f*
    "f":data.generate_linear_data, # generate_linear_data, generate_friedman_data_1, generate_friedman_data_2,
    }

# Start timer
t0 = time.time()

# Set seed
random.seed(1991)

#------------------------------------------------------------------------------
# Simple example
#------------------------------------------------------------------------------
# Generate data
df = data.simulate_data(**data_settings)

# True ate
ate = data_settings["ate"]

# Extract data
Y = df["Y"]
W = df["W"]
X = df[[col for col in df.columns if "X" in col]]

# Instantiate SC-objects
sc = SC()
tbsc = TBSC(max_n_models=max_n_models)
ensc = ENSC(max_n_models=max_n_models)

# Fit
sc.fit(Y=Y,W=W,X=X)
print(f"Estimated ATE using SC: {np.around(sc.average_treatment_effet_,2)}")

tbsc.fit(Y=Y,W=W,X=X)
print(f"Estimated ATE using TB-SC: {np.around(tbsc.average_treatment_effet_,2)}")

ensc.fit(Y=Y,W=W,X=X)
print(f"Estimated ATE using EN-SC: {np.around(ensc.average_treatment_effet_,2)}")

# Bootstrap
bootstrapped_results = tbsc.bootstrap_ate()

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tbscm-0.0.5.tar.gz (16.4 kB view details)

Uploaded Source

Built Distribution

tbscm-0.0.5-py3-none-any.whl (18.1 kB view details)

Uploaded Python 3

File details

Details for the file tbscm-0.0.5.tar.gz.

File metadata

  • Download URL: tbscm-0.0.5.tar.gz
  • Upload date:
  • Size: 16.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/4.8.1 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.8.12

File hashes

Hashes for tbscm-0.0.5.tar.gz
Algorithm Hash digest
SHA256 ce8514e6e020c50e96a2ce2223267ae18c1a069cb8b8072c87532b7a55d47b03
MD5 b69afa11a0e4b7fef854c37f77a48792
BLAKE2b-256 be2384e3a5d374a89d5ffada1a425e81eb98f99f4daae1802a61344966d51a89

See more details on using hashes here.

File details

Details for the file tbscm-0.0.5-py3-none-any.whl.

File metadata

  • Download URL: tbscm-0.0.5-py3-none-any.whl
  • Upload date:
  • Size: 18.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/4.8.1 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.8.12

File hashes

Hashes for tbscm-0.0.5-py3-none-any.whl
Algorithm Hash digest
SHA256 105291a1216fef4a186bbaa64fa750f6366a54c05e61014e914641a5be0d1409
MD5 59b7391a9ef2a169477f83465aace60a
BLAKE2b-256 874f942f1401d67a6ce55f911fcd2e58379024281dd8340acdd676d9caeb6208

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page