Skip to main content

Simulated Data Generating Process

Project description

Opossum

Package for simulation of data generating process to evaluate causal inference models.

Getting Started

Latest release version: 0.2.0

Download with pip

*nix OS

pip3 install opossum

Windows

pip install opossum --user --upgrade

Build it yourself

git clone https://github.com/jgitr/opossum.git
cd opossum
git checkout master
cd opossum

*nix

python3 main.py

Windows

python main.py

SHA-256: opossum-0.2.0-py3-none-any.whl

19b0b705b37a71fd5deac40720d30da4caa6529a9f1e1e7db648ecd07aea3077

SHA-256: opossum-0.2.0.tar.gz

e13000b1755576a80693dca5f2560cc9b7d3e1391701bdc5cae86f3a9b7eb036

Application

Bellow you can find a short description of the core functions that the package offers in code form. For more detailed information on how to apply the package and to get insight into the theoretical model that it is based on, please refer to the following blog post https://humboldt-wi.github.io/blog/research/applied_predictive_modeling_19/data_generating_process_blogpost/.

Default Setting

from opossum import UserInterface
# number of observations N and number of covariates k
N = 10000
k = 50
# initilizing class
u = UserInterface(N, k, seed=None, categorical_covariates = None)
# assign treatment and generate treatment effect inside of class object
u.generate_treatment(random_assignment = True, 
                     assignment_prob = 0.5, 
                     constant_pos = True, 
                     constant_neg = False,
                     heterogeneous_pos = False, 
                     heterogeneous_neg = False, 
                     no_treatment = False, 
                     discrete_heterogeneous = False,
                     treatment_option_weights = None, 
                     intensity = 5)
# generate output variable y and return all 4 variables
y, X, assignment, treatment = u.output_data(binary=False, x_y_relation = 'partial_nonlinear_simple')

Choosing covariates

N = 1000
k = 20
# whole dataset is binary
u = UserInterface(N, k, categorical_covariates = 2)
# one quarter of the dataset is binary
u = UserInterface(N, k, categorical_covariates = [5,2])
# dataset consists of 10 continuous, 4 binary, and 3 variables each with 3 and 5 categories respectively 
u = UserInterface(N, k, categorical_covariates = [10,[2,3,5])

Creating treatment effects

# random treatment assignment resulting in on average 20% treated observations 
u.generate_treatment(random_assignment = True, assignment_prob = 0.2)
# non-random treatment assignment with on average 65% treated observations
u.generate_treatment(random_assignment = False, assignment_prob = 'high')
# generating only a positive heterogeneous treatment effect
u.generate_treatment(constant_pos = False, constant_neg = False, heterogeneous_pos = True, heterogeneous_neg = False, 
                     no_treatment = False, discrete_heterogeneous = False)
# generating a heterogeneous treatment effect that is in 20% of cases negative and 80% positive
u.generate_treatment(treatment_option_weights = [0, 0, 0.8, 0.2, 0, 0]) 

Creating output

# Creating continuous y with partial nonlinear relation 
y, X, assignment, treatment = u.output_data(binary=False, x_y_relation = 'partial_nonlinear_simple')
# Creating binary y with underlying linear relation and added interaction terms of X
y, X, assignment, treatment = u.output_data(binary=True, x_y_relation = 'linear_interaction')

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

opossum-0.2.1.tar.gz (14.8 kB view details)

Uploaded Source

Built Distribution

opossum-0.2.1-py3-none-any.whl (35.9 kB view details)

Uploaded Python 3

File details

Details for the file opossum-0.2.1.tar.gz.

File metadata

  • Download URL: opossum-0.2.1.tar.gz
  • Upload date:
  • Size: 14.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.21.0 setuptools/41.1.0 requests-toolbelt/0.9.1 tqdm/4.31.1 CPython/3.7.3

File hashes

Hashes for opossum-0.2.1.tar.gz
Algorithm Hash digest
SHA256 881a94c50260960f68b3fd07ec96f7e4b19367a36a95e23484f861b85330e422
MD5 5c9872c001ce150f66b86a4d08e14493
BLAKE2b-256 1f60faa5f317ccae346b8b9aeb3a47fb31adc91a5e44228bec7661a25f5b8c13

See more details on using hashes here.

File details

Details for the file opossum-0.2.1-py3-none-any.whl.

File metadata

  • Download URL: opossum-0.2.1-py3-none-any.whl
  • Upload date:
  • Size: 35.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.21.0 setuptools/41.1.0 requests-toolbelt/0.9.1 tqdm/4.31.1 CPython/3.7.3

File hashes

Hashes for opossum-0.2.1-py3-none-any.whl
Algorithm Hash digest
SHA256 4797079e7887838a2572e0bc827ef9bc5bc3c3bcc8f217e23f4019294b732905
MD5 45e92e26f680f6017fbbc2f9cc928c64
BLAKE2b-256 cb40211a710f66f127c45d3faa3dc4a137241822a4f9aaccab801d826280ebc7

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page