Simulated Data Generating Process
Project description
Simulated Data Generating Process
Task: Simulate realistic data set with known treatment effects
Properties of the data generator:
- High dimensional and non-linear data
- flexible
- user friendly
Motivation
-
Provide high quality data for research
-
Useful for you to test your models
Model Equations
Y - Outcome Variable
D - Treatment Dummy
Treatment Assignment
Random
Conditioned covariates
Create assignment vector
Treatment effect
Options 1/2: Constant, positive and negative
Options 3/4: Continuous heterogeneous effect, positive and negative
Option 5: No treatment effect
Composition of dependent variable
Non-linearity:
Option 1: Continuous
Option 2: Binary
Application of module/package
from opossum import UserInterface
u = UserInterface(N = 10000,k = 10, seed = 12)
u.generate_treatment(random_assignment = True,
assignment_prob = 0.5,
constant_pos = True,
constant_neg = False,
heterogeneous_pos = False,
heterogeneous_neg = False,
no_treatment = False,
treatment_option_weights = [0, 0, 0.7, 0.1, 0.2], # treatment_option_weights default: None
# [constant_pos,constant_neg, heterogeneous_pos, heterogeneous_neg, no effect]
intensity = 5)
y, X, assignment_vector, treatment_effect = u.output_data(binary=False)
Correlation Matrix of X
%matplotlib notebook
u.plot_covariates_correlation()
<IPython.core.display.Javascript object>
Distribution of propensity scores according to treatment assignment
Treatment effect options
Customized treatment distribution
Outputs depending on treatment assignment
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
opossum-0.0.1.tar.gz
(196.2 kB
view hashes)
Built Distribution
opossum-0.0.1-py3-none-any.whl
(67.1 kB
view hashes)