A synthetic data generation package
Project description
data_generation
Are you looking for a package that can generate high dimensional synthetic datasets?
Do you need datasets with a grouped structure so that you can test your group lasso based formulation?
Dont look more. data_generation
is precisely what you need.
Usage example
data_equal = dgen.EqualGroupSize(n_obs=5000, ro=0.2, error_distribution='student_t',
e_df=5, random_state=1, group_size=10, non_zero_groups=3,
non_zero_coef=5, num_groups=7)
x, y, beta, group_index = data_equal.data_generation().values()
data_different = dgen.UnequalGroupSize(n_obs=5000, ro=0.8, error_distribution='normal', e_loc=1, e_scale=4,
random_state=2, tuple_group_size=(2, 4, 6, 8),
tuple_number_of_groups=(5, 10, 15, 20),
tuple_non_zero_coef=(1, 2, 3, 4),
tuple_non_zero_groups=(1, 3, 5, 7))
x, y, beta, group_index = data_different.data_generation().values()
For a deeper review and an explanation of the capabilities of this package we recommend to read the user_guide available in the repository.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
data_generation-0.0.1.tar.gz
(5.3 kB
view details)
File details
Details for the file data_generation-0.0.1.tar.gz
.
File metadata
- Download URL: data_generation-0.0.1.tar.gz
- Upload date:
- Size: 5.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.23.0 setuptools/46.4.0.post20200518 requests-toolbelt/0.9.1 tqdm/4.46.0 CPython/3.8.3
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 9d3093d298cd5e61b2c422bc5e927164791248ad24b3e1958e9ceb2b31b2f175 |
|
MD5 | 1db33b132dc0bf8cddff40c42d162e6c |
|
BLAKE2b-256 | 28259371333bb2596d0408688b6e92d216ddb8a0ab5985d80e60e0cad301034b |