DistVAE Implementation Package for Synthetic Data Generation
Project description
DistVAE-Tabular
DistVAE is a novel approach to distributional learning in the VAE framework, focusing on accurately capturing the underlying distribution of the observed dataset through a nonparametric CDF estimation.
We utilize the continuous ranked probability score (CRPS), a strictly proper scoring rule, as the reconstruction loss while preserving the mathematical derivation of the lower bound of the data log-likelihood. Additionally, we introduce a synthetic data generation mechanism that effectively preserves differential privacy.
1. Installation
Install using pip:
pip install distvae-tabular
2. Usage
from distvae_tabular import distvae
distvae.DistVAE # DistVAE model
distvae.generate_data # generate synthetic data
- See example.ipynb for detailed example with
loan
dataset.- Link for download
loan
dataset: https://www.kaggle.com/datasets/teertha/personal-loan-modeling
- Link for download
Citation
If you use this code or package, please cite our associated paper:
@article{an2024distributional,
title={Distributional learning of variational AutoEncoder: application to synthetic data generation},
author={An, Seunghwan and Jeon, Jong-June},
journal={Advances in Neural Information Processing Systems},
volume={36},
year={2024}
}
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file distvae_tabular-0.1.0.tar.gz
.
File metadata
- Download URL: distvae_tabular-0.1.0.tar.gz
- Upload date:
- Size: 8.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.12.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | bbbb3db508214f7ab586a3360991a26999f71276b4b7939dca9fe3fa012e4a95 |
|
MD5 | 43bfb772c88b64b2e0a8290b2863299b |
|
BLAKE2b-256 | 53c8bf75115a5e507af208819d9ae39c0ad5dd0c172591a3121feecfb7db3795 |
File details
Details for the file distvae_tabular-0.1.0-py3-none-any.whl
.
File metadata
- Download URL: distvae_tabular-0.1.0-py3-none-any.whl
- Upload date:
- Size: 8.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.12.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 6993079709bc33ee59ed24aca7a0eacf02d374aef3ca070ae4956dbc68f2a71e |
|
MD5 | f0dc6351c54aa96f3c3eff6326bb1575 |
|
BLAKE2b-256 | 8d739aec13494d9bd063c2f9245b8bb224974112df5ff7b0685721933450d05f |