Artificial Tabular Data Synthesizers
Project description
ARTSyn is a library containing models and algorithm implementations for synthesizing artificial tabular data. Such synthetic data are frequently useful in numerous classification and regression tasks under the presence of imbalanced datasets. Examples include fault/defect detection, intrusion detection, medical diagnoses, financial predictions, etc.
Most models in ARTSyn support conditional data generation, namely, generation of data instances that belong to a particular class. The models accept tabular data in CSV format and additional information about the column structure (e.g. columns with numeric/discrete values, class columns, etc.). Then, they are trained to generate additional samples either from a specific class, or without any condition. For the moment, ARTSyn emphasizes on Generative Adversarial Networks (GANs), but more models and algorithms will be supported in the future.
Licence: Apache License, 2.0 (Apache-2.0)
Dependencies:NumPy, Pandas, Matplotlib, Seaborn, joblib, Synthetic Data Vault (SDV), pyTorch, scikit-learn, xgboost, imblearn, Reversible Data Transforms (RDT), tqdm.
GitHub repository: https://github.com/lakritidis/artsyn
Publications:
- L. Akritidis, P. Bozanis, "A Clustering-Based Resampling Technique with Cluster Structure Analysis for Software Defect Detection in Imbalanced Datasets", Information Sciences, vol. 674,pp. 120724, 2024.
- L. Akritidis, A. Fevgas, M. Alamaniotis, P. Bozanis, "Conditional Data Synthesis with Deep Generative Models for Imbalanced Dataset Oversampling", In Proceedings of the 35th IEEE International Conference on Tools with Artificial Intelligence, pp. 444-451, 2023, 2023.
- L. Akritidis, P. Bozanis, "A Multi-Dimensional Survey on Learning from Imbalanced Data", Chapter in International Conference on Information, Intelligence, Systems, and Applications, pp. 13-45, 2024.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
File details
Details for the file artsyn-0.5.1.tar.gz.
File metadata
- Download URL: artsyn-0.5.1.tar.gz
- Upload date:
- Size: 108.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.1.1 pkginfo/1.4.2 requests/2.22.0 setuptools/45.2.0 requests-toolbelt/0.8.0 tqdm/4.30.0 CPython/3.8.10
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6bc74ac6bfe867c9c30457db4b5320ca5a93b2cc254f64c9a3684dc34886162d
|
|
| MD5 |
8dc15c479d520726507d3f6c011cec5e
|
|
| BLAKE2b-256 |
1dd097d179457d7b50ba21efae23dcf0f1e5ac15a4ec6349b717b447eb6dc4dc
|