Skip to main content

Artificial Tabular Data Synthesizers

Project description

ARTSyn is a library containing models and algorithm implementations for synthesizing artificial tabular data. Such synthetic data are frequently useful in numerous classification and regression tasks under the presence of imbalanced datasets. Examples include fault/defect detection, intrusion detection, medical diagnoses, financial predictions, etc.

Most models in ARTSyn support conditional data generation, namely, generation of data instances that belong to a particular class. The models accept tabular data in CSV format and additional information about the column structure (e.g. columns with numeric/discrete values, class columns, etc.). Then, they are trained to generate additional samples either from a specific class, or without any condition. For the moment, ARTSyn emphasizes on Generative Adversarial Networks (GANs), but more models and algorithms will be supported in the future.

Licence: Apache License, 2.0 (Apache-2.0)

Dependencies:NumPy, Pandas, Matplotlib, Seaborn, joblib, Synthetic Data Vault (SDV), pyTorch, scikit-learn, xgboost, imblearn, Reversible Data Transforms (RDT), tqdm.

GitHub repository: https://github.com/lakritidis/artsyn

Publications:

  • L. Akritidis, P. Bozanis, "A Clustering-Based Resampling Technique with Cluster Structure Analysis for Software Defect Detection in Imbalanced Datasets", Information Sciences, vol. 674,pp. 120724, 2024.
  • L. Akritidis, A. Fevgas, M. Alamaniotis, P. Bozanis, "Conditional Data Synthesis with Deep Generative Models for Imbalanced Dataset Oversampling", In Proceedings of the 35th IEEE International Conference on Tools with Artificial Intelligence, pp. 444-451, 2023, 2023.
  • L. Akritidis, P. Bozanis, "A Multi-Dimensional Survey on Learning from Imbalanced Data", Chapter in International Conference on Information, Intelligence, Systems, and Applications, pp. 13-45, 2024.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

artsyn-0.5.1.tar.gz (108.7 kB view details)

Uploaded Source

File details

Details for the file artsyn-0.5.1.tar.gz.

File metadata

  • Download URL: artsyn-0.5.1.tar.gz
  • Upload date:
  • Size: 108.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.4.2 requests/2.22.0 setuptools/45.2.0 requests-toolbelt/0.8.0 tqdm/4.30.0 CPython/3.8.10

File hashes

Hashes for artsyn-0.5.1.tar.gz
Algorithm Hash digest
SHA256 6bc74ac6bfe867c9c30457db4b5320ca5a93b2cc254f64c9a3684dc34886162d
MD5 8dc15c479d520726507d3f6c011cec5e
BLAKE2b-256 1dd097d179457d7b50ba21efae23dcf0f1e5ac15a4ec6349b717b447eb6dc4dc

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page