Skip to main content

Artificial Tabular Data Synthesizers

Project description

ARTSyn is a library containing models and algorithm implementations for synthesizing artificial tabular data. Such synthetic data are frequently useful in numerous classification and regression tasks under the presence of imbalanced datasets. Examples include fault/defect detection, intrusion detection, medical diagnoses, financial predictions, etc.

Most models in ARTSyn support conditional data generation, namely, generation of data instances that belong to a particular class. The models accept tabular data in CSV format and additional information about the column structure (e.g. columns with numeric/discrete values, class columns, etc.). Then, they are trained to generate additional samples either from a specific class, or without any condition. For the moment, ARTSyn emphasizes on Generative Adversarial Networks (GANs), but more models and algorithms will be supported in the future.

Licence: Apache License, 2.0 (Apache-2.0)

Dependencies:NumPy, Pandas, Matplotlib, Seaborn, joblib, Synthetic Data Vault (SDV), pyTorch, scikit-learn, xgboost, imblearn, Reversible Data Transforms (RDT), tqdm.

GitHub repository: https://github.com/lakritidis/artsyn

Publications:

  • L. Akritidis, P. Bozanis, "A Clustering-Based Resampling Technique with Cluster Structure Analysis for Software Defect Detection in Imbalanced Datasets", Information Sciences, vol. 674,pp. 120724, 2024.
  • L. Akritidis, A. Fevgas, M. Alamaniotis, P. Bozanis, "Conditional Data Synthesis with Deep Generative Models for Imbalanced Dataset Oversampling", In Proceedings of the 35th IEEE International Conference on Tools with Artificial Intelligence, pp. 444-451, 2023, 2023.
  • L. Akritidis, P. Bozanis, "A Multi-Dimensional Survey on Learning from Imbalanced Data", Chapter in International Conference on Information, Intelligence, Systems, and Applications, pp. 13-45, 2024.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

artsyn-0.5.2.tar.gz (108.7 kB view details)

Uploaded Source

File details

Details for the file artsyn-0.5.2.tar.gz.

File metadata

  • Download URL: artsyn-0.5.2.tar.gz
  • Upload date:
  • Size: 108.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.4.2 requests/2.22.0 setuptools/45.2.0 requests-toolbelt/0.8.0 tqdm/4.30.0 CPython/3.8.10

File hashes

Hashes for artsyn-0.5.2.tar.gz
Algorithm Hash digest
SHA256 b7d199a0e80c89dd703a751910ad4723e4a81a2d4c58d33688d53ee03b1d0e94
MD5 68998392ae6b4dbf912b2dc82c227dc1
BLAKE2b-256 61a19d403329ae920aec749b220a798637c4734f50eb967ee6152388c132b051

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page