A collection of Machine Learning techniques for data management, engineering and augmentation.
Project description
DeepCoreML is a collection of Machine Learning techniques for data management, engineering, and augmentation. More specifically, DeepCoreML includes modules for:
- Data management
- Text data preprocessing
- Text representation, vectorization, embeddings
- Dimensionality reduction
- Generative modeling
- Imbalanced datasets
Licence: Apache License, 2.0 (Apache-2.0)
Dependencies:NumPy, pandas, Natural Language Toolkit (nltk), Matplotlib, seaborn, Gensim, joblib, Reversible Data Transforms(RDT), bs4, scikit-learn, imblearn, pytorch, transformers, Synthetic Data Vault
GitHub repository: https://github.com/lakritidis/DeepCoreML
Publications:
- L. Akritidis, P. Bozanis, "A Clustering-Based Resampling Technique with Cluster Structure Analysis forSoftware Defect Detection in Imbalanced Datasets", Information Sciences, vol. 674, pp. 120724, 2024.
- L. Akritidis, A. Fevgas, M. Alamaniotis, P. Bozanis, "Conditional Data Synthesis with Deep Generative Models for Imbalanced Dataset Oversampling", In Proceedings of the 35th IEEE International Conference on Tools with Artificial Intelligence (ICTAI), pp. 444-451, 2023, 2023.
- L. Akritidis, P. Bozanis, "A Multi-Dimensional Survey on Learning from Imbalanced Data", Chapter in Machine Learning Paradigms - Advances in Theory and Applications of Learning from Imbalanced Data, to appear, 2023.
- L. Akritidis, P. Bozanis, "Low Dimensional Text Representations for Sentiment Analysis NLP Tasks", Springer Nature (SN) Computer Science, vol. 4, no. 5, 474, 2023.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
DeepCoreML-0.4.0.tar.gz
(57.4 kB
view hashes)