A collection of Machine Learning techniques for data management, engineering and augmentation.
Project description
DeepCoreML is a collection of Machine Learning techniques for data management, engineering, and augmentation. More specifically, DeepCoreML includes modules for:
- Data management
- Text data preprocessing
- Text representation, vectorization, embeddings
- Dimensionality reduction
- Generative modeling
- Imbalanced datasets
Licence: Apache License, 2.0 (Apache-2.0)
Dependencies:NumPy, pandas, Natural Language Toolkit (nltk), Matplotlib, seaborn, Gensim, joblib, Reversible Data Transforms(RDT), bs4, scikit-learn, imblearn, pytorch, transformers, Synthetic Data Vault
GitHub repository: https://github.com/lakritidis/DeepCoreML
Publications:
- L. Akritidis, P. Bozanis, "A Clustering-Based Resampling Technique with Cluster Structure Analysis forSoftware Defect Detection in Imbalanced Datasets", Information Sciences, vol. 674, pp. 120724, 2024.
- L. Akritidis, A. Fevgas, M. Alamaniotis, P. Bozanis, "Conditional Data Synthesis with Deep Generative Models for Imbalanced Dataset Oversampling", In Proceedings of the 35th IEEE International Conference on Tools with Artificial Intelligence (ICTAI), pp. 444-451, 2023, 2023.
- L. Akritidis, P. Bozanis, "A Multi-Dimensional Survey on Learning from Imbalanced Data", Chapter in Machine Learning Paradigms - Advances in Theory and Applications of Learning from Imbalanced Data, to appear, 2023.
- L. Akritidis, P. Bozanis, "Low Dimensional Text Representations for Sentiment Analysis NLP Tasks", Springer Nature (SN) Computer Science, vol. 4, no. 5, 474, 2023.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
DeepCoreML-0.4.0.tar.gz
(57.4 kB
view details)
File details
Details for the file DeepCoreML-0.4.0.tar.gz
.
File metadata
- Download URL: DeepCoreML-0.4.0.tar.gz
- Upload date:
- Size: 57.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.1.1 pkginfo/1.4.2 requests/2.22.0 setuptools/45.2.0 requests-toolbelt/0.8.0 tqdm/4.30.0 CPython/3.8.10
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | bfc23a32b127407bf468befef188d2767adc0e2dd3b5443110797a2111aa63e2 |
|
MD5 | 2c9b3c86fa841bb4eca438aba7f2b109 |
|
BLAKE2b-256 | 079fe4c70a2eb0daaa17aa23e712c949189fe3c045d1c2992ced87b7340ea14b |