An implementation of novel oversampling algorithms.
Project description
imbalanced-learn-extra
| Category | Tools |
|---|---|
| Development | |
| Package | |
| Documentation | |
| Communication |
Introduction
imbalanced-learn-extra is a Python package that extends imbalanced-learn. It implements algorithms that are not included in
imbalanced-learn due to their novelty or lower citation number. The current version includes the following:
-
A general interface for clustering-based oversampling algorithms.
-
The Geometric SMOTE algorithm. It is a geometrically enhanced drop-in replacement for SMOTE, that handles numerical as well as categorical features.
Installation
For user installation, imbalanced-learn-extra is currently available on the PyPi's repository, and you can
install it via pip:
pip install imbalanced-learn-extra
Development installation requires cloning the repository and then using PDM to install the project as well as the main and development dependencies:
git clone https://github.com/georgedouzas/imbalanced-learn-extra.git
cd imbalanced-learn-extra
pdm install
SOM clusterer requires optional dependencies:
pip install imbalanced-learn-extra[som]
Usage
All the classes included in imbalanced-learn-extra follow the imbalanced-learn API using the functionality of the base
oversampler. Using scikit-learn convention, the data are represented as follows:
- Input data
X: 2D array-like or sparse matrices. - Targets
y: 1D array-like.
The oversamplers implement a fit method to learn from X and y:
oversampler.fit(X, y)
They also implement a fit_resample method to resample X and y:
X_resampled, y_resampled = clustering_based_oversampler.fit_resample(X, y)
Citing imbalanced-learn-extra
Publications using clustering-based oversampling:
- G. Douzas, F. Bacao, "Self-Organizing Map Oversampling (SOMO) for imbalanced data set learning", Expert Systems with Applications, vol. 82, pp. 40-52, 2017.
- G. Douzas, F. Bacao, F. Last, "Improving imbalanced learning through a heuristic oversampling method based on k-means and SMOTE", Information Sciences, vol. 465, pp. 1-20, 2018.
- G. Douzas, F. Bacao, F. Last, "G-SOMO: An oversampling approach based on self-organized maps and geometric SMOTE", Expert Systems with Applications, vol. 183,115230, 2021.
Publications using Geometric-SMOTE:
-
Douzas, G., Bacao, B. (2019). Geometric SMOTE: a geometrically enhanced drop-in replacement for SMOTE. Information Sciences, 501, 118-135. https://doi.org/10.1016/j.ins.2019.06.007
-
Fonseca, J., Douzas, G., Bacao, F. (2021). Increasing the Effectiveness of Active Learning: Introducing Artificial Data Generation in Active Learning for Land Use/Land Cover Classification. Remote Sensing, 13(13), 2619. https://doi.org/10.3390/rs13132619
-
Douzas, G., Bacao, F., Fonseca, J., Khudinyan, M. (2019). Imbalanced Learning in Land Cover Classification: Improving Minority Classes’ Prediction Accuracy Using the Geometric SMOTE Algorithm. Remote Sensing, 11(24), 3040. https://doi.org/10.3390/rs11243040
User Support
If you encounter a bug, have a question, or would like to request a new feature, you can get support through the project’s GitHub issue tracker.
- Report a bug: Open a new issue and describe the problem, including steps to reproduce it and your environment details.
- Request a feature: Open a new issue describing the functionality you’d like to see added.
- Ask a question or request help: Use the Q&A discussion board for general usage questions or clarifications.
Before opening a new issue, please check the existing issues to see if your question or problem has already been addressed.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file imbalanced-learn-extra-0.2.10.tar.gz.
File metadata
- Download URL: imbalanced-learn-extra-0.2.10.tar.gz
- Upload date:
- Size: 40.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f01e3942a7ee95c74c71ebbbe7679521554d78f8b1ea507be1c496a64dad0805
|
|
| MD5 |
1a6bb35ff3b3cc61848f1f474f814a61
|
|
| BLAKE2b-256 |
86e412239626ae43e33dbd2f84593b6f22d806d45b27165cae212394cdca7757
|
File details
Details for the file imbalanced_learn_extra-0.2.10-py3-none-any.whl.
File metadata
- Download URL: imbalanced_learn_extra-0.2.10-py3-none-any.whl
- Upload date:
- Size: 38.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3b6a47af3fec8cd0767af58799069e044864aaa67583f0a31a02a10613fdb7da
|
|
| MD5 |
f04ee63a574db97bad585bf14feae286
|
|
| BLAKE2b-256 |
20a342d0d165b08262df21028730e9c953d49bfc0a11109822a1ed9edd2bf23d
|