Skip to main content

An implementation of novel oversampling algorithms.

Project description

imbalanced-learn-extra

ci doc

Category Tools
Development black ruff mypy docformatter
Package version pythonversion downloads
Documentation mkdocs
Communication gitter discussions

Introduction

imbalanced-learn-extra is a Python package that extends imbalanced-learn. It implements algorithms that are not included in imbalanced-learn due to their novelty or lower citation number. The current version includes the following:

  • A general interface for clustering-based oversampling algorithms.

  • The Geometric SMOTE algorithm. It is a geometrically enhanced drop-in replacement for SMOTE, that handles numerical as well as categorical features.

Installation

For user installation, imbalanced-learn-extra is currently available on the PyPi's repository, and you can install it via pip:

pip install imbalanced-learn-extra

Development installation requires cloning the repository and then using PDM to install the project as well as the main and development dependencies:

git clone https://github.com/georgedouzas/imbalanced-learn-extra.git
cd imbalanced-learn-extra
pdm install

SOM clusterer requires optional dependencies:

pip install imbalanced-learn-extra[som]

Usage

All the classes included in imbalanced-learn-extra follow the imbalanced-learn API using the functionality of the base oversampler. Using scikit-learn convention, the data are represented as follows:

  • Input data X: 2D array-like or sparse matrices.
  • Targets y: 1D array-like.

The oversamplers implement a fit method to learn from X and y:

oversampler.fit(X, y)

They also implement a fit_resample method to resample X and y:

X_resampled, y_resampled = clustering_based_oversampler.fit_resample(X, y)

Citing imbalanced-learn-extra

Publications using clustering-based oversampling:

Publications using Geometric-SMOTE:

  • Douzas, G., Bacao, B. (2019). Geometric SMOTE: a geometrically enhanced drop-in replacement for SMOTE. Information Sciences, 501, 118-135. https://doi.org/10.1016/j.ins.2019.06.007

  • Fonseca, J., Douzas, G., Bacao, F. (2021). Increasing the Effectiveness of Active Learning: Introducing Artificial Data Generation in Active Learning for Land Use/Land Cover Classification. Remote Sensing, 13(13), 2619. https://doi.org/10.3390/rs13132619

  • Douzas, G., Bacao, F., Fonseca, J., Khudinyan, M. (2019). Imbalanced Learning in Land Cover Classification: Improving Minority Classes’ Prediction Accuracy Using the Geometric SMOTE Algorithm. Remote Sensing, 11(24), 3040. https://doi.org/10.3390/rs11243040

User Support

If you encounter a bug, have a question, or would like to request a new feature, you can get support through the project’s GitHub issue tracker.

  • Report a bug: Open a new issue and describe the problem, including steps to reproduce it and your environment details.
  • Request a feature: Open a new issue describing the functionality you’d like to see added.
  • Ask a question or request help: Use the Q&A discussion board for general usage questions or clarifications.

Before opening a new issue, please check the existing issues to see if your question or problem has already been addressed.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

imbalanced-learn-extra-0.2.10.tar.gz (40.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

imbalanced_learn_extra-0.2.10-py3-none-any.whl (38.5 kB view details)

Uploaded Python 3

File details

Details for the file imbalanced-learn-extra-0.2.10.tar.gz.

File metadata

  • Download URL: imbalanced-learn-extra-0.2.10.tar.gz
  • Upload date:
  • Size: 40.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.9

File hashes

Hashes for imbalanced-learn-extra-0.2.10.tar.gz
Algorithm Hash digest
SHA256 f01e3942a7ee95c74c71ebbbe7679521554d78f8b1ea507be1c496a64dad0805
MD5 1a6bb35ff3b3cc61848f1f474f814a61
BLAKE2b-256 86e412239626ae43e33dbd2f84593b6f22d806d45b27165cae212394cdca7757

See more details on using hashes here.

File details

Details for the file imbalanced_learn_extra-0.2.10-py3-none-any.whl.

File metadata

File hashes

Hashes for imbalanced_learn_extra-0.2.10-py3-none-any.whl
Algorithm Hash digest
SHA256 3b6a47af3fec8cd0767af58799069e044864aaa67583f0a31a02a10613fdb7da
MD5 f04ee63a574db97bad585bf14feae286
BLAKE2b-256 20a342d0d165b08262df21028730e9c953d49bfc0a11109822a1ed9edd2bf23d

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page