Skip to main content

An implementation of novel oversampling algorithms.

Project description

imbalanced-learn-extra

ci doc

Category Tools
Development black ruff mypy docformatter
Package version pythonversion downloads
Documentation mkdocs
Communication gitter discussions

Introduction

imbalanced-learn-extra is a Python package that extends imbalanced-learn. It implements algorithms that are not included in imbalanced-learn due to their novelty or lower citation number. The current version includes the following:

  • A general interface for clustering-based oversampling algorithms.

  • The Geometric SMOTE algorithm. It is a geometrically enhanced drop-in replacement for SMOTE, that handles numerical as well as categorical features.

Installation

For user installation, imbalanced-learn-extra is currently available on the PyPi's repository, and you can install it via pip:

pip install imbalanced-learn-extra

Development installation requires cloning the repository and then using PDM to install the project as well as the main and development dependencies:

git clone https://github.com/georgedouzas/imbalanced-learn-extra.git
cd imbalanced-learn-extra
pdm install

SOM clusterer requires optional dependencies:

pip install imbalanced-learn-extra[som]

Usage

All the classes included in imbalanced-learn-extra follow the imbalanced-learn API using the functionality of the base oversampler. Using scikit-learn convention, the data are represented as follows:

  • Input data X: 2D array-like or sparse matrices.
  • Targets y: 1D array-like.

The oversamplers implement a fit method to learn from X and y:

oversampler.fit(X, y)

They also implement a fit_resample method to resample X and y:

X_resampled, y_resampled = clustering_based_oversampler.fit_resample(X, y)

Citing imbalanced-learn-extra

Publications using clustering-based oversampling:

Publications using Geometric-SMOTE:

  • Douzas, G., Bacao, B. (2019). Geometric SMOTE: a geometrically enhanced drop-in replacement for SMOTE. Information Sciences, 501, 118-135. https://doi.org/10.1016/j.ins.2019.06.007

  • Fonseca, J., Douzas, G., Bacao, F. (2021). Increasing the Effectiveness of Active Learning: Introducing Artificial Data Generation in Active Learning for Land Use/Land Cover Classification. Remote Sensing, 13(13), 2619. https://doi.org/10.3390/rs13132619

  • Douzas, G., Bacao, F., Fonseca, J., Khudinyan, M. (2019). Imbalanced Learning in Land Cover Classification: Improving Minority Classes’ Prediction Accuracy Using the Geometric SMOTE Algorithm. Remote Sensing, 11(24), 3040. https://doi.org/10.3390/rs11243040

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

imbalanced-learn-extra-0.2.6.tar.gz (36.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

imbalanced_learn_extra-0.2.6-py3-none-any.whl (34.8 kB view details)

Uploaded Python 3

File details

Details for the file imbalanced-learn-extra-0.2.6.tar.gz.

File metadata

  • Download URL: imbalanced-learn-extra-0.2.6.tar.gz
  • Upload date:
  • Size: 36.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.3

File hashes

Hashes for imbalanced-learn-extra-0.2.6.tar.gz
Algorithm Hash digest
SHA256 c51050535a4e4fbd035cc2dbe0f56fa5fb7693ac9738b82f514069c2fcf07448
MD5 9ade0e15d157b2d01f88ac2bd545ecce
BLAKE2b-256 38b3917f655b2f98153fcdc3ed610f9f0f8a912faba952c49ea8de77beea9617

See more details on using hashes here.

File details

Details for the file imbalanced_learn_extra-0.2.6-py3-none-any.whl.

File metadata

File hashes

Hashes for imbalanced_learn_extra-0.2.6-py3-none-any.whl
Algorithm Hash digest
SHA256 47b2fbcc1854e4fb13d465659c33d0a68aa23e00630bc5817905db724e21e844
MD5 16d9d527d07dec1e6bb11ede41d824ea
BLAKE2b-256 5a7e90f0243f43863b00be8e5371561a4175eddb2c4cafc702e82ca5eaf0fd4d

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page