Skip to main content

A general interface for clustering based over-sampling algorithms.

Project description


ci doc

Category Tools
Development black ruff mypy docformatter
Package version pythonversion downloads
Documentation mkdocs
Communication gitter discussions


A general interface for clustering based over-sampling algorithms.


For user installation, cluster-over-sampling is currently available on the PyPi's repository, and you can install it via pip:

pip install cluster-over-sampling

Development installation requires to clone the repository and then use PDM to install the project as well as the main and development dependencies:

git clone
cd cluster-over-sampling
pdm install

SOM clusterer requires optional dependencies:

pip install cluster-over-sampling[som]


All the classes included in cluster-over-sampling follow the imbalanced-learn API using the functionality of the base oversampler. Using scikit-learn convention, the data are represented as follows:

  • Input data X: 2D array-like or sparse matrices.
  • Targets y: 1D array-like.

The clustering-based oversamplers implement a fit method to learn from X and y:, y)

They also implement a fit_resample method to resample X and y:

X_resampled, y_resampled = clustering_based_oversampler.fit_resample(X, y)


If you use cluster-over-sampling in a scientific publication, we would appreciate citations to any of the following papers:

Project details

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cluster-over-sampling-0.6.0.tar.gz (27.6 kB view hashes)

Uploaded Source

Built Distribution

cluster_over_sampling-0.6.0-py3-none-any.whl (28.0 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page