Skip to main content

LoRAS: An oversampling approach for imbalanced datasets

Project description

LoRAS

CI Codecov PyPI

Localized Random Affine Shadowsampling

This repo provides a python implementation of an imbalanced dataset oversampling technique known as Localized Random Affine Shadowsampling (LoRAS). This implementation piggybacks off the package imbalanced-learn and thus aims to be as compatible as possible with it.

Dependencies

  • imbalanced-learn

Installation

Using pip:

$ pip install -U pyloras

Installing from source requires an installation of poetry and the following shell commands:

$ git clone https://github.com/zoj613/pyloras.git
$ cd pyloras/
$ poetry install
# add package to python's path
$ export PYTHONPATH=$PWD:$PYTHONPATH 

Usage

from collections import Counter
from pyloras import LORAS
from sklearn.datasets import make_classification

X, y = make_classification(n_samples=20000, n_features=5, n_informative=5,
                           n_redundant=0, n_repeated=0, n_classes=3,
                           n_clusters_per_class=1,
                           weights=[0.01, 0.05, 0.94],
                           class_sep=0.8, random_state=0)

lrs = LORAS(random_state=0, embedding_params={'perplexity': 35, 'n_iter': 250})
print(sorted(Counter(y).items()))
# [(0, 270), (1, 1056), (2, 18674)]
X_resampled, y_resampled = lrs.fit_resample(X, y)
print(sorted(Counter(y_resampled.astype(int)).items()))
# [(0, 18674), (1, 18674), (2, 18674)]

Visualization

Below is a comparision of imbalanced-learn's SMOTE implementation with LORAS on the dummy data used in this doc page using the default parameters.

The plots can be reproduced by running:

$ python scripts/compare_oversamplers.py --n_neighbors=<optional> --n_shadow=<optional> --n_affine=<optional>

References

Bej, S., Davtyan, N., Wolfien, M. et al. LoRAS: an oversampling approach for imbalanced datasets. Mach Learn 110, 279–301 (2021). https://doi.org/10.1007/s10994-020-05913-4

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyloras-0.1.0b2.tar.gz (6.0 kB view details)

Uploaded Source

Built Distribution

pyloras-0.1.0b2-py3-none-any.whl (7.1 kB view details)

Uploaded Python 3

File details

Details for the file pyloras-0.1.0b2.tar.gz.

File metadata

  • Download URL: pyloras-0.1.0b2.tar.gz
  • Upload date:
  • Size: 6.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.6.1 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.61.2 CPython/3.9.6

File hashes

Hashes for pyloras-0.1.0b2.tar.gz
Algorithm Hash digest
SHA256 d7a802dcc5e2ae2592da97cacd3b53f18db666e5713a6bc78229884e6a865ce2
MD5 560bd3c4b495234d285587339a1d07fe
BLAKE2b-256 a8bea04bd00ae2ec55c55f18e72ccc13e25fb602e98c58bb55a304d7b36b160c

See more details on using hashes here.

File details

Details for the file pyloras-0.1.0b2-py3-none-any.whl.

File metadata

  • Download URL: pyloras-0.1.0b2-py3-none-any.whl
  • Upload date:
  • Size: 7.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.6.1 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.61.2 CPython/3.9.6

File hashes

Hashes for pyloras-0.1.0b2-py3-none-any.whl
Algorithm Hash digest
SHA256 a35825c42bd71d4a7677854e090611f7f0de2b2877c14b0724207feaff33e551
MD5 a66a32e84c38e835e0bff79aa1aee4b8
BLAKE2b-256 7445fa1330d6cbf90f0263786726c66e5343c6214b3dacb53a3b587466519ecf

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page