An interface to apply your favourite re-sampler on regression tasks.
Project description
Regression ReSampling
An interface to apply your favourite resampler on regression tasks. Currently supports all resampling techniques present in imblearn
Why does this exist?
While we were working on a regression task, we realized that the target variable was skewed, i.e., most samples were present in a particular range. One can easily solve the skew problem for classification tasks via a slew of resampling techniques (either under or over sampling) but this luxury is unavailable for regression tasks. We therefore decided to create an interface that can repurpose all resampling techniques for classification problems to regression problems!
How to install?
pip install reg_resampler
How to use it?
Parameters
- X - Either a pandas dataframe or numpy matrix. Data to be resampled
- target - Either string (for pandas) or index (for numpy). The target variable to be resampled
- bins=3 - The number of classes that the user wants to generate (Default: 3)
- balanced_binning=False - Decides whether samples are to be distributed roughly equally across all classes (Default: False)
- sampler_obj - Resampling object. Currently, it supports all resampling techniques present in imblearn.
Functions
- fit - Adds classes to each sample and returns an augmented dataframe/numpy matrix (based on input type)
- resample - Performs resampling and returns the resampled dataframe/numpy matrix. It also merges classes when required
Important Note: All functions return the same data type as provided in input
How to import?
from reg_resampler import resampler
How to use?
# Initialize the resampler object
rs = resampler()
# Returned dataframe contains a new column called "classes"
# The function also prints the class distribution
# This is for analysis purpose
df_train_ = rs.fit(df_train, target, bins=7)
# Create a smote (over-sampling) object from imblearn
smote = SMOTE(random_state=27)
# Now resample
# You might recieve warning about class merger for low sample classes
df_train_ = rs.resample(smote)
df_train_.head()
Future Ideas
- Support for more resampling techniques
Feature Request
Drop us an email at atif.hit.hassan@gmail.com or pvsaikrithik@gmail.com if you want any particular feature
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for reg_resampler-1.0.4-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 13ebd9ebb61c8fdcaa396945b8807c5a3160b9f89bf68f56370faf6812f61f22 |
|
MD5 | 98b45dc9e18ca75823fefbf327824eb9 |
|
BLAKE2b-256 | a1e4f6bf205ea6bc624fdb34bc9572bb3a28f469a19f50962f6b38a938cbc52b |