Additions to the imblearn package
Project description
Additions to the imbalanced-learn package.
from imbutil.combine import MinMaxRandomSampler; from imblearn import pipeline;
# oversampling minority classes to 100 and undersampling majority classes to 800
sampler = MinMaxRandomSampler(min_freq=100, max_freq=800)
sampling_clf = pipeline.make_pipeline(sampler, inner_clf)
1 Installation
pip install imbutil
Additionally, the MinMaxRandomSampler, in addition to RandomUnderSampler and RandomOverSampler from imbalanced-learn, can technically be used with non-numeric data. However, the current implementation of imbalanced-learn forces a check for numeric data for all samplers. If you want to bypass this limitation, I have a fork of the project which does not force data to be numeric. You can install it with:
pip install git+https://github.com/shaypal5/imbalanced-learn.git@f6adc562fafdc2198931873799e725e5abdd65a1
2 Basic Use
imbutil additions addhere to the structure of the imblearn package:
2.1 combine
Containes samplers that both under-sample and over-sample:
MinMaxRandomSampler - Random samples data to bring all class frequencies into a range.
3 Contributing
Package author and current maintainer is Shay Palachy (shay.palachy@gmail.com); You are more than welcome to approach him for help. Contributions are very welcomed.
3.1 Installing for development
Clone:
git clone git@github.com:shaypal5/imbutil.git
Install in development mode, and with test dependencies:
cd imbutil
pip install -e ".[test]"
3.2 Running the tests
To run the tests use:
cd imbutil
pytest
3.3 Adding documentation
The project is documented using the numpy docstring conventions, which were chosen as they are perhaps the most widely-spread conventions that are both supported by common tools such as Sphinx and result in human-readable docstrings. When documenting code you add to this project, follow these conventions.
Additionally, if you update this README.rst file, use python setup.py checkdocs to validate it compiles.
4 Credits
Created by Shay Palachy (shay.palachy@gmail.com).
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file imbutil-0.0.8.tar.gz.
File metadata
- Download URL: imbutil-0.0.8.tar.gz
- Upload date:
- Size: 20.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/1.12.1 pkginfo/1.5.0.1 requests/2.21.0 setuptools/40.5.0 requests-toolbelt/0.8.0 tqdm/4.28.1 CPython/3.6.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6de5209f5029cce145d0bb729c995e2ba5543230387fc1114890b444e71f0a08
|
|
| MD5 |
2bac4745943cf889d8291a06fe9a9602
|
|
| BLAKE2b-256 |
1e6e3089538d8a69e20f40bb7f7a09a8569173901ad403af39e793450e3a8847
|
File details
Details for the file imbutil-0.0.8-py3-none-any.whl.
File metadata
- Download URL: imbutil-0.0.8-py3-none-any.whl
- Upload date:
- Size: 6.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/1.12.1 pkginfo/1.5.0.1 requests/2.21.0 setuptools/40.5.0 requests-toolbelt/0.8.0 tqdm/4.28.1 CPython/3.6.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5a26de7570e8c38857bfe3726f5826df32df5b858d5b9f71bd061fd2778553f1
|
|
| MD5 |
a30e35ccd8ea592e972b3c4478606f80
|
|
| BLAKE2b-256 |
6fbbee119860e6b22fa6299cbe22815921d536111f15434e7d3d7824a18f1333
|