Balanced splitting utility
Project description
balanced-splits
A utility library for splitting datasets in a balanced manner, with regards to several features.
Installation
pip install balanced-splits
Usage
import numpy as np
import pandas as pd
from balanced_splits.split import optimized_split
sample_size = 100
df = pd.DataFrame({
'age': np.random.normal(loc=45, scale=7., size=sample_size),
'skill': 1 - np.random.power(4, size=sample_size),
'type': np.random.choice(['T1', 'T2', 'T3'], size=sample_size)
})
A, B = optimized_split(df)
print('Partition 1\n===========\n')
print(A.describe())
print(A['type'].value_counts())
print('\n\n')
print('Partition 2\n===========\n')
print(B.describe())
print(B['type'].value_counts())
Check out the "examples" section for more examples.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
balanced-splits-0.2.0.tar.gz
(4.4 kB
view details)
Built Distribution
File details
Details for the file balanced-splits-0.2.0.tar.gz
.
File metadata
- Download URL: balanced-splits-0.2.0.tar.gz
- Upload date:
- Size: 4.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/46.1.3 requests-toolbelt/0.9.1 tqdm/4.45.0 CPython/3.7.5
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | dee302aa2f6d4b4c01617dbf1ff2c5141f298122a5bbc26ed457c45d455984c4 |
|
MD5 | 69a43b39042b7c997c8b562df431a1c9 |
|
BLAKE2b-256 | d12ce6a2dce1d7670d62c7f5242b20e339c9b0089deb726c151837962d0c73fc |
File details
Details for the file balanced_splits-0.2.0-py3-none-any.whl
.
File metadata
- Download URL: balanced_splits-0.2.0-py3-none-any.whl
- Upload date:
- Size: 6.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/46.1.3 requests-toolbelt/0.9.1 tqdm/4.45.0 CPython/3.7.5
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0b098e9c2b0e7e984aec304b61aa57a8b7de25b594b636fa5ea572a9ea0010f6 |
|
MD5 | 6e4090df6eec49115fe639ab2004c600 |
|
BLAKE2b-256 | aef0e93330681b328a1c62d54d108ad6f83c0d335df6641c9ca1d0aa1c4fb0d1 |