SOAK splitting utility
Project description
SOAK: Same/Other/All K-fold Cross-Validation
SOAK is designed to estimate the similarity of patterns found across different subsets of a dataset. It extends traditional K-fold cross-validation with "Same," "Other," and "All" splitting strategies to provide a robust measure of pattern similarity.
Usage
import numpy as np
import soakpy
# --- synthetic data ---
X = np.arange(8).reshape(-1, 1)
y = X.ravel()
subset_vec = np.array(['even' if x % 2 == 0 else 'odd' for x in X.ravel()])
# --- Initialize soak object ---
for subset_value, category, fold_id, train_idx, test_idx in soakpy.split(subset_vec, n_splits=2):
print(f"subset: {subset_value:6s} --- category: {category:6s} --- fold: {fold_id}")
print(f"y_test: {y[test_idx]}")
print(f"y_train: {y[train_idx]}")
print("-"*50)
subset: even --- category: same --- fold: 1
y_test: [0]
y_train: [2 4 6]
--------------------------------------------------
subset: even --- category: other --- fold: 1
y_test: [0]
y_train: [5]
--------------------------------------------------
subset: even --- category: all --- fold: 1
y_test: [0]
y_train: [2 4 5 6]
--------------------------------------------------
subset: odd --- category: same --- fold: 1
y_test: [1 3 7]
y_train: [5]
--------------------------------------------------
subset: odd --- category: other --- fold: 1
y_test: [1 3 7]
y_train: [2 4 6]
--------------------------------------------------
subset: odd --- category: all --- fold: 1
y_test: [1 3 7]
y_train: [2 4 5 6]
--------------------------------------------------
subset: even --- category: same --- fold: 2
y_test: [2 4 6]
y_train: [0]
--------------------------------------------------
subset: even --- category: other --- fold: 2
y_test: [2 4 6]
y_train: [1 3 7]
--------------------------------------------------
subset: even --- category: all --- fold: 2
y_test: [2 4 6]
y_train: [0 1 3 7]
--------------------------------------------------
subset: odd --- category: same --- fold: 2
y_test: [5]
y_train: [1 3 7]
--------------------------------------------------
subset: odd --- category: other --- fold: 2
y_test: [5]
y_train: [0]
--------------------------------------------------
subset: odd --- category: all --- fold: 2
y_test: [5]
y_train: [0 1 3 7]
--------------------------------------------------
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
soakpy-0.0.3.tar.gz
(3.3 kB
view details)
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file soakpy-0.0.3.tar.gz.
File metadata
- Download URL: soakpy-0.0.3.tar.gz
- Upload date:
- Size: 3.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8bb762044d1b809afbcd903e216a5060f1bfa07497c4824d086937786936d504
|
|
| MD5 |
55dc1a81e32e4b3cde3604c85d7412b5
|
|
| BLAKE2b-256 |
3aafb2213ee5baf259300de85899d322de0964f2fc0b0576dc07ac2c87a8411c
|
File details
Details for the file soakpy-0.0.3-py3-none-any.whl.
File metadata
- Download URL: soakpy-0.0.3-py3-none-any.whl
- Upload date:
- Size: 3.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c732bac177605120138bd6ef56c8cc43f0a9f28f32f13f66526020ca72283165
|
|
| MD5 |
e710c8dd5d7bb06fa05ababbab28bc34
|
|
| BLAKE2b-256 |
0931033976a6032bd362ea48cca9e6f78170082c7dcc2373a7821234e10192c3
|