Convenient helpers for splitting DataFrames into features/target and creating train/dev/test splits.
Project description
data-spliter
A small, well-tested Python package that provides two conveniences on top of scikit-learn:
x_y_data-spliter– split a DataFrame into feature matrix X and target vector y by column name or index.train_test_data-spliter– thin validated wrapper aroundsklearn.model_selection.train_test_split.train_dev_test_data-spliter– split data into three sets (train / dev / test) with sizes expressed as fractions of the full dataset.
Installation
pip install data-spliter
Or from source:
git clone https://github.com/Fares-Ayman-1/data-spliter.git
cd data-spliter
pip install -e ".[dev]"
Quick start
import pandas as pd
from data-spliter import x_y_data-spliter, train_test_data-spliter, train_dev_test_data-spliter
df = pd.read_csv("my_data.csv")
# Split features from target (by name or by position)
X, y = x_y_data-spliter(df, column_name="price")
X, y = x_y_data-spliter(df, column_index=-1)
# Train / test split
x_train, x_test, y_train, y_test = train_test_data-spliter(X, y, test_size=0.2)
# Train / dev / test split
x_train, x_dev, x_test, y_train, y_dev, y_test = train_dev_test_data-spliter(
X, y, dev_size=0.1, test_size=0.2
)
Running tests
pytest --cov=data-spliter
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
No source distribution files available for this release.See tutorial on generating distribution archives.
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file data_spliter-1.0.0-py3-none-any.whl.
File metadata
- Download URL: data_spliter-1.0.0-py3-none-any.whl
- Upload date:
- Size: 2.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4d81346485ef35dca227d43922582064a87b388b8a62d6f893193c52bcb2e77f
|
|
| MD5 |
7e7aa9de253580f159d43fe26e1bd397
|
|
| BLAKE2b-256 |
d4e19e542c764fbca5f349560a4aea78f365995d58adeff3a7cc955ce285129a
|