KeelDS is a package to load some datasets from KEEL repository with some normalizations and with split and discretization options.
Project description
KeelDS
KeelDS: A Python package for loading datasets from KEEL repository
KeelDS is a Python package that provides easy access to datasets from the KEEL repository, a popular source for machine learning datasets. This package simplifies the process of loading KEEL datasets, offering options for cross-validation and discretization.
Features
- Load KEEL datasets with a single line of code
- Access datasets pre-split into train and test sets
- Discretization option using the Fayyad algorithm (MDLP)
- Support for both balanced and imbalanced datasets
- Easy integration with machine learning workflows
Installation
Dependencies
- Python (>= 3.12)
- pandas (>= 2.2.2)
You can install KeelDS using pip:
pip install keel-ds
Usage
Here's a simple example of how to use KeelDS with a machine learning model:
from keel_ds import load_data
import numpy as np
from catboost import CatBoostClassifier
file_name = 'iris'
folds = load_data(file_name)
evaluations = []
for x_train, y_train, x_test, y_test in folds:
model = CatBoostClassifier(verbose=False)
model.fit(x_train, y_train)
evaluation = model.score(x_test, y_test)
evaluations.append(evaluation)
print(np.mean(evaluations)) # Output: 0.933333333333
API Reference
load_data(data, imbalanced=False, raw=False)
Load a dataset from the KEEL repository.
data(str): Name of the dataset to loadimbalanced(bool): If True, load from imbalanced datasets. Default is False.raw(bool): If True, return the raw dataset. Default is False.
Returns a list of tuples (x_train, y_train, x_test, y_test) for each fold.
list_data()
List all available datasets.
Returns a dictionary with two keys: 'balanced' and 'imbalanced', each containing a list of available dataset names.
Contributing
Contributions to KeelDS are welcome! Please feel free to submit a Pull Request.
License
[Add license information here]
Contact
For any queries or issues, please open an issue on the GitHub repository.
This updated README provides a more comprehensive overview of the KeelDS package, including:
1. A clearer introduction and feature list
2. Updated installation instructions
3. A more detailed usage example
4. API reference for the main functions
5. Information about contributing and contact
You may want to add more sections or details based on your specific needs, such as a more detailed API reference, troubleshooting tips, or information about the dataset preprocessing steps. Also, don't forget to add the appropriate license information.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file keel_ds-0.1.19.tar.gz.
File metadata
- Download URL: keel_ds-0.1.19.tar.gz
- Upload date:
- Size: 26.3 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.8.2 CPython/3.12.3 Linux/6.8.0-41-generic
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
473b0f0d764709aa05eaf551565d908ab325da60d6f1159bb05c7292a5bfece2
|
|
| MD5 |
c5350815c08f0584869e3492f8fdcb8a
|
|
| BLAKE2b-256 |
275f1834ebe26da81a80ec2434d0dece7c74c7a50ddf4e4008804b13f71721e7
|
File details
Details for the file keel_ds-0.1.19-py3-none-any.whl.
File metadata
- Download URL: keel_ds-0.1.19-py3-none-any.whl
- Upload date:
- Size: 26.6 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.8.2 CPython/3.12.3 Linux/6.8.0-41-generic
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c69e0015b5a2cd5a56d5ab2f5fbd1cda2d9e99080a16004099b74148a2ecc6ac
|
|
| MD5 |
344dde719a64e60de65d7ae2b785098a
|
|
| BLAKE2b-256 |
101c8675dc5306927d0ede2be15530de2064c9a4afed9dff155dc98f942b3775
|