Package for data reduction, especially using instance selection algorithms.
Project description
InstanceSelection
InstanceSelection is a Python module for reducing number of instances in datasets used in classification problems. The module is implemented as part of an engineering project.
Instalation
pip install data_reduction
Usage
Data loading and preparation
The first step is to load and prepare data using DataPreparation:
data = DataPreparation('iris')
Instance selection with selected algoritm
For all algorithms required parameter is instance of DataPreparation. Then you can reduce instances and prepare raport.
alg = DROP1(data, k=3)
alg.reduce_instances()
Creating raport
After reduction with selected algorithm you can create raport:
rap = Raport(data, alg.red_data, alg.red_lab)
rap.print_raport(c_type = 'knn')
Results of raporting
=============
Classifier: knn
=============
Raport for original dataset
Count of instances: 105
precision recall f1-score support
Iris-setosa 1.0000 1.0000 1.0000 19
Iris-versicolor 1.0000 1.0000 1.0000 13
Iris-virginica 1.0000 1.0000 1.0000 13
accuracy 1.0000 45
macro avg 1.0000 1.0000 1.0000 45
weighted avg 1.0000 1.0000 1.0000 45
Cohen's Kappa: 1.00
===
Training time: 0.0008822999999997805
Predicting time: 0.003322799999999848
Raport for reduced dataset
Count of instances: 21
precision recall f1-score support
Iris-setosa 1.0000 1.0000 1.0000 19
Iris-versicolor 0.7647 1.0000 0.8667 13
Iris-virginica 1.0000 0.6923 0.8182 13
accuracy 0.9111 45
macro avg 0.9216 0.8974 0.8949 45
weighted avg 0.9320 0.9111 0.9090 45
Cohen's Kappa: 0.86
===
Training time: 0.0006775000000001086
Predicting time: 0.0024793999999999095
Reduction factor: 80.00 %
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
data_reduction-0.0.1.tar.gz
(20.5 kB
view details)
Built Distribution
File details
Details for the file data_reduction-0.0.1.tar.gz
.
File metadata
- Download URL: data_reduction-0.0.1.tar.gz
- Upload date:
- Size: 20.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.23.0 setuptools/41.2.0 requests-toolbelt/0.9.1 tqdm/4.46.0 CPython/3.7.7
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 20079932f04f0f33e633548b7e0afc70d02836986daafed7696f6b2d6c50e68b |
|
MD5 | 250b98c0694a6a70c85657af698f8e01 |
|
BLAKE2b-256 | 039631fe7a6d3509a4cd3d130f689e74063f7ea647ad5e1ba386ccd6a8fb6195 |
File details
Details for the file data_reduction-0.0.1-py3-none-any.whl
.
File metadata
- Download URL: data_reduction-0.0.1-py3-none-any.whl
- Upload date:
- Size: 31.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.23.0 setuptools/41.2.0 requests-toolbelt/0.9.1 tqdm/4.46.0 CPython/3.7.7
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1a66d3b6969a0418888f91ff56c83e28f927e89481deef7601bffcb0088c51a9 |
|
MD5 | bb031f64414cfed62ee0949ea43b4696 |
|
BLAKE2b-256 | 37b1c4aac65c688d9699edf13e8616a383fa52b518fd7a42cf6983b9b8ef8121 |