Skip to main content

Package for data reduction, especially using instance selection algorithms.

Project description

InstanceSelection

InstanceSelection is a Python module for reducing number of instances in datasets used in classification problems. The module is implemented as part of an engineering project.

Instalation

    pip install data_reduction

Usage

Data loading and preparation

The first step is to load and prepare data using DataPreparation:

    data = DataPreparation('iris')

Instance selection with selected algoritm

For all algorithms required parameter is instance of DataPreparation. Then you can reduce instances and prepare raport.

    alg = DROP1(data, k=3)
    alg.reduce_instances()

Creating raport

After reduction with selected algorithm you can create raport:

    rap = Raport(data, alg.red_data, alg.red_lab)
    rap.print_raport(c_type = 'knn')

Results of raporting

=============
Classifier:   knn
=============
Raport for original dataset
Count of instances:  105
                 precision    recall  f1-score   support

    Iris-setosa     1.0000    1.0000    1.0000        19
Iris-versicolor     1.0000    1.0000    1.0000        13
 Iris-virginica     1.0000    1.0000    1.0000        13

       accuracy                         1.0000        45
      macro avg     1.0000    1.0000    1.0000        45
   weighted avg     1.0000    1.0000    1.0000        45

Cohen's Kappa: 1.00
===
Training time:  0.0008822999999997805
Predicting time:  0.003322799999999848

Raport for reduced dataset
Count of instances:  21
                 precision    recall  f1-score   support

    Iris-setosa     1.0000    1.0000    1.0000        19
Iris-versicolor     0.7647    1.0000    0.8667        13
 Iris-virginica     1.0000    0.6923    0.8182        13

       accuracy                         0.9111        45
      macro avg     0.9216    0.8974    0.8949        45
   weighted avg     0.9320    0.9111    0.9090        45

Cohen's Kappa: 0.86
===
Training time:  0.0006775000000001086
Predicting time:  0.0024793999999999095

Reduction factor: 80.00 %

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

data_reduction-0.0.1.tar.gz (20.5 kB view details)

Uploaded Source

Built Distribution

data_reduction-0.0.1-py3-none-any.whl (31.1 kB view details)

Uploaded Python 3

File details

Details for the file data_reduction-0.0.1.tar.gz.

File metadata

  • Download URL: data_reduction-0.0.1.tar.gz
  • Upload date:
  • Size: 20.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.23.0 setuptools/41.2.0 requests-toolbelt/0.9.1 tqdm/4.46.0 CPython/3.7.7

File hashes

Hashes for data_reduction-0.0.1.tar.gz
Algorithm Hash digest
SHA256 20079932f04f0f33e633548b7e0afc70d02836986daafed7696f6b2d6c50e68b
MD5 250b98c0694a6a70c85657af698f8e01
BLAKE2b-256 039631fe7a6d3509a4cd3d130f689e74063f7ea647ad5e1ba386ccd6a8fb6195

See more details on using hashes here.

File details

Details for the file data_reduction-0.0.1-py3-none-any.whl.

File metadata

  • Download URL: data_reduction-0.0.1-py3-none-any.whl
  • Upload date:
  • Size: 31.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.23.0 setuptools/41.2.0 requests-toolbelt/0.9.1 tqdm/4.46.0 CPython/3.7.7

File hashes

Hashes for data_reduction-0.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 1a66d3b6969a0418888f91ff56c83e28f927e89481deef7601bffcb0088c51a9
MD5 bb031f64414cfed62ee0949ea43b4696
BLAKE2b-256 37b1c4aac65c688d9699edf13e8616a383fa52b518fd7a42cf6983b9b8ef8121

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page