Skip to main content

Package for data reduction, especially using instance selection algorithms.

Project description

InstanceSelection

InstanceSelection is a Python module for reducing number of instances in datasets used in classification problems. The module is implemented as part of an engineering project.

Instalation

    pip install data_reduction

Usage

Data loading and preparation

The first step is to load and prepare data using DataPreparation:

    data = DataPreparation('iris')

Instance selection with selected algoritm

For all algorithms required parameter is instance of DataPreparation. Then you can reduce instances and prepare raport.

    alg = DROP1(data, k=3)
    alg.reduce_instances()

Creating raport

After reduction with selected algorithm you can create raport:

    rap = Raport(data, alg.red_data, alg.red_lab)
    rap.print_raport(c_type = 'knn')

Results of raporting

=============
Classifier:   knn
=============
Raport for original dataset
Count of instances:  105
                 precision    recall  f1-score   support

    Iris-setosa     1.0000    1.0000    1.0000        19
Iris-versicolor     1.0000    1.0000    1.0000        13
 Iris-virginica     1.0000    1.0000    1.0000        13

       accuracy                         1.0000        45
      macro avg     1.0000    1.0000    1.0000        45
   weighted avg     1.0000    1.0000    1.0000        45

Cohen's Kappa: 1.00
===
Training time:  0.0008822999999997805
Predicting time:  0.003322799999999848

Raport for reduced dataset
Count of instances:  21
                 precision    recall  f1-score   support

    Iris-setosa     1.0000    1.0000    1.0000        19
Iris-versicolor     0.7647    1.0000    0.8667        13
 Iris-virginica     1.0000    0.6923    0.8182        13

       accuracy                         0.9111        45
      macro avg     0.9216    0.8974    0.8949        45
   weighted avg     0.9320    0.9111    0.9090        45

Cohen's Kappa: 0.86
===
Training time:  0.0006775000000001086
Predicting time:  0.0024793999999999095

Reduction factor: 80.00 %

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

InstanceReduction-0.0.1.tar.gz (20.5 kB view details)

Uploaded Source

Built Distribution

InstanceReduction-0.0.1-py3-none-any.whl (31.1 kB view details)

Uploaded Python 3

File details

Details for the file InstanceReduction-0.0.1.tar.gz.

File metadata

  • Download URL: InstanceReduction-0.0.1.tar.gz
  • Upload date:
  • Size: 20.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.23.0 setuptools/41.2.0 requests-toolbelt/0.9.1 tqdm/4.46.0 CPython/3.7.7

File hashes

Hashes for InstanceReduction-0.0.1.tar.gz
Algorithm Hash digest
SHA256 e7a6f7f33a4b5df5c1e39181942717a65bd96fc5f7b5f4ca601415f39dc026a6
MD5 4c5bc19b54417749b4f1e73dbe3f265f
BLAKE2b-256 2aaa67454b2faddd2e3b27d80ab8fca9387564da249e96b1b01dd08a38cc69cb

See more details on using hashes here.

File details

Details for the file InstanceReduction-0.0.1-py3-none-any.whl.

File metadata

  • Download URL: InstanceReduction-0.0.1-py3-none-any.whl
  • Upload date:
  • Size: 31.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.23.0 setuptools/41.2.0 requests-toolbelt/0.9.1 tqdm/4.46.0 CPython/3.7.7

File hashes

Hashes for InstanceReduction-0.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 62a789f0726c2f9770bcdaf25d7799b00bbb5655541dcb7c538881a23eb9ca29
MD5 16bb647deed2b222f887422cfc46058b
BLAKE2b-256 c56b34b3ad8de65d89355fbbaf86bf29601e517cb33ee1ec32e40c806742b1c0

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page