Skip to main content

Package for data reduction, especially using instance selection algorithms.

Project description

InstanceSelection

InstanceSelection is a Python module for reducing number of instances in datasets used in classification problems. The module is implemented as part of an engineering project.

Instalation

    pip install data_reduction

Usage

Data loading and preparation

The first step is to load and prepare data using DataPreparation:

    data = DataPreparation('iris')

Instance selection with selected algoritm

For all algorithms required parameter is instance of DataPreparation. Then you can reduce instances and prepare raport.

    alg = DROP1(data, k=3)
    alg.reduce_instances()

Creating raport

After reduction with selected algorithm you can create raport:

    rap = Raport(data, alg.red_data, alg.red_lab)
    rap.print_raport(c_type = 'knn')

Results of raporting

=============
Classifier:   knn
=============
Raport for original dataset
Count of instances:  105
                 precision    recall  f1-score   support

    Iris-setosa     1.0000    1.0000    1.0000        19
Iris-versicolor     1.0000    1.0000    1.0000        13
 Iris-virginica     1.0000    1.0000    1.0000        13

       accuracy                         1.0000        45
      macro avg     1.0000    1.0000    1.0000        45
   weighted avg     1.0000    1.0000    1.0000        45

Cohen's Kappa: 1.00
===
Training time:  0.0008822999999997805
Predicting time:  0.003322799999999848

Raport for reduced dataset
Count of instances:  21
                 precision    recall  f1-score   support

    Iris-setosa     1.0000    1.0000    1.0000        19
Iris-versicolor     0.7647    1.0000    0.8667        13
 Iris-virginica     1.0000    0.6923    0.8182        13

       accuracy                         0.9111        45
      macro avg     0.9216    0.8974    0.8949        45
   weighted avg     0.9320    0.9111    0.9090        45

Cohen's Kappa: 0.86
===
Training time:  0.0006775000000001086
Predicting time:  0.0024793999999999095

Reduction factor: 80.00 %

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

InstanceReduction-0.0.2.tar.gz (3.6 MB view details)

Uploaded Source

Built Distribution

InstanceReduction-0.0.2-py3-none-any.whl (3.8 MB view details)

Uploaded Python 3

File details

Details for the file InstanceReduction-0.0.2.tar.gz.

File metadata

  • Download URL: InstanceReduction-0.0.2.tar.gz
  • Upload date:
  • Size: 3.6 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.23.0 setuptools/41.2.0 requests-toolbelt/0.9.1 tqdm/4.46.0 CPython/3.7.7

File hashes

Hashes for InstanceReduction-0.0.2.tar.gz
Algorithm Hash digest
SHA256 bd0697fcc3b7397573fbf2cecd2510e3ac6fe597cdfa8a88cf696d7a5360bd1e
MD5 586398985bc17592694357a1988527d4
BLAKE2b-256 3d30e7b31f55ddcfc6df525cfbf9526db77d4103badfd068d10c28d00ce898d5

See more details on using hashes here.

File details

Details for the file InstanceReduction-0.0.2-py3-none-any.whl.

File metadata

  • Download URL: InstanceReduction-0.0.2-py3-none-any.whl
  • Upload date:
  • Size: 3.8 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.23.0 setuptools/41.2.0 requests-toolbelt/0.9.1 tqdm/4.46.0 CPython/3.7.7

File hashes

Hashes for InstanceReduction-0.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 aab0ce66e8a1cbcb5fdcad121a6904ac07d19226820cfdc6cd441a1de00cf428
MD5 c6f1c25c4734c62543a1d33e739b46ff
BLAKE2b-256 605be9a915bbc50a695f7806a751c9ed338be5c4756cfe1d9d996a3538dc5cf2

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page