Skip to main content

Implementation of the KSU compression algorithm https://www.cs.bgu.ac.il/~karyeh/compression-arxiv.pdf

Project description

KSU Compression Algorithm Implementation

Algortihm 1 from Nearest-Neighbor Sample Compression: Efficiency, Consistency, Infinite Dimensions

Installation

  • With pip: pip install ksu
  • From source:
    • git clone --recursive https://github.com/nimroha/ksu_classifier.git
    • cd ksu_classifier
    • python setup.py install

Usage

This package provides a class KSU(Xs, Ys, metric, [gram, prune, logLevel, n_jobs])

Xs and Ys are the data points and their respective labels as numpy arrays

metric is either a callable to compute the metric or a string that names one of our provided metrics (print ksu.KSU.METRICS.keys() for the full list)

gram (optional, default=None) a precomputed gramian matrix, will be calculated if not provided.

prune (optional, default=False) a boolean indicating whether to prune the compressed set or not (Algorithm 2 from Near-optimal sample compression for nearest neighbors)

`logLevel (optional, default='CRITICAL') a string indicating the logging level (set to 'INFO' or 'DEBUG' to get more information)

n_jobs (optional, default=1) an integer defining how many cpus to use, pass -1 to use all. For n_jobs below -1, (n_cpus + 1 + n_jobs) are used. Thus for n_jobs = -2, all CPUs but one are used.


KSU provides a method compressData([delta])

Which selects the subset with the lowest estimated error with confidence 1 - delta.

You can then run getClassifier() which returns a 1-NN Classifer (based on sklearn's K-NN) fitted to the compressed data.

Or, run getCompressedSet() to get the compressed data as a tuple of numpy arrays (compressedXs, compressedYs).


See scripts/ for example usage

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ksu-0.2.2.tar.gz (13.1 kB view hashes)

Uploaded Source

Built Distribution

ksu-0.2.2-py2.py3-none-any.whl (14.8 kB view hashes)

Uploaded Python 2 Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page