Implementation of the KSU compression algorithm https://www.cs.bgu.ac.il/~karyeh/compression-arxiv.pdf
Project description
KSU Compression Algorithm Implementation
Algortihm 1 from Nearest-Neighbor Sample Compression: Efficiency, Consistency, Infinite Dimensions
Installation
- With pip:
pip install ksu
- From source:
git clone --recursive https://github.com/nimroha/ksu_classifier.git
cd ksu_classifier
python setup.py install
Usage
This package provides a class KSU(Xs, Ys, metric, [gram, prune, logLevel, n_jobs])
Xs
and Ys
are the data points and their respective labels as numpy arrays
metric
is either a callable to compute the metric or a string that names one of our provided metrics (print ksu.KSU.METRICS.keys()
for the full list)
gram
(optional, default=None) a precomputed gramian matrix, will be calculated if not provided.
prune
(optional, default=False) a boolean indicating whether to prune the compressed set or not (Algorithm 2 from Near-optimal sample compression for nearest neighbors)
`logLevel (optional, default='CRITICAL') a string indicating the logging level (set to 'INFO' or 'DEBUG' to get more information)
n_jobs
(optional, default=1) an integer defining how many cpus to use, pass -1 to use all. For n_jobs below -1, (n_cpus + 1 + n_jobs) are used. Thus for n_jobs = -2, all CPUs but one are used.
KSU
provides a method compressData([delta])
Which selects the subset with the lowest estimated error with confidence 1 - delta
.
You can then run getClassifier()
which returns a 1-NN Classifer (based on sklearn's K-NN) fitted to the compressed data.
Or, run getCompressedSet()
to get the compressed data as a tuple of numpy arrays (compressedXs, compressedYs)
.
See scripts/
for example usage
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.