Skip to main content

FINCH - First Integer Neighbor Clustering Hierarchy Algorithm

Project description

FINCH Clustering Algorithm

First Integer Neighbor Clustering Hierarchy Algorithm

A Python implementation of FINCH algorithm from the paper

Sarfraz, Saquib, Vivek Sharma, and Rainer Stiefelhagen. "Efficient parameter-free clustering using first neighbor relations." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2019.

This implementation is faster than the original implementation (see benchmark below). Further, our code deviates from the paper as it does not implement Algorithm 2, "Required Number of Clusters Mode".


Installation

The easiest way to install finchpy is by using pip :

pip install finchpy

How to use

from finch import FINCH

fin = finch()
fin.fit(data)

print(fin.partitions)
  • Demo Notebook: the following noteboook shows a demo of common features in this package - see Jupyter Notebook

Class Parameters

--metric        string      The used distance metric        Default='euclidean'
--n_jobs        int         The number of processes         Default=1

Methods

  • fit(X): Apply the FINCH algorithm.
  • fit_predict(X): Apply the FINCH algorithm and returns a reasonable partitioning labels based on the silhouette coefficient.

Benchmark

Here is a comparison of the performance of the finchpy implementation to the original ssarfraz Python implementation:

Hardware: Intel(R) Core(TM) i7-6567U CPU @ 3.30GHz with 16 GB RAM
Computed with %timeit with 2 run and 5 loops, and for memory %memit

| Samples| ssarfraz CPU | ssarfraz RAM | finchpy CPU | finchpy RAM | 
|------- |------------- |------------- |------------ |-------------|
| 1000   | 32.4 ms      | 109.63 MiB   | 29.3 ms     | 93.02 MiB   |
| 10000  | 1.62 s       | 689.86 MiB   | 215 ms      | 95.99 MiB   |
| 20000  | 7.57 s       | 2069.90 MiB  | 443 ms      | 101.78 MiB  |
| 50000  | -----        | -----        | 1.4 s       | 115.35 MiB  |
| 75000  | -----        | -----        | 2.56 s      | 129.67 MiB  |

pyflann was not used for the ssarfraz code as it does not support Python 3

License

Released under MIT License. See the LICENSE file for details. The package was developed by Eren Cakmak from the Data Analysis and Visualization Group Konstanz, Germany. This work is partly funded by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) under Germany's Excellence Strategy – EXC 2117 – 422037984“

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

finchpy-0.0.1.tar.gz (481.8 kB view details)

Uploaded Source

File details

Details for the file finchpy-0.0.1.tar.gz.

File metadata

  • Download URL: finchpy-0.0.1.tar.gz
  • Upload date:
  • Size: 481.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/3.7.3 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.59.0 CPython/3.8.8

File hashes

Hashes for finchpy-0.0.1.tar.gz
Algorithm Hash digest
SHA256 d2f5c971186d110980a522b305152c748de43021e39de976de2f1ce69f46a273
MD5 6bd9f51e77cb48ef75bfaa3bb88298a1
BLAKE2b-256 e8ba97c7bc0b376d0f4e38ca056172f6618bc348494ec2352f972a77726a9ae0

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page