FINCH - First Integer Neighbor Clustering Hierarchy Algorithm
Project description
FINCH Clustering Algorithm
First Integer Neighbor Clustering Hierarchy Algorithm
A Python implementation of FINCH algorithm from the paper
Sarfraz, Saquib, Vivek Sharma, and Rainer Stiefelhagen. "Efficient parameter-free clustering using first neighbor relations." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2019.
This implementation is faster than the original implementation (see benchmark below). Further, our code deviates from the paper as it does not implement Algorithm 2, "Required Number of Clusters Mode".
Installation
The easiest way to install finchpy is by using pip
:
pip install finchpy
How to use
from finch import FINCH
fin = finch()
fin.fit(data)
print(fin.partitions)
- Demo Notebook: the following noteboook shows a demo of common features in this package - see Jupyter Notebook
Class Parameters
--metric string The used distance metric Default='euclidean'
--n_jobs int The number of processes Default=1
Methods
fit(X)
: Apply the FINCH algorithm.fit_predict(X)
: Apply the FINCH algorithm and returns a reasonable partitioning labels based on the silhouette coefficient.
Benchmark
Here is a comparison of the performance of the finchpy implementation to the original ssarfraz Python implementation:
Hardware: Intel(R) Core(TM) i7-6567U CPU @ 3.30GHz with 16 GB RAM
Computed with %timeit with 2 run and 5 loops, and for memory %memit
| Samples| ssarfraz CPU | ssarfraz RAM | finchpy CPU | finchpy RAM |
|------- |------------- |------------- |------------ |-------------|
| 1000 | 32.4 ms | 109.63 MiB | 29.3 ms | 93.02 MiB |
| 10000 | 1.62 s | 689.86 MiB | 215 ms | 95.99 MiB |
| 20000 | 7.57 s | 2069.90 MiB | 443 ms | 101.78 MiB |
| 50000 | ----- | ----- | 1.4 s | 115.35 MiB |
| 75000 | ----- | ----- | 2.56 s | 129.67 MiB |
pyflann was not used for the ssarfraz code as it does not support Python 3
License
Released under MIT License. See the LICENSE file for details. The package was developed by Eren Cakmak from the Data Analysis and Visualization Group Konstanz, Germany. This work is partly funded by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) under Germany's Excellence Strategy – EXC 2117 – 422037984“
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
File details
Details for the file finchpy-0.0.1.tar.gz
.
File metadata
- Download URL: finchpy-0.0.1.tar.gz
- Upload date:
- Size: 481.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.4.1 importlib_metadata/3.7.3 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.59.0 CPython/3.8.8
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | d2f5c971186d110980a522b305152c748de43021e39de976de2f1ce69f46a273 |
|
MD5 | 6bd9f51e77cb48ef75bfaa3bb88298a1 |
|
BLAKE2b-256 | e8ba97c7bc0b376d0f4e38ca056172f6618bc348494ec2352f972a77726a9ae0 |