Skip to main content

A parameter-free fast clustering algorithm.

Project description

First Integer Neighbor Clustering Hierarchy (FINCH) Algorithm

The repository contains our Python and Matlab code for the proposed FINCH clustering algorithm described in our Efficient Parameter-free Clustering Using First Neighbor Relations CVPR 2019 oral paper.

@inproceedings{finch,
    author    = {M. Saquib Sarfraz and Vivek Sharma and Rainer Stiefelhagen}, 
    title     = {Efficient Parameter-free Clustering Using First Neighbor Relations}, 
    booktitle = {Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
    pages = {8934--8943}
    year  = {2019}
}

Installation

The project is available in PyPI. To install run:

pip install finch-clust

Optional. Install PyNNDescent to get first neighbours for large data

To install finch with pynndescent run:

pip install "finch-clust[ann]"

Usage:

typically you would run:

from finch import FINCH
c, num_clust, req_c = FINCH(data)

You can set options e.g., required number of cluster or distance etc,

c, num_clust, req_c = FINCH(data, initial_rank=None, req_clust=None, distance='cosine', verbose=True)

Input:

  • data: numpy array (feature vectors in rows)
  • [OPTIONAL]
    • req_c: specify required number of cluster
    • distance: One of ['cityblock', 'cosine', 'euclidean', 'l1', 'l2', 'manhattan'] Recommended: 'cosine (default)' or 'euclidean (for 2D data)'
    • initial_rank: Nx1 vector of 1-neighbour indices
    • ensure_early_exit: (default: True) if set it may help for Unbalanced or large datasets, ensure purity of merges and helps early exit
    • verbos : for printing some output

Output:

  • c: N x P array, each column vector contains cluster labels for each partition P
  • num_clust: shows total number of cluster in each partition P
  • req_c: Labels of required clusters (Nx1). Only set if req_clust is not None.

Matlab usage

For usage in Matlab check README in the matlab directory.

The code and FINCH algorithm is not meant for commercial use. Please contact the author below for licensing information.

M. Saquib Sarfraz (saquib.sarfraz@kit.edu)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

finch-clust-0.1.1.tar.gz (11.3 kB view details)

Uploaded Source

File details

Details for the file finch-clust-0.1.1.tar.gz.

File metadata

  • Download URL: finch-clust-0.1.1.tar.gz
  • Upload date:
  • Size: 11.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.9.7

File hashes

Hashes for finch-clust-0.1.1.tar.gz
Algorithm Hash digest
SHA256 565c8042c44156ef099a50fa6a297d2af5ce9d2a55601d431982c18881cb4d12
MD5 b4bd55d90b41fb0cf16cc2285cd24494
BLAKE2b-256 546a1682171935f0a6a7ec589708de9908861ac8e2ffc050215cf35c6c10f0fc

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page