Skip to main content

A Python package for common-nearest-neighbours clustering

Project description

image image image Build Status Coverage Status

Common-nearest-neighbours clustering


NOTE

This project is currently under development. The implementation may change in the future. Check the examples and the documentation for up-to-date information.


cnnclustering

The cnnclustering Python package provides a flexible interface to use the common-nearest-neighbours cluster algorithm. While the method can be applied to arbitrary data, this implementation was made before the background of processing trajectories from Molecular Dynamics simulations. In this context the cluster result can serve as a suitable basis for the construction of a core-set Markov-state (cs-MSM) model to capture the essential dynamics of the underlying molecular processes. For a tool for cs-MSM estimation, refer to this separate project.

The package provides a main module:

  • cluster: User interface to (hierarchical) common-nearest-neighbours clustering

Further, it contains the modules:

  • plot: Convenience functions to evaluate cluster results
  • _types: Direct access to generic types representing needed cluster components
  • _fit: Direct access to generic clustering procedures

Features:

  • Flexible: Clustering can be done for data sets in different input formats. Easy interfacing with external methods.
  • Convenient: Integration of functionality, handy in the context of Molecular Dynamics.
  • Fast: Core functionalities implemented in Cython.

Please refer to the following papers for the scientific background (and consider citing if you find the method useful):

  • B. Keller, X. Daura, W. F. van Gunsteren J. Chem. Phys., 2010, 132, 074110.
  • O. Lemke, B.G. Keller J. Chem. Phys., 2016, 145, 164104.
  • O. Lemke, B.G. Keller Algorithms, 2018, 11, 19.

Documentation

The package documentation (under developement) is available here.

Install

Refer to the documentation for more details. Install from PyPi

$ pip install cnnclustering

or clone the development version and install from a local branch

$ git clone https://github.com/janjoswig/CommonNNClustering.git
$ cd CommonNNClustering
$ pip install .

Quickstart

>>> from cnnclustering.cluster import prepare_clustering

>>> # 2D data points (list of lists, 12 points in 2 dimensions)
>>> data_points = [   # point index
...     [0, 0],       # 0
...     [1, 1],       # 1
...     [1, 0],       # 2
...     [0, -1],      # 3
...     [0.5, -0.5],  # 4
...     [2,  1.5],    # 5
...     [2.5, -0.5],  # 6
...     [4, 2],       # 7
...     [4.5, 2.5],   # 8
...     [5, -1],      # 9
...     [5.5, -0.5],  # 10
...     [5.5, -1.5],  # 11
...     ]

>>> clustering = prepare_clustering(data_points)
>>> clustering.fit(radius_cutoff=1.5, cnn_cutoff=1, v=False)
>>> clustering.labels
Labels([1, 1, 1, 1, 1, 0, 0, 0, 0, 2, 2, 2])

quickstart

Alternative scikit-learn implementation

We provide an alternative approach to common-nearest-neighbours clustering in the spirit of the scikit-learn project within scikit-learn-extra.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cnnclustering-0.4.2.tar.gz (9.3 MB view details)

Uploaded Source

File details

Details for the file cnnclustering-0.4.2.tar.gz.

File metadata

  • Download URL: cnnclustering-0.4.2.tar.gz
  • Upload date:
  • Size: 9.3 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/4.5.0 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.61.0 CPython/3.8.7

File hashes

Hashes for cnnclustering-0.4.2.tar.gz
Algorithm Hash digest
SHA256 7a2ff128bac17bc607853a85b998d79c4f6e897a81fd07f3f6148b358ef2d024
MD5 07de1a531d08f28ba1d26eddc7f993ab
BLAKE2b-256 d5b64c5e2fced8d7ee20e7e1e0bc70a2a9a7524a224a3790cc101439c92c9700

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page