Skip to main content

A package for clustering with cluster size constraints.

Project description

ccluster: a package for clustering with size constraints

PyPI version

ccluster is a library for performing clustering with exhaustive or partial cluster size constraints. It currently provides two constrained clustering algorithms: a constrained k-means suitable for euclidean data, and a constrained graph clustering algorithm based on spectral clustering.

For more details, please refer to the documentation.

k-means example 1 k-means example 2 k-means example 3
spectral clustering example 1 spectral clustering example 2 spectral clustering example 3

Installation

This package is available on Pypi. It can be installed as follows:

pip install ccluster

A set of dependencies will also be installed. The package can now be used:

import ccluster

Quick-Start

Here is an example of using constrained $k$-means with exhaustive constraints i.e. giving a size constraint for each cluster:

>>> from ccluster.size import ConstrainedKMeans
>>> import numpy as np

>>> X = np.array([[1, 2], [1, 4], [1, 0],
...               [10, 2], [10, 4], [10, 0]])
>>> kmeans = ConstrainedKMeans(
...     n_clusters=2,
...     cluster_size=[2, 4],
...     random_state=0,
...     n_init=10).fit(X)
>>> kmeans.labels_
# array([1, 1, 1, 0, 0, 1])  
>>> kmeans.predict([[0, 0], [12, 3]], [1, 1])
# array([1, 0], dtype=int32)  
>>> kmeans.cluster_centers_
# array([[10. , 3. ],  
#        [ 3.25, 1.5 ]])  

It is also possible to specify partial constraints i.e. constraints on a subset of the clusters and leave the remaining ones free. Here is an example using constrained spectral clustering:

>>> from ccluster.size import ConstrainedSpectralClustering
>>> import numpy as np

>>> X = np.array([[1, 1], [2, 1], [1, 0],
...               [4, 7], [3, 5], [3, 6],
...               [9, 6], [5, 4], [2, 1]])

>>> spectral = ConstrainedSpectralClustering(
...     n_clusters=4, 
...     cluster_sizes=[2, 2],
...     random_state=0).fit(X)

>>> spectral.labels_
# array([2, 3, 2, 0, 3, 0, 1, 1, 3])

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ccluster-0.1.3.tar.gz (16.2 kB view details)

Uploaded Source

Built Distribution

ccluster-0.1.3-py3-none-any.whl (17.4 kB view details)

Uploaded Python 3

File details

Details for the file ccluster-0.1.3.tar.gz.

File metadata

  • Download URL: ccluster-0.1.3.tar.gz
  • Upload date:
  • Size: 16.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/5.1.0 CPython/3.12.5

File hashes

Hashes for ccluster-0.1.3.tar.gz
Algorithm Hash digest
SHA256 cfdd11ee365a882e1a6668c22ff12de1f50bd062238530d0caeee89afc482891
MD5 fde5d778e1a9c31c623c825a353aab46
BLAKE2b-256 f81213c11139c7867fca05f55fc592410096930b6fc3932292fe5be13464e9b0

See more details on using hashes here.

File details

Details for the file ccluster-0.1.3-py3-none-any.whl.

File metadata

  • Download URL: ccluster-0.1.3-py3-none-any.whl
  • Upload date:
  • Size: 17.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/5.1.0 CPython/3.12.5

File hashes

Hashes for ccluster-0.1.3-py3-none-any.whl
Algorithm Hash digest
SHA256 a7fa7fdd3e09ad6ca2d52603a0316417a35be078bf29f905160db1703387341a
MD5 8cfa1c580772fbd7b9208e58c79c6716
BLAKE2b-256 9b28239f8d9815a77584e1b27bca7a6a786c1e887f4e351ef2084fb9617201cc

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page