Skip to main content

A package for clustering with cluster size constraints.

Project description

ccluster: a package for clustering with size constraints

PyPI version

ccluster is a library for performing clustering with exhaustive or partial cluster size constraints. It currently provides two constrained clustering algorithms: a constrained k-means suitable for euclidean data, and a constrained graph clustering algorithm based on spectral clustering.

For more details, please refer to the documentation.

k-means example 1 k-means example 2 k-means example 3
spectral clustering example 1 spectral clustering example 2 spectral clustering example 3

Installation

This package is available on Pypi. It can be installed as follows:

pip install ccluster

A set of dependencies will also be installed. The package can now be used:

import ccluster

Quick-Start

Here is an example of using constrained $k$-means with exhaustive constraints i.e. giving a size constraint for each cluster:

>>> from ccluster.size import ConstrainedKMeans
>>> import numpy as np

>>> X = np.array([[1, 2], [1, 4], [1, 0],
...               [10, 2], [10, 4], [10, 0]])
>>> kmeans = ConstrainedKMeans(
...     n_clusters=2,
...     cluster_size=[2, 4],
...     random_state=0,
...     n_init=10).fit(X)
>>> kmeans.labels_
# array([1, 1, 1, 0, 0, 1])  
>>> kmeans.predict([[0, 0], [12, 3]], [1, 1])
# array([1, 0], dtype=int32)  
>>> kmeans.cluster_centers_
# array([[10. , 3. ],  
#        [ 3.25, 1.5 ]])  

It is also possible to specify partial constraints i.e. constraints on a subset of the clusters and leave the remaining ones free. Here is an example using constrained spectral clustering:

>>> from ccluster.size import ConstrainedSpectralClustering
>>> import numpy as np

>>> X = np.array([[1, 1], [2, 1], [1, 0],
...               [4, 7], [3, 5], [3, 6],
...               [9, 6], [5, 4], [2, 1]])

>>> spectral = ConstrainedSpectralClustering(
...     n_clusters=4, 
...     cluster_sizes=[2, 2],
...     random_state=0).fit(X)

>>> spectral.labels_
# array([2, 3, 2, 0, 3, 0, 1, 1, 3])

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ccluster-0.1.2.tar.gz (16.2 kB view details)

Uploaded Source

Built Distribution

ccluster-0.1.2-py3-none-any.whl (17.4 kB view details)

Uploaded Python 3

File details

Details for the file ccluster-0.1.2.tar.gz.

File metadata

  • Download URL: ccluster-0.1.2.tar.gz
  • Upload date:
  • Size: 16.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/5.1.0 CPython/3.12.5

File hashes

Hashes for ccluster-0.1.2.tar.gz
Algorithm Hash digest
SHA256 a87d816a59d217db945c0277669299a08880126dd092c8c8916f889396ba8b8e
MD5 9189d5f37b2f29912c1f79469d53c303
BLAKE2b-256 b6e246bf7d45564a8e5113241621d76bfaa0e4353476c3a4a0bd2f6e593541f4

See more details on using hashes here.

File details

Details for the file ccluster-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: ccluster-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 17.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/5.1.0 CPython/3.12.5

File hashes

Hashes for ccluster-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 599c176baed8975bfd1d73054dada84aade432879bfc85b670082f206346bc55
MD5 934129fc3bf56ec0442de09d086ec6b6
BLAKE2b-256 16d86e9c7216e72203cb587ed68dbcdb86f867830c176615978b7162c79160bf

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page