Size Constrained Clustering solver
Project description
Size Constrained Clustering Solver
Implementation of Deterministic Annealing Size Constrained Clustering. Size constrained clustering can be treated as an optimization problem. Details could be found in a set of reference paper.
This is a fork of https://github.com/jingw2/size_constrained_clustering that solves installation issues. And mantains only the Determinstic Annealing clustering.
Installation
Requirement Python >= 3.6, Numpy >= 1.13
- install from PyPI
pip install light-size-constrained-clustering
Methods
- Deterministic Annealling Algorithm: Input target cluster distribution, return correspondent clusters
Usage:
Deterministic Annealing
# setup
from light_size_constrained_clustering import da
import numpy as np
n_samples = 40 # number cells in spot
n_clusters = 4 # distinct number of cell types
distribution= [0.4,0.3,0.2,0.1] # distribution of each cell type (form deconv)
seed = 17
print(np.sum(distribution))
np.random.seed(seed)
X = np.random.rand(n_samples, 2)
# distribution is the distribution of cluster sizes
model = da.DeterministicAnnealing(n_clusters, distribution= distribution, random_state=seed)
model.fit(X)
centers = model.cluster_centers_
labels = model.labels_
print("Labels:")
print(labels)
print("Elements in cluster 0: ", np.count_nonzero(labels == 0))
print("Elements in cluster 1: ", np.count_nonzero(labels == 1))
print("Elements in cluster 2: ", np.count_nonzero(labels == 2))
print("Elements in cluster 3: ", np.count_nonzero(labels == 3))
In case of provided distributions not being respected due to lack of convergence, distribution can
be nforced by using the parameter enforce_cluster_distribution
model.fit(X, enforce_cluster_distribution=True)
Cluster size: 16, 12, 8 and 4 in the figure above, corresponding to distribution [0.4, 0.3, 0.2, 0.1]
Copyright
Copyright (c) 2023 Jing Wang & Albert Pla. Released under the MIT License.
Third-party copyright in this distribution is noted where applicable.
Reference
- Clustering with Capacity and Size Constraints: A Deterministic Approach
- Deterministic Annealing, Clustering and Optimization
- Deterministic Annealing, Constrained Clustering, and Opthiieation
- Shrinkage Clustering
- Clustering with size constraints
- Data Clustering with Cluster Size Constraints Using a Modified k-means Algorithm
- KMeans Constrained Clustering Inspired by Minimum Cost Flow Problem
- Same Size Kmeans Heuristics Methods
- Google's Operations Research tools's
SimpleMinCostFlow
- Cluster KMeans Constrained
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for light_size_constrained_clustering-0.0.6.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | 511733951459f00152085c8a8cf74e7a0380eb675851d5bc91d70fd9fb30c744 |
|
MD5 | 2307826b355d3953b73b07f3e5b6caad |
|
BLAKE2b-256 | 4b650f65266da4e9aab77510121cfd78b58fcd4afb8c5f3658426064ae7a2f21 |
Hashes for light_size_constrained_clustering-0.0.6-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 07c34417861182190fb0d743416aba169a0362601455bd0c183a4711095f7c6e |
|
MD5 | c5a4f9848d04a2f62720e6748c95c631 |
|
BLAKE2b-256 | 0c2da929ab31c9e5797c6009c690d533b8a409179397eda0643d8cf48fa6bd2e |