A constrained KMeans algorithm.
Project description
Constrained KMeans
Modified version of KMeans algorithm that takes into account partial information about the data.Given a partial list of known labels init_labels
, Constrained KMeans
finds a cluster configuration that complies with init_labels
.
init_labels
is the same length as x.shape[0], which is why
a second array can_change
masks out which labels should be
marked as known and which labels can change.
Formally, the output of the algorithm is an array labels
such that
np.all((labels[can_change == 0] == init_labels[can_change == 0]))
is True
.
Can be installed via (requires Python>=3.7)
pip install ConstrainedKMeans
Example basic usage:
import numpy as np
from matplotlib import pyplot as plt
from ConstrainedKMeans import ConstrainedKMeans as CKM
def run_test(n_points):
ckm = CKM(n_clusters=10)
# Generate random dataset
# For visualization purposes, initialize 2d data
x = np.random.random((n_points, 2))
# Generate random labels
init_labels = np.random.randint(0, 10, n_points)
# Generate 0s with probability 0.2
# these shall mask the "known" labels
can_change = np.random.binomial(2, 0.7, n_points)
labels = ckm.fit_predict(x, can_change, init_labels)
plt.scatter(x[:, 0], x[:, 1], c=labels)
plt.show()
if __name__ == '__main__':
run_test(1000)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
ConstrainedKMeans-1.2.tar.gz
(225.3 kB
view hashes)
Built Distribution
Close
Hashes for ConstrainedKMeans-1.2-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 3ffec8ed669abbdcb97bc474c420df369a7bd9c47a07ce2a7c91459f0a70f6ef |
|
MD5 | b7b91aa55b77aa86d9bd89f4fac4a991 |
|
BLAKE2b-256 | 4244b5a8c92be6db6dcda8eafd9b878a9ba3a4f5c8ef805e3796ca14c1236d9e |