A constrained KMeans algorithm.
Project description
Constrained KMeans
Modified version of KMeans algorithm that takes into account partial information about the data.Given a partial list of known labels init_labels
, Constrained KMeans
finds a cluster configuration that complies with init_labels
.
init_labels
is the same length as x.shape[0], which is why
a second array can_change
masks out which labels should be
marked as known and which labels can change.
Formally, the output of the algorithm is an array labels
such that
np.all((labels[can_change == 0] == init_labels[can_change == 0]))
is True
.
Can be installed via (requires Python>=3.7)
pip install ConstrainedKMeans
Example basic usage:
import numpy as np
from matplotlib import pyplot as plt
from ConstrainedKMeans import ConstrainedKMeans as CKM
def run_test(n_points):
ckm = CKM(n_clusters=10)
# Generate random dataset
# For visualization purposes, initialize 2d data
x = np.random.random((n_points, 2))
# Generate random labels
init_labels = np.random.randint(0, 10, n_points)
# Generate 0s with probability 0.2
# these shall mask the "known" labels
can_change = np.random.binomial(2, 0.7, n_points)
labels = ckm.fit_predict(x, can_change, init_labels)
plt.scatter(x[:, 0], x[:, 1], c=labels)
plt.show()
if __name__ == '__main__':
run_test(1000)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
ConstrainedKMeans-1.1.tar.gz
(225.2 kB
view hashes)
Built Distribution
Close
Hashes for ConstrainedKMeans-1.1-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | fbcdc69dabd6d8cf7f6107cfc8a4dd95bd00fab9186e81b62a16efedfd5568f3 |
|
MD5 | d8f3ab645dbc61d75d1f436ba62d4d80 |
|
BLAKE2b-256 | 2904463c74fe95591079f4f48d95a9b1745c129ea572aec8cfdab39e5d51676c |