Skip to main content

The breathing k-means algorithm

Project description

The Breathing K-Means Algorithm

An approximation algorithm for the k-means problem that (on average) is better (higher solution quality) and faster (lower CPU time usage) than k-means++.

Techreport: (submitted for publication)

Repo (with examples):


The included class BKMeans is subclassed from scikit-learn's KMeans class and has, therefore, the same API. It can be used as a plug-in replacement for scikit-learn's KMeans.

There is one new parameters which can be ignored (left at default) for normal usage:

  • m (breathing depth), default: 5

The parameter m can also be used, however, to generate faster ( 1 < m < 5) or better (m>5) solutions. For details see the above techreport.


pip install bkmeans

Example 1: running on simple random data set


import numpy as np
from bkmeans import BKMeans

# generate random data set

# create BKMeans instance
bkm = BKMeans(n_clusters=100)

# run the algorithm

# print SSE (inertia in scikit-learn terms)



Example 2: comparison with k-means++ (multiple runs)


import numpy as np
from sklearn.cluster import KMeans
from bkmeans import BKMeans

# random 2D data set

# number of centroids

for i in range(5):
    # kmeans++
    kmp = KMeans(n_clusters=k)

    # breathing k-means
    bkm = BKMeans(n_clusters=k)

    # relative SSE improvement of bkm over km++
    imp = 1 - bkm.inertia_/kmp.inertia_
    print(f"SSE improvement over k-means++: {imp:.2%}")


SSE improvement over k-means++: 3.38%
SSE improvement over k-means++: 4.16%
SSE improvement over k-means++: 6.14%
SSE improvement over k-means++: 6.79%
SSE improvement over k-means++: 4.76%

Project details

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

bkmeans-1.1.tar.gz (5.4 kB view hashes)

Uploaded source

Built Distribution

bkmeans-1.1-py3-none-any.whl (6.7 kB view hashes)

Uploaded py3

Supported by

AWS AWS Cloud computing Datadog Datadog Monitoring Fastly Fastly CDN Google Google Object Storage and Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page