Skip to main content

Equilibrium K-Means (EKMeans) clustering algorithms compatible with scikit-learn

Project description

sklekmeans - Equilibrium K-Means for scikit-learn

Unit Tests codecov docs PyPI version Python versions License: BSD-3-Clause

sklekmeans provides batch and mini-batch implementations of the Equilibrium K-Means (EKMeans) clustering algorithm. The method introduces an equilibrium weighting scheme that can yield improved robustness on imbalanced datasets compared to standard k-means. The API is compatible with sklearn estimators.

Features

  • Drop-in scikit-learn compatible estimators: EKMeans, MiniBatchEKMeans.
  • Supports Euclidean and Manhattan distances.
  • Heuristic alpha selection via alpha='dvariance'.
  • Mini-batch variant with accumulation or online update modes.
  • Soft memberships (membership) and equilibrium weights (W_).

Installation

The package is available on PyPI. Install the base package:

pip install sklekmeans

Optional extras:

  • With numba acceleration (recommended for speed):
pip install "sklekmeans[speed]"

From source (latest main):

  • Basic installation
git clone https://github.com/ydcnanhe/sklearn-ekmeans.git
cd sklearn-ekmeans
pip install .
  • Or in editable mode
pip install -e .
  • With numba acceleration
pip install -e .[speed]
  • Development tools (tests, lint):
pip install -e .[dev]
  • Docs build dependencies:
pip install -e .[docs]
  • Everything (dev + docs + speed):
pip install -e .[all]

Quick Start

from sklekmeans import EKMeans
import numpy as np

X = np.random.rand(200, 2)
ekm = EKMeans(n_clusters=3, random_state=0, alpha='dvariance').fit(X)
print(ekm.cluster_centers_)

Mini-batch variant with multiple initializations and selection of the best run:

from sklekmeans import MiniBatchEKMeans
mb = MiniBatchEKMeans(n_clusters=3, batch_size=256, max_epochs=20, n_init=5, random_state=0)
mb.fit(X)
print(mb.cluster_centers_)

Documentation

The latest HTML documentation is hosted on GitHub Pages:

ydcnanhe.github.io/sklearn-ekmeans

Badges above reflect build status; if the link 404s, wait for the docs CI to finish.

PyPI project page: https://pypi.org/project/sklekmeans/

Build and publish (maintainers)

Local build of artifacts:

python -m pip install --upgrade build twine
python -m build
python -m twine check dist/*

Publishing to PyPI is automated via GitHub Actions (Trusted Publishing). See PUBLISHING.md.

References

  • [1] Y. He. An Equilibrium Approach to Clustering: Surpassing Fuzzy C-Means on Imbalanced Data, IEEE Transactions on Fuzzy Systems, 2025.
  • [2] Y. He. Semi-supervised equilibrium K-means for imbalanced data clustering, Knowledge-Based Systems, p.113990, 2025.
  • [3] Y. He. Imbalanced Data Clustering Using Equilibrium K-Means, arXiv, 2024.

License

BSD 3-Clause

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sklekmeans-0.1.3.tar.gz (17.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

sklekmeans-0.1.3-py3-none-any.whl (15.0 kB view details)

Uploaded Python 3

File details

Details for the file sklekmeans-0.1.3.tar.gz.

File metadata

  • Download URL: sklekmeans-0.1.3.tar.gz
  • Upload date:
  • Size: 17.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for sklekmeans-0.1.3.tar.gz
Algorithm Hash digest
SHA256 38bd7fcf4d85f36f7ad952379e139b7bb6c772757f42e84326cf3e2efb396da1
MD5 4fec7b593f58651a9a9438bcd18f9f7e
BLAKE2b-256 f61f04956d723641bba369facd3da56b3c9bd7a739e00cd29d482ad41824a589

See more details on using hashes here.

Provenance

The following attestation bundles were made for sklekmeans-0.1.3.tar.gz:

Publisher: publish.yml on ydcnanhe/sklearn-ekmeans

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file sklekmeans-0.1.3-py3-none-any.whl.

File metadata

  • Download URL: sklekmeans-0.1.3-py3-none-any.whl
  • Upload date:
  • Size: 15.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for sklekmeans-0.1.3-py3-none-any.whl
Algorithm Hash digest
SHA256 3dde197c7f627a7ba515911b5b470bd4a70ff776999f005b43840724d227bdb7
MD5 03e7048cca447906d9987dfa3334f23e
BLAKE2b-256 00fe79ddb04b43db60f5797646d843b45ea758fe4b148a3b0ba4c3bf352cd2e0

See more details on using hashes here.

Provenance

The following attestation bundles were made for sklekmeans-0.1.3-py3-none-any.whl:

Publisher: publish.yml on ydcnanhe/sklearn-ekmeans

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page