Skip to main content

Equilibrium K-Means (EKMeans) clustering algorithms compatible with scikit-learn

Project description

sklekmeans - Equilibrium K-Means for scikit-learn

Unit Tests codecov docs PyPI version Python versions License: BSD-3-Clause

sklekmeans provides batch and mini-batch implementations of the Equilibrium K-Means (EKMeans) clustering algorithm. The method introduces an equilibrium weighting scheme that can yield improved robustness on imbalanced datasets compared to standard k-means.

Features

  • Drop-in scikit-learn compatible estimators: EKMeans, MiniBatchEKMeans.
  • Supports Euclidean and Manhattan distances.
  • Heuristic alpha selection via alpha='dvariance'.
  • Mini-batch variant with accumulation or online update modes.
  • Soft memberships (membership) and equilibrium weights (W_).

Installation

The package is available on PyPI. Install the base package:

pip install sklekmeans

Optional extras:

  • With numba acceleration (recommended for speed):
pip install "sklekmeans[speed]"

From source (latest main):

  • Basic installation
git clone https://github.com/ydcnanhe/sklearn-ekmeans.git
cd sklearn-ekmeans
pip install .
  • Or in editable mode
pip install -e .
  • With numba acceleration
pip install -e .[speed]
  • Development tools (tests, lint):
pip install -e .[dev]
  • Docs build dependencies:
pip install -e .[docs]
  • Everything (dev + docs + speed):
pip install -e .[all]

Quick Start

from sklekmeans import EKMeans
import numpy as np

X = np.random.rand(200, 2)
ekm = EKMeans(n_clusters=3, random_state=0, alpha='dvariance').fit(X)
print(ekm.cluster_centers_)

Mini-batch variant with multiple initializations and selection of the best run:

from sklekmeans import MiniBatchEKMeans
mb = MiniBatchEKMeans(n_clusters=3, batch_size=256, max_epochs=20, n_init=5, random_state=0)
mb.fit(X)
print(mb.cluster_centers_)

Documentation

The latest HTML documentation is hosted on GitHub Pages:

ydcnanhe.github.io/sklearn-ekmeans

Badges above reflect build status; if the link 404s, wait for the docs CI to finish.

PyPI project page: https://pypi.org/project/sklekmeans/

Build and publish (maintainers)

Local build of artifacts:

python -m pip install --upgrade build twine
python -m build
python -m twine check dist/*

Publishing to PyPI is automated via GitHub Actions (Trusted Publishing). See PUBLISHING.md.

References

  • [1] Y. He. An Equilibrium Approach to Clustering: Surpassing Fuzzy C-Means on Imbalanced Data, IEEE Transactions on Fuzzy Systems, 2025.
  • [2] Y. He. Imbalanced Data Clustering Using Equilibrium K-Means, arXiv, 2024.

License

BSD 3-Clause

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sklekmeans-0.1.1.tar.gz (17.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

sklekmeans-0.1.1-py3-none-any.whl (14.7 kB view details)

Uploaded Python 3

File details

Details for the file sklekmeans-0.1.1.tar.gz.

File metadata

  • Download URL: sklekmeans-0.1.1.tar.gz
  • Upload date:
  • Size: 17.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for sklekmeans-0.1.1.tar.gz
Algorithm Hash digest
SHA256 c31a9469cf64fee2f23af149114789de0517d65be4dbb14e2ddea0fa41262ad2
MD5 9ee667c75cd2cda4ff4aa0dbd40583f1
BLAKE2b-256 3db4764bb2d8cbfffeb388222278106d90ef39f6aef5756cb654f9bd3a861c66

See more details on using hashes here.

Provenance

The following attestation bundles were made for sklekmeans-0.1.1.tar.gz:

Publisher: publish.yml on ydcnanhe/sklearn-ekmeans

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file sklekmeans-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: sklekmeans-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 14.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for sklekmeans-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 2351f0296efd547d74f1bbc84700a230b0b63feecf905defa4fcce8b465bfd3e
MD5 d82b08ade44d6771784d9bcc5b1b0cdb
BLAKE2b-256 06b1c3413828669c1632002bb1fb8a579f2d78f619c6b102a2bb0d4e6bf19114

See more details on using hashes here.

Provenance

The following attestation bundles were made for sklekmeans-0.1.1-py3-none-any.whl:

Publisher: publish.yml on ydcnanhe/sklearn-ekmeans

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page