Skip to main content

A scikit-learn compatible implementation of CLARANS clustering algorithm

Project description

scikit-clarans

A scikit-learn compatible implementation of the CLARANS (Clustering Large Applications based on RANdomized Search) algorithm.

License DOI Python 3.8+ Docs Build Test Suite Quality Check Open In Colab

CLARANS acts as a bridge between the high quality of PAM (Partition Around Medoids) and the speed required for large datasets. By using randomized search instead of exhaustive search, it finds high-quality medoids efficiently without exploring the entire graph of solutions.


Features

  • Scikit-Learn Native: Use it just like KMeans or DBSCAN. Drop-in compatibility for pipelines and cross-validation.
  • Scalable: Designed to handle datasets where standard PAM/k-medoids is too slow.
  • Flexible: Choose from multiple initialization strategies (k-medoids++, build, etc.) and distance metrics (euclidean, manhattan, cosine, etc.).

Installation

Install simply via pip:

pip install scikit-clarans

Or install from source:

pip install .

For development

pip install -e ".[dev]"

Quick Start

CLARANS

from clarans import CLARANS
from sklearn.datasets import make_blobs

# 1. Create dummy data
X, _ = make_blobs(n_samples=1000, centers=5, random_state=42)

# 2. Initialize CLARANS
#    - n_clusters: 5 clusters
#    - numlocal: 3 restarts for better quality
#    - init: 'k-medoids++' for smart starting points
clarans = CLARANS(n_clusters=5, numlocal=3, init='k-medoids++', random_state=42)

# 3. Fit
clarans.fit(X)

# 4. Results
print("Medoid Indices:", clarans.medoid_indices_)
print("Labels:", clarans.labels_)

FastCLARANS

For datasets that fit in memory, FastCLARANS can provide significant speedups by caching pairwise distances:

from clarans import FastCLARANS

fast_model = FastCLARANS(n_clusters=5, numlocal=3, random_state=42)
fast_model.fit(X)

Examples

This repository includes a number of runnable examples in the examples/ folder showing common usage patterns, integrations and a Jupyter notebook (examples/clarans_examples.ipynb) with many interactive recipes. Run any example with::

python examples/01_quick_start.py

Documentation

For full API reference and usage guides, please see the Documentation.

Contributing

Contributions are welcome! Please check out CONTRIBUTING.md for guidelines.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

scikit_clarans-0.1.2.tar.gz (194.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

scikit_clarans-0.1.2-py3-none-any.whl (24.6 kB view details)

Uploaded Python 3

File details

Details for the file scikit_clarans-0.1.2.tar.gz.

File metadata

  • Download URL: scikit_clarans-0.1.2.tar.gz
  • Upload date:
  • Size: 194.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for scikit_clarans-0.1.2.tar.gz
Algorithm Hash digest
SHA256 8f4e903193cb8c9efe5a0212aef5ffdc5b7934fcec809d49875a92aabbe7905f
MD5 cfecdb538a7702ea1e9d148ec84a0689
BLAKE2b-256 2763ef2afd91ffd142fa6fb27ede748e5269422d7ac3ca5da6b471f15d7a607d

See more details on using hashes here.

Provenance

The following attestation bundles were made for scikit_clarans-0.1.2.tar.gz:

Publisher: pypi-publish.yml on ThienNguyen3001/scikit-clarans

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file scikit_clarans-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: scikit_clarans-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 24.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for scikit_clarans-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 eb90dd8f1659476685c633487a7c1289bf8fc9881fc99a408da97ce4c471a635
MD5 349933c49e26683bef98d9e9561f6eb5
BLAKE2b-256 511c4cd48e6ddd6c65be8defb92c20ba79a4c91362b58d93bb147786a6a70093

See more details on using hashes here.

Provenance

The following attestation bundles were made for scikit_clarans-0.1.2-py3-none-any.whl:

Publisher: pypi-publish.yml on ThienNguyen3001/scikit-clarans

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page