Skip to main content

A scikit-learn compatible implementation of CLARANS clustering algorithm

Project description

scikit-clarans

A scikit-learn compatible implementation of the CLARANS (Clustering Large Applications based on RANdomized Search) algorithm.

License DOI Python 3.8+ Docs Build Test Suite Quality Check Open In Colab

CLARANS acts as a bridge between the high quality of PAM (Partition Around Medoids) and the speed required for large datasets. By using randomized search instead of exhaustive search, it finds high-quality medoids efficiently without exploring the entire graph of solutions.


Features

  • Scikit-Learn Native: Use it just like KMeans or DBSCAN. Drop-in compatibility for pipelines and cross-validation.
  • Scalable: Designed to handle datasets where standard PAM/k-medoids is too slow.
  • Flexible: Choose from multiple initialization strategies (k-medoids++, build, etc.) and distance metrics (euclidean, manhattan, cosine, etc.).

Installation

Install simply via pip:

pip install .

For development

pip install -e .[dev]

Quick Start

CLARANS

from clarans import CLARANS
from sklearn.datasets import make_blobs

# 1. Create dummy data
X, _ = make_blobs(n_samples=1000, centers=5, random_state=42)

# 2. Initialize CLARANS
#    - n_clusters: 5 clusters
#    - numlocal: 3 restarts for better quality
#    - init: 'k-medoids++' for smart starting points
clarans = CLARANS(n_clusters=5, numlocal=3, init='k-medoids++', random_state=42)

# 3. Fit
clarans.fit(X)

# 4. Results
print("Medoid Indices:", clarans.medoid_indices_)
print("Labels:", clarans.labels_)

FastCLARANS

For datasets that fit in memory, FastCLARANS can provide significant speedups by caching pairwise distances:

from clarans import FastCLARANS

fast_model = FastCLARANS(n_clusters=5, numlocal=3, random_state=42)
fast_model.fit(X)

Examples

This repository includes a number of runnable examples in the examples/ folder showing common usage patterns, integrations and a Jupyter notebook (examples/clarans_examples.ipynb) with many interactive recipes. Run any example with::

python examples/01_quick_start.py

Documentation

For full API reference and usage guides, please see the Documentation.

Contributing

Contributions are welcome! Please check out CONTRIBUTING.md for guidelines.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

scikit_clarans-0.1.1.tar.gz (194.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

scikit_clarans-0.1.1-py3-none-any.whl (24.5 kB view details)

Uploaded Python 3

File details

Details for the file scikit_clarans-0.1.1.tar.gz.

File metadata

  • Download URL: scikit_clarans-0.1.1.tar.gz
  • Upload date:
  • Size: 194.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for scikit_clarans-0.1.1.tar.gz
Algorithm Hash digest
SHA256 6330e46ef1808b911ca270f85c7c42a7587f134d6848d9d7c3ed481484a7c9af
MD5 0f8c72ef6500b7911f7be0441a52cffd
BLAKE2b-256 9d091531fe609d9302897ba1e7495b3e3d5ca3bd405ff6d33d390e5e73097dda

See more details on using hashes here.

Provenance

The following attestation bundles were made for scikit_clarans-0.1.1.tar.gz:

Publisher: pypi-publish.yml on ThienNguyen3001/scikit-clarans

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file scikit_clarans-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: scikit_clarans-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 24.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for scikit_clarans-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 3c4e965c5c09354f92feb5018513a77d57946ec7dd30546351e60f6d8d6fbdc0
MD5 0046b0d0217ff615728c3377ceb5f96e
BLAKE2b-256 a9f2ec57faafcbf464842108529333da1b13107cd9a2203e176fb83c1fd1da80

See more details on using hashes here.

Provenance

The following attestation bundles were made for scikit_clarans-0.1.1-py3-none-any.whl:

Publisher: pypi-publish.yml on ThienNguyen3001/scikit-clarans

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page