Skip to main content

VIASCKDE: internal cluster validity index (KDE-weighted compactness & separation)

Project description

VIASCKDE Index

VIASCKDE is a novel internal cluster validity index for arbitrary-shaped clusters based on Kernel Density Estimation (KDE).


Motivation

The VIASCKDE Index was developed to accurately assess clustering quality for non-spherical, arbitrarily shaped clusters, overcoming limitations of traditional validity measures that assume spherical structures. By combining compactness and separation at the point level with kernel density estimation to emphasize dense regions, VIASCKDE provides robust evaluation across diverse clustering outcomes.

The index evaluates clustering quality regardless of cluster shape by computing compactness and separation at the point level instead of relying on cluster centroids. This makes it robust for non-spherical and arbitrarily shaped clusters.


Installation

pip install viasckde

Usage

from sklearn.cluster import DBSCAN
from sklearn.datasets import make_moons
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import adjusted_rand_score
from viasckde import viasckde_score
import matplotlib.pyplot as plt

# 1. Arbitrary-shaped dataset (moons)
X, y_true = make_moons(n_samples=10000, noise=0.07, random_state=42)
X = StandardScaler().fit_transform(X)

Clustering process with DBSCAN
db = DBSCAN(eps=0.1, min_samples=5)
labels = db.fit_predict(X)

# VIASCKDE Score
viasckde = viasckde_score(X, labels)

# Adjusted Rand Index to validate the suc ess of VIASCKDE Index
ari = adjusted_rand_score(y_true, labels)

# print results
print("VIASCKDE Score:", viasckde)
print("ARI Score:", ari)

# to visualize results
plt.figure(figsize=(8, 6))
plt.scatter(X[:, 0], X[:, 1], c=labels, cmap="viridis", s=12)
plt.title(f"Best DBSCAN Clusters (eps=0.1, min_samples=5)\n"
          f"VIASCKDE={viasckde:.4f}, ARI={ari:.4f}")
plt.xlabel("Feature 1")
plt.ylabel("Feature 2")
plt.grid(True)
plt.show()

Concept

In non-spherical clusters, the distance from a point to the nearest neighbor in the same cluster is often more meaningful than the distance to the cluster centroid. VIASCKDE computes:

Compactness: distance to the closest point in the same cluster Separation: distance to the closest point in a different cluster This point-level computation ensures realistic evaluation of clusters regardless of their shape.

Parameters of VIASCKDE Index

VIASCKDE index needs four parameters (two are optional) that are:

  • X: your data array (NumPy-like)
  • labels: predicted cluster labels
  • kernel (optional): selected kernel method, krnl='gaussian' is default kernel. But it could be 'tophat', 'epanechnikov', 'exponential', 'linear', or 'cosine'.
  • bandwidth(optional): the bandwidth value of kernel density estimation. b_width=0.05 is the default value. But it could be changed.

Output Range

VIASCKDE returns a score in [-1, +1]: +1: best clustering -1: worst clustering

Citation

Ali Şenol, "VIASCKDE Index: A Novel Internal Cluster Validity Index for Arbitrary-Shaped Clusters Based on the Kernel Density Estimation", Computational Intelligence and Neuroscience, vol. 2022, Article ID 4059302, 20 pages, 2022. https://doi.org/10.1155/2022/4059302

BibTeX

@article{csenol2022viasckde,
  title={VIASCKDE Index: A Novel Internal Cluster Validity Index for Arbitrary-Shaped Clusters Based on the Kernel Density Estimation},
  author={{\c{S}}enol, Ali},
  journal={Computational Intelligence and Neuroscience},
  volume={2022},
  number={1},
  pages={4059302},
  year={2022},
  publisher={Wiley Online Library},
  doi = "10.1155/2022/4059302"
}

License & Author

Author: Assoc. Prof. Dr. Ali Şenol Computer Engineering Department, Tarsus University

License: MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

viasckde-1.0.0.tar.gz (4.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

viasckde-1.0.0-py3-none-any.whl (4.4 kB view details)

Uploaded Python 3

File details

Details for the file viasckde-1.0.0.tar.gz.

File metadata

  • Download URL: viasckde-1.0.0.tar.gz
  • Upload date:
  • Size: 4.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.5

File hashes

Hashes for viasckde-1.0.0.tar.gz
Algorithm Hash digest
SHA256 0204f73926814f62b3257f076d20879ce6d03b65d67b2f3e172045de7e1422cf
MD5 9b2881f55fa80e303dbf1c6fcbc26705
BLAKE2b-256 3ab8c6e3de249b1c1c432ddae1809fd0420bc23329feb28bdb7f380ce97b2486

See more details on using hashes here.

File details

Details for the file viasckde-1.0.0-py3-none-any.whl.

File metadata

  • Download URL: viasckde-1.0.0-py3-none-any.whl
  • Upload date:
  • Size: 4.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.5

File hashes

Hashes for viasckde-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 ff4c138ee90753161fff697a14f69562cea56fadd4e0cf6d597899b38ded36fb
MD5 cdf5fc333afc049a3eb86ad61b9643d9
BLAKE2b-256 f5675dbdb228f5a2eb7199f2801bbc638bbf9e62238085e00696c7de2bf187fc

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page