Skip to main content

Cluster Validity Indices for Noise-Aware Clusterings (e.g. DBSCAN)

Project description

Noise-Aware Cluster Validity Indices (NACVI)

This repository contains a Python implementation of internal cluster validity indices specifically designed for noise-aware Clusterings (e.g. DBSCAN). The validity indices presented here explicitly consider unassigned data points (noise), which makes them particularly suitable for realistic, unsupervised settings.

This is based on the scientific publication:
Lea Eileen Brauner, Frank Höppner, Frank Klawonn
Cluster Validity for Noise-Aware Clusterings, Intelligent Data Analysis Journal, IOS Press (2025)


Content

You can find the implementations of the following NACVIs:

  • sil+: noise-aware Silhouette Coefficient
  • dbi+: noise-aware Davies-Bouldin Index
  • gD33+: noise-aware Dunn-Index-Variant
  • sf+: noise-aware Score Function
  • grid+: grid-based noise-validity index
  • nr+: neighbourhood-based Noise-validity index

Motivation

Conventional validity measures treat all data points as belonging to a cluster, even if noise is explicitly labelled in DBSCAN, for example. This leads to distorted evaluations.

This package:

  • takes noise into account correctly,
  • enables a separate evaluation of the cluster structure and the noise delimitation,
  • offers an integrated metric for both with the B+ score.

Installation

pip install nacvi

Usage

In examples/usage_miniexample.py you can find a minimal example for the usage with numpy arrays as inputs.

In examples/usage_example.ipynb you can find a comprehensive example with:

  • data generation,
  • execution of the DBSCAN clustering algorithm,
  • visualisation,
  • calculation of the NACVIs

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

nacvi-0.1.0.tar.gz (20.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

nacvi-0.1.0-py3-none-any.whl (20.3 kB view details)

Uploaded Python 3

File details

Details for the file nacvi-0.1.0.tar.gz.

File metadata

  • Download URL: nacvi-0.1.0.tar.gz
  • Upload date:
  • Size: 20.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.5

File hashes

Hashes for nacvi-0.1.0.tar.gz
Algorithm Hash digest
SHA256 eced034575ced67beb915d3923502861793184e3bff921394f0001d1fdefbc60
MD5 09e01e2c4cee14b93a3839e6f1c3d56b
BLAKE2b-256 4123eb7132b67ecea572da16fd6c5155832ef516bfc3cc0466d67672f837f0b1

See more details on using hashes here.

File details

Details for the file nacvi-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: nacvi-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 20.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.5

File hashes

Hashes for nacvi-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 922f82d4128a87408ec92c236c14f4a2e4cbb76ea44c1e534e5956cc2946bbb1
MD5 e9c4966141593952b65de349fa895744
BLAKE2b-256 f7bd3655a2ff63535e984a6cb5d39cafd2835b884e45b9ebdca576f2513cab20

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page