Internal Cluster Validity Indices in python, compatible with time-series data

These details have not been verified by PyPI

Project links

Project description

PyCVI

PyCVI is a Python package specialized in internal Clustering Validity Indices (CVI). Internal CVIs are used to select the best clustering among a set of pre-computed clusterings when no external information is available such as the labels of the datapoints.

In addition, all CVIs rely on the definition of a distance between datapoints and most of them on the notion of cluster center.

For non-time-series data, the distance used is usually the Euclidean distance and the cluster center is defined as the usual average. Libraries such as scipy, numpy, scikit-learn, etc. offer a large selection of distance measures that are compatible with all their functions.

For time-series data however, common distances used are Dynamic Time Warping (DTW) ¹ or Move-Split-Merge (MSM) [^MSM] and the barycenter of a group of time series is then not defined as the usual mean, but as the DTW Barycentric Average (DBA)² or MBA (MSM DTW barycentric average) [^MBA]. Unfortunately, DTW, MSM, DBA and MBA are not compatible with the libraries mentioned above, which among other reasons, made additional machine learning libraries specialized in time series data such as aeon, sktime and tslearn necessary.

PyCVI then implements 12 state-of-the-art internal CVIs and extends them to make them compatible with time-series data and their distance and average functions. To compute DTW, MSM, DBA, MBA, etc. PyCVI relies on the aeon library.

Documentation

The full documentation is available at pycvi.readthedocs.io.

Features

12 internal CVIs implemented: Hartigan³, Calinski-Harabasz⁴, GapStatistic⁵, Silhouette⁶, ScoreFunction⁷, Maulik-Bandyopadhyay⁸, SD⁹, SDbw¹⁰, Dunn¹¹, Xie-Beni¹², XB*¹³ and Davies-Bouldin¹⁴.
Compute CVI values and select the best clustering based on the results.
Compatible with time-series and their distance and average functions such as Dynamic Time Warping (DTW)¹, Move-Split-Merge (MSM)[^MSM], Dynamic Time Warping Barycentric Average (DBA)², MBA (MSM DTW barycentric average)[^MBA], etc.
Compatible with scikit-learn, scikit-learn-extra, kmedoids, aeon and sktime, for easy integration into any clustering pipeline in Python.
Can compute the clusterings beforehand if provided with a sklearn-like clustering class.
Enables users to define custom CVIs.
Multiple CVIs can easily be combined to select the best clustering based on a majority vote.
Variation of Information¹⁵ implemented (distance between clusterings).
Facilitates the use of time-series distances directly in some of the models implemented in scikit-learn such as AgglomerativeClustering.

Install

With uv:

# From PyPI
uv add pycvi-lib
# Alternatively, from github directly
uv add "pycvi-lib @ git+https://github.com/nglm/pycvi.git"

With poetry:

# From PyPI
poetry add pycvi-lib
# Alternatively, from github directly
poetry add git+https://github.com/nglm/pycvi.git

With pip:

# From PyPI
pip install pycvi-lib
# Alternatively, from github directly
pip install git+https://github.com/nglm/pycvi.git

With anaconda:

# activate your environment (replace myEnv with your environment name)
conda activate myEnv
# install pip first in your environment
conda install pip
# install pycvi on your anaconda environment with pip
pip install pycvi-lib

Extra dependencies

In order to run the example scripts, extra dependencies are necessary. The install command is then:

# For uv
uv add pycvi-lib[examples]
# For poetry
poetry add pycvi-lib[examples]
# For pip and anaconda
pip install pycvi-lib[examples]

Alternatively, you can manually install in your environment the packages that are necessary to run the example scripts (matplotlib).

Important note: As of now (June 2026), the latest version of scikit-learn-extra (0.3.0) is not compatible with numpy>= 2.0.0. Users who wish to combine scikit-learn-extra with PyCVI must ensure themselves that they are using a compatible version of numpy.

If you wish to run the example scripts on your own computer, please follow the instructions detailed in the documentation first: Running example scripts on your computer.

Contribute

Issue Tracker: github.com/nglm/pycvi/issues.
Source Code: github.com/nglm/pycvi.

Support

If you are having issues, please let me know or create an issue.

License

The project is licensed under the MIT license.

How to cite PyCVI

If you are using PyCVI in your work, please cite us by using one of the following entries referring to the JOSS paper "PyCVI: A Python package for internal Cluster Validity Indices, compatible with time-series data" by N. Galmiche:

BibTeX

@article{Galmiche2024,
    author = {Natacha Galmiche},
    title = {PyCVI: A Python package for internal Cluster Validity Indices, compatible with time-series data},
    doi = {10.21105/joss.06841},
    url = {https://doi.org/10.21105/joss.06841},
    year = {2024},
    publisher = {The Open Journal},
    volume = {9},
    number = {102},
    pages = {6841},
    journal = {Journal of Open Source Software}
}

Plain text

Galmiche, N., (2024). PyCVI: A Python package for internal Cluster Validity Indices, compatible with time-series data. Journal of Open Source Software, 9(102), 6841, https://doi.org/10.21105/joss.06841

Donald J. Berndt and James Clifford. Using dynamic time warping to find patterns in time series. In Proceedings of the 3rd International Conference on Knowledge Discovery and Data Mining, AAAIWS’94, page 359–370. AAAI Press, 1994 ↩ ↩²
F. Petitjean, A. Ketterlin, and P. Gan carski, “A global averaging method for dynamic time warping, with applications to clustering,” Pattern Recognition, vol. 44, pp. 678–693, Mar. 2011. [^MSM] Stefan, Alexandra et al. “The Move-Split-Merge Metric for Time Series.” IEEE Transactions on Knowledge and Data Engineering 25 (2013): 1425-1438. [^MBA] Christopher Holder, David Guijo-Rubio, and Anthony Bagnall. Barycentre averaging for the move-split-merge time series distance measure. 15th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management (2023) ↩ ↩²
D. J. Strauss and J. A. Hartigan, “Clustering algorithms,” Biometrics, vol. 31, p. 793, sep 1975. ↩
T. Calinski and J. Harabasz, “A dendrite method for cluster analysis,” Communications in Statistics - Theory and Methods, vol. 3, no. 1, pp. 1–27, 1974. ↩
R. Tibshirani, G. Walther, and T. Hastie, “Estimating the number of clusters in a data set via the gap statistic,” Journal of the Royal Statistical Society Series B: Statistical Methodology, vol. 63, pp. 411–423, July 2001. ↩
P. J. Rousseeuw, “Silhouettes: a graphical aid to the interpretation and validation of cluster analysis,” Journal of computational and applied mathematics, vol. 20, pp. 53–65, 1987. ↩
S. Saitta, B. Raphael, and I. F. C. Smith, “A bounded index for cluster validity,” in Machine Learning and Data Mining in Pattern Recognition, pp. 174–187, Springer Berlin Heidelberg, 2007. ↩
U. Maulik and S. Bandyopadhyay, “Performance evaluation of some clustering algorithms and validity indices,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 24, pp. 1650–1654, Dec. 2002. ↩
M. Halkidi, M. Vazirgiannis, and Y. Batistakis, “Quality scheme assessment in the clustering process,” in Principles of Data Mining and Knowledge Discovery, pp. 265–276, Springer Berlin Heidelberg, 2000 ↩
M. Halkidi and M. Vazirgiannis, “Clustering validity assessment: finding the optimal partitioning of a data set,” in Proceedings 2001 IEEE International Conference on Data Mining, pp. 187–194, IEEE Comput. Soc, 2001. ↩
J. C. Dunn, “Well-separated clusters and optimal fuzzy partitions,” Journal of Cybernetics, vol. 4, pp. 95–104, Jan. 1974. ↩
X. Xie and G. Beni, “A validity measure for fuzzy clustering,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 13, no. 8, pp. 841–847, 1991. ↩
M. Kim and R. Ramakrishna, “New indices for cluster validity assessment,” Pattern Recognition Letters, vol. 26, pp. 2353–2363, Nov. 2005. ↩
D. L. Davies and D. W. Bouldin, “A cluster separation measure,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. PAMI-1, pp. 224–227, Apr. 1979. ↩
M. Meil ̆a, Comparing Clusterings by the Variation of Information, p. 173–187. Springer Berlin Heidelberg, 2003. ↩

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

1.0.1

Jul 3, 2026

1.0.0

Jun 25, 2026

0.1.7

Jun 12, 2026

0.1.6

Feb 13, 2026

0.1.5

Oct 23, 2024

0.1.4

Sep 24, 2024

0.1.3

Aug 20, 2024

0.1.2

Jul 2, 2024

0.1.1

Dec 23, 2023

0.1.0

Dec 23, 2023

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pycvi_lib-1.0.1.tar.gz (330.6 kB view details)

Uploaded Jul 3, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

pycvi_lib-1.0.1-py3-none-any.whl (339.5 kB view details)

Uploaded Jul 3, 2026 Python 3

File details

Details for the file pycvi_lib-1.0.1.tar.gz.

File metadata

Download URL: pycvi_lib-1.0.1.tar.gz
Upload date: Jul 3, 2026
Size: 330.6 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.9.28 {"installer":{"name":"uv","version":"0.9.28","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for pycvi_lib-1.0.1.tar.gz
Algorithm	Hash digest
SHA256	`40180d5767482ca1d2667c76f99014f030039f426214946e08b299a94f81ecc5`
MD5	`61d05a472777f435873394c8109452e9`
BLAKE2b-256	`01b33d7925ba2b44e840497653015be2e1247c0a98c5918ba634ff88de944222`

See more details on using hashes here.

File details

Details for the file pycvi_lib-1.0.1-py3-none-any.whl.

File metadata

Download URL: pycvi_lib-1.0.1-py3-none-any.whl
Upload date: Jul 3, 2026
Size: 339.5 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.9.28 {"installer":{"name":"uv","version":"0.9.28","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for pycvi_lib-1.0.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`63d0f250762ab3e72e8ac1c7463a9ee609d01717a0e8742ccc3c6385acf2b6da`
MD5	`5e32fa5c063a6cc837c6d3f1c4336576`
BLAKE2b-256	`34b9ece50ba86a9e267e8777f5a4bd49870508cc105cc4e685a5d308bdd8b065`

See more details on using hashes here.

pycvi-lib 1.0.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

PyCVI

Documentation

Features

Install

With uv:

With poetry:

With pip:

With anaconda:

Extra dependencies

Contribute

Support

License

How to cite PyCVI

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes