Skip to main content

A hierarchical divisive clustering toolbox

Project description

PyPI PyPI - Python Version example workflow codecov Codacy Badge License: MIT DOI

HiPart: Hierarchical divisive clustering toolbox

This repository presents the HiPart package, an open-source native python library that provides efficient and interpretable implementations of divisive hierarchical clustering algorithms. HiPart supports interactive visualizations for the manipulation of the execution steps allowing the direct intervention of the clustering outcome. This package is highly suited for Big Data applications as the focus has been given to the computational efficiency of the implemented clustering methodologies. The dependencies used are either Python build-in packages or highly maintained stable external packages. The software is provided under the MIT license.

Installation

For the installation of the package, the only necessary actions and requirements are a version of Python higher or equal to 3.8 and the execution of the following command.

pip install HiPart

Simple Example Execution

The example bellow is the simplest form of the package's execution. Shortly, it shows the creation of synthetic clustering dataset containing 6 clusters. Afterwards it is clustered with the DePDDP algorithm and only the cluster labels are returned.

from HiPart.clustering import DePDDP
from sklearn.datasets import make_blobs

X, y = make_blobs(n_samples=1500, centers=6, random_state=0)

clustered_class = DePDDP(max_clusters_number=6).fit_predict(X)

The HiPart package offers a comprehensive suite of examples to guide users in utilizing its various algorithms. These examples are conveniently located in the repository's examples directory.

For a general understanding of the package's capabilities, users can refer to the clustering_example file. This file serves as a foundational guide, providing complete examples of the package's algorithms in action.

Additionally, for those interested in incorporating KernelPCA methods, the clustering_with_kpca_example file is an invaluable resource. It offers a detailed example of how to apply KernelPCA within the context of the HiPart package.

Recognizing the importance of clustering via similarity or dissimilarity matrices, such as distance matrices, the HiPart package includes the clustering_with_distance_matrix_example file. This specific example demonstrates the use of the DePDDP algorithm with a distance matrix, offering a practical application scenario.

Lastly, the package features an interactive visualization component, which is exemplified in the interactive_visualization_example file. This example not only showcases the execution of the interactive visualization but also provides comprehensive instructions for navigating the visualization GUI.

These resources collectively ensure that users of the HiPart package have a well-rounded and practical understanding of its functionalities and applications.

Documentation

The full documentation of the package can be found here.

Citation

@article{Anagnostou2023HiPart,
  title = {HiPart: Hierarchical Divisive Clustering Toolbox},
  author = {Panagiotis Anagnostou and Sotiris Tasoulis and Vassilis P. Plagianakos and Dimitris Tasoulis},
  year = {2023},
  journal = {Journal of Open Source Software},
  publisher = {The Open Journal},
  volume = {8},
  number = {84},
  pages = {5024},
  doi = {10.21105/joss.05024},
  url = {https://doi.org/10.21105/joss.05024}
} 

Acknowledgments

This project has received funding from the Hellenic Foundation for Research and Innovation (HFRI), under grant agreement No 1901.

Collaborators

Dimitris Tasoulis :email: Panagiotis Anagnostou :email: Sotiris Tasoulis :email: Vassilis Plagianakos :email:

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

hipart-1.0.5.tar.gz (47.3 kB view details)

Uploaded Source

Built Distribution

HiPart-1.0.5-py3-none-any.whl (47.6 kB view details)

Uploaded Python 3

File details

Details for the file hipart-1.0.5.tar.gz.

File metadata

  • Download URL: hipart-1.0.5.tar.gz
  • Upload date:
  • Size: 47.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.9.20

File hashes

Hashes for hipart-1.0.5.tar.gz
Algorithm Hash digest
SHA256 782410f5c83c64b77a33805dcd39adbdc931ca29cedc526c1d6ad58460f42eba
MD5 2f7fe42b931e6c45a6c6f229d1f56238
BLAKE2b-256 9b7e5ae1d949c80580208b713f89267e941bd6912ea7aae2b7f851d58bbae3c4

See more details on using hashes here.

File details

Details for the file HiPart-1.0.5-py3-none-any.whl.

File metadata

  • Download URL: HiPart-1.0.5-py3-none-any.whl
  • Upload date:
  • Size: 47.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.9.20

File hashes

Hashes for HiPart-1.0.5-py3-none-any.whl
Algorithm Hash digest
SHA256 3c1faf346b2b2cf94932084d1fbc6ed8d0ba5c041b668835c271b6912ba85f26
MD5 69d71e36ad0397a29111523bd6b22a14
BLAKE2b-256 c67cc03b216df1d1e9b1024ecc1a90c92844a18057c94c8473b79b8dd85c6f23

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page