Skip to main content

A 2D clustering algorithms visualization package

Project description

Build Status codecov Documentation Status PEP8 License: MIT

ClustViz

2D Clustering Algorithms Visualization

Check out ClustVizGUI, too!

The aim of ClustViz is to visualize every step of each clustering algorithm, in the case of 2D input data.

The following algorithms have been examined:

  • OPTICS
  • DBSCAN
  • HDBSCAN
  • SPECTRAL CLUSTERING
  • HIERARCHICAL AGGLOMERATIVE CLUSTERING
    • single linkage
    • complete linkage
    • average linkage
    • Ward's method
  • CURE
  • BIRCH
  • PAM
  • CLARA
  • CLARANS
  • CHAMELEON
  • CHAMELEON2
  • DENCLUE

Instructions

Documentation: click here

Install with

pip install clustviz

To run BIRCH algorithm, the open source visualization software Graphviz is required. Install Graphviz from the official webpage (https://graphviz.gitlab.io/download/) or using HomeBrew, then modify the PATH variable as follows (replace the string according to the path where you installed Graphviz):

import os
# on Windows usually
os.environ["PATH"] += os.pathsep + 'C:/Program Files (x86)/Graphviz2.38/bin'
# on MacOS usually
os.environ["PATH"] += os.pathsep + '/usr/local/bin'

To run CHAMELEON and CHAMELEON2 algorithms, the METIS library is required. To install it on MacOS, execute the following commands (partially taken from here):

# download the file using wget (do it from the website if you prefer)
wget http://glaros.dtc.umn.edu/gkhome/fetch/sw/metis/metis-5.1.0.tar.gz
# uncompress it
gunzip metis-5.1.0.tar.gz
# untar it
tar -xvf metis-5.1.0.tar
# remove the tar
rm metis-5.1.0.tar
# go inside the folder
cd metis-5.1.0
# install it using make
make config shared=1
make install
# export the dll
export METIS_DLL=/usr/local/lib/libmetis.dylib

To install METIS on Windows, go to conda-metis and follow the instructions.

Usage

Let's see a basic example using OPTICS:

from clustviz.optics import OPTICS, plot_clust
from sklearn.datasets import make_blobs

# create a random dataset
X, y = make_blobs(n_samples=30, centers=4, n_features=2, cluster_std=1.8, random_state=42)

# perform OPTICS algorithm, with plotting enabled
ClustDist, CoreDist = OPTICS(X, eps=2, minPTS=3, plot=True, plot_reach=True)

# plot the final clusters
plot_clust(X, ClustDist, CoreDist, eps=2, eps_db=1.9)

For many other examples, take a look at the detailed clustviz_example notebook.

Repository structure

  1. The folder data/DOCUMENTS contains all the official papers, powerpoint presentations and other PDFs regarding all the algorithms involved and clustering in general.

  2. The folder clustviz contains the scripts necessary to run the clustering algorithms.

  3. The notebook data/clustviz_example.ipynb lets the user run every algorithm on 2D datasets; it contains a subsection for every algorithm, with the necessary modules and functions imported and some commented lines of code which can be uncommented to run the algorithms.

  4. The folder docs contains the necessary files to build the documentation using Sphinx and ReadTheDocs.

  5. The folder tests contains pytest tests.

Credits for some algorithms

I did not start to write the scripts for each algorithm from scratch; in some cases I modified some Python libraries, in other cases I took some publicly available GitHub repositories and modified the scripts contained there. The following list provides all the sources used when I did not write all the code by myself:

The other algorithms have been implemented from scratch following the relative papers. Thanks to Darius (https://github.com/dariomonici), the GUI Meister, for the help with PyQt5, used for ClustVizGUI.

Possible improvements

  • add more clustering algorithms
  • comment every code block and improve code quality
  • pymetis doesnt work on Windows, but could be an option for MacOS
  • add highlights to docstrings using ``
  • show aliases typehints using Sphinx (open issue)

TravisCI path

  • if Travis CI doesn't trigger, it is probably because .travis.yml isn't properly formatted. Use yamllint to correct it
  • add package update
  • for the deployment phase: brew install ruby, brew install travis
  • added empty conftest.py in clustviz folder for tests in windows version

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

clustviz-0.0.6b0.tar.gz (51.5 kB view details)

Uploaded Source

Built Distribution

clustviz-0.0.6b0-py3-none-any.whl (57.3 kB view details)

Uploaded Python 3

File details

Details for the file clustviz-0.0.6b0.tar.gz.

File metadata

  • Download URL: clustviz-0.0.6b0.tar.gz
  • Upload date:
  • Size: 51.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.25.0 setuptools/51.0.0 requests-toolbelt/0.9.1 tqdm/4.54.1 CPython/3.8.6

File hashes

Hashes for clustviz-0.0.6b0.tar.gz
Algorithm Hash digest
SHA256 d102547d1fbe3a9bf31d9288d73ea07202bf9530cb0e0d6051239d4e17239212
MD5 3e7bc511a998edeb2a68c86383356fd2
BLAKE2b-256 cf3bfd3f085283cead466864efb7de86247b7d35d5578b7ec69790f44600bc32

See more details on using hashes here.

File details

Details for the file clustviz-0.0.6b0-py3-none-any.whl.

File metadata

  • Download URL: clustviz-0.0.6b0-py3-none-any.whl
  • Upload date:
  • Size: 57.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.25.0 setuptools/51.0.0 requests-toolbelt/0.9.1 tqdm/4.54.1 CPython/3.8.6

File hashes

Hashes for clustviz-0.0.6b0-py3-none-any.whl
Algorithm Hash digest
SHA256 9a355fdd634ef1378d49b0c78e2a616d7e85e7a92b33fbd3715f2d567f0332d0
MD5 e792efa5708a66ea184e3c122c8d172f
BLAKE2b-256 06daace20f9ab3cf15122911780d2ced75dc2f1f722c1446d66b240fb79a28b6

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page