Skip to main content

This is a python package useful for the automated dimensionality reduction and clustering (both unsupervised and supervised).

Project description

pip install AutoClusterML

from AutoClusterML import AutoCluster

a=AutoCluster.autocluster(data,labels=None,clusters=None)

AutoClusterML

This is a python module for automated dimensionality reduction and clustering (both unsupervised and supervised).

Features

  • Dimensionality reduction techniques applied based on user request for method and number of components.

  • Automated implementation of clustering algorithms on the data (with and without the presence of labels).

  • Scatter plots for all clustering algorithms

  • Metric based bar plot comparison of clustering algorithms

  • Contingency matrices for all clustering algorithms

  • Confusion matrices for all clustering algorithms

dim_reducer

x_p=dim_reducer(x,method='PCA',components=2)

This function automatically reduces the dimension of the data using 10 different dimensionality reduction techniques namely, PCA, factor

analysis, ICA, Incremental PCA, Kernel PCA, Mini Batch Sparse PCA, NMF, Sparse PCA, Mini batch NMF and SVD respectively.

autocluster

a=AutoCluster.autocluster(data,labels=None,clusters=None)

This function automatically applies 13 clustering algorithms, namely, kmeans, kmeans-elkan algorithm, bisectingkmeans,

bisectingkmeans-elkan algorithm, minibatchkmeans, agglomerative, agglomerative-single linkage, agglomerative-complete linkage

agglomerative-average linkage,birch-0.05 threshold,birch-0.1 threshold, birch-0.5 threshold and spectral respectively.

If the labels are not given, only three metrics, namely, the silhouette, davies bouldin and calinski harabasz are used for evaluation.

If the labels are given, the following metrics, namely, 'accuracy','precision','recall','f1','rand','adjusted rand','mutual info',

'adjusted mutual info','fowlkes mallows','homogeneity measure','v measure','silhouette','davies-bouldin','calinski-harabasz','contingency

matrix','confusion matrix' are used for cluster evaluation.

The function finally returns a table that compares the clustering algorithms with the above mentioned metrices.

plot_confusion_matrix

plot_confusion_matrix(a)

This function takes the comparative table as input and plots the confusion matrix for all clustering algorithms.

plot_scatter_plot

plot_scatter_plot(data,a)

This function takes the comparative table as input and plots the scatter plot for all clustering algorithms.

get_metric_plot

get_metric_plot(a,metric)

This function takes the comparative table as input and plot a bar plot of all clustering algorithms based on the given metric.

plot_contingency_matrix

plot_contingency_matrix(a)

This function takes the comparative table as input and plots the contingency matrix for all clustering algorithms.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

AutoClusterML-0.1.2.tar.gz (6.0 kB view details)

Uploaded Source

Built Distribution

AutoClusterML-0.1.2-py3-none-any.whl (5.5 kB view details)

Uploaded Python 3

File details

Details for the file AutoClusterML-0.1.2.tar.gz.

File metadata

  • Download URL: AutoClusterML-0.1.2.tar.gz
  • Upload date:
  • Size: 6.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.3

File hashes

Hashes for AutoClusterML-0.1.2.tar.gz
Algorithm Hash digest
SHA256 4fad7376c8efa91cb1ebfeb20b70a2d50c1b925b5ad57035520bfa57b0fc69a5
MD5 1607036c22575930945f46f4dd8d2dc8
BLAKE2b-256 02ae90860d2bc02386748c42caab3c413bcd5b9d956b8c3c3e30b40c3a4c7eee

See more details on using hashes here.

File details

Details for the file AutoClusterML-0.1.2-py3-none-any.whl.

File metadata

File hashes

Hashes for AutoClusterML-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 c379ce430c173a3b66ebd8125de4ce5c073fb948128e78338945c546d4e565fa
MD5 5ddc1e3676ac0f8607f57c409390b201
BLAKE2b-256 ad3c5f21ccfe98f2bc6e26b1ecabda236130ca382925c76062b662fcfa1399c9

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page