Skip to main content

A GridSearchCV like object for clustering in sklearn

Project description

Cluster Optimizer

This is a simple object simulating the GridSearchCV object from scikit-learn (sklearn), but only for clustering. Instead of estimating predictive performance measures using a test fold, it simply calculates unsupervised scores such as the silhouette_score or davies_bouldin_score.

The object is instantiated with an sklearn cluster algorithm, e.g. KMeans, HDBScan, or similar from from sklearn.cluster and a set of parameter options. Different scoring approaches can be supplied as a list of the scoring functions (silhouette_score, davies_bouldin_score, calinski_harabasz_score from sklearn.metrics ).

Using the ClusterOptimizer.optimize() method will perform a grid search through the supplied parameter space. The scores for all supplied scoring functions are stored for all parameters.

The results can be obtained by ClusterOptimizer.results, which should return a pandas DataFrame.

For one or two parameters, the result DataFrame can be used together with seaborn for visualisation.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cluster_optimizer-0.0.1.tar.gz (5.2 kB view hashes)

Uploaded Source

Built Distribution

cluster_optimizer-0.0.1-py3-none-any.whl (5.0 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page