A GridSearchCV like object for clustering in sklearn
Project description
Cluster Optimizer
This is a simple object simulating the GridSearchCV object from scikit-learn (sklearn), but only for clustering. Instead of estimating predictive performance measures using a test fold, it simply calculates unsupervised scores such as the silhouette_score or davies_bouldin_score.
The object is instantiated with an sklearn cluster algorithm, e.g. KMeans, HDBScan, or similar from from sklearn.cluster and a set of parameter options. Different scoring approaches can be supplied as a list of the scoring functions (silhouette_score, davies_bouldin_score, calinski_harabasz_score from sklearn.metrics ).
Using the ClusterOptimizer.optimize() method will perform a grid search through the supplied parameter space. The scores for all supplied scoring functions are stored for all parameters.
The results can be obtained by ClusterOptimizer.results, which should return a pandas DataFrame.
For one or two parameters, the result DataFrame can be used together with seaborn for visualisation.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for cluster_optimizer-0.0.2-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 88cef9f8e7fc74a54c0c809fc27a37a05cc659c3f6bc346ef4dd5996bf21ec2c |
|
MD5 | d289234da2b46a4c2423a5d0a437fd30 |
|
BLAKE2b-256 | d8ed9b17675a9b68fe794e704c0a9e7a39c84237b4d0740e188dcb55f181076a |