A GridSearchCV like object for clustering in sklearn
Project description
Cluster Optimizer
This is a simple object simulating the GridSearchCV object from scikit-learn (sklearn), but only for clustering. Instead of estimating predictive performance measures using a test fold, it simply calculates unsupervised scores such as the silhouette_score or davies_bouldin_score.
The object is instantiated with an sklearn cluster algorithm, e.g. KMeans, HDBScan, or similar from from sklearn.cluster and a set of parameter options. Different scoring approaches can be supplied as a list of the scoring functions (silhouette_score, davies_bouldin_score, calinski_harabasz_score from sklearn.metrics ).
Using the ClusterOptimizer.optimize() method will perform a grid search through the supplied parameter space. The scores for all supplied scoring functions are stored for all parameters.
The results can be obtained by ClusterOptimizer.results, which should return a pandas DataFrame.
For one or two parameters, the result DataFrame can be used together with seaborn for visualisation.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for cluster_optimizer-0.0.1-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | b460e9d95934348f741366d895d0896daf6f31330e2d293bcb6df0d565b7965a |
|
MD5 | f32b681ba2b51303dfd6d74a41557817 |
|
BLAKE2b-256 | 09d6c3905335dc879c1f465f1bec1a786c0c2c3418c4a004792acf6d7e02106c |