Automated machine learning toolkit for performing clustering tasks.
Project description
autocluster
autocluster
is an automated machine learning (AutoML) toolkit for performing clustering tasks.
Report and presentation slides can be found here and here.
Prerequisites
- Python 3.5 or above
- Linux OS, or Windows WSL is also possible
How to get started?
- First, install SMAC:
sudo apt-get install build-essential swig
conda install gxx_linux-64 gcc_linux-64 swig
pip install smac==0.8.0
pip install autocluster
How it works?
-
autocluster
automatically optimizes the configuration of a clustering problem. By configuration, we mean- choice of dimension reduction algorithm
- choice of clustering model
- setting of dimension reduction algorithm's hyperparameters
- setting of clustering model's hyperparameters
-
autocluster
provides 3 different approaches to optimize the configuration (with increasing complexity):- random optimization
- bayesian optimization
- bayesian optimization + meta-learning (warmstarting)
Algorithms/Models supported
- List of dimension reduction algorithms in
sklearn
supported byautocluster
's optimizer.
- List of clustering models in
sklearn
supported byautocluster
's optimizer.
Examples
Examples are available in these notebooks.
Experimental results
- This dataset comprises of 16 Gaussian clusters in 128-dimensional space with
N = 1024
points. The optimal configuration obtained byautocluster
(SMAC + Warmstarting) consists of a Truncated SVD dimension reduction model + Birch clustering model.
- This dataset comprises of 15 Gaussian clusters in 2-dimensional space with
N = 5000 points
. The optimal configuration obtained byautocluster
(SMAC + Warmstarting) consists of a TSNE dimension reduction model + Agglomerative clustering model.
Links
Disclaimer
The project is experimental and still under development.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
autocluster-0.5.3.tar.gz
(23.4 kB
view hashes)
Built Distribution
Close
Hashes for autocluster-0.5.3-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 548b2c02ca8d402314677a27bcf949520b0f67b6edb8a6bff3bb9aece0a23e09 |
|
MD5 | 00fdeec0405ce734842d1f6a5e9ba2c9 |
|
BLAKE2b-256 | efa280b92e402899623f53f24116c793a14a2e3133c3a5d5221ee97335d05abf |