Skip to main content

Automated machine learning toolkit for performing clustering tasks.

Project description

autocluster

autocluster is an automated machine learning (AutoML) toolkit for performing clustering tasks.

Report and presentation slides can be found here and here.

Prerequisites

  • Python 3.5 or above
  • Linux OS, or Windows WSL is also possible

How to get started?

  1. First, install SMAC:
  • sudo apt-get install build-essential swig
  • conda install gxx_linux-64 gcc_linux-64 swig
  • pip install smac==0.8.0
  1. pip install autocluster

How it works?

  • autocluster automatically optimizes the configuration of a clustering problem. By configuration, we mean

    • choice of dimension reduction algorithm
    • choice of clustering model
    • setting of dimension reduction algorithm's hyperparameters
    • setting of clustering model's hyperparameters
  • autocluster provides 3 different approaches to optimize the configuration (with increasing complexity):

    • random optimization
    • bayesian optimization
    • bayesian optimization + meta-learning (warmstarting)

Algorithms/Models supported

  • List of dimension reduction algorithms in sklearn supported by autocluster's optimizer.
  • List of clustering models in sklearn supported by autocluster's optimizer.

Examples

Examples are available in these notebooks.

Experimental results

  • This dataset comprises of 16 Gaussian clusters in 128-dimensional space with N = 1024 points. The optimal configuration obtained by autocluster (SMAC + Warmstarting) consists of a Truncated SVD dimension reduction model + Birch clustering model.
  • This dataset comprises of 15 Gaussian clusters in 2-dimensional space with N = 5000 points. The optimal configuration obtained by autocluster (SMAC + Warmstarting) consists of a TSNE dimension reduction model + Agglomerative clustering model.

Links

  • Link to pypi.
  • Great writeup by Martin Krasser on Bayesian Optimization

Disclaimer

The project is experimental and still under development.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

autocluster-0.5.3.tar.gz (23.4 kB view details)

Uploaded Source

Built Distribution

autocluster-0.5.3-py3-none-any.whl (27.9 kB view details)

Uploaded Python 3

File details

Details for the file autocluster-0.5.3.tar.gz.

File metadata

  • Download URL: autocluster-0.5.3.tar.gz
  • Upload date:
  • Size: 23.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.9.6 readme-renderer/34.0 requests/2.27.1 requests-toolbelt/0.10.1 urllib3/1.26.12 tqdm/4.64.0 importlib-metadata/4.8.3 keyring/23.4.1 rfc3986/1.5.0 colorama/0.4.5 CPython/3.6.8

File hashes

Hashes for autocluster-0.5.3.tar.gz
Algorithm Hash digest
SHA256 c2842b656e3c2a6e177224194e48b23a8bcc0b46a9b6215f680dea2f7d85a009
MD5 969b9ac11fa08603585bafec8ee5f708
BLAKE2b-256 c61d47352630c57a530bdfa6e1db427890df51fb0858ae99b02cb611ec429a91

See more details on using hashes here.

File details

Details for the file autocluster-0.5.3-py3-none-any.whl.

File metadata

  • Download URL: autocluster-0.5.3-py3-none-any.whl
  • Upload date:
  • Size: 27.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.9.6 readme-renderer/34.0 requests/2.27.1 requests-toolbelt/0.10.1 urllib3/1.26.12 tqdm/4.64.0 importlib-metadata/4.8.3 keyring/23.4.1 rfc3986/1.5.0 colorama/0.4.5 CPython/3.6.8

File hashes

Hashes for autocluster-0.5.3-py3-none-any.whl
Algorithm Hash digest
SHA256 548b2c02ca8d402314677a27bcf949520b0f67b6edb8a6bff3bb9aece0a23e09
MD5 00fdeec0405ce734842d1f6a5e9ba2c9
BLAKE2b-256 efa280b92e402899623f53f24116c793a14a2e3133c3a5d5221ee97335d05abf

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page