Skip to main content

A package for automatic clustering hyperparameter optmization

Project description

Hypercluster

A package for clustering optimization with sklearn.

Requirements:

pandas
numpy
scipy
matplotlib
seaborn
scikit-learn
hdbscan

Optional: snakemake

Install

pip install hypercluster

or

conda install -c bioconda hypercluster

Right now there are issue with the bioconda install on linux. Try the pip, if you are having problems.

Docs

https://hypercluster.readthedocs.io/en/latest/index.html

Examples

https://github.com/liliblu/hypercluster/tree/dev/examples

Quickstart example

import pandas as pd
from sklearn.datasets import make_blobs
import hypercluster

data, labels = make_blobs()
data = pd.DataFrame(data)
labels = pd.Series(labels, index=data.index, name='labels')

# With a single clustering algorithm
clusterer = hypercluster.utilities.AutoClusterer()
clusterer.fit(data).evaluate(
  methods = hypercluster.constants.need_ground_truth+hypercluster.constants.inherent_metrics, 
  gold_standard = labels
  )

hypercluster.visualize.visualize_evaluations(clusterer.evaluation_, multiple_clusterers=False)

# With a range of algorithms

evals, labels_df, labels_dict = optimize_clustering(data)

hypercluster.visualize.visualize_evaluations(evals)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

hypercluster-0.1.2.tar.gz (13.2 kB view hashes)

Uploaded Source

Built Distribution

hypercluster-0.1.2-py3-none-any.whl (22.1 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page