Clustering via hierarchical agglomerative learning

These details have not been verified by PyPI

Project links

Homepage

License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language
- Python :: 3.8

Project description

Hierarchical Agglomerative Learning (HAL)

Package for performing clustering for high-dimensional data. This packages uses heavily scikit-learn and FFT accelerated t-SNE.

System requirement

Has been tested on latest version of OS X and Linux
(OPTIONAL) The dynamical plotting requires Chrome, Safari or Firefox (without ad blockers !).

Requirement:

Python 3.6 or later versions.

Installing (once)

Activate an Anaconda Python 3 environment

source activate YOUR_CONDA_ENVIRONMENT
conda config --add channels conda-forge
conda install cython numpy fftw scipy
pip install hal-x

Updating

For future versions of the package, you can upgrade using:

pip install hal-x --upgrade

Small example

from hal import HAL  # this imports the class HAL() 
from sklearn.datasets import make_blobs
from hal import metric
import numpy as np

# Setting random seed, in case you want to re-run example but keep saved data in info_hal/ 
np.random.seed(0)

# Generate some data. 
X,ytrue = make_blobs(n_samples=10000,n_features=12,centers=10) # 10 gaussians in 12 dimensions, 10000 data points

# The HAL constructor has many optional parameters (documentation coming soon)
model = HAL(clf_type='svm', warm_start=False) # using linear SVMs (fastest) for agglomeration. Other options are 'rf' and 'nb' (random forest, and naive bayes)

# builds model -> will save data in file info_hal
model.fit(X)

# rendering of results using javascript (with optional feature naming)
feature_name = ['feat_%i'%i for i in range(12)]

# Now that your model is fitted, let's visualize the clustering hierarchy. This will give us an idea of how to choose the cv score for the final prediction
model.plot_tree(feature_name = feature_name)

# In the visualization, we see that cv above ~ 0.86 will yield perfect clustering
# In order to generate the corresponding final clustering labels, use the predict function
ypred95 = model.predict(X, cv=0.95) # Predict with score of 0.95
ypred50 = model.predict(X, cv=0.5) # Predict with score of 0.5

# You can check the accuracy of your predictions against the true labels using the convenient metric functions:
print("Normalized mutual information score: %.4f"%metric.NMI(ypred95, ytrue))
print("Normalized mutual information score: %.4f"%metric.NMI(ypred50, ytrue))

# The fitted model information is in directory info_hal. To reload that information for later use, just:
model.load()

# To load t-SNE coordinates:
model.load('tsne')

Project details

These details have not been verified by PyPI

Project links

Homepage

License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language
- Python :: 3.8

Release history Release notifications | RSS feed

This version

0.994

Feb 28, 2022

0.993

Feb 17, 2022

0.992

Sep 21, 2018

0.991

Aug 28, 2018

0.99

Aug 22, 2018

0.98

Aug 6, 2018

0.97

Aug 6, 2018

0.96

Aug 6, 2018

0.95

Aug 5, 2018

0.94

Aug 5, 2018

0.93

Aug 5, 2018

0.91

Aug 2, 2018

0.89

Aug 2, 2018

0.88

Aug 2, 2018

0.87

Jul 31, 2018

0.86

Jul 31, 2018

0.85

Jul 31, 2018

0.84

Jul 31, 2018

0.83

Jul 29, 2018

0.82

Jul 17, 2018

0.81

Jul 17, 2018

0.80

Jul 6, 2018

0.79

Jul 6, 2018

0.78

Jul 6, 2018

0.77

Jul 5, 2018

0.76

Jun 30, 2018

0.75

Jun 30, 2018

0.74

Jun 30, 2018

0.73

Jun 29, 2018

0.72

Jun 28, 2018

0.70

Jun 27, 2018

0.69

Jun 27, 2018

0.68

Jun 27, 2018

0.67

Jun 27, 2018

0.66

Jun 27, 2018

0.65

Jun 27, 2018

0.64

Jun 27, 2018

0.63

Jun 27, 2018

0.62

Jun 26, 2018

0.61

Jun 26, 2018

0.53

Jun 23, 2018

0.52

Jun 23, 2018

0.51

Jun 23, 2018

0.9

Aug 2, 2018

0.6

Jun 26, 2018

0.5

Jun 23, 2018

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

hal-x-0.994.tar.gz (38.8 kB view details)

Uploaded Feb 28, 2022 Source

File details

Details for the file hal-x-0.994.tar.gz.

File metadata

Download URL: hal-x-0.994.tar.gz
Upload date: Feb 28, 2022
Size: 38.8 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/32.0 requests/2.27.1 requests-toolbelt/0.9.1 urllib3/1.26.8 tqdm/4.62.3 importlib-metadata/4.11.2 keyring/23.5.0 rfc3986/2.0.0 colorama/0.4.4 CPython/3.9.10

File hashes

Hashes for hal-x-0.994.tar.gz
Algorithm	Hash digest
SHA256	`0cf619800aa88830d50e76341bc2090f3b449416a433614bd064d646a68a2ca0`
MD5	`45f6e114ba8f6799a166a6cd680c0506`
BLAKE2b-256	`4f2b2beeb654fe194b292dc34137907ddbc0cb4b6967e9e8f38a063612383109`

See more details on using hashes here.

hal-x 0.994

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Hierarchical Agglomerative Learning (HAL)

System requirement

Requirement:

Installing (once)

Updating

Small example

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

File details

File metadata

File hashes