Hierarchical Uniform Manifold Approximation and Projection
Project description
.. -- mode: rst --
|conda_version|_ |conda_downloads|_ |pypi_version|_ |pypi_downloads|_
.. |pypi_version| image:: https://img.shields.io/pypi/v/humap.svg .. _pypi_version: https://pypi.python.org/pypi/humap/
.. |pypi_downloads| image:: https://pepy.tech/badge/humap .. _pypi_downloads: https://pepy.tech/project/humap
.. |conda_version| image:: https://anaconda.org/conda-forge/humap/badges/version.svg .. _conda_version: https://anaconda.org/conda-forge/humap
.. |conda_downloads| image:: https://anaconda.org/conda-forge/humap/badges/downloads.svg .. _conda_downloads: https://anaconda.org/conda-forge/humap
.. image:: images/humap-2M.gif :alt: HUMAP exploration on Fashion MNIST dataset
===== HUMAP
Hierarchical Manifold Approximation and Projection (HUMAP) is a technique based on UMAP <https://github.com/lmcinnes/umap/>
_ for hierarchical dimensionality reduction. HUMAP allows to:
- Focus on important information while reducing the visual burden when exploring huge datasets;
- Drill-down the hierarchy according to information demand.
The details of the algorithm can be found in our paper on ArXiv <https://arxiv.org/abs/2106.07718>
_. This repository also features a C++ UMAP implementation.
Installation
HUMAP was written in C++ for performance purposes, and provides an intuitive Python interface. It depends upon common machine learning libraries, such as scikit-learn
and NumPy
. It also needs the pybind11
due to the interface between C++ and Python.
Requirements:
- Python 3.6 or greater
- numpy
- scipy
- scikit-learn
- pybind11
- pynndescent (for reproducible results)
- Eigen (C++)
If you have these requirements installed, use PyPI:
.. code:: bash
pip install humap
Alternatively (and preferable), you can use conda to install:
.. code:: bash
conda install humap
If using pip:
HUMAP depends on Eigen <https://eigen.tuxfamily.org/>
_. Thus, make it sure to place the headers in /usr/local/include if using Unix or C:\Eigen if using Windows.
Manual installation:
For manually installing HUMAP, download the project and proceed as follows:
.. code:: bash
python setup.py bdist_wheel
.. code:: bash
pip install dist/humap*.whl
Usage examples
The simplest usage of HUMAP is as it follows:
Fitting the hierarchy
.. code:: python
import humap
from sklearn.datasets import fetch_openml
X, y = fetch_openml('mnist_784', version=1, return_X_y=True)
# build a hierarchy with three levels
hUmap = humap.HUMAP([0.2, 0.2])
hUmap.fit(X, y)
# embed level 2
embedding2 = hUmap.transform(2)
Refer to notebooks/ for complete examples.
C++ UMAP implementation
You can also fit a one-level HUMAP hierarchy, which essentially fits UMAP projection.
.. code:: python
umap_reducer = humap.UMAP()
embedding = umap_reducer.fit_transform(X)
Citation
Please, use the following reference to cite HUMAP in your work:
.. code:: bibtex
@misc{marciliojr_humap2021,
title={HUMAP: Hierarchical Uniform Manifold Approximation and Projection},
author={Wilson E. Marcílio-Jr and Danilo M. Eler and Fernando V. Paulovich and Rafael M. Martins},
year={2021},
eprint={2106.07718},
archivePrefix={arXiv},
primaryClass={cs.LG}
}
License
HUMAP follows the 3-clause BSD license and it uses the open-source NNDescent implementation from EFANNA <https://github.com/ZJULearning/efanna>
. It also uses a C++ implementation of UMAP <http://github.com/lmcinnes/umap>
for embedding hierarchy levels.
E-mail me (wilson_jr at outlook.com) if you like to contribute.
......
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file humap-0.2.8.tar.gz
.
File metadata
- Download URL: humap-0.2.8.tar.gz
- Upload date:
- Size: 24.2 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.8.13
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | be4824c44eb8a67c35ce8f9b22f798125d0066717633c0d139c356c7aa1ba111 |
|
MD5 | 288cb7b6eac9d04c134d6860efbac38f |
|
BLAKE2b-256 | 989ed12cea38cf34171bbdfaaca1d54bc2b72ac8f0c9e95aedb55875ab3c9895 |
File details
Details for the file humap-0.2.8-cp38-cp38-macosx_11_0_arm64.whl
.
File metadata
- Download URL: humap-0.2.8-cp38-cp38-macosx_11_0_arm64.whl
- Upload date:
- Size: 309.0 kB
- Tags: CPython 3.8, macOS 11.0+ ARM64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.8.13
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 558b68c42366be4c5cf732e05c6c28739cbf12134af288caad446b8d7014e0f4 |
|
MD5 | 29ed37916f8a137696c2c2b0371e669d |
|
BLAKE2b-256 | 84944de39198487e9d06f9ac732a7bcbc0fa31ae56931e3583179281e0139860 |