Skip to main content

Modification of the UMAP algorithm to allow for fast approximate projections of new data points.

Project description

Approximate UMAP

Modification of the UMAP algorithm to allow for fast approximate projections of new data points.

Description

This package provides the classes ApproxUMAP and ApproxAlignedUMAP that allow for fast approximate projections of new data points in the target space.

The fit and fit_transform methods of ApproxUMAP are nearly identical to those of umap.UMAP; they simply fit an additional sklearn.neighbors.NearestNeighbors estimator.

Only the transform method significantly differs; it approximates the projection of new data points in the embedding space to improve the projection speed. The projections are approximated by finding the nearest neighbors in the source space and computing their weighted average in the embedding space. The weights are the inverse of the distances in the source space.

Formally, the projection of a new point $x$ is approximated as follows: $$u=\sum_i^k\frac{f(k d_i)}{\sum_j^kf(k d_j)}u_i$$ with $x_1\dots x_k$ the $k$ nearest neighbours of $x$ in the source space among the points used for training (i.e., passed to fit or fit_transform), $d_i=distance(x, x_i)$, $u_1\dots u_i$ the exact UMAP projections of $x_1\dots x_k$, and $k$ the temperature parameter. The function $f(\cdot)$ corresponds to $\frac{1}{\cdot}$ if fn='inv', and to $\frac{1}{e^{\cdot}}$ if fn='exp'.

The original behavior of UMAP's transform method can be obtained using the transform_exact method.

Installation

The package can be installed via pip:

pip install approx-umap

Usage

The usage of ApproxUMAP is similar to that of any scikit-learn transformer:

import numpy as np
from approx_umap import ApproxUMAP

X = np.random.rand(100, 10)

emb_exact = ApproxUMAP(fn='exp', k=1).fit_transform(X)  # exact UMAP projections

projector = ApproxUMAP(fn='exp', k=1).fit(X)
emb_approx = projector.transform(X)  # approximate UMAP projection
emb_approx_exact = projector.transform_exact(X)  # exact UMAP projection

The class ApproxAlignedUMAP additionally implements the methods update and update_transform to created aligned embeddings of new data points with respect to the training data.

import numpy as np
from approx_umap import ApproxAlignedUMAP

X = np.random.rand(100, 10)
X_new = np.random.rand(10, 10)

emb_exact = ApproxAlignedUMAP(fn='exp', k=1).fit_transform(X)  # exact UMAP projections

projector = ApproxAlignedUMAP(fn='exp', k=1).fit(X)

emb_aligned = projector.update_transform(X_new)  # exact aligned UMAP projections
assert emb_aligned.shape[0] == X.shape[0] + X_new.shape[0]  # returns the aligned embeddings of the whole history

emb_approx_aligned = projector.transform(X_new)  # approximate aligned UMAP projections

Citation

Please, cite this work as:

@inproceedings{approx-umap2024,
    title = {Approximate UMAP allows for high-rate online visualization of high-dimensional data streams},
    author = {Peter Wassenaar and Pierre Guetschel and Michael Tangermann},
    year = {2024},
    month = {September},
    booktitle = {9th Graz Brain-Computer Interface Conference},
    address = {Graz, Austria},
    url = {https://arxiv.org/abs/2404.04001},
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

approx_umap-0.3.0.tar.gz (9.0 kB view details)

Uploaded Source

Built Distribution

approx_umap-0.3.0-py3-none-any.whl (11.5 kB view details)

Uploaded Python 3

File details

Details for the file approx_umap-0.3.0.tar.gz.

File metadata

  • Download URL: approx_umap-0.3.0.tar.gz
  • Upload date:
  • Size: 9.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.5.1 CPython/3.10.9 Darwin/23.5.0

File hashes

Hashes for approx_umap-0.3.0.tar.gz
Algorithm Hash digest
SHA256 0a264a926a7133a7ed793957a4edb0cfb0752c3f4499de87ce6bbb2edef6bc00
MD5 5b001b4a13b8c6aca9acc47d82cd4d13
BLAKE2b-256 253c7629b5f849e0fa52f4d1dad8cf1187294e8609a31c48acfa688a76060c87

See more details on using hashes here.

File details

Details for the file approx_umap-0.3.0-py3-none-any.whl.

File metadata

  • Download URL: approx_umap-0.3.0-py3-none-any.whl
  • Upload date:
  • Size: 11.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.5.1 CPython/3.10.9 Darwin/23.5.0

File hashes

Hashes for approx_umap-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 e693395a8f0abfdf46bbec30989131d79710e458ec69eb1765dfbc8564c70374
MD5 36b4e61a4272f6bc2c32a9ab0dc4d935
BLAKE2b-256 1edaa15a92bc3179ead6cc3781121ca8cc0b39b45183e45e51309b9c47647254

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page