Modification of the UMAP algorithm to allow for fast approximate projections of new data points.
Project description
Approximate UMAP
Modification of the UMAP algorithm to allow for fast approximate projections of new data points.
Description
This package provides the classes ApproxUMAP
and ApproxAlignedUMAP
that allow for fast approximate projections of
new data points in the target
space.
The fit
and fit_transform
methods of ApproxUMAP
are nearly identical to those of umap.UMAP
;
they simply fit an additional sklearn.neighbors.NearestNeighbors
estimator.
Only the transform
method significantly differs; it approximates the projection of new data points
in the embedding space to improve the projection speed.
The projections are approximated by finding the nearest neighbors in the
source space and computing their weighted average in the embedding space.
The weights are the inverse of the distances in the source space.
Formally, the projection of a new point $x$ is approximated as follows:
$$u=\sum_i^k\frac{f(k d_i)}{\sum_j^kf(k d_j)}u_i$$
with $x_1\dots x_k$ the $k$ nearest neighbours of $x$ in the source space
among the points used for training (i.e., passed to fit
or fit_transform
),
$d_i=distance(x, x_i)$, $u_1\dots u_i$ the exact UMAP projections of $x_1\dots x_k$, and $k$ the temperature parameter.
The function $f(\cdot)$ corresponds to $\frac{1}{\cdot}$ if fn='inv'
, and to $\frac{1}{e^{\cdot}}$ if fn='exp'
.
The original behavior of UMAP's transform
method can be obtained using the transform_exact
method.
Installation
The package can be installed via pip:
pip install approxumap
Usage
The usage of ApproxUMAP
is similar to that of any scikitlearn
transformer:
import numpy as np
from approx_umap import ApproxUMAP
X = np.random.rand(100, 10)
emb_exact = ApproxUMAP(fn='exp', k=1).fit_transform(X) # exact UMAP projections
projector = ApproxUMAP(fn='exp', k=1).fit(X)
emb_approx = projector.transform(X) # approximate UMAP projection
emb_approx_exact = projector.transform_exact(X) # exact UMAP projection
The class ApproxAlignedUMAP
additionally implements the methods update
and update_transform
to created aligned embeddings of new data points with respect to the training data.
import numpy as np
from approx_umap import ApproxAlignedUMAP
X = np.random.rand(100, 10)
X_new = np.random.rand(10, 10)
emb_exact = ApproxAlignedUMAP(fn='exp', k=1).fit_transform(X) # exact UMAP projections
projector = ApproxAlignedUMAP(fn='exp', k=1).fit(X)
emb_aligned = projector.update_transform(X_new) # exact aligned UMAP projections
assert emb_aligned.shape[0] == X.shape[0] + X_new.shape[0] # returns the aligned embeddings of the whole history
emb_approx_aligned = projector.transform(X_new) # approximate aligned UMAP projections
Citation
Please, cite this work as:
@inproceedings{approxumap2024,
title = {Approximate UMAP allows for highrate online visualization of highdimensional data streams},
author = {Peter Wassenaar and Pierre Guetschel and Michael Tangermann},
year = {2024},
month = {September},
booktitle = {9th Graz BrainComputer Interface Conference},
address = {Graz, Austria},
url = {https://arxiv.org/abs/2404.04001},
}
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for approx_umap0.3.0py3noneany.whl
Algorithm  Hash digest  

SHA256  e693395a8f0abfdf46bbec30989131d79710e458ec69eb1765dfbc8564c70374 

MD5  36b4e61a4272f6bc2c32a9ab0dc4d935 

BLAKE2b256  1edaa15a92bc3179ead6cc3781121ca8cc0b39b45183e45e51309b9c47647254 