Contrastive neighbor embeddings (CNE) for dimensionality reduction and clustering
Project description
CNE is a probabilistic self-supervised deep learning model for compressing high-dimensional data to a low-dimensional embedding. CNE is a general-purpose algorithm that works with multiple types of data including images, time series, and tabular data. It uses the InfoNCE objective, a variational bound on mutual information, to improve local structure preservation in the compressed latent space and simultaneously learns a cluster distribution (a prior over the latent embedding) during optimization. Overlapping clusters are automatically combined by optimizing a variational upper bound on entropy, so the number of clusters does not have to be specified manually — provided the number of initial clusters is large enough. CNE produces embeddings with similar quality to existing dimensionality reduction methods; can detect outliers; scales to large, out-of-core datasets; and can easily add new data to an existing embedding/clustering.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.