Skip to main content

Contrastive neighbor embeddings (CNE) for dimensionality reduction and clustering

Project description

CNE is a probabilistic self-supervised deep learning model for compressing high-dimensional data to a low-dimensional embedding. CNE is a general-purpose algorithm that works with multiple types of data including images, time series, and tabular data. It uses the InfoNCE objective, a variational bound on mutual information, to improve local structure preservation in the compressed latent space and simultaneously learns a cluster distribution (a prior over the latent embedding) during optimization. Overlapping clusters are automatically combined by optimizing a variational upper bound on entropy, so the number of clusters does not have to be specified manually — provided the number of initial clusters is large enough. CNE produces embeddings with similar quality to existing dimensionality reduction methods; can detect outliers; scales to large, out-of-core datasets; and can easily add new data to an existing embedding/clustering.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page