Skip to main content

PyTorch-implementation of the DIME method to detect out-of-distribution observations in deep learning.

Project description

Distance to Modelled Embedding (DIME)

This repo contains an implementation of DIME, which is a method to detect out-of-distribution (OOD) observations in deep learning. DIME provides a flexible method to detect OOD-observations with minimal computational overhead and simply assumes access to intermediate features from an ANN.

Schematic describing DIME

The workflow how DIME works is summarized in four steps.

  1. Given a trained ANN and training set observations, obtain intermediate feature representations of the training set observations (here denoted embedding). If the embeddings are higher than 2-dimensional, aggregate to a 2D-matrix NxP-matrix (for instance by global average pooling in the context of NxCxHxW-representations from a CNN).
  2. Linearly approximate the training set embedding by a hyperplane found by truncated singular value decomposition.
  3. Given new observations, obtain the corresponding intermediate representation.
  4. In the embedding space, measure the distance to the hyperplane (modelled embedding) to determine whether observations are OOD.

In an optional step following 2., you can calibrate the distances against a calibration set to obtain probabilities of observing an observation with less than or equal distance to the observed distance.

Get started

Simply install from pip:

pip install dime-pytorch

Examples

Given a 2D-tensor, fit the hyperplane.

from dime import DIME

x = torch.tensor(...) # N x P torch 2D float-tensor.
modelled_embedding = DIME().fit(x)

To obtain probabilities, calibrate percentiles. Preferably against separate dataset. Chaining is fine.:

x_cal = torch.tensor(...)  # N_cal x P torch 2D float-tensor.
modelled_embedding = DIME().fit(x).calibrate(x_cal)

Given fitted hyperplane, you can calculate distances on new observations:

x_new = torch.tensor(...)  # N_new x P 2D float-tensor.
modelled_embedding.distance_to_hyperplane(x_new)  # -> 1D float-tensor, length N_new

To obtain probabilities of that the new observations have a distance calibration set observations are equal or less than the new distance, you need to have calibrated the percentiles as shown above. Then you receive the probablities by passing return_probablities-keyword:

modelled_embedding.distance_to_hyperplane(x_new, return_probabilites=True) # -> 1D float-tensor, length N_new

You can also use the alternative formulation of distance within the hyperplane, optionally as probabilities:

modelled_embedding.distance_within_hyperplane(x_new)  # -> 1D float-tensor, length N_new

License

Distributed under the MIT-license. See LICENSE for more information.

© 2021 Sartorius AG

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dime-pytorch-1.0.1.tar.gz (6.1 kB view details)

Uploaded Source

File details

Details for the file dime-pytorch-1.0.1.tar.gz.

File metadata

  • Download URL: dime-pytorch-1.0.1.tar.gz
  • Upload date:
  • Size: 6.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.46.0 CPython/3.7.3

File hashes

Hashes for dime-pytorch-1.0.1.tar.gz
Algorithm Hash digest
SHA256 e622a1d99fcc4cad9523d1414c6b01b8cc4fb8fba031cac2d197bd223f6665fb
MD5 10549d5aca52749e168f0bf4f617f1a5
BLAKE2b-256 a462f50a5c7c80c066c2c0411bb8801882ec627e4c66d8350b24fa9e2d81b6ec

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page