Skip to main content

Diffusion based earth mover's distance.

Project description

Diffusion Earth Mover’s Distance embeds the Wasserstein distance between two distributions on a graph into L^1 in log-linear time.

Installation

DiffusionEMD is available in pypi. Install by running the following:

pip install DiffusionEMD

Quick Start

DiffusionEMD is written following the sklearn estimator framework. We provide two functions that operate quite differently. First the Chebyshev approxiamtion of the operator in DiffusionCheb, which we recommend when the number of distributions is small compared to the number of points. Second, the Interpolative Decomposition method that computes dyadic powers of $P^{2^k}$ directly in DiffusionTree. These two classes are used in the same way, first supplying parameters, fitting to a graph and array of distributions:

import numpy as np
from DiffusionEMD import DiffusionCheb

# Setup an adjacency matrix and a set of distributions to embed
adj = np.ones((10, 10))
distributions = np.random.randn(10, 5)
dc = DiffusionCheb()

# Embeddings where the L1 distance approximates the Earth Mover's Distance
embeddings = dc.fit_transform(adj, distributions)
# Shape: (5, 60)

Requirements can be found in requirements.txt

Examples

Examples are in the notebooks directory.

Take a look at the examples provided there to get a sense of how the parameters behave on simple examples that are easy to visualize.

Paper

This code implements the algorithms described in this paper:

ArXiv Link: http://arxiv.org/abs/2102.12833:

@InProceedings{pmlr-v139-tong21a,
  title =       {Diffusion Earth Mover’s Distance and Distribution Embeddings},
  author =      {Tong, Alexander Y and Huguet, Guillaume and Natik, Amine and Macdonald, Kincaid and Kuchroo, Manik and Coifman, Ronald and Wolf, Guy and Krishnaswamy, Smita},
  booktitle =   {Proceedings of the 38th International Conference on Machine Learning},
  pages =       {10336--10346},
  year =        {2021},
  editor =      {Meila, Marina and Zhang, Tong},
  volume =      {139},
  series =      {Proceedings of Machine Learning Research},
  month =       {18--24 Jul},
  publisher =   {PMLR},
  pdf =         {http://proceedings.mlr.press/v139/tong21a/tong21a.pdf},
  url =         {http://proceedings.mlr.press/v139/tong21a.html},
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

DiffusionEMD-0.4.0.tar.gz (478.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

DiffusionEMD-0.4.0-py3-none-any.whl (29.9 kB view details)

Uploaded Python 3

File details

Details for the file DiffusionEMD-0.4.0.tar.gz.

File metadata

  • Download URL: DiffusionEMD-0.4.0.tar.gz
  • Upload date:
  • Size: 478.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.0.1 pkginfo/1.7.0 requests/2.22.0 requests-toolbelt/0.9.1 tqdm/4.41.1 CPython/3.8.5

File hashes

Hashes for DiffusionEMD-0.4.0.tar.gz
Algorithm Hash digest
SHA256 e9902dead62d5e5b2337a5e2d1280257ec47413f68b21fdaa5775f791e0008a6
MD5 e46f67468d3e4c714047463bc8bbfd2a
BLAKE2b-256 c7fe9a176c88f1315fe7b2d95fab5e678f309871d3c4bb4287239ee9a6d70883

See more details on using hashes here.

File details

Details for the file DiffusionEMD-0.4.0-py3-none-any.whl.

File metadata

  • Download URL: DiffusionEMD-0.4.0-py3-none-any.whl
  • Upload date:
  • Size: 29.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.0.1 pkginfo/1.7.0 requests/2.22.0 requests-toolbelt/0.9.1 tqdm/4.41.1 CPython/3.8.5

File hashes

Hashes for DiffusionEMD-0.4.0-py3-none-any.whl
Algorithm Hash digest
SHA256 90e2c5ca45803d7348862d66e4a74810da5d25974bd91cec41a93e2496e4ce00
MD5 bab9e4aa93eb040d6b0be96649cbb3cd
BLAKE2b-256 e55bc4ca5fa96f461f6aaae518fd4db5b6b677ef8e5595fd534ba04a1932bb9a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page