Computing the Gromov–Hausdorff distance

These details have not been verified by PyPI

Project links

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Project description

dGH

Given the distance matrices of some metric spaces $X$ and $Y$, estimates the Gromov–Hausdorff distance $$d_\text{GH}(X, Y) = \frac{1}{2}\inf_{f:X\to Y, g:Y\to X} \text{dis}\Big(\{(x, f(x)): x \in X\} \cup \{(g(y), y): y \in Y\}\Big),$$ where $$\text{dis}(R) = \sup_{(x, y), (x', y') \in R} |d_X(x, x') - d_Y(y, y')|$$ for $R \subseteq X \times Y$.

The distance is estimated from above by solving its parametric relaxation whose solutions are guaranteed to deliver $d_\text{GH}(X, Y)$ for sufficiently large value of the parameter $c$. The quadratic relaxation with affine constraints is solved using the Frank–Wolfe algorithm in $O(n^3)$ time per its iteration, where $n = \max{|X|, |Y|}$ is the number of points in the larger space. Even if the algorithm fails to find $d_\text{GH}(X, Y)$ exactly, the resulting solution provides its upper bound.

A detailed description of the relaxation, its optimality guarantees and optimization landscape, and the approach to solving it can be found in Computing the Gromov–Hausdorff distance using first-order methods.

Quickstart

To install the package from Python Package Index:

$ pip install dgh

Consider an example in which $X$ is comprised by the vertices of a $1 \times 10$ rectangle and $Y$ — by the vertices of a unit equilateral triangle together with a point that is 10 away fom them (see illustration).

Illustration of the example

To create their distance matrices (in which the (i,j)-th entry contains the distance between the i-th and j-th points of the space):

import numpy as np

X = np.array([[0, 1, 10, 10],
              [0, 0, 10, 10],
              [0, 0, 0, 1],
              [0, 0, 0, 0]])
X += X.T

Y = np.array([[0, 1, 1, 10],
              [0, 0, 1, 10],
              [0, 0, 0, 10],
              [0, 0, 0, 0]])
Y += Y.T

To compute (an upper bound of) their Gromov--Hausdorff distance $d_\text{GH}(X, Y)$:

import dgh

dGH = dgh.upper(X, Y)

In this case, the distance is computed exactly as $d_\text{GH}(X, Y)=\frac{1}{2}$.

Basics

By default, the computational budget allocated for the search is 100 Frank–Wolfe iterations. Bigger budget means more random restarts (and/or better convergence) and therefore the accuracy. To set the budget:

dGH = dgh.upper(X, Y, iter_budget=my_budget)

To obtain the mappings $f:X\to Y, g:Y\to X$ (as arrays s.t. $f_i = j \Leftrightarrow f(x_i) = y_j$) that deliver the found minimum:

dGH, f, g = dgh.upper(X, Y, return_fg=True)

Advanced

The performance can be improved by explicitly specifying the relaxation parameter $c \in (1, \infty)$. Small $c$ makes the relaxation easier to solve, but its solutions are more likely to deliver the Gromov–Hausdorff distance when $c$ is large.

By default, the method allocates half of the iteration budget to select the best value of $c$ for $(X, Y)$ from $1+10^{-4}, 1+10^{-2},\ldots,1+10^8$, and then spends the remaining half on refining the Gromov–Hausdorff distance using this $c$.

To reveal the value of $c$ selected after the search:

dgh.upper(X, Y, verbose=1)

To compute for specific value of $c$ (to avoid spending iterations on the search and/or to test for better accuracy):

dGH = dgh.upper(X, Y, c=my_c)

Contributing

If you found a bug or want to suggest an enhancement, you can create a GitHub Issue. Alternatively, you can email vlad.oles (at) proton (dot) me.

License

dGH is released under the MIT license.

Research

To cite dGH, you can use the following:

Oles, V. (2023). Computing the Gromov–Hausdorff distance using first-order methods. arXiv preprint arXiv:2307.13660.

@article{oles2023computing,
  title={Computing the Gromov--Hausdorff distance using first-order methods},
  author={Oles, Vladyslav},
  journal={arXiv preprint arXiv:2307.13660},
  year={2023}
}

Project details

These details have not been verified by PyPI

Project links

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Release history Release notifications | RSS feed

0.1.3

Dec 28, 2023

0.1.2 yanked

Dec 28, 2023

This version

0.1.1

Jul 26, 2023

0.1.0

Jul 24, 2023

0.0.8

May 11, 2023

0.0.7

May 10, 2023

0.0.6

May 9, 2023

0.0.5

May 7, 2023

0.0.4

May 7, 2023

0.0.3

May 7, 2023

0.0.2

May 7, 2023

0.0.1

May 7, 2023

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dgh-0.1.1.tar.gz (11.2 kB view hashes)

Uploaded Jul 26, 2023 Source

Built Distribution

dgh-0.1.1-py3-none-any.whl (9.6 kB view hashes)

Uploaded Jul 26, 2023 Python 3

Hashes for dgh-0.1.1.tar.gz

Hashes for dgh-0.1.1.tar.gz
Algorithm	Hash digest
SHA256	`4798f475611e3edb8132bc38fe95362adcabe185c64415497873c3100420ca65`
MD5	`ff5f3d7fe75368df01b5f334712c6e6c`
BLAKE2b-256	`f15f9fbb0d9a129b246e4d25283e8e0131f2c2646d816dd7f02d15e8472bc7ba`

Hashes for dgh-0.1.1-py3-none-any.whl

Hashes for dgh-0.1.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`15914eff3ce7d4eaf8493ee878223061ce816798e4fdcc1eeef2cbee77cd2133`
MD5	`70519001f933bebaaf56798e335689d5`
BLAKE2b-256	`10f6280ed95dda69f8d6788fe944bdcffd4d955afc64880efc833ec7ed30f757`