Computing the Gromov–Hausdorff distance
Project description
dGH
Given the distance matrices of metric spaces $X$ and $Y$, estimates the Gromov–Hausdorff distance $$d_\text{GH}(X, Y) = \frac{1}{2}\inf_{f:X\to Y, g:Y\to X} \text{dis}\Bigg(\Big\{\big(x, f(x)\big): x \in X\Big\} \cup \Big\{\big(g(y), y\big): y \in Y\Big\}\Bigg),$$ where $$\text{dis}(R) = \sup_{(x, y), (x', y') \in R} \big|d_X(x, x') - d_Y(y, y')\big|$$ is the distortion of a relation $R \subseteq X \times Y$.
The distance is estimated from above by minimizing its parametric relaxation whose solutions are guaranteed to deliver $d_\text{GH}(X, Y)$ for a sufficiently large value of the parameter $c>1$. The quadratic relaxation with affine constraints is minimized using conditional gradient descent in $O(n^3)$ time per iteration, where $n = \max\big\{|X|, |Y|\big\}$. The retrieved minimum is an upper bound of (and in many cases equals to) the Gromov–Hausdorff distance $d_\text{GH}(X, Y)$.
A detailed description of the relaxation, its optimality guarantees and optimization landscape, and the approach to minimizing it can be found in Computing the Gromov–Hausdorff distance using first-order methods.
Installation
To install the package from Python Package Index:
$ pip install dgh
Quickstart
Consider $X$ comprised by the vertices of a $1 \times 10$ rectangle and $Y$ — by the vertices of a unit equilateral triangle together with a point that is 10 away from each of them (see illustration).
To create their distance matrices (the $(i,j)$-th entry stores the distance between the $i$-th and $j$-th points):
>>> import numpy as np
>>> X = np.array([[0, 1, 10, 10],
... [0, 0, 10, 10],
... [0, 0, 0, 1],
... [0, 0, 0, 0]])
>>> X += X.T
>>> Y = np.array([[0, 1, 1, 10],
... [0, 0, 1, 10],
... [0, 0, 0, 10],
... [0, 0, 0, 0]])
>>> Y += Y.T
To compute (an upper bound of) their Gromov–Hausdorff distance $d_\text{GH}(X, Y)$:
>>> import dgh
>>> dGH = dgh.upper(X, Y)
>>> dGH
0.5
In this case, the distance $d_\text{GH}(X, Y)=\frac{1}{2}$ is computed exactly.
Iteration budget
By default, the algorithm is allocated 100 iterations of conditional gradient descent. The algorithm restarts from a random point every time after converging to an approximate solution (i.e. a stationary point) until the iteration budget is depleted. Bigger budget generally means longer run time and better accuracy.
To set the iteration budget:
>>> dGH = dgh.upper(X, Y, iter_budget=20)
>>> dGH
0.5
Optimal mapping pair
Every solution is a mapping pair $(f:X\to Y, g:Y\to X)$. To access the mapping pair delivering the retrieved minimum:
>>> dGH, f, g = dgh.upper(X, Y, return_fg=True)
>>> f
[2, 2, 3, 3]
>>> g
[1, 1, 1, 2]
The $i$-th entry in either mapping stores (the index of) the image of its codomain's $i$-th point. For example, here $g(y_3)=x_2$.
Relaxation parameter $c>1$
Explicitly specifying $c$ can improve the performance of the algorithm. Small $c$ makes the relaxation easier to minimize, but its solutions are more likely to deliver the Gromov–Hausdorff distance when $c$ is large.
By default, the method allocates half of the iteration budget to select the best value of $c$ from $1+10^{-4}, 1+10^{-2},\ldots,1+10^8$, and then spends the remaining half on refining the Gromov–Hausdorff distance using this $c$. You can specify $c$ explicitly to see if it results in better accuracy and/or to save iterations on the search.
To see the value of $c$ selected after the search (along with the run summary):
>>> dgh.upper(X, Y, verbose=1)
iteration budget 100 | c=auto | dGH≥0
spent 49 iterations to choose c=1.0001
proved dGH≤0.5 after 40 restarts
To specify $c$ explicitly:
>>> dGH = dgh.upper(X, Y, c=1000)
>>> dGH
0.5
Contributing
If you found a bug or want to suggest an enhancement, you can create a GitHub Issue. Alternatively, you can email vlad.oles (at) proton (dot) me.
License
dGH is released under the MIT license.
Research
To cite dGH, you can use the following:
Oles, V. (2023). Computing the Gromov–Hausdorff distance using first-order methods. arXiv preprint arXiv:2307.13660.
@article{oles2023computing,
title={Computing the Gromov--Hausdorff distance using first-order methods},
author={Oles, Vladyslav},
journal={arXiv preprint arXiv:2307.13660},
year={2023}
}
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file dgh-0.1.3.tar.gz
.
File metadata
- Download URL: dgh-0.1.3.tar.gz
- Upload date:
- Size: 11.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.11.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 78257ddc6e5ff4dba766564aee26900d154b55913ab27ab1be60afb48b66873d |
|
MD5 | 991ba042901c89497d9aa5738d52be91 |
|
BLAKE2b-256 | 08fe6999bb5098964d0f113f1ca3f55d9771175844a1638369a71d86cede2c36 |
File details
Details for the file dgh-0.1.3-py3-none-any.whl
.
File metadata
- Download URL: dgh-0.1.3-py3-none-any.whl
- Upload date:
- Size: 10.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.11.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1b6b5af8d14121356f0ded2509597c6cf0e319ff70ee19b8ba5f85b8010ca1ef |
|
MD5 | 953f41c80a6d845ed5d671ee17d155c0 |
|
BLAKE2b-256 | bc2f40c19af3b5da4a9d7b546aaad1e7e5175270af8a5f672d7633fd150668f1 |