Skip to main content

Distances for graphs and markov chains using optimal transport

Project description

github workflow badge codecov doc badge pytorch badge

Distances on graphs based on optimal transport

This is the implementation code for

Brugere, T., Wan, Z., & Wang, Y. (2023). Distances for Markov Chains, and Their Differentiation. ArXiv, abs/2302.08621.

Setup

Installing as a library

The ot_markov_distances package can be installed with the following command:

pip install git+https://github.com/YusuLab/ot_markov_distances

Dependencies

Python version

This project requires python 3.10 a minima. If your python version is prior to 3.10, you need to update (or to create a new conda environment) to a version above (latest release at the time of writing is 3.12)

Python dependencies

This package manages its dependencies via poetry. I recommend you install it (otherwise if you prefer to manage them manually, a list of the dependencies is available in the file pyproject.toml)

When you have poetry, you can add dependencies using our makefile

$ make .make/deps

or directly with poetry

$ poetry install

TUDataset

If you are planning to reproduce the classification experiment.

The TUDataset package is also needed to run the classification experiment, but it is not available via pip / poetry. To install it, follow the instruction in the tudataset repo, including the “Compilation of kernel baselines” section, and add the directory where you downloaded it to your $PYTHONPATH. eg:

$ export PYTHONPATH="/path/to/tudataset:$PYTHONPATH"

Project structure

.
├── docs    #contains the generated docs (after typing make)
│   ├── build
│   │   └── html            #Contains the html docs in readthedocs format
│   └── source
├── experiments             #contains jupyter notebooks with the experiments
│   └── utils               #contains helper code for the experiments
├── ot_markov_distances     #contains reusable library code for computing and differentiating the discounted WL distance
│   ├── discounted_wl.py    # implementation of our discounted WL distance
│   ├── __init__.py
│   ├── sinkhorn.py         # implementation of the sinkhorn distance
│   ├── utils.py            # utility functions
│   └── wl.py               #implementation of the wl distance by Chen et al.
├── staticdocs #contains the static source for the docs
│   ├── build
│   └── source
└── tests #contains sanity checks

Documentation

The documentation is available online: read the documentation

You can build documentation and run tests using

$ make

Alternatively, you can build only the documentation using

$ make .make/build-docs

The documentation will be available in docs/build/html in the readthedocs format

Running Experiments

Running experiments requires installing development dependencies. This can be done by running

$ make .make/dev-deps

or alternatively

$ poetry install --with dev

Experiments can be found in the experiments/ directory (see Project structure ).

The Barycenter and Coarsening experiments can be found in experiments/Barycenter.ipynb and experiments/Coarsening.ipynb.

The performance graphs are computed in experiments/Performance.ipynb

Classification experiment

The Classification experiment (see the first paragraph of section 6 in the paper) is not in a jupyter notebook, but accessible via a command line.

As an additional dependency it needs tudataset, which is not installable via pip. To install it follow the instructions in the tudataset repo. , including the “Compilation of kernel baselines” section, and add the directory where you downloaded it to your $PYTHONPATH.

Now you can run the classification experiment using the command

$ poetry run python -m experiments.classification
usage: python -m experiments.classification [-h] {datasets_info,distances,eval} ...

Run classification experiments on graph datasets

positional arguments:
  {datasets_info,distances,eval}
    datasets_info       Print information about given datasets
    distances           Compute distance matrices for given datasets
    eval                Evaluate a kernel based on distance matrix

options:
  -h, --help            show this help message and exit

The yaml file containing dataset information that should be passed to the command line is in experiments/grakel_datasets.yaml. Modifying this file should allow running the experiment on different datasets.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ot_markov_distances-1.0.0.tar.gz (17.1 kB view details)

Uploaded Source

Built Distribution

ot_markov_distances-1.0.0-py3-none-any.whl (17.2 kB view details)

Uploaded Python 3

File details

Details for the file ot_markov_distances-1.0.0.tar.gz.

File metadata

  • Download URL: ot_markov_distances-1.0.0.tar.gz
  • Upload date:
  • Size: 17.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/4.0.2 CPython/3.11.7

File hashes

Hashes for ot_markov_distances-1.0.0.tar.gz
Algorithm Hash digest
SHA256 eefa1dbc23c9f3ca7b435ba4b7c33c3e663b9a94b5dcdc8ccd0582cb6c40eae0
MD5 33dd53c173d0b82548577447311cbff4
BLAKE2b-256 491c9ad6722778cd7c59b98b7fe764c2b3cc2e1d5f9e9ddf7a525df038b0feeb

See more details on using hashes here.

File details

Details for the file ot_markov_distances-1.0.0-py3-none-any.whl.

File metadata

File hashes

Hashes for ot_markov_distances-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 b514257072becafab77f0f4ec634c94aa72725d8cf5ad3f4035fc564e4f09923
MD5 7a673c6dc81193e6c6346be82161468c
BLAKE2b-256 9f843a35c1711cb712ce7febd4a47b2a0d2b4121b4a2a1f06ba84fbb72dfbb2e

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page