Distances for graphs and markov chains using optimal transport
Project description
Distances on graphs based on optimal transport
This is the implementation code for
Brugere, T., Wan, Z., & Wang, Y. (2023). Distances for Markov Chains, and Their Differentiation. ArXiv, abs/2302.08621.
Setup
Installing as a library
The ot_markov_distances package can be installed with the following command:
pip install ot_markov_distances
If for some reason you need to use cuda11.8 (ie you are installing torch+cuda118) then use the following command instead
pip install git+https://github.com/YusuLab/ot_markov_distances@cuda118
Dependencies
Python version
This project requires python 3.10 a minima. If your python version is prior to 3.10, you need to update (or to create a new conda environment) to a version above (latest release at the time of writing is 3.12)
Python dependencies
This package manages its dependencies via poetry. I recommend you install it (otherwise if you prefer to manage them manually, a list of the dependencies is available in the file pyproject.toml)
When you have poetry, you can add dependencies using our makefile
$ make .make/deps
or directly with poetry
$ poetry install
TUDataset
If you are planning to reproduce the classification experiment.
The TUDataset package is also needed to run the classification experiment, but it is not available via pip / poetry. To install it, follow the instruction in the tudataset repo, including the “Compilation of kernel baselines” section, and add the directory where you downloaded it to your $PYTHONPATH. eg:
$ export PYTHONPATH="/path/to/tudataset:$PYTHONPATH"
Project structure
.
├── docs #contains the generated docs (after typing make)
│ ├── build
│ │ └── html #Contains the html docs in readthedocs format
│ └── source
├── experiments #contains jupyter notebooks with the experiments
│ └── utils #contains helper code for the experiments
├── ot_markov_distances #contains reusable library code for computing and differentiating the discounted WL distance
│ ├── discounted_wl.py # implementation of our discounted WL distance
│ ├── __init__.py
│ ├── sinkhorn.py # implementation of the sinkhorn distance
│ ├── utils.py # utility functions
│ └── wl.py #implementation of the wl distance by Chen et al.
├── staticdocs #contains the static source for the docs
│ ├── build
│ └── source
└── tests #contains sanity checks
Documentation
The documentation is available online: read the documentation
You can build documentation and run tests using
$ make
Alternatively, you can build only the documentation using
$ make .make/build-docs
The documentation will be available in docs/build/html in the readthedocs format
Running Experiments
Running experiments requires installing development dependencies. This can be done by running
$ make .make/dev-deps
or alternatively
$ poetry install --with dev
Experiments can be found in the experiments/ directory (see Project structure ).
The Barycenter and Coarsening experiments can be found in experiments/Barycenter.ipynb and experiments/Coarsening.ipynb.
The performance graphs are computed in experiments/Performance.ipynb
Classification experiment
The Classification experiment (see the first paragraph of section 6 in the paper) is not in a jupyter notebook, but accessible via a command line.
As an additional dependency it needs tudataset, which is not installable via pip. To install it follow the instructions in the tudataset repo. , including the “Compilation of kernel baselines” section, and add the directory where you downloaded it to your $PYTHONPATH.
Now you can run the classification experiment using the command
$ poetry run python -m experiments.classification
usage: python -m experiments.classification [-h] {datasets_info,distances,eval} ...
Run classification experiments on graph datasets
positional arguments:
{datasets_info,distances,eval}
datasets_info Print information about given datasets
distances Compute distance matrices for given datasets
eval Evaluate a kernel based on distance matrix
options:
-h, --help show this help message and exit
The yaml file containing dataset information that should be passed to the command line is in experiments/grakel_datasets.yaml. Modifying this file should allow running the experiment on different datasets.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file ot_markov_distances-1.0.1.tar.gz
.
File metadata
- Download URL: ot_markov_distances-1.0.1.tar.gz
- Upload date:
- Size: 17.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/4.0.2 CPython/3.11.7
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | ea6aaf44f2b0c62701e02e1053584b9a75950b5c9fbdd43525c1d8bd26a24cf7 |
|
MD5 | ece2f97124eb394b60e42a2115e0b5b7 |
|
BLAKE2b-256 | 4bcebda1f0e12cf3194b486928880d765700680a0f62b00dc3135b0d04396934 |
File details
Details for the file ot_markov_distances-1.0.1-py3-none-any.whl
.
File metadata
- Download URL: ot_markov_distances-1.0.1-py3-none-any.whl
- Upload date:
- Size: 17.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/4.0.2 CPython/3.11.7
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 9558088aa0fae950f685707c7d90768fc46f58673e75ee3cd3842a570edc7058 |
|
MD5 | 1eae9e1c01bb228567288f2493031340 |
|
BLAKE2b-256 | 034fbea026d4ceacc6fa3dbdb171847896d7bc7934a121777700a718a401aa24 |