Skip to main content

Models for calculating distances between synthesis routes

Project description

route-distances

License Tests codecov Code style: black

This repository contains tools and routines to calculate distances between synthesis routes and to cluster them.

This repository is mainly intended for developers and researchers. If you want a fully functional tool that is easy to use, please consider looking into the AiZynthFinder project.

Prerequisites

Before you begin, ensure you have met the following requirements:

  • Linux, Windows or macOS platforms are supported - as long as the dependencies are supported on these platforms.

  • You have installed anaconda or miniconda with python 3.9 to 3.11

The tool has been developed on a Linux platform, but the software has been tested on Windows 10 and macOS Catalina.

Installation

For users

Setup your python environment and then run

pip install route-distances

For developers

First clone the repository using Git.

Then execute the following commands in the root of the repository

conda env create -f conda-env.yml
conda activate routes-env
poetry install

the route_distances package is now installed in editable mode.

Usage

The tool will install the cluster_aizynth_output that is used to calculate distances and clusters for AiZynthFinder output

cluster_aizynth_output --files finder_output1.hdf5 finder_output2.hdf5 --output finder_distances.hdf5 --nclusters 0 --model ted

This will perform TED calculations and add a column distance_matrix with the distances and column cluster_labels with the cluster labels for each route to the output file.

An ML model for fast predictions can be found here: https://zenodo.org/record/4925903.

This can be used with the cluster_aizynth_output tool

cluster_aizynth_output --files finder_output1.hdf5 finder_output2.hdf5 --output finder_distances.hdf5 --nclusters 0 --model chembl_10k_route_distance_model.ckpt

For further details, please consult the documentation.

Development

Testing

Tests uses the pytest package, and is installed by poetry

Run the tests using:

pytest -v

Documentation generation

The documentation is generated by Sphinx from hand-written tutorials and docstrings

The HTML documentation can be generated by

invoke build-docs

Contributing

We welcome contributions, in the form of issues or pull requests.

If you have a question or want to report a bug, please submit an issue.

To contribute with code to the project, follow these steps:

  1. Fork this repository.
  2. Create a branch: git checkout -b <branch_name>.
  3. Make your changes and commit them: git commit -m '<commit_message>'
  4. Push to the remote branch: git push
  5. Create the pull request.

Please use black package for formatting, and follow pep8 style guide.

Contributors

  • Samuel Genheden

The contributors have limited time for support questions, but please do not hesitate to submit an issue (see above).

License

The software is licensed under the MIT license (see LICENSE file), and is free and provided as-is.

References

  1. Genheden S, Engkvist O, Bjerrum E (2021) Clustering of synthetic routes using tree edit distance. J. Chem. Inf. Model. 61:3899–3907 https://doi.org/10.1021/acs.jcim.1c00232
  2. Genheden S, Engkvist O, Bjerrum E (2022) Fast prediction of distances between synthetic routes with deep learning. Mach. Learn. Sci. Technol. 3:015018 https://doi.org/10.1088/2632-2153/ac4a91

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

route_distances-1.2.1.tar.gz (21.1 kB view details)

Uploaded Source

Built Distribution

route_distances-1.2.1-py3-none-any.whl (26.7 kB view details)

Uploaded Python 3

File details

Details for the file route_distances-1.2.1.tar.gz.

File metadata

  • Download URL: route_distances-1.2.1.tar.gz
  • Upload date:
  • Size: 21.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.0 CPython/3.12.5

File hashes

Hashes for route_distances-1.2.1.tar.gz
Algorithm Hash digest
SHA256 30440b7ea38b11633cbf1abaa6ae217dd163a38bf467d823e5783b01bce3e89b
MD5 ae94e081f7f7b83d3a0c1046d0a1d01d
BLAKE2b-256 0e1186a19a3cc0d2e516c744cca349717ecec5128421459af4002d43a3b40420

See more details on using hashes here.

File details

Details for the file route_distances-1.2.1-py3-none-any.whl.

File metadata

File hashes

Hashes for route_distances-1.2.1-py3-none-any.whl
Algorithm Hash digest
SHA256 857c86847aca5aaaa00f4d289d10197c890e5a6d108c612c1d2adfee5e2ba776
MD5 ca7529a2843aabc92df1ab0fba185be6
BLAKE2b-256 afe1008102faddbc0bad6bdca1013dfeb435d4d504b7b444ce374d609f8aa714

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page