Skip to main content

Models for calculating distances between synthesis routes

Project description

route-distances

License Tests codecov Code style: black

This repository contains tools and routines to calculate distances between synthesis routes and to cluster them.

This repository is mainly intended for developers and researchers. If you want a fully functional tool that is easy to use, please consider looking into the AiZynthFinder project.

Prerequisites

Before you begin, ensure you have met the following requirements:

  • Linux, Windows or macOS platforms are supported - as long as the dependencies are supported on these platforms.

  • You have installed anaconda or miniconda with python 3.9 to 3.11

The tool has been developed on a Linux platform, but the software has been tested on Windows 10 and macOS Catalina.

Installation

For users

Setup your python environment and then run

pip install route-distances

For developers

First clone the repository using Git.

Then execute the following commands in the root of the repository

conda env create -f conda-env.yml
conda activate routes-env
poetry install

the route_distances package is now installed in editable mode.

Usage

The tool will install the cluster_aizynth_output that is used to calculate distances and clusters for AiZynthFinder output

cluster_aizynth_output --files finder_output1.hdf5 finder_output2.hdf5 --output finder_distances.hdf5 --nclusters 0 --model ted

This will perform TED calculations and add a column distance_matrix with the distances and column cluster_labels with the cluster labels for each route to the output file.

An ML model for fast predictions can be found here: https://zenodo.org/record/4925903.

This can be used with the cluster_aizynth_output tool

cluster_aizynth_output --files finder_output1.hdf5 finder_output2.hdf5 --output finder_distances.hdf5 --nclusters 0 --model chembl_10k_route_distance_model.ckpt

For further details, please consult the documentation.

Development

Testing

Tests uses the pytest package, and is installed by poetry

Run the tests using:

pytest -v

Documentation generation

The documentation is generated by Sphinx from hand-written tutorials and docstrings

The HTML documentation can be generated by

invoke build-docs

Contributing

We welcome contributions, in the form of issues or pull requests.

If you have a question or want to report a bug, please submit an issue.

To contribute with code to the project, follow these steps:

  1. Fork this repository.
  2. Create a branch: git checkout -b <branch_name>.
  3. Make your changes and commit them: git commit -m '<commit_message>'
  4. Push to the remote branch: git push
  5. Create the pull request.

Please use black package for formatting, and follow pep8 style guide.

Contributors

  • Samuel Genheden

The contributors have limited time for support questions, but please do not hesitate to submit an issue (see above).

License

The software is licensed under the MIT license (see LICENSE file), and is free and provided as-is.

References

  1. Genheden S, Engkvist O, Bjerrum E (2021) Clustering of synthetic routes using tree edit distance. J. Chem. Inf. Model. 61:3899–3907 https://doi.org/10.1021/acs.jcim.1c00232
  2. Genheden S, Engkvist O, Bjerrum E (2022) Fast prediction of distances between synthetic routes with deep learning. Mach. Learn. Sci. Technol. 3:015018 https://doi.org/10.1088/2632-2153/ac4a91

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

route_distances-1.2.4.tar.gz (20.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

route_distances-1.2.4-py3-none-any.whl (26.7 kB view details)

Uploaded Python 3

File details

Details for the file route_distances-1.2.4.tar.gz.

File metadata

  • Download URL: route_distances-1.2.4.tar.gz
  • Upload date:
  • Size: 20.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for route_distances-1.2.4.tar.gz
Algorithm Hash digest
SHA256 14bd4c38df773120490bce1df4d9661933a83e2861418a4be8a6a2dca67133db
MD5 3d824d4852ac702b9127d8465de5df92
BLAKE2b-256 85394dae0681b17fa9e0847f0d6d4ce3525459e73fdbfa2561f373f83899557f

See more details on using hashes here.

File details

Details for the file route_distances-1.2.4-py3-none-any.whl.

File metadata

File hashes

Hashes for route_distances-1.2.4-py3-none-any.whl
Algorithm Hash digest
SHA256 e97681b2de2d2429c6f95545bd0f7b5947984dd52aa1fa0686d383e2ae6cfa84
MD5 41e277eeadd2014e54d3f2bf23eed1ae
BLAKE2b-256 e0b1ecd110397d0c89ed73d8e6977707a200841d91aabbd8dbbac8de4934a0d8

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page