Advanced deep learning-based organic retrosynthesis engine
Project description
Odachi
Advanced deep learning-based organic retrosynthesis engine.
Overview
The Odachi Retrosynthesis Engine provides a platform for predicting organic retrosynthetic disconnections using a graph convolutional network. It also exposes two custom Tensorflow layers for performing spectral graph convolutions. The engine powers the retrosynthesis.com website, which provides a clean and intuitive interface to run retrosynthetic predictions.
Requirements
The Odachi Engine is built in Python 3. It has only three requirements to run:
- TensorFlow 2.x
- Scikit-Learn
- Numpy
Reference
Installation
To download dgaintel, simply use Pypi via pip.
$ pip install odachi
Alternatively, you could install from source.
$ git clone https://github.com/sudo-rushil/odachi
$ cd odachi
$ python setup.py install
Verify your installation by running
>>> import odachi
>>> odachi.engine.model.Odachi()
'<odachi.engine.model.Odachi object at 0x7f9ec80b3bd0>''
Examples
Predict bond disconnection
This is simple way of finding a retrosynthetic disconnection in a molecule. The input to the model is the SMILES string of the molecule (Ex. Aspirin).
from odachi.engine.model import Odachi
odachi = Odachi() # instantiates engine and load up TensorFlow model in backend.
results = odachi('O=C(C)Oc1ccccc1C(=O)O') # call prediction function on an input molecule.
print(results)
{'bonds': [2], 'smiles': 'O=C(C)Oc1ccccc1C(=O)O', 'svg':...}
Documentation
The Odachi package exposes four main objects: the GraphConv and ConvEmbed TensorFlow layers for spectral graph convolutions with knockdown, the Conv object for representing molecules as graphs, and the Odachi object for top-level predictions.
Layers
GraphConv
graph_conv = odachi.engine.layers.GraphConv(n,
num_feat = 41,
num_atoms = 130,
activation = tf.nn.elu,
knockdown = 0.1,
BATCH_SIZE = 1)
Layer for performing single-phase spectral graph convolutions. Inherits from tensorflow.keras.layers.Layer
and has access to all associated methods.
Parameters
- n - Layer index for labeling purpose.
- num_feat - Number of features for each node in graph.
- num_atoms - Maximum number of nodes over all graphs.
- activation - Activation function for layer.
- knockdown - Convolutional knockdown threshold for spectral regularization.
- BATCH_SIZE - Number of batches in input
Call
A, X = graph_conv([A, X])
Parameters
- A - Adjacency matrix of graph. Has dimensions (BATCH_SIZE, num_atoms, num_atoms).
- X - Features matrix of graph. Has dimensions (BATCH_SIZE, num_atoms, num_feat).
Returns
- A - Adjacency matrix of graph. Unchanged from input.
- X - Convolved features matrix of graph.
ConvEmbed
conv_embed = odachi.engine.layers.ConvEmbed(num_feat = 41,
num_atoms = 130,
depth = 10,
knock = 0.2,
BATCH_SIZE = 1)
Model object for performing stacked graph convolutions with the number of features staying constant across layers. Inherits from tensorflow.keras.Model
.
Parameters
- num_feat - Number of features for each node in graph.
- num_atoms - Maximum number of nodes over all graphs.
- depth - Number of stacked convolutional layers.
- knock - Convolutional knockdown threshold.
- BATCH_SIZE - Number of batches in input.
Call
X = conv_embed([A, X])
Parameters
- A - Initial adjacency matrix of graph. Has dimensions (BATCH_SIZE, num_atoms, num_atoms).
- X - Initial features matrix of graph. Has dimensions (BATCH_SIZE, num_atoms, num_feat).
Returns
- X - Fully convolved features matrix of graph.
Molecular Graph Representation
Conv
conv = odachi.data.conv.Conv(smiles)
Convolutional molecule (Conv) object for storing and representing molecules as featurized graphs upon which graph convolutional methods can be applied.
Parameters
- smiles - SMILES string representing the molecule to be stored as a featurized graph.
Attributes
- smiles - SMILES string of the molecule stored in the object.
- num_atoms - Number of atoms in the stored molecule.
- num_feat - Number of features per each atom (default 41).
- adj_matrix - Adjacency matrix of molecular graph. Padded up to 130 nodes by default.
- atom_features - Features matrix of molecular graph. Padded up to 130 nodes by default.
Engine
Odachi
odachi = odachi.engine.models.Odachi(knock = 0.0)
Engine implementation that wraps all three phases of the retrosynthetic prediction process to allow for predictions to be made and streamed to the retrosynthesis.com website.
Parameters
- knock - Convolutional knockdown threshold for loading saved models.
Call
result_dict = odachi(smiles,
clusters = 2,
version = 9)
Parameters
- smiles - SMILES string of the query target molecule.
- clusters - number of synthons to cluster the target molecule into.
- version - Version edition of the convolutional embedding to use for prediction. Latest version is 9.
Returns
- result_dict - Dictionary containing prediction data.
- smiles - Original smiles string of target molecule.
- bonds - List of bonds which are predicted to be disconnected.
- svg - Raw SVG for rendering the predicted retrosynthetic disconnection.
- time - Total prediction runtime.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
File details
Details for the file odachi-1.0.1.tar.gz
.
File metadata
- Download URL: odachi-1.0.1.tar.gz
- Upload date:
- Size: 1.4 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.1.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/42.0.0 requests-toolbelt/0.9.1 tqdm/4.36.1 CPython/3.7.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 937f35e085ddb54f49fd38dde872d28b1e65de3b35d15d8c38396d63c6ff4038 |
|
MD5 | c631bc6700d7be5ce4a4c4f0576f4e2d |
|
BLAKE2b-256 | d01e5ce36118da21f1f68ad678e4ecfb92b794d2897a64981d0b18e262fc0fa5 |