Implementations of graph neural networks for molecular machine learning
Project description
MolGraph: Graph Neural Networks for Molecular Machine Learning
This is an early release; things are still being updated, added and experimented with. Hence, API compatibility may break in the future.
Any feedback is welcomed!
Manuscript
See pre-print
Documentation
See readthedocs
Implementations
-
Graph tensor (GraphTensor)
- A composite tensor holding graph data.
- Has a ragged (multiple graphs) and a non-ragged state (single disjoint graph)
- Can conveniently go between both states (merge() and separate())
- Can propagate node information (features) based on edges (propagate())
- Can add, update and remove graph data (update(), remove())
- Has an associated GraphTensorSpec which it makes it compatible with Keras and TensorFlow API.
- This includes keras.Sequential, keras.Functional, tf.data.Dataset, and tf.saved_model API.
-
Layers
- Convolutional
- GCNConv (GCNConv)
- GINConv (GINConv)
- GCNIIConv (GCNIIConv)
- GraphSageConv (GraphSageConv)
- Attentional
- GATConv (GATConv)
- GATv2Conv (GATv2Conv)
- GTConv (GTConv)
- GMMConv (GMMConv)
- GatedGCNConv (GatedGCNConv)
- AttentiveFPConv (AttentiveFPConv)
- Message-passing
- Distance-geometric
- Pre- and post-processing
- In addition to the aforementioned GNN layers, there are also several other layers which improves model-building. See
readout/
,preprocessing/
,postprocessing/
,positional_encoding/
.
- In addition to the aforementioned GNN layers, there are also several other layers which improves model-building. See
- Convolutional
-
Models
- Although model building is easy with MolGraph, there are some built-in GNN models:
- DGIN
- DMPNN
- MPNN
- And models for improved interpretability of GNNs:
- SaliencyMapping
- IntegratedSaliencyMapping
- SmoothGradSaliencyMapping
- GradientActivationMapping (Recommended)
- Although model building is easy with MolGraph, there are some built-in GNN models:
Changelog
For a detailed list of changes, see the CHANGELOG.md.
Important notes
- Since version 0.5.0, default normalization for the GNN layers is layer normalization. This significantly improved the performance on some of the MoleculeNet datasets.
Installation
Install via pip:
pip install molgraph
Install via docker:
git clone https://github.com/akensert/molgraph.git cd molgraph/docker docker build -t molgraph-tf[-gpu][-jupyter]/molgraph:0.0 molgraph-tf[-gpu][-jupyter]/ docker run -it [-p 8888:8888] molgraph-tf[-gpu][-jupyter]/molgraph:0.0
Now run your first program with MolGraph:
from tensorflow import keras
from molgraph import chemistry
from molgraph import layers
from molgraph import models
# Obtain dataset, specifically ESOL
qm7 = chemistry.datasets.get('esol')
# Define molecular graph encoder
atom_encoder = chemistry.Featurizer([
chemistry.features.Symbol(),
chemistry.features.Hybridization(),
# ...
])
bond_encoder = chemistry.Featurizer([
chemistry.features.BondType(),
# ...
])
encoder = chemistry.MolecularGraphEncoder(atom_encoder, bond_encoder)
# Obtain features and associated labels
x_train = encoder(qm7['train']['x'])
y_train = qm7['train']['y']
x_test = encoder(qm7['test']['x'])
y_test = qm7['test']['y']
# Build model via Keras API
gnn_model = keras.Sequential([
keras.layers.Input(type_spec=x_train.spec),
layers.GATConv(name='gat_conv_1'),
layers.GATConv(name='gat_conv_2'),
layers.Readout(),
keras.layers.Dense(units=1024, activation='relu'),
keras.layers.Dense(units=y_train.shape[-1])
])
# Compile, fit and evaluate
gnn_model.compile(optimizer='adam', loss='mae')
gnn_model.fit(x_train, y_train, epochs=50)
scores = gnn_model.evaluate(x_test, y_test)
# Compute gradient activation maps
gam_model = models.GradientActivationMapping(
model=gnn_model, layer_names=['gat_conv_1', 'gat_conv_2'])
maps = gam_model(x_train)
Requirements/dependencies
- Python (version >= 3.6 recommended)
- TensorFlow (version >= 2.7.0 recommended)
- RDKit (version >= 2022.3.3 recommended)
- NumPy (version >= 1.21.2 recommended)
- Pandas (version >= 1.0.3 recommended)
Tested with
- Ubuntu 20.04 - Python 3.8.10
- MacOS Monterey (12.3.1) - Python 3.10.3
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
molgraph-0.5.2.tar.gz
(110.4 kB
view details)
Built Distribution
molgraph-0.5.2-py3-none-any.whl
(190.1 kB
view details)
File details
Details for the file molgraph-0.5.2.tar.gz
.
File metadata
- Download URL: molgraph-0.5.2.tar.gz
- Upload date:
- Size: 110.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.10.6
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5af98d31e80b59c7497889c858b46e3446b006afb595321c49cad52733b42986 |
|
MD5 | 27c6bb973a128447f27343ece53592a7 |
|
BLAKE2b-256 | 94d31aea43374b2674a064b7f66594b58697d626602a2ce5335793d4f821417d |
File details
Details for the file molgraph-0.5.2-py3-none-any.whl
.
File metadata
- Download URL: molgraph-0.5.2-py3-none-any.whl
- Upload date:
- Size: 190.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.10.6
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 423651af4431334cde8d24aaadd752b5d242ae6dcc9a36a9154e2c4175227785 |
|
MD5 | d11d566f464cba20cca781c8c1c90302 |
|
BLAKE2b-256 | 45c99daebbf71bd8972b1bbde73088ec0f6495bc5065c982c5e4cfd7735049d2 |