Skip to main content

Implementations of graph neural networks for molecular machine learning

Project description

MolGraph: Graph Neural Networks for Molecular Machine Learning

This is an early release; things are still being updated, added and experimented with. Hence, API compatibility may break in the future.

Any feedback is welcomed!

Manuscript

See pre-print

Documentation

See readthedocs

Implementations

  • Graph tensor (GraphTensor)

    • A composite tensor holding graph data.
    • Has a ragged (multiple graphs) and a non-ragged state (single disjoint graph)
    • Can conveniently go between both states (merge() and separate())
    • Can propagate node information (features) based on edges (propagate())
    • Can add, update and remove graph data (update(), remove())
    • Has an associated GraphTensorSpec which it makes it compatible with Keras and TensorFlow API.
      • This includes keras.Sequential, keras.Functional, tf.data.Dataset, and tf.saved_model API.
  • Layers

  • Models

    • Although model building is easy with MolGraph, there are some built-in GNN models:
      • DGIN
      • DMPNN
      • MPNN
    • And models for improved interpretability of GNNs:
      • SaliencyMapping
      • IntegratedSaliencyMapping
      • SmoothGradSaliencyMapping
      • GradientActivationMapping (Recommended)

Changelog

For a detailed list of changes, see the CHANGELOG.md.

Important notes

  • Since version 0.5.0, default normalization for the GNN layers is layer normalization. This significantly improved the performance on some of the MoleculeNet datasets.

Installation

Install via pip:

pip install molgraph

Install via docker:

git clone https://github.com/akensert/molgraph.git
cd molgraph/docker
docker build -t molgraph-tf[-gpu][-jupyter]/molgraph:0.0 molgraph-tf[-gpu][-jupyter]/
docker run -it [-p 8888:8888] molgraph-tf[-gpu][-jupyter]/molgraph:0.0

Now run your first program with MolGraph:

from tensorflow import keras
from molgraph import chemistry
from molgraph import layers
from molgraph import models

# Obtain dataset, specifically ESOL
qm7 = chemistry.datasets.get('esol')

# Define molecular graph encoder
atom_encoder = chemistry.Featurizer([
    chemistry.features.Symbol(),
    chemistry.features.Hybridization(),
    # ...
])

bond_encoder = chemistry.Featurizer([
    chemistry.features.BondType(),
    # ...
])

encoder = chemistry.MolecularGraphEncoder(atom_encoder, bond_encoder)

# Obtain features and associated labels
x_train = encoder(qm7['train']['x'])
y_train = qm7['train']['y']

x_test = encoder(qm7['test']['x'])
y_test = qm7['test']['y']

# Build model via Keras API
gnn_model = keras.Sequential([
    keras.layers.Input(type_spec=x_train.spec),
    layers.GATConv(name='gat_conv_1'),
    layers.GATConv(name='gat_conv_2'),
    layers.Readout(),
    keras.layers.Dense(units=1024, activation='relu'),
    keras.layers.Dense(units=y_train.shape[-1])
])

# Compile, fit and evaluate
gnn_model.compile(optimizer='adam', loss='mae')
gnn_model.fit(x_train, y_train, epochs=50)
scores = gnn_model.evaluate(x_test, y_test)

# Compute gradient activation maps
gam_model = models.GradientActivationMapping(
    model=gnn_model, layer_names=['gat_conv_1', 'gat_conv_2'])

maps = gam_model(x_train)

Requirements/dependencies

  • Python (version >= 3.6 recommended)
  • TensorFlow (version >= 2.7.0 recommended)
  • RDKit (version >= 2022.3.3 recommended)
  • NumPy (version >= 1.21.2 recommended)
  • Pandas (version >= 1.0.3 recommended)

Tested with

  • Ubuntu 20.04 - Python 3.8.10
  • MacOS Monterey (12.3.1) - Python 3.10.3

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

molgraph-0.5.2.tar.gz (110.4 kB view details)

Uploaded Source

Built Distribution

molgraph-0.5.2-py3-none-any.whl (190.1 kB view details)

Uploaded Python 3

File details

Details for the file molgraph-0.5.2.tar.gz.

File metadata

  • Download URL: molgraph-0.5.2.tar.gz
  • Upload date:
  • Size: 110.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.6

File hashes

Hashes for molgraph-0.5.2.tar.gz
Algorithm Hash digest
SHA256 5af98d31e80b59c7497889c858b46e3446b006afb595321c49cad52733b42986
MD5 27c6bb973a128447f27343ece53592a7
BLAKE2b-256 94d31aea43374b2674a064b7f66594b58697d626602a2ce5335793d4f821417d

See more details on using hashes here.

File details

Details for the file molgraph-0.5.2-py3-none-any.whl.

File metadata

  • Download URL: molgraph-0.5.2-py3-none-any.whl
  • Upload date:
  • Size: 190.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.6

File hashes

Hashes for molgraph-0.5.2-py3-none-any.whl
Algorithm Hash digest
SHA256 423651af4431334cde8d24aaadd752b5d242ae6dcc9a36a9154e2c4175227785
MD5 d11d566f464cba20cca781c8c1c90302
BLAKE2b-256 45c99daebbf71bd8972b1bbde73088ec0f6495bc5065c982c5e4cfd7735049d2

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page