Graphium: Scaling molecular GNNs to infinity.
Project description
Scaling molecular GNNs to infinity
A deep learning library focused on graph representation learning for real-world chemical tasks.
- ✅ State-of-the-art GNN architectures.
- 🐍 Extensible API: build your own GNN model and train it with ease.
- ⚗️ Rich featurization: powerful and flexible built-in molecular featurization.
- 🧠 Pretrained models: for fast and easy inference or transfer learning.
- ⮔ Read-to-use training loop based on Pytorch Lightning.
- 🔌 Have a new dataset? Graphium provides a simple plug-and-play interface. Change the path, the name of the columns to predict, the atomic featurization, and you’re ready to play!
Documentation
Visit https://graphium-docs.datamol.io/.
You can try running Graphium on Graphcore IPUs for free on Gradient by clicking on the button above.
Installation for developers
For CPU and GPU developers
Use mamba
:
# Install Graphium's dependencies in a new environment named `graphium`
mamba env create -f env.yml -n graphium
# Install Graphium in dev mode
mamba activate graphium
pip install --no-deps -e .
For IPU developers
mkdir ~/.venv # Create the folder for the environment
python3 -m venv ~/.venv/graphium_ipu # Create the environment
source ~/.venv/graphium_ipu/bin/activate # Activate the environment
# Install the PopTorch wheel
# Make sure this is the 3.3 SDK
# Change the link according to your operating system and the `PATH_TO_SDK`
pip install PATH_TO_SDK/poptorch-3.3.0+113432_960e9c294b_ubuntu_20_04-cp38-cp38-linux_x86_64.whl
# Enable Poplar SDK (including Poplar and PopART)
source PATH_TO_SDK/enable
# Install the IPU specific and graphium requirements
pip install -r requirements_ipu.txt
# Install Graphium in dev mode
pip install --no-deps -e .
If you are new to Graphcore IPUs, you can find more details in the section below: First Time Running On IPUs
.
Training a model
To learn how to train a model, we invite you to look at the documentation, or the jupyter notebooks available here.
If you are not familiar with PyTorch or PyTorch-Lightning, we highly recommend going through their tutorial first.
Running an experiment
We have setup Graphium with hydra
for managing config files. To run an experiment go to the expts/
folder. For example, to benchmark a GCN on the ToyMix dataset run
graphium-train dataset=toymix model=gcn
To change parameters specific to this experiment like switching from fp16
to fp32
precision, you can either override them directly in the CLI via
graphium-train dataset=toymix model=gcn trainer.trainer.precision=32
or change them permamently in the dedicated experiment config under expts/hydra-configs/toymix_gcn.yaml
.
Integrating hydra
also allows you to quickly switch between accelerators. E.g., running
graphium-train dataset=toymix model=gcn accelerator=gpu
automatically selects the correct configs to run the experiment on GPU. Finally, you can also run a fine-tuning loop:
graphium-train +finetuning=admet
To use a config file you built from scratch you can run
graphium-train --config-path [PATH] --config-name [CONFIG]
Thanks to the modular nature of hydra
you can reuse many of our config settings for your own experiments with Graphium.
First Time Running on IPUs
For new IPU developers this section helps provide some more explanation on how to set up an environment to use Graphcore IPUs with Graphium.
# Set up a virtual environment as normal
mkdir ~/.venv # Create the folder for the environment
python3 -m venv ~/.venv/graphium_ipu # Create the environment
source ~/.venv/graphium_ipu/bin/activate # Activate the environment
python3 -m pip install --upgrade pip
# We can download the Poplar SDK directly using `wget` - more details on the various Graphcore downloads can be found here `https://www.graphcore.ai/downloads`
# NOTE: For simplicity this will download the SDK directly where you run this command, we recommend doing this outside the Graphium directory.
# Make sure to download the right file according to your operating system
wget -q -O 'poplar_sdk-ubuntu_20_04-3.3.0-208993bbb7.tar.gz' 'https://downloads.graphcore.ai/direct?package=poplar-poplar_sdk_ubuntu_20_04_3.3.0_208993bbb7-3.3.0&file=poplar_sdk-ubuntu_20_04-3.3.0-208993bbb7.tar.gz'
# Unzip the SDK file
tar -xzf poplar_sdk-ubuntu_20_04-3.3.0-208993bbb7.tar.gz
# Then use pip to install the wheel
python3 -m pip install poplar_sdk-ubuntu_20_04-3.3.0+1403-208993bbb7/poptorch-3.3.0+113432_960e9c294b_ubuntu_20_04-cp38-cp38-linux_x86_64.whl
# Enable Poplar SDK (including Poplar and PopART)
source poplar_sdk-ubuntu_20_04-3.3.0+1403-208993bbb7/enable
# Then as a quick test make sure poptorch is correctly installed
# If it is, this will not execute properly.
python3 -c "import poptorch;print('poptorch installed correctly')"
# Install the IPU specific and graphium requirements
pip install -r requirements_ipu.txt
# Install Graphium in dev mode
python -m pip install --no-deps -e .
License
Under the Apache-2.0 license. See LICENSE.
Documentation
- Diagram for data processing in molGPS.
- Diagram for Muti-task network in molGPS
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file graphium-2.3.0.tar.gz
.
File metadata
- Download URL: graphium-2.3.0.tar.gz
- Upload date:
- Size: 4.4 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.11.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 3d3249bdef55c44cd0cca8d96a3f5f6be25b951f9e4af3a474b7b839ba426c8a |
|
MD5 | 86bb908d55592930a9bdc60c107a9216 |
|
BLAKE2b-256 | 6d1b1d8da7ec72f32f655161a8b95a1c4d0ae35999d43250dd557b36731c52cc |
File details
Details for the file graphium-2.3.0-py3-none-any.whl
.
File metadata
- Download URL: graphium-2.3.0-py3-none-any.whl
- Upload date:
- Size: 1.0 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.11.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 533ddf2a054684d4defc471c487a8789e989253af202c432e6889f1675f29ec5 |
|
MD5 | 3c7073a936f508ef9803d02486722d64 |
|
BLAKE2b-256 | a6176f964e073bead232aec9fb4d572ef9e0f573d0b333e3015b9020b5ee9b72 |