Skip to main content

Molecular property prediction based on Graph Convolution Network published by Deep4Chem

Project description

D4C molecular property prediction

License Version

Introduction

This project is a deep learning application designed to predict molecular properties. The models implemented in this project feature interpretable and hierarchical architectures, including conventional graph convolutional models. D4CMPP is designed to be user-friendly and extremely convenient for deep learning applications.


Installation

$ pip install D4CMPP

It is recommended to install dgl and torch first with the appropriate versions for your system.
Note that this package was developed with dgl==2.3.0 and torch==2.1.2.


How to start training

Using module

  1. Place the CSV file in your working directory
    • The SMILES of molecules should be in the "compound" column.
    • The SMILES of corresponding solvent should be in the "solvent" column. (optional)
    • There needs to be at least one molecular property for each corresponding molecule.
    • There are some inherent datasets such as 'Aqsoldb' and 'lipophilicity'. Use the 'test' dataset for test execution.
  2. Import the train module of the D4CMPP package
> from D4CMPP import train
  1. Check and choose the ID of the deep learning model in 'network_refer.yaml'. Or, give any invalid ID as the 'network' argument. This will show you the list of IDs of implemented networks.
    • Note that the networks incorporating the effect of solvents have the postfix 'wS', which indicates 'with solvent' (e.g. GCNwS).
> train(network="invalidID", data="test")
  1. Write the network ID for the argument 'network', the CSV file name for the argument 'data', and the column name in the file with the target property for the argument 'target'.
> train(network="GCN", data="test", target=["Abs","Emi"])
  1. Then, the graph will be generated, training will start, and the result of the training will be saved under the './_Models/' directory.

Using source code

You can directly execute training with the source code as below, which is equal to an above example.

$ python main.py -n GCN -d test -t Abs,Emi

Continue to training

You can load the saved model and continue training by

> train(LOAD_PATH="GCN_model_test_Abs,Emi_20240101_000000")

or

$ python main.py -l GCN_model_test_Abs,Emi_20240101_000000

Transfer learning

You can try transfer learning from the pretrained model by

> train(TRANSFER_PATH="GCN_model_test_Abs,Emi_20240101_000000", data="Aqsoldb", target=["Solbility"] )

or

$ python main.py --transfer GCN_model_test_Abs,Emi_20240101_000000 -d Aqsoldb -t Solubility

Analyzer

For additional tasks or analysis, the trained model can be loaded through the Analyzer module. You should import an appropriate analyzer for your trained model. In general, MolAnalyzer supports every model.

from D4CMPP.Analyzer import MolAnalyzer
ma = MolAnalyzer("_Model/GCN_model_test_Abs,Emi_20240101_000000")

In general, Analyzer supports predicting external data.

ma.predict("CCC")

Acknowledgements

This project includes code from the GC-GNN by Adem Rosenkvist Nielsen Aouichaoui (arnaou@kt.dtu.dk), licensed under the MIT License.

  • URL: GC-GNN
  • files: AttentiveFP.py, DMPNN.py
  • Description: The source codes were adopted from this project.
    Additionally, we acknowledge that various other codes and workflows in this project are based on or inspired by these projects. While many of the original codes were modified, we recognize that the coding style and some workflows were influenced by the corresponding projects.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

D4CMPP-0.37.tar.gz (1.8 MB view hashes)

Uploaded Source

Built Distribution

D4CMPP-0.37-py3-none-any.whl (1.8 MB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page