Skip to main content

Molecular property prediction based on Graph Convolution Network published by Deep4Chem

Project description

D4C molecular property prediction

License Version

Introduction

This project is a deep learning application designed to predict molecular properties. The models implemented in this project feature interpretable and hierarchical architectures, including conventional graph convolutional models. D4CMPP is designed to be user-friendly and extremely convenient for deep learning applications.


Installation

$ pip install D4CMPP

It is recommended to install dgl and torch first with the appropriate versions for your system.
Note that this package was developed with dgl==2.3.0 and torch==2.1.2.


How to start training

Using module

  1. Place the CSV file in your working directory
    • The SMILES of molecules should be in the "compound" column.
    • The SMILES of corresponding solvent should be in the "solvent" column. (optional)
    • There needs to be at least one molecular property for each corresponding molecule.
    • There are some inherent datasets such as 'Aqsoldb' and 'lipophilicity'. Use the 'test' dataset for test execution.
  2. Import the train module of the D4CMPP package
> from D4CMPP import train
  1. Check and choose the ID of the deep learning model in 'network_refer.yaml'. Or, give any invalid ID as the 'network' argument. This will show you the list of IDs of implemented networks.
    • Note that the networks incorporating the effect of solvents have the postfix 'wS', which indicates 'with solvent' (e.g. GCNwS).
> train(network="invalidID", data="test")
  1. Write the network ID for the argument 'network', the CSV file name for the argument 'data', and the column name in the file with the target property for the argument 'target'.
> train(network="GCN", data="test", target=["Abs","Emi"])
  1. Then, the graph will be generated, training will start, and the result of the training will be saved under the './_Models/' directory.

Using source code

You can directly execute training with the source code as below, which is equal to an above example.

$ python main.py -n GCN -d test -t Abs,Emi

Continue to training

You can load the saved model and continue training by

> train(LOAD_PATH="GCN_model_test_Abs,Emi_20240101_000000")

or

$ python main.py -l GCN_model_test_Abs,Emi_20240101_000000

Transfer learning

You can try transfer learning from the pretrained model by

> train(TRANSFER_PATH="GCN_model_test_Abs,Emi_20240101_000000", data="Aqsoldb", target=["Solbility"] )

or

$ python main.py --transfer GCN_model_test_Abs,Emi_20240101_000000 -d Aqsoldb -t Solubility

Analyzer

For additional tasks or analysis, the trained model can be loaded through the Analyzer module. You should import an appropriate analyzer for your trained model. In general, MolAnalyzer supports every model.

from D4CMPP.Analyzer import MolAnalyzer
ma = MolAnalyzer("_Model/GCN_model_test_Abs,Emi_20240101_000000")

In general, Analyzer supports predicting external data.

ma.predict("CCC")

Acknowledgements

This project includes code from the GC-GNN by Adem Rosenkvist Nielsen Aouichaoui (arnaou@kt.dtu.dk), licensed under the MIT License.

  • URL: GC-GNN
  • files: AttentiveFP.py, DMPNN.py
  • Description: The source codes were adopted from this project.
    Additionally, we acknowledge that various other codes and workflows in this project are based on or inspired by these projects. While many of the original codes were modified, we recognize that the coding style and some workflows were influenced by the corresponding projects.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

d4cmpp-1.26.2.tar.gz (1.8 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

D4CMPP-1.26.2-py3-none-any.whl (1.9 MB view details)

Uploaded Python 3

File details

Details for the file d4cmpp-1.26.2.tar.gz.

File metadata

  • Download URL: d4cmpp-1.26.2.tar.gz
  • Upload date:
  • Size: 1.8 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.12.3

File hashes

Hashes for d4cmpp-1.26.2.tar.gz
Algorithm Hash digest
SHA256 25a31a6641825812fd2fe701d8722251705f1b4f8ffa69fa8fe1a4c32bc473e2
MD5 318a2836aaa6d8d1aad4960c53877d7a
BLAKE2b-256 dfbc50ba1eef37b6e06d3b071fe6b5b8901ac09bb73e5a844a19bac8413d238f

See more details on using hashes here.

File details

Details for the file D4CMPP-1.26.2-py3-none-any.whl.

File metadata

  • Download URL: D4CMPP-1.26.2-py3-none-any.whl
  • Upload date:
  • Size: 1.9 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.12.3

File hashes

Hashes for D4CMPP-1.26.2-py3-none-any.whl
Algorithm Hash digest
SHA256 b6b79196e904b6a49325d676bfe5dd2662a47ff03c6c667bdc6dfb606a3575cc
MD5 037b048e5e2d176d240720d763072ff4
BLAKE2b-256 6d44b1f4b50c491a596bfed4806cd3b8ad0b20492c9bcd8486bc8ef709a676c6

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page