Activity-cliff awareness (ACA) loss and ACANet
Project description
Activity Cliff Awareness
Code repository for activity cliff-awareness (ACA) loss and graph-based ACANet model
About
1) ACALoss
This study proposes the activity-cliff-awareness (ACA) loss for improving molecular activity prediction by deep learning models. The ACA loss enhances both metric learning in the latent space and task learning in the target space during training, making the network aware of the activity-cliff issue. For more details, please refer to the paper titled "Online triplet contrastive learning enables efficient cliff awareness in molecular activity prediction."
2) ACANet
ACANet is a deep learning model that developed based on the proposed ACALoss and graph neural network. It can tune the hyperparameters of ACALoss automatically, and provides a high-level interface of training and test in the deep learning model (Users can use it just like scikit-learn)
Model performance of with and without AC-Awareness
ACA loss vs. MAE loss on external test set and on No. of mined triplets during the training:
More details on usage and performance can be found here.
ACA loss implementation
ACA loss usage
#Pytorch
from clsar.model.loss import ACALoss
aca_loss = ACALoss(alpha=0.1, cliff_lower = 0.2, cliff_upper = 1.0, p = 1., squared = False)
loss = aca_loss(labels, predictions, embeddings)
loss.backward()
#Tensorflow
from clsar.model.loss_tf import ACALoss
Installation
pip install clsar
Run ACANet
from clsar import ACANet
#Xs_train: the SMILES string of training set (1D Arrary)
#y_train_pIC50: the pChEMBL labels of training set (1D Arrary)
## init ACANet
clf = ACANet(gpuid = 0, work_dir = './')
## get loss hyperparameters (cliff_lower, cliff_upper, and alpha) by training set
dfp = clf.opt_cliff_by_cv(Xs_train, y_train_pIC50, total_epochs=50, n_repeats=3)
dfa = clf.opt_alpha_by_cv(Xs_train, y_train_pIC50, total_epochs=100, n_repeats=3)
## fit model using 5fold cross-validation
clf.cv_fit(Xs_train, y_train_pIC50, verbose=1)
## make prediction using the 5-submodels, the outputs are the average of the 5-submodels
test_pred_pIC50 = clf.cv_predict(Xs_test)
Citation
SHEN W, Cui C, Su X, Zhang Z, Velez-Arce A, Wang J, et al. Activity Cliff-Informed Contrastive Learning for Molecular Property Prediction. ChemRxiv. 2024; doi:10.26434/chemrxiv-2023-5cz7s-v2.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file clsar-1.2.tar.gz
.
File metadata
- Download URL: clsar-1.2.tar.gz
- Upload date:
- Size: 35.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.1.1 pkginfo/1.4.2 requests/2.22.0 setuptools/45.2.0 requests-toolbelt/0.8.0 tqdm/4.30.0 CPython/3.8.10
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 |
cc52ed9989b9dd4ba924f7ad525f39508c85d944816d8f16e79baaa992bae457
|
|
MD5 |
229c9bb740739f7734a667efc0ed68f5
|
|
BLAKE2b-256 |
b79b737393ad60943e5548e3b0f72e82546c619f6949245b66433449cd4bc603
|
File details
Details for the file clsar-1.2-py3-none-any.whl
.
File metadata
- Download URL: clsar-1.2-py3-none-any.whl
- Upload date:
- Size: 37.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.1.1 pkginfo/1.4.2 requests/2.22.0 setuptools/45.2.0 requests-toolbelt/0.8.0 tqdm/4.30.0 CPython/3.8.10
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 |
318fac07b5831074931862077a365035ee4cc634a7d318c7ef201b6acd78fd03
|
|
MD5 |
97ccf034e97bd6eef2f842484284bdaf
|
|
BLAKE2b-256 |
5bdab615673d252a2de5a07044ef8090b578f3520e93fe4e84f03adad77c1c9e
|