Neural networks for MolMap generated features

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Development Status
- 3 - Alpha
Intended Audience
- Developers
License
- OSI Approved :: Apache Software License
Natural Language
- English
Programming Language

Project description

molmapnets

Neural networks for regression and classification for molecular data, using MolMap generated features.

#all_slow

This package implements the neural network architects originally presented in the MolMap package, with two important differences:

The package is written using literate programming so all functionalities are written and tested in Jupyter notebooks, and the implementation, testing, and documentation are done together at the same time. You can find the documentation on the package website.
The models are implemented in PyTorch.

Install

First you need to install MolMap and ChemBench (you can find the detailed installation guide here), then simply

pip install molmapnets

How to use the package

We need ChemBench for the datasets, MolMap for feature extraction, and finally molmapnets for the neural networks.

from chembench import dataset
from molmap import MolMap

RDKit WARNING: [13:50:43] Enabling RDKit 2019.09.3 jupyter extensions

from molmapnets.data import SingleFeatureData, DoubleFeatureData
from molmapnets.models import MolMapRegression

And for model training we also need torch

import torch
from torch import nn, optim
from torch.utils.data import Dataset, DataLoader, random_split
torch.set_default_dtype(torch.float64)

Load and process data, using the eSOL dataset here for regression

data = dataset.load_ESOL()

total samples: 1128

descriptor = MolMap(ftype='descriptor', metric='cosine',)

descriptor.fit(verbose=0, method='umap', min_dist=0.1, n_neighbors=15,)

2021-07-23 13:50:53,798 - INFO - [bidd-molmap] - Applying grid feature map(assignment), this may take several minutes(1~30 min)
2021-07-23 13:50:56,904 - INFO - [bidd-molmap] - Finished

feature extraction

X = descriptor.batch_transform(data.x)

100%|##########| 1128/1128 [06:08<00:00,  2.78it/s]

Prepare data for model training

esol = SingleFeatureData(data.y, X)

Train, validation, and test split

train, val, test = random_split(esol, [904,112,112], generator=torch.Generator().manual_seed(7))

Batch data loader

train_loader = DataLoader(train, batch_size=8, shuffle=True)
val_loader = DataLoader(val, batch_size=8, shuffle=True)
test_loader = DataLoader(test, batch_size=8, shuffle=True)

Initialise model

model = MolMapRegression()

epochs = 5
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
model.to(device)
optimizer = optim.Adam(model.parameters(), lr=0.001)
criterion = nn.MSELoss()

Train model. The users are encouraged to tweak the training loop to achieve better performance

for epoch in range(epochs):

    running_loss = 0.0
    for i, (xb, yb) in enumerate(train_loader):

        xb, yb = xb.to(device), yb.to(device)

        # zero gradients
        optimizer.zero_grad()

        # forward propagation
        pred = model(xb)

        # loss calculation
        loss = criterion(pred, yb)
        loss.backward()
        optimizer.step()

        # print statistics
        running_loss += loss.item()

    print('[Epoch: %2d] Training loss: %.3f' %
          (epoch + 1, running_loss / (i+1)))

print('Training finished')

/Users/olivier/opt/anaconda3/envs/molmap/lib/python3.6/site-packages/torch/nn/functional.py:718: UserWarning: Named tensors and all their associated APIs are an experimental feature and subject to change. Please do not use them for anything important until they are released as stable. (Triggered internally at  ../c10/core/TensorImpl.h:1156.)
  return torch.max_pool2d(input, kernel_size, stride, padding, dilation, ceil_mode)


[Epoch:  1] Training loss: 4.530
[Epoch:  2] Training loss: 1.803
[Epoch:  3] Training loss: 1.541
[Epoch:  4] Training loss: 1.209
[Epoch:  5] Training loss: 1.092
Training finished

Please refer to the package documentation for more detailed usage.

Project details

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Development Status
- 3 - Alpha
Intended Audience
- Developers
License
- OSI Approved :: Apache Software License
Natural Language
- English
Programming Language

Release history Release notifications | RSS feed

This version

0.0.1

Jul 23, 2021

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

molmapnets-0.0.1.tar.gz (14.1 kB view hashes)

Uploaded Jul 23, 2021 Source

Built Distribution

molmapnets-0.0.1-py3-none-any.whl (11.0 kB view hashes)

Uploaded Jul 23, 2021 Python 3

Hashes for molmapnets-0.0.1.tar.gz

Hashes for molmapnets-0.0.1.tar.gz
Algorithm	Hash digest
SHA256	`2a50600cb3fecad274a719533e82ac391eadd3bd6d9120a745a5d84d7a557efd`
MD5	`4c098de2b92bc3ffedba49e38582f44f`
BLAKE2b-256	`c89c5322af9575d37a7677431cbcc113c24c0e086fc1ae282acc404984b2b7ba`

Hashes for molmapnets-0.0.1-py3-none-any.whl

Hashes for molmapnets-0.0.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`b468f5b03c2de5fe1e0cba81809d27fcf92d7b535ad868f963b4ad0b241869da`
MD5	`d80ef689ff329deba2f2ffef4681332d`
BLAKE2b-256	`a5560b629469949fc63aa837443cb4e1b93d95d8e345f386d8062c8ce284c509`