Skip to main content
Help the Python Software Foundation raise $60,000 USD by December 31st!  Building the PSF Q4 Fundraiser

This project aims to train neural networks by compound-protein interactions and provides interpretation of the learned model by interactively showing transformed chemical landscape and visualized SAR for chemicals of interest.

Project description

VISAR

VISAR: an interactive tool for dissecting chemical features learned by deep neural network QSAR models

Qingyang Ding, Siyu Hou, Songpeng Zu, Yonghui Zhang, Shao Li

Bioinformatics Division and Center for Synthetic and Systems Biology, TNLIST/Department of Automation, Tsinghua University, Beijing 100084, China

School of Pharmaceutical Science, Tsinghua University, Beijing 100084, China.

Please contact dingqy14@mails.tsinghua.edu.cn if you have question or suggestions.

Table of contents

Aims of this project

(Back to Table of contents.)

While many previous works focus on improving predictive merits of the models, few looked into the trained model and check if the model is learning what's truly important, as well as link what have been learned by the model back to useful insights.

Here we took a step forward to interpret the learned features from deep neural network QSAR models, and present VISAR, an interactive tool for visualizing structure-activity relationship and the chemical activity landscape based on the learned features, thus providing deeper insights of the neural network 'black-box'. For a learning task, VISAR firstly provided users with useful functions to build, train and test the deep neural network models.

The rationale of VISAR workflow is shown in the schematic diagram below:

avatar

Starting from a series of trained weights of the neural network QSAR models, VISAR provided visualization tools for dissecting the learned chemical features on 3 levels: 1) on the macro-level, compounds with weighted features are clustered and forming different chemical landscapes regarding different tasks; 2) on the meso-level, within each local cluster of chemicals on the chemical landscape sharing similar sturcture and similar activity, pharmacophoric features could be identified; 3) on the micro-level, the SAR pattern is built for each compound regarding each task.

The VISAR workflow features:

  • For a learning task, VISAR firstly provided users with useful functions to build, train and test the neural network models.
  • The learned parameters of the models were then mapped back as weights of each atom and were visualized as structural-activity relationship (SAR) patterns, demonstrating the positive and negative contributor substructure suggested by the trained model.
  • VISAR took the transformed features of the chemicals and build activity landscapes, showing the correlation between the descriptor space after model training and the experimental activity space.
  • With the interactive web application of VISAR, users could interactively explore the chemical space and the SAR pattern for each chemical.
  • The clusters of chemicals on the landscape could be then subject to analysis of active pharmacophores.

We proposed that VISAR could serve as a helpful workflow for training and interactive analysis of the deep neural network QSAR model.

Workflow

(Back to Table of contents.)

avatar

The training, testing and result processing pipeline is available in template jupyter notebooks:

After the train process, start the app in prompt window by 'bokeh serve --show VISAR_webapp' for interactive exploration.

avatar

The generation of SDF file for selected compounds and pharmacophor analysis can be referred to the template jupyter notebook.

Usage instructions

(Back to Table of contents.)

  1. Get your local copy of the TeachOpenCADD repository (including the template jupyter notebooks) by
  • downloading it as zip archive and unzipping it:

  • cloning it to your computer using the package git:

git clone https://github.com/Svvord/visar.git
  1. For training environment, python=3.5 is recommended, and the environment is depended on: Deepchem, Rdkit, Keras, Tensorflow, Numpy, Pandas, Sklearn, Scipy.
# Install packages via pip (which is probably installed by default in your environment)
pip install visar
  1. Preparing the working environment for visualization using Conda is recommended, and is referred to TeachOpenCADD.
# Create and activate an environment called `visar`
conda create -n visar python=3.6
conda activate visar

# Install packages via conda
conda install jupyter  # Installs also ipykernel
conda install -c rdkit rdkit  # Installs also numpy and pandas
conda install -c samoturk pymol  # Installs also freeglut and glew
conda install -c conda-forge pmw  # Necessary for PyMol terminal window to pop up
conda install -c conda-forge scikit-learn  # Installs also scipy
conda install -c conda-forge seaborn  # Installs also matplotlib
conda install bokeh

# start the web app
cd /path/of/visar
bokeh serve --show VISAR_webapp

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Files for visar, version 0.3.4
Filename, size File type Python version Upload date Hashes
Filename, size visar-0.3.4.tar.gz (25.8 kB) File type Source Python version None Upload date Hashes View

Supported by

Pingdom Pingdom Monitoring Google Google Object Storage and Download Analytics Sentry Sentry Error logging AWS AWS Cloud computing DataDog DataDog Monitoring Fastly Fastly CDN DigiCert DigiCert EV certificate StatusPage StatusPage Status page