Skip to main content

Prediction and re-engineering of the cofactor specificity of Rossmann-fold proteins

Project description

Rossmann Toolbox

The Rossmann Toolbox provides two deep learning models for predicting the cofactor specificity of Rossmann enzymes based on either the sequence or the structure of the beta-alpha-beta cofactor binding motif.

Table of contents

Installation

Create a conda environment:

conda create --name rtb python=3.6.2
conda activate rtb

Install pip in the environment:

conda install pip

Install rtb using requirements.txt:

pip install -r requirements.txt

Usage

Sequence-based approach

The input is a full-length sequence. The algorithm first detects Rossmann cores (i.e. the β-α-β motifs that interact with the cofactor) in the sequence and later evaluates their cofactor specificity:

import matplotlib.pylab as plt
from rossmann_toolbox import RossmannToolbox
rtb = RossmannToolbox(use_gpu=True)

# Eample 1
# The b-a-b core is predicted in the full-length sequence

data = {'3m6i_A': 'MASSASKTNIGVFTNPQHDLWISEASPSLESVQKGEELKEGEVTVAVRSTGICGSDVHFWKHGCIGPMIVECDHVLGHESAGEVIAVHPSVKSIKVGDRVAIEPQVICNACEPCLTGRYNGCERVDFLSTPPVPGLLRRYVNHPAVWCHKIGNMSYENGAMLEPLSVALAGLQRAGVRLGDPVLICGAGPIGLITMLCAKAAGACPLVITDIDEGRLKFAKEICPEVVTHKVERLSAEESAKKIVESFGGIEPAVALECTGVESSIAAAIWAVKFGGKVFVIGVGKNEIQIPFMRASVREVDLQFQYRYCNTWPRAIRLVENGLVDLTRLVTHRFPLEDALKAFETASDPKTGAIKVQIQSLE'}

preds = rtb.predict(data, mode='seq', core_detect_mode='dl', importance=False)

# Eample 2
# The b-a-b cores are provided by the user (WT vs mutant)

data = {'seq_wt': 'AGVRLGDPVLICGAGPIGLITMLCAKAAGACPLVITDIDEGR', # WT, binds NAD
        'seq_mut': 'AGVRLGDPVLICGAGPIGLITMLCAKAAGACPLVITSRDEGR'} # D211S, I212R mutant, binds NADP

preds, imps = rtb.predict(data, mode='core', importance=True)

# Example 3
# Which residues contributed most to the prediction of WT as NAD-binding?
seq_len = len(data['seq_wt'])
plt.errorbar(list(range(1, seq_len+1)),
             imps['seq_wt']['NAD'][0], yerr=imps['seq_wt']['NAD'][1], ecolor='grey')

For more examples of how to use the sequence-based approach, see example_minimal.ipynb.

Structure-based approach

Structure-based predictions are not currently available. We are working on a new version that will not only provide predictions, but also the ability to make specificity-shifting mutations.

EGATConv layer

The structure-based predictor includes an EGAT layer that deals with graph neural networks supporting edge features. The EGAT layer is available from DGL, and you can find more details about it in the DGL documentation. For a detailed description of the EGAT layer and its usage, please refer to the supplementary materials of the Rossmann Toolbox paper.

Remarks

How to cite?

If you find the rossmann-toolbox useful, please cite the paper:

Rossmann-toolbox: a deep learning-based protocol for the prediction and design of cofactor specificity in Rossmann fold proteins Kamil Kamiński, Jan Ludwiczak, Maciej Jasiński, Adriana Bukala, Rafal Madaj, Krzysztof Szczepaniak, Stanisław Dunin-Horkawicz Briefings in Bioinformatics, Volume 23, Issue 1, January 2022, bbab371

Contact

If you have any questions, problems or suggestions, please contact us.

Funding

This work was supported by the First TEAM program of the Foundation for Polish Science co-financed by the European Union under the European Regional Development Fund.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rossmann_toolbox-0.1.1.tar.gz (11.1 MB view hashes)

Uploaded Source

Built Distribution

rossmann_toolbox-0.1.1-py3-none-any.whl (11.1 MB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page