Enzyme Commission Number Prediction
Project description
ECPICK
Biologically interpretable deep learning enhances trustworthy enzyme commission number prediction and discovers potential motif sites
The rapid growth of uncharacterized enzymes and their functional diversity urge accurate and trustworthy computational functional annotation tools. However, current approaches lack trustworthiness for the predictions and model interpretation, limiting model reliability on the multi-label classification problem with thousands of classes. Here, we demonstrate that our novel biologically interpretable deep learning model (ECPICK) provides a robust solution for trustworthy predictions of enzyme commission (EC) numbers with significantly enhanced predictive power and the capability to discover potential motif sites. ECPICK learns complex sequential patterns of amino acids and their hierarchical structures from twenty million proteins to create the EC number predictions. Furthermore, ECPICK identifies significant amino acids that contribute to the prediction in a given protein sequence without multiple sequence alignment, which may match to known motif sites for trustworthy prediction or potential motif sites. Our intensive assessment showed not only outstanding enhancement of predictive performance on the largest databases of Uniprot, PDB, and KEGG, but also a capability to discover new motif sites in microorganisms. ECPICK will be a reliable EC number prediction tool to identify protein functions of an increasing number of uncharacterized enzymes.
- Website: http://ecpick.dataxlab.org
- Documentation: https://readthedocs.org/projects/ecpick
- Source code: https://github.com/datax-lab/ECPICK
Installation
ECPICK support Python 3.6+, Additionally, you will need
biopython
, numpy
, scikit-learn
, torch
, tqdm
.
However, these packages should be installed automatically when installing this codebase.
Dependencies+
ECPICK
is available through PyPi and can easily be installed with a pip install
$ pip install ecpick
Documentation
Read the documentation on readthedocs (Getting ready)
Quick Start
from ecpick import ECPICK
ecpick = ECPICK()
ecpick.predict_fasta(fasta_path='sample.fasta', output_path='output')
Usage
Links:
- ECPICK Web server: http://ecpick.dataxlab.org
References
Not available yet.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.