Skip to main content

Deep learning tool for protein orthologous group assignment

Project description

Linux/macOS builds on Actions Windows builds on AppVeyor codecov Language grade: Python Documentation Status PyPI version Anaconda-Server Badge PyPI - Python Version

DeepNOG: protein orthologous groups assignment

Assign proteins to orthologous groups (eggNOG 5) on CPUs or GPUs with deep networks. DeepNOG is much faster than alignment-based methods, providing accuracy similar to HMMER.

Installation guide

The easiest way to install DeepNOG is to obtain it from PyPI:

pip install deepnog

Alternatively, you can clone or download bleeding edge versions from GitHub and run

pip install /path/to/DeepNOG

If you plan to extend DeepNOG as a developer, run

pip install -e /path/to/DeepNOG

instead.

deepnog can also be installed from bioconda like this:

conda config --add channels pytorch
conda install pytorch deepnog

Usage

Call the deepnog command line tool with a protein sequence file in FASTA format. Example usages:

  • deepnog infer proteins.faa
    • Predicted groups of proteins in proteins.faa will be written to the console. By default, eggNOG5 bacteria level is used.
  • deepnog infer proteins.faa --out prediction.csv
    • Write into prediction.csv instead
  • deepnog infer proteins.faa -db eggNOG5 -t 1236 -V 3 -c 0.99
    • Predict EggNOG5 Gammaproteobacteria (tax 1236) groups
    • discard individual predictions below 99 % confidence
    • Show detailed progress report (-V 3)
  • deepnog train train.fa val.fa train.csv val.csv -a deepnog -e 15 --shuffle -r 123 -db eggNOG5 -t 3 -o /path/to/outdir
    • Train a model for the (hypothetical) tax level 3 of eggNOG5 with a fixed random seed for reproducible results.

The individual models for OG predictions are not stored on GitHub or PyPI, because they exceed file size limitations (up to 200M). deepnog automatically downloads the models, and puts them into a cache directory (default ~/deepnog_data/). You can change this directory by setting the DEEPNOG_DATA environment variable.

For help and advanced options, call deepnog --help, and deepnog infer --help or deepnog train --help for specific options for inference or training, respectively. See also the user & developer guide.

File formats supported

Preferred: FASTA (raw, .gz, or .xz)

DeepNOG supports protein sequences stored in all file formats listed in https://biopython.org/wiki/SeqIO, but is tested for the FASTA-file format only.

Databases currently supported

  • eggNOG 5.0
    • taxonomic level 1 (root level)
    • taxonomic level 2 (bacteria level)
    • For >100 additional eggNOG 5.0 levels, consult the docs.
  • COG 2020
  • (for additional databases/levels, please create an issue on Github, or train a model yourself---new in v1.2)

Deep network architectures currently supported

  • DeepNOG
  • DeepFam (no precomputed model currently available)

Required packages

deepnog builds upon the following packages:

  • PyTorch
  • NumPy
  • pandas
  • scikit-learn
  • tensorboard
  • Biopython
  • PyYAML
  • tqdm
  • pytest (for tests only)

See also requirements/*.txt for platform-specific recommendations (sometimes, specific versions might be required due to platform-specific bugs in the deepnog requirements)

Acknowledgements

This research is supported by the Austrian Science Fund (FWF): P27703, P31988; and by the GPU grant program of Nvidia corporation.

Citation

If you use DeepNOG, please consider citing our research article (click here for bibtex):

Roman Feldbauer, Lukas Gosch, Lukas Lüftinger, Patrick Hyden, Arthur Flexer, Thomas Rattei, DeepNOG: Fast and accurate protein orthologous group assignment, Bioinformatics, 2020, btaa1051, https://doi.org/10.1093/bioinformatics/btaa1051

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

deepnog-1.2.3.tar.gz (52.8 kB view details)

Uploaded Source

Built Distributions

deepnog-1.2.3-py3-none-win_amd64.whl (67.4 kB view details)

Uploaded Python 3 Windows x86-64

deepnog-1.2.3-py3-none-any.whl (67.4 kB view details)

Uploaded Python 3

File details

Details for the file deepnog-1.2.3.tar.gz.

File metadata

  • Download URL: deepnog-1.2.3.tar.gz
  • Upload date:
  • Size: 52.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/2.0.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/49.6.0.post20200814 requests-toolbelt/0.9.1 tqdm/4.48.2 CPython/3.7.9

File hashes

Hashes for deepnog-1.2.3.tar.gz
Algorithm Hash digest
SHA256 2e9674a0253fff91a988a7aa54ccdcb3d353d50493ef78d589202c58dfd660f8
MD5 0c572129dfa751ed1348b1dc2f9cb5e4
BLAKE2b-256 6e33d84ffcf3b090fd953f14545191fbd66d041198f44854528066cae9441ca2

See more details on using hashes here.

File details

Details for the file deepnog-1.2.3-py3-none-win_amd64.whl.

File metadata

  • Download URL: deepnog-1.2.3-py3-none-win_amd64.whl
  • Upload date:
  • Size: 67.4 kB
  • Tags: Python 3, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/2.0.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/49.6.0.post20200814 requests-toolbelt/0.9.1 tqdm/4.48.2 CPython/3.7.9

File hashes

Hashes for deepnog-1.2.3-py3-none-win_amd64.whl
Algorithm Hash digest
SHA256 c1d79513ebd4d78565176140e05a08939fa73157126d70d413e743e1accbd738
MD5 ae5ca2403bd58ad4e0455708df72edc1
BLAKE2b-256 5b99a1b72d50f54649ee2169eb00bd52b271bde44e850556cdd7fb243ad74f5e

See more details on using hashes here.

File details

Details for the file deepnog-1.2.3-py3-none-any.whl.

File metadata

  • Download URL: deepnog-1.2.3-py3-none-any.whl
  • Upload date:
  • Size: 67.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/2.0.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/49.6.0.post20200814 requests-toolbelt/0.9.1 tqdm/4.48.2 CPython/3.7.9

File hashes

Hashes for deepnog-1.2.3-py3-none-any.whl
Algorithm Hash digest
SHA256 2db1b69fa9eefca2ac8aa7f99d59b5058d72182a4e082c3abfa8413993dc1f5e
MD5 f8ea64515be1baf783b45791acc4f296
BLAKE2b-256 ec704fa5e3372f4e69af2d2011dc5c8131b9c420b32e45a49a0b957163822580

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page