Skip to main content

Deep learning tool for protein orthologous group assignment

Project description

Linux builds on Actions codecov Documentation Status PyPI version Anaconda-Server Badge PyPI - Python Version

DeepNOG: protein orthologous groups assignment

Assign proteins to orthologous groups (eggNOG 5) on CPUs or GPUs with deep networks. DeepNOG is much faster than alignment-based methods, providing accuracy similar to HMMER.

Installation guide

The easiest way to install DeepNOG is to obtain it from PyPI:

pip install deepnog

Alternatively, you can clone or download bleeding edge versions from GitHub and run

pip install /path/to/DeepNOG

If you plan to extend DeepNOG as a developer, run

pip install -e /path/to/DeepNOG

instead.

deepnog can also be installed from bioconda like this:

conda install deepnog

Usage

Call the deepnog command line tool with a protein sequence file in FASTA format. Example usages:

  • deepnog infer proteins.faa
    • Predicted groups of proteins in proteins.faa will be written to the console. By default, eggNOG5 bacteria level is used.
  • deepnog infer proteins.faa --out prediction.csv
    • Write into prediction.csv instead
  • deepnog infer proteins.faa -db eggNOG5 -t 1236 -V 3 -c 0.99
    • Predict EggNOG5 Gammaproteobacteria (tax 1236) groups
    • discard individual predictions below 99 % confidence
    • Show detailed progress report (-V 3)
  • deepnog train train.fa val.fa train.csv val.csv -a deepnog -e 15 --shuffle -r 123 -db eggNOG5 -t 3 -o /path/to/outdir
    • Train a model for the (hypothetical) tax level 3 of eggNOG5 with a fixed random seed for reproducible results.

The individual models for OG predictions are not stored on GitHub or PyPI, because they exceed file size limitations (up to 200M). deepnog automatically downloads the models, and puts them into a cache directory (default ~/deepnog_data/). You can change this directory by setting the DEEPNOG_DATA environment variable.

For help and advanced options, call deepnog --help, and deepnog infer --help or deepnog train --help for specific options for inference or training, respectively. See also the user & developer guide.

File formats supported

Preferred: FASTA (raw, .gz, or .xz)

DeepNOG supports protein sequences stored in all file formats listed in https://biopython.org/wiki/SeqIO, but is tested for the FASTA-file format only.

Databases currently supported

  • eggNOG 5.0
    • taxonomic level 1 (root level)
    • taxonomic level 2 (bacteria level)
    • For >100 additional eggNOG 5.0 levels, consult the docs.
  • COG 2020
  • (for additional databases/levels, please create an issue on Github, or train a model yourself---new in v1.2)

Deep network architectures currently supported

  • DeepNOG
  • DeepFam (no precomputed model currently available)

Required packages

deepnog builds upon the following packages:

  • PyTorch
  • NumPy
  • pandas
  • scikit-learn
  • tensorboard
  • Biopython
  • PyYAML
  • tqdm
  • pytest (for tests only)

See also requirements/*.txt for platform-specific recommendations (sometimes, specific versions might be required due to platform-specific bugs in the deepnog requirements)

Acknowledgements

This research is supported by the Austrian Science Fund (FWF): P27703, P31988; and by the GPU grant program of Nvidia corporation.

Citation

If you use DeepNOG, please consider citing our research article (click here for bibtex):

Roman Feldbauer, Lukas Gosch, Lukas Lüftinger, Patrick Hyden, Arthur Flexer, Thomas Rattei, DeepNOG: Fast and accurate protein orthologous group assignment, Bioinformatics, 2020, btaa1051, https://doi.org/10.1093/bioinformatics/btaa1051

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

deepnog-1.2.4.tar.gz (41.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

deepnog-1.2.4-py3-none-any.whl (48.5 kB view details)

Uploaded Python 3

File details

Details for the file deepnog-1.2.4.tar.gz.

File metadata

  • Download URL: deepnog-1.2.4.tar.gz
  • Upload date:
  • Size: 41.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.13 {"installer":{"name":"uv","version":"0.9.13"},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for deepnog-1.2.4.tar.gz
Algorithm Hash digest
SHA256 24c754c9d36d91b8dae91cb38f5d90ea68db4a25e0a3aef12c59813f28cdd3a2
MD5 72b6a358222a136d5dfcf294a62fe8da
BLAKE2b-256 310cce20a79198b733a911e8ff1ff3b64a5080f31ce3b44f27be18c564e8ed8b

See more details on using hashes here.

File details

Details for the file deepnog-1.2.4-py3-none-any.whl.

File metadata

  • Download URL: deepnog-1.2.4-py3-none-any.whl
  • Upload date:
  • Size: 48.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.13 {"installer":{"name":"uv","version":"0.9.13"},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for deepnog-1.2.4-py3-none-any.whl
Algorithm Hash digest
SHA256 665ff5942a5d5cc7d3fc7b75bbd23759aee902dce53499f64095a54685a2bd77
MD5 e5cc38aac50728a30de06632bf6907b4
BLAKE2b-256 81fda32df7e13e140a89f2bd11569b0377432b0702882017f357f5be63437835

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page