Skip to main content

Detecting novel human viruses from DNA reads with reverse-complement neural networks.

Project description

DeePaC-vir

DeePaC-vir is a plugin for DeePaC (see below) shipping built-in models for novel human virus detection directly from NGS reads. For details, see our preprint on bioRxiv: https://www.biorxiv.org/content/10.1101/2020.01.29.925354v5

DeePaC

DeePaC is a python package and a CLI tool for predicting labels (e.g. pathogenic potentials) from short DNA sequences (e.g. Illumina reads) with interpretable reverse-complement neural networks. For details, see our preprint on bioRxiv: https://www.biorxiv.org/content/10.1101/535286v3 and the paper in Bioinformatics: https://doi.org/10.1093/bioinformatics/btz541. For details regarding the interpretability functionalities of DeePaC, see the preprint here: https://www.biorxiv.org/content/10.1101/2020.01.29.925354v2

Documentation can be found here: https://rki_bioinformatics.gitlab.io/DeePaC/. See also the main repo here: https://gitlab.com/rki_bioinformatics/DeePaC.

Installation

With Bioconda (recommended)

install with bioconda

You can install DeePaC with bioconda. Set up the bioconda channel first (channel ordering is important):

conda config --add channels defaults
conda config --add channels bioconda
conda config --add channels conda-forge

We recommend setting up an isolated conda environment:

# python 3.6, 3.7 and 3.8 are supported
conda create -n my_env python=3.8
conda activate my_env

and then:

# For GPU support (recommended)
conda install tensorflow-gpu deepacvir
# Basic installation (CPU-only)
conda install deepacvir

With pip

We recommend setting up an isolated conda environment (see above). Alternatively, you can use a virtualenv virtual environment (note that deepac requires python 3):

# use -p to use the desired python interpreter (python 3.6 or higher required)
virtualenv -p /usr/bin/python3 my_env
source my_env/bin/activate

You can then install DeePaC with pip. For GPU support, you need to install CUDA and CuDNN manually first (see TensorFlow installation guide for details). Then you can do the same as above:

# For GPU support (recommended)
pip install tensorflow-gpu
pip install deepacvir

Alternatively, if you don't need GPU support:

# Basic installation (CPU-only)
pip install deepacvir

Usage

DeePaC-vir may be used exactly as the base version of DeePaC. To use the plugin, substitute the deepac command for deepac-vir. Visit https://gitlab.com/rki_bioinformatics/DeePaC for a DeePaC readme describing basic usage.

For example, you can use the following commands:

# See help
deepac-vir --help

# Run quick tests (eg. on CPUs)
deepac-vir test -q
# Full tests
deepac-vir test -a

# Predict using a rapid CNN (trained on VHDB data)
deepac-vir predict -r input.fasta
# Predict using a sensitive LSTM (trained on VHDB data)
deepac-vir predict -s input.fasta

More examples are available at https://gitlab.com/rki_bioinformatics/DeePaC.

Supplementary data and scripts

Training, validation and test datasets are available here: https://doi.org/10.5281/zenodo.3630803. In the main DeePaC repository (https://gitlab.com/rki_bioinformatics/DeePaC) you can find the R scripts and data files used in the papers for dataset preprocessing and benchmarking.

Cite us

If you find DeePaC useful, please cite:

@article{10.1093/bioinformatics/btz541,
    author = {Bartoszewicz, Jakub M and Seidel, Anja and Rentzsch, Robert and Renard, Bernhard Y},
    title = "{DeePaC: predicting pathogenic potential of novel DNA with reverse-complement neural networks}",
    journal = {Bioinformatics},
    year = {2019},
    month = {07},
    issn = {1367-4803},
    doi = {10.1093/bioinformatics/btz541},
    url = {https://doi.org/10.1093/bioinformatics/btz541},
    eprint = {http://oup.prod.sis.lan/bioinformatics/advance-article-pdf/doi/10.1093/bioinformatics/btz541/28971344/btz541.pdf},
}

@article {Bartoszewicz2020.01.29.925354,
    author = {Bartoszewicz, Jakub M. and Seidel, Anja and Renard, Bernhard Y.},
    title = {Interpretable detection of novel human viruses from genome sequencing data},
    elocation-id = {2020.01.29.925354},
    year = {2020},
    doi = {10.1101/2020.01.29.925354},
    publisher = {Cold Spring Harbor Laboratory},
    URL = {https://www.biorxiv.org/content/early/2020/02/01/2020.01.29.925354},
    eprint = {https://www.biorxiv.org/content/early/2020/02/01/2020.01.29.925354.full.pdf},
    journal = {bioRxiv}
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

deepacvir-0.2.2.tar.gz (34.8 MB view details)

Uploaded Source

Built Distribution

deepacvir-0.2.2-py3-none-any.whl (34.8 MB view details)

Uploaded Python 3

File details

Details for the file deepacvir-0.2.2.tar.gz.

File metadata

  • Download URL: deepacvir-0.2.2.tar.gz
  • Upload date:
  • Size: 34.8 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/49.6.0.post20201009 requests-toolbelt/0.9.1 tqdm/4.50.2 CPython/3.8.6

File hashes

Hashes for deepacvir-0.2.2.tar.gz
Algorithm Hash digest
SHA256 35fb33767f8c9ee41b5f22672b5a61f083915b590d6bc30ebc05c5dfbd4acd45
MD5 21d9e3a9768f8f4b29e91df3147f41cb
BLAKE2b-256 0596141c46aba33c030897805b46199974c0c920965655e75a1fb29aeab6485c

See more details on using hashes here.

File details

Details for the file deepacvir-0.2.2-py3-none-any.whl.

File metadata

  • Download URL: deepacvir-0.2.2-py3-none-any.whl
  • Upload date:
  • Size: 34.8 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/49.6.0.post20201009 requests-toolbelt/0.9.1 tqdm/4.50.2 CPython/3.8.6

File hashes

Hashes for deepacvir-0.2.2-py3-none-any.whl
Algorithm Hash digest
SHA256 4a6663bda45b79a0001a18a3ae08adf3f05d19d98e4e69f295eeb357806a6c79
MD5 9d6a2e66e7df67e96919bbdc5cf83b60
BLAKE2b-256 e25c5bbe3dd89943abe1bd57249dbcbbd2f50d2bae3ca94f164ed674ad6c6573

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page