Skip to main content

Predicting pathogenic potentials of novel strains of known bacterial species.

Project description

DeePaC-strain

DeePaC-strain is a plugin for DeePaC (see below) shipping built-in models for predicting pathogenic potentials of novel strains of known bacterial species.

DeePaC

DeePaC is a python package and a CLI tool for predicting labels (e.g. pathogenic potentials) from short DNA sequences (e.g. Illumina reads) with interpretable reverse-complement neural networks. For details, see our preprint on bioRxiv: https://www.biorxiv.org/content/10.1101/535286v3 and the paper in Bioinformatics: https://doi.org/10.1093/bioinformatics/btz541. For details regarding the interpretability functionalities of DeePaC, see the preprint here: https://www.biorxiv.org/content/10.1101/2020.01.29.925354v2

Documentation can be found here: https://rki_bioinformatics.gitlab.io/DeePaC/.

Installation

Recommended: set up an environment

We recomment setting up an isolated conda environment:

conda create -n my_env python=3.6
conda activate my_env

or, alternatively, a virtualenv:

virtualenv --system-site-packages my_env
source my_env/bin/activate

With conda (recommended)

install with bioconda

You can install DeePaC-strain with bioconda. Set up the bioconda channel first, and then:

conda install deepacstrain

DeePaC will be installed automatically.

With pip

You can also install DeePaC-strain with pip:

pip install deepacstrain

Note: TensorFlow 2.0 is not yet supported.

GPU support

To use GPUs, you need to install the GPU version of TensorFlow. In conda, install tensorflow-gpu from the defaults channel before deepac:

conda remove tensorflow
conda install -c defaults tensorflow-gpu=1.15 
conda install deepacstrain

DeePaC will be installed automatically. Note: TensorFlow 2.0 is not yet supported.

If you're using pip, you need to install CUDA and CuDNN first (see TensorFlow installation guide for details). Then you can do the same as above:

pip uninstall tensorflow
pip install tensorflow-gpu==1.15

Usage

DeePaC-strain may be used exactly as the base version of DeePaC. To use the plugin, substitute the deepac command for deepac-strain. Visit https://gitlab.com/rki_bioinformatics/DeePaC for a DeePaC readme describing basic usage.

For example, you can use the following commands:

# See help
deepac-strain --help

# Run quick tests (eg. on CPUs)
deepac-strain test -q
# Full tests on a GPU
deepac-strain test -a -g 1

# Predict using a rapid CNN (trained on VHDB data) using a GPU
deepac-strain predict -r -g 1 input.fasta
# Predict using a sensitive LSTM (trained on VHDB data) using a GPU
deepac-strain predict -s -g 1 input.fasta

Supplementary data and scripts

In the main DeePaC repository (https://gitlab.com/rki_bioinformatics/DeePaC) you can find the R scripts and data files used in the papers for dataset preprocessing and benchmarking.

Cite us

If you find DeePaC useful, please cite:

@article{10.1093/bioinformatics/btz541,
    author = {Bartoszewicz, Jakub M and Seidel, Anja and Rentzsch, Robert and Renard, Bernhard Y},
    title = "{DeePaC: predicting pathogenic potential of novel DNA with reverse-complement neural networks}",
    journal = {Bioinformatics},
    year = {2019},
    month = {07},
    issn = {1367-4803},
    doi = {10.1093/bioinformatics/btz541},
    url = {https://doi.org/10.1093/bioinformatics/btz541},
    eprint = {http://oup.prod.sis.lan/bioinformatics/advance-article-pdf/doi/10.1093/bioinformatics/btz541/28971344/btz541.pdf},
}

@article {Bartoszewicz2020.01.29.925354,
    author = {Bartoszewicz, Jakub M. and Seidel, Anja and Renard, Bernhard Y.},
    title = {Interpretable detection of novel human viruses from genome sequencing data},
    elocation-id = {2020.01.29.925354},
    year = {2020},
    doi = {10.1101/2020.01.29.925354},
    publisher = {Cold Spring Harbor Laboratory},
    URL = {https://www.biorxiv.org/content/early/2020/02/01/2020.01.29.925354},
    eprint = {https://www.biorxiv.org/content/early/2020/02/01/2020.01.29.925354.full.pdf},
    journal = {bioRxiv}
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

deepacstrain-0.2.0.tar.gz (30.1 MB view hashes)

Uploaded Source

Built Distribution

deepacstrain-0.2.0-py3-none-any.whl (30.1 MB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page