DeepBGC - Biosynthetic Gene Cluster detection and classification
Project description
DeepBGC: Biosynthetic Gene Cluster detection and classification
DeepBGC detects BGCs in bacterial and fungal genomes using deep learning. DeepBGC employs a Bidirectional Long Short-Term Memory Recurrent Neural Network and a word2vec-like vector embedding of Pfam protein domains. Product class and activity of detected BGCs is predicted using a Random Forest classifier.
Install using bioconda (recommended)
- Install Bioconda by following Step 1 and 2 from: https://bioconda.github.io/
- Run
conda install deepbgc
to install DeepBGC and all of its dependencies
Install using pip
If you don't mind installing the HMMER and Prodigal dependencies manually, you can also install DeepBGC using pip:
- Install Python version 2.7+ or 3.4+
- Install Prodigal and put the
prodigal
binary it on your PATH: https://github.com/hyattpd/Prodigal/releases - Install HMMER and put the
hmmscan
andhmmpress
binaries on your PATH: http://hmmer.org/download.html - Run
pip install deepbgc
to install DeepBGC
Use DeepBGC
Download models and Pfam database
Before you can use DeepBGC, download trained models and Pfam database:
deepbgc download
You can display downloaded dependencies and models using:
deepbgc info
Detection and classification
Detect and classify BGCs in a genomic sequence. Proteins and Pfam domains are detected automatically if not already annotated (HMMER and Prodigal needed)
# Show command help docs
deepbgc pipeline --help
# Detect and classify BGCs in mySequence.fa using DeepBGC algorithm and save the output to mySequence directory.
deepbgc pipeline mySequence.fa
This will produce a directory with multiple files and a README.txt with file descriptions.
Example output
See the DeepBGC Example Result Notebook. Data can be downloaded on the releases page
Model training
You can train your own BGC detection and classification models, see deepbgc train --help
for documentation and examples.
DeepBGC positives, negatives and other training and validation data can be found on the releases page.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file deepbgc-0.1.4.tar.gz
.
File metadata
- Download URL: deepbgc-0.1.4.tar.gz
- Upload date:
- Size: 41.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.21.0 setuptools/40.8.0 requests-toolbelt/0.9.1 tqdm/4.31.1 CPython/3.6.7
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0a884a07bc48342b335f5e96690324c692933fa84dcd1418726e6b6bae6dee69 |
|
MD5 | c2bd2c533e31bbb2859122e8e52cebd7 |
|
BLAKE2b-256 | 78cdb8c29d42498f8952cffb18babaa43fc8ed1d2676b96a4cd25ebc0662db94 |
File details
Details for the file deepbgc-0.1.4-py3-none-any.whl
.
File metadata
- Download URL: deepbgc-0.1.4-py3-none-any.whl
- Upload date:
- Size: 58.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.21.0 setuptools/40.8.0 requests-toolbelt/0.9.1 tqdm/4.31.1 CPython/3.6.7
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 71a89645e1a023778f3dafa51fc835baec764aae48e2b47ea7e9c318e398ba0e |
|
MD5 | 373d960472167474355950b33badc858 |
|
BLAKE2b-256 | a593917fdaf4fc8d6da4d95b014edd8547d44b0a0504327a915862d785b7c593 |