Skip to main content

DeepBGC - Biosynthetic Gene Cluster detection and classification

Project description

DeepBGC: Biosynthetic Gene Cluster detection and classification

DeepBGC detects BGCs in bacterial and fungal genomes using deep learning. DeepBGC employs a Bidirectional Long Short-Term Memory Recurrent Neural Network and a word2vec-like vector embedding of Pfam protein domains. Product class and activity of detected BGCs is predicted using a Random Forest classifier.

BioConda Install PyPI - Downloads PyPI license PyPI version CI

DeepBGC architecture

Install using bioconda (recommended)

  • Install Bioconda by following Step 1 and 2 from: https://bioconda.github.io/
  • Run conda install deepbgc to install DeepBGC and all of its dependencies

Install using pip

If you don't mind installing the HMMER and Prodigal dependencies manually, you can also install DeepBGC using pip:

Use DeepBGC

Download models and Pfam database

Before you can use DeepBGC, download trained models and Pfam database:

deepbgc download

You can display downloaded dependencies and models using:

deepbgc info

Detection and classification

DeepBGC pipeline

Detect and classify BGCs in a genomic sequence. Proteins and Pfam domains are detected automatically if not already annotated (HMMER and Prodigal needed)

# Show command help docs
deepbgc pipeline --help

# Detect and classify BGCs in mySequence.fa using DeepBGC algorithm and save the output to mySequence directory.
deepbgc pipeline mySequence.fa

This will produce a directory with multiple files and a README.txt with file descriptions.

Example output

See the DeepBGC Example Result Notebook. Data can be downloaded on the releases page

Detected BGC Regions

Model training

You can train your own BGC detection and classification models, see deepbgc train --help for documentation and examples.

DeepBGC positives, negatives and other training and validation data can be found on the releases page.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

deepbgc-0.1.9.tar.gz (42.5 kB view details)

Uploaded Source

Built Distribution

deepbgc-0.1.9-py3-none-any.whl (59.9 kB view details)

Uploaded Python 3

File details

Details for the file deepbgc-0.1.9.tar.gz.

File metadata

  • Download URL: deepbgc-0.1.9.tar.gz
  • Upload date:
  • Size: 42.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/40.8.0 requests-toolbelt/0.9.1 tqdm/4.31.1 CPython/3.6.7

File hashes

Hashes for deepbgc-0.1.9.tar.gz
Algorithm Hash digest
SHA256 7b4e9dbc39177a6927dde20af6c9d3f7049ac79ef7757992f8e413532ecd9001
MD5 cae772f4e07182f48ba7a3083cd22345
BLAKE2b-256 9a3661d08c394c0a325e6d5d8161a5f60e26379bbc5263b7c24e6fcccaac3b37

See more details on using hashes here.

File details

Details for the file deepbgc-0.1.9-py3-none-any.whl.

File metadata

  • Download URL: deepbgc-0.1.9-py3-none-any.whl
  • Upload date:
  • Size: 59.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/40.8.0 requests-toolbelt/0.9.1 tqdm/4.31.1 CPython/3.6.7

File hashes

Hashes for deepbgc-0.1.9-py3-none-any.whl
Algorithm Hash digest
SHA256 7316adba8478a1da6da0dbfe5377b2160f05c09344f8e35392224bd669d55f96
MD5 65c7939aa683c21bcb5d9d64d7951d6a
BLAKE2b-256 f43a1e50e66e21848c27fd99226280fcfec2506d95d0e81b74d92d4da4e130c4

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page