Skip to main content

DeepBGC - Biosynthetic Gene Cluster detection and classification

Project description

DeepBGC: Biosynthetic Gene Cluster detection and classification

DeepBGC detects BGCs in bacterial and fungal genomes using deep learning. DeepBGC employs a Bidirectional Long Short-Term Memory Recurrent Neural Network and a word2vec-like vector embedding of Pfam protein domains. Product class and activity of detected BGCs is predicted using a Random Forest classifier.

PyPI license PyPI - Downloads PyPI version CI

DeepBGC architecture

Install using pip

Use DeepBGC

Download models and Pfam database

Before you can use DeepBGC, download trained models and Pfam database:

deepbgc download

You can display downloaded dependencies and models using:

deepbgc info

Detection and classification

DeepBGC pipeline

Detect and classify BGCs in a genomic sequence. Proteins and Pfam domains are detected automatically if not already annotated (HMMER and Prodigal needed)

# Show command help docs
deepbgc pipeline --help

# Detect and classify BGCs in mySequence.fa using DeepBGC algorithm and save the output to mySequence directory.
deepbgc pipeline mySequence.fa

This will produce a directory with multiple files and a README.txt with file descriptions.

Example output

See the DeepBGC Example Result Notebook. Data can be downloaded on the releases page

Detected BGC Regions

Model training

You can train your own BGC detection and classification models, see deepbgc train --help for documentation and examples.

DeepBGC positives, negatives and other training and validation data can be found on the releases page.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

deepbgc-0.1.3.tar.gz (41.2 kB view details)

Uploaded Source

Built Distribution

deepbgc-0.1.3-py3-none-any.whl (58.3 kB view details)

Uploaded Python 3

File details

Details for the file deepbgc-0.1.3.tar.gz.

File metadata

  • Download URL: deepbgc-0.1.3.tar.gz
  • Upload date:
  • Size: 41.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.21.0 setuptools/40.8.0 requests-toolbelt/0.9.1 tqdm/4.31.1 CPython/3.6.7

File hashes

Hashes for deepbgc-0.1.3.tar.gz
Algorithm Hash digest
SHA256 3a91988182fae36eacdf570b9e9bc2db2afb81183c14002bdc0cb60296188dc4
MD5 e304153c3f92624aded17b723f6e17d9
BLAKE2b-256 88d87572ad95f22a63c331c7ad962fc37a11bd3b59e8e4676d175551d3d8d3e6

See more details on using hashes here.

File details

Details for the file deepbgc-0.1.3-py3-none-any.whl.

File metadata

  • Download URL: deepbgc-0.1.3-py3-none-any.whl
  • Upload date:
  • Size: 58.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.21.0 setuptools/40.8.0 requests-toolbelt/0.9.1 tqdm/4.31.1 CPython/3.6.7

File hashes

Hashes for deepbgc-0.1.3-py3-none-any.whl
Algorithm Hash digest
SHA256 2128c2d0dd5151979e27dc11fcacbbc79cb04cfc8f868ddb2612ea56e3247a71
MD5 775a1c8e5cd7f91b6bda17fac4e17cd2
BLAKE2b-256 39cd82d59973cac6548000f9260143678b0cb8c1b8b714e1098c2d81ebbf0fcd

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page