Skip to main content

Detect Biosynthetic Gene Clusters (BGCs) in HiFi metagenomic data

Project description

HiFiBGC

HiFiBGC is a tool to detect Biosynthetic Gene Clusters (BGCs) in PacBio HiFi metagenomic data.

Installation

Option 1: mamba

mamba create -n hifibgc -c conda-forge -c bioconda -c amityadav -y hifibgc

mamba activate hifibgc

mamba is preferred over below conda as it takes much lesser time and consumes lesser memory (RAM).
mamba can be installed from here.

Option 2: conda

conda create -n hifibgc -c conda-forge -c bioconda -c amityadav -y hifibgc

conda activate hifibgc

Usage

Install prerequisites

Below command need to be run only once. It installs a required database and a tool.

hifibgc install

Run on test data

Test installation of HiFiBGC on a small dataset using below command.

hifibgc test

On successful completion of above command, you should see something like Snakemake finished successfully on terminal and an output directory hifibgc1.out.

Run on real data

Run HiFiBGC with default options with a required input (.fastq) file:

hifibgc run --input input.fastq  

Specify output directory and no of threads:

hifibgc run --input input.fastq --output outdir --threads 50

Specify bigscape_cutoff option:

hifibgc run --input input.fastq --bigscape_cutoff 0.3

Output

The output directory from HiFiBGC contains following folders and files.

.
└── hifibgc1.out
    ├── 01_assembly --> Output from three assemblers
    ├── 02_mapping_reads_to_merged_assembly --> Read mapping to concatenated assembly and extraction of unmapped reads 
    ├── 03_antismash --> BGC prediction
    ├── 04_bgc_clustering --> BGC clustering
    ├── 05_final_output --> Primary output of HiFiBGC
    ├── benchmarks --> Resource usage and time consumption by different components of HiFiBGC
    ├── config.yaml --> Configuration file for HiFiBGC run
    ├── hifibgc.log --> Snakemake log file
    └── logs --> Logs associated with different tools used in HiFiBGC

Among above, the folder 05_final_output contains primary output of HiFiBGC, specifically following folders and files.

├── 05_final_output
│   ├── BGC_all --> Folder containing all BGC .gbk files
│   ├── BGC_all_metadata.tsv --> File containing metadata associated with all BGCs
│   ├── BGC_representative --> Folder containing representative BGC .gbk files
│   ├── upsetplot --> Upsetplot comparison of results from three assemblers and unmapped reads

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

hifibgc-0.1.11.tar.gz (28.4 MB view details)

Uploaded Source

File details

Details for the file hifibgc-0.1.11.tar.gz.

File metadata

  • Download URL: hifibgc-0.1.11.tar.gz
  • Upload date:
  • Size: 28.4 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.16

File hashes

Hashes for hifibgc-0.1.11.tar.gz
Algorithm Hash digest
SHA256 efa9a3bded12c99672b735caacc515d2c7142ddec77ef8ddbe4f7b8d165c657e
MD5 ecb82f1987fac26378f0281de717671c
BLAKE2b-256 2161d6c636b76ea16f8b9394c61240804243652193a760d9245bd6d4794a134f

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page