Detect Biosynthetic Gene Clusters (BGCs) in HiFi metagenomic data
Project description
HiFiBGC
HiFiBGC is a tool to detect Biosynthetic Gene Clusters (BGCs) in PacBio HiFi metagenomic data.
Installation
Option 1: mamba
mamba create -n hifibgc -c conda-forge -c bioconda -c amityadav -y hifibgc
mamba activate hifibgc
mamba is preferred over below conda as it takes much lesser time and consumes lesser memory (RAM).
mamba can be installed from here.
Option 2: conda
conda create -n hifibgc -c conda-forge -c bioconda -c amityadav -y hifibgc
conda activate hifibgc
Usage
Install prerequisites
Below command need to be run only once. It installs a required database and a tool.
hifibgc install
Run on test data
Test installation of HiFiBGC on a small dataset using below command.
hifibgc test
On successful completion of above command, you should see something like Snakemake finished successfully
on terminal and an output directory hifibgc1.out
.
Run on real data
Run HiFiBGC with default options with a required input (.fastq) file:
hifibgc run --input input.fastq
Specify output directory and no of threads:
hifibgc run --input input.fastq --output outdir --threads 50
Specify bigscape_cutoff option:
hifibgc run --input input.fastq --bigscape_cutoff 0.3
Output
The output directory from HiFiBGC contains following folders and files.
.
└── hifibgc1.out
├── 01_assembly --> Output from three assemblers
├── 02_mapping_reads_to_merged_assembly --> Read mapping to concatenated assembly and extraction of unmapped reads
├── 03_antismash --> BGC prediction
├── 04_bgc_clustering --> BGC clustering
├── 05_final_output --> Primary output of HiFiBGC
├── benchmarks --> Resource usage and time consumption by different components of HiFiBGC
├── config.yaml --> Configuration file for HiFiBGC run
├── hifibgc.log --> Snakemake log file
└── logs --> Logs associated with different tools used in HiFiBGC
Among above, the folder 05_final_output
contains primary output of HiFiBGC, specifically following folders and files.
├── 05_final_output
│ ├── BGC_all --> Folder containing all BGC .gbk files
│ ├── BGC_all_metadata.tsv --> File containing metadata associated with all BGCs
│ ├── BGC_representative --> Folder containing representative BGC .gbk files
│ ├── upsetplot --> Upsetplot comparison of results from three assemblers and unmapped reads
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
File details
Details for the file hifibgc-0.1.11.tar.gz
.
File metadata
- Download URL: hifibgc-0.1.11.tar.gz
- Upload date:
- Size: 28.4 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.9.16
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | efa9a3bded12c99672b735caacc515d2c7142ddec77ef8ddbe4f7b8d165c657e |
|
MD5 | ecb82f1987fac26378f0281de717671c |
|
BLAKE2b-256 | 2161d6c636b76ea16f8b9394c61240804243652193a760d9245bd6d4794a134f |