Detect Biosynthetic Gene Clusters (BGCs) in HiFi metagenomic data
Project description
HiFiBGC
HiFiBGC is a tool to detect Biosynthetic Gene Clusters (BGCs) in PacBio HiFi metagenomic data.
Installation
Option 1: mamba
mamba create -n hifibgc -c conda-forge -c bioconda -c amityadav -y hifibgc
mamba activate hifibgc
mamba is preferred over below conda as it takes much lesser time and consumes lesser memory (RAM).
mamba can be installed from here.
Option 2: conda
conda create -n hifibgc -c conda-forge -c bioconda -c amityadav -y hifibgc
conda activate hifibgc
HiFiBGC uses following third-party tools: hifiasm-meta, metaFlye, HiCanu, Minimap2, SAMtools, antiSMASH, BiG-SCAPE, complex-upsetplot, Snaketool, Snaketool-utils
Usage
Install prerequisites
Below command need to be run only once. It installs a required database and a tool.
hifibgc install
Run on test data
Test installation of HiFiBGC on a small dataset using below command.
hifibgc test
On successful completion of above command, you should see something like Snakemake finished successfully
on terminal and an output directory hifibgc1.out
.
Run on real data
Run HiFiBGC with default options with a required input (.fastq) file:
hifibgc run --input input.fastq
Specify output directory and no of threads:
hifibgc run --input input.fastq --output outdir --threads 50
Specify bigscape_cutoff option:
hifibgc run --input input.fastq --bigscape_cutoff 0.3
Output
The output directory from HiFiBGC contains following folders and files.
.
└── hifibgc1.out
├── 01_assembly --> Output from three assemblers
├── 02_mapping_reads_to_merged_assembly --> Read mapping to concatenated assembly and extraction of unmapped reads
├── 03_antismash --> BGC prediction
├── 04_bgc_clustering --> BGC clustering
├── 05_final_output --> Primary output of HiFiBGC
├── benchmarks --> Resource usage and time consumption by different components of HiFiBGC
├── config.yaml --> Configuration file for HiFiBGC run
├── hifibgc.log --> Snakemake log file
└── logs --> Logs associated with different tools used in HiFiBGC
Among above, the folder 05_final_output
contains primary output of HiFiBGC, specifically following folders and files.
├── 05_final_output
│ ├── BGC_all --> Folder containing all BGC .gbk files
│ ├── BGC_all_metadata.tsv --> File containing metadata associated with all BGCs
│ ├── BGC_representative --> Folder containing representative BGC .gbk files
│ ├── upsetplot --> Upsetplot comparison of results from three assemblers and unmapped reads
Commands
$hifibgc --help
Usage: hifibgc [OPTIONS] COMMAND [ARGS]...
Detect Biosynthetic Gene Clusters (BGCs) in HiFi metagenomic data. For
more options, run: hifibgc command --help
Options:
-v, --version Show the version and exit.
-h, --help Show this message and exit.
Commands:
run Run HiFiBGC
install Install required database and tool
test Test HiFiBGC
config Copy the system default config file
citation Print the citation(s) for this tool
$hifibgc run --help
Usage: hifibgc run [OPTIONS] [SNAKE_ARGS]...
Run HiFiBGC
Options:
--input TEXT Input file [required]
--output PATH Output directory [default: hifibgc1.out]
--bigscape_cutoff FLOAT BiG-SCAPE cutoff parameter [default: 0.3]
--configfile TEXT Custom config file [default:
(outputDir)/config.yaml]
--threads INTEGER Number of threads to use [default: 80]
--use-conda / --no-use-conda Use conda for Snakemake rules [default: use-
conda]
--conda-prefix PATH Custom conda env directory
--snake-default TEXT Customise Snakemake runtime args [default:
--rerun-incomplete, --printshellcmds,
--nolock, --show-failed-logs]
-h, --help Show this message and exit.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
File details
Details for the file hifibgc-0.1.14.tar.gz
.
File metadata
- Download URL: hifibgc-0.1.14.tar.gz
- Upload date:
- Size: 28.3 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.9.16
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 8ce7d663d2093e43e7b2945e9cdd16cb41fc8db6f94df267e4bb24aec6cd5ee4 |
|
MD5 | 0be939b778bc017e63ff054c66d611ef |
|
BLAKE2b-256 | 7ef8ef7065f93ec9e32c3267ae6738a371b693ed16168d3acf536af0a4c02a10 |