Skip to main content

Detect Biosynthetic Gene Clusters (BGCs) in HiFi metagenomic data

Project description

HiFiBGC

HiFiBGC is a tool to detect Biosynthetic Gene Clusters (BGCs) in PacBio HiFi metagenomic data.

Installation

Option 1: mamba

mamba create -n hifibgc -c conda-forge -c bioconda -c amityadav -y hifibgc

mamba activate hifibgc

mamba is preferred over below conda as it takes much lesser time and consumes lesser memory (RAM).
mamba can be installed from here.

Option 2: conda

conda create -n hifibgc -c conda-forge -c bioconda -c amityadav -y hifibgc

conda activate hifibgc

Usage

Install prerequisites

Below command need to be run only once. It installs a required database and a tool.

hifibgc install

Run on test data

Test installation of HiFiBGC on a small dataset using below command.

hifibgc test

On successful completion of above command, you should see something like Snakemake finished successfully on terminal and an output directory hifibgc1.out.

Run on real data

Run HiFiBGC with default options with a required input (.fastq) file:

hifibgc run --input input.fastq  

Specify output directory and no of threads:

hifibgc run --input input.fastq --output outdir --threads 50

Specify bigscape_cutoff option:

hifibgc run --input input.fastq --bigscape_cutoff 0.3

Output

The output directory from HiFiBGC contains following folders and files.

.
└── hifibgc1.out
    ├── 01_assembly --> Output from three assemblers
    ├── 02_mapping_reads_to_merged_assembly --> Read mapping to concatenated assembly and extraction of unmapped reads 
    ├── 03_antismash --> BGC prediction
    ├── 04_bgc_clustering --> BGC clustering
    ├── 05_final_output --> Primary output of HiFiBGC
    ├── benchmarks --> Resource usage and time consumption by different components of HiFiBGC
    ├── config.yaml --> Configuration file for HiFiBGC run
    ├── hifibgc.log --> Snakemake log file
    └── logs --> Logs associated with different tools used in HiFiBGC

Among above, the folder 05_final_output contains primary output of HiFiBGC, specifically following folders and files.

├── 05_final_output
│   ├── BGC_all --> Folder containing all BGC .gbk files
│   ├── BGC_all_metadata.tsv --> File containing metadata associated with all BGCs
│   ├── BGC_representative --> Folder containing representative BGC .gbk files
│   ├── upsetplot --> Upsetplot comparison of results from three assemblers and unmapped reads

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

hifibgc-0.1.12.tar.gz (28.4 MB view details)

Uploaded Source

File details

Details for the file hifibgc-0.1.12.tar.gz.

File metadata

  • Download URL: hifibgc-0.1.12.tar.gz
  • Upload date:
  • Size: 28.4 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.16

File hashes

Hashes for hifibgc-0.1.12.tar.gz
Algorithm Hash digest
SHA256 b395dc77af248ad484dd59ac2897c5607f066c0623df4609c8d78638833ae8af
MD5 003f1e517610f58230d7cad89d08bf77
BLAKE2b-256 c9c967b757e988c7ee29fe3f378255f17fc875d9fb7e2ae55dd908bd756ab629

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page