Skip to main content

Viral metagenomics framework for short and longreads

Project description

Anaconda-Server Badge Anaconda-Server Badge Documentation Status install with bioconda install with PyPI Unit tests Env builds


A hecatomb is a great sacrifice or an extensive loss. Heactomb the software empowers an analyst to make data driven decisions to 'sacrifice' false-positive viral reads from metagenomes to enrich for true-positive viral reads. This process frequently results in a great loss of suspected viral sequences / contigs.

Contents

Documentation

Complete documentation is hosted at Read the Docs

Citation

Hecatomb is currently on BioRxiv!

Quick start guide

Snakemake profiles (for running on HPCs)

Hecatomb is powered by Snakemake and greatly benefits from the use of Snakemake profiles for HPC Clusters. More information and example for setting up Snakemake profiles for Hecatomb in the documentation.

Install Hecatomb

option 1: PIP

# Optional: create a virtual with conda or venv
conda create -n hecatomb python=3.10

# activate
conda activte hecatomb

# Install
pip install hecatomb

option 2: Conda

# Create the conda env and install hecatomb in one step
conda create -n hecatomb -c conda-forge -c bioconda hecatomb

# activate
conda activate hecatomb

Check installation

hecatomb --help

Install databases

# locally: using 8 threads (default is 32 threads)
hecatomb install --threads 8

# HPC: using a snakemake profile named 'slurm'
hecatomb install --profile slurm

Run test dataset

# locally: using 32 threads and 64 GB RAM by default
hecatomb test

# HPC: using a profile named 'slurm'
hecatomb test --profile slurm

Inputs

Parsing samples with --reads

You can pass either a directory of reads or a TSV file to --reads. Note that Hecatomb expects your read file names to include common R1/R2 tags.

  • Directory: Hecatomb will infer sample names and _R1/_R2 pairs from the filenames.
  • TSV file: Hecatomb expects 2 or 3 columns, with column 1 being the sample name and columns 2 and 3 the reads files.

More information and examples are available here

Library preprocessing with --trim

Hecatomb uses Trimnami for read trimming which supports many different trimming methods. Current options are fastp (default), prinseq, roundAB, filtlong (longreads), cutadapt (FASTA input), and notrim (skip trimming). See Trimnami's documentation for more information.

Dependencies

The only dependency you need to get up and running with Hecatomb is conda or the python package manager pip. Hecatomb relies on conda (and mamba) to ensure portability and ease of installation of its dependencies. All of Hecatomb's dependencies are installed during installation or runtime, so you don't have to worry about a thing!

Links

Hecatomb @ PyPI

Hecatomb @ bioconda

Hecatomb @ bio.tools

Hecatomb @ WorkflowHub

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

hecatomb-1.3.1.tar.gz (98.5 MB view hashes)

Uploaded Source

Built Distribution

hecatomb-1.3.1-py3-none-any.whl (98.6 MB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page