Skip to main content

Viral metagenomics framework for short and longreads

Project description

Anaconda-Server Badge Anaconda-Server Badge Documentation Status install with bioconda install with PyPI Unit tests Env builds


A hecatomb is a great sacrifice or an extensive loss. Heactomb the software empowers an analyst to make data driven decisions to 'sacrifice' false-positive viral reads from metagenomes to enrich for true-positive viral reads. This process frequently results in a great loss of suspected viral sequences / contigs.

Contents

Documentation

Complete documentation is hosted at Read the Docs

Citation

Hecatomb: an integrated software platform for viral metagenomics, Michael J Roach, Sarah J Beecroft, Kathie A Mihindukulasuriya, Leran Wang, Anne Paredes, Luis Alberto Chica Cárdenas, Kara Henry-Cocks, Lais Farias Oliveira Lima, Elizabeth A Dinsdale, Robert A Edwards, Scott A Handley, GigaScience, Volume 13, 2024, giae020, https://doi.org/10.1093/gigascience/giae020

Quick start guide

Install Hecatomb

option 1: PIP

# Optional: create a virtual with conda or venv
conda create -n hecatomb python=3.10

# activate
conda activte hecatomb

# Install
pip install hecatomb

option 2: Conda

# Create the conda env and install hecatomb in one step
conda create -n hecatomb -c conda-forge -c bioconda hecatomb

# activate
conda activate hecatomb

Check installation

hecatomb --help

Install databases and envs

Download the databases

# 8 threads = 8 downloads at a time
hecatomb install --threads 8

Optional: prebuild envs

These are automatically built when running hecatomb, but manually pre-building is useful if your cluster nodes are isolated from the internet.

hecatomb test build_envs

Run test dataset

# locally: using 32 threads and 64 GB RAM by default
hecatomb test --threads 32

# HPC: using a profile named 'slurm'
hecatomb test --profile slurm

Snakemake profiles (for running on HPCs)

Hecatomb is powered by Snakemake and greatly benefits from the use of Snakemake profiles for HPC Clusters. More information and example for setting up Snakemake profiles for Hecatomb in the documentation.

NOTE: Hecatomb currently uses Snakemake version 7. The recent version 8 for Snakemake has some breaking changes, including some changes to the command line interface for cluster execution. Any new Snakemake v8 profiles might not work with Hecatomb. Please open an issue if you need help setting up a profile.

Inputs

Parsing samples with --reads

You can pass either a directory of reads or a TSV file to --reads. Note that Hecatomb expects paired read file names to include common R1/R2 tags.

  • Directory: Hecatomb will infer sample names and various R1/2 tag combinations from the filenames.
  • TSV file: Hecatomb expects 2 or 3 columns, with column 1 being the sample name and columns 2 and 3 the reads files.

More information and examples are available here

Lonread support --longreads

Pass the --longreads argument to tell Hecatomb that you are using longreads.

Library preprocessing with --trim

Hecatomb uses Trimnami for read trimming which supports many different trimming methods. Current options are fastp (default), prinseq, roundAB, filtlong (for longreads), cutadapt (FASTA input), and notrim (skip trimming). See Trimnami's documentation for more information.

Configuration

You can configure advanced parameters for Hecatomb. Copy the default config: hecatomb config. Edit the config file in your favourite text editor: nano hecatomb.out/hecatomb.config.yaml.

Dependencies

The only dependency you need to get up and running with Hecatomb is conda or the python package manager pip. Hecatomb relies on conda to ensure portability and ease of installation of its dependencies. All of Hecatomb's dependencies are installed during installation or runtime, so you don't have to worry about a thing!

Links

Hecatomb @ PyPI

Hecatomb @ bioconda

Hecatomb @ bio.tools

Hecatomb @ WorkflowHub

Hecatomb RRID:SCR_025002

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

hecatomb-1.3.4.tar.gz (98.5 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

hecatomb-1.3.4-py3-none-any.whl (98.6 MB view details)

Uploaded Python 3

File details

Details for the file hecatomb-1.3.4.tar.gz.

File metadata

  • Download URL: hecatomb-1.3.4.tar.gz
  • Upload date:
  • Size: 98.5 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.9.23

File hashes

Hashes for hecatomb-1.3.4.tar.gz
Algorithm Hash digest
SHA256 0963aa7c071c84072338aa914a17d40f510d2cad4fe8640960752d0b58cc8469
MD5 e4256188dab4fef346bf5c1471999150
BLAKE2b-256 b9000ee6b8728ae6977dc8dfeada809b7e00be1c679178798debce3bf0134df1

See more details on using hashes here.

File details

Details for the file hecatomb-1.3.4-py3-none-any.whl.

File metadata

  • Download URL: hecatomb-1.3.4-py3-none-any.whl
  • Upload date:
  • Size: 98.6 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.9.23

File hashes

Hashes for hecatomb-1.3.4-py3-none-any.whl
Algorithm Hash digest
SHA256 134601b18955886f18211b0a5cc61ce0442dd142d94580634adc99f30953681f
MD5 04d38d884f3ae55f4d5e52e62b8d1eb0
BLAKE2b-256 6b6c4a3a33a138b5c7d3ad141c748db3cb2a80a4b51be5f8e8bd6a5e161ae8fd

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page