Skip to main content

Obtain tidy alignment coverage info from sorted BAM files

Project description

PyPI version

AlignCov

AlignCov is a bioinformatics tool which can be used to obtain a) alignment summary statistics and b) read depths from sorted BAM files in tidy tab-separated tables.

Future plans

  • Create a Bioconda recipe
  • Create a Docker image

Introduction

This script takes a sorted BAM file as input and uses SAMtools and Python Pandas to generate two tables:

  • _stats.tsv: A table of alignment summary statistics, including fold-coverages (fold_cov) and proportions of target lengths covered by mapped reads (prop_cov).
    • target: Name of the target.
    • seqlen: Length of the target sequence (bp).
    • depth: Total number of base pairs mapped to the target.
    • len_cov: Total number of base pairs within the target that are covered by at least one mapped read.
    • prop_cov: Proportion of the target length covered by at least one mapped read (len_cov / seqlen).
    • fold_cov: Fold-coverage of mapped reads to the target (i.e. the number of times the target is completely covered by mapped reads) (depth / seqlen).
  • _depth.tsv: A table of read depths for each bp position of each target.
    • target: Name of the target.
    • position: Base pair position within the target.
    • depth: Total number of reads aligned to the base pair position within the target.

Dependencies

  • samtools>=1.15

Installation

AlignCov can be installed using Pip with the following command:

pip install aligncov

Usage

Quick start

For a sorted BAM file named 'bacillus.bam', compute alignment statistics and read depths, and save results to files named 'subtilis_stats.tsv' and 'subtilis_depth.tsv':

$ aligncov -i bacillus.bam -o subtilis

More options

To show the program's help message:

$ aligncov -h
usage: aligncov [-h] -i INPUT [-o OUTPUT]

Parse a sorted BAM file to generate two tables: a table of alignment summary statistics ('_stats.tsv'), including fold-coverages (fold_cov) and proportions of target lengths covered by mapped reads (prop_cov), and a table of read
depths ('_depth.tsv') for each bp position of each target.

options:
  -h, --help            show this help message and exit

Required:
  -i INPUT, --input INPUT
                        Path to sorted BAM file to process.

Optional:
  -o OUTPUT, --output OUTPUT
                        Path and base name of files to save as tab-separated tables ('[output]_stats.tsv', '[output]_depth.tsv'). Default: 'sample'

Credits

Packages

  • Pandas: McKinney W. 2011. Pandas: A foundation python library for data analysis and statistics. Python for High Performance and Scientific Computing 1–9.

Dependencies

  • SAMtools: Danecek P, Bonfield JK, Liddle J, Marshall J, Ohan V, Pollard MO, Whitwham A, Keane T, McCarthy SA, Davies RM, Li H. 2021. Twelve years of SAMtools and BCFtools. GigaScience 10(2) giab008. doi: 10.1093/gigascience/giab008

Project structure

This package was created with Cookiecutter and the audreyr/cookiecutter-pypackage project template.

======= History

0.0.2 (2023-08-13)

  • Improved/fixed documentation on GitHub and PyPI.

0.0.1 (2023-08-12)

  • First release on PyPI.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

aligncov-0.0.2.tar.gz (6.2 kB view details)

Uploaded Source

Built Distribution

aligncov-0.0.2-py2.py3-none-any.whl (6.7 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file aligncov-0.0.2.tar.gz.

File metadata

  • Download URL: aligncov-0.0.2.tar.gz
  • Upload date:
  • Size: 6.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.4

File hashes

Hashes for aligncov-0.0.2.tar.gz
Algorithm Hash digest
SHA256 47ded440daebad7bdfa1e5e3a75925f3bf6e730e7de5b3f50060bf35e5ad0f43
MD5 8a6d838a8ed836272f4164e89c711d62
BLAKE2b-256 3673ef19f7f6999d59275301fac96f7f37eb8a20ecafc678a1d7ced25d6f26cd

See more details on using hashes here.

File details

Details for the file aligncov-0.0.2-py2.py3-none-any.whl.

File metadata

  • Download URL: aligncov-0.0.2-py2.py3-none-any.whl
  • Upload date:
  • Size: 6.7 kB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.4

File hashes

Hashes for aligncov-0.0.2-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 d0f2460fe5d32bfce9d6bb3233759595221d924211cef4ebae4926cbd0317995
MD5 efda6fb96221af1a597abbb8af12ba5e
BLAKE2b-256 b2abd19f3b17187c6b0cb1c30a7a3551c56e49a0d0ab69596177118cb8a2c03e

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page