Skip to main content

A collection of utilities for computing statistics on genomic sequences and alignments.

Project description

NGS statter

A package to compute read/alignment statistics per fastq/bam files. This package is used as a part of an NGS analysis pipeline to generate intermediate QC results and final alignment statistics for all samples.

Parsers

  • Fastq parser
  • GFF parser
  • BAM parser
  • kraken2 report aggregator
  • fastp fastp trimming stats aggregator
  • STAR aligner alignment stats to json
  • parser fix UMI position in flexbar fastqs

Plotting functions

  • Fastq read length distribution plot
  • Overall aligned read length distribution plot
  • Gene type specific read length distribution plot
  • Alignment statistics

Installation

This package can be installed using pip:

$ pip install ngs-statter

Documentation

A detailed documentation is available at this read the docs page. A brief description of various commands available in this package is given below

Commands

use ngs-statter -h for a list of available helper commands.

Commands are grouped based on the input as follows:

alignment

Parse alignment (.bam) files and generate base statistics

command description usage
bam Compute basic alignment statistics for a generic BAM file ngs-statter bam -h
STAR Compute alignment statistics for a STAR aligned BAM file and output as JSON ngs-statter STAR -h

crosslink

Parse output crosslink (.bed) files from Shoji or htseq-clip

command description usage
csv-meta-example Print example CSV metadata file for crosslink plotting to console ngs-statter csv-meta-example -h
count-crosslinks Count crosslinking sites over secondary structure/primary motif regions ngs-statter count-crosslinks -h
crosslink-line-plot Plot crosslinking sites over secondary structure/primary motif regions as line plots ngs-statter crosslink-line-plot -h
crosslink-heatmap Plot crosslinking sites over secondary structure/primary motif regions as heatmaps ngs-statter crosslink-heatmap -h

fastq

Parse fastq files to compute and plot read-length distributions

command description usage
parse-read-length Compute fastq read length distribution ngs-statter parse-read-length -h
plot-read-length Read in json formatted output files (from parse-read-length) and output html plot ngs-statter plot-read-length -h

flexbar

Fix the position of UMIs in flexbar output fastq files

command description usage
fix-header Fix flexbar UMI headers ngs-statter fix-header -h

genetype

Given a gff3 formatted gene annotation file and an alignment (.bam) file, compute and plot read-length distribution of reads aligning to gene types (protein coding, lncRNA,...).

command description usage
parse-gene-type-read-length Compute gene type read length distributions ngs-statter parse-gene-type-read-length -h
plot-gene-type-read-length Plot gene type read length distributions using outputs from parse-gene-type-read-length ngs-statter plot-gene-type-read-length -h

kraken

Given NCBI taxonomy files and a set of Kraken2 species classification reports, merge the reports into one single output.

command description usage
collect-reports Collect kraken2 reports into a single file ngs-statter collect-reports -h

sample

command description usage
sample-stats Collect statistics from various steps in a workflow (trimming, alignment,...) and compile into a single json file ngs-statter sample-stats -h
compile-stats Collect stats from multiple samples (output from sample-stats) into a single tsv file ngs-statter compile-stats -h

Hentze Group, EMBL Heidelberg

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ngs_statter-1.20.3-cp312-cp312-manylinux_2_28_x86_64.whl (3.7 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.28+ x86-64

File details

Details for the file ngs_statter-1.20.3-cp312-cp312-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for ngs_statter-1.20.3-cp312-cp312-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 5ff765f5f7781c9f9b728bc0a8ba5dcf85d1083bbc3f54ce572dac1958264904
MD5 81fe440c0cb902b2163c6c73843638a9
BLAKE2b-256 687a5c7eeb5b0f9735b9285940aeff3bd368b980ea16904e0ad083ed432bcbd7

See more details on using hashes here.

Provenance

The following attestation bundles were made for ngs_statter-1.20.3-cp312-cp312-manylinux_2_28_x86_64.whl:

Publisher: builder.yml on EMBL-Hentze-group/ngs-statter

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page