A collection of utilities for computing statistics on genomic sequences and alignments.
Project description
NGS statter
A package to compute read/alignment statistics per fastq/bam files. This package is used as a part of an NGS analysis pipeline to generate intermediate QC results and final alignment statistics for all samples.
Parsers
- Fastq parser
- GFF parser
- BAM parser
- kraken2 report aggregator
- fastp fastp trimming stats aggregator
- STAR aligner alignment stats to json
- parser fix UMI position in flexbar fastqs
Plotting functions
- Fastq read length distribution plot
- Overall aligned read length distribution plot
- Gene type specific read length distribution plot
- Alignment statistics
Installation
This package can be installed using pip:
$ pip install ngs-statter
Documentation
A detailed documentation is available at this read the docs page. A brief description of various commands available in this package is given below
Commands
use ngs-statter -h for a list of available helper commands.
Commands are grouped based on the input as follows:
alignment
Parse alignment (.bam) files and generate base statistics
| command | description | usage |
|---|---|---|
| bam | Compute basic alignment statistics for a generic BAM file | ngs-statter bam -h |
| STAR | Compute alignment statistics for a STAR aligned BAM file and output as JSON | ngs-statter STAR -h |
crosslink
Parse output crosslink (.bed) files from Shoji or htseq-clip
| command | description | usage |
|---|---|---|
| csv-meta-example | Print example CSV metadata file for crosslink plotting to console | ngs-statter csv-meta-example -h |
| count-crosslinks | Count crosslinking sites over secondary structure/primary motif regions | ngs-statter count-crosslinks -h |
| crosslink-line-plot | Plot crosslinking sites over secondary structure/primary motif regions as line plots | ngs-statter crosslink-line-plot -h |
| crosslink-heatmap | Plot crosslinking sites over secondary structure/primary motif regions as heatmaps | ngs-statter crosslink-heatmap -h |
fastq
Parse fastq files to compute and plot read-length distributions
| command | description | usage |
|---|---|---|
| parse-read-length | Compute fastq read length distribution | ngs-statter parse-read-length -h |
| plot-read-length | Read in json formatted output files (from parse-read-length) and output html plot |
ngs-statter plot-read-length -h |
flexbar
Fix the position of UMIs in flexbar output fastq files
| command | description | usage |
|---|---|---|
| fix-header | Fix flexbar UMI headers | ngs-statter fix-header -h |
genetype
Given a gff3 formatted gene annotation file and an alignment (.bam) file, compute and plot read-length distribution of reads aligning to gene types (protein coding, lncRNA,...).
| command | description | usage |
|---|---|---|
| parse-gene-type-read-length | Compute gene type read length distributions | ngs-statter parse-gene-type-read-length -h |
| plot-gene-type-read-length | Plot gene type read length distributions using outputs from parse-gene-type-read-length |
ngs-statter plot-gene-type-read-length -h |
kraken
Given NCBI taxonomy files and a set of Kraken2 species classification reports, merge the reports into one single output.
| command | description | usage |
|---|---|---|
| collect-reports | Collect kraken2 reports into a single file | ngs-statter collect-reports -h |
sample
| command | description | usage |
|---|---|---|
| sample-stats | Collect statistics from various steps in a workflow (trimming, alignment,...) and compile into a single json file | ngs-statter sample-stats -h |
| compile-stats | Collect stats from multiple samples (output from sample-stats) into a single tsv file |
ngs-statter compile-stats -h |
Hentze Group, EMBL Heidelberg
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file ngs_statter-1.20.3-cp312-cp312-manylinux_2_28_x86_64.whl.
File metadata
- Download URL: ngs_statter-1.20.3-cp312-cp312-manylinux_2_28_x86_64.whl
- Upload date:
- Size: 3.7 MB
- Tags: CPython 3.12, manylinux: glibc 2.28+ x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5ff765f5f7781c9f9b728bc0a8ba5dcf85d1083bbc3f54ce572dac1958264904
|
|
| MD5 |
81fe440c0cb902b2163c6c73843638a9
|
|
| BLAKE2b-256 |
687a5c7eeb5b0f9735b9285940aeff3bd368b980ea16904e0ad083ed432bcbd7
|
Provenance
The following attestation bundles were made for ngs_statter-1.20.3-cp312-cp312-manylinux_2_28_x86_64.whl:
Publisher:
builder.yml on EMBL-Hentze-group/ngs-statter
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
ngs_statter-1.20.3-cp312-cp312-manylinux_2_28_x86_64.whl -
Subject digest:
5ff765f5f7781c9f9b728bc0a8ba5dcf85d1083bbc3f54ce572dac1958264904 - Sigstore transparency entry: 1859551024
- Sigstore integration time:
-
Permalink:
EMBL-Hentze-group/ngs-statter@34d11e84efe18dd6a1baf04b73a6b3d36056e536 -
Branch / Tag:
refs/tags/v1.20.3 - Owner: https://github.com/EMBL-Hentze-group
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
builder.yml@34d11e84efe18dd6a1baf04b73a6b3d36056e536 -
Trigger Event:
release
-
Statement type: