Skip to main content

RNA-seq Quality Control — modernized fork of RSeQC

Project description

rseqc-redux

PyPI Status Python Version License

Tests Coverage Ruff

rseqc-redux is a modernized fork of RSeQC (RNA-seq Quality Control), originally by Liguo Wang. It updates the RSeQC 5.0.1 codebase with modern Python packaging, comprehensive tests, and CI — while preserving the original functionality.

Requirements

  • Python 3.10 or higher
  • Indexed BAM files (.bam with corresponding .bam.bai index files)
  • A BED-format gene model / reference annotation

Dependencies

Core dependencies are automatically installed:

  • pysam — BAM/SAM file handling
  • bx-python — Interval indexing and overlap
  • numpy — Numerical operations
  • pyBigWig — BigWig file I/O
  • matplotlib / logomaker — Plotting

Installation

From PyPI (Recommended)

pip install rseqc-redux

From Source

git clone https://github.com/semenko/rseqc-redux.git
cd rseqc-redux
pip install .

Development Installation

git clone https://github.com/semenko/rseqc-redux.git
cd rseqc-redux
uv sync

Quick Start

# Basic BAM statistics
bam_stat -i sample.bam

# Infer RNA-seq strandedness
infer_experiment -r gene_model.bed -i sample.bam

# Transcript integrity number
tin -i sample.bam -r gene_model.bed

# Gene body coverage
geneBody_coverage -r gene_model.bed -i sample.bam -o output

# Read distribution over genomic features
read_distribution -r gene_model.bed -i sample.bam

# Junction annotation
junction_annotation -r gene_model.bed -i sample.bam -o output

Available Tools

Tool Description
bam_stat Summarize mapping statistics of a BAM file
bam2fq Convert BAM alignments to FASTQ format
bam2wig Convert BAM to wiggle/BigWig
divide_bam Equally divide BAM file into n parts
split_bam Split BAM by chromosome
split_paired_bam Split paired-end BAM into two single-end BAMs
infer_experiment Infer RNA-seq strandedness
inner_distance Inner distance between read pairs
RNA_fragment_size Fragment size statistics per gene
tin Calculate Transcript Integrity Number
geneBody_coverage Gene body coverage profile
geneBody_coverage2 Gene body coverage from BigWig input
read_distribution Reads over genomic features (CDS, UTR, intron, etc.)
read_duplication Read duplication rate
read_GC GC content of reads
read_NVC Nucleotide composition (ACGT) along reads
read_quality Per-position quality scores
read_hexamer Hexamer frequency analysis
junction_annotation Annotate splice junctions
junction_saturation Splice junction saturation analysis
FPKM_count FPKM expression quantification
FPKM-UQ Upper-quartile normalized FPKM
RPKM_saturation RPKM saturation analysis
mismatch_profile Mismatch profile along reads
insertion_profile Insertion profile along reads
deletion_profile Deletion profile along reads
clipping_profile Clipping profile along reads
normalize_bigwig Normalize BigWig signal to fixed wigsum
overlay_bigwig Pairwise operations on two BigWig files
sc_bamStat Single-cell RNA-seq mapping statistics
sc_editMatrix Barcode/UMI error correction heatmaps
sc_seqLogo DNA sequence logo from FASTQ/FASTA
sc_seqQual Sequencing quality heatmap from FASTQ

Upgrading from RSeQC 5.x or earlier rseqc-redux releases

Soft clipping bug fix (post-6.1.0)

fetch_exon() in bam_cigar.py had a long-standing bug inherited from the original RSeQC: soft clip CIGAR operations incorrectly advanced the reference coordinate, shifting exon boundaries rightward for reads with leading soft clips. The sibling function fetch_intron() already handled this correctly. This affects output from any script that resolves read-to-genome exon coordinates: bam2wig, read_distribution, read_duplication, RPKM_saturation, FPKM_count, and inner_distance.

Should I re-run my analyses? Almost certainly not. The shifts are small (1–10 bp, matching the soft clip length) and only affect the ~5% of reads with leading soft clips. Gene-level counts, coverage profiles, and quality metrics change by amounts well below meaningful thresholds. See CHANGES.md for a per-script impact table.

Contributing

Contributions are welcome!

Development Commands

# Install development dependencies
uv sync

# Run tests
uv run pytest

# Run full test matrix
uv run nox

# Lint and format
uv run ruff check .
uv run ruff format .

# Type check
uv run mypy rseqc/

License

Distributed under the terms of the GPLv3+ license, rseqc-redux is free and open source software.

Issues

If you encounter any problems, please file an issue with a description of the problem, steps to reproduce, and relevant error messages.

Credits

Original RSeQC by Liguo Wang — rseqc.sourceforge.net

Modernization by Nick Semenkovich (@semenko).

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rseqc_redux-6.2.1.tar.gz (300.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

rseqc_redux-6.2.1-py3-none-any.whl (100.8 kB view details)

Uploaded Python 3

File details

Details for the file rseqc_redux-6.2.1.tar.gz.

File metadata

  • Download URL: rseqc_redux-6.2.1.tar.gz
  • Upload date:
  • Size: 300.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for rseqc_redux-6.2.1.tar.gz
Algorithm Hash digest
SHA256 f47248d8f1d176b196113565fa32bfe171e144a550a1974c520d7ca4fcf30184
MD5 616f8836846044ce2b3fdbb6bc889b74
BLAKE2b-256 d5b8bb7e1689b8f9c08064cd494066e7fbdc8c5f4e7c1d1e5bebd946dfe5f53a

See more details on using hashes here.

Provenance

The following attestation bundles were made for rseqc_redux-6.2.1.tar.gz:

Publisher: publish.yml on semenko/rseqc-redux

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file rseqc_redux-6.2.1-py3-none-any.whl.

File metadata

  • Download URL: rseqc_redux-6.2.1-py3-none-any.whl
  • Upload date:
  • Size: 100.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for rseqc_redux-6.2.1-py3-none-any.whl
Algorithm Hash digest
SHA256 eb4869a225327105df3d5261fa630bdccf2b6349a117b638838365c95bbce4f6
MD5 0d6eb6c5d1a02c720d05a3214b83f264
BLAKE2b-256 a2387694a22b7fbdc262abcf2bd1883d5b62033c65c0dee8681deb71bd4e1050

See more details on using hashes here.

Provenance

The following attestation bundles were made for rseqc_redux-6.2.1-py3-none-any.whl:

Publisher: publish.yml on semenko/rseqc-redux

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page