Skip to main content

Reference-based analysis and quantification of long RNA reads

Project description

BioConda Install PyPI Downloads Python version License GitHub release (latest by date) GitHub Downloads UnitTests User manual

IsoQuant

Full IsoQuant documentation can be found here. Information in this README is given only for convenience and is not a full user manual.

Current version: see VERSION file.

About IsoQuant

IsoQuant is a tool for the genome-based analysis of long RNA reads, such as PacBio or Oxford Nanopores. IsoQuant allows reconstructing and quantifying transcript models with high precision and decent recall. If the reference annotation is given, IsoQuant also assigns reads to the annotated isoforms based on their intron and exon structure. IsoQuant further performs annotated gene, isoform, exon, and intron quantification. If reads are grouped (e.g. according to a cell type), counts are reported according to the provided grouping.

The latest IsoQuant version can be downloaded from github.com/ablab/IsoQuant/releases/latest.

Full IsoQuant documentation is available at ablab.github.io/IsoQuant.

Supported sequencing data

IsoQuant supports all kinds of long RNA data:

  • PacBio CCS
  • ONT dRNA / ONT cDNA
  • Assembled / corrected transcript sequences

Reads must be provided in FASTQ/FASTA format (can be gzipped) or unmapped BAM format. If you have already aligned your reads to the reference genome, simply provide sorted and indexed BAM files. IsoQuant expect reads to contain polyA tails. For more reliable transcript model construction do not trim polyA tails.

IsoQuant can also take aligned Illumina reads to correct long-read spliced alignments. However, short reads are not used to discover transcript models or compute abundances.

Supported reference data

Reference genome is mandatory and should be provided in multi-FASTA format (can be gzipped).

Reference gene annotation is not mandatory but is likely to increase precision and recall. It can be provided in GFF/GTF format (can be gzipped).

Pre-constructed minimap2 index can also be provided to reduce mapping time.

Citation

The paper describing IsoQuant algorithms and benchmarking is available at 10.1038/s41587-022-01565-y.

To try IsoQuant, you can use the data that was used in the publication zenodo.org/record/7611877.

Feedback and bug reports

Your comments, bug reports, and suggestions are very welcome. They will help us to further improve IsoQuant. If you have any troubles running IsoQuant, please send us isoquant.log from the <output_dir> directory.

You can leave your comments and bug reports at our GitHub repository tracker or send them via email: isoquant.rna@gmail.com.

Quick start

  • Full IsoQuant documentation is available at ablab.github.io/IsoQuant.

  • IsoQuant can installed via pip:

    pip install isoquant
    
  • Via conda (bioconda channel):

    conda create -c conda-forge -c bioconda -n isoquant python=3.12 isoquant
    
  • Or from GitHub:

    git clone https://github.com/ablab/IsoQuant.git 
    cd IsoQuant
    git checkout latest
    pip install -e .
    

Installation typically takes no more than a few minutes.

  • If running simply from the source archive, you will need Python3 (3.8 or higher), gffutils, pysam, biopython, pyfaidx, ssw-py, editdistance and some other common Python libraries to be installed. See requirements.txt for details. You will also need to have minimap2 and samtools to be in your $PATH variable. All required Python libraries can be installed via:

    pip install -r requirements.txt
    
  • Verify your installation by running (typically takes less than 1 minute):

    isoquant --test
    
  • To run IsoQuant on raw FASTQ/FASTA files, use the following command

    isoquant --reference /PATH/TO/reference_genome.fasta \
    --genedb /PATH/TO/gene_annotation.gtf \
    --fastq /PATH/TO/sample1.fastq.gz /PATH/TO/sample2.fastq.gz \
    --data_type (assembly|pacbio_ccs|nanopore) -o OUTPUT_FOLDER
    

    For example, using the toy data provided within this repository,

    isoquant --fastq /home/andreyp/ablab/IsoQuant/isoquant_tests/simple_data/chr9.4M.ont.sim.fq.gz \
    --reference /home/andreyp/ablab/IsoQuant/isoquant_tests/simple_data/chr9.4M.fa.gz \
    --genedb /home/andreyp/ablab/IsoQuant/isoquant_tests/simple_data/chr9.4M.gtf.gz \
    --data_type nanopore --complete_genedb -p TEST_DATA --output isoquant_test 
    
  • To run IsoQuant on aligned reads (make sure your BAM is sorted and indexed) use the following command:

      isoquant --reference /PATH/TO/reference_genome.fasta \
      --genedb /PATH/TO/gene_annotation.gtf \
      --bam /PATH/TO/sample1.sorted.bam /PATH/TO/sample2.sorted.bam \
      --data_type (assembly|pacbio_ccs|nanopore) -o OUTPUT_FOLDER
    
  • If using official annotations containing gene and transcript features use --complete_genedb to save time.

  • Using reference annotation is optional since version 3.0, you may preform de novo transcript discovery without providing --genedb option':

      isoquant --reference /PATH/TO/reference_genome.fasta \
      --fastq /PATH/TO/sample1.fastq.gz /PATH/TO/sample2.fastq.gz \
      --data_type (assembly|pacbio|nanopore) -o OUTPUT_FOLDER
    
  • If multiple files are provided, IsoQuant will create a single output annotation and a single set of gene/transcript expression tables.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

isoquant-3.13.0.post1.tar.gz (621.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

isoquant-3.13.0.post1-py3-none-any.whl (643.2 kB view details)

Uploaded Python 3

File details

Details for the file isoquant-3.13.0.post1.tar.gz.

File metadata

  • Download URL: isoquant-3.13.0.post1.tar.gz
  • Upload date:
  • Size: 621.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.20

File hashes

Hashes for isoquant-3.13.0.post1.tar.gz
Algorithm Hash digest
SHA256 31941bba27585b22848e5ab2ca3cc87666c6efff3b08f2f75cbc04d4f031dbb2
MD5 bba706fe954f04df03116c3b42bed1ed
BLAKE2b-256 dde7ff5bed235b887d6190f2ae9eecd67cb792b3d57c9df8dd62ba98acddde25

See more details on using hashes here.

File details

Details for the file isoquant-3.13.0.post1-py3-none-any.whl.

File metadata

  • Download URL: isoquant-3.13.0.post1-py3-none-any.whl
  • Upload date:
  • Size: 643.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.20

File hashes

Hashes for isoquant-3.13.0.post1-py3-none-any.whl
Algorithm Hash digest
SHA256 c2cd8c1d35768f0ce3e060ba4cc58f19f37c9569bcba5db27d8ed82184b6b3f7
MD5 8f4acafbb73f0904155524a357f0d53f
BLAKE2b-256 0ce84b270d78393b2e71ee0a12eaba5045131287da4d9865f6027d55f53c5bdc

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page