Skip to main content

Plotting suite for Oxford Nanopore sequencing data and alignments

Project description

Plotting tool for Oxford Nanopore sequencing data and alignments.

Twitter URL conda badge Build Status Code Health

Example plot

Example plot

The example plot above shows a bivariate plot comparing log transformed read length with average basecall Phred quality score. More examples can be found in the gallery on my blog ‘Gigabase Or Gigabyte’.

In addition to various plots also a NanoStats file is created summarizing key features of the dataset.

This script performs data extraction from Oxford Nanopore sequencing data in the following formats:
- fastq files
(can be bgzip, bzip2 or gzip compressed)
- fastq files generated by albacore or MinKNOW containing additional information
(can be bgzip, bzip2 or gzip compressed)
- sorted bam files
- sequencing_summary.txt output table generated by albacore

INSTALLATION

pip install NanoPlot

Upgrade to a newer version using:
pip install NanoPlot --upgrade

or

conda badge
conda install -c bioconda nanoplot

STATUS

Build Status Code Health

The script is written for python3.

USAGE

NanoPlot [-h] [-v] [-t THREADS] [--maxlength MAXLENGTH]
                [--drop_outliers] [--downsample DOWNSAMPLE] [--loglength]
                [--readtype {1D,2D,1D2}] [--alength] [-c COLOR] [-o OUTDIR]
                [-p PREFIX]
                [-f {eps,jpeg,jpg,pdf,pgf,png,ps,raw,rgba,svg,svgz,tif,tiff}]
                [--plots [{kde,hex,dot,pauvre} [{kde,hex,dot,pauvre} ...]]]
                [--barcoded]
                (--fastq [FASTQ [FASTQ ...]] | --fastq_rich [FASTQ_RICH [FASTQ_RICH ...]] | --fastq_minimal [FASTQ_MINIMAL [FASTQ_MINIMAL ...]] | --summary [SUMMARY [SUMMARY ...]] | --bam [BAM [BAM ...]] | --listcolors)


Required input argument is (exact) one of these:
    --fastq file(s)         Data presented is in fastq format exported from fast5
                            files by e.g. poretools.
    --fastq_rich file(s)    Data presented is in fastq format generated by
                            Albacore or MinKNOW with additional information concerning
                            channel and time.
    --bam file(s)           Data presented as a sorted bam file.
    --summary file(s)       Data is a summary file generated by albacore.
    --fastq_minimal file(s) Data is in fastq format generated by albacore or
                            MinKNOW with additional information concerning channel
                            and time. Minimal data is extracted swiftly without
                            elaborate checks.


Each of these options can take one or multiple files e.g.
--summary summary1.txt summary2.txt summary3.txt
--bam bam1.txt bam2.txt


Arguments for optional filtering:
    --readtype              Specify read type to extract from summary file
                            Options: 1D (default), 2D or 1D2
    --barcoded             Use if you want to split the summary file by barcode
    --maxlength MAXLENGTH   Drop reads longer than length N.
    --downsample DOWNSAMPLE Reduce dataset to N reads by random sampling.
    --drop_outliers         Drop outlier reads with extreme long length.
    --loglength             Logarithmic scaling of lengths in plots.
    --alength               Use aligned read lengths rather than sequenced length (bam mode).


Optional output arguments:
    -o, --outdir OUTDIR     Specify directory in which output has to be created.
    -p, --prefix PREFIX     Specify a prefix to be used for the output files.
    -c, --color COLOR       Specify a color for the plots
                            must be a valid matplotlib color (see color_options.txt)
                            default: green
    -f, --format FORMAT     Specify the output format for the plots,
                            options are: eps, jpg, pdf, png, ps, svg
                            default: png
    --plots PLOTS           Specify which type of bivariate plots have to be made
                            options are: hex, kde, dot, pauvre (multiple can be specified together)
                            default: hex, kde, dot


General arguments:
    -h, --help              show this help message and exit
    -v, --version           Print version and exit.
    -t, --threads THREADS   Max number of threads to be used by the script
    --listcolors            Give a list of all colors which can be used for plotting

EXAMPLES

Nanoplot --summary sequencing_summary.txt --loglength -o summary-plots-log-transformed
NanoPlot -t 2 --fastq reads1.fastq.gz reads2.fastq.gz --maxlength 40000 --plots hex dot
NanoPlot -t 12 --color yellow --bam alignment1.bam alignment2.bam alignment3.bam --downsample 10000 -o bamplots_downsampled

This script now also provides read length vs mean quality plots in the ‘pauvre’-style from [@conchoecia](https://github.com/conchoecia).

ACKNOWLEDGMENTS

I welcome all suggestions, bug reports, feature requests and contributions. Please leave an issue or open a pull request. I will usually respond within a day, or rarely within a few days.

COMPANION SCRIPTS

  • NanoComp: comparing multiple runs

  • NanoStat: statistic summary report of reads or alignments

  • NanoFilt: filtering and trimming of reads

  • NanoLyse: removing contaminant reads (e.g. lambda control DNA) from fastq

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

NanoPlot-0.24.0.tar.gz (9.6 kB view details)

Uploaded Source

File details

Details for the file NanoPlot-0.24.0.tar.gz.

File metadata

  • Download URL: NanoPlot-0.24.0.tar.gz
  • Upload date:
  • Size: 9.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for NanoPlot-0.24.0.tar.gz
Algorithm Hash digest
SHA256 2ca95f8b1a225051c2cd86eb33c63105ee28f9234f8715c5eceeb9cb53ca2381
MD5 e51c2cd54b4488320e7effa8a5559bef
BLAKE2b-256 3c725d77b8dea0d31ad0cea1e0afa0a015736f7a8b56f59e60c0beed39c4696c

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page