Plotting suite for Oxford Nanopore sequencing data and alignments
Project description
Plotting tool for long read sequencing data and alignments.
NanoPlot is also available as a web service.
The example plot above shows a bivariate plot comparing log transformed read length with average basecall Phred quality score. More examples can be found in the gallery on my blog ‘Gigabase Or Gigabyte’.
In addition to various plots also a NanoStats file is created summarizing key features of the dataset.
INSTALLATION
pip install NanoPlot
or
The script is written for python3.
OUTPUT
NanoPlot creates: - a statistical summary - a number of plots - a html summary file
USAGE
NanoPlot [-h] [-v] [-t THREADS] [--verbose] [--store] [--raw] [-o OUTDIR] [-p PREFIX] [--maxlength N] [--minlength N] [--drop_outliers] [--downsample N] [--loglength] [--percentqual] [--alength] [--minqual N] [--readtype {1D,2D,1D2}] [--barcoded] [-c COLOR] [-f {eps,jpeg,jpg,pdf,pgf,png,ps,raw,rgba,svg,svgz,tif,tiff}] [--plots [{kde,hex,dot,pauvre} [{kde,hex,dot,pauvre} ...]]] [--listcolors] [--no-N50] [--N50] [--title TITLE] (--fastq file [file ...] | --fastq_rich file [file ...] | --fastq_minimal file [file ...] | --summary file [file ...] | --bam file [file ...] | --cram file [file ...] | --pickle pickle) General options: -h, --help show the help and exit -v, --version Print version and exit. -t, --threads THREADS Set the allowed number of threads to be used by the script --verbose Write log messages also to terminal. --store Store the extracted data in a pickle file for future plotting. --raw Store the extracted data in tab separated file. -o, --outdir OUTDIR Specify directory in which output has to be created. -p, --prefix PREFIX Specify an optional prefix to be used for the output files. Options for filtering or transforming input prior to plotting: --maxlength N Drop reads longer than length specified. --minlength N Drop reads shorter than length specified. --drop_outliers Drop outlier reads with extreme long length. --downsample N Reduce dataset to N reads by random sampling. --loglength Logarithmic scaling of lengths in plots. --percentqual Use qualities as theoretical percent identities. --alength Use aligned read lengths rather than sequenced length (bam mode) --minqual N Drop reads with an average quality lower than specified. --readtype Which read type to extract information about from a summary file. One of 1D (default), 2D, 1D2 --barcoded Use if you want to split the summary file by barcode Options for customizing the plots created: -c, --color COLOR Specify a color for the plots, must be a valid matplotlib color -f, --format Specify the output format of the plots. One of png [default], eps,jpeg,jpg,pdf,pgf,ps,raw,rgba,svg,svgz,tif,tiff --plots Specify which bivariate plots have to be made. One or more of 'dot' (default), 'kde' (default), 'hex' and 'pauvre' --listcolors List the colors which are available for plotting and exit. --no-N50 Hide the N50 mark in the read length histogram --N50 Show the N50 mark in the read length histogram --title TITLE Add a title to all plots, requires quoting if using spaces Input data sources, one of these is required.: --fastq file [file ...] Data is in one or more default fastq file(s). --fasta file [file ...] Data is in one or more default fasta file(s). --fastq_rich file [file ...] Data is in one or more fastq file(s) generated by albacore or MinKNOW with additional information concerning channel and time. --fastq_minimal file [file ...] Data is in one or more fastq file(s) generated by albacore or MinKNOW with additional information concerning channel and time. Minimal data is extracted swiftly without elaborate checks. --summary file [file ...] Data is in one or more summary file(s) generated by albacore. --bam file [file ...] Data is in one or more sorted bam file(s). --cram file [file ...] Data is in one or more sorted cram file(s). --pickle pickle Data is a pickle file stored earlier.
EXAMPLE USAGE
Nanoplot --summary sequencing_summary.txt --loglength -o summary-plots-log-transformed
NanoPlot -t 2 --fastq reads1.fastq.gz reads2.fastq.gz --maxlength 40000 --plots hex dot
NanoPlot -t 12 --color yellow --bam alignment1.bam alignment2.bam alignment3.bam --downsample 10000 -o bamplots_downsampled
This script now also provides read length vs mean quality plots in the ‘pauvre’-style from [@conchoecia](https://github.com/conchoecia).
ACKNOWLEDGMENTS
I welcome all suggestions, bug reports, feature requests and contributions. Please leave an issue or open a pull request. I will usually respond within a day, or rarely within a few days.
PLOTS GENERATED
Plot |
Fastq |
Fastq _ric h |
Fastq _min imal |
Bam |
Summa ry |
Optio ns |
Style |
---|---|---|---|---|---|---|---|
Histo gram of read lengt h |
x |
x |
x |
x |
x |
N50 |
|
Histo gram of (log trans forme d) read lengt h |
x |
x |
x |
x |
x |
N50 |
|
Bivar iate plot of lengt h again st base call quali ty |
x |
x |
x |
x |
log trans forma tion |
dot, hex, kde, pauvr e |
|
Heatm ap of reads per chann el |
x |
x |
|||||
Cumul ative yield plot |
x |
x |
x |
||||
Violi n plot of read lengt h over time |
x |
x |
x |
||||
Violi n plot of base call quali ty over time |
x |
x |
|||||
Bivar iate plot of align ed read lengt h again st seque nced read lengt h |
x |
dot, hex, kde |
|||||
Bivar iate plot of perce nt refer ence ident ity again st read lengt h |
x |
log trans forma tion |
dot, hex, kde |
||||
Bivar iate plot of perce nt refer ence ident ity again st base call quali ty |
x |
dot, hex, kde |
|||||
Bivar iate plot of mappi ng quali ty again st read lengt h |
x |
log trans forma tion |
dot, hex, kde |
||||
Bivar iate plot of mappi ng quali ty again st basec all quali ty |
x |
dot, hex, kde |
COMPANION SCRIPTS
CITATION
If you use this tool, please consider citing our publication.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
File details
Details for the file NanoPlot-1.12.0.tar.gz
.
File metadata
- Download URL: NanoPlot-1.12.0.tar.gz
- Upload date:
- Size: 13.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 8b8b8be89cea05231736f314e9de014d52819dafa9349b07236c1db229e1c168 |
|
MD5 | 84c680589c6379aaf5209b59bf518fe5 |
|
BLAKE2b-256 | e690cf482375730cfe28e6bc96553d0d29f4d72a02f48fb4346a51aadeda4015 |