Filtering and trimming of Oxford Nanopore Sequencing data
Project description
Filtering and trimming of Oxford Nanopore sequencing data.
Filtering on quality and/or read length, and optional trimming after
passing filters.
Reads from stdin, writes to stdout.
Intended to be used:
- directly after fastq extraction
- prior to mapping
- in a stream between extraction and mapping
Due to a
discrepancy
between calculated read quality and the quality as summarized by
albacore this script takes since v1.1.0 optionally also a
--summary argument. Using this argument with the
sequencing_summary.txt file from albacore will do the filtering using
the quality scores from the summary. It’s also faster.
INSTALLATION AND UPGRADING:
pip install nanofilt
pip install nanofilt --upgrade
or
conda install -c bioconda nanofilt
STATUS
NanoFilt is written for Python 3, and should also work for Python 2.7.
USAGE:
NanoFilt [-h] [-q QUALITY] [-l LENGTH] [--headcrop HEADCROP] [--tailcrop TAILCROP] optional arguments: -h, --help show this help message and exit -s --summary SUMMARYFILE optional, the sequencing_summary file from albacore for extracting quality scores -q, --quality QUALITY Filter on a minimum average read quality score -l, --length LENGTH Filter on a minimum read length --headcrop HEADCROP Trim n nucleotides from start of read --tailcrop TAILCROP Trim n nucleotides from end of read --minGC MINGC Sequences must have GC content >= to this. Float between 0.0 and 1.0. Ignored if using summary file. --maxGC MAXGC Sequences must have GC content <= to this. Float between 0.0 and 1.0. Ignored if using summary file.
Example:
gunzip -c reads.fastq.gz | NanoFilt -q 10 -l 500 --headcrop 50 | minimap2 genome.fa - | samtools sort -O BAM -@24 -o alignment.bam -
gunzip -c reads.fastq.gz | NanoFilt -q 12 --headcrop 75 | gzip > trimmed-reads.fastq.gz
gunzip -c reads.fastq.gz | NanoFilt -q 10 | gzip > highQuality-reads.fastq.gz
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
NanoFilt-1.4.0.tar.gz
(4.4 kB
view hashes)