utilities for performing various preprocessing steps on sequencing reads
Project description
# seq_qc - Python package with modules for preprocessing sequencing reads
## About
seq_qc is a python package for performing various quality control tasks on sequencing reads. Currently, seq_qc has three programs for this - a dereplicator for paired-end or single-end reads, a tool for trimming reads based on length and quality score thresholds, and a tool for interleaving paired-end reads.
## Requirements
Python 2.7+ or 3.4+
Python Libraries:
screed
## Installation
pip install seq_qc
## Usage
filter_replicates takes as input a fastq or fasta file. If the reads are paired-end, the pairs can either be in separate files or a single interleaved file. The types of replicates that filter_replicates can search for are exact, 5’-prefix, and reverse-complement replicates.
### Examples
filter_replicates -1 input_forward.fastq -2 input_reverse.fastq –prefix –rev-comp
filter_replicates -1 input_forward.fastq.gz -2 input_reverse.fastq.gz -o output_forward.fastq.gz -v output_reverse.fastq.gz –prefix –rev-comp
filter_replicates -1 input_interleaved.fastq -o output_interleaved.fasta –interleaved –format fasta –log output.log
filter_replicates -1 input_singles.fastq -o output_singles.fastq
qtrim takes only fastq files as input. It can perform a variety of trimming steps - including trimming low quality bases from the start and end of a read, trimming the read after the position of the first ambiguous base, and trimming a read using a sliding-window approach. It also supports cutting the read to a desired length by removing bases from the start or end of the read.
### Examples
qtrim -1 input_forward.fastq.gz -2 input_reverse.fastq.gz -o output_forward.fastq.gz -v output_reverse.fastq.gz -s output_singles.fastq.gz –qual-type phred33 –sliding-window 10:20
qtrim -1 input_interleaved.fastq -o output_interleaved.fastq –interleaved –leading 20 –trailing 20 –trunc-n –min-len 60
interleave_pairs takes paired reads in separate fastq or fasta files and interleaves them in a single file. It can also act as a simple format conversion tool by allowing the input to be in fastq format and the output in fasta format.
### Examples
interleave_pairs -1 input_forward.fastq.gz -2 input_reverse.fastq.gz –format fasta -o output_interleaved.fasta
interleave_pairs - 1 input_forward.fasta -2 input_reverse.fasta –format fasta -o output_interleaved.fasta
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
File details
Details for the file seq_qc.tar.gz.
File metadata
- Download URL: seq_qc.tar.gz
- Upload date:
- Size: 66.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
62884c06742d9c1445ead956e385e0316b76b602fa451007ee6ba4b4efc595d3
|
|
| MD5 |
09bac48401ba4a39e971f14225685cad
|
|
| BLAKE2b-256 |
7d2a72ab2b69fe1360fffc5c8bdc957f5defd7083a7cfc250a3a75dc001e88c3
|
File details
Details for the file seq-qc-0.16.0.tar.gz.
File metadata
- Download URL: seq-qc-0.16.0.tar.gz
- Upload date:
- Size: 11.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
98fde702ae27d1eb0f86005dad37046f3f898678413007880785180ccf3db811
|
|
| MD5 |
411bc975a60c3a5c92606d4a3c2c0340
|
|
| BLAKE2b-256 |
c33056cdd36b65e2afe04e64700865a7ad6aa73ec091b978e87fe6181edc5a54
|