Specific, sensitive, and speedy trimming of NGS reads.
Project description
Atropos
Atropos is tool for specific, sensitive, and speedy trimming of NGS reads. It is a fork of the venerable Cutadapt read trimmer (https://github.com/marcelm/cutadapt, DOI:10.14806/ej.17.1.200), with the primary improvements being:
- Multi-threading support, including an extremely fast "parallel write" mode.
- Implementation of a new insert alignment-based trimming algorithm for paired-end reads that is substantially more sensitive and specific than the original Cutadapt adapter alignment-based algorithm. This algorithm can also correct mismatches between the overlapping portions of the reads.
- Options for trimming specific types of data (miRNA, bisulfite-seq).
- A new command ('detect') that will detect adapter sequences and other potential contaminants.
- A new command ('error') that will estimate the sequencing error rate, which helps to select the appropriate adapter- and quality- trimming parameter values.
- A new command ('qc') that` ``` generates read statistics similar to FastQC. The trim command can also compute read metrics both before and after trimming (using the '--metrics' option).
- Improved summary reports, including support for serialization formats (JSON, YAML, pickle), support for user-defined templates (via the optional Jinja2 dependency), and integration with MultiQC.
- The ability to merge overlapping reads (this is experimental and the functionality is limited).
- The ability to write the summary report and log messages to separate files.
- The ability to read/write SAM, BAM, and interleaved FASTQ files.
- Direct trimming of reads from an SRA accession.
- A progress bar, and other minor usability enhancements.
Manual installation
Atropos is available from pypi and can be installed using pip
.
First install dependencies:
- Required
- Optional
- pytest (for running unit tests)
- bamnostic or pysam (SAM/BAM support)
- khmer 2.0+ (for detecting low-frequency adapter contamination)
- jinja2 (for user-defined report formats)
- ngstream (for SRA streaming), which requires ngs
Pip can be used to install atropos and optional dependencies, e.g.:
pip install atropos[tqdm,bamnostic,ngstream]
Conda
There is an Atropos recipe in Bioconda.
conda install -c bioconda atropos
Docker
A Docker image is available for Atropos in Docker Hub.
docker run jdidion/atropos <arguments>
Usage
Atropos is almost fully backward-compatible with cutadapt. If you currently use cutadapt, you can simply install Atropos and then substitute the executable name in your command line, with one key difference: you need to use options to specify input file names. For example:
atropos -a AGATCGGAAGAGCACACGTCTGAACTCCAGTCACGAGTTA -o trimmed.fq.gz -se reads.fq.gz
To take advantage of multi-threading, set the --threads
option:
atropos --threads 8 -a AGATCGGAAGAGCACACGTCTGAACTCCAGTCACGAGTTA -o trimmed.fq.gz -se reads.fq.gz
To take advantage of the new aligner (if you have paired-end reads with 3' adapters), set the --aligner
option to 'insert':
atropos --aligner insert -a AGATCGGAAGAGCACACGTCTGAACTCCAGTCACACAGTGATCTCGTATGCCGTCTTCTGCTTG \
-A AGATCGGAAGAGCGTCGTGTAGGGAAAGAGTGTAGATCTCGGTGGTCGCCGTATCATT -o trimmed.1.fq.gz -p trimmed.2.fq.gz \
-pe1 reads.1.fq.gz -pe2 reads.2.fq.gz
See the Documentation for more complete usage information.
Using Atropos as a library
While we consider the command-line interface to be stable, the internal code organization of Atropos is likely to change. At this time, we recommend to not directly interface with Atropos as a library (or to be prepared for your code to break).
Publications
Atropos is published in PeerJ.
Please cite as:
Didion JP, Martin M, Collins FS. (2017) Atropos: specific, sensitive, and speedy trimming of sequencing reads. PeerJ 5:e3720 https://doi.org/10.7717/peerj.3720
The results in the paper can be fully reproduced using the workflow defined in the paper directory.
The citation for the original Cutadapt paper is:
Marcel Martin. "Cutadapt removes adapter sequences from high-throughput sequencing reads." EMBnet.Journal, 17(1):10-12, May 2011. http://dx.doi.org/10.14806/ej.17.1.200
Links
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for atropos-2.0.0a5-cp36-cp36m-macosx_10_14_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 29c1a6228ac9bcddfa2605bd085a8caf427dcce9c957cd428bd9ed4b0976dd1c |
|
MD5 | 42e43f9a5dec9b87ab56022c55c98714 |
|
BLAKE2b-256 | 2f081f4eb912e7f677abd037fb367795f15953a399dcaa9368127391938b414f |