Specific, sensitive, and speedy trimming of NGS reads.
Project description
Atropos
Atropos is tool for specific, sensitive, and speedy trimming of NGS reads. It is a fork of the venerable Cutadapt read trimmer (https://github.com/marcelm/cutadapt, DOI:10.14806/ej.17.1.200), with the primary improvements being:
- Multi-threading support, including an extremely fast "parallel write" mode.
- Implementation of a new insert alignment-based trimming algorithm for paired-end reads that is substantially more sensitive and specific than the original Cutadapt adapter alignment-based algorithm. This algorithm can also correct mismatches between the overlapping portions of the reads.
- Options for trimming specific types of data (miRNA, bisulfite-seq).
- A new command ('detect') that will detect adapter sequences and other potential contaminants.
- A new command ('error') that will estimate the sequencing error rate, which helps to select the appropriate adapter- and quality- trimming parameter values.
- A new command ('qc') that` ``` generates read statistics similar to FastQC. The trim command can also compute read metrics both before and after trimming (using the '--metrics' option).
- Improved summary reports, including support for serialization formats (JSON, YAML, pickle), support for user-defined templates (via the optional Jinja2 dependency), and integration with MultiQC.
- The ability to merge overlapping reads (this is experimental and the functionality is limited).
- The ability to write the summary report and log messages to separate files.
- The ability to read/write SAM, BAM, and interleaved FASTQ files.
- Direct trimming of reads from an SRA accession.
- A progress bar, and other minor usability enhancements.
Manual installation
Atropos is available from pypi and can be installed using pip
.
First install dependencies:
- Required
- Python 3.6+ (python 2.x is NOT supported)
- Note: Reading from SAM/BAM files is not currently supported in Python 3.8, due to the fact that pysam is not compatible with Python 3.8. This is a temporary limitation that will be fixed before the final release of Atropos 2.0.0.
- Cython 0.25.2+/0.29+/0.29.14+, depending on whether you're using python 3.6/3.7/3.8 (
pip install Cython
) - loguru
- pokrok 0.2.0+
- xphyle 4.2.1+
- Python 3.6+ (python 2.x is NOT supported)
- Optional
- pytest (for running unit tests)
- pysam (SAM/BAM support)
- khmer 2.0+ (for detecting low-frequency adapter contamination)
- jinja2 (for user-defined report formats)
- ngstream (for SRA streaming), which requires ngs
Pip can be used to install atropos and optional dependencies, e.g.:
pip install atropos[tqdm,pysam,srastream]
Conda
There is an Atropos recipe in Bioconda.
conda install -c bioconda atropos
Docker
A Docker image is available for Atropos in Docker Hub.
docker run jdidion/atropos <arguments>
Usage
Atropos is almost fully backward-compatible with cutadapt. If you currently use cutadapt, you can simply install Atropos and then substitute the executable name in your command line, with one key difference: you need to use options to specify input file names. For example:
atropos -a AGATCGGAAGAGCACACGTCTGAACTCCAGTCACGAGTTA -o trimmed.fq.gz -se reads.fq.gz
To take advantage of multi-threading, set the --threads
option:
atropos --threads 8 -a AGATCGGAAGAGCACACGTCTGAACTCCAGTCACGAGTTA -o trimmed.fq.gz -se reads.fq.gz
To take advantage of the new aligner (if you have paired-end reads with 3' adapters), set the --aligner
option to 'insert':
atropos --aligner insert -a AGATCGGAAGAGCACACGTCTGAACTCCAGTCACACAGTGATCTCGTATGCCGTCTTCTGCTTG \
-A AGATCGGAAGAGCGTCGTGTAGGGAAAGAGTGTAGATCTCGGTGGTCGCCGTATCATT -o trimmed.1.fq.gz -p trimmed.2.fq.gz \
-pe1 reads.1.fq.gz -pe2 reads.2.fq.gz
See the Documentation for more complete usage information.
Using Atropos as a library
While we consider the command-line interface to be stable, the internal code organization of Atropos is likely to change. At this time, we recommend to not directly interface with Atropos as a library (or to be prepared for your code to break).
Publications
Atropos is published in PeerJ.
Please cite as:
Didion JP, Martin M, Collins FS. (2017) Atropos: specific, sensitive, and speedy trimming of sequencing reads. PeerJ 5:e3720 https://doi.org/10.7717/peerj.3720
The results in the paper can be fully reproduced using the workflow defined in the paper directory.
The citation for the original Cutadapt paper is:
Marcel Martin. "Cutadapt removes adapter sequences from high-throughput sequencing reads." EMBnet.Journal, 17(1):10-12, May 2011. http://dx.doi.org/10.14806/ej.17.1.200
Links
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for atropos-2.0.0a1-cp36-cp36m-macosx_10_14_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 9cd379d2925850dd7e02905375115d2363237a610b76f32ab264e6418fa6f85f |
|
MD5 | c0a63070ac33ec275bea14d5ed2612af |
|
BLAKE2b-256 | 17d6f61706d3e314662e0cb2d69191df07ac62534306e15098ef53362ba8dc78 |