Skip to main content

A QIIME2 plugin to trim ITS regions using ITSxpress

Project description

Build Status GitHub release (latest by date) https://zenodo.org/badge/138209572.svg

This is the end of life version 1 of q2_itsxpress and the command line version of ITSxpress. See 1.8.1-EOL branch of ITSxpress. The new version 2 of ITSxpress, has the Qiime2 plugin built in with command line version of ITSxpress.


See ITSxpress 1.8.1-EOL branch here: ITSxpress-1.8.1-EOL

Authors

  • Adam R. Rivers, US Department of Agriculture, Agricultural Research Service

  • Kyle C. Weber, US Department of Agriculture, Agricultural Research Service

  • Sveinn V. Einarsson, US Department of Agriculture, Agricultural Research Service

Citation

Rivers AR, Weber KC, Gardner TG et al. ITSxpress: Software to rapidly trim internally transcribed spacer sequences with quality scores for marker gene analysis. F1000Research 2018, 7:1418. doi: 10.12688/f1000research.15704.1

Introduction

The internally transcribed spacer (ITS) is a region between the small subunit and large subunit rRNA genes. In is a commonly used phylogenetic marker for Fungi and other Eukaryotes. The ITS contains the 5.8s gene and two variable length spacer regions. In amplicon sequencing studies it is common practice to trim off the conserved (SSU, 5,8S or LSU) regions. Bengtsson-Palme et al. (2013) published a software package ITSx to do this.

Q2-ITSxpress extends this work by rapidly trimming FASTQ sequences within Qiime2. Q2-ITSxpress is the Qiime2 plugin version of the stand alone command line utility ITSxpress. Q2_ITSxpress is designed to support the calling of exact sequence variants rather than OTUs. This newer method of sequence error-correction requires quality score data from each sequence, so each input sequence must be trimmed. ITSxpress makes this possible by taking FASTQ data, de-replicating the sequences then identifying the start and stop sites using HMMSearch. Results are parsed and the trimmed files are returned. The ITS1, ITS2 or the entire ITS region including the 5.8s rRNA gene can be selected. ITSxpress uses the hmm models from ITSx so results are nearly identical.

Requirements/Dependencies

  • Qiime2 is required to run Q2-itsxpress (for stand alone software see ITSxpress)

  • To install Qiime2 follow these instructions: https://docs.qiime2.org/2022.8/install/

  • This end of life version 1 of q2-itsxpress and ITSxpress is ONLY compatible with Qiime2 version 2022.8. So make sure to follow the link above.

  • We are using mamba because it resolves packages better and faster, but conda can be substituted.

Q2-itsxpress plugin installation

  1. Example on how to install and create new Qiime2-2022.8 environment.

wget https://data.qiime2.org/distro/core/qiime2-2022.8-py38-osx-conda.yml
mamba env create -n qiime2-2022.8 --file qiime2-2022.8-py38-osx-conda.yml
  1. Activate the Qiime2 conda environment

mamba activate qiime2-2022.8
  1. Install Q2_itsxpress using BioConda. Be sure to install itsxpress and q2_itsxpress in the Qiime2 environment using the following commands.

mamba install -c bioconda itsxpress==1.8.1
pip install q2-itsxpress
  1. In your Qiime2 environment, refresh the plugins.

qiime dev refresh-cache
  1. Check to see if the ITSxpress plugin is installed. You should see an output similar to the image below.

qiime itsxpress
./screenshot.png

Usage

Within Qiime2 you can trim paired-end or single-end reads using these commands

qiime itsxpress trim-pair

qiime itsxpress trim-pair-output-unmerged

qiime itsxpress trim-single
  1. qiime itsxpress trim-single

This command takes single-end data and returns trimmed reads. The sequence may have been merged previously or have been generated from a long read technology like PacBio. Merged and long reads trimmed by this function can be used by Deblur but only long reads (not merged reads) trimmed by this function should be passed to Dada2. Its statistical model for estimating error rates was not designed for pre-merged reads.

Command-requirement

Description

–i-per-sample-sequences

  • The artifact that contains the sequence file(s).

  • Either Joined Paired or just a single fastq.

  • One file sequence in the qza data folder.

–p-region

  • The regions ITS2, ITS1, and ALL.

–p-taxa

  • Select the taxonomic group sequenced: A, B, C, D, E, F, G, H, I, L, M, O, P, Q, R, S, T, U, V, ALL.

–p-threads

  • The amount of threads to use.

–o-trimmed

  • The resulting trimmed sequences from ITSxpress in a qza format.

–cluster-id

  • The percent identity for clustering reads, set to 1 for exact dereplication.

  1. qiime itsxpress trim-pair

This command takes paired-end data and returns merged, trimmed reads. The merged reads trimmed by this function can be used by Deblur but not Dada2. Its statistical model for estimating error rates was not designed for pre-merged reads, instead use qiime itsxpress trim-pair-output-unmerged.

Command-requirement

Description

–i-per-sample-sequences

  • The artifact that contains the sequence file(s).

  • Either Joined Paired or just a single fastq.

  • One file sequence in the qza data folder.

–p-region

  • The regions ITS2, ITS1, and ALL.

–p-taxa

  • Select the taxonomic group sequenced: A, B, C, D, E, F, G, H, I, L, M, O, P, Q, R, S, T, U, V, ALL.

–p-threads

  • The amount of threads to use.

–o-trimmed

  • The resulting trimmed sequences from ITSxpress in a qza format.

–cluster-id

  • The percent identity for clustering reads, set to 1 for exact dereplication.

  1. qiime itsxpress trim-pair-output-unmerged

This command takes paired-end data and returns unmerged, trimmed reads. The merged reads trimmed by this function can be used by Dada2 but not Deblur. For Deblur use qiime itsxpress trim-pair.

Command-requirement

Description

–i-per-sample-sequences

  • The artifact that contains the sequence file.

  • Only paired will work.

  • Two file sequences in the qza data folder.

–p-region

  • The regions ITS2, ITS1, and ALL.

–p-taxa

  • Select the taxonomic group sequenced: A, B, C, D, E, F, G, H, I, L, M, O, P, Q, R, S, T, U, V, ALL.

–p-threads

  • The amount of threads to use.

–o-trimmed

  • The resulting trimmed sequences from ITSxpress in a qza format.

–cluster-id

  • The percent identity for clustering reads, set to 1 for exact dereplication.

Taxa Key

A

Alveolata

B

Bryophyta

C

Bacillariophyta

D

Amoebozoa

E

Euglenozoa

F

Fungi

G

Chlorophyta (green algae)

H

Rhodophyta (red algae)

I

Phaeophyceae (brown algae)

L

Marchantiophyta (liverworts)

M

Metazoa

O

Oomycota

P

Haptophyceae (prymnesiophytes)

Q

Raphidophyceae

R

Rhizaria

S

Synurophyceae

T

Tracheophyta (higher plants)

U

Eustigmatophyceae

ALL

All

Example

Use case: Trimming the ITS2 region from a fungal amplicon sequencing dataset with a PairedSequencesWithQuailty qza using two cpu threads. The example file used is in the Tests folder under paired.qza.

qiime itsxpress trim-pair --i-per-sample-sequences ~/parired.qza --p-region ITS2 \
--p-taxa F --p-threads 2 --o-trimmed ~/Desktop/out.qza

License information

This software is a work of the United States Department of Agriculture, Agricultural Research Service and is released under a Creative Commons CC0 public domain attribution.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

q2_itsxpress-1.8.1.tar.gz (321.5 kB view details)

Uploaded Source

Built Distribution

q2_itsxpress-1.8.1-py3-none-any.whl (16.4 kB view details)

Uploaded Python 3

File details

Details for the file q2_itsxpress-1.8.1.tar.gz.

File metadata

  • Download URL: q2_itsxpress-1.8.1.tar.gz
  • Upload date:
  • Size: 321.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.8.16

File hashes

Hashes for q2_itsxpress-1.8.1.tar.gz
Algorithm Hash digest
SHA256 2d770123d74690faacc3e6c33d80ccf4b4ecd6f19cee68acd3906a21a8c8f170
MD5 95d409a8263c80642fdf243f3261c8b4
BLAKE2b-256 0597de14f45c4a2b908d3cc165db85c8decfd84b3c5784adaf3c036fe6e1a0e2

See more details on using hashes here.

File details

Details for the file q2_itsxpress-1.8.1-py3-none-any.whl.

File metadata

  • Download URL: q2_itsxpress-1.8.1-py3-none-any.whl
  • Upload date:
  • Size: 16.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.8.16

File hashes

Hashes for q2_itsxpress-1.8.1-py3-none-any.whl
Algorithm Hash digest
SHA256 b035e274b6b9c6c297b14c14c54b02c306392d8afdf47287eea4b9dd35a8eac5
MD5 7b1a5cae15aec33dc1c085e67d80a444
BLAKE2b-256 10ffc833cf5057a40e14b7b968cc14e260c043d40bc24165727041f6d6501ed1

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page