Skip to main content

A python package to process UMI tagged mixed amplicon metabarcoding data.

Project description

Caltha

A python package for processing UMI tagged mixed amplicon metabarcoding data.

Code Style: Black

Installation

The current version of Caltha requires Python 3.8+.

To install Caltha, simply run the pip install command:

pip install caltha

NOTE: Caltha does require one more dependency which can not be installed with the Caltha pip or conda package. This dependency is vsearch (2.14.2).
Executing the following conda install command should install the dependency.

conda install -c bioconda vsearch

How to run

Caltha can be run directly from the command line.

usage: caltha [-h] [-v] [-i FLINPUT] [-t FLTABULAR] [-z FLPREZIP] [-b FLBLAST]
              [-f STRFORMAT] [-l STRLOCATION] [-a STRANCHOR] [-u INTUMILENGTH]
              [-y FLTIDENTITY] [-c INTABUNDANCE] [-w STRFORWARD]
              [-r STRREVERSE] [-d STRDIRECTORY] [-@ INTTHREADS]

A python package for processing UMI tagged mixed amplicon metabarcoding data.

optional arguments:
  -h, --help            show this help message and exit
  -v, --version         show program's version number and exit
  -i FLINPUT, --input FLINPUT
                        The input fasta/fastq file(s). This can either be a
                        zip archive or a single fasta/fastq file.
  -t FLTABULAR, --tabular FLTABULAR
                        The output tabular zip file.
  -z FLPREZIP, --zip FLPREZIP
                        The pre validation zip file.
  -b FLBLAST, --blast FLBLAST
                        The output blast zip file.
  -f STRFORMAT, --format STRFORMAT
                        The format of the input file
                        [fasta/fastq]. (default: fasta)
  -l STRLOCATION, --location STRLOCATION
                        Search for UMIs at the 5'-end [umi5], 3'-end [umi3] or 
                        at the 5'-end and 3'-end [umidouble]. (default: umi5)
  -a STRANCHOR, --anchor STRANCHOR
                        Which anchor type to use
                        [primer/adapter/zero]. (default: primer)
  -u INTUMILENGTH, --length INTUMILENGTH
                        The length of the UMI sequence. (default: 5)
  -y FLTIDENTITY, --identity FLTIDENTITY
                        The identity percentage with which to perform the
                        validation. (default: 0.97)
  -c INTABUNDANCE, --abundance INTABUNDANCE
                        The minimum abundance of a sequence in order for it
                        to be included during validation. (default: 1)
  -w STRFORWARD, --forward STRFORWARD
                        The 5'-end anchor nucleotides.
  -r STRREVERSE, --reverse STRREVERSE
                        The 3'-end anchor nucleotides.
  -d STRDIRECTORY, --directory STRDIRECTORY
                        The location of the temporary working directory
                        (not created by program). (default: .)
  -@ INTTHREADS, --threads INTTHREADS
                        The number of threads to run Caltha
                        with. (default: number of threads available on system)

This python package requires one extra dependency which can be easily
installed with conda (conda install -c bioconda vsearch=2.14.2).

Further documentation can be found here.

Package links

Source(s)

  • Python Software Foundation,
    Python 3.8+. 2019.
    Python
  • Rognes T, Flouri T, Nichols B, Quince C, Mahe F,
    VSEARCH: A versatile open source tool for metagenomics.
    PeerJ. 2016. doi: 10.7717/peerj.2584
    vsearch
  • Augspurger T, Ayd W, Bartak C, Battiston P, Cloud P, Garcia M,
    Python Data Analysis Library.
    Pandas
  • Langa L, Willing C, Meyer C, Zijlstra J, Naylor M, Dollenstein Z,
    The uncompromising Python code formatter.
    Black
  • Ziadé T, Cordasco I,
    Your tool for style guide enforcement.
    Flake8
  • Sottile A, Struys K, Kuehl C, Finkle M,
    A framework for managing and maintaining multi-language pre-commit hooks.
    Pre-commit
  • Python Software Foundation,
    The Python Package index.
    PyPI
  • Du L,
    A lightweight Python C extension for easy access to sequences from plain and gzipped fasta/q files.
    Pyfastx
  • Cock P, Antao T, Chang J, Chapman B, Cox C, Dalke A,
    Biopython: freely available Python tools for computational molecular biology and bioinformatics.
    Bioinformatics. 2009; 25(11): 1422-1423. doi: 10.1093/bioinformatics/btp163
    Biopython

Author(s)

Citation

Copyright (C) 2018 Jasper Boom

This program is free software: you can redistribute it and/or modify
it under the terms of the GNU Affero General Public License version 3 as
published by the Free Software Foundation.

This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU Affero General Public License for more details.

You should have received a copy of the GNU Affero General Public License
along with this program. If not, see <https://www.gnu.org/licenses/>.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

caltha-0.6.tar.gz (10.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

caltha-0.6-py3-none-any.whl (22.2 kB view details)

Uploaded Python 3

File details

Details for the file caltha-0.6.tar.gz.

File metadata

  • Download URL: caltha-0.6.tar.gz
  • Upload date:
  • Size: 10.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/2.0.0 pkginfo/1.5.0.1 requests/2.23.0 setuptools/46.4.0.post20200518 requests-toolbelt/0.9.1 tqdm/4.46.0 CPython/3.8.2

File hashes

Hashes for caltha-0.6.tar.gz
Algorithm Hash digest
SHA256 729315d685ff4b88c1915d091e3288979528b1881f1c3a92eb20f1c32e61793c
MD5 ab7304e481ab74679ffc2b5612e1ed7b
BLAKE2b-256 50e73f4fa3fc0181e602df883a828cffc482b21f135d94fcd9ac8e86307f50cf

See more details on using hashes here.

File details

Details for the file caltha-0.6-py3-none-any.whl.

File metadata

  • Download URL: caltha-0.6-py3-none-any.whl
  • Upload date:
  • Size: 22.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/2.0.0 pkginfo/1.5.0.1 requests/2.23.0 setuptools/46.4.0.post20200518 requests-toolbelt/0.9.1 tqdm/4.46.0 CPython/3.8.2

File hashes

Hashes for caltha-0.6-py3-none-any.whl
Algorithm Hash digest
SHA256 82f57a3a042230378720644dc69ce3d8d244fc26bf21a48ea283e7fbb2b404ad
MD5 600bb5fc866f9487dc48db68d2e3fcc5
BLAKE2b-256 b64832dd55160e32f13b65d5eb248c88c82e9988d2eb0d6103b569428172e47e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page