Skip to main content

TreeSAPP is a functional and taxonomic annotation tool for genomes and metagenomes.

Project description

TreeSAPP: Tree-based Sensitive and Accurate Phylogenetic Profiler

Build Status Codacy Badge PyPI version

Anaconda-Server Badge Anaconda-Server Badge Anaconda-Server Badge

Connor Morgan-Lang, Ryan McLaughlin, Grace Zhang, Kevin Chan, Zachary Armstrong, and Steven J. Hallam

Overview

TreeSAPP is a python package for phylogenetically annotating genomes and metagenomes. Here is a diagram of the workflow:

alt text

Installation

TreeSAPP supports Python versions 3.5, 3.6, 3.7 and 3.8.

Conda

TreeSAPP and most of its dependencies can be installed in its own environment using conda.

conda create -n treesapp_cenv -c bioconda -c conda-forge treesapp
conda activate treesapp_cenv

If you plan on building your own reference packages you will also require USEARCH.

Singularity

If you're working in an HPC environment and don't have conda installed, we also have a singularity container available:

singularity pull library://cmorganl/default/treesapp
singularity exec treesapp.sif

PyPI

The most recent version of TreeSAPP is hosted on the Python Package Index (PyPI) and can be installed using pip install treesapp. Alternatively you can install the latest development version of TreeSAPP locally with git clone. In either case we recommend installing within a virtual environment using the python package virtualenv.

cd ~/bin
virtualenv ~/bin/treesapp_venv
source ~/bin/treesapp_venv/bin/activate
git clone https://github.com/hallamlab/TreeSAPP.git
cd TreeSAPP/
python setup.py sdist
pip install dist/treesapp*.tar.gz

If you opted to install TreeSAPP either using pip or by cloning the development version from GitHub you will need to install dependencies that you do not already have installed (i.e. they will need to be in you're environment's path).

Running TreeSAPP

To list all the sub-commands run treesapp.

To test the assign workflow, run:

treesapp assign -i ~/bin/TreeSAPP/test_data/marker_test_suite.faa -m prot --trim_align -o assign_test -t M0701,M0702,M0705

To assign sequences in your genome of interest:

treesapp assign -i Any.fasta -o ~/path/to/output/directory/

As in the previous command, we recommend using the --trim_align flag and increasing the number of threads to use with -n.

Tutorials

If we do not yet have a reference package for a gene you are interested in, please try building a new reference package. Of course, if you run into any problems or would like to collaborate on building many reference packages don't hesitate to email us or create a new issue with an 'enhancement' label.

To determine whether the sequences used to build your new reference package are what you think they are, and whether it might unexpectedly annotate homologous sequences, see the purity tutorial.

If you are working with a particularly complex reference package, from an orthologous group for example, or have extra phylogenetic information you'd like to include in your classifications, try annotating extra features with treesapp layer.

Yet to come

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

treesapp-0.6.7.tar.gz (7.0 MB view hashes)

Uploaded Source

Built Distributions

treesapp-0.6.7-cp38-cp38-manylinux2010_x86_64.whl (7.8 MB view hashes)

Uploaded CPython 3.8 manylinux: glibc 2.12+ x86-64

treesapp-0.6.7-cp38-cp38-macosx_10_9_x86_64.whl (7.4 MB view hashes)

Uploaded CPython 3.8 macOS 10.9+ x86-64

treesapp-0.6.7-cp37-cp37m-manylinux2010_x86_64.whl (7.8 MB view hashes)

Uploaded CPython 3.7m manylinux: glibc 2.12+ x86-64

treesapp-0.6.7-cp37-cp37m-macosx_10_6_intel.whl (7.4 MB view hashes)

Uploaded CPython 3.7m macOS 10.6+ intel

treesapp-0.6.7-cp36-cp36m-manylinux2010_x86_64.whl (7.8 MB view hashes)

Uploaded CPython 3.6m manylinux: glibc 2.12+ x86-64

treesapp-0.6.7-cp36-cp36m-macosx_10_6_intel.whl (7.4 MB view hashes)

Uploaded CPython 3.6m macOS 10.6+ intel

treesapp-0.6.7-cp35-cp35m-manylinux2010_x86_64.whl (7.8 MB view hashes)

Uploaded CPython 3.5m manylinux: glibc 2.12+ x86-64

treesapp-0.6.7-cp35-cp35m-macosx_10_6_intel.whl (7.4 MB view hashes)

Uploaded CPython 3.5m macOS 10.6+ intel

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page