Skip to main content

Assigns taxonomic ranks based on evolutionary divergence.

Project description

PhyloRank

If you are looking to classify genomes according to the methodology used by the GTDB, we recommend using our companion tool GTDB-Tk instead of PhyloRank. PhyloRank is intended to aid the manual taxonomic curation of trees inferred from genomes spanning the bacterial or archaeal domain.

version status Bioconda Downloads

PhyloRank provides functionality for calculating the relative evolutionary divergence (RED) of taxa in a tree and for finding the best placement of taxonomic labels in a tree. Other functionality is in development and is currently unsupported.

Install

The simplest way to install this package is through pip:

sudo pip install phylorank

This package makes use of the numpy, matplotlib, jinja2 (>=2.7.3), and dendropy (>=4.1.0) Python libraries. These must be install seperately. PhyloRank also uses mpld3 (>=0.2) which has been explicitly added to this package. It also required biolib (>=0.1.0), but this will be install with PhyloRank if you are doing the installation through pip.

PhyloRank requires Python 3 starting with v0.1.0

Calculating RED

PhyloRank can calculate the relative evolutionary divergence (RED) of taxa in a tree in order to identify taxa at a given taxonomic rank that have conspicuous placements in the tree. This information can be used to refine the placement of taxa in the tree with the aim of taxa at the same rank having similar RED. RED values can be calculated using:

>phylorank outliers <input_tree> <taxonomy_file> <output_dir>

where <input_tree> is a tree in Newick format that is decorated with taxonomic information, the <taxonomy_file> indicates the taxonomic assignment of each genome in the tree (see File Formats below), and <output_dir> is the location to store all results. This command assumes the taxonomically-decorated tree and taxonomy file are congruent. The taxonomy file is required in order to establish taxonomic affiliations that could not be provided in the tree (e.g., a species with a single representative and consequently no internal node in the tree). The output directory contains a table indicating the RED value calculated for all taxa along with a plot indicating the RED value of taxa at each rank.

Decorating a Tree

PhyloRank can decorate a tree with taxonomic information. This is accomplished by determining the placement for each taxa that results in the highest F-measure (see McDonald et al., 2012). If the provided taxonomy is incongruent with the topology of the tree, only the most cohesive lineage of a polyphyletic group will be labelled and it is possible that taxa at the same rank may be nested within each other (e.g., a lineage may contain two genus labels). The resulting decorated tree can be manually curated to obtain a taxonomy that is consistent with the tree topology. Alternatively, the resulting F-measure scores can be examined to assess the congruency between the taxonomy and tree topology. A tree can be decorated using:

>phylorank decorate <input_tree> <taxonomy_file> <output_tree> --skip_rd_refine

where <input_tree> is a tree in Newick format, <taxonom_file> indicates the taxonomic assignment of each genome in the tree (see File Formats below), and <output_tree> is the desired name of the decorated tree. The --skip_rd_refine flag indicates that the placement of taxa in the tree should not be adjusted to account for their RED. Adjusting for RED is only recommended once the taxonomy and tree topology have been established as being largely congruent. The file <output_tree>-table indicates the F-measure, precision, and recall for each taxa in the tree and the file <output_tree>-taxonomy gives the assigned taxonomy of each genome in the tree.

File formats

The taxonomy file is a simple tab-separated values file with two columns indicating the genome ID and Greengenes-style taxonomy string, e.g.:

>GCF_001687105.1    d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rhodobacterales;f__Rhodobacteraceae;g__Yangia;s__

Example data

Example data is provided in the example_data directory. This tree can be decorated and the RED of each taxon establish as follows:

>phylorank decorate --skip_rd_refine ar122_r89.tree ar122_taxonomy_r89.tsv ar122_r89.decorated.tree
>phylorank outliers ar122_r89.decorated.tree ar122_taxonomy_r89.tsv output_dir

Cite

If you find this package useful, please cite this git repository (https://github.com/dparks1134/PhyloRank)

Copyright

Copyright © 2015 Donovan Parks. See LICENSE for further details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

phylorank-0.1.9.tar.gz (55.5 kB view details)

Uploaded Source

Built Distribution

phylorank-0.1.9-py3-none-any.whl (78.5 kB view details)

Uploaded Python 3

File details

Details for the file phylorank-0.1.9.tar.gz.

File metadata

  • Download URL: phylorank-0.1.9.tar.gz
  • Upload date:
  • Size: 55.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/40.6.2 requests-toolbelt/0.9.1 tqdm/4.56.0 CPython/3.6.12

File hashes

Hashes for phylorank-0.1.9.tar.gz
Algorithm Hash digest
SHA256 f8ea2308e7ae8679081894f7310b0fa83ae2570301b40ba263c071d52246330e
MD5 a85dec0a5fcb94aad50dd82bb762789c
BLAKE2b-256 7d45498500f20b2c0f6f92beaa5d59e8b5a553c065b96eeae9dabb8dcbf82caa

See more details on using hashes here.

File details

Details for the file phylorank-0.1.9-py3-none-any.whl.

File metadata

  • Download URL: phylorank-0.1.9-py3-none-any.whl
  • Upload date:
  • Size: 78.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/40.6.2 requests-toolbelt/0.9.1 tqdm/4.56.0 CPython/3.6.12

File hashes

Hashes for phylorank-0.1.9-py3-none-any.whl
Algorithm Hash digest
SHA256 a0ac9078cb846c7ba1eb710c85559e74b02185fc4e3a3fabc2426b80a794f7c1
MD5 6f1fc2f9595bf35ef48392ef087f6bac
BLAKE2b-256 690e5851427ff2cc14a1ba035e727627d5edcc9bb39c4651456a3a2b6bf8599f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page