Skip to main content

Routines for loading, saving, and manipulating taxonomic trees

Project description

Taxonomy

PyPI version Crates version CircleCI

This is a Rust library for reading, writing, and editing biological taxonomies. There are associated Python bindings for accessing most of the functionality from Python.

This library was developed initially as a component in One Codex's metagenomic classification pipeline before being refactored out, expanded, and open-sourced. It is designed such that it can be used as is with a number of taxonomic formats or the Taxonomy trait it provides can be used to add last common ancestor, traversal, etc. methods to a downstream package's taxonomy implementation.

The library ships with a number of features:

  • Common support for taxonomy handling across Rust and Python
  • Fast and low(er) memory usage
  • NCBI taxonomy, JSON ("tree" and "node_link_data" formats), Newick, and PhyloXML support
  • Easily extensible (in Rust) to support other formats and operations

Python Usage

The Python taxonomy API can open and manipulate all of the formats from the Rust library:

from taxonomy import Taxonomy

tax = Taxonomy.from_newick('(A,(B,C)D)E;')
assert tax.parent('A') == 'E'
assert tax.parent('B') == 'D'

If you have the NCBI taxonomy locally (found on their FTP), you can use that too:

ncbi_tax = Taxonomy.from_ncbi('./nodes.dmp', './names.dmp')
assert tax.name('562') == 'Escherichia coli'
assert tax.rank('562') == 'species'

Note that Taxonomy IDs in NCBI format are integers, but they're converted to strings on import. We find working with "string taxonomy IDs" greatly simplifies interoperation between different taxonomy systems.

Installation

Rust

This library can be added to an existing Cargo.toml file and installed straight from crates.io.

Python

You can install the Python bindings directly from PyPI (binaries are only built for select architextures) with:

pip install taxonomy

Development

Rust

There is a test suite runable with cargo test. To test the Python-bindings you need to use the additional python_test feature: cargo test --features python_test.

Python

To work on the Python library on a Mac OS X/Unix system (requires Python 3):

# you need the nightly version of Rust installed
curl https://sh.rustup.rs -sSf | sh
rustup default nightly

# finally, install the library
maturin install --cargo-extra-args="--features=python"

Building binary wheels and pushing to PyPI

# The Mac build requires switching through a few different python versions
maturin build --cargo-extra-args="--features=python" --release --strip

# The linux build is automated through cross-compiling in a docker image
docker run --rm -v $(pwd):/io konstin2/maturin:master build --cargo-extra-args="--features=python" --release --strip
twine upload target/wheels/*

Other Taxonomy Libraries

There are taxonomic toolkits for other programming languages that offer different features and provided some inspiration for this library:

ETE Toolkit (http://etetoolkit.org/) A Python taxonomy library

Taxize (https://ropensci.github.io/taxize-book/) An R toolkit for working with taxonomic data

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Files for taxonomy, version 0.4.0
Filename, size File type Python version Upload date Hashes
Filename, size taxonomy-0.4.0-cp38-cp38-macosx_10_7_x86_64.whl (283.8 kB) File type Wheel Python version cp38 Upload date Hashes View
Filename, size taxonomy-0.4.0-cp37-cp37m-manylinux1_x86_64.whl (317.8 kB) File type Wheel Python version cp37 Upload date Hashes View
Filename, size taxonomy-0.4.0-cp37-cp37m-macosx_10_7_x86_64.whl (283.8 kB) File type Wheel Python version cp37 Upload date Hashes View
Filename, size taxonomy-0.4.0-cp36-cp36m-manylinux1_x86_64.whl (317.9 kB) File type Wheel Python version cp36 Upload date Hashes View
Filename, size taxonomy-0.4.0-cp36-cp36m-macosx_10_7_x86_64.whl (284.0 kB) File type Wheel Python version cp36 Upload date Hashes View
Filename, size taxonomy-0.4.0-cp35-cp35m-manylinux1_x86_64.whl (317.9 kB) File type Wheel Python version cp35 Upload date Hashes View
Filename, size taxonomy-0.4.0-cp35-cp35m-macosx_10_7_x86_64.whl (283.9 kB) File type Wheel Python version cp35 Upload date Hashes View

Supported by

AWS AWS Cloud computing Datadog Datadog Monitoring Facebook / Instagram Facebook / Instagram PSF Sponsor Fastly Fastly CDN Google Google Object Storage and Download Analytics Huawei Huawei PSF Sponsor Microsoft Microsoft PSF Sponsor NVIDIA NVIDIA PSF Sponsor Pingdom Pingdom Monitoring Salesforce Salesforce PSF Sponsor Sentry Sentry Error logging StatusPage StatusPage Status page