Skip to main content

Routines for loading, saving, and manipulating taxonomic trees

Project description

Taxonomy

PyPI version Crates version CircleCI

This is a Rust library for reading, writing, and editing biological taxonomies. There are associated Python bindings for accessing most of the functionality from Python.

This library was developed initially as a component in One Codex's metagenomic classification pipeline before being refactored out, expanded, and open-sourced. It is designed such that it can be used as is with a number of taxonomic formats or the Taxonomy trait it provides can be used to add last common ancestor, traversal, etc. methods to a downstream package's taxonomy implementation.

The library ships with a number of features:

  • Common support for taxonomy handling across Rust and Python
  • Fast and low(er) memory usage
  • NCBI taxonomy, JSON ("tree" and "node_link_data" formats), Newick, and PhyloXML support
  • Easily extensible (in Rust) to support other formats and operations

Python Usage

The Python taxonomy API can open and manipulate all of the formats from the Rust library:

from taxonomy import Taxonomy

tax = Taxonomy.from_newick('(A,(B,C)D)E;')
assert tax.parent('A') == 'E'
assert tax.parent('B') == 'D'

If you have the NCBI taxonomy locally (found on their FTP), you can use that too:

ncbi_tax = Taxonomy.from_ncbi('./nodes.dmp', './names.dmp')
assert tax.name('562') == 'Escherichia coli'
assert tax.rank('562') == 'species'

Note that Taxonomy IDs in NCBI format are integers, but they're converted to strings on import. We find working with "string taxonomy IDs" greatly simplifies interoperation between different taxonomy systems.

Installation

Rust

This library can be added to an existing Cargo.toml file and installed straight from crates.io.

Python

You can install the Python bindings directly from PyPI (binaries are only built for select architextures) with:

pip install taxonomy

Development

Rust

There is a test suite runable with cargo test.

Python

To work on the Python library on a Mac OS X/Unix system (requires Python 3):

# you need the nightly version of Rust installed
curl https://sh.rustup.rs -sSf | sh
rustup default nightly

# finally, install the library
./setup.py install  # (or ./setup.py develop)

Building binary wheels and pushing to PyPI

# For each supported Python version and architecture combination...
## On a Mac
python setup.py install
python setup.py bdist_wheel
twine upload dist/*

## On Linux
# I built the 0.3.1 wheels with a forked version of pyo3-pack; we should reevaluate the next time
# we build wheels if we can just get this to work with rust-python.
git clone https://github.com/onecodex/pyo3-pack
docker run -it --rm --entrypoint /bin/bash -v .../pyo3-pack:/pyo3pack -v .../taxonomy/:/mnt konstin2/pyo3-pack
cd /pyo3pack
cargo install --path . --force
cd /mnt
rustup default nightly
pyo3-pack build --python-feature-gate python

Other Taxonomy Libraries

There are taxonomic toolkits for other programming languages that offer different features and provided some inspiration for this library:

ETE Toolkit (http://etetoolkit.org/) A Python taxonomy library

Taxize (https://ropensci.github.io/taxize-book/) An R toolkit for working with taxonomic data

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

taxonomy-0.3.3-py3.7-macosx-10.14-x86_64.egg (483.1 kB view details)

Uploaded Source

taxonomy-0.3.3-cp37-cp37m-manylinux1_x86_64.whl (4.8 MB view details)

Uploaded CPython 3.7m

taxonomy-0.3.3-cp37-cp37m-macosx_10_14_x86_64.whl (4.1 MB view details)

Uploaded CPython 3.7m macOS 10.14+ x86-64

taxonomy-0.3.3-cp36-cp36m-manylinux1_x86_64.whl (4.8 MB view details)

Uploaded CPython 3.6m

taxonomy-0.3.3-cp35-cp35m-manylinux1_x86_64.whl (4.8 MB view details)

Uploaded CPython 3.5m

File details

Details for the file taxonomy-0.3.3-py3.7-macosx-10.14-x86_64.egg.

File metadata

  • Download URL: taxonomy-0.3.3-py3.7-macosx-10.14-x86_64.egg
  • Upload date:
  • Size: 483.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.1.0 requests-toolbelt/0.9.1 tqdm/4.34.0 CPython/3.7.4

File hashes

Hashes for taxonomy-0.3.3-py3.7-macosx-10.14-x86_64.egg
Algorithm Hash digest
SHA256 3cbe6bb49c476d9f5e9673b4fc2680423e7b4f5d9a5930be4644759212b087a0
MD5 00a37b66af3b42eeb11712757108c255
BLAKE2b-256 8f168e2a5c8ccea7c7822c0ddd130380390359ffb59d99d8ce164d482a40d39e

See more details on using hashes here.

File details

Details for the file taxonomy-0.3.3-cp37-cp37m-manylinux1_x86_64.whl.

File metadata

  • Download URL: taxonomy-0.3.3-cp37-cp37m-manylinux1_x86_64.whl
  • Upload date:
  • Size: 4.8 MB
  • Tags: CPython 3.7m
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.1.0 requests-toolbelt/0.9.1 tqdm/4.34.0 CPython/3.7.4

File hashes

Hashes for taxonomy-0.3.3-cp37-cp37m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 a9d23a51032062f50fb767480fa3c2cb8cc42852a5b8bb9d09a6c1c7c92c9ab7
MD5 84100bfc85b8cf16f8d60805283dd364
BLAKE2b-256 db084208d745b95554ebacbe8c64c9842861adcf6acc97f7ca3264e75407fa23

See more details on using hashes here.

File details

Details for the file taxonomy-0.3.3-cp37-cp37m-macosx_10_14_x86_64.whl.

File metadata

  • Download URL: taxonomy-0.3.3-cp37-cp37m-macosx_10_14_x86_64.whl
  • Upload date:
  • Size: 4.1 MB
  • Tags: CPython 3.7m, macOS 10.14+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.1.0 requests-toolbelt/0.9.1 tqdm/4.34.0 CPython/3.7.4

File hashes

Hashes for taxonomy-0.3.3-cp37-cp37m-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 186ec1cf63c0cc4f58360aad2864b75db28c0c86f1d7f67d42be04693cd2d8fc
MD5 e0541d15c6384974288620af3ce5f649
BLAKE2b-256 cbfb16cf70231416822d435c7f0c4ce87a7cf73e4f969e3c459c33b604f50e33

See more details on using hashes here.

File details

Details for the file taxonomy-0.3.3-cp36-cp36m-manylinux1_x86_64.whl.

File metadata

  • Download URL: taxonomy-0.3.3-cp36-cp36m-manylinux1_x86_64.whl
  • Upload date:
  • Size: 4.8 MB
  • Tags: CPython 3.6m
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.1.0 requests-toolbelt/0.9.1 tqdm/4.34.0 CPython/3.7.4

File hashes

Hashes for taxonomy-0.3.3-cp36-cp36m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 b8aeda16181488004673561c651e0364b3365d0d786fdc34c31a85ab9cc383ab
MD5 ef520f99f8bf7462d42ec3d3fddf4ace
BLAKE2b-256 71dc2418922162c5bd180989854e2ea9617b3c214059d3a97eb0aeb684f03690

See more details on using hashes here.

File details

Details for the file taxonomy-0.3.3-cp35-cp35m-manylinux1_x86_64.whl.

File metadata

  • Download URL: taxonomy-0.3.3-cp35-cp35m-manylinux1_x86_64.whl
  • Upload date:
  • Size: 4.8 MB
  • Tags: CPython 3.5m
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.1.0 requests-toolbelt/0.9.1 tqdm/4.34.0 CPython/3.7.4

File hashes

Hashes for taxonomy-0.3.3-cp35-cp35m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 1ea0d476564c9036b86e6b23cba5ffde534e79cd52f556e49c4cdcef897d8fbc
MD5 4de564e6a6035d559c1379df68d8bff8
BLAKE2b-256 90bc9da91b09c0de59d7385c1e65c8a6d052d3b8173402cda2d7c083788d4906

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page