Skip to main content

A Python package for obtaining complete lineages and the lowest common ancestor (LCA) from a set of taxonomic identifiers.

Project description

taxopy

A Python package for obtaining complete lineages and the lowest common ancestor (LCA) from a set of taxonomic identifiers.

Installation

There are two ways to install taxopy:

  • Using pip:
pip install taxopy
  • Using conda:
conda install -c conda-forge -c bioconda taxopy

Usage

import taxopy

First you need to download taxonomic information from NCBI's servers and put this data into a TaxDb object:

taxdb = taxopy.TaxDb()
# You can also use your own set of taxonomy files:
taxdb = taxopy.TaxDb(nodes_dmp="taxdb/nodes.dmp", names_dmp="taxdb/names.dmp", keep_files=True)

The TaxDb object stores the name, rank and parent-child relationships of each taxonomic identifier:

print(taxdb.taxid2name['2'])
print(taxdb.taxid2parent['2'])
print(taxdb.taxid2rank['2'])
Bacteria
131567
superkingdom

To get information of a given taxon you can create a Taxon object using its taxonomic identifier:

human = taxopy.Taxon('9606', taxdb)
gorilla = taxopy.Taxon('9593', taxdb)
lagomorpha = taxopy.Taxon('9975', taxdb)

Each Taxon object stores a variety of information, such as the rank, identifier and name of the input taxon, and the identifiers and names of all the parent taxa:

print(lagomorpha.rank)
print(lagomorpha.name)
print(lagomorpha.name_lineage)
order
Lagomorpha
['Lagomorpha', 'Glires', 'Euarchontoglires', 'Boreoeutheria', 'Eutheria', 'Theria', 'Mammalia', 'Amniota', 'Tetrapoda', 'Dipnotetrapodomorpha', 'Sarcopterygii', 'Euteleostomi', 'Teleostomi', 'Gnathostomata', 'Vertebrata', 'Craniata', 'Chordata', 'Deuterostomia', 'Bilateria', 'Eumetazoa', 'Metazoa', 'Opisthokonta', 'Eukaryota', 'cellular organisms', 'root']

You can get the lowest common ancestor of a list of taxa using the find_lca function:

human_lagomorpha_lca = taxopy.find_lca([human, lagomorpha], taxdb)
print(human_lagomorpha_lca.name)
Euarchontoglires

You may also use the find_majority_vote to discover the most specific taxon that is shared by more than half of the lineages of a list of taxa:

majority_vote = taxopy.find_majority_vote([human, gorilla, lagomorpha], taxdb)
print(majority_vote.name)
Homininae

Acknowledgements

Some of the code used in taxopy was taken from the CAT/BAT tool for taxonomic classification of contigs and metagenome-assembled genomes.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

taxopy-0.2.1.tar.gz (17.1 kB view details)

Uploaded Source

Built Distribution

taxopy-0.2.1-py3-none-any.whl (19.6 kB view details)

Uploaded Python 3

File details

Details for the file taxopy-0.2.1.tar.gz.

File metadata

  • Download URL: taxopy-0.2.1.tar.gz
  • Upload date:
  • Size: 17.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/46.0.0.post20200311 requests-toolbelt/0.9.1 tqdm/4.43.0 CPython/3.7.6

File hashes

Hashes for taxopy-0.2.1.tar.gz
Algorithm Hash digest
SHA256 c1b38b18cf8c243feeaea4fb1c2a21cf2551709026ea72f33a250c018aca5810
MD5 e205354a2ca77e36ea99849c607058e8
BLAKE2b-256 5425d1514835a85373c2c293340ac6894a96a5319f966b48d6200d4ffe26942e

See more details on using hashes here.

File details

Details for the file taxopy-0.2.1-py3-none-any.whl.

File metadata

  • Download URL: taxopy-0.2.1-py3-none-any.whl
  • Upload date:
  • Size: 19.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/46.0.0.post20200311 requests-toolbelt/0.9.1 tqdm/4.43.0 CPython/3.7.6

File hashes

Hashes for taxopy-0.2.1-py3-none-any.whl
Algorithm Hash digest
SHA256 cdddcf96b510b306adcec7fc11b552e29be2b735076e9b17512158f556c76a94
MD5 731956756f72c384cbe1b0c3338c4287
BLAKE2b-256 51b69a39b385ff3683b3f3ff635d3c01cf30e410f6fc6c6328b53b9663104a16

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page