Skip to main content

Convert NCBI TaxIDs to scientific species names and vice-versa

Project description

get-tax-info

Load NCBI taxonomy (names.dmp and nodes.dmp) into a SQLite database for lightning-fast hash-based lookups.

Features

  • Fast: Indexed SQLite queries for names, parents, and children.
  • Automatic: Downloads and converts NCBI data on first run.
  • Easy Storage: Uses standard user cache directories by default.
  • BUSCO Integration: Maps TaxIDs to the best BUSCO dataset for lineage analysis.

Installation

pip install git+https://github.com/MrTomRod/get-tax-info

Configuration

By default, the database is stored in the user cache directory (e.g., ~/.cache/get-tax-info/).

  • Change location: Set GET_TAX_INFO_DB environment variable or pass db_path to GetTaxInfo.

Python Usage

from get_tax_info import GetTaxInfo, TaxID

# Automatically download/init data on first use
gti = GetTaxInfo()

# Use the TaxID object (recommended)
t = gti.get_taxid_object(2590146)  # Ektaphelenchus kanzakii
print(t.scientific_name, t.rank)   # 'Ektaphelenchus kanzakii', 'species'

# Parents and children
parent = t.parent                  # <TaxID 483517 (Ektaphelenchus)>
children = parent.children          # List of TaxID objects

# Ancestor at specific rank
genus = t.tax_at_rank('genus')

CLI Usage

# Get BUSCO dataset for a TaxID
get-tax-info taxid-to-busco-dataset --taxid 110

# Add TaxID and BUSCO column to a CSV/TSV table
get-tax-info add-taxid-column table.tsv --sep ,

A complete demonstration of the BUSCO workflow (including Podman usage) can be found in demo_busco_workflow.sh.


Note: BUSCO dataset mapping requires pre-downloaded lineages. See get_busco.py for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

get_tax_info-0.2.0.tar.gz (9.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

get_tax_info-0.2.0-py3-none-any.whl (10.5 kB view details)

Uploaded Python 3

File details

Details for the file get_tax_info-0.2.0.tar.gz.

File metadata

  • Download URL: get_tax_info-0.2.0.tar.gz
  • Upload date:
  • Size: 9.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.21 {"installer":{"name":"uv","version":"0.9.21","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Fedora Linux","version":"43","id":"","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for get_tax_info-0.2.0.tar.gz
Algorithm Hash digest
SHA256 58de2b73865c147edcb3b8eae2a7c9475acea8922685057812955c0f6022dba0
MD5 e254a27c5d573f5abd1488a2d9c6ca88
BLAKE2b-256 49935862552cc0cec9009d2add5f7261aca399e442d573aa351abb4d0bfe82c8

See more details on using hashes here.

File details

Details for the file get_tax_info-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: get_tax_info-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 10.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.21 {"installer":{"name":"uv","version":"0.9.21","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Fedora Linux","version":"43","id":"","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for get_tax_info-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 ac8ae8fd79024e2e23a639c0a8a67043264cc6b6a0c985e136e97b4739ba4daa
MD5 19e254c021818f57f34aff8c2d5dc9ff
BLAKE2b-256 6fc6cfdddd2533efd94e13711511fbdeda9a65984239e3cc04934296fd7b68f7

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page