Skip to main content

Cross-Temporal and Cross-Database Biological Identifier Mapping.

Project description

PyPI downloads Python Version License Read the documentation at https://idtrack.readthedocs.io/ Build Package Status Tests status Codecov

IDTrack logo

Cross-Temporal and Cross-Database Biological Identifier Mapping

Modern biology constantly mixes identifiers from different years, databases, and genome builds. The result is a familiar set of problems: IDs disappear, symbols change, references disagree, and “the same gene” isn’t always represented the same way across datasets.

IDTrack is built for that reality. It provides a time-aware, audit-friendly way to translate and harmonize biological identifiers across Ensembl releases and across external namespaces (HGNC, UniProt, RefSeq, Entrez, …), while keeping ambiguity explicit instead of silently forcing a single answer.

What makes IDTrack different

  • Time-aware mapping: treat Ensembl releases as a “time axis” and travel forward/backward through identifier history.

  • Assembly-aware mapping: harmonize identifiers across genome builds (e.g. GRCh37 ↔ GRCh38) and respect external databases that are assembly-scoped.

  • Snapshot boundary for reproducibility: build a release-bounded graph snapshot so results are stable and repeatable.

  • Explicit external database opt-in: choose which external namespaces participate via a small, editable YAML contract.

  • Transparency over coercion: conversions are naturally classified as 1→0 (no match), 1→1 (clean), or 1→n (ambiguous).

  • Scale-ready workflows: caching and snapshot reuse make repeated conversions and multi-dataset harmonization practical.

Who is it for?

  • Wet-lab researchers who need a reliable, step-by-step path from “my gene list is old” to “my analysis is reproducible”.

  • Bioinformaticians who want release-pinned, auditable conversions in notebooks, pipelines, and integration workflows.

  • Atlas builders / integrators who need to harmonize gene identifiers across many cohorts (different Ensembl releases, symbols, and external IDs), keep an explicit audit trail of what mapped/failed/was ambiguous, and ship a release-pinned, reproducible feature space for downstream integration and publication.

Common use cases

  • Dataset harmonization before integration (single-cell, bulk, atlas-scale collections).

  • Legacy data rescue (old Ensembl releases, mixed symbols/IDs, retired identifiers).

  • Publication-grade reproducibility (pin a snapshot boundary + share the exact external configuration).

  • Cross-database interoperability when collaborators use different identifier conventions.

Documentation and tutorials

The documentation includes a full tutorial suite designed to be the primary learning resource:

  • Documentation: Documentation

  • Tutorials: start from the “Tutorials” section in the docs (Part 0 → Part 7).

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

idtrack-0.0.5.tar.gz (221.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

idtrack-0.0.5-py3-none-any.whl (242.0 kB view details)

Uploaded Python 3

File details

Details for the file idtrack-0.0.5.tar.gz.

File metadata

  • Download URL: idtrack-0.0.5.tar.gz
  • Upload date:
  • Size: 221.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for idtrack-0.0.5.tar.gz
Algorithm Hash digest
SHA256 ad6a90422f64bea4cf7454838b3d5746c36720585e8772de9c49c46e4972569e
MD5 90bb081e8a2d5242cb1144b9e19fe0a3
BLAKE2b-256 836456e4159abb9bf3705203c4811636e97fecbd32094b91475d1557ac8026d4

See more details on using hashes here.

File details

Details for the file idtrack-0.0.5-py3-none-any.whl.

File metadata

  • Download URL: idtrack-0.0.5-py3-none-any.whl
  • Upload date:
  • Size: 242.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for idtrack-0.0.5-py3-none-any.whl
Algorithm Hash digest
SHA256 be6c5dc6fca2a98d0be696ba0f17ae401d282ee68805e69a750f5fa08920d354
MD5 95c39569634eaac306a4dbb0adb53881
BLAKE2b-256 d89a98c5a7fe5a712aa2ab860352233fd21e1c54356ac732c456c5bab53fd0f8

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page