Skip to main content

Robust protein marker to gene symbol resolution backed by SQLite.

Project description

rpg_conv

rpg_conv resolves protein marker aliases (for example ki67, SMA, CD57) to canonical gene symbols using a local SQLite database populated from a bundled merged Ensembl reference table (human + mouse aliases together).

Install

pip install rpg_conv

For development:

pip install -e ".[dev]"

Quick Start

from rpg_conv import GeneResolver

resolver = GeneResolver()  # creates/loads a local SQLite DB in ~/.rpg_conv

print(resolver.resolve_one("ki67"))    # MKI67
print(resolver.resolve_one("a-sma", return_top=False)) # contains ACTA2 among merged aliases
print(resolver.resolve_one("CD57", return_top=False))  # contains B3GAT1 among merged aliases
print(resolver.resolve_one("PDCD1"))   # PDCD1

CLI

rpg-conv "ki--67"
rpg-conv "ki--67" --no-only-return-confident
rpg-conv "CD57" --return-ensembl-id
rpg-conv "hox1" --no-return-top --sep "|"
rpg-conv "hox1" --return-df --verbose

Data model

The SQLite database stores one merged alias table with:

  • Ensembl gene ID
  • canonical gene symbols
  • aliases/synonyms
  • normalized alias keys used for robust lookup
  • bundled Ensembl reference rows loaded on first initialization

Matching behavior

  • Direct normalized DB match is tried first.
  • If no direct match, fuzzy search uses rapidfuzz.distance.Levenshtein.
  • only_return_confident=True means only exact or strict fuzzy (distance < confidence_distance_lt) matches are accepted.
  • If only_return_confident=False, worst-case fallback returns the original query string.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rpg_conv-0.2.0.tar.gz (3.3 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

rpg_conv-0.2.0-py3-none-any.whl (3.4 MB view details)

Uploaded Python 3

File details

Details for the file rpg_conv-0.2.0.tar.gz.

File metadata

  • Download URL: rpg_conv-0.2.0.tar.gz
  • Upload date:
  • Size: 3.3 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.4

File hashes

Hashes for rpg_conv-0.2.0.tar.gz
Algorithm Hash digest
SHA256 8c6aa6b903ccec0ca845261ad5242de6d5ee8d4674d9bd04b215f3401cc2ef28
MD5 c9111f2219677fffedd3395db04b6a9f
BLAKE2b-256 550ee83cd3f9c85accdab731dbea454922075659bf8044b14396ca629b093ad6

See more details on using hashes here.

File details

Details for the file rpg_conv-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: rpg_conv-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 3.4 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.4

File hashes

Hashes for rpg_conv-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 d5e4b22fe18d5926f6e48750245d83fa58d102f39c287544cb29de6a9c329d2e
MD5 daa3a1e277cb1ac0cb361ed4a7e95f14
BLAKE2b-256 2645ee089d1b9bc16c414e310029cd330d11d5322e95bf515abb7bbce7beed7e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page