Skip to main content

GeneThesaurus is a Python package that translates gene aliases and old gene symbols to the current HGNC standard gene symbols.

Project description

GeneThesaurus v3.2.0

GeneThesaurus is a Python package that translates between different gene standards using publicly available data from HGNC and NIH.

Presently, GeneThesaurus supports translating:

  • gene aliases and old gene symbols to the current HGNC standard gene symbols
  • gene symbols to ensembl identifiers
  • ensembl identifiers to gene symbols
  • entrez (NCBI) identifiers to gene symbols or ensembl identifiers

Please get in touch (or consider submitting a pull request to this project) if you need translation between other formats.

Installation

You can install GeneThesaurus with:

pip install gene-thesaurus

Example usage

from gene_thesaurus import GeneThesaurus
gt = GeneThesaurus(data_dir='/tmp')

outdated_gene = 'TNFSF2'
up_to_date_gene = 'ETV6'
fake_gene = 'NOTAREALGENE'
input = [outdated_gene, up_to_date_gene, fake_gene]

#############################
### update_gene_symbols() ###
#############################

updated_genes = gt.update_gene_symbols(input)
print(updated_genes)
# {'TNFSF2': 'TNF'}

#########################
### translate_genes() ###
#########################

# Valid values for source and target are 'symbol', 'ensembl_id' and 'entrez_id'.

translated_genes = gt.translate_genes(input, source='symbol', target='ensembl_id')
print(translated_genes)
{'TNFSF2': 'ENSG00000232810', 'ETV6': 'ENSG00000139083'}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

gene_thesaurus-3.2.0.tar.gz (9.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

gene_thesaurus-3.2.0-py3-none-any.whl (8.8 kB view details)

Uploaded Python 3

File details

Details for the file gene_thesaurus-3.2.0.tar.gz.

File metadata

  • Download URL: gene_thesaurus-3.2.0.tar.gz
  • Upload date:
  • Size: 9.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.8.20

File hashes

Hashes for gene_thesaurus-3.2.0.tar.gz
Algorithm Hash digest
SHA256 4bb85c46650cde3dea2b1d60ca4065a930756339a8d3ed7a0ea572d02a6095d2
MD5 b771fb1700e9c7b9799b9a98f607a8ef
BLAKE2b-256 5f08d2cc5afca95f5bc65fd7534c316549b1b2ee1cac7a04c74dab030a3ed79e

See more details on using hashes here.

File details

Details for the file gene_thesaurus-3.2.0-py3-none-any.whl.

File metadata

  • Download URL: gene_thesaurus-3.2.0-py3-none-any.whl
  • Upload date:
  • Size: 8.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.8.20

File hashes

Hashes for gene_thesaurus-3.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 252601a5cac0cbffaccd48391eec358d03ce30ab693a5d18372682934a6120c8
MD5 6f165298df4e852ac6f77415ac0e2be8
BLAKE2b-256 d3bcce2192167efafbe419a1d1dd911f0260cbfa3cad6f576bfe70ef6010705f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page