Skip to main content

GeneThesaurus is a Python package that translates gene aliases and old gene symbols to the current HGNC standard gene symbols.

Project description

GeneThesaurus v1.0.2

GeneThesaurus is a Python package that translates gene aliases and old gene symbols to the current HGNC standard gene symbols.

Installation

You can install GeneThesaurus with:

pip install gene-thesaurus

Example Usage

GeneThesaurus takes a list of gene names and returns a list where all possible values are updated to the latest HGNC standard gene symbols.

By default, if a gene name cannot be found, the original gene name is used. If 'nullify_missing' is set to True, these missing genes will be set to None instead.

Default example

import gene_thesaurus

genes = gene_thesaurus.translate_genes(['TNFSF2', 'ERBB1', 'VPF', 'ZSCAN5CP', 'MISSING_GENE'], data_dir='/tmp')

print(genes)
# ['TNF', 'EGFR', 'VEGFA', 'ZSCAN5C', 'MISSING_GENE']

Example with 'nullify_missing'

import gene_thesaurus

genes = gene_thesaurus.translate_genes(['TNFSF2', 'ERBB1', 'VPF', 'ZSCAN5CP', 'MISSING_GENE'], data_dir='/tmp', nullify_missing=True)

print(genes)
# ['TNF', 'EGFR', 'VEGFA', 'ZSCAN5C', None]

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

gene-thesaurus-1.0.2.tar.gz (4.3 kB view hashes)

Uploaded Source

Built Distribution

gene_thesaurus-1.0.2-py3-none-any.whl (4.7 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page