GeneThesaurus is a Python package that translates gene aliases and old gene symbols to the current HGNC standard gene symbols.
Project description
GeneThesaurus
GeneThesaurus is a Python package that translates gene aliases and old gene symbols to the current HGNC standard gene symbols.
Installation
You can install GeneThesaurus with:
pip install gene-thesaurus
Example Usage
GeneThesaurus takes a list of gene names and returns a list where all possible values are updated to the latest HGNC standard gene symbols.
By default, if a gene name cannot be found, the original gene name is used. If 'nullify_missing' is set to True, these missing genes will be set to None instead.
Default example
import gene_thesaurus
genes = gene_thesaurus.translate_genes(['TNFSF2', 'ERBB1', 'VPF', 'ZSCAN5CP', 'MISSING_GENE'], data_dir='/tmp')
print(genes)
# ['TNF', 'EGFR', 'VEGFA', 'ZSCAN5C', 'MISSING_GENE']
Example with 'nullify_missing'
import gene_thesaurus
genes = gene_thesaurus.translate_genes(['TNFSF2', 'ERBB1', 'VPF', 'ZSCAN5CP', 'MISSING_GENE'], data_dir='/tmp', nullify_missing=True)
print(genes)
# ['TNF', 'EGFR', 'VEGFA', 'ZSCAN5C', None]
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
gene-thesaurus-1.0.0.tar.gz
(4.0 kB
view hashes)
Built Distribution
Close
Hashes for gene_thesaurus-1.0.0-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | abbb1b34a764b4aad3cc5aacb70629e88ff36960f7ebbdd0919501ffa3cf1a7f |
|
MD5 | 75243411c6047c361a25fb8f03281609 |
|
BLAKE2b-256 | 0a227c6e09b83b53765bb7cc49623a11502df8f026d0e0e26fe4bc6cf2148934 |