GeneThesaurus is a Python package that translates gene aliases and old gene symbols to the current HGNC standard gene symbols.
Project description
GeneThesaurus v1.0.2
GeneThesaurus is a Python package that translates gene aliases and old gene symbols to the current HGNC standard gene symbols.
Installation
You can install GeneThesaurus with:
pip install gene-thesaurus
Example Usage
GeneThesaurus takes a list of gene names and returns a list where all possible values are updated to the latest HGNC standard gene symbols.
By default, if a gene name cannot be found, the original gene name is used. If 'nullify_missing' is set to True, these missing genes will be set to None instead.
Default example
import gene_thesaurus
genes = gene_thesaurus.translate_genes(['TNFSF2', 'ERBB1', 'VPF', 'ZSCAN5CP', 'MISSING_GENE'], data_dir='/tmp')
print(genes)
# ['TNF', 'EGFR', 'VEGFA', 'ZSCAN5C', 'MISSING_GENE']
Example with 'nullify_missing'
import gene_thesaurus
genes = gene_thesaurus.translate_genes(['TNFSF2', 'ERBB1', 'VPF', 'ZSCAN5CP', 'MISSING_GENE'], data_dir='/tmp', nullify_missing=True)
print(genes)
# ['TNF', 'EGFR', 'VEGFA', 'ZSCAN5C', None]
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
gene-thesaurus-1.0.2.tar.gz
(4.3 kB
view hashes)
Built Distribution
Close
Hashes for gene_thesaurus-1.0.2-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | e59a9b25183bedbb1d68f72649f2cbb587e745e9e4d51d87d5bded3270bed5ca |
|
MD5 | 84fcf67cf2a233b94efd50e23a983084 |
|
BLAKE2b-256 | c846656ff37efae639f30a75603cc930d6a95e40f5a00dcdc56ec3f2b2b610ff |