Skip to main content

Map gene ids using UniProt.

Project description

PyPI Build Status codecov

Tool for converting between various gene ids.


$ pip install gene_map


$ gene_map --help
Usage: gene_map [OPTIONS]

  Map gene ids between various formats.

  -i, --input TEXT                If it exists, treated as file with
                                  whitespace-separated gene ids. Otherwise
                                  treated as a gene id itself.  [required]
  --from TEXT                     Source ID type.  [required]
  --to TEXT                       Target ID type.  [required]
  -o, --output FILENAME           CSV-file to save result to.
  --organism [ARATH_3702|CAEEL_6239|CHICK_9031|DANRE_7955|DICDI_44689|DROME_7227|ECOLI_83333|HUMAN_9606|MOUSE_10090|RAT_10116|SCHPO_284812|YEAST_559292]
                                  Organism to convert IDs in.
  --cache-dir DIRECTORY           Folder to store ID-databases in.
  -q, --quiet                     Suppress logging of mapping-statistics.
  --force-download                Force download of mapping-database.
  --help                          Show this message and exit.

Getting started

Commandline usage

Inputs can be either gene ids or files containing whitespace-separated gene ids:

$ cat mygenes.txt
P63244 P08246
$ gene_map \
    -i P35222 -i InvalidID -i mygenes.txt -i P04637 \
    --from ACC --to Gene_Name \
    -o gene_mapping.csv
Mapped 5/6 genes.
$ cat gene_mapping.csv

It is also possible to simply try to convert all given inputs without knowing their ID type, by using --from auto:

$ gene_map \
    -i P35222 \
    -i TP53 \
    -i '9606.ENSP00000306407' \
    --from auto \
    --to GeneID
Mapped 3/3 genes.

Attention: if an ID is valid for multiple types, unintended side-effects may occur. Furthermore, all IDs are treated as strings.

API usage

>>> from gene_map import GeneMapper

>>> stringdb_ids = ['9606.ENSP00000306407', '9606.ENSP00000337461']
>>> gm = GeneMapper()  # defaults to HUMAN_9606
>>> gm.query(stringdb_ids, source_id_type='STRING', target_id_type='GeneID')
#                ID_from  ID_to
#0  9606.ENSP00000306407  79007
#1  9606.ENSP00000337461  90529

Project details

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Files for gene-map, version 0.4.4
Filename, size File type Python version Upload date Hashes
Filename, size gene_map-0.4.4.tar.gz (8.2 kB) File type Source Python version None Upload date Hashes View hashes

Supported by

Elastic Elastic Search Pingdom Pingdom Monitoring Google Google BigQuery Sentry Sentry Error logging AWS AWS Cloud computing DataDog DataDog Monitoring Fastly Fastly CDN SignalFx SignalFx Supporter DigiCert DigiCert EV certificate StatusPage StatusPage Status page