Skip to main content

A Python wrapper for the UniProt Mapping RESTful API.

Project description

Linting Ruff Code style: black Imports: isort License: MIT GitHub Actions

UniProtMapper

An (unofficial) Python wrapper for the UniProt Retrieve/ID Mapping RESTful API. This package supports the following functionalities:

Installation

From PyPI:

pip install uniprot-id-mapper

From source:

git clone https://github.com/David-Araripe/UniProtMapper
cd UniProtMapper
pip install .

Usage

UniProtIDMapper

Supported databases and their respective type are stored under the attribute self.supported_dbs_with_types. These are also found as a list under self._supported_fields.

from UniProtIDMapper import UniProtIDMapper

mapper = UniProtIDMapper()
print(mapper.supported_dbs_with_types)

To map a list of UniProt IDs to Ensembl IDs, the user can either call the object directly or use the mapID method.

result, failed = mapper.mapIDs(
    ids=["P30542", "Q16678", "Q02880"], from_db="UniProtKB_AC-ID", to_db="Ensembl"
)
>>> Retrying in 3s
>>> Fetched: 3 / 3

result, failed = mapper(
    ids=["P30542", "Q16678", "Q02880"], from_db="UniProtKB_AC-ID", to_db="Ensembl"
)
>>> Retrying in 3s
>>> Fetched: 3 / 3

Where result is the following pandas DataFrame:

UniProtKB_AC-ID Ensembl
0 P30542 ENSG00000163485.17
1 Q16678 ENSG00000138061.12
2 Q02880 ENSG00000077097.17

UniProtRetriever

This class supports retrieving any of the UniProt return fields. The user can access these directly from the object, under the attribute self.fields_table, e.g.:

import pandas as pd
from UniProtMapper import UniProtRetriever

field_retriever = UniProtRetriever()
df = field_retriever.fields_table
df.head()
Label Legacy Returned Field Returned Field Field Type
0 Entry id accession Names & Taxonomy
1 Entry Name entry name id Names & Taxonomy
2 Gene Names genes gene_names Names & Taxonomy
3 Gene Names (primary) genes(PREFERRED) gene_primary Names & Taxonomy
4 Gene Names (synonym) genes(ALTERNATIVE) gene_synonym Names & Taxonomy

Similar to UniProtIDMapper, the user can either call the object directly or use the retrieveFields method to obtain the response.

result, failed = field_retriever.retrieveFields(["Q02880"])
>>> Fetched: 1 / 1

result, failed = field_retriever(["Q02880"])
>>> Fetched: 1 / 1

Custom returned fields can be retrieved by passing a list of fields to the fields parameter. These fields need to be within UniProtRetriever.fields_table["Returned Field"] and will be returned with columns named as their respective Label.

The object already has a list of default fields under self.default_fields, but these are ignored if the parameter fields is passed.

fields = ["accession", "organism_name", "structure_3d"]
result, failed = field_retriever.retrieveFields(["Q02880"],
                                                fields=fields)

SwissProtParser

Querying data from UniProt-SwissProt

Retrieving json UniProt-SwissProt (reviewed) responses is also possible, such as the following:

result, failed = mapper(
    ids=["P30542", "Q16678", "Q02880"], from_db="UniProtKB_AC-ID", to_db="UniProtKB-Swiss-Prot"
)

print(result[0])
>>> {'from': 'P30542',
>>>  'to': {'entryType': 'UniProtKB reviewed (Swiss-Prot)',
>>>   'primaryAccession': 'P30542',
>>> ...
>>>     'Beta strand': 2,
>>>     'Turn': 1},
>>>    'uniParcId': 'UPI00000503E1'}}}

SwissProt responses from UniProtIDMapper can be parsed using the SwissProtParser class, where the fields to extract from UniProt (:param: = toquery) are stored under self._supported_fields and the cross-referenced datasets are stored under self._crossref_dbs (:param: = crossrefs).

parser = SwissProtParser(
    toquery=["organism", "tissueExpression", "cellLocation"], crossrefs=["GO"]
)
parser(result[0]['to'])

>>> {'organism': 'Homo sapiens',
>>>  'tissueExpression': '',
>>>  'cellLocation': 'Cell membrane',
>>>  'GO_crossref': ['GO:0030673~GoTerm~C:axolemma',
>>>   'GO:0030673~GoEvidenceType~IEA:Ensembl',
>>> ...
>>>   'GO:0007165~GoEvidenceType~TAS:ProtInc',
>>>   'GO:0001659~GoTerm~P:temperature homeostasis',
>>>   'GO:0001659~GoEvidenceType~IEA:Ensembl',
>>>   'GO:0070328~GoTerm~P:triglyceride homeostasis',
>>>   'GO:0070328~GoEvidenceType~IEA:Ensembl']}

Both UniProtIDMapper.mapIDs and __call__ methods accept a SwissProtParser as a parameter, such as in:

result, failed = mapper(
    ids=["P30542", "Q16678", "Q02880"],
    from_db="UniProtKB_AC-ID",
    to_db="UniProtKB-Swiss-Prot",
    parser=parser,
)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

uniprot-id-mapper-1.0.1.tar.gz (40.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

uniprot_id_mapper-1.0.1-py3-none-any.whl (40.0 kB view details)

Uploaded Python 3

File details

Details for the file uniprot-id-mapper-1.0.1.tar.gz.

File metadata

  • Download URL: uniprot-id-mapper-1.0.1.tar.gz
  • Upload date:
  • Size: 40.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.16

File hashes

Hashes for uniprot-id-mapper-1.0.1.tar.gz
Algorithm Hash digest
SHA256 3e9e3dc5c6fd37ffaf468ed42924d17c2ec56d6de6a683ebd253579286e74cb5
MD5 13dd9d6f7d636101b120ecad74da0307
BLAKE2b-256 caf95c8e6eeb473009a584a38d097a1c910f77112714ec51c95f6a9878c2edde

See more details on using hashes here.

File details

Details for the file uniprot_id_mapper-1.0.1-py3-none-any.whl.

File metadata

File hashes

Hashes for uniprot_id_mapper-1.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 95ec260bd671e243e6ac96d63b2318728826c77d3224b62d6199260396fbfb3e
MD5 bae752f6ab0eff15f8afdd39647f342c
BLAKE2b-256 c47f84b962697ec9eb4b57deba48c20201b5c8f0af33c3de1bd0b210a8f5ca54

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page