Skip to main content

A Python wrapper for the UniProt Mapping RESTful API.

Project description

Linting Ruff Code style: black Imports: isort License: MIT GitHub Actions

UniProtMapper

An (unofficial) Python wrapper for the UniProt Retrieve/ID Mapping RESTful API. This package supports the following functionalities:

Installation

From PyPI:

pip install uniprot-id-mapper

From source:

git clone https://github.com/David-Araripe/UniProtMapper
cd UniProtMapper
pip install .

Usage

UniProtIDMapper

Supported databases and their respective type are stored under the attribute self.supported_dbs_with_types. These are also found as a list under self._supported_fields.

from UniProtIDMapper import UniProtIDMapper

mapper = UniProtIDMapper()
print(mapper.supported_dbs_with_types)

To map a list of UniProt IDs to Ensembl IDs, the user can either call the object directly or use the mapID method.

result, failed = mapper.mapIDs(
    ids=["P30542", "Q16678", "Q02880"], from_db="UniProtKB_AC-ID", to_db="Ensembl"
)
>>> Retrying in 3s
>>> Fetched: 3 / 3

result, failed = mapper(
    ids=["P30542", "Q16678", "Q02880"], from_db="UniProtKB_AC-ID", to_db="Ensembl"
)
>>> Retrying in 3s
>>> Fetched: 3 / 3

Where result is the following pandas DataFrame:

UniProtKB_AC-ID Ensembl
0 P30542 ENSG00000163485.17
1 Q16678 ENSG00000138061.12
2 Q02880 ENSG00000077097.17

UniProtRetriever

This class supports retrieving any of the UniProt return fields. The user can access these directly from the object, under the attribute self.fields_table, e.g.:

import pandas as pd
from UniProtMapper import UniProtRetriever

field_retriever = UniProtRetriever()
df = field_retriever.fields_table
df.head()
Label Legacy Returned Field Returned Field Field Type
0 Entry id accession Names & Taxonomy
1 Entry Name entry name id Names & Taxonomy
2 Gene Names genes gene_names Names & Taxonomy
3 Gene Names (primary) genes(PREFERRED) gene_primary Names & Taxonomy
4 Gene Names (synonym) genes(ALTERNATIVE) gene_synonym Names & Taxonomy

Similar to UniProtIDMapper, the user can either call the object directly or use the retrieveFields method to obtain the response.

result, failed = field_retriever.retrieveFields(["Q02880"])
>>> Fetched: 1 / 1

result, failed = field_retriever(["Q02880"])
>>> Fetched: 1 / 1

Custom returned fields can be retrieved by passing a list of fields to the fields parameter. These fields need to be within UniProtRetriever.fields_table["Returned Field"] and will be returned with columns named as their respective Label.

The object already has a list of default fields under self.default_fields, but these are ignored if the parameter fields is passed.

fields = ["accession", "organism_name", "structure_3d"]
result, failed = field_retriever.retrieveFields(["Q02880"],
                                                fields=fields)

SwissProtParser

Querying data from UniProt-SwissProt

Retrieving json UniProt-SwissProt (reviewed) responses is also possible, such as the following:

result, failed = mapper(
    ids=["P30542", "Q16678", "Q02880"], from_db="UniProtKB_AC-ID", to_db="UniProtKB-Swiss-Prot"
)

print(result[0])
>>> {'from': 'P30542',
>>>  'to': {'entryType': 'UniProtKB reviewed (Swiss-Prot)',
>>>   'primaryAccession': 'P30542',
>>> ...
>>>     'Beta strand': 2,
>>>     'Turn': 1},
>>>    'uniParcId': 'UPI00000503E1'}}}

SwissProt responses from UniProtIDMapper can be parsed using the SwissProtParser class, where the fields to extract from UniProt (:param: = toquery) are stored under self._supported_fields and the cross-referenced datasets are stored under self._crossref_dbs (:param: = crossrefs).

parser = SwissProtParser(
    toquery=["organism", "tissueExpression", "cellLocation"], crossrefs=["GO"]
)
parser(result[0]['to'])

>>> {'organism': 'Homo sapiens',
>>>  'tissueExpression': '',
>>>  'cellLocation': 'Cell membrane',
>>>  'GO_crossref': ['GO:0030673~GoTerm~C:axolemma',
>>>   'GO:0030673~GoEvidenceType~IEA:Ensembl',
>>> ...
>>>   'GO:0007165~GoEvidenceType~TAS:ProtInc',
>>>   'GO:0001659~GoTerm~P:temperature homeostasis',
>>>   'GO:0001659~GoEvidenceType~IEA:Ensembl',
>>>   'GO:0070328~GoTerm~P:triglyceride homeostasis',
>>>   'GO:0070328~GoEvidenceType~IEA:Ensembl']}

Both UniProtIDMapper.mapIDs and __call__ methods accept a SwissProtParser as a parameter, such as in:

result, failed = mapper(
    ids=["P30542", "Q16678", "Q02880"],
    from_db="UniProtKB_AC-ID",
    to_db="UniProtKB-Swiss-Prot",
    parser=parser,
)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

uniprot-id-mapper-1.0.1.tar.gz (40.3 kB view hashes)

Uploaded Source

Built Distribution

uniprot_id_mapper-1.0.1-py3-none-any.whl (40.0 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page