A Python wrapper for the UniProt Mapping RESTful API.
Project description
UniProtMapper
An (unofficial) Python wrapper for the UniProt Retrieve/ID Mapping RESTful API. This package supports the following functionalities:
- Map UniProt IDs other identifiers (handled by UniProtIDMapper);
- Retrieve any of the supported return fields (handled by UniprotRetriever)
- Parse json UniProt-SwissProt responses (handled by SwissProtParser).
Installation
From PyPI:
pip install uniprot-id-mapper
From source:
git clone https://github.com/David-Araripe/UniProtMapper
cd UniProtMapper
pip install .
Usage
UniProtIDMapper
Supported databases and their respective type are stored under the attribute self.supported_dbs_with_types. These are also found as a list under self._supported_fields.
from UniProtIDMapper import UniProtIDMapper
mapper = UniProtIDMapper()
print(mapper.supported_dbs_with_types)
To map a list of UniProt IDs to Ensembl IDs, the user can either call the object directly or use the mapID method.
result, failed = mapper.mapIDs(
ids=["P30542", "Q16678", "Q02880"], from_db="UniProtKB_AC-ID", to_db="Ensembl"
)
>>> Retrying in 3s
>>> Fetched: 3 / 3
result, failed = mapper(
ids=["P30542", "Q16678", "Q02880"], from_db="UniProtKB_AC-ID", to_db="Ensembl"
)
>>> Retrying in 3s
>>> Fetched: 3 / 3
Where result is the following pandas DataFrame:
| UniProtKB_AC-ID | Ensembl | |
|---|---|---|
| 0 | P30542 | ENSG00000163485.17 |
| 1 | Q16678 | ENSG00000138061.12 |
| 2 | Q02880 | ENSG00000077097.17 |
UniProtRetriever
This class supports retrieving any of the UniProt return fields. The user can access these directly from the object, under the attribute self.fields_table, e.g.:
import pandas as pd
from UniProtMapper import UniProtRetriever
field_retriever = UniProtRetriever()
df = field_retriever.fields_table
df.head()
| Label | Legacy Returned Field | Returned Field | Field Type | |
|---|---|---|---|---|
| 0 | Entry | id | accession | Names & Taxonomy |
| 1 | Entry Name | entry name | id | Names & Taxonomy |
| 2 | Gene Names | genes | gene_names | Names & Taxonomy |
| 3 | Gene Names (primary) | genes(PREFERRED) | gene_primary | Names & Taxonomy |
| 4 | Gene Names (synonym) | genes(ALTERNATIVE) | gene_synonym | Names & Taxonomy |
Similar to UniProtIDMapper, the user can either call the object directly or use the retrieveFields method to obtain the response.
result, failed = field_retriever.retrieveFields(["Q02880"])
>>> Fetched: 1 / 1
result, failed = field_retriever(["Q02880"])
>>> Fetched: 1 / 1
Custom returned fields can be retrieved by passing a list of fields to the fields parameter. These fields need to be within UniProtRetriever.fields_table["Returned Field"] and will be returned with columns named as their respective Label.
The object already has a list of default fields under self.default_fields, but these are ignored if the parameter fields is passed.
fields = ["accession", "organism_name", "structure_3d"]
result, failed = field_retriever.retrieveFields(["Q02880"],
fields=fields)
SwissProtParser
Querying data from UniProt-SwissProt
Retrieving json UniProt-SwissProt (reviewed) responses is also possible, such as the following:
result, failed = mapper(
ids=["P30542", "Q16678", "Q02880"], from_db="UniProtKB_AC-ID", to_db="UniProtKB-Swiss-Prot"
)
print(result[0])
>>> {'from': 'P30542',
>>> 'to': {'entryType': 'UniProtKB reviewed (Swiss-Prot)',
>>> 'primaryAccession': 'P30542',
>>> ...
>>> 'Beta strand': 2,
>>> 'Turn': 1},
>>> 'uniParcId': 'UPI00000503E1'}}}
SwissProt responses from UniProtIDMapper can be parsed using the SwissProtParser class, where the fields to extract from UniProt (:param: = toquery) are stored under self._supported_fields and the cross-referenced datasets are stored under self._crossref_dbs (:param: = crossrefs).
parser = SwissProtParser(
toquery=["organism", "tissueExpression", "cellLocation"], crossrefs=["GO"]
)
parser(result[0]['to'])
>>> {'organism': 'Homo sapiens',
>>> 'tissueExpression': '',
>>> 'cellLocation': 'Cell membrane',
>>> 'GO_crossref': ['GO:0030673~GoTerm~C:axolemma',
>>> 'GO:0030673~GoEvidenceType~IEA:Ensembl',
>>> ...
>>> 'GO:0007165~GoEvidenceType~TAS:ProtInc',
>>> 'GO:0001659~GoTerm~P:temperature homeostasis',
>>> 'GO:0001659~GoEvidenceType~IEA:Ensembl',
>>> 'GO:0070328~GoTerm~P:triglyceride homeostasis',
>>> 'GO:0070328~GoEvidenceType~IEA:Ensembl']}
Both UniProtIDMapper.mapIDs and __call__ methods accept a SwissProtParser as a parameter, such as in:
result, failed = mapper(
ids=["P30542", "Q16678", "Q02880"],
from_db="UniProtKB_AC-ID",
to_db="UniProtKB-Swiss-Prot",
parser=parser,
)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file uniprot-id-mapper-1.0.1.tar.gz.
File metadata
- Download URL: uniprot-id-mapper-1.0.1.tar.gz
- Upload date:
- Size: 40.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.9.16
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3e9e3dc5c6fd37ffaf468ed42924d17c2ec56d6de6a683ebd253579286e74cb5
|
|
| MD5 |
13dd9d6f7d636101b120ecad74da0307
|
|
| BLAKE2b-256 |
caf95c8e6eeb473009a584a38d097a1c910f77112714ec51c95f6a9878c2edde
|
File details
Details for the file uniprot_id_mapper-1.0.1-py3-none-any.whl.
File metadata
- Download URL: uniprot_id_mapper-1.0.1-py3-none-any.whl
- Upload date:
- Size: 40.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.9.16
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
95ec260bd671e243e6ac96d63b2318728826c77d3224b62d6199260396fbfb3e
|
|
| MD5 |
bae752f6ab0eff15f8afdd39647f342c
|
|
| BLAKE2b-256 |
c47f84b962697ec9eb4b57deba48c20201b5c8f0af33c3de1bd0b210a8f5ca54
|