A Python wrapper for the UniProt Mapping RESTful API.
Project description
UniProtMapper
An (unofficial) Python wrapper for the UniProt Retrieve/ID Mapping RESTful API. This package supports the following functionalities:
- Map UniProt IDs other identifiers (handled by UniProtIDMapper);
- Retrieve any of the supported return fields (handled by UniprotRetriever)
- Parse json UniProt-SwissProt responses (handled by SwissProtParser).
Installation
From PyPI:
pip install uniprot-id-mapper
From source:
git clone https://github.com/David-Araripe/UniProtMapper
cd UniProtMapper
pip install .
Usage
UniProtIDMapper
Supported databases and their respective type are stored under the attribute self.supported_dbs_with_types
. These are also found as a list under self._supported_fields
.
from UniProtIDMapper import UniProtIDMapper
mapper = UniProtIDMapper()
print(mapper.supported_dbs_with_types)
To map a list of UniProt IDs to Ensembl IDs, the user can either call the object directly or use the mapID
method.
result, failed = mapper.mapIDs(
ids=["P30542", "Q16678", "Q02880"], from_db="UniProtKB_AC-ID", to_db="Ensembl"
)
>>> Retrying in 3s
>>> Fetched: 3 / 3
result, failed = mapper(
ids=["P30542", "Q16678", "Q02880"], from_db="UniProtKB_AC-ID", to_db="Ensembl"
)
>>> Retrying in 3s
>>> Fetched: 3 / 3
Where result is the following pandas DataFrame:
UniProtKB_AC-ID | Ensembl | |
---|---|---|
0 | P30542 | ENSG00000163485.17 |
1 | Q16678 | ENSG00000138061.12 |
2 | Q02880 | ENSG00000077097.17 |
UniProtRetriever
This class supports retrieving any of the UniProt return fields. The user can access these directly from the object, under the attribute self.fields_table
, e.g.:
import pandas as pd
from UniProtMapper import UniProtRetriever
field_retriever = UniProtRetriever()
df = field_retriever.fields_table
df.head()
Label | Legacy Returned Field | Returned Field | Field Type | |
---|---|---|---|---|
0 | Entry | id | accession | Names & Taxonomy |
1 | Entry Name | entry name | id | Names & Taxonomy |
2 | Gene Names | genes | gene_names | Names & Taxonomy |
3 | Gene Names (primary) | genes(PREFERRED) | gene_primary | Names & Taxonomy |
4 | Gene Names (synonym) | genes(ALTERNATIVE) | gene_synonym | Names & Taxonomy |
Similar to UniProtIDMapper
, the user can either call the object directly or use the retrieveFields
method to obtain the response.
result, failed = field_retriever.retrieveFields(["Q02880"])
>>> Fetched: 1 / 1
result, failed = field_retriever(["Q02880"])
>>> Fetched: 1 / 1
Custom returned fields can be retrieved by passing a list of fields to the fields
parameter. These fields need to be within UniProtRetriever.fields_table["Returned Field"]
and will be returned with columns named as their respective Label
.
The object already has a list of default fields under self.default_fields
, but these are ignored if the parameter fields
is passed.
fields = ["accession", "organism_name", "structure_3d"]
result, failed = field_retriever.retrieveFields(["Q02880"],
fields=fields)
SwissProtParser
Querying data from UniProt-SwissProt
Retrieving json UniProt-SwissProt (reviewed) responses is also possible, such as the following:
result, failed = mapper(
ids=["P30542", "Q16678", "Q02880"], from_db="UniProtKB_AC-ID", to_db="UniProtKB-Swiss-Prot"
)
print(result[0])
>>> {'from': 'P30542',
>>> 'to': {'entryType': 'UniProtKB reviewed (Swiss-Prot)',
>>> 'primaryAccession': 'P30542',
>>> ...
>>> 'Beta strand': 2,
>>> 'Turn': 1},
>>> 'uniParcId': 'UPI00000503E1'}}}
SwissProt responses from UniProtIDMapper
can be parsed using the SwissProtParser
class, where the fields to extract from UniProt (:param: = toquery) are stored under self._supported_fields
and the cross-referenced datasets are stored under self._crossref_dbs
(:param: = crossrefs).
parser = SwissProtParser(
toquery=["organism", "tissueExpression", "cellLocation"], crossrefs=["GO"]
)
parser(result[0]['to'])
>>> {'organism': 'Homo sapiens',
>>> 'tissueExpression': '',
>>> 'cellLocation': 'Cell membrane',
>>> 'GO_crossref': ['GO:0030673~GoTerm~C:axolemma',
>>> 'GO:0030673~GoEvidenceType~IEA:Ensembl',
>>> ...
>>> 'GO:0007165~GoEvidenceType~TAS:ProtInc',
>>> 'GO:0001659~GoTerm~P:temperature homeostasis',
>>> 'GO:0001659~GoEvidenceType~IEA:Ensembl',
>>> 'GO:0070328~GoTerm~P:triglyceride homeostasis',
>>> 'GO:0070328~GoEvidenceType~IEA:Ensembl']}
Both UniProtIDMapper.mapIDs
and __call__
methods accept a SwissProtParser
as a parameter, such as in:
result, failed = mapper(
ids=["P30542", "Q16678", "Q02880"],
from_db="UniProtKB_AC-ID",
to_db="UniProtKB-Swiss-Prot",
parser=parser,
)
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for uniprot_id_mapper-1.0.1-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 95ec260bd671e243e6ac96d63b2318728826c77d3224b62d6199260396fbfb3e |
|
MD5 | bae752f6ab0eff15f8afdd39647f342c |
|
BLAKE2b-256 | c47f84b962697ec9eb4b57deba48c20201b5c8f0af33c3de1bd0b210a8f5ca54 |