Getting Uniprot Data from Uniprot Accession ID through Uniprot REST API
Project description
UniProt Database Web Parser Project
Table of Contents
Description
This parser can be used to parse UniProt accession IDs and obtain related data from the UniProt web database.
Installation
To install the package, use:
python -m pip install uniprotparser
or
python3 -m pip install uniprotparser
Usage
Basic Usage
With version 1.2.0, we have exposed to and from mapping parameters for UniProt API where you can indicate which database you want to map to and from.
from uniprotparser import get_from_fields, get_to_fields
# To get all available fields to map from
from_fields = get_from_fields()
print(from_fields)
# To get all available fields to map to
to_fields = get_to_fields()
print(to_fields)
These parameters can be passed to the parse method of the UniprotParser class as follows:
from uniprotparser.betaparser import UniprotParser
parser = UniprotParser()
for p in parser.parse(ids=["P06493"], to_key="UniProtKB", from_key="UniProtKB_AC-ID"):
print(p)
CLI Interface
With version 1.1.0, a simple CLI interface has been added to the package.
Usage: uniprotparser [OPTIONS]
Options:
-i, --input FILENAME Input file containing a list of accession ids
-o, --output FILENAME Output file
--help Show this message and exit.
Asyncio Support
With version 1.0.5, support for asyncio through aiohttp has been added to betaparser. Usage can be seen as follows:
from uniprotparser.betaparser import UniprotParser
from io import StringIO
import asyncio
import pandas as pd
async def main():
example_acc_list = ["Q99490", "Q8NEJ0", "Q13322", "P05019", "P35568", "Q15323"]
parser = UniprotParser()
df = []
# Yield result for 500 accession ids at a time
async for r in parser.parse_async(ids=example_acc_list):
df.append(pd.read_csv(StringIO(r), sep="\t"))
# Check if there were more than one result and consolidate them into one dataframe
if len(df) > 0:
df = pd.concat(df, ignore_index=True)
else:
df = df[0]
asyncio.run(main())
Legacy API
To parse UniProt accession with the legacy API:
from uniprotparser.parser import UniprotSequence
protein_id = "seq|P06493|swiss"
acc_id = UniprotSequence(protein_id, parse_acc=True)
# Access ACCID
print(acc_id.accession)
# Access isoform id
print(acc_id.isoform)
To get additional data from the UniProt online database:
from uniprotparser.parser import UniprotParser
from io import StringIO
import pandas as pd
protein_accession = "P06493"
parser = UniprotParser([protein_accession])
# To get tabulated data
result = []
for i in parser.parse("tab"):
tab_data = pd.read_csv(i, sep="\t")
last_column_name = tab_data.columns[-1]
tab_data.rename(columns={last_column_name: "query"}, inplace=True)
result.append(tab_data)
fin = pd.concat(result, ignore_index=True)
# To get fasta sequence
with open("fasta_output.fasta", "wt") as fasta_output:
for i in parser.parse():
fasta_output.write(i)
License
This project is licensed under the MIT License. See the LICENSE file for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file uniprotparser-1.3.1.tar.gz.
File metadata
- Download URL: uniprotparser-1.3.1.tar.gz
- Upload date:
- Size: 9.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/2.2.1 CPython/3.10.12 Linux/6.6.87.2-microsoft-standard-WSL2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
758d1393291912f891e1c003255f8e1405bbddaac10aa6bbd235389dc71d4966
|
|
| MD5 |
7f450f6dd173a7e4e8ded16a94618826
|
|
| BLAKE2b-256 |
2b55b12d9603b07dbf7c30b3ec76aa8364cb66abd2daeb7defcbd4e55bf2a32b
|
File details
Details for the file uniprotparser-1.3.1-py3-none-any.whl.
File metadata
- Download URL: uniprotparser-1.3.1-py3-none-any.whl
- Upload date:
- Size: 10.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/2.2.1 CPython/3.10.12 Linux/6.6.87.2-microsoft-standard-WSL2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b7aeb4b5f0b8b2951abe6e4d695e54250910a6126ae2efa8c01b21f949794257
|
|
| MD5 |
20902277b199190917eeb16bf2697269
|
|
| BLAKE2b-256 |
e0157f7afafbd9f52911d580a0b54ffb786e2b23a42743eee60666629c31fc63
|