Skip to main content

Getting Uniprot Data from Uniprot Accession ID through Uniprot REST API

Project description

UniProt Database Web Parser Project

Downloads

TLDR: This parser can be used to parse UniProt accession id and obtain related data from the UniProt web database.

To use:

python -m pip install uniprotparser

or

python3 -m pip install uniprotparser

With version 1.1.0, a simple CLI interface has been added to the package.

Usage: uniprotparser [OPTIONS]

Options:
  -i, --input FILENAME   Input file containing a list of accession ids
  -o, --output FILENAME  Output file
  --help                 Show this message and exit.

With version 1.0.5, support for asyncio through aiohttp has been added to betaparser. Usage can be seen as follow

from uniprotparser.betaparser import UniprotParser
from io import StringIO
import asyncio
import pandas as pd

async def main():
    example_acc_list = ["Q99490", "Q8NEJ0", "Q13322", "P05019", "P35568", "Q15323"]
    parser = UniprotParser()
    df = []
    #Yield result for 500 accession ids at a time
    async for r in parser.parse_async(ids=example_acc_list):
        df.append(pd.read_csv(StringIO(r), sep="\t"))
    
    #Check if there were more than one result and consolidate them into one dataframe
    if len(df) > 0:
        df = pd.concat(df, ignore_index=True)
    else:
        df = df[0]

asyncio.run(main())

With version 1.0.2, support for the new UniProt REST API have been added under betaparser module of the package.

In order to utilize this new module, you can follow the example bellow

from uniprotparser.betaparser import UniprotParser
from io import StringIO

import pandas as pd
example_acc_list = ["Q99490", "Q8NEJ0", "Q13322", "P05019", "P35568", "Q15323"]
parser = UniprotParser()
df = []
#Yield result for 500 accession ids at a time
for r in parser.parse(ids=example_acc_list):
    df.append(pd.read_csv(StringIO(r), sep="\t"))

#Check if there were more than one result and consolidate them into one dataframe
if len(df) > 0:
    df = pd.concat(df, ignore_index=True)
else:
    df = df[0]

To parse UniProt accession with legacy API

from uniprotparser.parser import UniprotSequence

protein_id = "seq|P06493|swiss"

acc_id = UniprotSequence(protein_id, parse_acc=True)

#Access ACCID
acc_id.accession

#Access isoform id
acc_id.isoform

To get additional data from UniProt online database

from uniprotparser.parser import UniprotParser
from io import StringIO
#Install pandas first to handle tabulated data
import pandas as pd

protein_accession = "P06493"

parser = UniprotParser([protein_accession])

#To get tabulated data
result = []
for i in parser.parse("tab"):
    tab_data = pd.read_csv(i, sep="\t")
    last_column_name = tab_data.columns[-1]
    tab_data.rename(columns={last_column_name: "query"}, inplace=True)
    result.append(tab_data)
fin = pd.concat(result, ignore_index=True)

#To get fasta sequence
with open("fasta_output.fasta", "wt") as fasta_output:
    for i in parser.parse():
        fasta_output.write(i)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

uniprotparser-1.1.0.tar.gz (8.4 kB view details)

Uploaded Source

Built Distribution

uniprotparser-1.1.0-py3-none-any.whl (9.3 kB view details)

Uploaded Python 3

File details

Details for the file uniprotparser-1.1.0.tar.gz.

File metadata

  • Download URL: uniprotparser-1.1.0.tar.gz
  • Upload date:
  • Size: 8.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.3.0 CPython/3.11.2 Windows/10

File hashes

Hashes for uniprotparser-1.1.0.tar.gz
Algorithm Hash digest
SHA256 e126b37a0c3238a00b58f7537d7a8b357a9fad66605f80b4b24cc85d72c92172
MD5 e865a2b0a1b3bd28bab51da158952b1d
BLAKE2b-256 afe4aa6817870c78046e6b51210149e1f11213eb2261cc6f6a3c837a7b0b74b1

See more details on using hashes here.

File details

Details for the file uniprotparser-1.1.0-py3-none-any.whl.

File metadata

  • Download URL: uniprotparser-1.1.0-py3-none-any.whl
  • Upload date:
  • Size: 9.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.3.0 CPython/3.11.2 Windows/10

File hashes

Hashes for uniprotparser-1.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 be900a82e41df3202ec37f8ca2190c452adb2df6864ea1765c26c86d183bf888
MD5 f1a6be258fb18b2ff5216adb9c7fcc1a
BLAKE2b-256 4223dc4eb92372fd28df11c25056d0b68bb8c1485a85202445ff7338809a6be8

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page