A python package to retrieve bioinformatics given Uniprot IDs.
Project description
pyuniprot
pyuniprot
parses a Uniprot txt file given a Uniprot ID into a python object. All information is made programmatically accessible when programming in python, the most used programming language in bioinformatics.
Through the python object, mostly a dictionary of different categories and each category is wrapped as a dataclass
so that inside attributes are easy to access through dot notations. Convenient functions will be provided for some common usage.
Contributions are highly welcomed!
Install
pip install pyuniprotkb
Usage
- Read a local file
./P01116.txt
or download from UniprotKB and save to the current directory
from pyuniprot import Uniprot
uniprot_id = "P01116"
uniprot = Uniprot(
uniprot_id, local_download_dir='./', save_txt=True)
) # If './P01116.txt' exists, pyuniprot reads it directly; if not pyuniprot downloads it from UniprotKB first and (optionally) saves it to './'
print(uniprot.category_lines) # It prints out all the information for a Uniprot ID that UniprotKB has for it.
- Access category information
print(uniprot.category_lines['AC'])
# result: AC(primary_accession='P01116;', secondary_accessions=['A8K8Z5', 'B0LPF9', 'P01118', 'Q96D10'])
Currently, most information can only be accessed from the category_lines
python dictionary. In this dictionary, the keys are just the two-letter codes used by UniprotKB.
They are (in my understanding):
ID
: UniprotKB identifersAC
: accession numbersDT
: entry brief historyDE
: protein names/descriptionsGN
: gene name (by HGNC?)OS
,OC
,OX
(grouped asOSCX
in pyuniprot): organism namesRN
,RP
,RC
,RX
,RA
,RT
,RL
(grouped asRNPCXATL
in pyuniprot): referencesCC
: activity commentsDR
: database cross referencesPE
: protein evidence levelKW
: protein keywordsFT
: protein feature tablesSQ
: protein sequence
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for pyuniprotkb-0.1.2-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | ad982fb0181d2440e27e561df6024ddc721c3ad36ca3f2df4ac99ab93bdd5c76 |
|
MD5 | 6784af95b01bddb0352fc037d2414bd8 |
|
BLAKE2b-256 | 9c7349d48e229f0670b33c5284260015d248d6c69bb4bcebd5225b383ad76b98 |