APFID - Arbitrary Protein Fragment IDentifier parser
Project description
APFID
APFID stands for Arbitrary Protein Fragment IDentifier and serves as unique identificator of molecular structure fragment in databases such as PDB, AlphaFold or user own database if necessary. It can be found useful to identify and store data in the researches dealing with structural motifs or domains on a big scale. It is used specifically by PSSKB.ORG database as structure identifier and alo in some structural researches performed by Laboratory of structural proteomics in the IBMC, Moscow.
This library contains basic parser utility to work with APFIDs using Python 3.10+.
APFID structure:
commonly, it looks like this: 1YSI_A111_A191 but there are some options described below
Apfid contains of three underscore-separated parts:
- experiment_id: Identificator in Database: PDB ID, Alphafold ID or user-given one
- Start position: chain ID and residue no
- End position: chain ID and residue no
Note: chain ID must represent code used in PDB structure file, not in deposition details. At rcsb.org the correct ID is described as Auth ID.
To identify the whole chain, one can use short form with {experiment_id}_{chain_id} and ommit what follows. So 1EDI_A
is absolutely correct APFID.
In some cases, second underscore can be replaced with "-", so 1YSI_A111_A191 is equal to 1YSI_A111-A191
APFIDv2
Is more forgiving yet not used before in our structures way to identify structure. Basic rules are:
{experiment_id}[:{model}]_{chain_id}[{start}_[{chain2_id}_]{end}]
where:
- experiment_id: Identificator in Database: PDB ID, Alphafold ID or user-given one
- model: ID of the model in the same file (can be ommited to fallback to zero, or first model). Added to navigate through NMR and MD
- chain_id: ID of the chain to begin
- chain2_id: ID of the chain to end. if differs, all the chains alphabetically within chain_id and chain2_id are included. not fully implemented yet as no cases are really imaginable
- start, end: numbers of residues in chains
examples:
1YSI_A111_A191
1YSI:10_A111_191
1YSI:10_A111-191
1YSI:10_A
1YSI_A
Databases ID:
ID is register-insensitive, e.g. APFID 1EDI_A is equal to 1edi_A
(but none are equal to 1edi_a as register can be crucial to chain ID)
For AlphaFold can be used structre ID in form of AF-Q8RX87-F1-V2.
User databases experiment_ids (for example, structures uploaded by users in PSSKB services) should have prefix 'USR' and total length not equal to 4. Letters, numders and hyphens are allowed.
INSTALLATION
pip install apfid
USAGE
Парсинг из строки:
from apfid import parse_apfid
apf = parse_apfid('1YSI_A111_A191')
print(apf)
# 1YSI_A111_A191
print(apf.upper())
# 1YSI_A111_A191
print(apf.lower())
# 1ysi_A111_A191
print(apf.experiment_id)
# 1YSI
print(apf.chain_id, apf.start, apf.end)
# A 111 191
Создание из параметров:
from apfid import Apfid
apf = Apfid('1YSQ', 'A', 111, 191, 10, version=2)
apf.upper()
# '1YSQ:10_A111_191'
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file apfid-0.1.0.tar.gz.
File metadata
- Download URL: apfid-0.1.0.tar.gz
- Upload date:
- Size: 6.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.0.1 CPython/3.10.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b3c785206ffd4fde6e886b1e14f13463c60cac0ff0bf096f002db7f9efb17f87
|
|
| MD5 |
2bb1c14df7d7548603cf629ef5e7fde7
|
|
| BLAKE2b-256 |
8ca3125dbfa8622e1a7eede25dd48c4fd0af53747857f2ab0b55afa2f31c91f1
|
File details
Details for the file apfid-0.1.0-py3-none-any.whl.
File metadata
- Download URL: apfid-0.1.0-py3-none-any.whl
- Upload date:
- Size: 6.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.0.1 CPython/3.10.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e249d81d5135d5e7c9ed66fff98d51a6c7c64dc1f92d8d736d3a7299b3fb5ab1
|
|
| MD5 |
18630fb9338749a89354b16353b3b948
|
|
| BLAKE2b-256 |
47df4cdb38309f1a0abe6cc0d1a1075e19b6d5b3e45025f42a62d93be9b42564
|