A toolkit for functional annotation of ligands in the PDB
Project description
PDBe RelLig
Relevant Ligands in PDB
With over 200,000 entries in the PDB, about 75% of these structures contain at least one ligand bound to a protein or nucleic acid. However, not all ligands are biologically relevant, some are present due to experimental necessities, such as aiding crystallisation or enabling cryoprotection, while others play biologically significant roles acting as cofactors, reactants or drugs. Unfortunately, the biological role of ligands present in PDB entries is not annotated in the PDB/mmCIF files. PDBe RelLig is designed to bridge this gap by automatically annotating the ligand's functional role as the following:
- cofactor
- reactant
- drug
Installation
Create and activate a virtual environment and then install PDBe RelLig
pip install pdberellig
Dependencies
- Python ^3.10
- click 8.1.7
- pdbeccdutils ^0.8.6
- sparqlwrapper ^2.0.0
- pandas ^2.2.3
Running the pipeline
There are three modes of pipelines:
cofactors
pdberellig cofactors --cif <path_to_ligand_cif_file> --ligand_type <type_of_ligand> --out-dir <path_to_output>
pipeline inputs
Options:
--cif TEXT path to input cif file [required]
--ligand-type [CCD|PRD|CLC] type of ligand in the PDB [required]
--out-dir TEXT path to output directory [required]
--help Show this message and exit.
pipeline outputs
<ligand_id>_cofactor_annotation.json - a json file containing ligand interacting proteins and similarity to template and representative molecules of cofactor classes .
Example
{
"HEM": {
"template": {
"id": "HEA",
"similarity": 0.717
},
"representative": {
"id": "HEA",
"similarity": 0.717
},
"pdb_chains": [
{
"pdb_id": "3ks0",
"auth_asym_id": "A",
"struct_asym_id": "C",
"uniprot_id": "P00175",
"ec_number": "1.1.2.3"
},
{
"pdb_id": "1ltd",
"auth_asym_id": "A",
"struct_asym_id": "A",
"uniprot_id": "P00175",
"ec_number": "1.1.2.3"
},
]
}
}
reactants
pdberellig reactants --cif <path_to_ligand_cif_file> --ligand_type <type_of_ligand> --chebi-structure-file <csv_file_with_chebi_mol> --out-dir <path_to_output>
pipeline inputs
Options:
--cif TEXT path to input cif file [required]
--ligand-type [CCD|PRD|CLC] type of ligand in the PDB [required]
--chebi-structure-file TEXT Path to the ChEBI SDF file [required]
--out-dir TEXT path to output directory [required]
--update-chebi Path to the ChEBI archive files
--minimal-ligand-size INTEGER Minimum ligand size. [default: 5]
--help Show this message and exit.
pipeline outputs
<ligand_id>_reactant_annotation.tsv - A tsv file containing ligand interacting proteins and similarity to the reaction participants present in the reactions catalysed the protein
Example
pdb_id | auth_asym_id | struct_asym_id | uniprot_id | rhea_id | chebi_id | similarity |
---|---|---|---|---|---|---|
1r3q | A | A | P06132 | 19865 | 57308 | 0.714 |
1r3q | A | A | P06132 | 31239 | 62626 | 0.8 |
1r3q | A | A | P06132 | 31239 | 62631 | 1.0 |
1r3s | A | A | P06132 | 19865 | 57308 | 0.714 |
1r3s | A | A | P06132 | 31239 | 62626 | 0.8 |
1r3s | A | A | P06132 | 31239 | 62631 | 1.0 |
1r3v | A | A | P06132 | 19865 | 57308 | 0.714 |
1r3v | A | A | P06132 | 31239 | 62626 | 0.8 |
1r3v | A | A | P06132 | 31239 | 62631 | 1.0 |
drugs
pdberellig drugs --cif <path_to_ligand_cif_file> --ligand_type <type_of_ligand> --out-dir <path_to_output>
pipeline inputs
Options:
--cif TEXT path to input cif file [required]
--out-dir TEXT path to output directory [required]
--ligand-type [CCD|PRD|CLC] type of ligand in the PDB [required]
--help Show this message and exit.
pipeline outputs
<ligand_id>_drug_annotation.tsv - a tsv file containing ligand interacting pharmacologically active drug-targets
Example
pdb_id | auth_asym_id | struct_asym_id | uniprot_id | name | organism |
---|---|---|---|---|---|
7n9g | A | A | P00519 | Tyrosine-protein kinase ABL1 | Humans |
3pyy | B | B | P00519 | Tyrosine-protein kinase ABL1 | Humans |
6npu | B | B | P00519 | Tyrosine-protein kinase ABL1 | Humans |
6npe | A | A | P00519 | Tyrosine-protein kinase ABL1 | Humans |
2hyy | A | A | P00519 | Tyrosine-protein kinase ABL1 | Humans |
6npu | A | A | P00519 | Tyrosine-protein kinase ABL1 | Humans |
6npv | B | B | P00519 | Tyrosine-protein kinase ABL1 | Humans |
6npe | B | B | P00519 | Tyrosine-protein kinase ABL1 | Humans |
Contribution
We encourage you to contribute to this project. The package uses poetry for packaging and dependency management. You can develop locally using:
git clone https://github.com/PDBeurope/rellig.git
cd rellig
pip install poetry
poetry install --with dev,doc
pre-commit install
The pre-commit hook will run linting, formatting and update poetry.lock
. The poetry.lock
file will lock all dependencies and ensure that they match pyproject.toml versions.
To add a new dependency
# Latest resolvable version
poetry add <package>
# Optionally fix a version
poetry add <package>@<version>
To change a version of a dependency, either edit pyproject.toml and run:
poetry sync --with dev
or
poetry add <package>@<version>
Documentation
The documentation is generated using sphinx
in sphinx_rtd_theme
and hosted on GitHub Pages. To generate the documentation locally,
cd doc
poetry run sphinx-build -b html . _build/html
# See the documentation at http://localhost:8080.
python -m http.server 8080 -d _build/html
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file pdberellig-1.0.0.tar.gz
.
File metadata
- Download URL: pdberellig-1.0.0.tar.gz
- Upload date:
- Size: 138.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.12.7
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | e23d06bb6d71ceeff6923759c6609623981bf045e451b8da32c0f599fd343933 |
|
MD5 | 9c91f96f82886418127a94a384fa12ce |
|
BLAKE2b-256 | 985852781e21e001ee74288cd6c1bb19fe005241af3d62fa4ee7c62874ba4bc6 |
File details
Details for the file pdberellig-1.0.0-py3-none-any.whl
.
File metadata
- Download URL: pdberellig-1.0.0-py3-none-any.whl
- Upload date:
- Size: 160.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.12.7
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 30eeb018356bf9a078830241b8383eba7a0ae8e978f1b113a187729cf573e9cd |
|
MD5 | f438487d35f549d619d019f190e12ecc |
|
BLAKE2b-256 | 613df69b89f607d6667d8794da105e8e54c86d1b09f582b6d193f4fe5dc8f381 |