Model peptide-MHC I complexes using anchor distance restrains in MODELLER
Project description
PANDORA
Peptide ANchored mODelling fRAmework for peptide-MHC complexes
Contents
Overview
PANDORA is anchor restrained modelling pipeline for generating peptide-MHC structures.
It contains multiple functions to pre-process data and it's able to exploit different crucial domain knowledge provided by the user to guide the modelling.
PANDORA documentation can be found at: https://csb-pandora.readthedocs.io/en/latest/
Dependencies
PANDORA requires MODELLER, python and some python libraries to be installed. The following installations are required to start PANDORA installation:
- Python 3
- conda
- pip3
The installation process will take care of installing the following dependencies (see Installation), no need to install them yourself.
The following dependencies can be used to predict peptide anchor postisions, but have to be manually installed:
Installation
Conda Installation (suggested)
1. Get a Modeller Key License:
Prior to PANDORA installation, you need to first activate MODELLER's license. Please request MODELLER license at: https://salilab.org/modeller/registration.html
Replace XXXX with your MODELLER License key and run the command:
alias KEY_MODELLER='XXXX'
2. Install PANDORA
Install with conda:
conda install -c csb-nijmegen csb-pandora -c salilab -c bioconda
GitHub / Pypi installation
1. Install Modeller:
Prior to PANDORA installation, you need to first activate MODELLER's license. Please request MODELLER license at: https://salilab.org/modeller/registration.html
Replace XXXX with your MODELLER License key and run the command:
alias KEY_MODELLER='XXXX'
Then Install MODELLER with:
conda install -y -c salilab modeller
2. Install Muscle
PANDORA relies on muscle (https://anaconda.org/bioconda/muscle) that can be installed via bioconda
conda install -c bioconda muscle
3. Install PANDORA
Pypi installation:
pip install csb-pandora
Alternatively, GitHub installation:
Clone the repository:
git clone https://github.com/X-lab-3D/PANDORA.git
Enter the cloned directory and then install the dependencies!
cd PANDORA
pip install -e .
Generate / download template Database
PANDORA needs a PDB template database to work (retrieved from IMGT database). You can download it from https://github.com/X-lab-3D/PANDORA_database (pMHC I only, generated on 23/03/2021) and follow the instructions. Please be sure you re-path your database as explained in the instructions.
Alternatively, you can generate your template database(suggested) with the following python3 code:
## import requested modules
from PANDORA.PMHC import PMHC
from PANDORA.Pandora import Pandora
from PANDORA.Database import Database
## A. Create local Database
db = Database.Database()
db.construct_database(save='path/to/pandora_Database.pkl')
Note: generating a database can take more than one hour and a half, so we advice to run it as background process or submit it as cluster job.
(Optional) Install NetMHCpan and/or NetMHCIIpan
PANDORA lets the user if he wants to predict peptide's anchor residues instead of using conventional predefined anchor residues. In that case you need to download NetMHCpan (for peptide:MHC class I) and/or NetMHCIIpan (for peptide:MHC class II). To install, you can simply run:
python netMHCpan_install.py
Tutorial
Example 1 : Generating a peptide:MHC complex given the peptide sequence
PANDORA requires at least these information to generate models:
- Peptide sequence
- MHC allele
Steps: A. Load the template database (see installation, point 4)
B. Creating a Template object based on the given target information
C. Generating n number of pMHC models (Default n=20)
Please note that you can specify output directory yourself, otherwise will be generated in a default directory
## import requested modules
from PANDORA.PMHC import PMHC
from PANDORA.Pandora import Pandora
from PANDORA.Database import Database
## A. Create local Database
db = Database.load('path/to/pandora_Database.pkl')
## B. Create Target object
target = PMHC.Target(id = 'myTestCase'
allele_type = 'HLA-A*0201'
peptide = 'LLFGYPVYV',
anchors = [2,9])
## C. Perform modelling
case = Pandora.Pandora(target, db)
case.model()
Example 2 : Create multiple loop models in a your given directory
There are some options provided that you can input them as arguments to the functions.
For instance:
- Generate more models for your modelling case
- Specify the output directory yourself
- Give your target a name
- Predict anchors by NetMHCpan
Please note that, if anchors is not specified or use_netmhcpan is set to False, PANDORA will automatically assign canonical anchors (P2 and PΩ).
from PANDORA.PMHC import PMHC
from PANDORA.Pandora import Pandora
from PANDORA.Database import Database
## A. load the pregenerated Database of all pMHC PDBs as templates
db = Database.load('path/to/pandora_Database.pkl')
## B. Create Target object
target = PMHC.Target(id = 'myTestCase'
allele_type = ['HLA-B*5301', 'HLA-B*5301'],
peptide = 'TPYDINQML',
use_netmhcpan = True)
## C. Perform modelling
case = Pandora.Pandora(target, db, output_dir = '/your/directory/')
case.model(n_loop_models=100) # Generates 100 models
Example 3 : Benchmark PANDORA on one modelling case
Evaluate the framework on a target with a known experimental structure:
- Provide the PDB ID for the Target class
- Set benchmark=True for the modelling (calculates L-RMSD to show how far the model is from the near-native structure)
from PANDORA.PMHC import PMHC
from PANDORA.Pandora import Pandora
from PANDORA.Database import Database
## A. Load pregenerated database of all pMHC PDBs as templates
db = Database.load('path/to/pandora_Database.pkl')
## B. Create Target object
target = PMHC.Target('1A1M',
db.MHCI_data['1A1M'].allele_type,
db.MHCI_data['1A1M'].peptide,
anchors = db.MHCI_data['1A1M'].anchors)
## C. Perform modelling
case = Pandora.Pandora(target, db)
case.model(benchmark=True)
Example 4: Model a peptide:MHCI complex with an alpha helix in the peptide
Input domain secondary structure prediction information (Helix/Beta strand):
from PANDORA.PMHC import PMHC
from PANDORA.Pandora import Pandora
from PANDORA.Database import Database
## A. Load pregenerated database of all pMHC PDBs as templates
db = Database.load('path/to/pandora_Database.pkl')
## B. Create Target object
target = PMHC.Target(id = 'myMHCIITestCase'
allele_type = ['MH1-B*2101', 'MH1-B*2101'],
peptide = 'TAGQSNYDRL',
anchors = [2,10],
helix = ['4', '9'])
## C. Perform modelling
case = Pandora.Pandora(target, db)
case.model(helix=target.helix)
Example 5: Benchmark PANDORA on multiple cases (running in parallel on multiple cores)
PANDORA can model large batches of peptides in parallel. You need to provide the following peptide information in a .tsv or .csv file:
- Peptide sequence, MHC Allele name Note: you can also add various information to your file, including anchors for each case, templates, IDs.
The Wrapper class will take care of generating PANDORA target objects and parallelize the modelling on the given number of cores:
from PANDORA.Pandora import Pandora
from PANDORA.Database import Database
from PANDORA.Wrapper import Wrapper
## A. Load pregenerated database of all pMHC PDBs as templates
db = Database.load('path/to/pandora_Database.pkl')
## B. Create the wrapper object
wrap = Wrapper()
## C. Create all Target Objects based on peptides in the .tsv file
wrap.create_targets('datafile.tsv', db)
## C. Perform modelling
wrap.run_pandora(num_cores=128)
Example 6: Generating a peptide:MHC class II complex given the peptide sequence
To model a peptide:MHC class II complex, you only need to specify that in PMHC.Target() function: as MHC_class='II' (By default it is set to model MHC class I).
from PANDORA.PMHC import PMHC
from PANDORA.Pandora import Pandora
from PANDORA.Database import Database
## A. Load pregenerated database of all pMHC PDBs as templates
db = Database.load('path/to/pandora_Database.pkl')
target = PMHC.Target(id='myMHCIITestCase'
MHC_class = 'II',
allele_type = ['HLA-DRA*0102', 'HLA-DRA*0101', 'HLA-DRB1*0101'],
peptide = 'GELIGILNAAKVPAD',
anchors = [4, 7, 9, 12])
case = Pandora.Pandora(target, db)
case.model()
Note: For MHC II, no canonical anchors can be defined. Therefore the user must either install and use NetMHCIIpan or directly input the anchors positions as anchors in PMHC.Target()
Code Design
PANDORA has been implemented in an Object-Oriented Design(OOD). Resulting in a comprehensible and user-friendly framework.
see Class Diagram
Output
The following file structure is prepared to store the output files for each case. Each modelling case is given a specific name based on target and template ID.
Please note that the modelling results consisting genretaed models by default are stored in ./PANDORA_files/data/outputs/ directory
- Main outputs: *molpdf_DOPE.tsv, BL.pdb, modeller.log(
- Input files prepared for modelling: contacs_.list, .ali
- .py files: MODELLER scripts
- MODELLER by product outputs(Generated during the modelling): D0, DL*, IL.pdb , , *.ini, *.lrsr, *.rsr, .sch, ...
PANDORA_files
└── data
└── outputs Default directory to save output
└── <target_name>_<template_id> Each user's modelling case is given a specific name
├── molpdf_DOPE.tsv Ranking all models by molpdf and DOPE modeller's scoring functions
├── *BL*.pdb Final models
├── modeller.log Printing log file generated by MODELLER, describing modelling steps, or any issues arose along modelling
├── *.ali Alignment file between template(s) and target used for modelling
├── contacts_*.list Contact restraints
├── MyLoop.py MODELLER script to set loop modelling parameters for the peptide
├── cmd_modeller_ini.py MODELLER script to generate an initial model to extract restraints from
├── cmd_modeller.py MODELLER script to set the main modelling parameters
├── *.ini Model generated placing the target atoms at the same coordinate as the template's atoms
├── *IL*.pdb Initial loop model
└── ...
Issues
If you have questions or find a bug, please report the issue in the Github issue channel.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
File details
Details for the file csb_pandora-0.9.tar.gz
.
File metadata
- Download URL: csb_pandora-0.9.tar.gz
- Upload date:
- Size: 60.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/34.0 requests/2.22.0 requests-toolbelt/0.9.1 urllib3/1.26.9 tqdm/4.36.1 importlib-metadata/4.11.3 keyring/18.0.0 rfc3986/2.0.0 colorama/0.4.4 CPython/3.7.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 7e339e94fee4d26c2b1489ea6d27ffcd137eff356815f6a3a4a244b5b045db81 |
|
MD5 | f151b9180bfcd1dc8312b45a1efbdff5 |
|
BLAKE2b-256 | af5b2e98b7d74f4dd7a9d3d84cb591bb38a37922ee793d8735c45b90bf14285b |