Software for analysing deformation between protein structures.
Project description
Protein Deformation Analysis (PDAnalysis)
Python package for calculating deformation between protein structures.
Code is still actively being developed, but should work for most cases. If you encounter any problems, either post an Issue or send an email to John McBride.
Documentation for PDAnalysis can be found here
Description
Calculates local similarity:
- Local Distance Difference Test (LDDT)
Calculates local deformation:
- Local Distance Difference (LDD)
- Neighborhood Distance
- Effective Strain
- Shear Strain
- Non-affine Strain
Calculates global deformation:
- Root-mean-squared-deviation (RMSD)
Getting Started
Dependencies
- python>=3.7
- numpy
- scipy
- pandas
- Biopython
Installing
Using PDAnalysis as a module
- pip install PDAnalysis
Using PDAnalysis from the command line
- Clone this repository
git clone https://github.com/mirabdi/PDAnalysis.git
- Run
python setup.py install
- From inside the cloned repository run
python main.py
Executing program
python main.py --protA [path(s) to protein A] --protB [path(s) to protein B]
Protein Input Types:
- "--protA" :: Path to a single protein, or multiple paths separated by spaces. Allowed input types: .pdb, .cif, .npy, .txt.
- "--protB" :: Path to a single protein, or multiple paths separated by spaces. Allowed input types: .pdb, .cif, .npy, .txt.
- "--prot_listA" :: Path to a simple text file with a list of paths to proteins.
- "--prot_listB" :: Path to a simple text file with a list of paths to proteins.
Optional Arguments:
- "--method" :: Deformation method to use. Default is "strain". "all" calculates everything. Multiple methods can be used if separated by spaces. Allowed methods: "mut_dist", "strain", "shear", "non_affine", "ldd", "lddt", "rmsd". Note that "rmsd" only works for comparisons between single structures (no averaging); running "rmsd" with multple input structures will only use one protein from A and one protein from B.
- "--min_plddt" :: Minimum pLDDT cutoff: exclude from calculations any atoms with pLDDT lower than the cutoff. Recommended value = 70.
- "--max_bfactor" :: Maximum bfactor cutoff: exclude from calculations any atoms with B-factor higher than the cutoff.
- "--neigh_cut" :: Neighbor cutoff distance in Angstroms.
- "--lddt_cutoffs" :: Specify the distance cutoffs used in LDDT calculation. Default = (0.5, 1, 2, 4).
Examples:
Calculates "Effective Strain" between two PDB conformations, and ignore residues with pLDDT < 70:
python main.py --protA test_data/Lysozyme/AF-P61626-F1-model_v4.pdb --protB test_data/Lysozyme/AF-P79180-F1-model_v4.pdb --min_plddt 70
Calculates "LDDT" using one PDB file and two PDB files (an averaged configuration) and use a neighbor cutoff distance of 10 Angstroms:
python main.py --protA test_data/GFP/GFP_mutant.pdb --protB test_data/GFP/GFP_WT.pdb test_data/GFP/GFP_WT_ver2.pdb --neigh_cut 10 --method lddt
Calculates "Effective Strain" and "Shear Strain" using two lists of PDB files (averaging over all proteins in each list), use a neighbor cutoff distance of 12 Angstroms, and ignore residues with pLDDT < 70:
python main.py --prot_listA test_data/GFP/mut_all.txt --prot_listB test_data/GFP/wt_all.txt --neigh_cut 12 --min_plddt 70 --method strain shear
Calculates all measures between two PDB conformations, and uses alternative LDDT cutoff values:
python main.py --protA test_data/Lysozyme/AF-P61626-F1-model_v4.pdb --protB test_data/Lysozyme/AF-P79180-F1-model_v4.pdb --method all --lddt_cutoffs 0.125 0.25 0.5 1
Help
For more detailed help on running code:
python main.py --help
Authors
Version History
- 0.0.0
- Pre-release
- 0.0.1
- First working release
License
This project is licensed under the MIT License - see the LICENSE.md file for details
Acknowledgments
- If you use PDAnalysis in your work, please cite McBride, J. M., Polev, K., Abdirasulov, A., Reinharz, V., Grzybowski, B. A., & Tlusty, T. AlphaFold2 can predict single-mutation effects.
- If using LDDT, please cite Mariani, Valerio, Marco Biasini, Alessandro Barbato, and Torsten Schwede (2013), lDDT: a local superposition-free score for comparing protein structures and models using distance difference tests, Bioinformatics 29 (21), 2722–2728.
- The code for Shear Strain was provided by Jacques Rougemont. If using this method, please cite Eckmann, J P, J. Rougemont, and T. Tlusty (2019), Colloquium: Proteins: The physics of amorphous evolving matter, Rev. Mod. Phys. 91, 031001.
- If using Non-affine Strain, please cite Falk, M L, and J. S. Langer (1998), Dynamics of viscoplastic deformation in amorphous solids, Phys. Rev. E 57, 7192–7205. We also give credit to matscipy for an earlier implementation.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file PDAnalysis-0.0.4.tar.gz
.
File metadata
- Download URL: PDAnalysis-0.0.4.tar.gz
- Upload date:
- Size: 18.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.9.12
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | ba49ef4e66fb0063846340b6450b0373cffd8de3f77f55dcd5ccc1260ccd3768 |
|
MD5 | bd90a48d699d341c2aef708633584d87 |
|
BLAKE2b-256 | 80ebb0c8b9b6372a66b2389cc76840e8fab7625b2f53b06eb0b9eada5545ed83 |
File details
Details for the file PDAnalysis-0.0.4-py3-none-any.whl
.
File metadata
- Download URL: PDAnalysis-0.0.4-py3-none-any.whl
- Upload date:
- Size: 19.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.9.12
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | c9f1b016048ad77c4160f902d9e1843c17b91f551e7917bcf09375dd704e150c |
|
MD5 | 015b36d7c659a71628c4d4578f79a227 |
|
BLAKE2b-256 | f7a03ffe19799e3f7632f7807e65d975d88fd201173a1931cf243f84c7fbccdc |