`AF analysis` is a python library allowing analysis of Alphafold results.
Project description
[![Documentation Status](https://readthedocs.org/projects/af2-analysis/badge/?version=latest)](https://af2-analysis.readthedocs.io/en/latest/?badge=latest) [![codecov](https://codecov.io/gh/samuelmurail/af_analysis/graph/badge.svg?token=WOJYQKKOP7)](https://codecov.io/gh/samuelmurail/af_analysis) [![Build Status](https://dev.azure.com/samuelmurailRPBS/af_analysis/_apis/build/status%2Fsamuelmurail.af_analysis?branchName=main)](https://dev.azure.com/samuelmurailRPBS/af_analysis/_build/latest?definitionId=2&branchName=main) [![PyPI - Version](https://img.shields.io/pypi/v/af-analysis)](https://pypi.org/project/af-analysis/) [![Downloads](https://static.pepy.tech/badge/af2-analysis)](https://pepy.tech/project/af2-analysis)
# About Alphafold Analysis
<img src=”https://raw.githubusercontent.com/samuelmurail/af_analysis/master/docs/source/logo.jpeg” alt=”AF Analysis Logo” width=”200” style=”display: block; margin: auto;”/>
af-analysis is a python package for the analysis of AlphaFold protein structure predictions. This package is designed to simplify and streamline the process of working with protein structures generated by [AlphaFold 2][AF2], [AlphaFold 3][AF3] and its derivatives like [ColabFold][ColabFold], [AlphaFold-Multimer][AF2-M] and [AlphaPulldown][AlphaPulldown].
- Source code repository:
[https://github.com/samuelmurail/af_analysis](https://github.com/samuelmurail/af_analysis)
## Statement of Need
AlphaFold 2 and its derivatives have revolutionized protein structure prediction, achieving remarkable accuracy. Analyzing the abundance of resulting structural models can be challenging and time-consuming. Existing tools often require separate scripts for calculating various quality metrics (pDockQ, pDockQ2, LIS score) and assessing model diversity. af-analysis addresses these challenges by providing a unified and user-friendly framework for in-depth analysis of AlphaFold 2 results.
## Main features:
Import AlphaFold or ColabFold prediction directories as pandas DataFrames for efficient data handling.
- Calculate and add additional structural quality metrics to the DataFrame, including:
pDockQ
pDockQ2
LIS score
Visualize predicted protein models.
Cluster generated models to identify diverse conformations.
Select the best models based on defined criteria.
Add your custom metrics to the DataFrame for further analysis.
## Installation
af-analysis is available on PyPI and can be installed using pip:
`bash pip install af_analysis `
You can install last version from the github repo:
`bash pip install git+https://github.com/samuelmurail/af_analysis.git@main `
AF-Analysis can also be installed easily through github:
`bash git clone https://github.com/samuelmurail/af_analysis cd af_analysis pip install . `
## Documentation
The full documentation is available at [ReadTheDocs](https://af-analysis.readthedocs.io/en/latest/).
## Usage
### Importing data
Create the Data object, giving the path of the directory containing the results of the alphafold2/colabfold run.
`python import af_analysis my_data = af_analysis.Data('MY_AF_RESULTS_DIR') `
Extracted data are available in the df attribute of the Data object.
`python my_data.df `
### Analysis
The analysis package contains several function to add metrics like [pdockQ][pdockq] and [pdockQ2][pdockq2]:
`python from af_analysis import analysis analysis.pdockq(my_data) analysis.pdockq2(my_data) `
### Docking Analysis
The docking package contains several function to add metrics like [LIS Score][LIS]:
`python from af_analysis import docking docking.LIS_pep(my_data) `
### Plots
At first approach the user can visualize the pLDDT, PAE matrix and the model scores. The show_info() function displays the scores of the models, as well as the pLDDT plot and PAE matrix in a interactive way.
<img src=”https://raw.githubusercontent.com/samuelmurail/af_analysis/master/docs/source/_static/show_info.gif” alt=”Interactive Visualization” width=”100%” style=”display: block; margin: auto;”/>
plot msa, plddt and PAE:
`python my_data.plot_msa() my_data.plot_plddt([0,1]) best_model_index = my_data.df['ranking_confidence'].idxmax() my_data.plot_pae(best_model_index) `
show 3D structure (nglview package required):
`python my_data.show_3d(my_data.df['ranking_confidence'].idxmax()) `
## Dependencies
af_analysis requires the following dependencies:
pdb_numpy
pandas
numpy
tqdm
seaborn
cmcrameri
nglview
ipywidgets
mdanalysis
## Contributing
af-analysis is an open-source project and contributions are welcome. If you find a bug or have a feature request, please open an issue on the GitHub repository at https://github.com/samuelmurail/af_analysis. If you would like to contribute code, please fork the repository and submit a pull request.
## Authors
Alaa Regei, Graduate Student - [Université Paris Cité](https://u-paris.fr).
[Samuel Murail](https://samuelmurail.github.io/PersonalPage/>), Associate Professor - [Université Paris Cité](https://u-paris.fr), [CMPLI](http://bfa.univ-paris-diderot.fr/equipe-8/>).
See also the list of [contributors](https://github.com/samuelmurail/af_analysis/contributors) who participated in this project.
## License
This project is licensed under the GNU General Public License version 2 - see the LICENSE file for details.
# References
Jumper et al. Nature (2021) doi: [10.1038/s41586-021-03819-2][AF2]
Abramson et al. Nature (2024) doi: [10.1038/s41586-024-07487-w][AF3]
Mirdita et al. Nature Methods (2022) doi: [10.1038/s41592-022-01488-1][ColabFold]
Evans et al. bioRxiv (2021) doi: [10.1101/2021.10.04.463034][AF2-M]
Bryant et al. Nat. Commun. (2022) doi: [10.1038/s41467-022-28865-w][pdockq]
Zhu et al. Bioinformatics (2023) doi: [10.1093/bioinformatics/btad424][pdockq2]
Kim et al. bioRxiv (2024) doi: [10.1101/2024.02.19.580970][LIS]
Yu et al. Bioinformatics (2023) doi: [10.1093/bioinformatics/btac749][AlphaPulldown]
[AF2]: https://www.nature.com/articles/s41586-021-03819-2 “Jumper et al. Nature (2021) doi: 10.1038/s41586-021-03819-2” [AF3]: https://www.nature.com/articles/s41586-024-07487-w “Abramson et al. Nature (2024) doi: 10.1038/s41586-024-07487-w” [ColabFold]: https://www.nature.com/articles/s41592-022-01488-1 “Mirdita et al. Nat Methods (2022) doi: 10.1038/s41592-022-01488-1” [AF2-M]: https://www.biorxiv.org/content/10.1101/2021.10.04.463034v2 “Evans et al. bioRxiv (2021) doi: 10.1101/2021.10.04.463034” [pdockq]: https://www.nature.com/articles/s41467-022-28865-w “Bryant et al. Nat Commun (2022) doi: 10.1038/s41467-022-28865-w” [pdockq2]: https://academic.oup.com/bioinformatics/article/39/7/btad424/7219714 “Zhu et al. Bioinformatics (2023) doi: 10.1093/bioinformatics/btad424” [LIS]: https://www.biorxiv.org/content/10.1101/2024.02.19.580970v1 “Kim et al. bioRxiv (2024) doi: 10.1101/2024.02.19.580970 ” [AlphaPulldown]: https://doi.org/10.1093/bioinformatics/btac749 “Yu et al. Bioinformatics (2023) doi: 10.1093/bioinformatics/btac749”
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file af_analysis-0.1.1.tar.gz
.
File metadata
- Download URL: af_analysis-0.1.1.tar.gz
- Upload date:
- Size: 38.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.10.13
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 920ae408fa7e839edb3e7be0d32cf35cce3f8ede459122804a2b9d2dbfe3449c |
|
MD5 | 640ece0cd56697a0a9b877e20782774a |
|
BLAKE2b-256 | a8452c5a70803ce276330b1c4fdb4e99b58429cb95c95f4d29b109dea8b8f4a7 |
File details
Details for the file af_analysis-0.1.1-py3-none-any.whl
.
File metadata
- Download URL: af_analysis-0.1.1-py3-none-any.whl
- Upload date:
- Size: 44.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.10.13
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | d220e18408ff2e1e23f010b698745da4074d4de701efb5b7992b89e612fbec1a |
|
MD5 | a3a677e02f1be7ac9698597f006ff7e2 |
|
BLAKE2b-256 | 27ef07df890fcba35fee4ba7133349aeb0f55f411057944c4d271b2422cba793 |