Skip to main content

`AF2 analysis` is a python library allowing analysis of Alphafold results.

Project description

[![Documentation Status](https://readthedocs.org/projects/af2-analysis/badge/?version=latest)](https://af2-analysis.readthedocs.io/en/latest/?badge=latest) [![codecov](https://codecov.io/gh/samuelmurail/af2_analysis/graph/badge.svg?token=WOJYQKKOP7)](https://codecov.io/gh/samuelmurail/af2_analysis) [![Build Status](https://dev.azure.com/samuelmurailRPBS/af2_analysis/_apis/build/status%2Fsamuelmurail.af2_analysis?branchName=main)](https://dev.azure.com/samuelmurailRPBS/af2_analysis/_build/latest?definitionId=2&branchName=main) [![PyPI - Version](https://img.shields.io/pypi/v/af2-analysis)](https://pypi.org/project/af2-analysis/) [![Downloads](https://static.pepy.tech/badge/af2-analysis)](https://pepy.tech/project/af2-analysis)

# About Alphafold2 Analysis

<img src=”https://raw.githubusercontent.com/samuelmurail/af2_analysis/master/docs/source/logo.jpeg” alt=”AF2 Analysis Logo” width=”200” style=”display: block; margin: auto;”/>

af2-analysis is a python package for the analysis of AlphaFold protein structure predictions. This package is designed to simplify and streamline the process of working with protein structures generated by [AlphaFold 2][AF2], [AlphaFold 3][AF3] and its derivatives like [ColabFold][ColabFold], [AlphaFold-Multimer][AF2-M] and [AlphaPulldown][AlphaPulldown].

## Statement of Need

AlphaFold 2 and its derivatives have revolutionized protein structure prediction, achieving remarkable accuracy. Analyzing the abundance of resulting structural models can be challenging and time-consuming. Existing tools often require separate scripts for calculating various quality metrics (pDockQ, pDockQ2, LIS score) and assessing model diversity. af2-analysis addresses these challenges by providing a unified and user-friendly framework for in-depth analysis of AlphaFold 2 results.

## Main features:

  • Import AlphaFold or ColabFold prediction directories as pandas DataFrames for efficient data handling.

  • Calculate and add additional structural quality metrics to the DataFrame, including:
    • pDockQ

    • pDockQ2

    • LIS score

  • Visualize predicted protein models.

  • Cluster generated models to identify diverse conformations.

  • Select the best models based on defined criteria.

  • Add your custom metrics to the DataFrame for further analysis.

## Installation

  • af2-analysis is available on PyPI and can be installed using pip:

`bash pip install af2_analysis `

  • You can install last version from the github repo:

`bash pip install git+https://github.com/samuelmurail/af2_analysis.git@main `

  • AF2-Analysis can also be installed easily through github:

`bash git clone https://github.com/samuelmurail/af2_analysis cd af2_analysis pip install . `

## Documentation

The full documentation is available at [ReadTheDocs](https://af2-analysis.readthedocs.io/en/latest/).

## Usage

### Importing data

Create the Data object, giving the path of the directory containing the results of the alphafold2/colabfold run.

`python import af2_analysis my_data = af2_analysis.Data('MY_AF2_RESULTS_DIR') `

Extracted data are available in the df attribute of the Data object.

`python my_data.df `

### Analysis

  • The analysis package contains several function to add metrics like [pdockQ][pdockq] and [pdockQ2][pdockq2]:

`python from af2_analysis import analysis analysis.pdockq(my_data) analysis.pdockq2(my_data) `

### Docking Analysis

  • The docking package contains several function to add metrics like [LIS Score][LIS]:

`python from af2_analysis import docking docking.LIS_pep(my_data) `

### Plots

  • At first approach the user can visualize the pLDDT, PAE matrix and the model scores. The show_info() function displays the scores of the models, as well as the pLDDT plot and PAE matrix in a interactive way.

<img src=”https://raw.githubusercontent.com/samuelmurail/af2_analysis/master/docs/source/_static/show_info.gif” alt=”Interactive Visualization” width=”100%” style=”display: block; margin: auto;”/>

  • plot msa, plddt and PAE:

`python my_data.plot_msa() my_data.plot_plddt([0,1]) best_model_index = my_data.df['ranking_confidence'].idxmax() my_data.plot_pae(best_model_index) `

  • show 3D structure (nglview package required):

`python my_data.show_3d(my_data.df['ranking_confidence'].idxmax()) `

## Dependencies

af2_analysis requires the following dependencies:

  • pdb_numpy

  • pandas

  • numpy

  • tqdm

  • seaborn

  • cmcrameri

  • nglview

  • ipywidgets

  • mdanalysis

## Contributing

af2-analysis is an open-source project and contributions are welcome. If you find a bug or have a feature request, please open an issue on the GitHub repository at https://github.com/samuelmurail/af2_analysis. If you would like to contribute code, please fork the repository and submit a pull request.

## Authors

See also the list of [contributors](https://github.com/samuelmurail/af2_analysis/contributors) who participated in this project.

## License

This project is licensed under the GNU General Public License v2.0 - see the LICENSE file for details.

# References

  • Jumper et al. Nature (2021) doi: [10.1038/s41586-021-03819-2][AF2]

  • Abramson et al. Nature (2024) doi: [10.1038/s41586-024-07487-w][AF3]

  • Mirdita et al. Nature Methods (2022) doi: [10.1038/s41592-022-01488-1][ColabFold]

  • Evans et al. bioRxiv (2021) doi: [10.1101/2021.10.04.463034][AF2-M]

  • Bryant et al. Nat. Commun. (2022) doi: [10.1038/s41467-022-28865-w][pdockq]

  • Zhu et al. Bioinformatics (2023) doi: [10.1093/bioinformatics/btad424][pdockq2]

  • Kim et al. bioRxiv (2024) doi: [10.1101/2024.02.19.580970][LIS]

  • Yu et al. Bioinformatics (2023) doi: [10.1093/bioinformatics/btac749][AlphaPulldown]

[AF2]: https://www.nature.com/articles/s41586-021-03819-2 “Jumper et al. Nature (2021) doi: 10.1038/s41586-021-03819-2” [AF3]: https://www.nature.com/articles/s41586-024-07487-w “Abramson et al. Nature (2024) doi: 10.1038/s41586-024-07487-w” [ColabFold]: https://www.nature.com/articles/s41592-022-01488-1 “Mirdita et al. Nat Methods (2022) doi: 10.1038/s41592-022-01488-1” [AF2-M]: https://www.biorxiv.org/content/10.1101/2021.10.04.463034v2 “Evans et al. bioRxiv (2021) doi: 10.1101/2021.10.04.463034” [pdockq]: https://www.nature.com/articles/s41467-022-28865-w “Bryant et al. Nat Commun (2022) doi: 10.1038/s41467-022-28865-w” [pdockq2]: https://academic.oup.com/bioinformatics/article/39/7/btad424/7219714 “Zhu et al. Bioinformatics (2023) doi: 10.1093/bioinformatics/btad424” [LIS]: https://www.biorxiv.org/content/10.1101/2024.02.19.580970v1 “Kim et al. bioRxiv (2024) doi: 10.1101/2024.02.19.580970 ” [AlphaPulldown]: https://doi.org/10.1093/bioinformatics/btac749 “Yu et al. Bioinformatics (2023) doi: 10.1093/bioinformatics/btac749”

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

af2_analysis-0.1.0.tar.gz (38.4 kB view hashes)

Uploaded Source

Built Distribution

af2_analysis-0.1.0-py3-none-any.whl (44.4 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page