Skip to main content

`AF analysis` is a python library allowing analysis of Alphafold results.

Project description

[![Documentation Status](https://readthedocs.org/projects/af2-analysis/badge/?version=latest)](https://af2-analysis.readthedocs.io/en/latest/?badge=latest) [![codecov](https://codecov.io/gh/samuelmurail/af_analysis/graph/badge.svg?token=WOJYQKKOP7)](https://codecov.io/gh/samuelmurail/af_analysis) [![Build Status](https://dev.azure.com/samuelmurailRPBS/af_analysis/_apis/build/status%2Fsamuelmurail.af_analysis?branchName=main)](https://dev.azure.com/samuelmurailRPBS/af_analysis/_build/latest?definitionId=2&branchName=main) [![PyPI - Version](https://img.shields.io/pypi/v/af-analysis)](https://pypi.org/project/af-analysis/) [![Downloads](https://static.pepy.tech/badge/af2-analysis)](https://pepy.tech/project/af2-analysis)

# About Alphafold Analysis

<img src=”https://raw.githubusercontent.com/samuelmurail/af_analysis/master/docs/source/logo.jpeg” alt=”AF Analysis Logo” width=”200” style=”display: block; margin: auto;”/>

af-analysis is a python package for the analysis of AlphaFold protein structure predictions. This package is designed to simplify and streamline the process of working with protein structures generated by [AlphaFold 2][AF2], [AlphaFold 3][AF3] and its derivatives like [ColabFold][ColabFold], [AlphaFold-Multimer][AF2-M] and [AlphaPulldown][AlphaPulldown].

## Statement of Need

AlphaFold 2 and its derivatives have revolutionized protein structure prediction, achieving remarkable accuracy. Analyzing the abundance of resulting structural models can be challenging and time-consuming. Existing tools often require separate scripts for calculating various quality metrics (pDockQ, pDockQ2, LIS score) and assessing model diversity. af-analysis addresses these challenges by providing a unified and user-friendly framework for in-depth analysis of AlphaFold 2 results.

## Main features:

  • Import AlphaFold or ColabFold prediction directories as pandas DataFrames for efficient data handling.

  • Calculate and add additional structural quality metrics to the DataFrame, including:
    • pDockQ

    • pDockQ2

    • LIS score

  • Visualize predicted protein models.

  • Cluster generated models to identify diverse conformations.

  • Select the best models based on defined criteria.

  • Add your custom metrics to the DataFrame for further analysis.

## Installation

  • af-analysis is available on PyPI and can be installed using pip:

`bash pip install af_analysis `

  • You can install last version from the github repo:

`bash pip install git+https://github.com/samuelmurail/af_analysis.git@main `

  • AF-Analysis can also be installed easily through github:

`bash git clone https://github.com/samuelmurail/af_analysis cd af_analysis pip install . `

## Documentation

The full documentation is available at [ReadTheDocs](https://af-analysis.readthedocs.io/en/latest/).

## Usage

### Importing data

Create the Data object, giving the path of the directory containing the results of the alphafold2/colabfold run.

`python import af_analysis my_data = af_analysis.Data('MY_AF_RESULTS_DIR') `

Extracted data are available in the df attribute of the Data object.

`python my_data.df `

### Analysis

  • The analysis package contains several function to add metrics like [pdockQ][pdockq] and [pdockQ2][pdockq2]:

`python from af_analysis import analysis analysis.pdockq(my_data) analysis.pdockq2(my_data) `

### Docking Analysis

  • The docking package contains several function to add metrics like [LIS Score][LIS]:

`python from af_analysis import docking docking.LIS_pep(my_data) `

### Plots

  • At first approach the user can visualize the pLDDT, PAE matrix and the model scores. The show_info() function displays the scores of the models, as well as the pLDDT plot and PAE matrix in a interactive way.

<img src=”https://raw.githubusercontent.com/samuelmurail/af_analysis/master/docs/source/_static/show_info.gif” alt=”Interactive Visualization” width=”100%” style=”display: block; margin: auto;”/>

  • plot msa, plddt and PAE:

`python my_data.plot_msa() my_data.plot_plddt([0,1]) best_model_index = my_data.df['ranking_confidence'].idxmax() my_data.plot_pae(best_model_index) `

  • show 3D structure (nglview package required):

`python my_data.show_3d(my_data.df['ranking_confidence'].idxmax()) `

## Dependencies

af_analysis requires the following dependencies:

  • pdb_numpy

  • pandas

  • numpy

  • tqdm

  • seaborn

  • cmcrameri

  • nglview

  • ipywidgets

  • mdanalysis

## Contributing

af-analysis is an open-source project and contributions are welcome. If you find a bug or have a feature request, please open an issue on the GitHub repository at https://github.com/samuelmurail/af_analysis. If you would like to contribute code, please fork the repository and submit a pull request.

## Authors

See also the list of [contributors](https://github.com/samuelmurail/af_analysis/contributors) who participated in this project.

## License

This project is licensed under the GNU General Public License version 2 - see the LICENSE file for details.

# References

  • Jumper et al. Nature (2021) doi: [10.1038/s41586-021-03819-2][AF2]

  • Abramson et al. Nature (2024) doi: [10.1038/s41586-024-07487-w][AF3]

  • Mirdita et al. Nature Methods (2022) doi: [10.1038/s41592-022-01488-1][ColabFold]

  • Evans et al. bioRxiv (2021) doi: [10.1101/2021.10.04.463034][AF2-M]

  • Bryant et al. Nat. Commun. (2022) doi: [10.1038/s41467-022-28865-w][pdockq]

  • Zhu et al. Bioinformatics (2023) doi: [10.1093/bioinformatics/btad424][pdockq2]

  • Kim et al. bioRxiv (2024) doi: [10.1101/2024.02.19.580970][LIS]

  • Yu et al. Bioinformatics (2023) doi: [10.1093/bioinformatics/btac749][AlphaPulldown]

[AF2]: https://www.nature.com/articles/s41586-021-03819-2 “Jumper et al. Nature (2021) doi: 10.1038/s41586-021-03819-2” [AF3]: https://www.nature.com/articles/s41586-024-07487-w “Abramson et al. Nature (2024) doi: 10.1038/s41586-024-07487-w” [ColabFold]: https://www.nature.com/articles/s41592-022-01488-1 “Mirdita et al. Nat Methods (2022) doi: 10.1038/s41592-022-01488-1” [AF2-M]: https://www.biorxiv.org/content/10.1101/2021.10.04.463034v2 “Evans et al. bioRxiv (2021) doi: 10.1101/2021.10.04.463034” [pdockq]: https://www.nature.com/articles/s41467-022-28865-w “Bryant et al. Nat Commun (2022) doi: 10.1038/s41467-022-28865-w” [pdockq2]: https://academic.oup.com/bioinformatics/article/39/7/btad424/7219714 “Zhu et al. Bioinformatics (2023) doi: 10.1093/bioinformatics/btad424” [LIS]: https://www.biorxiv.org/content/10.1101/2024.02.19.580970v1 “Kim et al. bioRxiv (2024) doi: 10.1101/2024.02.19.580970 ” [AlphaPulldown]: https://doi.org/10.1093/bioinformatics/btac749 “Yu et al. Bioinformatics (2023) doi: 10.1093/bioinformatics/btac749”

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

af_analysis-0.1.1.tar.gz (38.4 kB view details)

Uploaded Source

Built Distribution

af_analysis-0.1.1-py3-none-any.whl (44.3 kB view details)

Uploaded Python 3

File details

Details for the file af_analysis-0.1.1.tar.gz.

File metadata

  • Download URL: af_analysis-0.1.1.tar.gz
  • Upload date:
  • Size: 38.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.10.13

File hashes

Hashes for af_analysis-0.1.1.tar.gz
Algorithm Hash digest
SHA256 920ae408fa7e839edb3e7be0d32cf35cce3f8ede459122804a2b9d2dbfe3449c
MD5 640ece0cd56697a0a9b877e20782774a
BLAKE2b-256 a8452c5a70803ce276330b1c4fdb4e99b58429cb95c95f4d29b109dea8b8f4a7

See more details on using hashes here.

File details

Details for the file af_analysis-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: af_analysis-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 44.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.10.13

File hashes

Hashes for af_analysis-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 d220e18408ff2e1e23f010b698745da4074d4de701efb5b7992b89e612fbec1a
MD5 a3a677e02f1be7ac9698597f006ff7e2
BLAKE2b-256 27ef07df890fcba35fee4ba7133349aeb0f55f411057944c4d271b2422cba793

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page