Skip to main content

rafm

Project description

AlphaFold model and two crystal structures of calmodulin

rafm computes per-model measures associated with atomic-level accuracy for AlphaFold models from pLDDT confidence scores. Outputs are to a tab-separated file.

Installation

You can install rafm via pip from PyPI:

$ pip install rafm

Usage

rafm –help lists all commands. Current commands are:

  • plddt-stats

    Calculate stats on bounded pLDDTs from list of AlphaFold model files. in PDB format.

    Options:

    • –criterion FLOAT

      The cutoff value on truncated pLDDT for possible utility. [default: 91.2]

    • –min-length INTEGER

      The minimum sequence length for which to calculate truncated stats. [default: 20]

    • –min-count INTEGER

      The minimum number of truncated pLDDT values for which to calculate stats. [default: 20]

    • –lower-bound INTEGER

      The pLDDT value below which stats will not be calculated. [default: 80]

    • –upper-bound INTEGER

      The pLDDT value above which stats will not be calculated. [default: 100]

    • –file-stem TEXT

      Output file name stem. [default: rafm]

    Output columns (where NN is the bounds specifier, default: 80):

    • residues_in_pLDDT

      The number of residues in the AlphaFold model.

    • pLDDT_mean

      The mean value of pLDDT over all residues.

    • pLDDT_median

      The median value of pLDDT over all residues.

    • pLDDTNN_count

      The number of residues within bounds.

    • pLDDTNN_frac

      The fraction of pLDDT values within bounds, if the count is greater than the minimum.

    • pLDDTNNN_mean

      The mean of pLDDT values within bounds, if the count is greater than the minimum.

    • pLDDTNN_median

      The median of pLDDT values within bounds, if the count is greater than the minimum.

    • LDDT_expect

      The expectation value of global LDDT over the residues with LDDT within bounds. Only produced if default bounds are used.

    • passing

      True if the model passed the criterion, False otherwise. Only produced if default bounds are used.

    • file

      The path to the model file.

  • plddt-select-residues

    Writes a tab-separated file of residues from passing models, using an input file of values selected by plddt-stats. Input options are the same as plddt-stats.

    Output columns:

    • file

      Path to the model file.

    • residue

      Residue number, starting from 0 and numbered sequentially. Note that all residues will be written, regardless of bounds set.

    • pLDDT

      pLDDT value for that residue.

Statistical Basis

The default parameters were chosen to select for LDDT values of greater than 80 on a set of crystal structures obtained since AlphaFold was trained. The distributions of LDDT scores for the passing and non-passing sets, along with an (overlapping) set of PDB files at 100% sequence identity over at least 80% of the sequence looks like this:

Distribution of high-scoring, low-scoring, and high-similarity structures

The markers on the x-axis refer to the size of conformational changes observed in conformational changes in various protein crystal structures:

  • CALM

    Between calcum-bound and calcium-free calmodulin (depicted in the logo image above).

  • ERK2

    Between unphosphorylated and doubly-phosphorylated ERK2 kinase.

  • HB

    Between R- and T-state hemoglobin

  • MB

    Between carbonmonoxy- and deoxy-myoglobin

When applied to set of “dark” genomes with no previous PDB entries, the distributions of median pLDDT scores with a lower bound of 80 and per-residue pLDDT scores looks like this:

Distribution of *pLDDT80* scores and per-residue *pLDDT* scores

Contributing

Contributions are very welcome. To learn more, see the Contributor Guide.

License

Distributed under the terms of the MIT license, rafm is free and open source software.

Issues

If you encounter any problems, please file an issue along with a detailed description.

Credits

This project was generated from the UNM Translational Informatics Python Cookiecutter template.

rafm was written by Joel Berendzen and Jessica Binder.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rafm-0.3.0.tar.gz (9.0 kB view hashes)

Uploaded Source

Built Distribution

rafm-0.3.0-py3-none-any.whl (8.5 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page