Skip to main content

Self-correcting protein folding with differentiable NMR constraints

Project description

๐Ÿงฌ ResonanceFlow: Differentiable Protein Structure Prediction with NMR Self-Correction

Tests Docs License: MIT Documentation Python Version Ruff Linting: ruff Type checked: mypy JAX

ResonanceFlow is a JAX-native protein structure prediction framework that integrates differentiable biophysics with experimental NMR constraints. It allows models to "self-correct" by propagating gradients from physical violations (atomic clashes, bad geometry) and NMR observables (RDCs, NOE distances) back into the neural network architecture โ€” end-to-end, with no manual refinement step.


๐Ÿš€ Key Features

  • JAX-Native Gradient Flow โ€” End-to-end differentiability from experimental constraints to model weights via jax.grad.
  • Saupe Tensor RDC Loss โ€” Differentiable least-squares fitting of the alignment tensor at every forward pass (Bax & Tjandra 1997; Cornilescu et al. 1998).
  • NOE Distance Restraints โ€” Flat-bottomed harmonic penalty on upper-bound violations, the primary 3D information source in protein NMR (Wรผthrich 1986; Gรผntert et al. 1997).
  • Biophysically Correct Geometry โ€” Bond length loss calibrated to the canonical Cฮฑโ€“Cฮฑ distance of 3.80 ร… (Engh & Huber 1991).
  • Differentiable Steric Clash โ€” Harmonic atom-overlap penalty with optional AMBER/CHARMM-style 1-2/1-3 bonded exclusions, powered by jax-md.
  • RDC Quality Metric โ€” Built-in Q-factor (Cornilescu et al. 1998) for structural validation without additional tooling.
  • PBC Support โ€” Periodic boundary conditions for simulation-box contexts.
  • Transformer-to-Coords โ€” A pre-LN Transformer architecture that maps amino acid sequences directly to physical 3D Cฮฑ coordinates.

๐Ÿง  The Concept: "Self-Correction"

Traditional folding models are trained on static PDB snapshots. ResonanceFlow instead teaches a model to listen to physical laws and NMR data during training itself:

  Sequence  โ†’  [Transformer]  โ†’  Cฮฑ Coordinates
                                       โ”‚
                  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
                  โ–ผ                    โ–ผ                       โ–ผ
           Steric Clash          Bond Length              RDC / NOE
             Penalty               Loss                  Mismatch
                  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                                       โ”‚  โˆ‡ฮธ L_total
                                       โ–ผ
                              [Optimizer Step]

Gradients from every constraint flow back simultaneously into the model weights โ€” the model learns not just from data, but from physics.


๐Ÿ› ๏ธ Installation

pip install resonance-flow

For development (includes linting, type-checking, testing, and docs):

git clone https://github.com/elkins/resonance-flow.git
cd resonance-flow
pip install -e ".[dev]"

Requirements: Python 3.10+, JAX โ‰ฅ 0.4, Flax, Optax, jax-md, NumPy.


๐Ÿงช Quick Start

Run the self-correction demo

from resonance_flow.train import main

state = main(num_steps=100)
# Step   0 | Total Loss: 12.3421 | Steric: 0.0012 | Bond: 1.2034 | RDC: 0.0087
# Step  10 | Total Loss:  4.1823 | ...
# Step 100 | Total Loss:  0.0031 | ...

Use individual loss functions

import jax.numpy as jnp
from resonance_flow import (
    get_steric_clash_loss,
    get_bond_length_loss,
    rdc_loss,
    rdc_q_factor,
    noe_upper_bound_loss,
    estimate_nh_proxy_vectors,
)

# โ”€โ”€ Steric clash (AMBER-style 1-2 bonded exclusion) โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
clash_fn   = get_steric_clash_loss(exclude_bonded_range=1)
positions  = jnp.array([[0.0, 0.0, 0.0], [4.0, 0.0, 0.0]])
atom_radii = jnp.array([1.5, 1.5])
clash_fn(positions, atom_radii)          # โ†’ 0.0  (no overlap)

# โ”€โ”€ Bond length (Cฮฑโ€“Cฮฑ virtual bond, Engh & Huber 1991) โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
bond_fn  = get_bond_length_loss()        # default target = 3.8 ร…
ca_chain = jnp.array([[0.0,0.0,0.0],[3.8,0.0,0.0],[7.6,0.0,0.0]])
bond_fn(ca_chain)                        # โ†’ ~0.0

# โ”€โ”€ RDC loss (Saupe tensor fitting) โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
nh_vecs      = jnp.array([[1.,0.,0.],[0.,1.,0.],[0.,0.,1.],
                           [0.7,0.7,0.],[0.7,0.,0.7],[0.,0.7,0.7]])
measured_rdc = jnp.array([10., -5., 2., 0., 4., 8.])
rdc_loss(nh_vecs, measured_rdc)          # โ†’ scalar MSE

# โ”€โ”€ RDC Q-factor (structure quality; Q โ‰ค 0.20 = high quality) โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
rdc_q_factor(nh_vecs, measured_rdc)      # โ†’ 0 โ€“ 1 (lower is better)

# โ”€โ”€ N-H proxy vectors from Cฮฑ coordinates (Cฮฑ-only models) โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
ca_coords = jax.random.normal(jax.random.PRNGKey(0), (10, 3))
nh_proxy  = estimate_nh_proxy_vectors(ca_coords)   # โ†’ (8, 3) unit vectors

# โ”€โ”€ NOE upper-bound distance restraints (Wรผthrich 1986) โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
noe_pairs    = jnp.array([[0, 2], [1, 3]])
upper_bounds = jnp.array([5.0, 4.5])
noe_upper_bound_loss(positions, noe_pairs[:1], upper_bounds[:1])  # โ†’ 0.0

๐Ÿ”ฌ Scientific Basis

All loss functions and validation metrics are grounded in published, peer-reviewed NMR methodology:

Loss / Metric Scientific Basis
RDC loss โ€” Saupe tensor Bax & Tjandra, J. Biomol. NMR 1997; Cornilescu et al., JACS 1998
RDC Q-factor Cornilescu et al., JACS 1998; Clore & Garrett, JACS 1999
NOE distance restraints Wรผthrich, NMR of Proteins and Nucleic Acids 1986; Gรผntert et al., J. Mol. Biol. 1997
Cฮฑโ€“Cฮฑ bond distance (3.8 ร…) Engh & Huber, Acta Crystallogr. A 1991
N-H proxy vectors Zweckstetter & Bax, JACS 2000
Bonded exclusion (1-2/1-3) Cornell et al. (AMBER), JACS 1995; MacKerell et al. (CHARMM), J. Phys. Chem. B 1998
d_max = 21 700 Hz Ottiger & Bax, JACS 1998

๐Ÿงฌ Architecture

TransformerCoordinatePredictor
โ”œโ”€โ”€ Embedding         (vocab_size=21, d_model=128)
โ”œโ”€โ”€ Positional Embed  (learned, max_len=512)
โ”œโ”€โ”€ N ร— Pre-LN Block
โ”‚   โ”œโ”€โ”€ LayerNorm โ†’ MultiHeadSelfAttention โ†’ Residual
โ”‚   โ””โ”€โ”€ LayerNorm โ†’ FFN (4ร— expand, GELU) โ†’ Residual
โ””โ”€โ”€ LayerNorm โ†’ Linear(3)   # โ†’ (batch, seq_len, 3) Cฮฑ coordinates

The pre-LN (LayerNorm before attention) layout avoids gradient explosion and follows the convention recommended by Xiong et al. 2020.


๐Ÿค Contributing

Contributions are welcome! Please open an issue or pull request. The project follows:

  • Formatting + Linting: ruff / ruff format
  • Type checking: mypy
  • Testing: pytest with coverage
# Run the full quality pipeline before submitting a PR
ruff check resonance_flow tests
ruff format resonance_flow tests
mypy resonance_flow tests
pytest --cov=resonance_flow tests

๐Ÿ“š Documentation

Full theory, API reference, and examples at elkins.github.io/resonance-flow.


โš–๏ธ License

MIT ยฉ George Elkins


๐Ÿ”— Related Projects

ResonanceFlow is the most complete end-to-end model in this ecosystem, depending on:

  • diff-biophys โ€” Differentiable RDC, NOE, bond-length, and clash kernels
  • synth-nmr โ€” NMR parameter libraries (chemical shifts, Karplus, RDC)
  • synth-pdb โ€” Protein structure data generation
  • TorsionTuner โ€” Single-structure refinement using similar torsion-space kinematics
  • diff-ensemble โ€” Ensemble counterpart for IDPs

๐Ÿ“– Citation

@software{resonance_flow,
  author  = {Elkins, George},
  title   = {ResonanceFlow: Differentiable protein structure prediction with NMR self-correction},
  year    = {2024},
  url     = {https://github.com/elkins/resonance-flow},
  version = {0.1.0}
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

resonance_flow-0.1.0.tar.gz (22.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

resonance_flow-0.1.0-py3-none-any.whl (11.9 kB view details)

Uploaded Python 3

File details

Details for the file resonance_flow-0.1.0.tar.gz.

File metadata

  • Download URL: resonance_flow-0.1.0.tar.gz
  • Upload date:
  • Size: 22.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.10

File hashes

Hashes for resonance_flow-0.1.0.tar.gz
Algorithm Hash digest
SHA256 0fe7bee6ed152c1774b47ee4bc02ae426cd6df6a8d7ce1ab2dfdc54fecd2a960
MD5 a2c60fb91a65450da875c052df1866be
BLAKE2b-256 7c87d67e6828adac985709af8bc44433638f6256d83279e1033765e8cdbab972

See more details on using hashes here.

File details

Details for the file resonance_flow-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: resonance_flow-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 11.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.10

File hashes

Hashes for resonance_flow-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 c07b2abdd3a5012a0c2520d5f0e40464dfe5a3d9951dfb6bf9062890a36e794c
MD5 6eb18892a94966fac5b52dd9e721dc51
BLAKE2b-256 7a4548ca0bed6ab5e75894d81324000c0821641cd3007d2030d4f86155245340

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page