Skip to main content

Self-correcting protein folding with differentiable NMR constraints

Project description

๐Ÿงฌ ResonanceFlow: Differentiable Protein Structure Prediction with NMR Self-Correction

Tests Docs License: MIT Documentation Python Version Ruff Type checked: mypy JAX

ResonanceFlow is a JAX-native protein structure prediction framework that integrates differentiable biophysics with experimental NMR constraints. It allows models to "self-correct" by propagating gradients from physical violations (atomic clashes, bad geometry) and NMR observables (RDCs, NOE distances) back into the neural network architecture โ€” end-to-end, with no manual refinement step.


๐Ÿš€ Key Features

  • JAX-Native Gradient Flow โ€” End-to-end differentiability from experimental constraints to model weights via jax.grad.
  • Saupe Tensor RDC Loss โ€” Differentiable least-squares fitting of the alignment tensor at every forward pass (Bax & Tjandra 1997; Cornilescu et al. 1998).
  • NOE Distance Restraints โ€” Flat-bottomed harmonic penalty on upper-bound violations, the primary 3D information source in protein NMR (Wรผthrich 1986; Gรผntert et al. 1997).
  • Biophysically Correct Geometry โ€” Bond length loss calibrated to the canonical Cฮฑโ€“Cฮฑ distance of 3.80 ร… (Engh & Huber 1991).
  • Differentiable Steric Clash โ€” Harmonic atom-overlap penalty with optional AMBER/CHARMM-style 1-2/1-3 bonded exclusions, powered by jax-md.
  • RDC Quality Metric โ€” Built-in Q-factor and Q_free cross-validation (Cornilescu et al. 1998; Clore & Garrett 1999) for structural validation without additional tooling.
  • Backbone Conformational Checks โ€” Pseudo-torsion angle calculation (Oldfield & Hubbard 1994) to verify secondary structure plausibility in Cฮฑ-only models.
  • PBC Support โ€” Periodic boundary conditions for simulation-box contexts.
  • Transformer-to-Coords โ€” A pre-LN Transformer architecture that maps amino acid sequences directly to physical 3D Cฮฑ coordinates.

๐Ÿง  The Concept: "Self-Correction"

Traditional folding models are trained on static PDB snapshots. ResonanceFlow instead teaches a model to listen to physical laws and NMR data during training itself:

  Sequence  โ†’  [Transformer]  โ†’  Cฮฑ Coordinates
                                       โ”‚
                  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
                  โ–ผ                    โ–ผ                       โ–ผ
           Steric Clash          Bond Length              RDC / NOE
             Penalty               Loss                  Mismatch
                  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                                       โ”‚  โˆ‡ฮธ L_total
                                       โ–ผ
                              [Optimizer Step]

Gradients from every constraint flow back simultaneously into the model weights โ€” the model learns not just from data, but from physics.


๐Ÿ› ๏ธ Installation

pip install resonance-flow

For development (includes linting, type-checking, testing, and docs):

git clone https://github.com/elkins/resonance-flow.git
cd resonance-flow
pip install -e ".[dev]"

Requirements: Python 3.10+, JAX โ‰ฅ 0.4, Flax, Optax, jax-md, NumPy.


๐Ÿงช Quick Start

Run the self-correction demo

from resonance_flow.train import main

state = main(num_steps=100)
# Step   0 | Total Loss: 12.3421 | Steric: 0.0012 | Bond: 1.2034 | RDC: 0.0087
# Step  10 | Total Loss:  4.1823 | ...
# Step 100 | Total Loss:  0.0031 | ...

Use individual loss functions

import jax.numpy as jnp
from resonance_flow import (
    get_steric_clash_loss,
    get_bond_length_loss,
    rdc_loss,
    rdc_q_factor,
    noe_upper_bound_loss,
    estimate_nh_proxy_vectors,
)

# โ”€โ”€ Steric clash (AMBER-style 1-2 bonded exclusion) โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
clash_fn   = get_steric_clash_loss(exclude_bonded_range=1)
positions  = jnp.array([[0.0, 0.0, 0.0], [4.0, 0.0, 0.0]])
atom_radii = jnp.array([1.5, 1.5])
clash_fn(positions, atom_radii)          # โ†’ 0.0  (no overlap)

# โ”€โ”€ Bond length (Cฮฑโ€“Cฮฑ virtual bond, Engh & Huber 1991) โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
bond_fn  = get_bond_length_loss()        # default target = 3.8 ร…
ca_chain = jnp.array([[0.0,0.0,0.0],[3.8,0.0,0.0],[7.6,0.0,0.0]])
bond_fn(ca_chain)                        # โ†’ ~0.0

# โ”€โ”€ RDC loss (Saupe tensor fitting) โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
nh_vecs      = jnp.array([[1.,0.,0.],[0.,1.,0.],[0.,0.,1.],
                           [0.7,0.7,0.],[0.7,0.,0.7],[0.,0.7,0.7]])
measured_rdc = jnp.array([10., -5., 2., 0., 4., 8.])
rdc_loss(nh_vecs, measured_rdc)          # โ†’ scalar MSE

# โ”€โ”€ RDC Q-factor (structure quality; Q โ‰ค 0.20 = high quality) โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
rdc_q_factor(nh_vecs, measured_rdc)      # โ†’ 0 โ€“ 1 (lower is better)
train_mask = jnp.array([True, True, True, False, False, False])
rdc_q_free(nh_vecs, measured_rdc, train_mask)  # โ†’ Q-factor on held-out data

# โ”€โ”€ N-H proxy vectors from Cฮฑ coordinates (Cฮฑ-only models) โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
ca_coords = jax.random.normal(jax.random.PRNGKey(0), (10, 3))
nh_proxy  = estimate_nh_proxy_vectors(ca_coords)   # โ†’ (8, 3) unit vectors

# โ”€โ”€ NOE upper-bound distance restraints (Wรผthrich 1986) โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
noe_pairs    = jnp.array([[0, 2], [1, 3]])
upper_bounds = jnp.array([5.0, 4.5])
noe_upper_bound_loss(positions, noe_pairs[:1], upper_bounds[:1])  # โ†’ 0.0

๐ŸŽ“ Interactive Tutorial Catalog

Experience ResonanceFlow directly in your browser via Google Colab. These interactive tutorials cover everything from basic biophysics to advanced structural self-correction.

Tutorial Difficulty Time Action
Self-Correction Demo โญ Beginner 15 min Open In Colab
Biophysical Constraints โญ Beginner 15 min Open In Colab
Differentiable NMR โญ• Intermediate 25 min Open In Colab
Transformer-to-Coords ๐Ÿ”๏ธ Advanced 30 min Open In Colab

๐Ÿ”ฌ Scientific Basis

All loss functions and validation metrics are grounded in published, peer-reviewed NMR methodology:

Loss / Metric Scientific Basis
RDC loss โ€” Saupe tensor Bax & Tjandra, J. Biomol. NMR 1997; Cornilescu et al., JACS 1998
RDC Q-factor Cornilescu et al., JACS 1998; Clore & Garrett, JACS 1999
NOE distance restraints Wรผthrich, NMR of Proteins and Nucleic Acids 1986; Gรผntert et al., J. Mol. Biol. 1997
Cฮฑโ€“Cฮฑ bond distance (3.8 ร…) Engh & Huber, Acta Crystallogr. A 1991
N-H proxy vectors Zweckstetter & Bax, JACS 2000
Bonded exclusion (1-2/1-3) Cornell et al. (AMBER), JACS 1995; MacKerell et al. (CHARMM), J. Phys. Chem. B 1998
d_max = 21 700 Hz Ottiger & Bax, JACS 1998

๐Ÿงฌ Architecture

TransformerCoordinatePredictor
โ”œโ”€โ”€ Embedding         (vocab_size=21, d_model=128)
โ”œโ”€โ”€ Positional Embed  (learned, max_len=512)
โ”œโ”€โ”€ N ร— Pre-LN Block
โ”‚   โ”œโ”€โ”€ LayerNorm โ†’ MultiHeadDotProductAttention โ†’ Residual
โ”‚   โ””โ”€โ”€ LayerNorm โ†’ FFN (4ร— expand, GELU) โ†’ Residual
โ””โ”€โ”€ LayerNorm โ†’ Linear(3)   # โ†’ (batch, seq_len, 3) Cฮฑ coordinates

The pre-LN (LayerNorm before attention) layout avoids gradient explosion and follows the convention recommended by Xiong et al. 2020.


๐Ÿค Contributing

Contributions are welcome! Please open an issue or pull request. The project follows:

  • Formatting + Linting: ruff / ruff format
  • Type checking: mypy
  • Testing: pytest with coverage
# Run the full quality pipeline before submitting a PR
ruff check resonance_flow tests
ruff format resonance_flow tests
mypy resonance_flow tests
pytest --cov=resonance_flow tests

๐Ÿ“š Documentation

Full theory, API reference, and examples at elkins.github.io/resonance-flow.


โš–๏ธ License

MIT ยฉ George Elkins


๐Ÿ”— Related Projects

ResonanceFlow is the most complete end-to-end model in this ecosystem, depending on:

  • diff-biophys โ€” Differentiable RDC, NOE, bond-length, and clash kernels
  • synth-nmr โ€” NMR parameter libraries (chemical shifts, Karplus, RDC)
  • synth-pdb โ€” Protein structure data generation
  • TorsionTuner โ€” Single-structure refinement using similar torsion-space kinematics
  • diff-ensemble โ€” Ensemble counterpart for IDPs

๐Ÿ“– Citation

@software{resonance_flow,
  author  = {Elkins, George},
  title   = {ResonanceFlow: Differentiable protein structure prediction with NMR self-correction},
  year    = {2026},
  url     = {https://github.com/elkins/resonance-flow},
  version = {0.1.0}
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

resonance_flow-0.1.2.tar.gz (25.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

resonance_flow-0.1.2-py3-none-any.whl (12.9 kB view details)

Uploaded Python 3

File details

Details for the file resonance_flow-0.1.2.tar.gz.

File metadata

  • Download URL: resonance_flow-0.1.2.tar.gz
  • Upload date:
  • Size: 25.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.10

File hashes

Hashes for resonance_flow-0.1.2.tar.gz
Algorithm Hash digest
SHA256 a9c21967f89104d072cf50aa166ee677f1a173b1baf9e3f997290478e8bcbf7e
MD5 5d0cff60774c4cee0d2a479557ab00ec
BLAKE2b-256 d3995b84dda738d57c4b4f9e81df308f1d771ef2ab2d03f54f97f97a1e919e4f

See more details on using hashes here.

File details

Details for the file resonance_flow-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: resonance_flow-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 12.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.10

File hashes

Hashes for resonance_flow-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 1a0956de70f008e733c8806de36a3c29fd1128ff1e16279b67ce8c645a0eb9be
MD5 89693affcf6db15fe55040f0efeed52b
BLAKE2b-256 3a4f4965d2fd552f8500404434687d8ccb58fb0f0b3e35723288186b52ffbf7c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page