Skip to main content

Refining Openfold predictions with Crystallographic Likelihood Targets

Project description

Refining Openfold predictions with Crystallographic/Cryo-EM LiKElihood Targets (ROCKET)

Build Ruff GitHub License BioRXiv Doc

This is the code repo for AlphaFold as a Prior: Experimental Structure Determination Conditioned on a Pretrained Neural Network

You can find detailed documentation and walk-through tutorials at: https://rocket-9.gitbook.io/rocket-docs

Installation

1. Install OpenFold

To ensure usability, we forked the OpenFold repo, and sorted a couple details in the installation guides. Here is what we advise ROCKET users to do:

Note: The ⁠openfold installation requires approximately 6 GB of free space to download weights. Please ensure you start in a directory with sufficient available space.

Note: To ensure a smooth installation and execution of ROCKET, install on a GPU machine that matches the hardware you’ll use in production. In other words, for HPC users, if you plan to run your code on a node with a particular GPU model, request the same GPU model when you install OpenFold. This is important because the installation process performs a hardware-specific compilation. We also recommend using GPUs with CUDA Compute Capability 8.0 or higher.

  1. Clone our fork of the OpenFold repo, switch to the pl_upgrades branch to work with CUDA 12:

    git clone https://github.com/minhuanli/rocket_openfold.git
    cd rocket_openfold
    git checkout pl_upgrades
    
  2. Create a conda/mamba env with the environment.yml

    Note: If you work with an HPC cluster with package management like module, purge all your modules before this step to avoid conflicts.

    mamba env create -n <env_name_you_like> -f environment.yml
    mamba activate <env_name_you_like>
    

    The main change we made is moving the flash-attn package outside of the yml file, so you can install it manually afterwards. This is necessary because this OpenFold version relies on pytorch 2.1, which is incompatible with the latest flash-attn, so a simple pip install flash-attn would fail. Also using a --no-build-isolation flag allows using ninja for compilation, which is much faster.

  3. Install compatible flash-attn (latest flash-attn with noted support for pytroch-2.1 + cuda-12.1)

    pip install flash-attn==2.2.2 --no-build-isolation
    
  4. Run the setup script to install OpenFold, and configure kernels and folding resources

    ./scripts/install_third_party_dependencies.sh
    

    Add the following lines to <path_to_your_conda_env>/etc/conda/activate.d/env_vars.sh, create it if it doesn't exist

    #!/bin/sh
    
    export LIBRARY_PATH=$CONDA_PREFIX/lib:$LIBRARY_PATH
    export LD_LIBRARY_PATH=$CONDA_PREFIX/lib:$LD_LIBRARY_PATH
    

    This is so everytime you activate this env, the prepend will happen automatically.

  5. Download AlphaFold2 weights, add the resources path to system environment (we need this for ROCKET)

    ./scripts/download_alphafold_params.sh ./openfold/resources
    

    Note: You can download OpenFold weights if you want to try

    Append the following line to <path_to_your_conda_env>/etc/conda/activate.d/env_vars.sh, you should have created it from the previous step

    export OPENFOLD_RESOURCES="<ABSOLUTE_PATH_TO_OPENFOLD_FOLDER>/openfold/resources"
    

    <ABSOLUTE_PATH_TO_OPENFOLD_FOLDER> should be the output of pwd -P you get from the OpenFold repo path.

    Deactivate and reactivate your python environment, you should be able to run and see the path:

    echo $OPENFOLD_RESOURCES 
    
  6. Check your OpenFold build with unit tests:

    ./scripts/run_unit_tests.sh
    

    Ensure you see no errors:

    ...
    Time to load evoformer_attn op: 243.8257336616516 seconds
    ............s...s.sss.ss.....sssssssss.sss....ssssss..s.s.s.ss.s......s.s..ss...ss.s.s....s........
    ----------------------------------------------------------------------
    Ran 117 tests in 275.889s
    
    OK (skipped=41)
    

2. Install Phenix (required from automatic preprocessing and post-refinement)

Phenix is required for automatic data preprocessing and for post-refinement when polishing final model geometry. Follow the steps below to install it and add the path to the system environment variables:

  1. Download the latest nightly-build Phenix python3 installer according to https://phenix-online.org/download, note you have download the installer from the show-all link, with version newer than 2.0rc1-5647

  2. Run the installer

    bash phenix-installer-2.0rc1-5617-<platform>.sh
    

    You will be prompted to type your preferred path of installation, after specifying it, you will see:

    Phenix will now be installed into this location:
    <phenix_directory>/phenix-2.0rc1-5617
    

    Note: <phenix_directory> must be a absolute path. The installer will will make <phenix_directory>/phenix-2.0rc1-5617 and install there.

  3. Append the following line to <path_to_your_conda_env>/etc/conda/activate.d/env_vars.sh, you should have created it from the previous section

    export PHENIX_ROOT="<phenix_directory>/phenix-2.0rc1-5617"
    

    <phenix_directory> is where you install Phenix in the last step

    Deactivate and reactivate your python environment, you should be able to run and see the path:

    echo $PHENIX_ROOT 
    

3. Install ROCKET

Install ROCKET. First move to the parent folder, clone the ROCKET repo (so you don't mix the ROCKET repo with the OpenFold one), then install it with pip

git clone https://github.com/alisiafadini/ROCKET.git
cd ROCKET
pip install .

It will automatically install dependencies like SFcalculator and reciprocalspaceship.

Note: If you get errors about incompatibility of prompt_toolkit, ignore them.

For develop mode, run

pip install -e .

Run rk.score --help after installation, if you see a normal doc strings without errors, you are good to go!

Citing

@article{fadini2025alphafold,
  title={AlphaFold as a Prior: Experimental Structure Determination Conditioned on a Pretrained Neural Network},
  author={Fadini, Alisia and Li, Minhuan and McCoy, Airlie J and Terwilliger, Thomas C and Read, Randy J and Hekstra, Doeke and AlQuraishi, Mohammed},
  journal={bioRxiv},
  year={2025}
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rs_rocket-0.1.0.tar.gz (108.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

rs_rocket-0.1.0-py3-none-any.whl (122.9 kB view details)

Uploaded Python 3

File details

Details for the file rs_rocket-0.1.0.tar.gz.

File metadata

  • Download URL: rs_rocket-0.1.0.tar.gz
  • Upload date:
  • Size: 108.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.14

File hashes

Hashes for rs_rocket-0.1.0.tar.gz
Algorithm Hash digest
SHA256 1e5959ec8b551405efe813e0de524f1b825ce04171391da97823df4d1334d2ce
MD5 d1a3ac3491395c9a761c0dde277b393c
BLAKE2b-256 8742eb2f8dbb13eb06e750f2a6546737a9e8748433b5bc1941e70a96a545e88a

See more details on using hashes here.

File details

Details for the file rs_rocket-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: rs_rocket-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 122.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.14

File hashes

Hashes for rs_rocket-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 010b0d36fd8197d828e1945b55760bfb0b4035bf4f77bee101b66434ac1711ea
MD5 376f143f9c0d9f5039d4dac7a2160252
BLAKE2b-256 d611a8aa3c5f90b49d30494c78bb6ccdc9dec570bd1d2f46339720bca0f2e5d4

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page