Skip to main content

Structural heterogeneous cryoEM reconstruction: https://github.com/Gabriel-Ducrocq/cryoSPHERE

Project description

cryoSPHERE: Single-particle heterogeneous reconstruction from cryo EM

CryoSPHERE is a structural heterogeneous reconstruction software of cryoEM data. It requires an estimate of the CTF and poses of each image. This can be obtained using other softwares. CryoSPHERE works with two yaml files: one parameters.yaml describing the hyperparameters used to train cryoSPHERE and a image.yaml file, describing the images in the dataset. You can find an commented example of these files in the repository.

Installation

CryoSPHERE is available as a python package named cryosphere. Create a conda environment, install cryosphere with pip and then pytorch3d:

conda create -n cryosphere python==3.9.20
conda activate cryosphere
pip install cryosphere
conda install pytorch3d -c pytorch3d

Training

Preliminary: consensus reconstruction.

Before running cryoSPHERE on a dataset is to run a homogeneous reconstruction software such as RELION or cryoSparc. This should yield a star file containing the poses of each image, the CTF and information about the images as well as one or several mrcs file(s) containing the actual images. You should also obtain one or several mrc files corresponding to consensus reconstruction(s). For this tutorial, we assume your images are in a file called particles.mrcs and after a consensus reconstruction, you obain a star file named particles.star and a consensus reconstruction file called consensus_map.mrc. This naming is not mandatory, your files can have arbitrary names as long as the extension is correct.

This step is important to obtain an estimation of the CTF and the pose of each image.

First step: centering the structure

Fit a good atomic structure of the protein of interest into the volume obtained at step one (consensus_map.mrc), using e.g ChimeraX. Save this structure in pdb format: fitted_structure.pdb. You can now use cryopshere command line tools to center the structure and volume:

cryosphere_center_origin --pdb_file_path fitted_structure.pdb --mrc_file_path consensus_map.mrc

This yields a pdb file fitted_structure_centered.pdb of the centered structure and a mrc file consensus_map_centered.mrc of the centered consensus volume.

First step bis (optional)

Since the dataset is usually very noisy, it might be helpful to apply a low pass filter to the images. To determine the bandwith cutoff, first turn the centered structure into a volume, using the same GMM representation of the protein as is used during training cryoSPHERE:

cryosphere_structure_to_volume --image_yaml /path/to/image.yaml --structure_path/path/to/fitted_structure_centered.pdb --output_path /path/to/fitted_structure_centered_volume.mrc

You can now compute the Fourier Shell Correlation (FSC) between fitted_structure_centered_volume.mrc and consensus_map_centered.mrc using available softwares. Find the frequency cutoff_freq for which the FSC is equal to 0.5, and set lp_bandwidth: 1/cutoff_freq in the parameters.yaml. This means that the in the images, the frequencies such that freq > 1/lp_bandwidth are set to 0.

Second step

The second step is to run cryoSPHERE. To run it, you need two yaml files: a parameters.yaml file, defining all the parameters of the training run and a image.yaml file, containing informations about the images. You need to set the folder_experiment entry of the paramters.yaml to the path of the folder containing your data. You also need to change the base_structure entry to fitted_structure_centered.pdb. You can then run cryosphere using the command line tool:

cryosphere_train --experiment_yaml /path/to/parameters.yaml

This command creates a folder named cryoSPHERE which contains the PyTorch models ckpt.pt, one at the end of each epoch. It also copies the parameters.yaml and image.yaml files in this directory and creates a run.log to log training data.

Analysis

You can first get the latent variables corresponding to the imagaes and generate a PCA analysis of the latent space, with latent traversal of first principal components::

cryosphere_analyze --experiment_yaml /path/to/parameters.yaml --model /path/to/model.pt --output_path /path/to/outpout_folder --no-generate_structures

where model.pt is the saved torch model you want to analyze and output_folder is the folder where you want to save the results of the analysis. This will create the following directory structure:

analysis
   |	z.npy
   |	pc0
	   |   structure_z_1.pdb
	   .
	   .
	   .
	   |   structure_z_10.pdb
      |   pca.png

	pc1
	   |   structure_z_1.pdb
	   .
	   .
      .

If you want to generate all structures (one for each images), you can set --generate_structures instead. This will skip the PCA step. The file z.npy contains the latent variable associated to each image (in the same order as the images in the star file), the .pdb files are the structures sampled along the principal component (from lowest to highest values along that PC) and the .png files are images of the PCA decompisitions.

It is also possible to get the structures corresponding to specific images. Save the latent variables corresponding to the images of interest into a z_interest.npy. You can then run:

cryosphere_analyze --experiment_yaml /path/to/parameters.yaml --model /path/to/model.pt --output_path /path/to/outpout_folder --z /path/to/z_interest.npy --generate_structures

Setting the --z /path/to/z_interest.npy argument will directly decode the latent variables in z_interest.npy into structures.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cryosphere-0.3.9.tar.gz (50.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

cryoSPHERE-0.3.9-py3-none-any.whl (52.8 kB view details)

Uploaded Python 3

File details

Details for the file cryosphere-0.3.9.tar.gz.

File metadata

  • Download URL: cryosphere-0.3.9.tar.gz
  • Upload date:
  • Size: 50.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.9.20

File hashes

Hashes for cryosphere-0.3.9.tar.gz
Algorithm Hash digest
SHA256 721924694835360837d0f285c6d40aa2be97bfc704ff9c3a2f77795bda4f22c3
MD5 f0af813cc42a6d45fda4ec876aa5142d
BLAKE2b-256 ce0329a65a76444697ab825b7a2daa90a311b22bd50c8d13ba953c9401c49edd

See more details on using hashes here.

File details

Details for the file cryoSPHERE-0.3.9-py3-none-any.whl.

File metadata

  • Download URL: cryoSPHERE-0.3.9-py3-none-any.whl
  • Upload date:
  • Size: 52.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.9.20

File hashes

Hashes for cryoSPHERE-0.3.9-py3-none-any.whl
Algorithm Hash digest
SHA256 ab9af407fe177ed71cbca32a61db3ec4508a60d016c82c5154770a3ffb765099
MD5 2c7edead2fd0bbec07938302aaefe905
BLAKE2b-256 a90ce76c4813aabbabf09700d8ff382b7800505065e53817fd757a486548c7eb

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page