Skip to main content

pySymStat: average orientations, including both spatial rotations and projection directions, while accounting for molecular symmetry.

Project description

pySymStat Overview

pySymStat is a Python software package designed to average orientations, including both spatial rotations and projection directions, while accounting for molecular symmetry.

Publications

This program package implements algorithm proposed in the paper The Moments of Orientation Estimations Considering Molecular Symmetry in Cryo-EM.

Installation

CryoSieve is an open-source software, developed using Python. Please access our source code on GitHub.

Installing pySymStat

Install pySymStat by executing the following command with Pip:

pip install pysymstat

Verifying Installation

Open your Python environment, and execute the following command to import the pySymStat package:

import pySymStat

Then, list all functions, classes, and modules included in pySymStat by running:

print(dir(pySymStat))

Tutorial

The function averaging_SO3_G.mean_variance_SO3_G calculates the mean and variance of a given set of spatial rotations represented as unit quaternions. Meanwhile, the function averaging_S2_G.mean_variance_S2_G computes the mean and variance of a given set of projection directions represented as unit vectors. Click the hyperlinks to view the descriptions, help information, source code, and demo script for these two functions.

A set of functions related to these calculations is also provided in this package. Please refer to the Function List section for a comprehensive list.

Function List

conversion.euler_to_quaternion

euler_to_quaternion(src: numpy.ndarray) -> numpy.ndarray
    The `euler_to_quaternion` function converts Euler angles (Relion's convention) to a unit quaternion.
    - `src`: This is a Numpy vector of type `np.float64` with a length of 3, representing `rlnAngleRot`, `rlnAngleTilt`, and `rlnAnglePsi` in Relion's starfile convention.

source demo

conversion.quaternion_to_euler

quaternion_to_euler(src: numpy.ndarray) -> numpy.ndarray
    The `quaternion_to_euler` function converts a unit quaternion to Euler angles (Relion's convention).
    - `src`: This is a Numpy vector of type `np.float64` with a length of 4, which is a quaternion representing a spatial rotation.

source demo

quaternion.quat_conj

quat_conj(q: numpy.ndarray) -> numpy.ndarray
    The `quat_conj` functions calculates the conjugate of an input quaternion.
    - `q`: This is a quaternion and should be a NumPy vector of type `np.float64` with a length of 4.

source demo

quaternion.quat_mult

quat_mult(q1: numpy.ndarray, q2: numpy.ndarray) -> numpy.ndarray
    The `quat_mult` functions calculates the product of two input quaternions.
    - `q1` and `q2`: This are two quaternions and should be NumPy vectors of type `np.float64` with a length of 4.

source demo

quaternion.quat_rotate

quat_rotate(q: numpy.ndarray, v: numpy.ndarray) -> numpy.ndarray
    The `quat_rotate` functions rotate this a vector based on a spatial rotation represented by a unit quarternion.
    - `q`: This is a quaternion and should be a NumPy vector of type `np.float64` with a length of 4. It represents the spatial rotation.
    - `v`: This is a vector in 3D space, which to be rotated.

source demo

distance.distance_SO3

distance_SO3(q1: numpy.ndarray[typing.Any, numpy.dtype[numpy.float64]], q2: numpy.ndarray[typing.Any, numpy.dtype[numpy.float64]], type: Literal['arithmetic', 'geometric'] = 'arithmetic') -> float
    The `distance_SO3` function calculates either the arithmetic or geometric distance between two spatial rotations.
    - `q1`, `q2`: These are the unit quaternion representations of spatial rotations, each a Numpy vector of type `np.float64` with a length of 4.
    - `type`: Specifies the type of distance calculation. Options are 'arithmetic' or 'geometric'.

source demo

distance.distance_S2

distance_S2(v1: numpy.ndarray[typing.Any, numpy.dtype[numpy.float64]], v2: numpy.ndarray[typing.Any, numpy.dtype[numpy.float64]], type: Literal['arithmetic', 'geometric'] = 'arithmetic') -> float
    The `distance_S2` function calculates either the arithmetic or geometric distance between two projection directions.
    - `v1`, `v2`: These are the unit vectors representing projection directions, each a Numpy vector of type `np.float64` with a length of 3.
    - `type`: Specifies the type of distance calculation. Options are 'arithmetic' or 'geometric'.

source demo

get_sym_grp

get_sym_grp(sym)
    The `get_sym_grp` function retrieves the elements, the Cayley table, and the irreducible representations for a specified molecular symmetry symbol.
    - `sym`: The molecular symmetry symbol. Acceptable inputs include `Cn`, `Dn`, `T`, `O`, `I`, `I1`, `I2`, `I3`. The symbols `I`, `I1`, `I2`, `I3` all denote icosahedral symmetry, but with different conventions. Notably, `I` is equivalent to `I2`. This convention is used in Relion. For more details, visit [Relion Conventions](https://relion.readthedocs.io/en/release-3.1/Reference/Conventions.html#symmetry).

source demo

distance.distance_SO_G

distance_SO3_G(q1: numpy.ndarray, q2: numpy.ndarray, sym_grp: str, type: Literal['arithmetic', 'geometric'] = 'arithmetic') -> Tuple[float, numpy.ndarray[Any, numpy.dtype[numpy.float64]]]
    The `distance_SO3_G` function calculates either the arithmetic or geometric distance between two spatial rotations with molecular symmetry.
    - `q1`, `q2`: These are the unit quaternion representations of spatial rotations, each a Numpy vector of type `np.float64` with a length of 4.
    - `sym`: The molecular symmetry symbol. Acceptable inputs include `Cn`, `Dn`, `T`, `O`, `I`, `I1`, `I2`, `I3`. The symbols `I`, `I1`, `I2`, `I3` all denote icosahedral symmetry, but with different conventions. Notably, `I` is equivalent to `I2`. This convention is used in Relion. For more details, visit [Relion Conventions](https://relion.readthedocs.io/en/release-3.1/Reference/Conventions.html#symmetry).
    - `type`: Specifies the type of distance calculation. Options are 'arithmetic' or 'geometric'.

    Output:
    - `output[0]`: The distance between two spatial rotations with molecular symmetry.
    - `output[1]`: The closest representative of `q1` to `q2`

source demo

distance.distance_S2_G

distance_S2_G(v1: numpy.ndarray, v2: numpy.ndarray, sym_grp: str, type: Literal['arithmetic', 'geometric'] = 'arithmetic') -> Tuple[float, numpy.ndarray[Any, numpy.dtype[numpy.float64]]]
    The `distance_S2_G` function calculates either the arithmetic or geometric distance between two projection directions with molecular symmetry.
    - `v1`, `v2`: These are the unit vectors representing projection directions, each a Numpy vector of type `np.float64` with a length of 3.
    - `sym`: The molecular symmetry symbol. Acceptable inputs include `Cn`, `Dn`, `T`, `O`, `I`, `I1`, `I2`, `I3`. The symbols `I`, `I1`, `I2`, `I3` all denote icosahedral symmetry, but with different conventions. Notably, `I` is equivalent to `I2`. This convention is used in Relion. For more details, visit [Relion Conventions](https://relion.readthedocs.io/en/release-3.1/Reference/Conventions.html#symmetry).
    - `type`: Specifies the type of distance calculation. Options are 'arithmetic' or 'geometric'.

    Output:
    - `output[0]`: The distance between two projection direction with molecular symmetry.
    - `output[1]`: The closest representative of `v1` to `v2`

source demo

averaging_SO3.mean_SO3

mean_SO3(quats: numpy.ndarray[typing.Any, numpy.dtype[numpy.float64]], type: Literal['arithmetic'] = 'arithmetic') -> numpy.ndarray[typing.Any, numpy.dtype[numpy.float64]]
    The `mean_SO3` function calculates the mean of a set of spatial rotations.

    - `quats`: Unit quaternion representations of a set of spatial rotations. It is a numpy array of shape `(n, 4)` with a data type of `np.float64`.
    - `type`: Specifies the type of distance. Only accepts the value `arithmetic`."

source demo

averaging_SO3.variance_SO3

variance_SO3(quats: numpy.ndarray[typing.Any, numpy.dtype[numpy.float64]], type: Literal['arithmetic'] = 'arithmetic', mean: Literal[None, numpy.ndarray[Any, numpy.dtype[numpy.float64]]] = None) -> float
    The `variance_SO3` function calculates the variances of a set of spatial rotations.

    - `quats`: Unit quaternion representations of spatial rotations, provided as a numpy array with the shape `(n, 4)` and a data type of `np.float64`.
    - `type`: Specifies the type of distance calculation to be used. It only accepts the value `arithmetic`.
    - `mean`: Specifies the mean of the input spatial rotations. If this is `None`, the variance is calculated in an unsupervised manner.

source demo

averaging_S2.mean_S2

mean_S2(vecs: numpy.ndarray[typing.Any, numpy.dtype[numpy.float64]], type: Literal['arithmetic'] = 'arithmetic') -> numpy.ndarray[typing.Any, numpy.dtype[numpy.float64]]
    The `mean_S2` function calculates the mean of a set of projection directions.

    - `quats`: Unit vector representations of a set of projection directions. It is a numpy array of shape `(n, 3)` with a data type of `np.float64`.
    - `type`: Specifies the type of distance. Only accepts the value `arithmetic`."

source demo

averaging_S2.variance_S2

variance_S2(vecs: numpy.ndarray[typing.Any, numpy.dtype[numpy.float64]], type: Literal['arithmetic'] = 'arithmetic', mean: Literal[None, numpy.ndarray[Any, numpy.dtype[numpy.float64]]] = None) -> float
    The `variance_S2` function calculates the variances of a set of projection directions.

    - `vecs`: Unit vector representations of projection directions, provided as a numpy array with the shape `(n, 3)` and a data type of `np.float64`.
    - `type`: Specifies the type of distance calculation to be used. It only accepts the value `arithmetic`.
    - `mean`: Specifies the mean of the input projection directions. If this is `None`, the variance is calculated in an unsupervised manner.

source demo

averaging_SO3_G.mean_variance_SO3_G

mean_variance_SO3_G(quats: numpy.ndarray[typing.Any, numpy.dtype[numpy.float64]], sym_grp, type: Literal['arithmetic'] = 'arithmetic', **kwargs)
    The `mean_variance_SO3_G` function calculates the mean and variance of a set of spatial rotations with molecular symmetry.

    - `quats`: Unit quaternion representations of spatial rotations. It is a numpy array of shape `(n, 4)` with a data type of `np.float64`.
    - `sym`: The molecular symmetry symbol. Acceptable inputs include `Cn`, `Dn`, `T`, `O`, `I`, `I1`, `I2`, `I3`. The symbols `I`, `I1`, `I2`, `I3` all denote icosahedral symmetry, but with different conventions. Notably, `I` is equivalent to `I2`. This convention is used in Relion. For more details, visit [Relion Conventions](https://relion.readthedocs.io/en/release-3.1/Reference/Conventions.html#symmetry).
    - `type`: Specifies the type of distance calculation to be used. It only accepts the value `arithmetic`.

    Output:
    - `output[0]`: The mean of these spatial rotations.
    - `output[1]`: The variance of these spatial rotations.
    - `output[2]`: The correct representatives of these spatial rotations.
    - `output[3]`: The index of elements in the symmetry group corresponding to the correct representative.

source demo

averaging_S2_G.mean_variance_S2_G

mean_variance_S2_G(vecs: numpy.ndarray[typing.Any, numpy.dtype[numpy.float64]], sym_grp, type: Literal['arithmetic'] = 'arithmetic', **kwargs)
    The `mean_variance_S2_G` function calculates the mean and variance of a set of projection directions with molecular symmetry.

    - `quats`: Unit quaternion representations of projection directions. It is a numpy array of shape `(n, 3)` with a data type of `np.float64`.
    - `sym`: The molecular symmetry symbol. Acceptable inputs include `Cn`, `Dn`, `T`, `O`, `I`, `I1`, `I2`, `I3`. The symbols `I`, `I1`, `I2`, `I3` all denote icosahedral symmetry, but with different conventions. Notably, `I` is equivalent to `I2`. This convention is used in Relion. For more details, visit [Relion Conventions](https://relion.readthedocs.io/en/release-3.1/Reference/Conventions.html#symmetry).
    - `type`: Specifies the type of distance calculation to be used. It only accepts the value `arithmetic`.

    Output:
    - `output[0]`: The mean of these projection directions.
    - `output[1]`: The variance of these projection directions.
    - `output[2]`: The correct representatives of these projection directions.
    - `output[3]`: The index of elements in the symmetry group corresponding to the correct representative.

source demo

Test Data and Reproduction of Results

In the test folder of the pySymStat repository, we provide the dataset and expected results for our simulation experiments. Interested researchers can utilize this data to reproduce the results presented in Section 4.1 of our paper.

Please note that the contents of the test folder are not included in the PyPI release version of pySymStat. Users are required to clone the repository into their local directory using the following command:

git clone https://github.com/mxhulab/pySymStat.git

Alternatively, users can download the Release source code file and extract it.

To rerun the test, navigate to the test directory and extract the data.zip file into a folder named data. Then, execute the run.sh script with the following command:

cd test
unzip data.zip -d data
bash run.sh

Please note that the entire testing process may take a considerable amount of time to complete. We have provided all intermediate results in the data.zip file. If you only wish to view the summary result, you can run the summary.py script using the following command:

python summary.py

This will print a summary of all the tests as follows:

Approximation ability of \tilde{L}^{SO(3)} to L^{SO(3)}.
  For group C2, RoE = 33.7%, Pr[RCG < 0.01] = 50.6%, Pr[RCG < 0.1] = 95.3%.
  For group C7, RoE = 44.8%, Pr[RCG < 0.01] = 60.5%, Pr[RCG < 0.1] = 94.2%.
  For group D2, RoE = 93.4%, Pr[RCG < 0.01] = 96.7%, Pr[RCG < 0.1] = 100.0%.
  For group D7, RoE = 96.1%, Pr[RCG < 0.01] = 99.1%, Pr[RCG < 0.1] = 100.0%.
  For group T , RoE = 98.4%, Pr[RCG < 0.01] = 99.9%, Pr[RCG < 0.1] = 100.0%.
  For group O , RoE = 98.6%, Pr[RCG < 0.01] = 100.0%, Pr[RCG < 0.1] = 100.0%.
  For group I , RoE = 99.9%, Pr[RCG < 0.01] = 100.0%, Pr[RCG < 0.1] = 100.0%.

Approximation ability of \tilde{L}^{S2} to L^{S2}.
According to THEOREM 2.2, RoE should be 100%.
  For group C2, RoE = 100.0%.
  For group C7, RoE = 100.0%.
  For group D2, RoE = 100.0%.
  For group D7, RoE = 100.0%.
  For group T , RoE = 100.0%.
  For group O , RoE = 100.0%.
  For group I , RoE = 100.0%.

Plot scatter point graph of (\tilde{L}^{S^2}, L^{S^2}).
Plot scatter point graph of (\tilde{L}^{SO(3)}, L^{SO(3)}).

Approximation ability of NUG approach with our rounding for \tilde{L}^{SO(3)}.
Show (RoE, max-RCG).
| Group |   d_{SO(3)}^A   |   d_{SO(3)}^G   |    d_{S2}^A     |    d_{S2}^G     |
|-------------------------------------------------------------------------------|
|   C2  | ( 98.1%, 0.82%) | ( 98.7%, 1.32%) | (100.0%, 0.00%) | (100.0%, 0.00%) |
|   C7  | ( 99.9%, 0.22%) | ( 98.7%, 2.33%) | (100.0%, 0.00%) | (100.0%, 0.00%) |
|   D2  | ( 99.9%, 0.13%) | (100.0%, 0.00%) | ( 99.9%, 0.02%) | (100.0%, 0.00%) |
|   D7  | (100.0%, 0.00%) | ( 99.9%, 0.58%) | (100.0%, 0.00%) | (100.0%, 0.00%) |
|   T   | (100.0%, 0.00%) | (100.0%, 0.00%) | (100.0%, 0.00%) | (100.0%, 0.00%) |
|   O   | (100.0%, 0.00%) | (100.0%, 0.00%) | (100.0%, 0.00%) | (100.0%, 0.00%) |
|   I   | (100.0%, 0.00%) | (100.0%, 0.00%) | (100.0%, 0.00%) | (100.0%, 0.00%) |
|-------------------------------------------------------------------------------|

Approximation ability of NUG approach with original rounding for \tilde{L}^{SO(3)}.
Show (RoE, max-RCG).
| Group |   d_{SO(3)}^A   |   d_{SO(3)}^G   |    d_{S2}^A     |    d_{S2}^G     |
|-------------------------------------------------------------------------------|
|   C2  | ( 81.1%, 4.32%) | ( 80.4%, 7.88%) | ( 90.1%, 7.94%) | ( 89.2%, 11.76%) |
|   C7  | ( 92.2%, 6.66%) | ( 84.9%, 10.06%) | ( 99.0%, 4.21%) | ( 98.9%, 1.66%) |
|-------------------------------------------------------------------------------|
The (RoE, max-RCG) results for varying $m$ under $d_{SO(3)}^A$.
|-------------------------------------------------------------------------------|
| Group |   m=12, c=0.99  |   m=4, c=0.99   |   m=20, c=0.5   |    m=20, c=0    |
|   C2  | ( 98.1%, 0.82%) | ( 98.0%, 0.82%) | ( 89.6%, 2.40%) | ( 89.6%, 2.40%) |
|   C7  | ( 99.8%, 0.50%) | ( 97.9%, 8.27%) | ( 95.7%, 2.48%) | ( 95.3%, 2.48%) |
|   D2  | ( 99.9%, 0.13%) | ( 99.9%, 0.13%) | ( 94.6%, 8.74%) | ( 94.6%, 8.74%) |
|   D7  | (100.0%, 0.00%) | ( 99.8%, 1.62%) | ( 96.8%, 13.11%) | ( 96.5%, 17.62%) |
|   T   | (100.0%, 0.00%) | (100.0%, 0.00%) | ( 98.2%, 33.32%) | ( 98.2%, 33.32%) |
|   O   | (100.0%, 0.00%) | (100.0%, 0.00%) | ( 99.2%, 6.89%) | ( 99.2%, 6.89%) |
|   I   | (100.0%, 0.00%) | (100.0%, 0.00%) | ( 99.7%, 2.00%) | ( 99.7%, 2.00%) |
|-------------------------------------------------------------------------------|

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pysymstat-1.3.tar.gz (56.3 kB view details)

Uploaded Source

Built Distribution

pysymstat-1.3-py3-none-any.whl (46.6 kB view details)

Uploaded Python 3

File details

Details for the file pysymstat-1.3.tar.gz.

File metadata

  • Download URL: pysymstat-1.3.tar.gz
  • Upload date:
  • Size: 56.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/5.1.0 CPython/3.12.4

File hashes

Hashes for pysymstat-1.3.tar.gz
Algorithm Hash digest
SHA256 00cde8e4163470d810f5b44399451275f3848be4fc00e3f9e3cb60d2574084d0
MD5 0ee58dd549bce42c224d527d129ff441
BLAKE2b-256 104db36c4fb7e1df7a05fdd3198420828ce18b890e3b444d50805b86149fbba1

See more details on using hashes here.

File details

Details for the file pysymstat-1.3-py3-none-any.whl.

File metadata

  • Download URL: pysymstat-1.3-py3-none-any.whl
  • Upload date:
  • Size: 46.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/5.1.0 CPython/3.12.4

File hashes

Hashes for pysymstat-1.3-py3-none-any.whl
Algorithm Hash digest
SHA256 1dde04ff0b7cee2ae06681d351a9efdf68833f4a17cc1ca1e75fa06d6e688aba
MD5 9bc29a026e7091f315d0b2e715364c15
BLAKE2b-256 e098fb64e7d15a534326eaaa85ccfe5d68b46bea804c54d84a8ea3c1f96ed291

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page