Reimplementation as a python package of the software for Simple Atom Depth Index Calculator (SADIC)
Project description
SADIC v2: A Modern Implementation of the Simple Atom Depth Index Calculator
This repository contains the source code for the SADIC v2 package, a modern implementation of the Simple Atom Depth Index Calculator, used to compute the SADIC depth index, a measure of atom depth in protein molecules.
The package is designed to be easy to use and to provide a fast and efficient implementation of the algorithm.
It is built to be used as a command line tool or as a Python library. The package exposes functions to all the single steps of the algorithm, allowing the user to have full control over the computation, as well as a high-level function to compute the SADIC indices with a single line of python code.
Authors
- Giacomo Nunziati
- Alessia Lucia Prete
- Sara Marziali
Install
Install from PyPI
pip install sadic
Install from source
git clone https://github.com/nunziati/sadic.git
cd sadic
pip install .
Requirements
SADIC v2 requires the packages numpy, scipy, biopandas and biopython.
The requirements are automatically downloaded while installing this package.
To in stall the requirements separately, run the following command:
pip install -U -r requirements.txt
Usage
The algorithm processes a protein structure and computes the depth index for each atom in the structure.
The protein structure can be provided as a PDB code or as a path to a PDB file.
The package is integrated with BioPython and BioPandas, so the input can also be provided as a BioPython Structure object or a BioPandas PDB Entity object.
Command Line Interface (CLI)
Simplified interface for the command line usage of the package.
The CLI interface only allows to specify the input as a PDB code or a path to a PDB file. The output is returned as a PDB file.
sadic <input> --output <output> [--config <config_file>]
Input can be:
- a PDB code of a protein structure
- a path to a PDB file (.pdb or .tar.gz)
Output must be a path of a PDB file (.pdb or .tar.gz)
Config file is optional and, if specified, must be a path to a python file (.py) containing two dictionaries:
- sadic_config: a dictionary containing the configuration parameters for the SADIC algorithm
- output_config: a dictionary containing the configuration parameters for the output file
Python interface
Simple usage
import sadic
# Input protein
pdb_code = "1GWD"
# Run the pipeline
result = sadic.sadic(pdb_code)
# (optional) Useful to retrieve the depth indices from the result object
output = result.get_depth_index()
# Save the output to a file
result.save_pdb("1gwd_sadic.pdb")
Filter and aggregations
Note: filters, atom aggregations and model aggregations are optional and independent from each other.
They can be used in any combination.
import sadic
# Input protein
pdb_code = "1GWD"
# Define the filter options
# Only return the SADIC indices for the atoms composing the alanine and glycine residues
filter_arg = {"residue_name": ["ALA", "GLY"]}
# Define the atom aggregation options
# Compute the depth index for each residue by averaging the depth indices of the atoms composing it
group_by = "residue_number"
aggregation_function = "mean"
atom_aggregation_arg = (group_by, aggregation_function)
# Define the model aggregation options
# If the pdb file contains multiple models, they can be aggregated
# In this case, the depth indices of corresponding atoms in different models are averaged
model_aggregation_arg = "mean"
# Run the pipeline
# Filter by residue name
result = sadic.sadic(pdb_code, filter_by = filter_arg)
# (optional) Useful to retrieve the depth indices from the result object
# Aggregate the depth indices of the atoms of the same residue
output = result.get_depth_index(atom_aggregation = atom_aggregation_arg)
# Save the output to a file
# Aggregate the depth indices of the different models
result.save_pdb("1gwd_sadic.pdb", model_aggregation = model_aggregation_arg)
Software
Our approach involves modeling each protein as a solid object composed of spheres centered on single atoms.
SADIC simulates the probing of the protein computing the largest sphere inscribed in its molecular structure.
Let $r$ be the radius of such sphere and $V_{r_{max}}$ its volume.
During the simulation, the reference sphere is iteratively centered on each atom $i$, and the exposed volume $V_{r,i}$ is calculated.
The evaluation of the atom depth index $D_{i,r}$ for the $i$-th atom is determined by the formula:
$$ D_{i,r} = \frac{2V_{r,i}}{V_{r_{max}}} $$
The exposed volume $V_{r,i}$ indicates the volume of the portion of the reference sphere centered on the $i$-th atom that does not intersect the solid representation of the protein.
Main algorithm
The execution of the SADIC v2 algorithm is articulated in multiple stages:
- Loading of protein data;
- Creation of the structured PDB entity;
- For each model found in the PDB file:
- Creation of the continuous-space model of the protein under analysis;
- Voxelization and definition of the discrete-space model approximating the protein solid;
- Filling of the internal cavities of the protein;
- Computation of the reference radius, that will be used for the depth index calculation;
- Computation of the depth indexes for the atoms selected by the user.
Architecture
The software architecture of SADIC v2 is organized into distinct sub-packages:
- pdb for organizing the data of the input protein and managing the result of the execution of the algorithms;
- solid for modeling and manipulating the continuous-space and discrete-space solids representing the molecule;
- algorithm where the core algorithms are defined
The main sadic package exposes an API with a single function for executing the depth index computation pipeline.
Functionalities
Different types of input are supported:
- PDB code
- PDB file (raw .pdb or compressed .tar.gz)
- BioPython Structure object
- BioPandas PDB Entity object
The user can specify different options:
- Reference sphere radius
- Van Der Waals radii for the atoms
- Grid resolution for the discretization of the protein
- Protein models to consider (in case of multiple models)
- Atom filters, to select only a subset of atoms
- Atom aggregations, to compute the depth index for groups of atoms
- Model aggregations, to obtain a single depth index for each atom (in case of multiple models)
The output can be obtained in different forms:
- Python list
- Numpy array
- Save to a .txt file
- Save to a .npy file (NumPy)
- PDB file (raw .pdb or compressed .tar.gz)
License
This project is MIT licensed.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file sadic-2.0.0.tar.gz
.
File metadata
- Download URL: sadic-2.0.0.tar.gz
- Upload date:
- Size: 34.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.0.0 CPython/3.10.13
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | fe62ee28b0d3279d29d7c758bd44d0d8ea53e591ec5a287128154911f921808d |
|
MD5 | 19650d05bb6e9fec53f7cade5c7aaecf |
|
BLAKE2b-256 | 8251b41874f9770a2f40bc837413c427c683456ca10d7fdea53f4f365c2e8741 |
File details
Details for the file sadic-2.0.0-py2.py3-none-any.whl
.
File metadata
- Download URL: sadic-2.0.0-py2.py3-none-any.whl
- Upload date:
- Size: 37.5 kB
- Tags: Python 2, Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.0.0 CPython/3.10.13
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | c147f78b7a6b26a478175d2bab6919ed6199cbfbc69eea7069a8b70a6632599a |
|
MD5 | 02eb9520b0cb6625259968be3d818ac3 |
|
BLAKE2b-256 | 9af99b46fcd2bef495db4ceaf28a0c5052c5fd61ba1169b07fccf90ee3fdf662 |