A library for protein domain segmentation using Merizo and Chainsaw.
Project description
Protein Domain Segmentation Library
This library provides a convenient interface for protein domain segmentation using the powerful tools Merizo and Chainsaw. The library abstracts the complexity of using these tools, allowing users to predict protein domain boundaries directly from PDB files or MDAnalysis universes.
Features
- Unified Interface: Use a consistent API to access both Merizo and Chainsaw functionalities.
- Flexible Input Support: Predict from PDB files or MDAnalysis universes.
- Lightweight Usage: Avoid integrating full tool codebases directly into your project.
Requirements
Python 3.10+
Installation
Install the library and its dependencies using pip:
pip install protein-domain-segmentation
Usage
Below is an example demonstrating how to use the MerizoCluster and ChainsawCluster classes:
from protein_domain_segmentation import MerizoCluster, ChainsawCluster
import MDAnalysis as mda
# Initialize clusters
merizo_cluster = MerizoCluster()
chainsaw_cluster = ChainsawCluster()
# Example PDB file path
pdb_path = "example.pdb"
# Predict chopping using Merizo
merizo_result = merizo_cluster.predict_from_pdb(pdb_path)
print("Merizo Results:", merizo_result)
# Predict chopping using Chainsaw
chainsaw_result = chainsaw_cluster.predict_from_pdb(pdb_path)
print("Chainsaw Results:", chainsaw_result)
# Predict chopping from an MDAnalysis Universe
u = mda.Universe(pdb_path)
merizo_from_universe = merizo_cluster.predict_from_universe(u)
print("Merizo Results from Universe:", merizo_from_universe)
Warning for non-Linux Users (Merizo only)
If you are not on a Linux system, you might need to compile Stride, which is used by Merizo for secondary structure assignment.
You can find Stride installation procedure on their official website.
You can then specify STRIDE_PATH to point towards the binary:
export STRIDE_PATH=/path/to/stride_executable
API Overview
Classes
-
Cluster: Abstract base class for clustering tools.predict_from_pdb(pdb_path: str, model_params: Dict = None): Predicts the chopping of a protein from a PDB file.predict_from_universe(universe: mda.Universe, model_params: Dict = None): Predicts the chopping of a protein from an MDAnalysis Universe.
-
MerizoCluster: Implementation ofClusterfor Merizo. -
ChainsawCluster: Implementation ofClusterfor Chainsaw.
Methods
-
MerizoCluster.predict_from_pdb(pdb_path: str, model_params: Dict = None): Uses Merizo to predict chopping from a PDB file. -
ChainsawCluster.predict_from_pdb(pdb_path: str, model_params: Dict = None): Uses Chainsaw to predict chopping from a PDB file.
Credits
This package wraps the functionality of the following tools:
- Merizo: Developed by the UCL Bioinformatics Group. Visit the Merizo GitHub repository.
- Chainsaw: Developed by Jude Wells and collaborators. Visit the Chainsaw GitHub repository.
Citation
If you use this package in your research, please cite the respective papers for Merizo and Chainsaw:
- Merizo: Lau et al., 2023. Merizo: a rapid and accurate protein domain segmentation method using invariant point attention. Nature Communications.
- Chainsaw: Wells et al., Chainsaw: protein domain segmentation with fully convolutional neural networks. Bioinformatics.
License
This package is distributed under the MIT License. See the LICENSE file for details.
Acknowledgments
We extend our gratitude to the developers of Merizo and Chainsaw for their invaluable contributions to the field of protein domain segmentation.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file protein_domain_segmentation-0.1.4.tar.gz.
File metadata
- Download URL: protein_domain_segmentation-0.1.4.tar.gz
- Upload date:
- Size: 194.2 MB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
327efb107fb9705b480de73b75f38e413e9f123e7888c13f66dba16fdf917fc4
|
|
| MD5 |
62cfc6e698971252e3a950c9cd392b56
|
|
| BLAKE2b-256 |
1de5ad36bba418c47c71ecfdbf9fa209a3358df38403b45da7bd908101e8c59e
|
Provenance
The following attestation bundles were made for protein_domain_segmentation-0.1.4.tar.gz:
Publisher:
python-publish.yml on GaspardMerten/protein-domain-segmentation-library
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
protein_domain_segmentation-0.1.4.tar.gz -
Subject digest:
327efb107fb9705b480de73b75f38e413e9f123e7888c13f66dba16fdf917fc4 - Sigstore transparency entry: 192997705
- Sigstore integration time:
-
Permalink:
GaspardMerten/protein-domain-segmentation-library@807fdbc4d12dc415b267012412f5dd92b5067693 -
Branch / Tag:
refs/tags/v0.1.4 - Owner: https://github.com/GaspardMerten
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
python-publish.yml@807fdbc4d12dc415b267012412f5dd92b5067693 -
Trigger Event:
push
-
Statement type:
File details
Details for the file protein_domain_segmentation-0.1.4-py3-none-any.whl.
File metadata
- Download URL: protein_domain_segmentation-0.1.4-py3-none-any.whl
- Upload date:
- Size: 194.2 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b19a43e35d864b2b8f0f0a7fdd70d28592b74afa37b264065d1780ec4d70f3d5
|
|
| MD5 |
666a95fb8327f2a3ea4574c27373027a
|
|
| BLAKE2b-256 |
15be6bb009b2a1938bd95bd2af237f55f6dbc227cd070eedfb7e31b8cfe91ead
|
Provenance
The following attestation bundles were made for protein_domain_segmentation-0.1.4-py3-none-any.whl:
Publisher:
python-publish.yml on GaspardMerten/protein-domain-segmentation-library
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
protein_domain_segmentation-0.1.4-py3-none-any.whl -
Subject digest:
b19a43e35d864b2b8f0f0a7fdd70d28592b74afa37b264065d1780ec4d70f3d5 - Sigstore transparency entry: 192997706
- Sigstore integration time:
-
Permalink:
GaspardMerten/protein-domain-segmentation-library@807fdbc4d12dc415b267012412f5dd92b5067693 -
Branch / Tag:
refs/tags/v0.1.4 - Owner: https://github.com/GaspardMerten
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
python-publish.yml@807fdbc4d12dc415b267012412f5dd92b5067693 -
Trigger Event:
push
-
Statement type: