Skip to main content

A library for protein domain segmentation using Merizo and Chainsaw.

Project description

Protein Domain Segmentation Library

This library provides a convenient interface for protein domain segmentation using the powerful tools Merizo and Chainsaw. The library abstracts the complexity of using these tools, allowing users to predict protein domain boundaries directly from PDB files or MDAnalysis universes.

Features

  • Unified Interface: Use a consistent API to access both Merizo and Chainsaw functionalities.
  • Flexible Input Support: Predict from PDB files or MDAnalysis universes.
  • Lightweight Usage: Avoid integrating full tool codebases directly into your project.

Requirements

Python 3.10+

Installation

Install the library and its dependencies using pip:

pip install protein-domain-segmentation

Usage

Below is an example demonstrating how to use the MerizoCluster and ChainsawCluster classes:

from protein_domain_segmentation import MerizoCluster, ChainsawCluster
import MDAnalysis as mda

# Initialize clusters
merizo_cluster = MerizoCluster()
chainsaw_cluster = ChainsawCluster()

# Example PDB file path
pdb_path = "example.pdb"

# Predict chopping using Merizo
merizo_result = merizo_cluster.predict_from_pdb(pdb_path)
print("Merizo Results:", merizo_result)

# Predict chopping using Chainsaw
chainsaw_result = chainsaw_cluster.predict_from_pdb(pdb_path)
print("Chainsaw Results:", chainsaw_result)

# Predict chopping from an MDAnalysis Universe
u = mda.Universe(pdb_path)
merizo_from_universe = merizo_cluster.predict_from_universe(u)
print("Merizo Results from Universe:", merizo_from_universe)

Warning for non-Linux Users (Merizo only)

If you are not on a Linux system, you might need to compile Stride, which is used by Merizo for secondary structure assignment.

You can find Stride installation procedure on their official website.

You can then specify STRIDE_PATH to point towards the binary:

export STRIDE_PATH=/path/to/stride_executable

API Overview

Classes

  • Cluster: Abstract base class for clustering tools.

    • predict_from_pdb(pdb_path: str, model_params: Dict = None): Predicts the chopping of a protein from a PDB file.
    • predict_from_universe(universe: mda.Universe, model_params: Dict = None): Predicts the chopping of a protein from an MDAnalysis Universe.
  • MerizoCluster: Implementation of Cluster for Merizo.

  • ChainsawCluster: Implementation of Cluster for Chainsaw.

Methods

  • MerizoCluster.predict_from_pdb(pdb_path: str, model_params: Dict = None): Uses Merizo to predict chopping from a PDB file.

  • ChainsawCluster.predict_from_pdb(pdb_path: str, model_params: Dict = None): Uses Chainsaw to predict chopping from a PDB file.

Credits

This package wraps the functionality of the following tools:

Citation

If you use this package in your research, please cite the respective papers for Merizo and Chainsaw:

  • Merizo: Lau et al., 2023. Merizo: a rapid and accurate protein domain segmentation method using invariant point attention. Nature Communications.
  • Chainsaw: Wells et al., Chainsaw: protein domain segmentation with fully convolutional neural networks. Bioinformatics.

License

This package is distributed under the MIT License. See the LICENSE file for details.

Acknowledgments

We extend our gratitude to the developers of Merizo and Chainsaw for their invaluable contributions to the field of protein domain segmentation.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

protein_domain_segmentation-0.1.4.tar.gz (194.2 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

protein_domain_segmentation-0.1.4-py3-none-any.whl (194.2 MB view details)

Uploaded Python 3

File details

Details for the file protein_domain_segmentation-0.1.4.tar.gz.

File metadata

File hashes

Hashes for protein_domain_segmentation-0.1.4.tar.gz
Algorithm Hash digest
SHA256 327efb107fb9705b480de73b75f38e413e9f123e7888c13f66dba16fdf917fc4
MD5 62cfc6e698971252e3a950c9cd392b56
BLAKE2b-256 1de5ad36bba418c47c71ecfdbf9fa209a3358df38403b45da7bd908101e8c59e

See more details on using hashes here.

Provenance

The following attestation bundles were made for protein_domain_segmentation-0.1.4.tar.gz:

Publisher: python-publish.yml on GaspardMerten/protein-domain-segmentation-library

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file protein_domain_segmentation-0.1.4-py3-none-any.whl.

File metadata

File hashes

Hashes for protein_domain_segmentation-0.1.4-py3-none-any.whl
Algorithm Hash digest
SHA256 b19a43e35d864b2b8f0f0a7fdd70d28592b74afa37b264065d1780ec4d70f3d5
MD5 666a95fb8327f2a3ea4574c27373027a
BLAKE2b-256 15be6bb009b2a1938bd95bd2af237f55f6dbc227cd070eedfb7e31b8cfe91ead

See more details on using hashes here.

Provenance

The following attestation bundles were made for protein_domain_segmentation-0.1.4-py3-none-any.whl:

Publisher: python-publish.yml on GaspardMerten/protein-domain-segmentation-library

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page