Skip to main content

Searching for Rosetta Binaries.

Project description

RosettaPy

A Python utility for wrapping Rosetta command line tools.

GitHub License

CI Status

Python CI Test with Rosetta Dependabot Updates codecov

Release

GitHub Release GitHub Release Date

PyPI - Format PyPI - Version PyPI - Status PyPI - Wheel

Python version supported

PyPI - Python Version PyPI - Implementation

Overview

RosettaPy is a Python module designed to locate Rosetta biomolecular modeling suite binaries that follow a specific naming pattern and execute Rosetta in command line. The module includes:

  • An object-oriented RosettaFinder class to search for binaries.
  • A RosettaBinary dataclass to represent the binary and its attributes.
  • A command-line wrapper dataclass Rosetta for handling Rosetta runs.
  • A RosettaScriptsVariableGroup dataclass to represent Rosetta scripts variables.
  • A simplified result analyzer RosettaEnergyUnitAnalyser to read and interpret Rosetta output score files.
  • A series of example applications that follow the design elements and patterns described above.
    • PROSS
    • FastRelax
    • RosettaLigand
    • Supercharge
    • MutateRelax
    • Cartesian ddG (on the way)
  • Unit tests to ensure reliability and correctness.

Features

  • Flexible Binary Search: Finds Rosetta binaries based on their naming convention.
  • Platform Support: Supports Linux and macOS operating systems.
  • Customizable Search Paths: Allows specification of custom directories to search.
  • Structured Binary Representation: Uses a dataclass to encapsulate binary attributes.
  • Command-Line Shortcut: Provides a quick way to find binaries via the command line.
  • Available on PyPI: Installable via pip without the need to clone the repository.
  • Unit Tested: Includes tests for both classes to ensure functionality.

Naming Convention

The binaries are expected to follow this naming pattern:

rosetta_scripts[[.mode].oscompilerrelease]
  • Binary Name: rosetta_scripts (default) or specified.
  • Mode (optional): default, mpi, or static.
  • OS (optional): linux or macos.
  • Compiler (optional): gcc or clang.
  • Release (optional): release or debug.

Examples of valid binary filenames:

  • rosetta_scripts (dockerized Rosetta)
  • rosetta_scripts.linuxgccrelease
  • rosetta_scripts.mpi.macosclangdebug
  • rosetta_scripts.static.linuxgccrelease

Installation

Ensure you have Python 3.6 or higher installed.

Install via PyPI

You can install RosettaPy directly from PyPI:

pip install RosettaPy -U

Usage

Command-Line Shortcut

RosettaPy provides a command-line shortcut to quickly locate Rosetta binaries.

Using the whichrosetta Command

After installing RosettaPy, you can use the whichrosetta command in your terminal.

whichrosetta <binary_name>

Example:

To find the relax binary:

relax_bin=$(whichrosetta relax)
echo $relax_bin

This command assigns the full path of the relax binary to the relax_bin variable and prints it.

Importing the Module

You can also use RosettaPy in your Python scripts.

from RosettaPy import RosettaFinder, RosettaBinary

Finding a Rosetta Binary in Python

# Initialize the finder (optional custom search path)
finder = RosettaFinder(search_path='/custom/path/to/rosetta/bin')

# Find the binary (default is 'rosetta_scripts')
rosetta_binary = finder.find_binary('rosetta_scripts')

# Access binary attributes
print(f"Binary Name: {rosetta_binary.binary_name}")
print(f"Mode: {rosetta_binary.mode}")
print(f"OS: {rosetta_binary.os}")
print(f"Compiler: {rosetta_binary.compiler}")
print(f"Release: {rosetta_binary.release}")
print(f"Full Path: {rosetta_binary.full_path}")

Wrapping the Rosetta

# Create a Rosetta object with the desired parameters
rosetta = Rosetta(
    bin="rosetta_scripts",
    flags=[...],
    opts=[
        "-in:file:s", os.path.abspath(pdb),
        "-parser:protocol", "/path/to/my_rosetta_scripts.xml",
    ],
    output_dir=...,
    save_all_together=True,
    job_id=...,
)

# Create tasks for each variant
tasks = [
    {
        "rsv": RosettaScriptsVariableGroup.from_dict(
            {
                "var1": ...,
                "var2": ...,
                "var3": ...,
            }
        ),
        "-out:file:scorefile": f"{variant}.sc",
        "-out:prefix": f"{variant}.",
    }
    for variant in variants
]

# Run the Rosetta tasks
rosetta.run(inputs=tasks)

# Analyze the results
analyser = RosettaEnergyUnitAnalyser(score_file=rosetta.output_scorefile_dir)
best_hit = analyser.best_decoy
pdb_path = os.path.join(rosetta.output_pdb_dir, f'{best_hit["decoy"]}.pdb')

print("Analysis of the best decoy:")
print("-" * 79)
print(analyser.df.sort_values(by=analyser.score_term))

print("-" * 79)

print(f'Best Hit on this Rosetta run: {best_hit["decoy"]} - {best_hit["score"]}: {pdb_path}')
#

Environment Variables

The RosettaFinder searches the following directories by default:

  1. PATH, which is commonly used in dockerized Rosetta image.
  2. The path specified in the ROSETTA_BIN environment variable.
  3. ROSETTA3/bin
  4. ROSETTA/main/source/bin/
  5. A custom search path provided during initialization.

Running Tests

The project includes unit tests using Python's pytest framework.

  1. Clone the repository (if not already done):

    git clone https://github.com/YaoYinYing/RosettaPy.git
    cd RosettaPy
    
  2. Navigate to the project directory:

    cd RosettaPy
    
  3. Run the tests:

    python -m pytest ./tests
    

Contributing

Contributions are welcome! Please submit a pull request or open an issue for bug reports and feature requests.

License

This project is licensed under the MIT License.

Acknowledgements

  • Rosetta Commons: The Rosetta software suite for the computational modeling and analysis of protein structures.

Contact

For questions or support, please contact:

  • Name: Yinying Yao
  • Email:yaoyy.hi(a)gmail.com

Project details


Release history Release notifications | RSS feed

This version

0.1.1

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rosettapy-0.1.1.tar.gz (314.5 kB view details)

Uploaded Source

Built Distribution

rosettapy-0.1.1-py3-none-any.whl (41.0 kB view details)

Uploaded Python 3

File details

Details for the file rosettapy-0.1.1.tar.gz.

File metadata

  • Download URL: rosettapy-0.1.1.tar.gz
  • Upload date:
  • Size: 314.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: python-requests/2.32.3

File hashes

Hashes for rosettapy-0.1.1.tar.gz
Algorithm Hash digest
SHA256 1a3b7f32a7d4909bcff2609aeed5c564aa95d8a162a6315304ff8cea700d7c3d
MD5 72bcbaecacfd96f6cfafc61f2f59d769
BLAKE2b-256 618e56eada2b1e91f5b06a19a21b2e7edde9992a83499fa1a7f368dc31c2407b

See more details on using hashes here.

File details

Details for the file rosettapy-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: rosettapy-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 41.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: python-requests/2.32.3

File hashes

Hashes for rosettapy-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 da9d5b5b8ddd72862e54b72a0f74fdc73fd68d3679a371635d00ef7445682e1a
MD5 f4280c01646b69163b72a3e4f980eec2
BLAKE2b-256 0bb60d309d2e17a9cd76f0ea3840ed66740c1f7f91860446972d758c9d4cff88

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page