A Python utility for wrapping Rosetta command line tools.
Project description
RosettaPy
A Python utility for wrapping Rosetta command line tools.
CI Status
Quality
Release
Python version supported
Overview
RosettaPy
is a Python module designed to locate Rosetta biomolecular modeling suite binaries that follow a specific naming pattern and execute Rosetta in command line. The module includes:
- An object-oriented
RosettaFinder
class to search for binaries. - A
RosettaBinary
dataclass to represent the binary and its attributes. - A command-line wrapper dataclass
Rosetta
for handling Rosetta runs. - A
RosettaScriptsVariableGroup
dataclass to represent Rosetta scripts variables. - A simplified result analyzer
RosettaEnergyUnitAnalyser
to read and interpret Rosetta output score files. - A series of example applications that follow the design elements and patterns described above.
- PROSS
- FastRelax
- RosettaLigand
- Supercharge
- MutateRelax
- Cartesian ddG (on the way)
- Unit tests to ensure reliability and correctness.
Features
- Flexible Binary Search: Finds Rosetta binaries based on their naming convention.
- Platform Support: Supports Linux and macOS operating systems.
- Customizable Search Paths: Allows specification of custom directories to search.
- Structured Binary Representation: Uses a dataclass to encapsulate binary attributes.
- Command-Line Shortcut: Provides a quick way to find binaries via the command line.
- Available on PyPI: Installable via
pip
without the need to clone the repository. - Unit Tested: Includes tests for both classes to ensure functionality.
Naming Convention
The binaries are expected to follow this naming pattern:
rosetta_scripts[[.mode].oscompilerrelease]
- Binary Name:
rosetta_scripts
(default) or specified. - Mode (optional):
default
,mpi
, orstatic
. - OS (optional):
linux
ormacos
. - Compiler (optional):
gcc
orclang
. - Release (optional):
release
ordebug
.
Examples of valid binary filenames:
rosetta_scripts
(dockerized Rosetta)rosetta_scripts.linuxgccrelease
rosetta_scripts.mpi.macosclangdebug
rosetta_scripts.static.linuxgccrelease
Installation
Ensure you have Python 3.8 or higher installed.
Install via PyPI
You can install RosettaPy
directly from PyPI:
pip install RosettaPy -U
Usage
Command-Line Shortcut
RosettaPy
provides a command-line shortcut to quickly locate Rosetta binaries.
Using the whichrosetta
Command
After installing RosettaPy
, you can use the whichrosetta
command in your terminal.
whichrosetta <binary_name>
Example:
To find the relax
binary:
relax_bin=$(whichrosetta relax)
echo $relax_bin
This command assigns the full path of the relax
binary to the relax_bin
variable and prints it.
Importing the Module
You can also use RosettaPy
in your Python scripts.
from RosettaPy import RosettaFinder, RosettaBinary
Finding a Rosetta Binary in Python
# Initialize the finder (optional custom search path)
finder = RosettaFinder(search_path='/custom/path/to/rosetta/bin')
# Find the binary (default is 'rosetta_scripts')
rosetta_binary = finder.find_binary('rosetta_scripts')
# Access binary attributes
print(f"Binary Name: {rosetta_binary.binary_name}")
print(f"Mode: {rosetta_binary.mode}")
print(f"OS: {rosetta_binary.os}")
print(f"Compiler: {rosetta_binary.compiler}")
print(f"Release: {rosetta_binary.release}")
print(f"Full Path: {rosetta_binary.full_path}")
Wrapping the Rosetta
# Imports
from RosettaPy import Rosetta, RosettaScriptsVariableGroup, RosettaEnergyUnitAnalyser,
# Create a Rosetta object with the desired parameters
rosetta = Rosetta(
bin="rosetta_scripts",
flags=[...],
opts=[
"-in:file:s", os.path.abspath(pdb),
"-parser:protocol", "/path/to/my_rosetta_scripts.xml",
],
output_dir=...,
save_all_together=True,
job_id=...,
)
# Run with the Rosetta tasks
tasks = [ # Create tasks for each variant
{
"rsv": RosettaScriptsVariableGroup.from_dict(
{
"var1": ...,
"var2": ...,
"var3": ...,
}
),
"-out:file:scorefile": f"{variant}.sc",
"-out:prefix": f"{variant}.",
}
for variant in variants
]
# Run the tasks
rosetta.run(inputs=tasks)
# Or create a distributed runs with structure labels (-nstruct)
options=[...] # Passing an optional list of options that will be used to all structure models
rosetta.run(nstruct=nstruct, inputs=options)
# Analyze the results
analyser = RosettaEnergyUnitAnalyser(score_file=rosetta.output_scorefile_dir)
best_hit = analyser.best_decoy
pdb_path = os.path.join(rosetta.output_pdb_dir, f'{best_hit["decoy"]}.pdb')
print("Analysis of the best decoy:")
print("-" * 79)
print(analyser.df.sort_values(by=analyser.score_term))
print("-" * 79)
print(f'Best Hit on this run: {best_hit["decoy"]} - {best_hit["score"]}: {pdb_path}')
#
Environment Variables
The RosettaFinder
searches the following directories by default:
PATH
, which is commonly used in dockerized Rosetta image.- The path specified in the
ROSETTA_BIN
environment variable. ROSETTA3/bin
ROSETTA/main/source/bin/
- A custom search path provided during initialization.
Running Tests
The project includes unit tests using Python's pytest
framework.
-
Clone the repository (if not already done):
git clone https://github.com/YaoYinYing/RosettaPy.git cd RosettaPy
-
Navigate to the project directory:
cd RosettaPy
-
Run the tests:
python -m pytest ./tests
Contributing
Contributions are welcome! Please submit a pull request or open an issue for bug reports and feature requests.
License
This project is licensed under the MIT License.
Acknowledgements
- Rosetta Commons: The Rosetta software suite for the computational modeling and analysis of protein structures.
Contact
For questions or support, please contact:
- Name: Yinying Yao
- Email:yaoyy.hi(a)gmail.com
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file rosettapy-0.1.3rc124.post1.tar.gz
.
File metadata
- Download URL: rosettapy-0.1.3rc124.post1.tar.gz
- Upload date:
- Size: 345.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: python-requests/2.32.3
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 9c56269594528527cc5ef03324ee2fcfc509ddb4af4331ee0392b2956ca1351e |
|
MD5 | d53f3883763734e1764c5d882d223520 |
|
BLAKE2b-256 | 0f4e1e4145cb531caa433e8dcfaf485ff748b4bd8f939a4611315aef31d5161e |
File details
Details for the file rosettapy-0.1.3rc124.post1-py3-none-any.whl
.
File metadata
- Download URL: rosettapy-0.1.3rc124.post1-py3-none-any.whl
- Upload date:
- Size: 47.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: python-requests/2.32.3
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | db2c953c3c5e8da0574d88fdf038821ff21e76e119f41d6b63cb7a7b2b16e71a |
|
MD5 | f871f36295f02c5012b2aa189a066cea |
|
BLAKE2b-256 | d7885facbcb6b0d58584bfe5a7e8f602a31d92364f79f51081f9b3b49ac41c4c |