Hydrascreen Python package.

These details have not been verified by PyPI

Project links

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Project description

Hydrascreen

This codebase provides functionality for making predictions using the HydraScreen API. It allows users to upload protein and ligand files, perform predictions, and retrieve the predicted affinity and pose confidence for each prediction. The GUI tool with the same functionality can be found here: HydraScreen GUI.

Installation

Install hydrascreen as a pip installable package:

pip install hydrascreen

Usage

Login

First login to hydrascreen by providing your email and your organization.

from hydrascreen import login

predictor = login(
    email='user@email.com', 
    organization='User Org'
    )

Getting predictions

Call the predict_for_protein function to get predictions for your docked protein-ligand pairs.

protein_file needs to be a Path object for a PDB file. Protein .pdb file needs to include explicit hydrogens and charges, and to be void of waters, metal ions, and salts.

ligand_files needs to be a list of Path objects for docked SDF files. Ligand files must contain a single molecule per file with one or more docked poses, with all hydrogens and charges.

from pathlib import Path

results = predictor.predict_for_protein(
            protein_file=Path('/path/to/protein.pdb'), 
            ligand_files=[
                Path('/path/to/ligand1.sdf'), 
                Path('/path/to/ligand2.sdf')
                ]
            )

The output will be a results dataclass with 2 entries which are pandas DataFrames for your protein-ligand pair predictions:

results.affinity: aggregated affinity scores of each protein-ligand complex
results.pose: pose scores for each pose separately

If you want to run multiple proteins with their ligands you can use the code as follows:

from pathlib import Path

input_pairs = [
    {
        "protein_file": Path('/path/to/protein1.pdb'), 
        "ligand_files": [
            Path('/path/to/ligand1.sdf'), 
            Path('/path/to/ligand2.sdf')
            ]
    },
    {
        "protein_file": Path('/path/to/protein2.pdb'), 
        "ligand_files": [
            Path('/path/to/ligand3.sdf'), 
            Path('/path/to/ligand4.sdf')
            ]
    }
]

affinities = []
poses = []
for input_pair in input_pairs:
    results = predictor.predict_for_protein(**input_pair)
    affinities.append(results.affinity)
    poses.append(results.pose)

The output will be 2 lists of pandas DataFrames with the prediction results for your protein-ligand pairs.

Outputs

Below is an example of the resulting affinity and pose DaraFrames for a protein and 2 docked ligands, with 2 and 3 docked poses respectively.

Affinity

Columns:

pdb_id: Name of the protein the ligands are docked to (provided protein PDB file name).
ligand_id: Name of the ligand docked to the pdb_id protein (provided ligand SDF file name).
affinity: Predicted affinity of protein-ligand pair overall, expressed in pKi units.

pdb_id,  ligand_id,                affinity,           
protein, protein_docked_ligand_0,  0.84967568666
protein, protein_docked_ligand_1,  0.8498707

Pose

Columns:

pdb_id: Name of the protein the ligands are docked to (provided protein PDB file name).
ligand_id: Name of the ligand docked to the pdb_id protein (provided ligand SDF file name).
pose_id: Sequential pose number based on the order of the docked ligand poses in the SDF file.
pose_confidence: Pose confidence, ranging from low (0) to high (1), indicating the model's confidence that the pose could be the true, protein-ligand co-crystal structure. Note that this is solely based on the model's prediction and not a direct comparison with an existing co-crystal structure.

pdb_id,  ligand_id,             pose_id,  pose_confidence
protein, protein_docked_ligand_0,  0, 0.9360706533333333
protein, protein_docked_ligand_0,  1, 0.9487579333333334
protein, protein_docked_ligand_1,  0, 0.8837728666666665
protein, protein_docked_ligand_1,  1, 0.9275542666666666
protein, protein_docked_ligand_1,  2, 0.8115468833333334

Development

Install the requirements:

pip install -r requirements-dev.txt
pre-commit install

License

HydraScreen is available restricted to Non-Commercial Use. For more information see the LICENSE file.

Project details

These details have not been verified by PyPI

Project links

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Release history Release notifications | RSS feed

0.0.8

Dec 18, 2023

0.0.7

Nov 2, 2023

0.0.6

Oct 17, 2023

0.0.5

Oct 12, 2023

0.0.4

Sep 1, 2023

This version

0.0.3

Aug 17, 2023

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

hydrascreen-0.0.3-6-py3-none-any.whl (6.4 kB view hashes)

Uploaded Aug 17, 2023 Python 3

Hashes for hydrascreen-0.0.3-6-py3-none-any.whl

Hashes for hydrascreen-0.0.3-6-py3-none-any.whl
Algorithm	Hash digest
SHA256	`b3ae210e056fdce4b31d909258e4bb951a76795766cef4918a65b65f58cd364c`
MD5	`e72f4895f1ed43a895c24c3223153243`
BLAKE2b-256	`4a8bb947260091417bca4473dbccd596186c46e8be2ced26c067f2079834c2e2`