Skip to main content

Python SDK for interacting with the QDX Tengu API and modules

Project description

tengu-py

Below we’ll walk through the process of building and running a drug discovery workflow, where we prepare a protein and ligand for molecular dynamics simulation, run the molecular dynamics, perform a quantum lattice interaction energy calculation.

First, install the following modules via pip - we require Python > 3.10

pip install tengu-py pdb-tools

0) Setup

This is where we prepare the tengu client, directories, and input data we’ll be working with

0.0) Imports

import os
import tarfile
from datetime import datetime
from pathlib import Path

from pdbtools import pdb_fetch, pdb_delhetatm, pdb_selchain, pdb_rplresname, pdb_keepcoord, pdb_selresname
import requests
import py3Dmol

import tengu

0.1) Credentials

# Set our token - ensure you have exported TENGU_TOKEN in your shell; or just replace the os.getenv with your token
TOKEN = os.getenv("TENGU_TOKEN")
# You might have a custom deployment url, by default it will use https://tengu.qdx.ai
URL = os.getenv("TENGU_URL") or "https://tengu.qdx.ai"
# These env variables will be read by default, so you can skip this step in future

0.2) Configuration

Lets set some global variables that define our project

# Define our project information
DESCRIPTION = "tengu-py demo notebook"
TAGS = ["qdx", "tengu-py-v2", "tutorial", "cdk2"]
WORK_DIR = Path.home() / "qdx" / "tengu-py-demo"
# Set our inputs
SYSTEM_PDB_PATH = WORK_DIR / "test.pdb"
PROTEIN_PDB_PATH = WORK_DIR / "test_P.pdb"
LIGAND_SMILES_STR = (
    "c1nc(c2c(n1)n(cn2)[C@H]3[C@@H]([C@@H]([C@H](O3)CO[P@@](=O)(O)O[P@](=O)(O)OP(=O)(O)O)O)O)N"
)
LIGAND_PDB_PATH = WORK_DIR / "test_L.pdb"
# where want our jobs to run
TARGET = "NIX_SSH_3"

Ensure your workdir exists

os.makedirs(WORK_DIR)

0.2) Build your client

# Get our client, for calling modules and using the tengu API
# Note, access_token and url are optional, if you have set the env variables TENGU_TOKEN and TENGU_URL
# Workspace sets the location where we will store our session history file and module lock file
# By using the `build_provider_with_functions` method, we will also build helper functions calling each module
client = await tengu.build_provider_with_functions(
    access_token=TOKEN, url=URL, workspace=WORK_DIR, batch_tags=TAGS
)

0.3) Input selection

# fetch datafiles
complex = list(pdb_fetch.fetch_structure("1B39"))
protein = pdb_delhetatm.remove_hetatm(pdb_selchain.select_chain(complex, "A"))
# select the ATP residue
ligand = pdb_selresname.filter_residue_by_name(complex, "ATP")
# we require ligands to be labelled as UNL
ligand = pdb_rplresname.rename_residues(ligand, "ATP", "UNL")
# we don't want to repeat all of the remark / metadata that is already in the protein
ligand = pdb_keepcoord.keep_coordinates(ligand)
# write our files to the locations defined in the config block
with open(SYSTEM_PDB_PATH, "w") as f:
    for l in complex:
        f.write(str(l))
with open(PROTEIN_PDB_PATH, "w") as f:
    for l in protein:
        f.write(str(l))
with open(LIGAND_PDB_PATH, "w") as f:
    for l in ligand:
        f.write(str(l))

0.4) View tengu modules

Tengu modules are “functions” that perform various computational chemistry tasks can be run on HPC infrastructure. We maintain multiple versions of these functions so that your scripts will stay stable over upgrades.

# Get our latest modules as a dict[module_name, module_path]
# If a lock file exists, load it so that the run is reproducable
# This will be done automatically if you use the `build_provider_with_functions` method
modules = await client.get_latest_module_paths()
module_name = "hermes_energy"
module_path = modules[module_name]
print(module_path)
github:talo/tengu-prelude/efc6d8b3a8cc342cd9866d037abb77dac40a4d56#hermes_energy
  • module_name is a descriptive string and indicates the “function” the module is calling;
  • module_path is a versioned tengu “endpoint” for a module accessible via the client.

Using the same module_path string across multiple runs provides reproducibility.

0.5) Use module functions

Next, we’ll use helper functions for the modules that we’ve fetched

If we have built a provider with functions, we can use the python help() function to describe their usage.

The QDX Type Description is a standard type definition across multiple programing languages to assist in interoperablility. @ indicates that the type is stored in a file, which will be synced to cloud storage

help(client.convert)
Help on function convert in module tengu.provider:

async convert(*args: [list[typing.Union[str, ~T]], <class 'pathlib.Path'>], target: tengu.graphql_client.enums.ModuleInstanceTarget | None = <ModuleInstanceTarget.NIX: 'NIX'>, resources: tengu.graphql_client.input_types.ModuleInstanceResourcesInput | None = ModuleInstanceResourcesInput(gpus=0, gpu_mem=None, gpu_mem_units=None, cpus=None, nodes=None, mem=None, mem_units=None, storage=10, storage_units=<MemUnits.MB: 'MB'>, walltime=None, storage_mounts=None), tags: list[str] | None = None, restore: bool | None = None) -> [<class 'pathlib.Path'>]
    Convert biomolecular and chemical file formats to the QDX file format. Supports PDB and SDF
    
    Module version: github:talo/tengu-prelude/efc6d8b3a8cc342cd9866d037abb77dac40a4d56#convert
    
    QDX Type Description:
    
        format: PDB|SDF;
    
        input: @bytes 
    
    ->
    
        output: @[Conformer]
    
    
    
    :param format: the format of the input file
    :param input: the input file
    :return output: the output conformers

1) Running Tengu Modules

Below we’ll call modules using the functions created on the client.

The parameters to a tengu module function would look like the following

  • *args: The values or ids passed to the :
    1. For @Objects - A pathlib.Path or a file-like object like BufferedReader, FileIO, StringIO etc.: Loads the data in the file as an argument. NOTE: The uploaded value isn’t just the string of the file, so don’t pass the string directly; pass the path or wrap in StringIO.
    2. An tengu Provider.Argument or ArgId returned by a previous call to a tengu module via client.[some_module_name](): The ArgId type wraps data for use within tengu. It may refer to an object already uploaded to tengu storage, such as outputs of other run calls. See below for more details. It’s easier to understand when you see an example.
    3. A parameter, i.e. a value of any other type, including None: Ensure the values match what is outlined in the *args list
  • **kwargs
    • target: The machine we want to run on (eg. NIX_SSH for a cluster, GADI for a supercomputer).
    • resources: The resources to use on the target. The most commonly provided being {"gpus": n, "storage": storage_in_units, "storage_units": "B" | "MB" | "GB", "walltime": mins}.
    • tags: Tags to associate with our run, so we can easily look up our runs. They will be populated by the batch_tags passed to the cleint on constructionby default
    • restore: If this is set to True - the function will check if a single module_instance exists for the same version of the function with the same tags, and return that instead of re-running.

The return value is a list of Provider.Arguments. You can wait for them to resolve by calling await your_argument.get(), or pass the arguments directly to subsequent functions, which will cause Tengu to do the waiting for you.

You can see the status of all the the jobs submitted for your workspace or session by going client.status()

We will now demonstrate how this works in action

1.1) Input Preparation

1.1.1) Prep the protein

First we will run the protein preparation routine (using pdbfixer internally) to prepare the protein for molecular dynamics

# we can check the arguments and outputs for prepare_protein with help()
help(client.prepare_protein)
Help on function prepare_protein in module tengu.provider:

async prepare_protein(*args: [<class 'pathlib.Path'>], target: tengu.graphql_client.enums.ModuleInstanceTarget | None = <ModuleInstanceTarget.NIX_SSH_2: 'NIX_SSH_2'>, resources: tengu.graphql_client.input_types.ModuleInstanceResourcesInput | None = ModuleInstanceResourcesInput(gpus=1, gpu_mem=None, gpu_mem_units=None, cpus=None, nodes=None, mem=None, mem_units=None, storage=138, storage_units=<MemUnits.MB: 'MB'>, walltime=None, storage_mounts=None), tags: list[str] | None = None, restore: bool | None = None) -> [<class 'pathlib.Path'>, <class 'pathlib.Path'>]
    Prepare a PDB for downstream tasks: protonate, fill missing atoms, etc.
    
    Module version: github:talo/pdb2pqr/ff5abe87af13f31478ede490d37468a536621e9c#prepare_protein_tengu
    
    QDX Type Description:
    
        input_pdb: @bytes 
    
    ->
    
        output_qdxf: @[Conformer];
    
        output_pdb: @bytes
    
    
    
    :param input_pdb: An input protein as a file: one PDB file
    :return output_qdxf: An output protein a vec: one qdxf per model in pdb
    :return output_pdb: An output protein as a file: one PDB file
# Here we run the function, it will return a Provider.Arg which you can use to fetch the results
# We set restore = True so that we can restore a previous run to the same path with the same tags
(prepared_protein_qdxf, prepared_protein_pdb) = await client.prepare_protein(PROTEIN_PDB_PATH, target=TARGET, restore=True)
print(f"{datetime.now().time()} | Running protein prep!")
prepared_protein_qdxf  # this initially only have the id of your result, we will show how to fetch the actual value later
2024-01-23 22:01:26,788 - tengu - WARNING - Multiple module instances found with the same tags and path
22:01:27.084250 | Running protein prep!

Arg(id=d205516c-fdb8-477d-a704-4275681c8062, value=None)

1.1.2) Checking results

1.1.2.1) Run statuses

This will show the status of all of your runs

await client.status()
{'416d265d-5df9-42a2-90da-670ca5d55585': (<ModuleInstanceStatus.RESOLVING: 'RESOLVING'>,
  'prepare_protein',
  1)}

1.1.2.2) Run Logs

If any of our runs fail, we can check their logs with

for instance_id, (status, name, count) in (await client.status()).items():
    if status.value == "FAILED":
        async for log_page in client.logs(instance_id, "stderr"):
            for log in log_page:
                print(log)
        break

1.1.2.3) Run Values

This will return the “value” of the output from the function - for files you will recieve a url that you can download, otherwise you will recieve them as python types

await prepared_protein_pdb.get()
2024-01-23 20:56:50,658 - tengu - INFO - Argument 54866acc-1e94-4bc5-a41d-1053e82b04f3 is now ModuleInstanceStatus.RESOLVING
2024-01-23 20:56:51,776 - tengu - INFO - Argument 54866acc-1e94-4bc5-a41d-1053e82b04f3 is now ModuleInstanceStatus.ADMITTED
2024-01-23 20:57:06,368 - tengu - INFO - Argument 54866acc-1e94-4bc5-a41d-1053e82b04f3 is now ModuleInstanceStatus.DISPATCHED
2024-01-23 20:57:13,044 - tengu - INFO - Argument 54866acc-1e94-4bc5-a41d-1053e82b04f3 is now ModuleInstanceStatus.QUEUED
2024-01-23 21:12:05,543 - tengu - INFO - Argument 54866acc-1e94-4bc5-a41d-1053e82b04f3 is now ModuleInstanceStatus.RUNNING
2024-01-23 21:12:28,201 - tengu - INFO - Argument 54866acc-1e94-4bc5-a41d-1053e82b04f3 is now ModuleInstanceStatus.AWAITING_UPLOAD

{'url': 'https://storage.googleapis.com/qdx-store/45d5362d-9fe5-4ecf-b8ec-b481ef200dbc?x-goog-signature=082e0f9043748a633104f51bd34ccf3186d07dd7c6e694d41a2cfddb77d247c31da83630cb3ba372ee8ec46ddc80213c820dbebe1a5af5917c217cd07791fdba38e5c9c6c9704756d13778ea0e80cf3a690b8108bf93136ec7283044f54b1f3909b2de7026ffd36b47e0db6cdb90ecb278380939c7cfc31c5b98c1f67b555fd82470e7e2f609b71600ea2701b04196bac390fd2c810cbad7351b0eb486316962bcdd4996584b595332f20504360b13e3f7d7d97cf2e8042f9a928e0f348f330e82e74c5adb598511cb46aa729c606ef93f599a9eef9a9a70b967177775631a12ceb2e6a249d15fca7182fa44f274e94cd8d910ffb2256669a0de7f0271b90377&x-goog-algorithm=GOOG4-RSA-SHA256&x-goog-credential=qdx-store-user%40humming-bird-321603.iam.gserviceaccount.com%2F20240123%2Fasia-southeast1%2Fstorage%2Fgoog4_request&x-goog-date=20240123T131309Z&x-goog-expires=3600&x-goog-signedheaders=host'}

1.1.2.3) Downloads

We provide a utility to download files into your workspace, you can either provide a filename, which will be saved in workspace/objects/[filename], or you can provide your own filepath which the client will use as-is

try:
    await prepared_protein_pdb.download(filename="01_prepared_protein.pdb")
except FileExistsError:
    # we will raise an error if you try to overwrite an existing file, you can force the file to overwrite
    # by passing an absolute filepath instead
    pass
# we can read our prepared protein pdb like this
with open(client.workspace / "objects" / "01_prepared_protein.pdb", "r") as f:
    print(f.readline(), "...")
REMARK   1 PDBFIXER FROM: /home/ubuntu/.cache/tengu_store/run/416d265d-5df9-42a2-90da-670ca5d55585/.tmp/m2_protein.pdb
 ...

You should visualize your prepared protein to spot check any incorrectly transformed residues

view = py3Dmol.view()
with open(client.workspace / "objects" / "01_prepared_protein.pdb", "r") as f:
    view.addModel(f.read(), "pdb")
    view.setStyle({"cartoon":{"color":"spectrum"}})
    view.zoomTo()
    # view.show() # we can't have the widget in the readme

1.1.3) Prep the ligand

Next we will prepare the ligand (using gypsum_dl internally)

# we can check the inputs for prepare_ligand with help()
help(client.prepare_ligand)
Help on function prepare_ligand in module tengu.provider:

async prepare_ligand(*args: [<class 'str'>, <class 'pathlib.Path'>, dict[str, ~T]], target: tengu.graphql_client.enums.ModuleInstanceTarget | None = <ModuleInstanceTarget.NIX_SSH_3: 'NIX_SSH_3'>, resources: tengu.graphql_client.input_types.ModuleInstanceResourcesInput | None = ModuleInstanceResourcesInput(gpus=1, gpu_mem=None, gpu_mem_units=None, cpus=None, nodes=None, mem=None, mem_units=None, storage=18, storage_units=<MemUnits.MB: 'MB'>, walltime=None, storage_mounts=None), tags: list[str] | None = None, restore: bool | None = None) -> [<class 'pathlib.Path'>, <class 'pathlib.Path'>]
    Prepare ligand for sim. or quantum energy calc. using gypsum_dl
    
    Module version: github:talo/gypsum_dl/04acd1852cb3e2c8d0347e15763926fdf9a93a5d#prepare_ligand_tengu
    
    QDX Type Description:
    
        in: string;
    
        in: @bytes;
    
        in: {
    
        job_manager:string,
    
        let_tautomers_change_chirality:bool,
    
        max_ph:f32,
    
        max_variants_per_compound:i32,
    
        min_ph:f32,
    
        num_processors:i32,
    
        output_folder:string,
    
        pka_precision:f32,
    
        separate_output_files:bool,
    
        skip_adding_hydrogen:bool,
    
        skip_alternate_ring_conformations:bool,
    
        skip_enumerate_chiral_mol:bool,
    
        skip_enumerate_double_bonds:bool,
    
        skip_making_tautomers:bool,
    
        skip_optimize_geometry:bool,
    
        source:string,
    
        thoroughness:i32,
    
        use_durrant_lab_filters:bool
    
        } 
    
    ->
    
        out: @bytes;
    
        out: @bytesPrepare ligand for sim. or quantum energy calc. using gypsum_dl.
    
    
    Inputs:
        - An input molecule as a SMILES string
        - The same input molecule as a PDB file
        - A json file representing any other options to pass; see gypsum_dl docs for details
    
    
    Outputs:
        - A pdb file containing the prepared version of the molecule, ready for downstream use
ligand_prep_config = {
    "source": "",
    "output_folder": "./",
    "job_manager": "multiprocessing",
    "num_processors": -1,
    "max_variants_per_compound": 1,
    "thoroughness": 3,
    "separate_output_files": True,
    "min_ph": 6.4,
    "max_ph": 8.4,
    "pka_precision": 1.0,
    "skip_optimize_geometry": True,
    "skip_alternate_ring_conformations": True,
    "skip_adding_hydrogen": False,
    "skip_making_tautomers": True,
    "skip_enumerate_chiral_mol": True,
    "skip_enumerate_double_bonds": True,
    "let_tautomers_change_chirality": False,
    "use_durrant_lab_filters": True,
}
(prepared_ligand_pdb, prepared_ligand_sdf) = await client.prepare_ligand(
    LIGAND_SMILES_STR,
    LIGAND_PDB_PATH,
    ligand_prep_config,
    target=TARGET,
    restore=True,
)
print(f"{datetime.now().time()} | Running ligand prep!")
21:13:14.769402 | Running ligand prep!
# we can check the status again
await client.status()
{'416d265d-5df9-42a2-90da-670ca5d55585': (<ModuleInstanceStatus.COMPLETED: 'COMPLETED'>,
  'prepare_protein',
  1)}
# we can download our outputs
try:
    await prepared_ligand_pdb.download(filename="01_prepped_ligand.pdb")
    await prepared_ligand_sdf.download(filename="01_prepped_ligand.sdf")
except FileExistsError:
    pass

print(f"{datetime.now().time()} | Downloaded prepped ligand!")
21:13:23.776710 | Downloaded prepped ligand!
# we can read our outputs
with open(client.workspace / "objects" / "01_prepped_ligand.sdf", "r") as f:
    print(f.readline(), f.readline(), "...")
untitled_0_molnum_0
      RDKit          3D
 ...

1.2) Run GROMACS (module: gmx_tengu / gmx_tengu_pdb)

Next we will run a molecular dynamics simulation on our protein and ligand, using gromacs (gmx)

help(client.gmx)
Help on function gmx in module tengu.provider:

async gmx(*args: [typing.Optional[~T], typing.Optional[~T], typing.Optional[~T], dict[str, ~T]], target: tengu.graphql_client.enums.ModuleInstanceTarget | None = <ModuleInstanceTarget.NIX_SSH: 'NIX_SSH'>, resources: tengu.graphql_client.input_types.ModuleInstanceResourcesInput | None = ModuleInstanceResourcesInput(gpus=4, gpu_mem=None, gpu_mem_units=None, cpus=None, nodes=None, mem=None, mem_units=None, storage=1034, storage_units=<MemUnits.MB: 'MB'>, walltime=None, storage_mounts=None), tags: list[str] | None = None, restore: bool | None = None) -> [<class 'pathlib.Path'>, <class 'pathlib.Path'>, <class 'pathlib.Path'>, <class 'pathlib.Path'>, <class 'pathlib.Path'>, <class 'pathlib.Path'>, <class 'pathlib.Path'>, <class 'pathlib.Path'>]
    Runs a molecular dynamics simluation using GROMACS from either protein, ligand pdbs or conformers as inputs
    
    Module version: github:talo/gmx_tengu_support/6bd881c6bb32ac85ab9cb9c95d2b16676cc72b7a#gmx_tengu
    
    QDX Type Description:
    
        conformer: @Conformer?;
    
        protein: @bytes?;
    
        ligand: @bytes?;
    
        gmx-config: {
    
        frame_sel:{
    
        end_frame:u32,
    
        interval:u32,
    
        start_frame:u32
    
        }?,
    
        ligand_charge:i8?,
    
        num_gpus:u8,
    
        num_replicas:u8?,
    
        params_overrides:{
    
        em:{
    
        coulombtype:string?,
    
        cutoff_scheme:string?,
    
        emstep:f64?,
    
        emtol:f64?,
    
        integrator:string?,
    
        ns_type:string?,
    
        nsteps:i32?,
    
        nstlist:i32?,
    
        pbc:string?,
    
        rcoulomb:f64?,
    
        rlist:f64?,
    
        rvdw:f64?
    
        }?,
    
        ions:{
    
        coulombtype:string?,
    
        cutoff_scheme:string?,
    
        emstep:f64?,
    
        emtol:f64?,
    
        integrator:string?,
    
        nsteps:i32?,
    
        nstlist:i32?,
    
        pbc:string?,
    
        rcoulomb:f64?,
    
        rlist:f64?,
    
        rvdw:f64?
    
        }?,
    
        md:{
    
        compressibility:f64?,
    
        constraint_algorithm:string?,
    
        constraints:string?,
    
        continuation:string?,
    
        coulombtype:string?,
    
        cutoff_scheme:string?,
    
        disp_corr:string?,
    
        dt:f64?,
    
        fourierspacing:f64?,
    
        gen_vel:string?,
    
        integrator:string?,
    
        lincs_iter:i32?,
    
        lincs_order:i32?,
    
        ns_type:string?,
    
        nstenergy:i32?,
    
        nsteps:i32?,
    
        nstlist:i32?,
    
        nstlog:i32?,
    
        nstxout_compressed:i32?,
    
        pbc:string?,
    
        pcoupl:string?,
    
        pcoupltype:string?,
    
        pme_order:i32?,
    
        rcoulomb:f64?,
    
        ref_p:f64?,
    
        ref_t:[f64]?,
    
        rlist:f64?,
    
        rvdw:f64?,
    
        rvdw_switch:f64?,
    
        tau_p:f64?,
    
        tau_t:[f64]?,
    
        tc_grps:string?,
    
        tcoupl:string?,
    
        vdw_modifier:string?,
    
        vdwtype:string?
    
        }?,
    
        npt:{
    
        compressibility:f64?,
    
        constraint_algorithm:string?,
    
        constraints:string?,
    
        continuation:string?,
    
        coulombtype:string?,
    
        cutoff_scheme:string?,
    
        define:string?,
    
        disp_corr:string?,
    
        dt:f64?,
    
        fourierspacing:f64?,
    
        gen_vel:string?,
    
        integrator:string?,
    
        lincs_iter:i32?,
    
        lincs_order:i32?,
    
        ns_type:string?,
    
        nstenergy:i32?,
    
        nsteps:i32?,
    
        nstlist:i32?,
    
        nstlog:i32?,
    
        nstxout_compressed:i32?,
    
        pbc:string?,
    
        pcoupl:string?,
    
        pcoupltype:string?,
    
        pme_order:i32?,
    
        rcoulomb:f64?,
    
        ref_p:f64?,
    
        ref_t:[f64]?,
    
        refcoord_scaling:string?,
    
        rlist:f64?,
    
        rvdw:f64?,
    
        rvdw_switch:f64?,
    
        tau_p:f64?,
    
        tau_t:[f64]?,
    
        tc_grps:string?,
    
        tcoupl:string?,
    
        vdw_modifier:string?,
    
        vdwtype:string?
    
        }?,
    
        nvt:{
    
        constraint_algorithm:string?,
    
        constraints:string?,
    
        continuation:string?,
    
        coulombtype:string?,
    
        cutoff_scheme:string?,
    
        define:string?,
    
        disp_corr:string?,
    
        dt:f64?,
    
        fourierspacing:f64?,
    
        gen_seed:i32?,
    
        gen_temp:f64?,
    
        gen_vel:string?,
    
        integrator:string?,
    
        lincs_iter:i32?,
    
        lincs_order:i32?,
    
        ns_type:string?,
    
        nstenergy:i32?,
    
        nsteps:i32?,
    
        nstlist:i32?,
    
        nstlog:i32?,
    
        nstxout_compressed:i32?,
    
        pbc:string?,
    
        pcoupl:string?,
    
        pme_order:i32?,
    
        rcoulomb:f64?,
    
        ref_t:[f64]?,
    
        rlist:f64?,
    
        rvdw:f64?,
    
        rvdw_switch:f64?,
    
        tau_t:[f64]?,
    
        tc_grps:string?,
    
        tcoupl:string?,
    
        vdw_modifier:string?,
    
        vdwtype:string?
    
        }?
    
        }?,
    
        save_wets:bool
    
        } 
    
    ->
    
        output_gros: @bytes;
    
        outputs_tpr: @bytes;
    
        outputs_tops: @bytes;
    
        output_logs: @bytes;
    
        outputs_md.ligand_in.dry.xtc: @bytes;
    
        outputs_md.ligand_in.dry.pdb: @bytes;
    
        index.ligand_in.ndx: @bytes;
    
        outputs_md.ligand_in.xtc: @bytes
    
    
    
    :param conformer: Optional Conformer in QDXF format; must provide either this argument, or the Protein PDB argument .
    :param protein: Protein PDB file; provide this if no Conformer qdxf is provided.
    :param ligand: Ligand PDB file
    :param gmx-config: Configuration record
    :return output_gros: Intermediate and final structure files through the simulation
    :return outputs_tpr: .tpr files of the production MD runs
    :return outputs_tops: .top files, and relevant .itp files used in the simulation.
    :return output_logs: Logs of the production MD runs
    :return outputs_md.ligand_in.dry.xtc: Trajectories, i.e., without water molecules, from the production MD runs
    :return outputs_md.ligand_in.dry.pdb: Outputs of select_frame, pdb frames without water
    :return index.ligand_in.ndx: Index file
    :return outputs_md.ligand_in.xtc: Wet trajectories, i.e., with water molecules, from the production MD runs
gmx_config = {
    "param_overrides": {
        "md": {"nsteps", "5000"},
        "em": {"nsteps", "1000"},
        "nvt": {"nsteps", "1000"},
        "npt": {"nsteps", "1000"},
        "ions": {},
    },
    "num_gpus": 0,
    "num_replicas": 1,
    "ligand_charge": None,
    "save_wets": False,
    "frame_sel": {
        "start_frame": 1,
        "end_frame": 10,
        "interval": 2,
    },
}
# we pass the outputs from our prior runs directly, instead of their values, to prevent them from being re-uploaded
gmx_results = await client.gmx(
    None,
    prepared_protein_pdb,
    prepared_ligand_pdb,
    gmx_config,
    resources={"gpus": 0, "storage": 1, "storage_units": "GB", "cpus": 48, "walltime": 60},
    target=TARGET
)
print(f"{datetime.now().time()} | Running GROMACS simulation!")
21:13:23.875843 | Running GROMACS simulation!
# we can check the status again
await client.status()
{'ff45a908-db5e-437f-806a-894f5538dfad': (<ModuleInstanceStatus.RESOLVING: 'RESOLVING'>,
  'gmx',
  1),
 '416d265d-5df9-42a2-90da-670ca5d55585': (<ModuleInstanceStatus.COMPLETED: 'COMPLETED'>,
  'prepare_protein',
  1)}
print("Fetching gmx results")
try:
    await gmx_results[5].download(filename="02_gmx_dry_frames.tar.gz")
    await gmx_results[0].download(filename="02_gmx_lig_gro.tar.gz")

except FileExistsError:
    pass

print(f"{datetime.now().time()} | Downloaded GROMACS output!")
Fetching gmx results
2024-01-23 21:13:24,192 - tengu - INFO - Argument 05957a95-e618-4c3e-859d-18e8e3fbfd21 is now ModuleInstanceStatus.RESOLVING
2024-01-23 21:13:28,653 - tengu - INFO - Argument 05957a95-e618-4c3e-859d-18e8e3fbfd21 is now ModuleInstanceStatus.ADMITTED
2024-01-23 21:13:38,647 - tengu - INFO - Argument 05957a95-e618-4c3e-859d-18e8e3fbfd21 is now ModuleInstanceStatus.DISPATCHED
2024-01-23 21:13:44,180 - tengu - INFO - Argument 05957a95-e618-4c3e-859d-18e8e3fbfd21 is now ModuleInstanceStatus.RUNNING
2024-01-23 21:31:39,617 - tengu - INFO - Argument 05957a95-e618-4c3e-859d-18e8e3fbfd21 is now ModuleInstanceStatus.AWAITING_UPLOAD
21:32:59.792453 | Downloaded GROMACS output!
# Extract the "dry" (i.e. non-solvated) pdb frames we asked for
with tarfile.open(client.workspace / "objects" / "02_gmx_dry_frames.tar.gz", "r") as tf:
    selected_frame_pdbs = [tf.extractfile(member).read() for member in tf if "pdb" in member.name and member.isfile()]
    for i, frame in enumerate(selected_frame_pdbs):
        with open(client.workspace / "objects" / f"02_gmx_output_frame_{i}.pdb", "w") as pf:
            print(frame.decode("utf-8"), file=pf)
# Extract the ligand.gro file
with tarfile.open(client.workspace / "objects" / "02_gmx_lig_gro.tar.gz", "r") as tf:
    gro = [tf.extractfile(member).read() for member in tf if "md.ligand" in member.name][0]
    with open(client.workspace / "objects" / f"02_gmx_lig.gro", "w") as pf:
        print(gro.decode("utf-8"), file=pf)

1.3) Run MM-PBSA

help(client.gmx_mmpbsa)
Help on function gmx_mmpbsa in module tengu.provider:

async gmx_mmpbsa(*args: [<class 'pathlib.Path'>, <class 'pathlib.Path'>, <class 'pathlib.Path'>, <class 'pathlib.Path'>, dict[str, ~T]], target: tengu.graphql_client.enums.ModuleInstanceTarget | None = <ModuleInstanceTarget.NIX_SSH_3: 'NIX_SSH_3'>, resources: tengu.graphql_client.input_types.ModuleInstanceResourcesInput | None = ModuleInstanceResourcesInput(gpus=0, gpu_mem=None, gpu_mem_units=None, cpus=None, nodes=None, mem=None, mem_units=None, storage=10, storage_units=<MemUnits.MB: 'MB'>, walltime=None, storage_mounts=None), tags: list[str] | None = None, restore: bool | None = None) -> [<class 'pathlib.Path'>]
    Run gmx mmpbsa on the outputs of a gmx simulation 
    
    Module version: github:talo/gmx_tengu_support/6bd881c6bb32ac85ab9cb9c95d2b16676cc72b7a#gmx_mmpbsa_tengu
    
    QDX Type Description:
    
        tpr_tar_gz: @bytes;
    
        dry_xtc_tar_gz: @bytes;
    
        index_file: @bytes;
    
        topol_file: @bytes;
    
        mmpbsa_config: {
    
        end_frame:u64,
    
        ie_segment:u32?,
    
        interaction_entropy:bool?,
    
        interval:u32?,
    
        num_cpus:u32,
    
        start_frame:u64,
    
        with_gb:bool?
    
        } 
    
    ->
    
        output: @bytes
    
    
    
    :param tpr_tar_gz: Compressed GROMACS output folder
    :param dry_xtc_tar_gz: Compressed GROMACS output folder
    :param index_file: Compressed GROMACS output folder
    :param topol_file: Compressed GROMACS output folder
    :param mmpbsa_config: Configuration record for mmpbsa:
        start_frame: Frame to start with
        end_frame: Frame to end with
        num_cpus: Number of CPUs to use - cannot be larger than the number of frames
        interaction_entropy: Calculate interaction entropy
        
    :return output: Compressed mmpbsa output folder
mmpbsa_config = {
    "start_frame": 1,
    "end_frame": 2,
    "num_cpus": 1,  # cannot be greater than number of frames
}
(mmpbsa_result_tar,) = await client.gmx_mmpbsa(
    gmx_results[1],
    gmx_results[4],
    gmx_results[6],
    gmx_results[2],
    mmpbsa_config,
    resources=tengu.Resources(storage=100, storage_units="MB", cpus=42, gpus=0, walltime=60, mem=1024 * 42),
    target="GADI",
)
print(f"{datetime.now().time()} | Running GROMACS MM-PBSA calculation!")
21:32:59.933969 | Running GROMACS MM-PBSA calculation!
print("Fetching gmx_mmpbsa results")
try:
    await mmpbsa_result_tar.download(filename="04_gmx_mmpbsa_run_folder.tar.gz")
except FileExistsError:
    pass
print(f"{datetime.now().time()} | Downloaded MM-PBSA results!")
Fetching gmx_mmpbsa results
2024-01-23 21:33:00,060 - tengu - INFO - Argument 224d9a20-f6a3-476a-8866-bea4450db4a8 is now ModuleInstanceStatus.RESOLVING
2024-01-23 21:34:29,513 - tengu - INFO - Argument 224d9a20-f6a3-476a-8866-bea4450db4a8 is now ModuleInstanceStatus.ADMITTED
2024-01-23 21:34:52,976 - tengu - INFO - Argument 224d9a20-f6a3-476a-8866-bea4450db4a8 is now ModuleInstanceStatus.DISPATCHED
2024-01-23 21:34:57,439 - tengu - INFO - Argument 224d9a20-f6a3-476a-8866-bea4450db4a8 is now ModuleInstanceStatus.QUEUED
2024-01-23 21:40:02,101 - tengu - INFO - Argument 224d9a20-f6a3-476a-8866-bea4450db4a8 is now ModuleInstanceStatus.AWAITING_UPLOAD
21:50:25.625550 | Downloaded MM-PBSA results!

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tengu_py-1.2.1.tar.gz (51.3 kB view details)

Uploaded Source

Built Distribution

tengu_py-1.2.1-py3-none-any.whl (59.1 kB view details)

Uploaded Python 3

File details

Details for the file tengu_py-1.2.1.tar.gz.

File metadata

  • Download URL: tengu_py-1.2.1.tar.gz
  • Upload date:
  • Size: 51.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.7.1 CPython/3.11.6 Linux/6.1.69

File hashes

Hashes for tengu_py-1.2.1.tar.gz
Algorithm Hash digest
SHA256 a81f06cf4a7d1cc4dba50a002203b096ae5e88ffb4f737cd78ad7b2cfeadcfb5
MD5 fa23e1aca2af45d7ec5adc636edfe8ff
BLAKE2b-256 3feb817c38d1cb1881bf80927378d02a8cb254c59bf3643ca009032fc3ba2728

See more details on using hashes here.

File details

Details for the file tengu_py-1.2.1-py3-none-any.whl.

File metadata

  • Download URL: tengu_py-1.2.1-py3-none-any.whl
  • Upload date:
  • Size: 59.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.7.1 CPython/3.11.6 Linux/6.1.69

File hashes

Hashes for tengu_py-1.2.1-py3-none-any.whl
Algorithm Hash digest
SHA256 607d192f221b975768e780c359bc8c70362e411d38c484b0f2ca7b1c784a7174
MD5 bef494726c7c56f36272ba80b7a4fef3
BLAKE2b-256 14fc213fac073a661a985a8ae5f4bcd0d5decd60f701c4a910594edc5a3353c4

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page