Skip to main content

Python SDK for interacting with the QDX Rush API and modules

Project description

rush-py

Quickstart

This document will walk through executing jobs on the Rush platform. For a comprehensive guide on the concepts and constructing a full workflow, see the full rush-py explainer document

First, install the following modules via pip - we require Python > 3.10

pip install rush-py pdb-tools

0) Setup

This is where we prepare the rush client, directories, and input data we’ll be working with

0.0) Imports

import os
import tarfile
from datetime import datetime
from pathlib import Path

from pdbtools import pdb_fetch, pdb_delhetatm, pdb_selchain, pdb_rplresname, pdb_keepcoord, pdb_selresname
import requests
import py3Dmol

import rush

NOTE: This walkthrough assumes that you are running code in a Jupyter notebook, which allows for top level await calls. If you are writing a normal Python script, you will need to wrap your code in something like the following:

import asyncio
def main():
    #your code here
asyncio.run(main)

0.1) Credentials

Retrieve your api token from the Rush UI.

You can either set the RUSH_TOKEN and RUSH_URL environment variables, or provide them as variables to the client directly.

To see how to set environment variables, Wikipedia has an extensive article

RUSH_TOKEN = os.getenv("RUSH_TOKEN") or "YOUR_TOKEN_HERE"
RUSH_URL = os.getenv("RUSH_URL") or "https://tengu.qdx.ai"

0.2) Configuration

Lets set some global variables that define our project, these are not required, but are good practice to help organize the jobs that will be persisted under your account.

Make sure you create a unique set of tags for each run. Good practice is to have at least each of the experiment name and system name as a tag.

EXPERIMENT = "tengu-py-v2-quickstart"
SYSTEM = "1B39"
TAGS = ["qdx", EXPERIMENT, SYSTEM]

0.2) Build your client

Get our client, for calling modules and using the Rush API.

As mentioned earlier access_token and url are optional, if you have set the env variables RUSH_TOKEN and RUSH_URL.

batch_tags will be applied to each run that is spawned by this client.

A folder called .rush will be created in your workspace directory (defaults to the current working directory, can be overridden by passing workspace= to the provider builder

# By using the `build_provider_with_functions` method, we will also build helper functions calling each module
client = await rush.build_provider_with_functions(
    access_token=RUSH_TOKEN, url=RUSH_URL, batch_tags=TAGS
)

0.3) Input selection

Fetch data files from RCSB to pass as input to the modules

PROTEIN_PDB_PATH = client.workspace / f"{SYSTEM}_P.pdb"

complex = list(pdb_fetch.fetch_structure(SYSTEM))
protein = pdb_delhetatm.remove_hetatm(pdb_selchain.select_chain(complex, "A"))
with open(PROTEIN_PDB_PATH, "w") as f:
    for l in protein:
        f.write(str(l))
help(client.convert)
Help on function convert in module rush.provider:

async convert(*args: [list[typing.Union[str, ~T]], <class 'pathlib.Path'>], target: rush.graphql_client.enums.ModuleInstanceTarget | None = <ModuleInstanceTarget.NIX: 'NIX'>, resources: rush.graphql_client.input_types.ModuleInstanceResourcesInput | None = ModuleInstanceResourcesInput(gpus=0, gpu_mem=None, gpu_mem_units=None, cpus=None, nodes=None, mem=None, mem_units=None, storage=10, storage_units=<MemUnits.MB: 'MB'>, walltime=None, storage_mounts=None), tags: list[str] | None = None, restore: bool | None = None) -> [<class 'pathlib.Path'>]
    Convert biomolecular and chemical file formats to the QDX file format. Supports PDB and SDF
    
    Module version: github:talo/tengu-prelude/efc6d8b3a8cc342cd9866d037abb77dac40a4d56#convert
    
    QDX Type Description:
    
        format: PDB|SDF;
    
        input: @bytes 
    
    ->
    
        output: @[Conformer]
    
    
    
    :param format: the format of the input file
    :param input: the input file
    :return output: the output conformers

1) Running Rush Modules

You can view which modules are available, alongside their documentation, in the API Dodumentation

1.1) Prep the protein

First we will run the protein preparation routine (using pdbfixer and pdb2pqr internally) to prepare the protein for molecular dynamics

# we can check the arguments and outputs for prepare_protein with help()
help(client.prepare_protein)
Help on function prepare_protein in module rush.provider:

async prepare_protein(*args: [<class 'pathlib.Path'>], target: rush.graphql_client.enums.ModuleInstanceTarget | None = <ModuleInstanceTarget.NIX_SSH_2_GPU: 'NIX_SSH_2_GPU'>, resources: rush.graphql_client.input_types.ModuleInstanceResourcesInput | None = ModuleInstanceResourcesInput(gpus=1, gpu_mem=None, gpu_mem_units=None, cpus=None, nodes=None, mem=None, mem_units=None, storage=138, storage_units=<MemUnits.MB: 'MB'>, walltime=None, storage_mounts=None), tags: list[str] | None = None, restore: bool | None = None) -> [<class 'pathlib.Path'>, <class 'pathlib.Path'>]
    Prepare a PDB for downstream tasks: protonate, fill missing atoms, etc.
    
    Module version: github:talo/pdb2pqr/ff5abe87af13f31478ede490d37468a536621e9c#prepare_protein_tengu
    
    QDX Type Description:
    
        input_pdb: @bytes 
    
    ->
    
        output_qdxf: @[Conformer];
    
        output_pdb: @bytes
    
    
    
    :param input_pdb: An input protein as a file: one PDB file
    :return output_qdxf: An output protein a vec: one qdxf per model in pdb
    :return output_pdb: An output protein as a file: one PDB file
# Here we run the function, it will return a Provider.Arg which you can use to fetch the results
# We set restore = True so that we can restore a previous run to the same path with the same tags
(prepared_protein_qdxf, prepared_protein_pdb) = await client.prepare_protein(
    PROTEIN_PDB_PATH
)
print(f"{datetime.now().time()} | Running protein prep!")
prepared_protein_qdxf  # this initially only have the id of your result, we will show how to fetch the actual value later
23:32:40.657673 | Running protein prep!

Arg(id=1c19095e-4bd0-4fa1-bd60-e52338e2d9c2, value=None)

1.3) Run statuses

This will show the status of all of your runs. You can also view run statuses on the Rush UI

await client.status()
{'6e643129-f6e9-47f4-9b6f-414bacc29944': (<ModuleInstanceStatus.RESOLVING: 'RESOLVING'>,
  'prepare_protein',
  1),
 '0cae0860-f8c7-4afb-8fe2-144ab175a415': (<ModuleInstanceStatus.COMPLETED: 'COMPLETED'>,
  'prepare_protein',
  1),
 '0c2b5aa5-36c2-4180-b242-c2ff622a14f4': (<ModuleInstanceStatus.COMPLETED: 'COMPLETED'>,
  'prepare_protein',
  1)}

1.4) Run Values

This will return the “value” of the output from the function - for files you will recieve a url that you can download, otherwise you will recieve them as python types

protein_qdxf_value = await prepared_protein_qdxf.get()
len(protein_qdxf_value[0]["topology"]["symbols"])
2024-01-27 23:32:40,880 - rush - INFO - Argument 1c19095e-4bd0-4fa1-bd60-e52338e2d9c2 is now ModuleInstanceStatus.RESOLVING
2024-01-27 23:32:46,504 - rush - INFO - Argument 1c19095e-4bd0-4fa1-bd60-e52338e2d9c2 is now ModuleInstanceStatus.ADMITTED
2024-01-27 23:33:00,993 - rush - INFO - Argument 1c19095e-4bd0-4fa1-bd60-e52338e2d9c2 is now ModuleInstanceStatus.DISPATCHED
2024-01-27 23:33:06,618 - rush - INFO - Argument 1c19095e-4bd0-4fa1-bd60-e52338e2d9c2 is now ModuleInstanceStatus.RUNNING
2024-01-27 23:33:30,495 - rush - INFO - Argument 1c19095e-4bd0-4fa1-bd60-e52338e2d9c2 is now ModuleInstanceStatus.AWAITING_UPLOAD

4852

1.5) Downloads

We provide a utility to download files into your workspace, you can either provide a filename, which will be saved in workspace/objects/[filename], or you can provide your own filepath which the client will use as-is

await prepared_protein_pdb.download(filename="01_prepared_protein.pdb", overwrite=True)
# we can read our prepared protein pdb like this
with open(client.workspace / "objects" / "01_prepared_protein.pdb", "r") as f:
    print(f.readline(), "...")
REMARK   1 PDBFIXER FROM: /home/ubuntu/.cache/tengu_store/run/6e643129-f6e9-47f4-9b6f-414bacc29944/.tmp/m2_protein.pdb
 ...

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rush_py-1.3.2.tar.gz (40.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

rush_py-1.3.2-py3-none-any.whl (53.7 kB view details)

Uploaded Python 3

File details

Details for the file rush_py-1.3.2.tar.gz.

File metadata

  • Download URL: rush_py-1.3.2.tar.gz
  • Upload date:
  • Size: 40.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.7.1 CPython/3.11.6 Linux/6.1.69

File hashes

Hashes for rush_py-1.3.2.tar.gz
Algorithm Hash digest
SHA256 7dc1398c715d9113352ba7521180af4e2af03e4cafcb7844fdf1e780f7151e28
MD5 f524fe1ad5b05310507a24174840d1d7
BLAKE2b-256 b7d97da0c61682b7a112d4912db73ebe4ac4b5c5a09e9e2226f22218ee30fbfb

See more details on using hashes here.

File details

Details for the file rush_py-1.3.2-py3-none-any.whl.

File metadata

  • Download URL: rush_py-1.3.2-py3-none-any.whl
  • Upload date:
  • Size: 53.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.7.1 CPython/3.11.6 Linux/6.1.69

File hashes

Hashes for rush_py-1.3.2-py3-none-any.whl
Algorithm Hash digest
SHA256 d530e493c8e2c805c123a8c41cc13371b734c3c503cb32bc202cb4a63b965788
MD5 bbfbf266460dc74205eed47bbb6ad992
BLAKE2b-256 d7b87d65b68eb316bd2ceffecf818c974fbe412391ce6f6e766247365dfdcb0e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page