A Computational Workflow for Structure-Guided Design of Potent and Selective Kinase Peptide Substrates
Project description
Subtimizer
A Computational Workflow for Structure-Guided Design of Potent and Selective Kinase Peptide Substrates
A. Contents
A. Contents
B. Overview
C. Configuration
D. Prerequisites
E. Installation
F. Usage
G. Citation
B. Overview
Subtimizer provides an automated, structure-guided workflow for designing peptide substrates for kinases. It integrates AlphaFold-Multimer for structural modeling, ProteinMPNN for sequence design, and AlphaFold2-based interface evaluation of designed substrates.
C. Configuration (Customizing SLURM Templates)
The workflow uses SLURM job scripts generated from templates. To customize these for your HPC environment (partition names, memory limits, modules):
-
Initialize local templates:
subtimizer init-templatesThis creates a
subtimizer_templates/directory in your current folder with copies of all default scripts. -
Edit the templates: Open the files in
subtimizer_templates/(e.g.,fold_template.sh) and modify the#SBATCHdirectives ormodule loadcommands. -
Run Subtimizer: The tool will automatically detect and use your local templates instead of the package defaults.
D. Prerequisites
Add ColabFold to PATH using
export PATH="/PathTo/colabfold/localcolabfold/colabfold-conda/bin:$PATH"
Add ProteinMPNN to PATH using
export MPNN_PATH="/PathTo/ProteinMPNN/"
Add code to PATH using
export DL_BINDER_DESIGN_PATH="/PathTo/dl_binder_design/af2_initial_guess/predict.py"
- SLURM: This workflow is optimized for HPC environments using SLURM for job scheduling.
E. Installation
1. Set Up a Conda/Mamba Environment
Create a environment named subtimizer_env with Python>=3.9:
# Create the environment
mamba create -n subtimizer_env python=3.9 -y
# Activate the environment
mamba activate subtimizer_env
Step C: Set Up Worker Environments (Critical)
AlphaFold and ProteinMPNN run in separate environments to avoid dependency conflicts. Create these environments using the provided YAML files in the repository root.
mamba env create -f af2_des_env.yaml
mamba env create -f mpnn_des_env.yaml
2. Install Subtimizer
While in the subtimizer_env environment, you can install the package via PyPI (recommended) or from source.
Option A: Install from PyPI (Recommended)
pip install subtimizer
Option B: Install from Source (For Development) Use this if you want to modify the code or templates.
git clone https://github.com/abeebyekeen/subtimizer.git
cd subtimizer
pip install -e .
-
Verify Installation:
subtimizer --help
F. Usage
The workflow is managed through the subtimizer command. Use subtimizer --help to see all available commands.
Common Command Line Options
Most subtimizer commands (fold, design, validate, fix-pdb) accept the following options to control execution:
-n, --max-jobs <int>: Controls concurrency.- Default is 4. Increase this if you have more resources/GPUs available (e.g.,
-n 8). - Note: In
parallelmode, this should match your SLURM script's layout.
- Default is 4. Increase this if you have more resources/GPUs available (e.g.,
--start <int>/--end <int>: Process a subset of the list.- Example:
--start 1 --end 10(Processes items 1 through 10 in your input list).
- Example:
1. Setup Project Structure
Initialize the directory structure for your kinase complexes.
Input: A file (e.g., example_list_of_complexes.dat) containing the list of folder names/complexes.
AKT1_2akt1tide
ALK_axltide
SGK1_1akt1tide
TEC_srctide
Change into your working directory:
cd examples
And create this file using:
echo -e "AKT1_2akt1tide\nALK_axltide\nSGK1_1akt1tide\nTEC_srctide" > example_list_of_complexes.dat
Command:
subtimizer setup --file example_list_of_complexes.dat --type initial
This creates the project directories and necessary subfolders for AlphaFold.
2. Run AlphaFold-Multimer
Launch AlphaFold-Multimer for the listed complexes.
Important: This step expects a FASTA file (e.g., AKT1_2akt1tide.fasta) to exist inside each complex folder. See the examples folder for an example of how to prepare the FASTA files.
Option A: Batch Mode (Default)
Submits individual jobs for each complex using fold_template.sh.
subtimizer fold --file example_list_of_complexes.dat --max-jobs 4
Option B: Parallel Mode (Multi-GPU)
Submits a single job (run_fold_parallel.sh) that manages a pool of parallel tasks on a multi-GPU node.
subtimizer fold --file example_list_of_complexes.dat --mode parallel --max-jobs 4
--max-jobs: Number of parallel tasks (should match the number of GPUs requested infold_parallel_template.sh).--start/--end: Optionally specify explicit range of complexes to process on the list (e.g.,--start 1 --end 10).
3. Run ProteinMPNN Design
Perform sequence design on the generated structures.
3.1 Setup MPNN Design Folders and Configurations
subtimizer setup --file example_list_of_complexes.dat --type mpnn
Edit the
design_config.jsonfile created in your working directory to customizechains_to_design(default: "B") orfixed_positions(default: "4") for specific complexes.
3.2 Run ProteinMPNN Design
Option A: Batch Mode (Default)
Submits individual jobs using design_template.sh.
subtimizer design --file example_list_of_complexes.dat --max-jobs 4
Option B: Parallel Mode
Submits a single multi-sequence job using design_parallel_template.sh.
subtimizer design --file example_list_of_complexes.dat --mode parallel --max-jobs 4 --start 1 --end 10
4. Analyze Design Results
Analyze sequence recovery.
Command: Generates:
- Combined FASTA files (
all_design.fa) - Sequence Logos (
*_seqlogo.png) - Sequence Recovery Plots (
sequence_recovery_stripplot.pngand.csv)
subtimizer analyze --file example_list_of_complexes.dat
5. Sequence Clustering
Cluster designed sequences to remove duplicates and generate a summary file cluster_summary.dat.
subtimizer cluster --file example_list_of_complexes.dat
6. Preparing kinase-peptide (designed) for folding
Prepare sequences for AlphaFold-Multimer folding.
subtimizer prep-fold --file example_list_of_complexes.dat
7. Fold designed sequences with AF-Multimer
Note: The version of proteinMPNN used in this work does not generate pdbs. Hence the need for post-design folding.
However, with the newer version (and LigandMPNN) which generates structures of designed sequences, this step may not be necessary.
Option A: Batch Mode (Default)
Run on a single node.
subtimizer fold --file example_list_of_complexes.dat --stage validation --max-jobs 4
Option B: Parallel Mode (Multi-GPU)
Distribute the folding of designed sequences across multiple GPUs on a single node.
subtimizer fold --file example_list_of_complexes.dat --stage validation --mode parallel --max-jobs 4
Tip for Multi-Node Parallelism: To scale up to multiple nodes (e.g., 4 nodes), launch the parallel command 4 times with different ranges:
subtimizer fold ... --start 1 --end 2(Node 1)subtimizer fold ... --start 3 --end 4(Node 2) ... and so on. Note: This requires manually creating different SLURM jobs or running from different interactive sessions.
8. Prepare PDBs for AF2 initial guess
Note: af2_init guess has two requirements for the input pdb * the binder (substrate) has to be the first chain * no overlapping residue numbers between chains
subtimizer fix-pdb --file example_list_of_complexes.dat
9. Validation (AF2 Initial Guess)
Run AlphaFold-based validation with initial guess.
Configuration: This step requires the path to the
af2_initial_guesscode (specificallypredict.py). You can provide this path via the--binder-pathargument or theDL_BINDER_DESIGN_PATHenvironment variable.Setting the Environment Variable:
export DL_BINDER_DESIGN_PATH="/path/to/dl_binder_design/af2_initial_guess/predict.py"
subtimizer validate --file example_list_of_complexes.dat --binder-path /path/to/dl_binder_design/af2_initial_guess/predict.py
10. Reporting
Generates final reports, including:
- Merged score CSVs with weighted
pTM_ipTMmetric (0.2*pTM + 0.8*ipTM). - Swarm plots of validation metrics.
- Data is copied to
af2_init_guess/data/for easy access.
subtimizer report --file example_list_of_complexes.dat
11. Workflow for Original (Parental) Substrates
To process the parental substrates (Legacy Steps 16-17), use the setup --type original command with the standard workflow tools.
-
Setup: Creates
original_subsfolder and prepares files.subtimizer setup --file example_list_of_complexes.dat --type original
-
Process: Run commands pointing to the new files.
cd original_subs # Fix PDBs subtimizer fix-pdb --file ../example_list_of_complexes.dat # Validation subtimizer validate --file ../example_list_of_complexes.dat --max-jobs 4 # Reporting (Generates 'original' data) subtimizer report --file ../example_list_of_complexes.dat
-
Final Merge: Return to the main directory and run/re-run report to combine results.
cd .. # This automatically detects 'original_subs' and merges the data subtimizer report --file example_list_of_complexes.dat
12. ipSAE Evaluation
Perform interface-based Structure-Activity Relationship (ipSAE) analysis on the folded structures.
-
Prerequisite: Download ipSAE and add it to your PATH:
export PATH=$PATH:/path/to/ipSAE_directory
-
Run ipSAE Calculation: This command submits a SLURM job to calculate ipSAE metrics for all structures (designed and parental).
subtimizer ipsae --file example_list_of_complexes.dat --max-jobs 16
- Supports list slicing:
subtimizer ipsae --file list.dat --start 1 --end 5 - Arguments:
--pae-cutoff(default: 15),--dist-cutoff(default: 15).
- Supports list slicing:
-
Generate Final Reports: Run the report command again to generate ipSAE-specific plots (Regression and Colored Scatter plots).
subtimizer report --file example_list_of_complexes.dat
G. Citation
If you use Subtimizer in your work, please cite:
Yekeen A.A., Meyer C.J., McCoy M., Posner B., Westover K.D. A Computational Workflow for Structure-Guided Design of Potent and Selective Kinase Peptide Substrates. bioRxiv (2025). https://doi.org/10.1101/2025.07.04.663216
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file subtimizer-1.0.0.tar.gz.
File metadata
- Download URL: subtimizer-1.0.0.tar.gz
- Upload date:
- Size: 130.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.9.18
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b5f634aa40c60a4ce9dd66440dd37f10d3c5dfcfc67658754801c78d2976635e
|
|
| MD5 |
0615722dfde3056020287cb12b35efcb
|
|
| BLAKE2b-256 |
1ea9194161d68afa44fef090df2866b542442c21c14b4ff71c811a75b6d6d4fa
|
File details
Details for the file subtimizer-1.0.0-py3-none-any.whl.
File metadata
- Download URL: subtimizer-1.0.0-py3-none-any.whl
- Upload date:
- Size: 45.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.9.18
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
52cb3c4aaf3807fdbf42c1edecb909712d055ff2951a9baebd762f29f448217f
|
|
| MD5 |
ab7c87bd1d3dfa7a53033a8fc1593807
|
|
| BLAKE2b-256 |
804287b9652716b5ec00280a3b62a30102698f026862c951cd3392325d88a61a
|