Skip to main content

pdb2reaction - Automated enzyme reaction path modeling from PDB structures

Project description

pdb2reaction: End-to-end Reaction-Path Modeling from PDB Structures Using Machine-Learning Interatomic Potentials

Overview

pdb2reaction workflow overview

pdb2reaction is a Python CLI toolkit for modeling enzymatic reaction pathways from PDB structures using machine-learning interatomic potentials (MLIPs). Each workflow step is also available as an individual subcommand (opt, scan, scan2d, path-search, tsopt, freq, irc, dft, energy-diagram, etc.) for fine-grained control.

A single command can generate a first-pass enzymatic reaction path:

# Multi-PDB mode (R + P → MEP)
pdb2reaction -i 1.R.pdb 3.P.pdb -c 'SAM,GPP,MG' -l 'SAM:1,GPP:-3'
# Scan mode (single structure → staged bond scans → MEP)
pdb2reaction -i 1.R.pdb -c 'SAM,GPP,MG' -l 'SAM:1,GPP:-3' \
    --scan-lists '[("CS1 SAM 320","GPP 321 C7",1.60)]' \
                 '[("GPP 321 H11","GLU 186 OE2",0.90)]'

The full workflow — MEP search → TS optimization → IRC → thermochemistry → single-point DFT — can be run in one command:

pdb2reaction -i 1.R.pdb 3.P.pdb -c 'SAM,GPP,MG' -l 'SAM:1,GPP:-3' \
    --tsopt --thermo --dft

Working examples are provided in the examples/ directory: a run.sh with complete all workflow commands for both the multi-structure MEP and the scan-based pipeline.


Given (i) two or more PDB files (R → ... → P), or (ii) one PDB with --scan-lists, or (iii) one TS candidate with --tsopt, pdb2reaction automatically:

  • extracts an active-site model around user-defined substrates to build a cluster model,
  • explores minimum-energy paths (MEPs) with GSM or DMF,
  • optionally optimizes transition states, runs vibrational analysis, IRC, and single-point DFT,

using machine-learning interatomic potentials (MLIPs).

Related tools

Tool Use case Repository
mlmm-toolkit ML/MM (ONIOM) with full protein environment — automates MM parameter generation and ML region assignment from a single PDB input https://github.com/t-0hmura/mlmm_toolkit
UMA–Pysisyphus Interface YAML-input-based reaction mechanism analysis for small molecules https://github.com/t-0hmura/uma_pysis

Both pdb2reaction and mlmm-toolkit include a custom GPU-optimized pysisyphus fork for geometry optimization, TS search, and IRC. This bundled fork is not compatible with the upstream pysisyphus package; do not install them side by side.

Important (prerequisites):

  • Input PDB files must already contain hydrogen atoms.
  • When providing multiple PDBs, they must contain the same atoms in the same order (only coordinates may differ).
  • Boolean CLI options accept both --flag / --no-flag and value style --flag True/False (yes/no, 1/0 are also accepted). Prefer toggle style in new scripts.
  • The workflow also works for small-molecule systems. If you omit --center/-c and --ligand-charge, you can use .xyz or .gjf inputs as well.

Documentation


Agent Skills

pdb2reaction ships AI-agent instructions under .claude/skills/ so your agent can drive enzyme reaction-mechanism investigations via Claude Code, Cursor, etc.

The skill bundle covers:

  • End-to-end workflows and output parsing (summary.json, R/TS/P canonical paths)
  • CLI subcommands (extract, path-search, tsopt, freq, irc, dft, …)
  • Structure I/O (PDB / XYZ / GJF, charge & multiplicity decisions, link hydrogens & frozen atoms)
  • Installation & Setup instructions
  • HPC operation (PBS / SLURM, multi-GPU)

To activate, copy the .claude/skills/ directory into your project repository or home directory.


Installation

Linux with a CUDA-capable NVIDIA GPU is the validated production environment for the MLIP reaction-path workflows. The core Python package and CPU-only smoke tests also run on macOS and on Windows under WSL2.

Prerequisites

  • Python >= 3.11
  • CUDA 12.x

Minimal setup (CUDA 12.9)

pip install torch --index-url https://download.pytorch.org/whl/cu129
pip install pdb2reaction
plotly_get_chrome -y
huggingface-cli login

For DMF method (Additional MEP search method)

Install cyipopt (recommended via conda):

conda install -c conda-forge cyipopt -y

For the full step-by-step guide (HPC module load, alternative backends, DFT extras, troubleshooting), see docs/installation.md.

DFT single-point (pdb2reaction dft)

DFT dependencies are not installed by default. To use pdb2reaction dft, install the [dft] extra:

pip install "pdb2reaction[dft]"

This installs PySCF, GPU4PySCF (x86_64 only), and related CUDA libraries.

Supported ML potentials

Potential Repository Install extra
UMA (default) https://github.com/facebookresearch/fairchem (included)
ORB https://github.com/orbital-materials/orb-models pip install "pdb2reaction[orb]"
MACE https://github.com/ACEsuit/mace See below
AIMNet2 https://github.com/isayevlab/aimnetcentral pip install "pdb2reaction[aimnet]"

MACE installation: Because mace-torch and fairchem-core (UMA) can pin incompatible versions of e3nn, we recommend installing MACE in a dedicated environment. To use MACE, uninstall fairchem-core first, then install MACE:

pip uninstall fairchem-core
pip install mace-torch

Quick Examples

The examples below use GPP C6-methyltransferase BezA (Tsutsumi et al., Angew. Chem. Int. Ed. 2022, 61, e202111217) — a two-step mechanism: electrophilic methyl transfer from SAM to GPP C6 (via C7 carbocation), then proton abstraction by glutamate (GLU 186). The complete commands are in examples/run.sh.

Full workflow (multi-structure MEP)

pdb2reaction -i 1.R.pdb 3.P.pdb -c 'SAM,GPP,MG' -l 'SAM:1,GPP:-3' \
    --tsopt --thermo --out-dir result_mep

Scan mode (single structure → staged bond scans → MEP)

pdb2reaction -i 1.R.pdb -c 'SAM,GPP,MG' -l 'SAM:1,GPP:-3' \
    --scan-lists '[("CS1 SAM 320","GPP 321 C7",1.60)]' \
                 '[("GPP 321 H11","GLU 186 OE2",0.90)]' \
    --tsopt --thermo --out-dir result_scan

TS optimization only

pdb2reaction -i TS_candidate.pdb -c 'SAM,GPP,MG' -l 'SAM:1,GPP:-3' \
    --tsopt

Step-by-step workflow

1. Extract active-site model (cluster model)extract

pdb2reaction extract -i 1.R.pdb -c 'SAM,GPP,MG' -l 'SAM:1,GPP:-3'

2. Optimize geometryopt

pdb2reaction opt -i model.pdb -l 'SAM:1,GPP:-3'

3. MEP searchpath-opt

pdb2reaction path-opt -i R_model.pdb IM_model.pdb -l 'SAM:1,GPP:-3'

Recursive MEP search for multi-step reactionspath-search

pdb2reaction path-search -i R_model.pdb P_model.pdb -l 'SAM:1,GPP:-3'

4. TS optimizationtsopt

pdb2reaction tsopt -i hei.pdb -l 'SAM:1,GPP:-3'

5. Frequency analysisfreq

pdb2reaction freq -i ts_optimized.pdb -l 'SAM:1,GPP:-3'

6. IRCirc

pdb2reaction irc -i ts_optimized.pdb -l 'SAM:1,GPP:-3'

7. DFT single-pointdft

pdb2reaction dft -i optimized.pdb -l 'SAM:1,GPP:-3'

CLI Subcommands

Workflow

Subcommand Role Documentation
all End-to-end: extraction → MEP → TS → IRC → freq → DFT docs/all.md

Structure Preparation

Subcommand Role Documentation
extract Extract active-site model (cluster model) docs/extract.md
fix-altloc Resolve alternate conformations in PDB files docs/fix-altloc.md
add-elem-info Add/repair PDB element columns (77–78) docs/add-elem-info.md

Optimization & Path Search

Subcommand Role Documentation
opt Geometry optimization (L-BFGS or RFO) docs/opt.md
tsopt TS optimization (Dimer or RS-I-RFO) docs/tsopt.md
path-opt MEP optimization via GSM or DMF docs/path-opt.md
path-search Recursive MEP search with refinement docs/path-search.md
scan 1D bond-length driven scan docs/scan.md
scan2d 2D distance grid scan docs/scan2d.md
scan3d 3D distance grid scan docs/scan3d.md

Analysis

Subcommand Role Documentation
freq Vibrational frequency analysis + thermochemistry docs/freq.md
irc IRC calculation (EulerPC) docs/irc.md
dft Single-point DFT (GPU4PySCF / PySCF) docs/dft.md
bond-summary Compare structures and report bond changes docs/bond-summary.md

Visualization

Subcommand Role Documentation
trj2fig Energy plot from XYZ trajectory docs/trj2fig.md
energy-diagram Energy diagram from numeric values docs/energy-diagram.md

Tip: In tsopt, freq, and irc, setting --hessian-calc-mode Analytical is strongly recommended when you have enough VRAM.


HPC / Multi-GPU

On HPC clusters or multi-GPU workstations, pdb2reaction can parallelize UMA inference across nodes. Set workers and workers_per_node to enable parallel inference; see docs/hpc-example.md for details.


Getting Help

pdb2reaction --help
pdb2reaction <subcommand> --help
pdb2reaction <subcommand> --help-advanced
pdb2reaction all --help-advanced
# Shorthand alias (equivalent to pdb2reaction)
p2r --help
# Equivalent module invocation
python -m pdb2reaction --help

pdb2reaction all --help shows core options. Use pdb2reaction all --help-advanced for the full option list. scan, scan2d, scan3d, and the calculation commands (opt, path-opt, path-search, tsopt, freq, irc, dft) now follow the same progressive-help pattern (--help core, --help-advanced full). add-elem-info, trj2fig, and energy-diagram also use the same pattern. extract and fix-altloc also support progressive help (--help core, --help-advanced full parser options).

If you encounter any issues, please open an issue at https://github.com/t-0hmura/pdb2reaction/issues.


Citation

A preprint describing pdb2reaction is in preparation. Currently, if you find this work helpful for your research, please cite the software itself:

@software{ohmura2026pdb2reaction,
  author       = {Ohmura, Takuto},
  title        = {pdb2reaction},
  year         = {2026},
  month        = {4},
  version      = {0.3.8},
  url          = {https://github.com/t-0hmura/pdb2reaction},
  license      = {GPL-3.0},
  doi          = {10.5281/zenodo.19197865}
}

Known limitations

  • MACE and UMA cannot coexist in the same environment due to an e3nn version conflict. Use separate conda environments.
  • DFT single-point (pdb2reaction dft) is practical up to ~300 atoms; larger systems may require fragmentation.
  • ORB backend tends to converge transition states with extra small imaginary modes even when the reaction coordinate is correctly identified (i.e. mechanism recovery is usually fine but a clean single-saddle TS spectrum is not guaranteed). For quantitative studies that need a single-imaginary-mode TS, prefer UMA or MACE, or re-score ORB-converged geometries with DFT.
  • CPU-only execution is supported but 10-100x slower than GPU.

License

pdb2reaction is distributed under the GNU General Public License version 3 (GPL-3.0) and is available for academic and commercial use subject to the GPL-3.0 license terms.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pdb2reaction-0.3.8.tar.gz (6.7 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pdb2reaction-0.3.8-py3-none-any.whl (3.7 MB view details)

Uploaded Python 3

File details

Details for the file pdb2reaction-0.3.8.tar.gz.

File metadata

  • Download URL: pdb2reaction-0.3.8.tar.gz
  • Upload date:
  • Size: 6.7 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for pdb2reaction-0.3.8.tar.gz
Algorithm Hash digest
SHA256 047cc415a96793f7ffeb0492fe41396d390469e0fb93cba1936bdb773aa70368
MD5 dc7e6da98646fcc29efa224dcdc1cdeb
BLAKE2b-256 727c8980e28bf915aac0e9c8cbe25175aae389b114e3aed372b32d375a75798e

See more details on using hashes here.

Provenance

The following attestation bundles were made for pdb2reaction-0.3.8.tar.gz:

Publisher: release.yml on t-0hmura/pdb2reaction

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file pdb2reaction-0.3.8-py3-none-any.whl.

File metadata

  • Download URL: pdb2reaction-0.3.8-py3-none-any.whl
  • Upload date:
  • Size: 3.7 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for pdb2reaction-0.3.8-py3-none-any.whl
Algorithm Hash digest
SHA256 34ac48b2cbf535467a5bde3418d31bb93cd3d0eca3f689b38500c3153f057825
MD5 0920aad652a1a423d8d0fefd63a82f11
BLAKE2b-256 970ec512e190ad6bdd03ba8689080a0e2d34d538ff9774c28e3697a34dba836c

See more details on using hashes here.

Provenance

The following attestation bundles were made for pdb2reaction-0.3.8-py3-none-any.whl:

Publisher: release.yml on t-0hmura/pdb2reaction

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page