Skip to main content

pdb2reaction - Automated enzyme reaction path modeling from PDB structures

Project description

pdb2reaction: automated reaction-path modeling directly from PDB structures

Overview

pdb2reaction is a Python CLI toolkit for turning PDB structures into enzymatic reaction pathways with machine-learning interatomic potentials (MLIPs). Each workflow step is also available as an individual subcommand (opt, scan, scan2d, path-search, tsopt, freq, irc, dft, energy-diagram, etc.) for fine-grained control.

A single command can generate a first-pass enzymatic reaction path:

# bezA (GPP C6-methyltransferase): methyl transfer (SAM→GPP C6) + proton abstraction (E170)
pdb2reaction -i 1.R.pdb 3.P.pdb -c 'SAM,GPP,MG' -l 'SAM:1,GPP:-3'
# Scan mode (single structure → staged bond scans → MEP)
pdb2reaction -i 1.R.pdb -c 'SAM,GPP,MG' -l 'SAM:1,GPP:-3' \
    --scan-lists '[("CS1 SAM 320","GPP 321 C7",1.60)]' \
                 '[("GPP 321 H11","GLU 186 OE2",0.90)]'

The full workflow — MEP search → TS optimization → IRC → thermochemistry → single-point DFT — can be run in one command:

pdb2reaction -i 1.R.pdb 3.P.pdb -c 'SAM,GPP,MG' -l 'SAM:1,GPP:-3' \
    --tsopt --thermo --dft

Working examples are provided in the examples/ directory, including complete all workflow scripts for both multi-structure MEP and scan-based pipelines. The example system is GPP C6-methyltransferase BezA (Tsutsumi et al., Angew. Chem. Int. Ed. 2022, 61, e202111217), which catalyzes a two-step reaction: (1) electrophilic methyl transfer from SAM to the C6 position of GPP via a C7 carbocation intermediate, and (2) proton abstraction from C6 by the catalytic base E170 to yield 6-methylgeranyl pyrophosphate (6MGPP).


Given (i) two or more PDB files (R → ... → P), or (ii) one PDB with --scan-lists, or (iii) one TS candidate with --tsopt, pdb2reaction automatically:

  • extracts an active-site model around user-defined substrates to build a cluster model,
  • explores minimum-energy paths (MEPs) with GSM or DMF,
  • optionally optimizes transition states, runs vibrational analysis, IRC, and single-point DFT,

using machine-learning interatomic potentials (MLIPs).

Related tools

Tool Use case Repository
mlmm-toolkit ML/MM (ONIOM) with full protein environment — automates MM parameter generation and ML region assignment from a single PDB input https://github.com/t-0hmura/mlmm_toolkit
UMA–Pysisyphus Interface YAML-input-based reaction mechanism analysis for small molecules https://github.com/t-0hmura/uma_pysis

Both pdb2reaction and mlmm-toolkit include a custom GPU-optimized pysisyphus fork for geometry optimization, TS search, and IRC. This bundled fork is not compatible with the upstream pysisyphus package; do not install them side by side.

Important (prerequisites):

  • Input PDB files must already contain hydrogen atoms.
  • When providing multiple PDBs, they must contain the same atoms in the same order (only coordinates may differ).
  • Boolean CLI options accept both --flag / --no-flag and value style --flag True/False (yes/no, 1/0 are also accepted). Prefer toggle style in new scripts.
  • The workflow also works for small-molecule systems. If you omit --center/-c and --ligand-charge, you can use .xyz or .gjf inputs as well.

Documentation

This software is still under development. Please use it at your own risk.


Installation

pdb2reaction requires Linux with a CUDA-capable GPU.

Prerequisites

  • Python >= 3.11
  • CUDA 12.x

Minimal setup (CUDA 12.9, torch 2.8.0)

pip install torch==2.8.0 --index-url https://download.pytorch.org/whl/cu129
pip install pdb2reaction
plotly_get_chrome -y
huggingface-cli login

For DMF method

conda create -n pdb2reaction python=3.11 -y
conda activate pdb2reaction
conda install -c conda-forge cyipopt -y
pip install torch==2.8.0 --index-url https://download.pytorch.org/whl/cu129
pip install pdb2reaction
plotly_get_chrome -y

DFT single-point (pdb2reaction dft)

DFT dependencies are not installed by default. To use pdb2reaction dft, install the [dft] extra:

pip install "pdb2reaction[dft]"

This installs PySCF, GPU4PySCF (x86_64 only), and related CUDA libraries. Note that DFT single-point calculations are practical only for systems up to ~500 atoms; larger systems will require prohibitive compute time and memory.

For detailed installation instructions, see Installation.

Supported ML potentials

Potential Repository Install extra
UMA (default) https://github.com/facebookresearch/fairchem (included)
ORB https://github.com/orbital-materials/orb-models pip install "pdb2reaction[orb]"
MACE https://github.com/ACEsuit/mace See below
AIMNet2 https://github.com/isayevlab/aimnetcentral pip install "pdb2reaction[aimnet]"

MACE installation: MACE requires e3nn==0.4.4, which conflicts with fairchem-core (UMA). To use MACE, first uninstall UMA's dependency, then install MACE:

pip uninstall fairchem-core
pip install mace-torch

UMA and MACE cannot coexist in the same environment. Use separate conda environments if you need both.


Quick Examples

The examples below use GPP C6-methyltransferase BezA (Tsutsumi et al., Angew. Chem. Int. Ed. 2022, 61, e202111217) — a two-step mechanism: electrophilic methyl transfer from SAM to GPP C6 (via C7 carbocation), then proton abstraction by E170. Complete working scripts are in examples/.

Full workflow (multi-structure MEP)

pdb2reaction -i 1.R.pdb 3.P.pdb -c 'SAM,GPP,MG' -l 'SAM:1,GPP:-3' \
    --tsopt --thermo --out-dir result_mep

Scan mode (single structure → staged bond scans → MEP)

pdb2reaction -i 1.R.pdb -c 'SAM,GPP,MG' -l 'SAM:1,GPP:-3' \
    --scan-lists '[("CS1 SAM 320","GPP 321 C7",1.60)]' \
                 '[("GPP 321 H11","GLU 186 OE2",0.90)]' \
    --tsopt --thermo --out-dir result_scan

TS optimization only

pdb2reaction -i TS_candidate.pdb -c 'SAM,GPP,MG' -l 'SAM:1,GPP:-3' \
    --tsopt

Step-by-step workflow

1. Extract active-site model (cluster model)extract

pdb2reaction extract -i 1.R.pdb -c 'SAM,GPP,MG' -l 'SAM:1,GPP:-3' -r 6.0

2. Optimize geometryopt

pdb2reaction opt -i model.pdb -l 'SAM:1,GPP:-3'

3. MEP searchpath-opt

pdb2reaction path-opt -i R_model.pdb P_model.pdb -l 'SAM:1,GPP:-3'

4. TS optimizationtsopt

pdb2reaction tsopt -i hei.pdb -l 'SAM:1,GPP:-3'

5. Frequency analysisfreq

pdb2reaction freq -i ts_optimized.pdb -l 'SAM:1,GPP:-3'

6. IRCirc

pdb2reaction irc -i ts_optimized.pdb -l 'SAM:1,GPP:-3'

7. DFT single-pointdft

pdb2reaction dft -i optimized.pdb -l 'SAM:1,GPP:-3'

CLI Subcommands

Workflow

Subcommand Role Documentation
all End-to-end: extraction → MEP → TS → IRC → freq → DFT docs/all.md

Structure Preparation

Subcommand Role Documentation
extract Extract active-site model (cluster model) docs/extract.md
fix-altloc Resolve alternate conformations in PDB files docs/fix_altloc.md
add-elem-info Add/repair PDB element columns (77–78) docs/add_elem_info.md

Optimization & Path Search

Subcommand Role Documentation
opt Geometry optimization (L-BFGS or RFO) docs/opt.md
tsopt TS optimization (Dimer or RS-I-RFO) docs/tsopt.md
path-opt MEP optimization via GSM or DMF docs/path_opt.md
path-search Recursive MEP search with refinement docs/path_search.md
scan 1D bond-length driven scan docs/scan.md
scan2d 2D distance grid scan docs/scan2d.md
scan3d 3D distance grid scan docs/scan3d.md

Analysis

Subcommand Role Documentation
freq Vibrational frequency analysis + thermochemistry docs/freq.md
irc IRC calculation (EulerPC) docs/irc.md
dft Single-point DFT (GPU4PySCF / PySCF) docs/dft.md
bond-summary Compare structures and report bond changes docs/bond-summary.md

Visualization

Subcommand Role Documentation
trj2fig Energy plot from XYZ trajectory docs/trj2fig.md
energy-diagram Energy diagram from numeric values docs/energy_diagram.md

Tip: In tsopt, freq, and irc, setting --hessian-calc-mode Analytical is strongly recommended when you have enough VRAM.


HPC / Multi-GPU

On HPC clusters or multi-GPU workstations, pdb2reaction can parallelize UMA inference across nodes. Set workers and workers_per_node to enable parallel inference; see docs/uma_pysis.md for details.


Getting Help

pdb2reaction --help
pdb2reaction <subcommand> --help
pdb2reaction <subcommand> --help-advanced
pdb2reaction all --help-advanced
# Shorthand alias (equivalent to pdb2reaction)
p2r --help
# Equivalent module invocation
python -m pdb2reaction --help

pdb2reaction all --help shows core options. Use pdb2reaction all --help-advanced for the full option list. scan, scan2d, scan3d, and the calculation commands (opt, path-opt, path-search, tsopt, freq, irc, dft) now follow the same progressive-help pattern (--help core, --help-advanced full). add-elem-info, trj2fig, and energy-diagram also use the same pattern. extract and fix-altloc also support progressive help (--help core, --help-advanced full parser options).

If you encounter any issues, please open an issue at https://github.com/t-0hmura/pdb2reaction/issues.


Citation

A preprint describing pdb2reaction is in preparation. Currently, if you find this work helpful for your research, please cite the software itself:

@software{ohmura2026pdb2reaction,
  author       = {Ohmura, Takuto},
  title        = {pdb2reaction},
  year         = {2026},
  month        = {3},
  version      = {0.3.2},
  url          = {https://github.com/t-0hmura/pdb2reaction},
  license      = {GPL-3.0},
  doi          = {10.5281/zenodo.19197878}
}

Known limitations

  • MACE and UMA cannot coexist in the same environment due to an e3nn version conflict. Use separate conda environments.
  • DFT single-point (pdb2reaction dft) is practical up to ~500 atoms; larger systems may require fragmentation.
  • ORB backend has a higher failure rate on multi-step reactions (SVD failures in path optimization).
  • CPU-only execution is supported but 10-100x slower than GPU.

License

pdb2reaction is distributed under the GNU General Public License version 3 (GPL-3.0).

This software is still under development. Please use it at your own risk.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pdb2reaction-0.3.5.tar.gz (4.3 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pdb2reaction-0.3.5-py3-none-any.whl (3.7 MB view details)

Uploaded Python 3

File details

Details for the file pdb2reaction-0.3.5.tar.gz.

File metadata

  • Download URL: pdb2reaction-0.3.5.tar.gz
  • Upload date:
  • Size: 4.3 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for pdb2reaction-0.3.5.tar.gz
Algorithm Hash digest
SHA256 470fbfbc6004b344bcf101182120ceeb8cca8edd6ee2f5c55ac2d93c9c22126f
MD5 59ea764f5363a509ad8ca1ebade8b8c6
BLAKE2b-256 bcc91a859d26896dcb742747e01e70257b18fdb8b7c8c8930ca842898ffb10ec

See more details on using hashes here.

Provenance

The following attestation bundles were made for pdb2reaction-0.3.5.tar.gz:

Publisher: release.yml on t-0hmura/pdb2reaction

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file pdb2reaction-0.3.5-py3-none-any.whl.

File metadata

  • Download URL: pdb2reaction-0.3.5-py3-none-any.whl
  • Upload date:
  • Size: 3.7 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for pdb2reaction-0.3.5-py3-none-any.whl
Algorithm Hash digest
SHA256 7174cf36599940a472c5d5174f599d8bae43859a550192d34035d0ffb4999493
MD5 6ddc6810891243cd7d44867978abcaf1
BLAKE2b-256 0a7945546f14a7853455ecc68ea47c9476e2b2384a81acb6f204ac2a12018c16

See more details on using hashes here.

Provenance

The following attestation bundles were made for pdb2reaction-0.3.5-py3-none-any.whl:

Publisher: release.yml on t-0hmura/pdb2reaction

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page