pdb2reaction - Automated enzyme reaction path modeling from PDB structures
Project description
pdb2reaction: End-to-end Reaction-Path Modeling from PDB Structures Using Machine-Learning Interatomic Potentials
Overview
pdb2reaction is a Python CLI toolkit for modeling enzymatic reaction pathways from PDB structures using machine-learning interatomic potentials (MLIPs). Each workflow step is also available as an individual subcommand (opt, scan, scan2d, path-search, tsopt, freq, irc, dft, energy-diagram, etc.) for fine-grained control.
A single command can generate a first-pass enzymatic reaction path:
# Multi-PDB mode (R + P → MEP)
pdb2reaction -i 1.R.pdb 3.P.pdb -c 'SAM,GPP,MG' -l 'SAM:1,GPP:-3'
# Scan mode (single structure → staged bond scans → MEP)
pdb2reaction -i 1.R.pdb -c 'SAM,GPP,MG' -l 'SAM:1,GPP:-3' \
--scan-lists '[("CS1 SAM 320","GPP 321 C7",1.60)]' \
'[("GPP 321 H11","GLU 186 OE2",0.90)]'
The full workflow — MEP search → TS optimization → IRC → thermochemistry → single-point DFT — can be run in one command:
pdb2reaction -i 1.R.pdb 3.P.pdb -c 'SAM,GPP,MG' -l 'SAM:1,GPP:-3' \
--tsopt --thermo --dft
Working examples are provided in the
examples/directory: arun.shwith completeallworkflow commands for both the multi-structure MEP and the scan-based pipeline.
Given (i) two or more PDB files (R → ... → P), or (ii) one PDB with --scan-lists, or (iii) one TS candidate with --tsopt, pdb2reaction automatically:
- extracts an active-site model around user-defined substrates to build a cluster model,
- explores minimum-energy paths (MEPs) with GSM or DMF,
- optionally optimizes transition states, runs vibrational analysis, IRC, and single-point DFT,
using machine-learning interatomic potentials (MLIPs).
Related tools
| Tool | Use case | Repository |
|---|---|---|
| mlmm-toolkit | ML/MM (ONIOM) with full protein environment — automates MM parameter generation and ML region assignment from a single PDB input | https://github.com/t-0hmura/mlmm_toolkit |
| UMA–Pysisyphus Interface | YAML-input-based reaction mechanism analysis for small molecules | https://github.com/t-0hmura/uma_pysis |
Both pdb2reaction and mlmm-toolkit include a custom GPU-optimized pysisyphus fork for geometry optimization, TS search, and IRC. This bundled fork is not compatible with the upstream pysisyphus package; do not install them side by side.
Important (prerequisites):
- Input PDB files must already contain hydrogen atoms.
- When providing multiple PDBs, they must contain the same atoms in the same order (only coordinates may differ).
- Boolean CLI options accept both
--flag/--no-flagand value style--flag True/False(yes/no,1/0are also accepted). Prefer toggle style in new scripts.- The workflow also works for small-molecule systems. If you omit
--center/-cand--ligand-charge, you can use.xyzor.gjfinputs as well.
Documentation
- Getting Started — Quick start and workflow overview
- Installation — Setup and dependency installation
- Examples — Working
allworkflow commands (MEP and scan pipelines) for BezA, inexamples/run.sh - YAML Reference — Configuration options
- JSON Output Reference — Machine-readable result.json schema
- Troubleshooting — Common errors, backend selection guide, VRAM requirements
- Full documentation: t-0hmura.github.io/pdb2reaction/
Agent Skills
pdb2reaction ships AI-agent instructions under .claude/skills/ so your agent can drive enzyme reaction-mechanism investigations via Claude Code, Cursor, etc.
The skill bundle covers:
- End-to-end workflows and output parsing (
summary.json, R/TS/P canonical paths) - CLI subcommands (
extract,path-search,tsopt,freq,irc,dft, …) - Structure I/O (PDB / XYZ / GJF, charge & multiplicity decisions, link hydrogens & frozen atoms)
- Installation & Setup instructions
- HPC operation (PBS / SLURM, multi-GPU)
To activate, copy the .claude/skills/ directory into your project repository or home directory.
Installation
Linux with a CUDA-capable NVIDIA GPU is the validated production environment for the MLIP reaction-path workflows. The core Python package and CPU-only smoke tests also run on macOS and on Windows under WSL2.
Prerequisites
- Python >= 3.11
- CUDA 12.x
Minimal setup (CUDA 12.9)
pip install torch --index-url https://download.pytorch.org/whl/cu129
pip install pdb2reaction
plotly_get_chrome -y
huggingface-cli login
For DMF method (Additional MEP search method)
Install cyipopt (recommended via conda):
conda install -c conda-forge cyipopt -y
For the full step-by-step guide (HPC module load, alternative backends, DFT extras, troubleshooting), see docs/installation.md.
DFT single-point (pdb2reaction dft)
DFT dependencies are not installed by default. To use pdb2reaction dft, install the [dft] extra:
pip install "pdb2reaction[dft]"
This installs PySCF, GPU4PySCF (x86_64 only), and related CUDA libraries.
Supported ML potentials
| Potential | Repository | Install extra |
|---|---|---|
| UMA (default) | https://github.com/facebookresearch/fairchem | (included) |
| ORB | https://github.com/orbital-materials/orb-models | pip install "pdb2reaction[orb]" |
| MACE | https://github.com/ACEsuit/mace | See below |
| AIMNet2 | https://github.com/isayevlab/aimnetcentral | pip install "pdb2reaction[aimnet]" |
MACE installation: Because
mace-torchandfairchem-core(UMA) can pin incompatible versions ofe3nn, we recommend installing MACE in a dedicated environment. To use MACE, uninstallfairchem-corefirst, then install MACE:pip uninstall fairchem-core pip install mace-torch
Quick Examples
The examples below use GPP C6-methyltransferase BezA (Tsutsumi et al., Angew. Chem. Int. Ed. 2022, 61, e202111217) — a two-step mechanism: electrophilic methyl transfer from SAM to GPP C6 (via C7 carbocation), then proton abstraction by glutamate (GLU 186). The complete commands are in examples/run.sh.
Full workflow (multi-structure MEP)
pdb2reaction -i 1.R.pdb 3.P.pdb -c 'SAM,GPP,MG' -l 'SAM:1,GPP:-3' \
--tsopt --thermo --out-dir result_mep
Scan mode (single structure → staged bond scans → MEP)
pdb2reaction -i 1.R.pdb -c 'SAM,GPP,MG' -l 'SAM:1,GPP:-3' \
--scan-lists '[("CS1 SAM 320","GPP 321 C7",1.60)]' \
'[("GPP 321 H11","GLU 186 OE2",0.90)]' \
--tsopt --thermo --out-dir result_scan
TS optimization only
pdb2reaction -i TS_candidate.pdb -c 'SAM,GPP,MG' -l 'SAM:1,GPP:-3' \
--tsopt
Step-by-step workflow
1. Extract active-site model (cluster model) — extract
pdb2reaction extract -i 1.R.pdb -c 'SAM,GPP,MG' -l 'SAM:1,GPP:-3'
2. Optimize geometry — opt
pdb2reaction opt -i model.pdb -l 'SAM:1,GPP:-3'
3. MEP search — path-opt
pdb2reaction path-opt -i R_model.pdb IM_model.pdb -l 'SAM:1,GPP:-3'
Recursive MEP search for multi-step reactions — path-search
pdb2reaction path-search -i R_model.pdb P_model.pdb -l 'SAM:1,GPP:-3'
4. TS optimization — tsopt
pdb2reaction tsopt -i hei.pdb -l 'SAM:1,GPP:-3'
5. Frequency analysis — freq
pdb2reaction freq -i ts_optimized.pdb -l 'SAM:1,GPP:-3'
6. IRC — irc
pdb2reaction irc -i ts_optimized.pdb -l 'SAM:1,GPP:-3'
7. DFT single-point — dft
pdb2reaction dft -i optimized.pdb -l 'SAM:1,GPP:-3'
CLI Subcommands
Workflow
| Subcommand | Role | Documentation |
|---|---|---|
all |
End-to-end: extraction → MEP → TS → IRC → freq → DFT | docs/all.md |
Structure Preparation
| Subcommand | Role | Documentation |
|---|---|---|
extract |
Extract active-site model (cluster model) | docs/extract.md |
fix-altloc |
Resolve alternate conformations in PDB files | docs/fix-altloc.md |
add-elem-info |
Add/repair PDB element columns (77–78) | docs/add-elem-info.md |
Optimization & Path Search
| Subcommand | Role | Documentation |
|---|---|---|
opt |
Geometry optimization (L-BFGS or RFO) | docs/opt.md |
tsopt |
TS optimization (Dimer or RS-I-RFO) | docs/tsopt.md |
path-opt |
MEP optimization via GSM or DMF | docs/path-opt.md |
path-search |
Recursive MEP search with refinement | docs/path-search.md |
scan |
1D bond-length driven scan | docs/scan.md |
scan2d |
2D distance grid scan | docs/scan2d.md |
scan3d |
3D distance grid scan | docs/scan3d.md |
Analysis
| Subcommand | Role | Documentation |
|---|---|---|
freq |
Vibrational frequency analysis + thermochemistry | docs/freq.md |
irc |
IRC calculation (EulerPC) | docs/irc.md |
dft |
Single-point DFT (GPU4PySCF / PySCF) | docs/dft.md |
bond-summary |
Compare structures and report bond changes | docs/bond-summary.md |
Visualization
| Subcommand | Role | Documentation |
|---|---|---|
trj2fig |
Energy plot from XYZ trajectory | docs/trj2fig.md |
energy-diagram |
Energy diagram from numeric values | docs/energy-diagram.md |
Tip: In
tsopt,freq, andirc, setting--hessian-calc-mode Analyticalis strongly recommended when you have enough VRAM.
HPC / Multi-GPU
On HPC clusters or multi-GPU workstations, pdb2reaction can parallelize UMA inference across nodes. Set workers and workers_per_node to enable parallel inference; see docs/hpc-example.md for details.
Getting Help
pdb2reaction --help
pdb2reaction <subcommand> --help
pdb2reaction <subcommand> --help-advanced
pdb2reaction all --help-advanced
# Shorthand alias (equivalent to pdb2reaction)
p2r --help
# Equivalent module invocation
python -m pdb2reaction --help
pdb2reaction all --help shows core options. Use pdb2reaction all --help-advanced for the full option list.
scan, scan2d, scan3d, and the calculation commands (opt, path-opt, path-search, tsopt, freq, irc, dft) now follow the same progressive-help pattern (--help core, --help-advanced full). add-elem-info, trj2fig, and energy-diagram also use the same pattern. extract and fix-altloc also support progressive help (--help core, --help-advanced full parser options).
If you encounter any issues, please open an issue at https://github.com/t-0hmura/pdb2reaction/issues.
Citation
A preprint describing pdb2reaction is in preparation. Currently, if you find this work helpful for your research, please cite the software itself:
@software{ohmura2026pdb2reaction,
author = {Ohmura, Takuto},
title = {pdb2reaction},
year = {2026},
month = {4},
version = {0.3.8},
url = {https://github.com/t-0hmura/pdb2reaction},
license = {GPL-3.0},
doi = {10.5281/zenodo.19197865}
}
Known limitations
- MACE and UMA cannot coexist in the same environment due to an
e3nnversion conflict. Use separate conda environments. - DFT single-point (
pdb2reaction dft) is practical up to ~300 atoms; larger systems may require fragmentation. - ORB backend tends to converge transition states with extra small imaginary modes even when the reaction coordinate is correctly identified (i.e. mechanism recovery is usually fine but a clean single-saddle TS spectrum is not guaranteed). For quantitative studies that need a single-imaginary-mode TS, prefer UMA or MACE, or re-score ORB-converged geometries with DFT.
- CPU-only execution is supported but 10-100x slower than GPU.
License
pdb2reaction is distributed under the GNU General Public License version 3 (GPL-3.0) and is available for academic and commercial use subject to the GPL-3.0 license terms.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pdb2reaction-0.3.8.tar.gz.
File metadata
- Download URL: pdb2reaction-0.3.8.tar.gz
- Upload date:
- Size: 6.7 MB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
047cc415a96793f7ffeb0492fe41396d390469e0fb93cba1936bdb773aa70368
|
|
| MD5 |
dc7e6da98646fcc29efa224dcdc1cdeb
|
|
| BLAKE2b-256 |
727c8980e28bf915aac0e9c8cbe25175aae389b114e3aed372b32d375a75798e
|
Provenance
The following attestation bundles were made for pdb2reaction-0.3.8.tar.gz:
Publisher:
release.yml on t-0hmura/pdb2reaction
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
pdb2reaction-0.3.8.tar.gz -
Subject digest:
047cc415a96793f7ffeb0492fe41396d390469e0fb93cba1936bdb773aa70368 - Sigstore transparency entry: 1413637638
- Sigstore integration time:
-
Permalink:
t-0hmura/pdb2reaction@d458340d1e430046364a9bae6df1f7e4ed0f3fc7 -
Branch / Tag:
refs/tags/v0.3.8 - Owner: https://github.com/t-0hmura
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@d458340d1e430046364a9bae6df1f7e4ed0f3fc7 -
Trigger Event:
release
-
Statement type:
File details
Details for the file pdb2reaction-0.3.8-py3-none-any.whl.
File metadata
- Download URL: pdb2reaction-0.3.8-py3-none-any.whl
- Upload date:
- Size: 3.7 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
34ac48b2cbf535467a5bde3418d31bb93cd3d0eca3f689b38500c3153f057825
|
|
| MD5 |
0920aad652a1a423d8d0fefd63a82f11
|
|
| BLAKE2b-256 |
970ec512e190ad6bdd03ba8689080a0e2d34d538ff9774c28e3697a34dba836c
|
Provenance
The following attestation bundles were made for pdb2reaction-0.3.8-py3-none-any.whl:
Publisher:
release.yml on t-0hmura/pdb2reaction
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
pdb2reaction-0.3.8-py3-none-any.whl -
Subject digest:
34ac48b2cbf535467a5bde3418d31bb93cd3d0eca3f689b38500c3153f057825 - Sigstore transparency entry: 1413637732
- Sigstore integration time:
-
Permalink:
t-0hmura/pdb2reaction@d458340d1e430046364a9bae6df1f7e4ed0f3fc7 -
Branch / Tag:
refs/tags/v0.3.8 - Owner: https://github.com/t-0hmura
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@d458340d1e430046364a9bae6df1f7e4ed0f3fc7 -
Trigger Event:
release
-
Statement type: