FastMDXplora: Fully Automated SysTem for Molecular Dynamics eXploration
Project description
FastMDXplora
Fully Automated SysTem for Molecular Dynamics eXploration
FastMDXplora is a project-level orchestrator for end-to-end molecular dynamics studies. A single command takes a protein structure (or PDB ID) from input to publication-quality deliverable, coordinating four phases:
setup → simulation → analysis → report
FastMDXplora is the next generation of FastMDAnalysis (Aina & Kwan, J. Comput. Chem. 2026, DOI: 10.1002/jcc.70350) — the same automated, reproducibility-by-design philosophy, extended from trajectory analysis to the full molecular dynamics study: setup, simulation (including enhanced sampling), protein and protein-ligand analysis, and reporting. It is not a generic workflow engine — the workflow is built-in, the domain knowledge is built-in, and the user expresses intent rather than describing a workflow graph (a DAG, or directed acyclic graph, the task-and-dependency model used by tools like Snakemake and Nextflow).
Highlights
- Single-command end-to-end MD — from PDB to slides in one invocation
- Protein-ligand ready — parameterize a small-molecule ligand (OpenFF) from a feasible bound pose; ligand-aware analyses (pose RMSD, contacts, protein-ligand H-bonds) run automatically
- Project-level orchestrator pattern — shared state, registered phases, intelligent defaults, consolidated outputs
- Granular control when you want it — run any single phase independently
- Self-contained — the analysis and report phases have no heavy runtime dependencies
- Reproducibility built in — every run writes a structured manifest of parameters, software versions, and artifact paths
- Publication-quality reporting — automated slide deck, structured Markdown report, self-contained project bundle
Installation
FastMDXplora's four phases have different dependency footprints. The analysis and report phases work from pip alone; the setup and simulation phases need PDBFixer + OpenMM, which are distributed primarily through conda-forge. So there are two routes — pick by what you need.
Full install (all four phases) — from the git repo
The setup/simulation chemistry stack (OpenMM, PDBFixer) installs most reliably from conda-forge, so the full install uses the bundled environment.yml. We recommend mamba (a faster conda solver); plain conda works too.
git clone https://github.com/aai-research-lab/FastMDXplora.git
cd FastMDXplora
mamba env create -f environment.yml || conda env create -f environment.yml
conda activate fastmdxplora
pip install .
Don't have
mamba? Either install Miniforge (see below), or just useconda— the||above falls back to it automatically.
Analysis + report only — from PyPI
If you only need to analyze existing trajectories and build reports (no simulation), plain pip is enough — no conda required:
pip install fastmdxplora # primary package
pip install fastmdx # alias (resolves to fastmdxplora)
This gives a fully working analysis + report pipeline, slide deck included (python-pptx is a core dependency). The setup and simulation phases emit a clear warning and skip gracefully until the chemistry stack is present. Add it via conda-forge (recommended, reliable across platforms):
conda install -c conda-forge pdbfixer openmm
or best-effort via the [md] pip extras (PDBFixer wheels are unavailable on some platforms, so conda is preferred):
pip install "fastmdxplora[md]"
Development install
git clone https://github.com/aai-research-lab/FastMDXplora.git
cd FastMDXplora
mamba env create -f environment.yml || conda env create -f environment.yml
conda activate fastmdxplora
pip install -e ".[test]" # editable, with the test dependencies
Verify
fastmdx --version
fastmdx info # versions + detected backends (OpenMM/PDBFixer)
Check which OpenMM platforms are available (CPU/CUDA/OpenCL):
python - <<'PY'
import openmm as mm
plats = [mm.Platform.getPlatform(i).getName() for i in range(mm.Platform.getNumPlatforms())]
print("Available platforms:", plats)
print("CUDA available" if "CUDA" in plats else "CPU-only — simulations will run on CPU")
PY
conda-forge package (coming soon). A single-command
conda install -c conda-forge fastmdxplora(pulling every dependency, all four phases working out of the box) is planned once the recipe clears review. Until then, use the git +environment.ymlroute above.
Mamba / Miniforge (optional)
mamba is a drop-in, faster replacement for the conda solver — helpful because solving the OpenMM/CUDA stack is exactly where the classic solver is slow. If you don't have it, the easiest source is Miniforge (conda + mamba, preconfigured for conda-forge):
# Linux (x86_64) — see https://conda-forge.org/miniforge/ for macOS/Windows/ARM
curl -L -o "$HOME/Miniforge3.sh" \
"https://github.com/conda-forge/miniforge/releases/latest/download/Miniforge3-Linux-x86_64.sh"
bash "$HOME/Miniforge3.sh" -b -p "$HOME/miniforge3"
source "$HOME/miniforge3/etc/profile.d/conda.sh"
conda init "$(basename "$SHELL")"
If mamba still isn't on PATH afterward, add it to the base environment:
conda install -n base -c conda-forge mamba
For other operating systems (macOS Intel/Apple Silicon, Linux ARM64, Windows), grab the matching installer from the Miniforge releases page.
Examples
Command line
Run the full pipeline (setup → simulate → analyze → report):
fastmdx explore --system protein.pdb
Fetch a structure from the PDB by ID (auto-detected, fetched from RCSB):
fastmdx explore --system 1L2Y
Tune per-phase options (flags are namespaced by phase):
fastmdx explore -s protein.pdb --setup-ph 7.4 --simulate-duration-ns 100 --simulate-platform CUDA
Run only specific phases:
fastmdx explore -s protein.pdb --include setup simulation
Run a single phase (bare flags, no phase prefix):
fastmdx setup -s protein.pdb --ph 6.5
fastmdx simulate --output run_001 --duration-ns 50 --platform CUDA
fastmdx analyze --output run_001 --analyses rmsd rmsf rg
Drive a whole study from a config file (-c and -config also work):
fastmdx explore --config study.yml
Generate a commented config template to edit:
fastmdx init-config -o study.yml
The -s, -system, and --system forms are equivalent; xplore is an alias of explore.
Python API
Run the full pipeline:
from fastmdxplora import FastMDXplora
fmdx = FastMDXplora(system="protein.pdb")
fmdx.explore()
Specify options and select phases:
fmdx = FastMDXplora(system="1L2Y") # PDB ID, fetched from RCSB
results = fmdx.explore(
include=["setup", "simulation", "analysis"],
options={
"simulation": {"duration_ns": 100, "temperature_K": 310, "platform": "CUDA"},
"analysis": {"include": ["rmsd", "rg", "cluster"]},
},
)
# explore() always returns a list of runs (a single study is a list of one)
for run in results:
print(run.run_id, run.status)
for phase in run.phases:
print(" ", phase.name, phase.status)
Run a config file — one system, many systems, or a parameter sweep, all the same way:
fmdx = FastMDXplora(config="study.yml")
fmdx.explore()
Preview a run without executing (CLI --dry-run, or dry_run=True):
FastMDXplora(config="campaign.yml").explore(dry_run=True)
Recommended alias:
import fastmdxplora as fastmdx.
See Configuration files and Many systems and parameter sweeps for the YAML format, batches, sweeps, and parallel execution.
Configuration files
For anything beyond a quick run, capture the whole study in a single YAML file instead of a long flag list. The same file drives both the CLI and the Python API. Input is always given as a systems: list — even for a single system — so the file looks the same whether you study one protein or a dozen.
Generate a commented template to start from:
fastmdx init-config # writes fastmdxplora.yml (comprehensive)
fastmdx init-config --minimal -o study.yml # short starter
A study.yml looks like:
systems:
- id: protein1
system: protein.pdb # PDB/CIF path, 4-char PDB ID, or sequence
output: ./my_study
include: [setup, simulation, analysis, report]
setup:
ph: 7.4
ion_concentration_M: 0.15
simulation:
duration_ns: 100.0 # production length (equilibration is separate)
temperature_K: 310.0
platform: CUDA
analysis:
include: [rmsd, rmsf, rg, cluster]
selection: "name CA"
options:
cluster:
methods: [kmeans, hierarchical]
n_clusters: 5
report:
title: "My MD Study"
Run it from the CLI or the API:
fastmdx explore --config study.yml # also: -c, -config
from fastmdxplora import FastMDXplora
FastMDXplora(config="study.yml").explore()
With a single system and no sweep, the output uses the familiar flat layout (my_study/setup/, my_study/simulation/, …) with the usual manifest.json and resolved_config.yml. Three things make this robust:
- Flags override the file.
fastmdx explore --config study.yml --simulate-duration-ns 50keeps everything in the file but runs 50 ns. Precedence is: command-line flags / API kwargs > config file > built-in defaults. - Strict validation. A typo like
pH:(wrong case) orsimulaton:is rejected with a did-you-mean suggestion, so a misspelled key never silently runs with the default. - Reproducibility. Every run writes
resolved_config.yml— the fully-merged configuration that actually ran (defaults + file + overrides). Feed it straight back to--configto reproduce the study exactly.
For a quick command-line one-off, -s/--system is shorthand that builds a one-element systems list for you:
fastmdx explore -s protein.pdb --simulate-duration-ns 50
Many systems and parameter sweeps
Because input is always a systems: list, studying several systems is just adding entries. Add a sweep: block to vary parameters, and FastMDXplora runs the full cross-product — each as a complete, self-contained study.
output: ./trpcage_campaign
include: [setup, simulation, analysis, report]
systems:
- id: trpcage1
system: trpcage.pdb
- id: trpcage2
system: trpcage.pdb
setup: { ph: 6.5 } # optional per-system overrides
sweep:
simulation.temperature_K: [300, 310, 320] # dotted phase.option → values
simulation.pressure_bar: [1.0, 1.2] # multiple axes → cross-product
That config produces 2 systems × 3 temperatures × 2 pressures = 12 runs. When there is more than one run, each goes in its own runs/<id>/ subdirectory, indexed by a top-level batch_manifest.json, with a cross-run comparison/ report:
trpcage_campaign/
batch_manifest.json
comparison/ (cross-run report)
runs/
trpcage1__temperature_K-300__pressure_bar-1.0/ (a full study)
trpcage1__temperature_K-300__pressure_bar-1.2/
...
Run it exactly as any other config:
fastmdx explore --config campaign.yml
from fastmdxplora import FastMDXplora
FastMDXplora(config="campaign.yml").explore()
Each run is identical in structure to a single study (its own manifest.json, resolved_config.yml, and phase directories), so existing analysis tooling works per-run unchanged. Option precedence within a run is base config < per-system overrides < swept value. Typo'd sweep axes are rejected with the valid-option list, and a failed run is recorded while the others continue.
Cross-run comparison report
After a multi-run study, FastMDXplora automatically builds a comparison/ report at the batch root that turns a directory of runs into a single analysis:
- Overlays — every run's per-frame trace (RMSD, Rg, Q-value, total SASA) drawn on one set of axes, labelled by its swept value, so divergence across the sweep is visible at a glance.
- Trends — each run reduced to a summary scalar (e.g. mean RMSD over the trajectory) and plotted against the swept parameter, giving a structure-property relationship.
comparison_summary.csv— one row per run with the summary scalars, ready for further analysis.comparison_report.md— a written report tying the figures together, with a one-line quantitative takeaway per property (e.g. "across temperature_K 300 → 320, mean RMSD increases 0.21 → 0.23 nm").
It degrades gracefully (errored runs and missing analyses are skipped) and can be turned off with report: { comparison: false }.
Parallel execution
By default runs execute sequentially. An optional execution: block runs several at once:
execution:
mode: parallel # sequential (default) | parallel
workers: 2 # how many runs at once
devices: [0, 1] # GPU indices — one run pinned per device
continue_on_error: true
Parallelism is process-based (each run is a subprocess, required because OpenMM contexts and the GIL don't share across threads). On GPU, the safe pattern is one run per GPU: list your devices and each worker is pinned to a distinct index round-robin. Oversubscribing a single GPU is slower than running sequentially, so workers should not exceed the number of devices on GPU. When workers is unset it defaults to one per device (GPU) or the CPU count capped at the run count (CPU).
The four phases
| Phase | Purpose | Key outputs |
|---|---|---|
setup |
System preparation (fix, protonate, solvate, ionize) | prepared.pdb, solvated.pdb, setup_parameters.json |
simulation |
Minimize, NVT, NPT, production MD | production.dcd, topology.pdb, simulation_parameters.json |
analysis |
RMSD, RMSF, Rg, H-bonds, SS, cluster, SASA, dim-red, Q-value, dihedrals | <analysis>/*.dat, <analysis>/*.png, analysis_manifest.json |
report |
Slides, structured report, project bundle | report.md, slides.pptx, project_bundle.zip |
Each phase writes to a dedicated subdirectory under the project output root and produces a structured parameters manifest, so every artifact is traceable to the exact options that produced it.
Documentation
Documentation is hosted at fastmdxplora.readthedocs.io (under development).
Citation
If you use FastMDXplora in your work, please cite the foundational FastMDAnalysis paper:
Aina, A.; Kwan, D. FastMDAnalysis: Software for Automated Analysis of Molecular Dynamics Trajectories. J. Comput. Chem. 2026, 47, e70350. DOI: 10.1002/jcc.70350
@article{aina2026fastmd,
author = {Aina, Adekunle and Kwan, Derrick},
title = {FastMDAnalysis: Software for Automated Analysis of Molecular Dynamics Trajectories},
journal = {Journal of Computational Chemistry},
volume = {47},
number = {8},
pages = {e70350},
year = {2026},
doi = {10.1002/jcc.70350},
}
Contributing
Contributions are welcome. See CONTRIBUTING.md. FastMDXplora follows the Contributor Covenant.
License
MIT — see LICENSE.
Acknowledgements
FastMDXplora is developed in the AAI Research Lab at California State University Dominguez Hills. It builds on a deep ecosystem of open-source scientific Python: MDTraj, OpenMM, PDBFixer, NumPy, SciPy, scikit-learn, Matplotlib, python-pptx, and many others.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file fastmdxplora-2.0.0.tar.gz.
File metadata
- Download URL: fastmdxplora-2.0.0.tar.gz
- Upload date:
- Size: 229.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
572e4a395d3863dd78285a2a3d1aceeceda35bcaa6be744c3ae6155749fb40eb
|
|
| MD5 |
b39dd34c11ec8087fc3d1f4aaf044921
|
|
| BLAKE2b-256 |
0023f42df706cd7e59ac5a74b0c6e73e4ac89af933e593173bd50b4be5fd9308
|
Provenance
The following attestation bundles were made for fastmdxplora-2.0.0.tar.gz:
Publisher:
publish.yml on aai-research-lab/FastMDXplora
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
fastmdxplora-2.0.0.tar.gz -
Subject digest:
572e4a395d3863dd78285a2a3d1aceeceda35bcaa6be744c3ae6155749fb40eb - Sigstore transparency entry: 1631353039
- Sigstore integration time:
-
Permalink:
aai-research-lab/FastMDXplora@9c56104190be8cfeb1a53be08763e165209e472d -
Branch / Tag:
refs/tags/v2.0.0 - Owner: https://github.com/aai-research-lab
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@9c56104190be8cfeb1a53be08763e165209e472d -
Trigger Event:
push
-
Statement type:
File details
Details for the file fastmdxplora-2.0.0-py3-none-any.whl.
File metadata
- Download URL: fastmdxplora-2.0.0-py3-none-any.whl
- Upload date:
- Size: 172.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
50a856dcc30efe1b9c9a9b95403d741cd5130100e5bddded9a1bcbb428aaea3b
|
|
| MD5 |
35f25f13e940fb066f8500dae652959c
|
|
| BLAKE2b-256 |
d28041bb19fbc7f2bfb7e1736c3190b243f23c3c299e46f4271f1886436abe92
|
Provenance
The following attestation bundles were made for fastmdxplora-2.0.0-py3-none-any.whl:
Publisher:
publish.yml on aai-research-lab/FastMDXplora
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
fastmdxplora-2.0.0-py3-none-any.whl -
Subject digest:
50a856dcc30efe1b9c9a9b95403d741cd5130100e5bddded9a1bcbb428aaea3b - Sigstore transparency entry: 1631353077
- Sigstore integration time:
-
Permalink:
aai-research-lab/FastMDXplora@9c56104190be8cfeb1a53be08763e165209e472d -
Branch / Tag:
refs/tags/v2.0.0 - Owner: https://github.com/aai-research-lab
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@9c56104190be8cfeb1a53be08763e165209e472d -
Trigger Event:
push
-
Statement type: