Pharmacometric modeling workflow CLI
Project description
PKflow
A composable command-line workflow tool for pharmacometric modeling.
PKflow turns the run → diagnose → compare → report loop of population PK/PD modeling into a handful of scriptable commands. Fit a NONMEM model, collect its results into a tidy, file-based format, and generate goodness-of-fit plots, VPCs, bootstrap confidence intervals, shrinkage tables, η–covariate plots, and a shareable report — all from the terminal or as a Python library.
It is a clean-room Python rewrite of the ideas behind the classic Pirana workbench, with three deliberate design choices:
- File-based, not a database. Every run is a self-contained directory
(
results.yaml+ parquet sidecars) — diffable, reproducible, and git-friendly. - A thin backend protocol. Modeling-engine specifics live behind a small
parse / run / collectinterface. Today that backend is NONMEM (via pharmpy); the diagnostics, workflows, and report layers are engine-agnostic. - Pure functions you can test. The statistics (VPC binning, bootstrap CIs, shrinkage, comparison tables) are pure and unit-tested without needing NONMEM.
Status: early alpha (
0.1.0a4). The NONMEM workflow below works end-to-end against a realnmfebinary. APIs may still change.
Table of contents
- Install
- Quickstart
- Configuration
- Examples — one per feature
- Run directory layout
- Architecture
- Development
- Contributing
- Roadmap
- Citation
- Acknowledgements
- Author
- License
Install
Requires Python ≥ 3.10.
pip install -e .
To actually run models you also need:
- A NONMEM installation with an
nmfescript onPATH(or point at it inpkflow.toml— see Configuration). - pandoc (system package) — only for
report --format html|docx. Markdown reports and everything else need no extra tooling.
Python dependencies (installed automatically): pharmpy-core, pandas,
pyarrow, plotnine, scikit-misc, jinja2, pyyaml, typer.
Quickstart
# 1. Fit a model — creates runs/<name>_<timestamp>/
pkflow run model.ctl
# 2. Look at the estimates
pkflow show runs/model_20260609_120000/
# 3. Diagnostics: GOF + VPC + shrinkage
pkflow diagnose runs/model_20260609_120000/
pkflow vpc runs/model_20260609_120000/
pkflow shrinkage runs/model_20260609_120000/
# 4. One report tying it all together
pkflow report runs/model_20260609_120000/ --format docx --gof
Every command is independent and operates on a saved run directory, so you can re-run, re-collect, and re-diagnose without re-fitting.
Configuration
Optional pkflow.toml in the working directory:
backend = "nonmem" # only backend today
executor = "local" # local subprocess runner
nmfe = "/opt/nm760/run/nmfe76" # path to your NONMEM nmfe script
runs_dir = "runs" # where run directories are created
All keys are optional; defaults are shown above (nmfe defaults to nmfe75 on
PATH). Override per-invocation with flags like --backend / --runs-dir.
Examples
The examples below use a 2-compartment IV model warfarin.ctl. Replace it with
your own control stream — PKflow reads $INPUT, $DATA, parameter blocks, and
result files (.lst, .ext, .phi) through pharmpy.
1. Run a model
pkflow run warfarin.ctl
→ runs/warfarin_20260609_120000
status: ok ofv: 1234.56 (21.9s)
run creates an isolated run directory, copies the dataset in and rewrites
$DATA so models with relative data paths just work, executes NONMEM, then
collects everything into results.yaml + parquet sidecars (parameters,
predictions, η estimates, covariates).
2. Inspect saved results
pkflow show runs/warfarin_20260609_120000/
run : warfarin_20260609_120000
backend : nonmem
status : ok
ofv : 1234.56
aic/bic : 1250.56 / 1278.10
cond # : 18.3
parameters:
name type estimate se rse_pct
CL theta 0.134 0.0042 3.1
V1 theta 8.110 0.2100 2.6
Q theta 0.220 0.0180 8.2
OMEGA_1_1 omega 0.091 0.0150 16.4
show reads only the saved files — no NONMEM needed. Use pkflow collect <run_dir> to re-parse the NONMEM output of an existing run without re-fitting.
3. Compare runs
Rank competing models side by side. ΔOFV is relative to the best (lowest) run; failed runs are excluded from the "best" calculation.
pkflow compare runs/base_*/ runs/covCL_*/ runs/covCL_V_*/ --sort ofv --gof
run_id status ofv delta_ofv n_params aic bic condition_number
covCL_V_20260609 ok 1208.9 0.0 9 1226.9 1236.9 18.3
covCL_20260609 ok 1210.2 1.3 7 1224.2 1234.2 18.3
base_20260609 ok 1234.5 25.6 5 1244.5 1254.5 18.3
→ compare/comparison.csv
→ compare/compare_gof.png # overlaid DV-vs-PRED, colored by run
4. Bootstrap confidence intervals
Nonparametric case-resampling bootstrap: subjects are resampled with replacement (and relabeled to keep duplicates distinct), the model is refit on each replicate, and percentile CIs are reported. Non-converged replicates are excluded and counted.
pkflow bootstrap warfarin.ctl --n 200 --ci 0.95
→ runs/warfarin_20260609_121500 (200 replicates)
converged: 196/200
name original_est boot_median boot_se ci_lo ci_hi n_success
CL 0.134 0.135 0.0051 0.125 0.145 196
V1 8.110 8.090 0.2400 7.640 8.580 196
OMEGA_1_1 0.091 0.087 0.0190 0.052 0.128 196
→ runs/.../bootstrap/bootstrap_summary.csv
Per-replicate run directories are cleaned up automatically; the summary and the
raw per-replicate estimates (replicate_params.parquet) are kept.
5. Goodness-of-fit plots
The standard 4-panel GOF (DV-vs-PRED, DV-vs-IPRED, CWRES-vs-PRED, CWRES-vs-TIME), rendered with plotnine:
pkflow diagnose runs/warfarin_20260609_120000/
runs/.../diagnostics/dv_vs_pred.png
runs/.../diagnostics/dv_vs_ipred.png
runs/.../diagnostics/cwres_vs_pred.png
runs/.../diagnostics/cwres_vs_time.png
→ 4 plot(s) in runs/.../diagnostics
GOF needs a
$TABLEwithDV PRED IPRED CWRES TIMEwritten to ansdtab-style file so PKflow can find it.
6. Visual Predictive Check (VPC)
PKflow converts the fitted model to a simulation ($SIMULATION with N
subproblems), runs it, bins observations by time, and overlays the observed
5/50/95 percentiles on the simulated prediction intervals.
pkflow vpc runs/warfarin_20260609_120000/ --n-sim 500 --n-bins 10
→ runs/.../diagnostics/vpc.png (+ vpc.csv with the binned percentiles)
7. η / ε shrinkage
A shrinkage table (flagging values above a threshold, default 30%) plus a faceted histogram of the individual η estimates.
pkflow shrinkage runs/warfarin_20260609_120000/ --threshold 0.30
parameter kind shrinkage shrinkage_pct high
ETA_1 eta 0.0868 8.68 False
ETA_2 eta 0.4171 41.71 True
ETA_3 eta 0.6388 63.88 True
→ runs/.../diagnostics/shrinkage_table.csv
→ runs/.../diagnostics/eta_distributions.png
8. η–covariate plots
Scatter of each η against each subject-level covariate, with a linear trend.
Covariates are auto-detected (constant-within-subject, varying across
subjects); override with --cov.
# auto-detect covariates
pkflow etacov runs/warfarin_20260609_120000/
# or name them explicitly
pkflow etacov runs/warfarin_20260609_120000/ --cov WT --cov SEX --cov AGE
→ runs/.../diagnostics/eta_covariates.png (facet grid: η rows × covariate cols)
→ runs/.../diagnostics/eta_covariates.csv
9. Reports (md / html / docx)
Assemble fit summary, parameter table, shrinkage, any bootstrap result, and the
diagnostic plots into one document. Markdown is the canonical render; HTML and
Word are produced via pandoc.
# Markdown (no extra dependencies)
pkflow report runs/warfarin_20260609_120000/ --format md
# Word document, generating GOF plots first and embedding them
pkflow report runs/warfarin_20260609_120000/ --format docx --gof
→ runs/.../report/report.docx
10. Use it as a Python library
Everything the CLI does is available as importable functions. The statistics are
pure — feed them a Results object (from a saved run or constructed in memory):
from pathlib import Path
from pkflow import backends
from pkflow.executors import LocalExecutor
from pkflow.model import Results
from pkflow.compare import build_table
from pkflow.diagnostics import save_gof, shrinkage_table
from pkflow.workflows import bootstrap
be = backends.get("nonmem")
ex = LocalExecutor({"nmfe": "/opt/nm760/run/nmfe76"})
# parse → run → collect
model = be.parse(Path("warfarin.ctl"))
handle = be.run(model, Path("runs/wf"), ex)
res = be.collect(model, Path("runs/wf"), handle)
res.save(Path("runs/wf"))
# load a saved run later
res = Results.load(Path("runs/wf"))
# pure analytics
table = build_table([Results.load(p) for p in Path("runs").glob("*/")])
shr = shrinkage_table(res, threshold=0.3)
save_gof(res, Path("runs/wf/diagnostics"))
# a full bootstrap workflow
boot = bootstrap(model, res, be, ex, Path("runs/wf"), n=200, seed=1234)
print(boot.summary)
Run directory layout
A run directory is the unit of reproducibility:
runs/warfarin_20260609_120000/
├── results.yaml # fit metadata: status, ofv, aic/bic, cond#, shrinkage
├── parameters.parquet # estimates + SE + RSE%
├── predictions.parquet # $TABLE output (DV/PRED/IPRED/CWRES/...)
├── etas.parquet # individual η estimates
├── covariates.parquet # per-subject covariates
├── warfarin.ctl # the control stream that was run
├── diagnostics/ # GOF, VPC, shrinkage, η-covariate PNGs + CSVs
├── bootstrap/ # bootstrap_summary.csv + replicate_params.parquet
└── report/ # report.md / .html / .docx
Architecture
pkflow/
├── cli.py # typer entrypoint — every command is a thin wrapper
├── config.py # pkflow.toml loader
├── compare.py # cross-run table + overlaid GOF (pure functions)
├── model/
│ ├── base.py # backend-agnostic Model
│ └── results.py # unified Results + save/load (yaml + parquet)
├── backends/
│ ├── base.py # Backend protocol: parse / run / collect / simulate
│ └── nonmem.py # pharmpy-backed NONMEM implementation
├── executors/
│ └── local.py # local subprocess runner
├── diagnostics/
│ ├── gof.py # 4-panel goodness-of-fit
│ ├── vpc.py # backend-agnostic VPC (compute + plot)
│ └── shrinkage.py # shrinkage table, η distributions, η-covariate plots
├── workflows/
│ └── bootstrap.py # case-resampling bootstrap (pure stats + orchestrator)
└── report/
├── render.py # context builder + Jinja2 markdown + pandoc convert
└── templates/ # run_report.md.j2
Extending it is meant to be small:
- A new backend (e.g. another estimation engine) = one file implementing
parse / run / collect. - A new executor (e.g. Slurm/SGE) = one file implementing
submit / wait.
The diagnostics, comparison, bootstrap, and report layers consume the unified
Results object and don't care which engine produced it.
Development
pip install -e ".[dev]"
python -m pytest # full suite
The test suite covers every module. Pure-function tests (config, results,
compare, bootstrap, VPC math, shrinkage, report rendering) run without
NONMEM using in-memory Results; NONMEM-dependent paths are exercised with a
real .mod template and stubbed/faked boundaries. Pandoc-dependent report tests
skip automatically when pandoc is absent.
Contributing
Contributions are welcome! Please see CONTRIBUTING.md for
details. In short:
- Open an issue to discuss bugs or feature ideas before large changes.
- Follow test-driven development — add a failing test first, then the implementation. Keep statistics as pure functions where possible.
- Run
python -m pytestand make sure the suite is green before opening a PR.
Roadmap
- Categorical-covariate boxplots in η–covariate plots
- Cluster executors (
slurm,sge) - Additional report sections and templating hooks
The backend protocol is intentionally general, but the project is focused on NONMEM for now.
Citation
If you use PKflow in your research, please cite it:
@software{zhang_pkflow,
author = {Zhang, Yufeng},
title = {PKflow: A composable command-line workflow tool for pharmacometric modeling},
year = {2026},
url = {https://github.com/kinginsun/pkflow}
}
Acknowledgements
PKflow stands on the shoulders of excellent open-source work:
- pharmpy — NONMEM control-stream parsing and result handling.
- plotnine — grammar-of-graphics plotting for all diagnostics.
- pandas, Typer, Jinja2, and pandoc.
- The original Pirana workbench, whose workflow inspired this rewrite.
Author
Yufeng Zhang School of Pharmacy, The Chinese University of Hong Kong (CUHK) Contact: zhangyf@cuhk.edu.hk
License
Released under the MIT License — see LICENSE.
MIT License
Copyright (c) 2026 Yufeng Zhang
PKflow is an independent Python project and is not affiliated with the original Pirana software.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pkflow-0.1.0a4.tar.gz.
File metadata
- Download URL: pkflow-0.1.0a4.tar.gz
- Upload date:
- Size: 43.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
73e1602769437164362b744ad7f1e7bc8d54a06934368a0e8d412d705a4b5bb9
|
|
| MD5 |
601d2c12852e987cf7a82e57fe5db69a
|
|
| BLAKE2b-256 |
50bddcc54335fa124e32bbfb104e8c42d90859f44588b837a67cd4d36d1cdf78
|
Provenance
The following attestation bundles were made for pkflow-0.1.0a4.tar.gz:
Publisher:
publish.yml on kinginsun/pkflow
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
pkflow-0.1.0a4.tar.gz -
Subject digest:
73e1602769437164362b744ad7f1e7bc8d54a06934368a0e8d412d705a4b5bb9 - Sigstore transparency entry: 1775319678
- Sigstore integration time:
-
Permalink:
kinginsun/pkflow@0cd5b9735c8fb160592fdd81a6e25123f58aa731 -
Branch / Tag:
refs/tags/v0.1.0a4 - Owner: https://github.com/kinginsun
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@0cd5b9735c8fb160592fdd81a6e25123f58aa731 -
Trigger Event:
push
-
Statement type:
File details
Details for the file pkflow-0.1.0a4-py3-none-any.whl.
File metadata
- Download URL: pkflow-0.1.0a4-py3-none-any.whl
- Upload date:
- Size: 33.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2de1f2b81532318b632d77455f1b599247db92e6fddd5773561cb325a4fd1210
|
|
| MD5 |
8522b32838c982ebddc5bc545074807c
|
|
| BLAKE2b-256 |
4691cd7a18dcdf89d457a0d208836e2c67812a9b4c8718f8a7c21204477c7806
|
Provenance
The following attestation bundles were made for pkflow-0.1.0a4-py3-none-any.whl:
Publisher:
publish.yml on kinginsun/pkflow
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
pkflow-0.1.0a4-py3-none-any.whl -
Subject digest:
2de1f2b81532318b632d77455f1b599247db92e6fddd5773561cb325a4fd1210 - Sigstore transparency entry: 1775319851
- Sigstore integration time:
-
Permalink:
kinginsun/pkflow@0cd5b9735c8fb160592fdd81a6e25123f58aa731 -
Branch / Tag:
refs/tags/v0.1.0a4 - Owner: https://github.com/kinginsun
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@0cd5b9735c8fb160592fdd81a6e25123f58aa731 -
Trigger Event:
push
-
Statement type: