Skip to main content

ML-Enhanced Pharmacometrics Toolkit

Project description

pharmacoml

PyPI version Python versions License CI

pharmacoml is a benchmark-backed hybrid AI/ML covariate screening toolkit for population PK/PD, combining explainable ML discovery, penalized confirmation, and SCM-style bridging in an estimation-tool-agnostic Python workflow.

What It Is

pharmacoml helps pharmacometricians use a hybrid AI/ML screening workflow to identify and prioritize likely covariates from subject-level EBEs or individual parameters before formal model confirmation. It is designed to work with outputs from NONMEM, nlmixr2, Monolix, Pumas, or similar mixed-effects workflows.

The current release is evaluated against a fixed public benchmark suite that includes real public PK examples and paper-style benchmark scenarios.

What It Is Not

pharmacoml is not a replacement for final NLME estimation, full model search, or pharmacometric confirmation in the current release. It is a hybrid AI/ML covariate screening and preselection tool designed to reduce search space before SCM, backward elimination, or final model fitting.

Why It Is Different

  • Uses a hybrid AI/ML screening workflow that combines explainable ML discovery, penalized confirmation, and SCM-style bridging instead of relying on a single method.
  • Works with EBEs or individual parameters from any solver, including NONMEM, nlmixr2, Monolix, and Pumas, so screening is not tied to a single estimation engine.
  • Supports many screening backends, including explainable boosting, AALASSO, Stochastic Gates (STG), and an SCM-style bridge, rather than relying on a single screening model.
  • Includes pharmacometric screening features such as shrinkage-aware logic, biology-aware proxy preservation, and optional interaction screening.
  • Ships with a public benchmark suite, pinned baselines, and generated benchmark reports so workflow changes can be evaluated against fixed reference cases.

Installation

From PyPI:

pip install pharmacoml

For development:

git clone https://github.com/s-rani1/pharmacoml.git
cd pharmacoml
pip install -e ".[dev]"

Optional extras:

pip install -e ".[dev,dl,symbolic]"

Quick Start

import pandas as pd
from pharmacoml.covselect import HybridScreener

ebes = pd.read_csv("individual_parameters.csv")
covariates = pd.read_csv("covariates.csv")

report = HybridScreener(
    include_scm=True,
).fit(
    ebes=ebes,
    covariates=covariates,
    parameter_shrinkage={"CL": 0.12, "V": 0.28},
)

report.confirmed_covariates()   # recommended daily-use answer
report.candidate_covariates()   # shortlist to carry forward
report.core_covariates()        # strongest ML-supported signals
report.proxy_groups()           # correlated alternatives
print(report.to_nonmem_candidates())

For reproducible runs, set a fixed random_state. In the current release, the default hybrid workflow is stable across repeated runs when the data, settings, and random seed are unchanged. Experimental or more stochastic paths may vary more across runs and environments.

Example confirmed_covariates() output:

  parameter covariate functional_form confirmation_status
0        CL        WT           power                 scm
1         V        WT           power                 scm

Typical Outputs

  • confirmed_covariates(): compact answer after SCM-style confirmation
  • candidate_covariates(): practical shortlist for downstream PMx confirmation
  • core_covariates(): strongest ML-supported signals
  • proxy_groups(): correlated or overlapping covariate groups
  • interaction_covariates(): screened interactions when enabled
  • to_nonmem_candidates(): export-ready candidate block for downstream workflows

How to Read the Outputs

  • confirmed_covariates(): start here for the most compact daily-use answer. These are the covariates that survive the package's confirmation layer and are the clearest candidates to carry forward.
  • candidate_covariates(): use this as the practical shortlist for formal PMx confirmation. It is intentionally broader than confirmed_covariates() and is often the right input to SCM or backward elimination.
  • core_covariates(): the strongest ML-supported signals before confirmation. This is useful when you want to inspect what the AI/ML layer found most strongly, even if not every signal is retained in the final confirmed set.
  • proxy_groups(): review this whenever correlated covariates are plausible. It shows which variables are acting as correlated alternatives so you can make a pharmacometrically sensible choice downstream.
  • interaction_covariates(): only relevant when interaction screening is enabled. These are pairwise interaction terms that survived the screening workflow.
  • to_nonmem_candidates(): use this when you want a direct candidate block to carry into a downstream modeling workflow.

For most users, the practical reading order is:

  1. confirmed_covariates()
  2. candidate_covariates()
  3. proxy_groups()
  4. to_nonmem_candidates()

For reproducibility, keep random_state fixed when comparing runs or benchmarking workflow changes.

Benchmarks

pharmacoml includes a fixed public benchmark suite for release calibration:

  • pheno (Pharmpy phenobarbital example)
  • Eleveld/Wahlquist public propofol data
  • ggPMX Monolix theophylline example
  • Asiimwe-style correlated-covariate simulation
  • Shap-Cov-style collinear simulation
  • optional Kekic public synthetic scenarios when available locally

Current agreement snapshot for the benchmark-backed default workflow:

Dataset Agreement
pheno Exact
eleveld_union Exact
ggpmx_theophylline Exact
high_shrinkage_user_input Exact
age_pma_distinct Exact
interaction_xor_screening Exact
asiimwe_correlated_small_n Partial
shapcov_collinear Partial

The current fixed benchmark suite shows exact agreement on the real/public PK cases and targeted shrinkage, proxy, and interaction checks, with remaining errors concentrated in the hardest collinearity-heavy synthetic scenarios.

Run the benchmark suite:

PYTHONPATH=. python benchmarks/run_public_benchmarks.py --check

That command generates a reusable report bundle under benchmarks/reports/fixed_public/ by default:

  • public_benchmark_report.md
  • public_benchmark_summary.csv
  • public_benchmark_details.csv
  • public_benchmark_report.json

Use --no-report to skip artifact generation, or --report-dir <path> to write the bundle somewhere else.

Experimental Consensus

For advanced benchmarking and model-family comparison, the experimental namespace exposes a curated multi-model consensus workflow:

from pharmacoml.covselect.experimental import MultiModelConsensusScreener

report = MultiModelConsensusScreener(
    top_k=3,
    n_bootstrap=8,
    include_neural=False,
).fit(ebes, covariates)

report.consensus_covariates()
report.selection_frequency_table()
report.compare_with_hybrid(ebes, covariates)

Documentation

Rendered docs are available via GitHub Pages:

Methodological References

The default hybrid workflow implements and combines approaches described in recent pharmacometric ML literature on covariate screening, including Sibieude et al. (2021), Asiimwe et al. (2024), Brooks et al. (2025), Karlsen et al. (2025), and Kekic et al. (2026). The broader package also includes additional experimental screening and benchmarking capabilities.

How to Cite

If you use pharmacoml in your work, please cite the software repository. GitHub will also expose citation metadata directly via the repository citation panel.

Suggested citation:

Rani S. pharmacoml: Benchmark-backed hybrid AI/ML covariate screening toolkit
for population PK/PD. Version 0.1.1. GitHub.
https://github.com/s-rani1/pharmacoml

When relevant, also cite the methodological papers that informed the workflow, especially Sibieude et al. (2021), Asiimwe et al. (2024), Brooks et al. (2025), Karlsen et al. (2025), and Kekic et al. (2026).

Roadmap

Potential future expansion includes:

  • backend integration for formal model-confirmation workflows such as nlmixr2 and NONMEM
  • estimation-driven SCM and backward elimination
  • simulation and reporting layers for broader pharmacometric workflows
  • possible R integration paths via subprocess-based execution or rpy2

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pharmacoml-0.1.2.tar.gz (76.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pharmacoml-0.1.2-py3-none-any.whl (84.2 kB view details)

Uploaded Python 3

File details

Details for the file pharmacoml-0.1.2.tar.gz.

File metadata

  • Download URL: pharmacoml-0.1.2.tar.gz
  • Upload date:
  • Size: 76.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.8

File hashes

Hashes for pharmacoml-0.1.2.tar.gz
Algorithm Hash digest
SHA256 9bc52f8aba863ec2305e425c6b05221274a389a060bca675f0693231d8b14f2f
MD5 07bfccdfc437b7a5945e3fe9869811c5
BLAKE2b-256 60fa547a21b612bfee2281b09dfa11ec5614ccf1ea3946bf8298aff83322408d

See more details on using hashes here.

File details

Details for the file pharmacoml-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: pharmacoml-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 84.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.8

File hashes

Hashes for pharmacoml-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 82d9cf33eecf573b4a5bcdd2991e8fa217572bce6f9c8f2c6c9047abb3aea50d
MD5 ab9872c6ba0a580c0a422dc495a86fc8
BLAKE2b-256 81037214c749df70d940f1ddfc4bce49b489748059b233fe1ece7a37ada0de9a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page