Skip to main content

Implements Bayesian D-PDDM for Post-Deployment Deterioration Monitoring of ML models.

Project description

Bayesian D-PDDM

Bayesian implementation of the D-PDDM algorithm for post-deployment deterioration monitoring. Bayesian D-PDDM is a Bayesian approximation to the D-PDDM algorithm which provably monitors model deterioration at deployment time. Bayesian D-PDDM:

  • Flags deteriorating shifts in the unsupervised deployment data distribution
  • Resists flagging non-deteriorating shifts, unlike classical OOD detection leveraging distances and/or metrics between data distributions.

Installation and Requirements

This implementation requires python>=3.11.

The easiest way to install bayesian_dpddm is with pip:

pip install bayesian_dpddm

You can also install by cloning the GitHub repo:

# Clone the repo
git clone https://github.com/teivng/bayesian_dpddm.git

# Navigate into repo directory 
cd bayesian_dpddm

# Install the required dependencies
pip install .

Sweeping Instructions

All experiments are running from the root directory of the repo. We use hydra-core as an argparse on steroids, in tandem with wandb for sweeping. For a sweeping configuration experiments/my_sweep.yaml, run:

wandb sweep experiments/my_sweep.yaml

for which wandb responds with:

wandb: Creating sweep from: experiments/my_sweep.yaml
wandb: Creating sweep with ID: <my_sweep_id>
wandb: View sweep at: https://wandb.ai/<my_wandb_team>/<my_project>/sweeps/<my_sweep_id>

Sweeping locally

Run sweep agent with: wandb agent <my_wandb_team>/<my_project>/<my_sweep_id>.

Sweeping with slurm

sbatch files format pre-configured for the Vaughan cluster. Edit the templates at will.

We execute a script to replace the wandb agent ... line in our .slrm files:

./experiments/replace_wandb_agent.sh "wandb agent <my_wandb_team>/<my_project>/<my_sweep_id>"

Finally, spam jobs on the cluster and maximize your allocation per qos:

./experiments/sbatch_all.sh

Edit this script per your allocation.

Usage and Tutorials

In short, training a DPDDMMonitor consists of a three steps.

  1. Train the base model of the monitor on I.I.D. training data
  2. With a held-out set of I.I.D. validation data, train the distribution of I.I.D. disagreement rates (Phi) of the monitor
  3. Deploy the base model and monitor by periodically running dpddm_test on batches of unsupervised deployment data

When dpddm_test returns True, the monitor recognizes that the base model may severely underperform on the unsupervised deployment data. This is the cue for ML practitioners to inspect the problem further and consider further measures such as adapting and retraining.

For a full tutorial on how to deploy bayesian_dpddm to monitor a downstream task, consider running the guidebook tutorials/classification.ipynb where we train a DPDDMBayesianMonitor to monitor an induced deteriorating shift on the UCI Heart Disease dataset.

Citation

Coming soon.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

bayesian_dpddm-1.0.5.tar.gz (15.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

bayesian_dpddm-1.0.5-py3-none-any.whl (19.1 kB view details)

Uploaded Python 3

File details

Details for the file bayesian_dpddm-1.0.5.tar.gz.

File metadata

  • Download URL: bayesian_dpddm-1.0.5.tar.gz
  • Upload date:
  • Size: 15.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.11

File hashes

Hashes for bayesian_dpddm-1.0.5.tar.gz
Algorithm Hash digest
SHA256 ae17224e9f9899da8d7a1c94da424e95813bc408161f790c81cfb41cf942b75b
MD5 8c729eae5d07d2574c6cd1399ff537b6
BLAKE2b-256 85c7bb993bc0d10d13e1d9f3dadb9962ea20e2e5fea8455d5ae46305a42a59f1

See more details on using hashes here.

File details

Details for the file bayesian_dpddm-1.0.5-py3-none-any.whl.

File metadata

  • Download URL: bayesian_dpddm-1.0.5-py3-none-any.whl
  • Upload date:
  • Size: 19.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.11

File hashes

Hashes for bayesian_dpddm-1.0.5-py3-none-any.whl
Algorithm Hash digest
SHA256 047f35b4029c451b7f112a68c894faef453fc9a7f9d24c7b9468225eae51a2e2
MD5 600490f8b1a6013180704f1f7f71c20a
BLAKE2b-256 49fe8b1407ce08b0f97e31642405fd41cd66022f273c106a51b35d1c10cb9b87

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page