Skip to main content

Python Open-source package for ensuring robust and reliable ML models deployments

Project description

MED3pa Package

Table of Contents

Overview

Overview

The MED3pa package is specifically designed to address critical challenges in deploying machine learning models, particularly focusing on the robustness and reliability of models under real-world conditions. It provides comprehensive tools for evaluating model stability and performance in the face of covariate shifts, uncertainty, and problematic data profiles.

Key Functionalities

  • Covariate Shift Detection: Utilizing the Detectron subpackage, MED3pa can identify significant shifts in data distributions that might affect the model’s predictions. This feature is crucial for applications such as healthcare, where early detection of shifts can prevent erroneous decisions.

  • Uncertainty and Confidence Estimation: Through the med3pa subpackage, the package measures the uncertainty and predictive confidence at both individual and group levels. This helps in understanding the reliability of model predictions and in making informed decisions based on model outputs.

  • Identification of Problematic Profiles: MED3pa analyzes data profiles that consistently lead to poor model performance. This capability allows developers to refine training datasets or retrain models to handle these edge cases effectively.

Subpackages

Overview

The package is structured into four distinct subpackages:

  • datasets: Stores and manages the dataset.
  • models: Handles ML models operations.
  • detectron: Evaluates the model against covariate shift.
  • med3pa: Evaluates the model’s performance & extracts problematic profiles.

This modularity allows users to easily integrate and utilize specific functionalities tailored to their needs without dealing with unnecessary complexities.

Getting Started with the Package

To get started with MED3pa, follow the installation instructions and usage examples provided in the documentation.

Installation

pip install MED3pa

A simple exemple

   
    from MED3pa.datasets import DatasetsManager
    from MED3pa.models import BaseModelManager, ModelFactory
    from MED3pa.med3pa import Med3paDetectronExperiment
    
    # Initialize the DatasetsManager
    datasets = DatasetsManager()

    # Load datasets for training, validation, reference, and testing
    datasets.set_from_file(dataset_type="training", file='./tutorials/data/train_data.csv', target_column_name='Outcome')
    datasets.set_from_file(dataset_type="validation", file='./tutorials/data/val_data.csv', target_column_name='Outcome')
    datasets.set_from_file(dataset_type="reference", file='./tutorials/data/test_data.csv', target_column_name='Outcome')
    datasets.set_from_file(dataset_type="testing", file='./tutorials/data/test_data_shifted_0.6.csv', target_column_name='Outcome')


    # Initialize the model factory and load the pre-trained model
    factory = ModelFactory()
    model = factory.create_model_from_pickled("./tutorials/models/diabetes_xgb_model.pkl")

    # Set the base model using BaseModelManager
    base_model_manager = BaseModelManager()
    base_model_manager.set_base_model(model=model)

    # Execute the integrated MED3PA and Detectron experiment
    reference_results, test_results, detectron_results = Med3paDetectronExperiment.run(
        datasets=datasets,
        base_model_manager=base_model_manager,
    )

    # Save the results to a specified directory
    reference_det_results.save(file_path='./tutorials/med3pa_detectron_experiment_results/reference')
    test_det_results.save(file_path='./tutorials/med3pa_detectron_experiment_results/test')
    detectron_results.save(file_path='./tutorials/med3pa_detectron_experiment_results/detectron')

Tutorials

We have created many tutorial notebooks to assist you in learning how to use the different parts of the package.

Acknowledgement

MED3pa is an open-source package developed at the MEDomics-Udes laboratory with the collaboration of the international consortium MEDomics. We welcome any contribution and feedback.

References

This package utilizes the methods described in the following work:

Ginsberg, T., Liang, Z., & Krishnan, R. G. (2023). A Learning Based Hypothesis Test for Harmful Covariate Shift. In The Eleventh International Conference on Learning Representations.

Authors

Statement

This package is part of https://github.com/medomics, a package providing research utility tools for developing precision medicine applications.

Copyright (C) 2024 MEDomics consortium

GPL3 LICENSE SYNOPSIS

Here's what the license entails:

1. Anyone can copy, modify and distribute this software.
2. You have to include the license and copyright notice with each and every distribution.
3. You can use this software privately.
4. You can use this software for commercial purposes.
5. If you dare build your business solely from this code, you risk open-sourcing the whole code base.
6. If you modify it, you have to indicate changes made to the code.
7. Any modifications of this code base MUST be distributed with the same license, GPLv3.
8. This software is provided without warranty.
9. The software author or license can not be held liable for any damages inflicted by the software.

More information on about the LICENSE can be found here

Supported Python Versions

The MED3pa package is developed and tested with Python 3.12.3.

Additionally, it is compatible with the following Python versions:

  • Python 3.11.x
  • Python 3.10.x
  • Python 3.9.x

While the package may work with other versions of Python, these are the versions we officially support and recommend.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

med3pa-0.1.37.tar.gz (4.2 MB view details)

Uploaded Source

Built Distribution

MED3pa-0.1.37-py3-none-any.whl (98.0 kB view details)

Uploaded Python 3

File details

Details for the file med3pa-0.1.37.tar.gz.

File metadata

  • Download URL: med3pa-0.1.37.tar.gz
  • Upload date:
  • Size: 4.2 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/5.1.1 CPython/3.12.6

File hashes

Hashes for med3pa-0.1.37.tar.gz
Algorithm Hash digest
SHA256 6dde08329c5a048967470dca5f9b4fa7222395bde2545835050266e0bde48c50
MD5 dd68158f876afd6f23aa7a15c72004f8
BLAKE2b-256 b17051d5739339c51fafa1c938590bb89e297eabe135ee478fd72e066926695f

See more details on using hashes here.

File details

Details for the file MED3pa-0.1.37-py3-none-any.whl.

File metadata

  • Download URL: MED3pa-0.1.37-py3-none-any.whl
  • Upload date:
  • Size: 98.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/5.1.1 CPython/3.12.6

File hashes

Hashes for MED3pa-0.1.37-py3-none-any.whl
Algorithm Hash digest
SHA256 7bb115b3c9fc5daf8622cc484c586af021d2f3c7fa322eafa04618caa2dd955c
MD5 06f4bc59363205b184a8e6f8c14797f4
BLAKE2b-256 90cf6fe686216f8fc5aca90c64d1b55542b17c0f383cd773454dc0b448280cdf

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page