Skip to main content

Custom ML metrics for medical applications

Project description

Med Metrics: Advanced Medical Machine Learning Evaluation Toolkit

Overview

med_metrics is a Python package tailored for the evaluation of machine learning models in medical contexts. This package offers a unique suite of metrics, compatibility assessments, and bootstrapping techniques specifically designed to assess the performance and impact of models in healthcare.

Key Features

  • Specialized Medical Metrics: Functions for advanced metrics such as Number Needed to Treat (NNT) across decision thresholds, average height of NNT vs. treated curves, and net benefit analysis.
  • Curves for Evaluation: Generate various curves like NNT vs. treated, and net benefit curves, essential for visual and quantitative model analysis.
  • Compatibility Metrics: Evaluate prediction compatibility across model updates or between different models, crucial for maintaining trust in evolving medical ML applications.
  • Bootstrap Evaluation: Robust tools for performing bootstrap evaluations, enabling detailed performance comparisons across different machine learning models.

Features

  • Specialized Medical Metrics: Calculate metrics like Number Needed to Treat (NNT) across decision thresholds, average height of NNT vs. treated curves, and net benefit analysis.
  • Compatibility Assessment: Evaluate how predictions change with model updates or across different models using functions like backwards_trust_compatibility.
  • Utility Functions: A range of utility functions for generating classification and confusion matrix curves, and handling various inputs for metrics calculation.

Installation

To install med_metrics, run the following command:

pip install med-metrics

Dependencies

med_metrics requires the following libraries:

  • numpy
  • scikit-learn
  • scipy

These dependencies are automatically installed with med_metrics.

Usage

The med_metrics package can be used to perform bootstrap evaluations for model comparison. Below is an example showcasing how to compare two machine learning models using the package's bootstrapping functionality.

Model Comparison

This example demonstrates a bootstrap analysis to compare two models using roc_auc_score and average_NNTvsTreated metrics, as well as generating roc_curve and NNT vs. Number Treated curves.

import numpy as np
from med_metrics.bootstrap import bootstrap_evaluation, summarize_bootstrap_results
from med_metrics.plotting import plot_bootstrap_curve
from sklearn.metrics import roc_auc_score, roc_curve
from med_metrics.metrics import average_NNTvsTreated
from med_metrics.curves import NNTvsTreated_curve
import pandas as pd

# Simulation of ground truth and model predictions
n = 1000
rng = np.random.default_rng(42)
p = rng.uniform(0, 1, n)
q = rng.uniform(0, 1, n)
y_true = rng.binomial(1, p)

# Bootstrap parameters
y_scores = {'model_0': p * q, 'model_1': p}
metric_funcs = {'roc_auc_score': roc_auc_score, 'average_NNTvsTreated': average_NNTvsTreated}
metric_funcs_kwargs = {'average_NNTvsTreated': {'rho': 0.4}}
curve_funcs = {'roc_curve': roc_curve, 'NNTvsT': NNTvsTreated_curve}
curve_funcs_kwargs = {'NNTvsT': {'rho': 0.4}}

# Perform the bootstrap analysis
bootstrapped_results = bootstrap_evaluation(
    y_true=y_true,
    y_scores=y_scores,
    metric_funcs=metric_funcs,
    curve_funcs=curve_funcs,
    n_bootstraps=1000,
    random_state=42,
    metric_funcs_kwargs=metric_funcs_kwargs,
    curve_funcs_kwargs=curve_funcs_kwargs
)

# Summarize the bootstrap results
mf_summary_results, _ = summarize_bootstrap_results(bootstrapped_results)
display(pd.DataFrame(mf_summary_results))

# Plot the bootstrap analysis results
_ = plot_bootstrap_curve(bootstrapped_results, 'average_NNTvsTreated', 'NNTvsT',
                         xlabel='Number Treated', ylabel='NNT',
                         title='NNT vs. Number Treated', legend_title='Mean NNT (95% CI)')

The above code performs the bootstrap analysis and generates a summary table, as well as a plot for NNT vs. Number Treated. The results are shown below:

Bootstrap Summary Table

Copy code
roc_auc_score    average_NNTvsTreated
model_0  0.709 (0.676, 0.741)  3.844 (3.624, 4.04)
model_1  0.818 (0.794, 0.844)  3.539 (3.366, 3.698)

Plot: NNT vs. Number Treated NNT vs. Number Treated Plot

Modules Overview

  • bootstrap.py: Perform bootstrap evaluations and analyses.
  • compatibility_metrics.py: Functions for assessing prediction compatibility.
  • curves.py: Generate various evaluative curves.
  • metrics.py: Core module for specialized medical metrics.

Contributing

Contributions to med_metrics are welcome! Please read our contributing guidelines for more information on how to submit pull requests, report issues, or suggest enhancements.

License

med_metrics is released under a MIT License.

Contact

For questions or feedback, please contact Erkin Ötleş at hi@eotles.com .

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

med_metrics-0.0.5.tar.gz (19.0 kB view details)

Uploaded Source

Built Distribution

med_metrics-0.0.5-py3-none-any.whl (21.0 kB view details)

Uploaded Python 3

File details

Details for the file med_metrics-0.0.5.tar.gz.

File metadata

  • Download URL: med_metrics-0.0.5.tar.gz
  • Upload date:
  • Size: 19.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.4

File hashes

Hashes for med_metrics-0.0.5.tar.gz
Algorithm Hash digest
SHA256 621dcb7c429c287d9a1b7c9704c905433caf48a971ad5135cb2620a850510287
MD5 6d7f686584ab6f73fe672c860d3fa5fd
BLAKE2b-256 a6b29b984fa74b26e5883448cf93d88ebe14cb97c8de791e35757865326fa05e

See more details on using hashes here.

File details

Details for the file med_metrics-0.0.5-py3-none-any.whl.

File metadata

  • Download URL: med_metrics-0.0.5-py3-none-any.whl
  • Upload date:
  • Size: 21.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.4

File hashes

Hashes for med_metrics-0.0.5-py3-none-any.whl
Algorithm Hash digest
SHA256 8582a41770a85fee217771f555601e3e018aaa927f473b1e214c4a4d894609d6
MD5 877e6e2f07272f7f57c277dc24ef5bf7
BLAKE2b-256 80933985caa8fc2bb62e41926404ea18281cc2bd586dd73ddf6520b37fe23939

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page