Custom ML metrics for medical applications
Project description
Med Metrics: Advanced Medical Machine Learning Evaluation Toolkit
Overview
med_metrics
is a Python package tailored for the evaluation of machine learning models in medical contexts. This package offers a unique suite of metrics, compatibility assessments, and bootstrapping techniques specifically designed to assess the performance and impact of models in healthcare.
Key Features
- Specialized Medical Metrics: Functions for advanced metrics such as Number Needed to Treat (NNT) across decision thresholds, average height of NNT vs. treated curves, and net benefit analysis.
- Curves for Evaluation: Generate various curves like NNT vs. treated, and net benefit curves, essential for visual and quantitative model analysis.
- Compatibility Metrics: Evaluate prediction compatibility across model updates or between different models, crucial for maintaining trust in evolving medical ML applications.
- Bootstrap Evaluation: Robust tools for performing bootstrap evaluations, enabling detailed performance comparisons across different machine learning models.
Features
- Specialized Medical Metrics: Calculate metrics like Number Needed to Treat (NNT) across decision thresholds, average height of NNT vs. treated curves, and net benefit analysis.
- Compatibility Assessment: Evaluate how predictions change with model updates or across different models using functions like backwards_trust_compatibility.
- Utility Functions: A range of utility functions for generating classification and confusion matrix curves, and handling various inputs for metrics calculation.
Installation
To install med_metrics, run the following command:
Copy code
pip install med-metrics
Dependencies
med_metrics requires the following libraries:
- numpy
- scikit-learn
- scipy
These dependencies are automatically installed with med_metrics.
Usage
The med_metrics package can be used to perform bootstrap evaluations for model comparison. Below is an example showcasing how to compare two machine learning models using the package's bootstrapping functionality.
Model Comparison
This example demonstrates a bootstrap analysis to compare two models using roc_auc_score
and average_NNTvsTreated
metrics, as well as generating roc_curve
and NNT vs. Number Treated curves
.
import numpy as np
from med_metrics.bootstrap import bootstrap_evaluation, summarize_bootstrap_results
from med_metrics.plotting import plot_bootstrap_curve
from sklearn.metrics import roc_auc_score, roc_curve
from med_metrics.metrics import average_NNTvsTreated
from med_metrics.curves import NNTvsTreated_curve
import pandas as pd
# Simulation of ground truth and model predictions
n = 1000
rng = np.random.default_rng(42)
p = rng.uniform(0, 1, n)
q = rng.uniform(0, 1, n)
y_true = rng.binomial(1, p)
# Bootstrap parameters
y_scores = {'model_0': p * q, 'model_1': p}
metric_funcs = {'roc_auc_score': roc_auc_score, 'average_NNTvsTreated': average_NNTvsTreated}
metric_funcs_kwargs = {'average_NNTvsTreated': {'rho': 0.4}}
curve_funcs = {'roc_curve': roc_curve, 'NNTvsT': NNTvsTreated_curve}
curve_funcs_kwargs = {'NNTvsT': {'rho': 0.4}}
# Perform the bootstrap analysis
bootstrapped_results = bootstrap_evaluation(
y_true=y_true,
y_scores=y_scores,
metric_funcs=metric_funcs,
curve_funcs=curve_funcs,
n_bootstraps=1000,
random_state=42,
metric_funcs_kwargs=metric_funcs_kwargs,
curve_funcs_kwargs=curve_funcs_kwargs
)
# Summarize the bootstrap results
mf_summary_results, _ = summarize_bootstrap_results(bootstrapped_results)
display(pd.DataFrame(mf_summary_results))
# Plot the bootstrap analysis results
_ = plot_bootstrap_curve(bootstrapped_results, 'average_NNTvsTreated', 'NNTvsT',
xlabel='Number Treated', ylabel='NNT',
title='NNT vs. Number Treated', legend_title='Mean NNT (95% CI)')
The above code performs the bootstrap analysis and generates a summary table, as well as a plot for NNT vs. Number Treated. The results are shown below:
Bootstrap Summary Table
Copy code
roc_auc_score average_NNTvsTreated
model_0 0.709 (0.676, 0.741) 3.844 (3.624, 4.04)
model_1 0.818 (0.794, 0.844) 3.539 (3.366, 3.698)
Plot: NNT vs. Number Treated
Modules Overview
- bootstrap.py: Perform bootstrap evaluations and analyses.
- compatibility_metrics.py: Functions for assessing prediction compatibility.
- curves.py: Generate various evaluative curves.
- metrics.py: Core module for specialized medical metrics.
Contributing
Contributions to med_metrics are welcome! Please read our contributing guidelines for more information on how to submit pull requests, report issues, or suggest enhancements.
License
med_metrics is released under a MIT License.
Contact
For questions or feedback, please contact Erkin Ötleş at hi@eotles.com .
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file med_metrics-0.0.2.tar.gz
.
File metadata
- Download URL: med_metrics-0.0.2.tar.gz
- Upload date:
- Size: 18.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.11.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 2c2fd86215a1e7e1b51bef791b95905174d5ccf35702e110ef5d9a9cf67f8c48 |
|
MD5 | 3e99fa38b8328fb7c604ad49a3f4bb2c |
|
BLAKE2b-256 | e69ac9be218b95caa9f3540a56ab59c293c5323b5126800dbad21e18719a478c |
File details
Details for the file med_metrics-0.0.2-py3-none-any.whl
.
File metadata
- Download URL: med_metrics-0.0.2-py3-none-any.whl
- Upload date:
- Size: 21.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.11.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 7cb02ce9417060bae38b6d555822e88882ee692118d34e94eeaa8c4ecbd80086 |
|
MD5 | f9029ca4af31ddbfc6d4f3c44aed97e0 |
|
BLAKE2b-256 | dfd10c304540e830e7a16875c474e0c2164761a546e2d90a6a33fb7f640832fe |