Skip to main content

Machine Learning model performance metrics & charts with confidence intervals, optimized with numba to be fast

Project description

fronni

A Python library for quickly calculating & displaying machine learning model performance metrics with confidence intervals.

Data scientists spend a lot of time evaluating the performance of their machine learning models. A common means of doing so is a classification report, such as the one below built-into the scikit-learn library (referenced over 440,000 times on Github).

The problem with depending on this is that when you have imbalanced and/or small datasets in your test sample, the volume of data that makes up each point estimate can be very small. As a result, whatever estimates you have for the precision and recall values will likely be unrepresentative of what happens in the real world. Instead, it would be far better to present to the reader a range of values from confidence intervals.

We can easily create confidence intervals for any metric of interest by using the bootstrap technique to create hundreds of samples from the original dataset (with replacement), and then throw away the most extreme 5% of values to get a 95% confidence interval, for example.

Examples

With a pandas dataframe named 'df', you can generate your metrics like this:

from fronni.classification import classification_report
report = classification_report(df['label'], df['predicted'], n=1000)
print(report)

Requirements

  • Python >= 3.6

  • numba

  • numpy

  • scikit-learn

  • plotly

Installing fronni

pip install fronni

How fronni works

When the datasets are large, running the bootstrap as described above becomes a computationally expensive calculation. Even a relatively small sample of a million rows, when resampled 1,000 times, turns into a billion-row dataset and causes the calculations to take close to a minute. This isn’t a problem when doing these calculations offline, but engineers are often working interactively in a Jupyter notebook and want answers quickly. For this reason we decided to create a few fast implementations of these calculations for common machine learning metrics, such as precision & recall using the numba library, which provides a speedup of approximately 23X over regular Python parallel-processing code.

Full documentation

Functions from the classification module:

classification_report

Generates confidence intervals for precision, recall, & F1 metrics for a binary or multi-class classification model, given arrays of predicted & label values.

Parameter Type Default
label Numpy array or Pandas series None
predicted Numpy array or Pandas series None
n integer, number of bootstrap iterations 1,000
confidence_level integer value between 1 & 100 95
as_dict Boolean, return nested dictionary if True otherwise Pandas dataframe False
confidence_level value between 1 & 100 95
sort_by_sample_size Boolean, return the Pandas dataframe, sorted in descending order of class sample size False

plot_classification_report

Plots precision, recall, & confidence intervals for F1 metrics for a binary or multi-class classification model, given a classification report input.

Parameter Type Default
report output from classification_report None
save_to_filename string, path of filename image to save like "image.png" None

From the regression module:

regression_report

Generates confidence intervals for RMSE, MAE, and R^2 metrics for a regression model, given arrays of predicted & label values.

Parameter Type Default
label Numpy array or Pandas series None
predicted Numpy array or Pandas series None
n integer, number of bootstrap iterations 1,000
as_dict Boolean, return nested dictionary if True otherwise Pandas dataframe False

See the CONTRIBUTING file for how to help out.

License

fronni is Apache 2.0 licensed, as found in the LICENSE file.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fronni-0.0.5.tar.gz (11.6 kB view hashes)

Uploaded Source

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page