Skip to main content

Package to calibrate and understand ML Models

Project description

ML Insights

Welcome to ML-Insights!

This package contains two core sets of functions:

  1. Calibration

  2. Interpreting Models

For probability calibration, the main class is SplineCalib. Given a set of model outputs and the “true” classes, you can fit a SplineCalib object. That object can then be used to calibrate future model predictions post-hoc.

>>> model.fit(X_train, y_train)
>>> sc = mli.SplineCalib()
>>> sc.fit(X_valid, y_valid)
>>> uncalib_preds = model.predict_proba(X_test)
>>> calib_preds = sc.calibrate(uncalib_preds)
>>> cv_preds = mli.cv_predictions(model, X_train, y_train)
>>> model.fit(X_train, y_train)
>>> sc = mli.SplineCalib()
>>> sc.fit(cv_preds, y_train)
>>> uncalib_preds = model.predict_proba(X_test)
>>> calib_preds = sc.calibrate(uncalib_preds)

For model interpretability, we provide the ice_plot and histogram_pair functions as well as other tools.

>>> rd = mli.get_range_dict(X_train)
>>> mli.ice_plot(model, X_test.sample(3), X_train.columns, rd)
>>> mli.histogram_pair(df.outcome, df.feature, bins=np.linspace(0,100,11))

Please see the documentation and examples at the links below.

Python

Python 3.4+

Disclaimer

We have tested this tool to the best of our ability, but understand that it may have bugs. It was most recently developed on Python 3.7.3. Use at your own risk, but feel free to report any bugs to our github. <https://github.com/numeristical/introspective>

Installation

$ pip install ml_insights

Usage

>>> import ml_insights as mli
>>> xray = mli.ModelXRay(model, data)
>>> rfm = RandomForestClassifier(n_estimators = 500, class_weight='balanced_subsample')
>>> rfm_cv = mli.SplineCalibratedClassifierCV(rfm)
>>> rfm_cv.fit(X_train,y_train)
>>> test_res_calib_cv = rfm_cv.predict_proba(X_test)[:,1]
>>> log_loss(y_test,test_res_calib_cv)

Source

Find the latest version on github: https://github.com/numeristical/introspective

Feel free to fork and contribute!

License

Free software: MIT license

Developed By

  • Brian Lucena

  • Ramesh Sampath

References

Alex Goldstein, Adam Kapelner, Justin Bleich, and Emil Pitkin. 2014. Peeking Inside the Black Box: Visualizing Statistical Learning With Plots of Individual Conditional Expectation. Journal of Computational and Graphical Statistics (March 2014)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ml_insights-1.0.0.tar.gz (23.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ml_insights-1.0.0-py2.py3-none-any.whl (42.9 kB view details)

Uploaded Python 2Python 3

File details

Details for the file ml_insights-1.0.0.tar.gz.

File metadata

  • Download URL: ml_insights-1.0.0.tar.gz
  • Upload date:
  • Size: 23.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.50.0 CPython/3.7.3

File hashes

Hashes for ml_insights-1.0.0.tar.gz
Algorithm Hash digest
SHA256 cac33eefa32fb30909280b58ed2e73a6dd8def863a2f4c392b876c50b5972ef1
MD5 6007f5c080c956401cf2e18158146164
BLAKE2b-256 d7a0a3c898bdac8f5cd9d57a924feb772a9a0ee5fadabbb7b4d012bb74c2e80e

See more details on using hashes here.

File details

Details for the file ml_insights-1.0.0-py2.py3-none-any.whl.

File metadata

  • Download URL: ml_insights-1.0.0-py2.py3-none-any.whl
  • Upload date:
  • Size: 42.9 kB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.50.0 CPython/3.7.3

File hashes

Hashes for ml_insights-1.0.0-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 3c4b0f9f53667407862b6214d2be96f1cc76ebf82dc8a16c9feab9f8130f5b3d
MD5 d1205972e1b24cbc82d83d0e6a3316b2
BLAKE2b-256 63168af8ac263adbf492376766e3c4f383ebb8f3b74dfc17e470b8a9c11bfa00

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page