Skip to main content

Performance Robustness Evaluation for Statistical Classifiers

Project description

PRESC: Performance and Robustness Evaluation for Statistical Classifiers

CircleCI Join the chat at https://gitter.im/PRESC-outreachy/community

PRESC is a toolkit for the evaluation of machine learning classification models. Its goal is to provide insights into model performance which extend beyond standard scalar accuracy-based measures and into areas which tend to be overlooked in application, including:

  • Generalizability of the model to unseen data for which the training set may not be representative
  • Sensitivity to statistical error and methodological choices
  • Performance evaluation localized to meaningful subsets of the feature space
  • In-depth analysis of misclassifications and their distribution in the feature space

More details about the specific features we are considering are presented in the project roadmap. We believe that these evaluations are essential for developing confidence in the selection and tuning of machine learning models intended to address user needs, and are important prerequisites towards building trustworthy AI.

As a tool, PRESC is intended for use by ML engineers to assist in the development and updating of models. It will be usable in the following ways:

  • As a standalone tool which produces a graphical report evaluating a given model and dataset
  • As a Python package/API which can be integrated into an existing pipeline
  • As a step in a Continuous Integration workflow: evaluations run as a part of CI, for example, on regular model updates, and fail if metrics produce unacceptable values.

We are using the standard Python scientific stack (numpy/pandas/jupyter). In order to streamline development while the project is still in its early stages, we are restricting focus to scikit-learn supervised classification models, and we are prototyping report visualizations in Jupyter notebooks. For the time being, the following are considered out of scope:

  • Models built in machine learning frameworks other than scikit-learn
  • User-facing evaluations, eg. explanations
  • Evaluations which depend explicitly on domain context or value judgements of features, eg. protected demographic attributes. A domain expert could use PRESC to study misclassifications across such protected groups, say, but the PRESC evaluations themselves should be agnostic to such determinations.
  • Analyses which do not involve the model, eg. class imbalance in the training data

There is a considerable body of recent academic research addressing these topics, as well as a number of open-source projects solving related problems. Where possible, we plan to offer integration with existing tools which align with our vision and goals.

This project was the subject of an Outreachy internship during Summer 2020. Submissions from the Spring 2020 application period have been archived in this this repo in their original state, and will be integrated here as needed.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

presc-0.1.2.tar.gz (16.5 MB view details)

Uploaded Source

Built Distributions

presc-0.1.2-py3.7.egg (441.3 kB view details)

Uploaded Source

presc-0.1.2-py3-none-any.whl (392.1 kB view details)

Uploaded Python 3

File details

Details for the file presc-0.1.2.tar.gz.

File metadata

  • Download URL: presc-0.1.2.tar.gz
  • Upload date:
  • Size: 16.5 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.7.0 requests/2.25.1 setuptools/49.6.0.post20210108 requests-toolbelt/0.9.1 tqdm/4.58.0 CPython/3.7.6

File hashes

Hashes for presc-0.1.2.tar.gz
Algorithm Hash digest
SHA256 0861be855747915c6690e1d125cefa42377e3148c85c5c2b7e3f9c76c30e57b8
MD5 a9eafd578871d1187dbcb5fd18f6c066
BLAKE2b-256 c8f0690938cb82fcd406c5710d9309ec28d51ffad4133bcc04712eef271ca744

See more details on using hashes here.

File details

Details for the file presc-0.1.2-py3.7.egg.

File metadata

  • Download URL: presc-0.1.2-py3.7.egg
  • Upload date:
  • Size: 441.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.7.0 requests/2.25.1 setuptools/49.6.0.post20210108 requests-toolbelt/0.9.1 tqdm/4.58.0 CPython/3.7.6

File hashes

Hashes for presc-0.1.2-py3.7.egg
Algorithm Hash digest
SHA256 977cf7581a77efb9bae278009929499bc601c3b7b64868c47117800cd97ceb4a
MD5 8ed1cda71fe5703492dad75803a03e56
BLAKE2b-256 ee3bce4c7345b3169bd2fba34f99a0bcabf37df57d80fd7ce9b515b22508ef40

See more details on using hashes here.

File details

Details for the file presc-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: presc-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 392.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.7.0 requests/2.25.1 setuptools/49.6.0.post20210108 requests-toolbelt/0.9.1 tqdm/4.58.0 CPython/3.7.6

File hashes

Hashes for presc-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 35814688a51ceb3b486d9f11c349e842e37e507aa61be16870e7f8c5c5b7a2a1
MD5 6206e425bae20f29c2a5b07bbf241cc5
BLAKE2b-256 939c09f143609a69eb6047f25e7cb68464c27543f694ab31c4cde745eea36289

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page