Skip to main content

Performance Robustness Evaluation for Statistical Classifiers

Project description

PRESC: Performance and Robustness Evaluation for Statistical Classifiers

CircleCI Join the chat at https://gitter.im/PRESC-outreachy/community

PRESC is a toolkit for the evaluation of machine learning classification models. Its goal is to provide insights into model performance which extend beyond standard scalar accuracy-based measures and into areas which tend to be underexplored in application, including:

  • Generalizability of the model to unseen data for which the training set may not be representative
  • Sensitivity to statistical error and methodological choices
  • Performance evaluation localized to meaningful subsets of the feature space
  • In-depth analysis of misclassifications and their distribution in the feature space

More details about the specific features we are considering are presented in the project roadmap. We believe that these evaluations are essential for developing confidence in the selection and tuning of machine learning models intended to address user needs, and are important prerequisites towards building trustworthy AI.

It also includes a package to carry out copies of machine learning classifiers.

As a tool, PRESC is intended for use by ML engineers to assist in the development and updating of models. It is usable in the following ways:

  • As a standalone tool which produces a graphical report evaluating a given model and dataset
  • As a Python package/API which can be integrated into an existing pipeline

A further goal is to use PRESC:

  • As a step in a Continuous Integration workflow: evaluations run as a part of CI, for example, on regular model updates, and fail if metrics produce unacceptable values.

For the time being, the following are considered out of scope:

  • User-facing evaluations, eg. explanations
  • Evaluations which depend explicitly on domain context or value judgements of features, eg. protected demographic attributes. A domain expert could use PRESC to study misclassifications across such protected groups, say, but the PRESC evaluations themselves should be agnostic to such determinations.
  • Analyses which do not involve the model, eg. class imbalance in the training data

There is a considerable body of recent academic research addressing these topics, as well as a number of open-source projects solving related problems. Where possible, we plan to offer integration with existing tools which align with our vision and goals.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

presc-0.4.0.tar.gz (17.7 MB view details)

Uploaded Source

Built Distribution

presc-0.4.0-py3-none-any.whl (402.2 kB view details)

Uploaded Python 3

File details

Details for the file presc-0.4.0.tar.gz.

File metadata

  • Download URL: presc-0.4.0.tar.gz
  • Upload date:
  • Size: 17.7 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/0.0.0 pkginfo/1.8.2 readme-renderer/27.0 requests/2.27.1 requests-toolbelt/0.9.1 urllib3/1.26.8 tqdm/4.63.0 importlib-metadata/4.11.2 keyring/23.4.0 rfc3986/2.0.0 colorama/0.4.4 CPython/3.9.10

File hashes

Hashes for presc-0.4.0.tar.gz
Algorithm Hash digest
SHA256 4d21b469014c00a4a8d811a17bf540871f314bb7843a76419425944c830122b4
MD5 eb3e518b73910dc17cc34db73b9312cd
BLAKE2b-256 cd42680bd307b00de1521e08bd1af34dfc68c4cbda275ff212d929e71f9c394e

See more details on using hashes here.

File details

Details for the file presc-0.4.0-py3-none-any.whl.

File metadata

  • Download URL: presc-0.4.0-py3-none-any.whl
  • Upload date:
  • Size: 402.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/0.0.0 pkginfo/1.8.2 readme-renderer/27.0 requests/2.27.1 requests-toolbelt/0.9.1 urllib3/1.26.8 tqdm/4.63.0 importlib-metadata/4.11.2 keyring/23.4.0 rfc3986/2.0.0 colorama/0.4.4 CPython/3.9.10

File hashes

Hashes for presc-0.4.0-py3-none-any.whl
Algorithm Hash digest
SHA256 0ed3f9d51dd40615ae6a78aeb23915b0a5497d9c9f3abadd2899bbdcea183fb1
MD5 89bad3204df971fb906102a154d432b1
BLAKE2b-256 35b15146f864e65bb40580ff0bc34e7efc609e53efa34baa62248aae8d3c4aec

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page