Skip to main content

A framework to diagnose ML models

Project description

License: MIT Build Status Coverage Status CodeFactor

MLDiag

This python library helps you diagnose machine learning models before deployment.

Visit this introduction to understand about MLDiag.

Features

  • Generate synthetic data with adversarial attacks to evaluate model robustness
  • Make some interesting statistics on model behaviour
  • Simple, easy-to-use and lightweight library. Diagnose data in 3 lines of code
  • Plug and play to any neural network frameworks (e.g. PyTorch, TensorFlow) or standard machine learning framework (e.g. scikit-learn)
  • Support textual, image, audio and structured data
  • Can be added in a CI workflow
  • Can be used in command line or python scripts

Quick Demo

Quick start

Installation

The library supports python 3.7+ in linux and window platform.

To install the library:

pip install mldiag

or install the latest version (include BETA features) from github directly

pip install git+https://github.com/AI-MEN/mldiag.git

Run a diagostic

Method 1:

This method uses command lines only. It requires a model running as a webservice. We provide for a demo a complete example:

  • create a text classification model:
python examples/text_classification/tf_text_classification.py train --save_model_path=./mldiag

a tensorflow model model.h5 is created in the mldiag directory

  • Run a text classification web service:
python examples/text_classification/flask_text_classification_service.py  --model_path ./mldiag/model.h5

a local webservice is running under http://localhost:8080/query

  • create the test set to diagnose the model
python examples/text_classification/tf_text_classification.py save_test_set --out_path=./mldiag

a test set test.npy is saved in mldiag. It contains a numpy array of text examples and their class labels

  • run the diagnostic application calling the web service
python mldiag/cli.py diagnose   --eval_set "./mldiag/test.npy" 
                                --config_file  "examples/text_classification/config_text_classification.yaml" 
                                --service_url http://localhost:8080/query
                                --report_path "./mldiag"
                                --json_field "results"

where results is the key used to jsonify data from the webservice (see the web service script).

![MLDiag](https://github.com/AI-MEN/MLDiag/tree/master/blog/capture.jpg =300x100)

Method 2

This method uses python scripts. it supports a number of machine learning models and data formats through wrappers. Ready to use wrappers can be found in mldiag/wrappers.py In the following, a complete example is proposed as demo.

  • create a text classification model:
python examples/text_classification/tf_text_classification.py train --save_model_path=./mldiag

a tensorflow model model.h5 is created in the mldiag directory

  • call the python scrip (the diagnose config file is available in examples/text_classification/config_text_classification.yaml):
python examples/text_classification/tf_text_classification_diag.py run --model_path=./mldiag/model.h5 --repor_path=./mldiag

Diagnostics

Diagnostic Target Action Description
Textual Character OCRError Simulate ocr error

Recent Changes

See changelog for more details.

Extension Reading

Reference

This library uses:

  • data (e.g. capturing from internet),
  • research (e.g. following augmenter idea),
  • model (e.g. using pre-trained model)

TODO: update sources See data source for more details.

Citing

@misc{shabou2020mldiag,
  title={Machine learning diagnosis},
  author={Aymen SHABOU},
  howpublished={https://github.com/AI-MEN/MLDiag},
  year={2020}
}

Contributions

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mldiag-0.0.1.tar.gz (35.1 kB view hashes)

Uploaded Source

Built Distribution

mldiag-0.0.1-py3-none-any.whl (35.0 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page