A framework to diagnose ML models
Project description
MLDiag
This python library helps you diagnose machine learning models before deployment.
Visit this introduction to understand about MLDiag.
Features
- Generate synthetic data with adversarial attacks to evaluate model robustness
- Make some interesting statistics on model behaviour
- Simple, easy-to-use and lightweight library. Diagnose data in 3 lines of code
- Plug and play to any neural network frameworks (e.g. PyTorch, TensorFlow) or standard machine learning framework (e.g. scikit-learn)
- Support textual, image, audio and structured data
- Can be added in a CI workflow
- Can be used in command line or python scripts
Quick Demo
Quick start
Installation
The library supports python 3.7+ in linux and window platform.
To install the library:
pip install mldiag
or install the latest version (include BETA features) from github directly
pip install git+https://github.com/AI-MEN/mldiag.git
Run a diagostic
Method 1:
This method uses command lines only. It requires a model running as a webservice. We provide for a demo a complete example:
- create a text classification model:
python examples/text_classification/tf_text_classification.py train --save_model_path=./mldiag
a tensorflow model model.h5 is created in the mldiag directory
- Run a text classification web service:
python examples/text_classification/flask_text_classification_service.py --model_path ./mldiag/model.h5
a local webservice is running under http://localhost:8080/query
- create the test set to diagnose the model
python examples/text_classification/tf_text_classification.py save_test_set --out_path=./mldiag
a test set test.npy is saved in mldiag.
It contains a numpy array of text examples and their class labels
- run the diagnostic application calling the web service
python mldiag/cli.py diagnose --eval_set "./mldiag/test.npy"
--config_file "examples/text_classification/config_text_classification.yaml"
--service_url http://localhost:8080/query
--report_path "./mldiag"
--json_field "results"
where results is the key used to jsonify data from the webservice (see the web service script).

Method 2
This method uses python scripts. it supports a number of machine learning models and data formats through wrappers. Ready to use wrappers can be found in mldiag/wrappers.py In the following, a complete example is proposed as demo.
- create a text classification model:
python examples/text_classification/tf_text_classification.py train --save_model_path=./mldiag
a tensorflow model model.h5 is created in the mldiag directory
- call the python scrip (the diagnose config file is available in
examples/text_classification/config_text_classification.yaml):
python examples/text_classification/tf_text_classification_diag.py run --model_path=./mldiag/model.h5 --repor_path=./mldiag
Diagnostics
| Diagnostic | Target | Action | Description |
|---|---|---|---|
| Textual | Character | OCRError | Simulate ocr error |
Recent Changes
See changelog for more details.
Extension Reading
- Data Augmentation library for Text
- Does your NLP model able to prevent adversarial attack?
- How does Data Noising Help to Improve your NLP Model?
- Data Augmentation library for Speech Recognition
- Data Augmentation library for Audio
- Unsupervied Data Augmentation
- A Visual Survey of Data Augmentation in NLP
Reference
This library uses:
- data (e.g. capturing from internet),
- research (e.g. following augmenter idea),
- model (e.g. using pre-trained model)
TODO: update sources
See data source for more details.
Citing
@misc{shabou2020mldiag,
title={Machine learning diagnosis},
author={Aymen SHABOU},
howpublished={https://github.com/AI-MEN/MLDiag},
year={2020}
}
Contributions
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file mldiag-0.0.1.tar.gz.
File metadata
- Download URL: mldiag-0.0.1.tar.gz
- Upload date:
- Size: 35.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/49.6.0.post20200917 requests-toolbelt/0.9.1 tqdm/4.50.0 CPython/3.7.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e6ed0290bcfe6cab779cbd1a1d37bb89233fec1c9b2be481845f90008ae547e2
|
|
| MD5 |
948ad15b0d363382da942a97e3620327
|
|
| BLAKE2b-256 |
e7fec79bf679f492843bfccc78494685b47864fbbb5574ea3633eea4818a5a9c
|
File details
Details for the file mldiag-0.0.1-py3-none-any.whl.
File metadata
- Download URL: mldiag-0.0.1-py3-none-any.whl
- Upload date:
- Size: 35.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/49.6.0.post20200917 requests-toolbelt/0.9.1 tqdm/4.50.0 CPython/3.7.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
01928e01600007ab2fa5446116760b2eb17f12e8108dfc2c00038e1ed78492df
|
|
| MD5 |
ac43094ac305f0e25a0631294802f03a
|
|
| BLAKE2b-256 |
be952918d90b2d366e45b3e8310bfa8cad03cd2980c771c305822276c500c718
|