The modelscan package is a cli tool for detecting unsafe operations in model files across various model serialization formats.
Project description
modelscan
# malicious code injection
command = "system"
malicious_code = """cat ~/.aws/secrets"""
modelscan is an open-source tool for scanning Machine Learning (ML) models. With modelscan, the ML models can be scanned *without* loading them in your machines: saving you from potential malicious code injection attacks.
How modelscan works
Fig 1: An outline for scanning models using modelscan.
TODO: Add a gif here like NBDefense to show how modelscan works- example notebook from pytorch
Getting Started
-
Install modelscan:
pip install modelscan
-
Scan the model:
For scanning model from local directory:
modelscan -p /path/to/model_file
For scanning model from huggingface:
modelscan -hf /repo_id/model_file
-
Inspect the modelscan result:
The modelscan results include:
- List of files scanned.
- List of files not scanned.
- A summery of scan results categorized using modelscan severity levels of: CRITICAL, HIGH, MEDIUM, and LOW.
- A detailed list under each severity level of the malicious code found.
More information on which ML models will be scanned using modelscan can be found here
More information about modelscan severity levels can be found here.
Which ML Models can be Scanned using modelscan
At the moment, modelscan supports the following ML libraries.
PyTorch
Pytorch models can be saved and loaded using pickle. modelscan can scan models saved using pickle. A notebook to illustarate the modelscan usage and expected results with pytorch model is included in ./examples folder. [TODO]
Tensorflow
Tensorflow uses saved_model for model serialization. modelscan can scan models saved using saved_model. A notebook to illustarate the modelscan usage and expected results with tensorflow model is included in ./examples folder. [TODO]
Keras
Keras uses saved_model and h5 for model serialization. modelscan can scan models saved using saved_model and h5. A notebook to illustarate the modelscan usage and expected results with keras model is included in ./examples folder. [TODO]
Classical ML libraries
modelscan also supports all ML libraries that support pickle for their model serialization, such as Sklearn, XGBoost, Catboost etc. A notebook to illustarate the modelscan usage and expected results with keras model is included in ./examples folder. [TODO]
Example Notebooks
TODO
modelscan CLI arguments:
The modelscan CLI arguments and their usage is as follows:
argument | Exaplanation | Usage |
---|---|---|
-h or --help | For getting help | modelscan -h |
-p or --path | For scanning a model file in local directory | modelscan -p /path/to/model_file |
-hf or --huggingface | For scanning a model file on hugging face | modelscan -hf /repo/model_file |
Contributing
We would love to have you contribute to our open source modelscan project. If you would like to contribute, please follow the details on Contribution page.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for modelscan-0.1.0-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0fd7ddc01645227d2f2c0631ef4e04c12a3ba8ce7d195310ef383e5c7a935d91 |
|
MD5 | 2287d550016804128cc5b0fa5499f502 |
|
BLAKE2b-256 | 3e006ca004b0e49a3930ecb423603e32c0959dbcbef91844ce5ebaebc0a91a27 |