Skip to main content

The modelscan package is a cli tool for detecting unsafe operations in model files across various model serialization formats.

Project description

modelscan

# malicious code injection 
command = "system"
malicious_code = """cat ~/.aws/secrets""" 


modelscan is an open-source tool for scanning Machine Learning (ML) models. With modelscan, the ML models can be scanned *without* loading them in your machines: saving you from potential malicious code injection attacks.





How modelscan works




Fig 1: An outline for scanning models using modelscan.


TODO: Add a gif here like NBDefense to show how modelscan works- example notebook from pytorch



Getting Started

  1. Install modelscan:

    pip install modelscan
    
  2. Scan the model:

    For scanning model from local directory:

    modelscan -p /path/to/model_file
    

    For scanning model from huggingface:

    modelscan -hf /repo_id/model_file
    
  3. Inspect the modelscan result:

    The modelscan results include:

    • List of files scanned.
    • List of files not scanned.
    • A summery of scan results categorized using modelscan severity levels of: CRITICAL, HIGH, MEDIUM, and LOW.
    • A detailed list under each severity level of the malicious code found.

    More information on which ML models will be scanned using modelscan can be found here

    More information about modelscan severity levels can be found here.



Which ML Models can be Scanned using modelscan

At the moment, modelscan supports the following ML libraries.

PyTorch

Pytorch models can be saved and loaded using pickle. modelscan can scan models saved using pickle. A notebook to illustarate the modelscan usage and expected results with pytorch model is included in ./examples folder. [TODO]

Tensorflow

Tensorflow uses saved_model for model serialization. modelscan can scan models saved using saved_model. A notebook to illustarate the modelscan usage and expected results with tensorflow model is included in ./examples folder. [TODO]

Keras

Keras uses saved_model and h5 for model serialization. modelscan can scan models saved using saved_model and h5. A notebook to illustarate the modelscan usage and expected results with keras model is included in ./examples folder. [TODO]



Classical ML libraries

modelscan also supports all ML libraries that support pickle for their model serialization, such as Sklearn, XGBoost, Catboost etc. A notebook to illustarate the modelscan usage and expected results with keras model is included in ./examples folder. [TODO]



Example Notebooks

TODO



modelscan CLI arguments:

The modelscan CLI arguments and their usage is as follows:

argument Exaplanation Usage
-h or --help For getting help modelscan -h
-p or --path For scanning a model file in local directory modelscan -p /path/to/model_file
-hf or --huggingface For scanning a model file on hugging face modelscan -hf /repo/model_file



Contributing

We would love to have you contribute to our open source modelscan project. If you would like to contribute, please follow the details on Contribution page.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

modelscan-0.1.0.tar.gz (17.8 kB view hashes)

Uploaded Source

Built Distribution

modelscan-0.1.0-py3-none-any.whl (22.2 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page