Skip to main content

A drop-in seat-belt library for machine-learning model files that prevents hidden malware and verifies provenance

Project description

ModelGuard 🛡️

PyPI version Python 3.9+ License Tests

A drop-in "seat-belt" library for machine learning model files that prevents hidden malware, verifies provenance, and works seamlessly across PyTorch, TensorFlow, scikit-learn, and ONNX.

🚨 The Problem

Machine learning models are increasingly being shared and downloaded from public repositories, but this creates serious security risks:

  • Arbitrary Code Execution: ML model formats based on Pickle can execute malicious code when loaded
  • Supply Chain Attacks: Models from untrusted sources can contain hidden malware
  • No Provenance Verification: No way to verify who created a model or if it's been tampered with
  • Framework Fragmentation: Different security approaches for each ML framework

✨ The Solution

ModelGuard provides comprehensive ML model security with:

🔒 Safe Loading - Blocks malicious Pickle opcodes with restricted unpickler
🔐 Signature Verification - Guarantees model provenance via Sigstore signatures
Zero Friction - Drop-in replacement requiring minimal code changes
🌐 Multi-Framework - Unified security across PyTorch, TensorFlow, scikit-learn, and ONNX
🚀 Production Ready - Extensively tested with 54/54 tests passing

🚀 Quick Start

Installation

pip install ml-modelguard

Basic Usage

Option 1: Direct Replacement

# Before: Unsafe loading
import torch
model = torch.load('model.pth')

# After: Safe loading
import modelguard.torch as torch
model = torch.safe_load('model.pth')

Option 2: Context Manager (Recommended)

import modelguard
import torch

with modelguard.patched():
    model = torch.load('model.pth')  # Automatically secured

Option 3: CLI Scanning

# Scan a model file
modelguard scan model.pth

# Scan entire directory
modelguard scan ./models/ --recursive

# Get JSON output
modelguard scan model.pth --format json

🔧 Framework Support

PyTorch

import modelguard.torch as torch
model = torch.safe_load('model.pth')

TensorFlow/Keras

import modelguard.tensorflow as tf
model = tf.safe_load('model.h5')

scikit-learn

import modelguard.sklearn as sklearn
model = sklearn.safe_load('model.pkl')

ONNX

import modelguard.onnx as onnx
model = onnx.safe_load('model.onnx')

🛡️ Security Features

Malicious Code Detection

ModelGuard analyzes Pickle opcodes to detect dangerous patterns:

  • GLOBAL opcodes that import dangerous functions
  • REDUCE opcodes that execute arbitrary code
  • BUILD opcodes that construct malicious objects

Signature Verification

Verify model authenticity using Sigstore:

# Sign a model
modelguard sign model.pth

# Verify signature
modelguard verify model.pth

Policy Enforcement

Configure security policies via environment variables or YAML:

# modelguard.yaml
enforce: true
require_signatures: true
trusted_signers:
  - "alice@company.com"
  - "bob@company.com"
max_file_size_mb: 1000

📊 Performance

ModelGuard is designed for production use with excellent performance:

  • Fast Scanning: < 150ms for 100MB models (2x better than target)
  • Memory Efficient: Stable memory usage with no leaks
  • Concurrent Safe: Thread-safe operations with linear scaling
  • Low Overhead: Reasonable security overhead for comprehensive protection

🔧 Configuration

Environment Variables

export MODELGUARD_ENFORCE=true
export MODELGUARD_REQUIRE_SIGNATURES=true
export MODELGUARD_TRUSTED_SIGNERS="alice@company.com,bob@company.com"

Policy File

Create modelguard.yaml in your project root:

enforce: true
require_signatures: false
scan_on_load: true
max_file_size_mb: 1000
timeout_seconds: 30

📚 Examples

Enterprise Security Setup

import modelguard
import os

# Configure strict security policy
os.environ['MODELGUARD_ENFORCE'] = 'true'
os.environ['MODELGUARD_REQUIRE_SIGNATURES'] = 'true'
os.environ['MODELGUARD_TRUSTED_SIGNERS'] = 'security@company.com'

# All model loading is now secured
with modelguard.patched():
    import torch
    import tensorflow as tf

    # Both calls are automatically secured
    pytorch_model = torch.load('model.pth')
    tf_model = tf.keras.models.load_model('model.h5')

Development Workflow

import modelguard.torch as torch

# Safe loading with detailed feedback
try:
    model = torch.safe_load('untrusted_model.pth')
    print("✅ Model loaded safely")
except modelguard.MaliciousModelError as e:
    print(f"🚨 Malicious content detected: {e}")
except modelguard.SignatureError as e:
    print(f"🔐 Signature verification failed: {e}")

🧪 Testing

ModelGuard has comprehensive test coverage:

# Run all tests
pytest tests/

# Run specific test categories
pytest tests/test_policy.py      # Policy engine tests
pytest tests/test_scanner.py     # Malware detection tests
pytest tests/test_loaders.py     # Framework loader tests
pytest tests/test_performance.py # Performance benchmarks

Test Results: 54/54 tests passing ✅

🤝 Contributing

We welcome contributions! Here's how to get started:

Development Setup

  1. Fork and Clone

    git clone https://github.com/YOUR_USERNAME/Modelguard.git
    cd Modelguard
    
  2. Install Development Dependencies

    pip install -e ".[dev]"
    
  3. Run Tests

    pytest tests/
    
  4. Code Quality Checks

    ruff check src/ tests/
    mypy src/
    

What We Need Help With

  • 🐛 Bug Reports: Found an issue? Open an issue with details
  • 🚀 New Features: Ideas for improving ML security
  • 📚 Documentation: Help improve our docs and examples
  • 🧪 Testing: More test cases and edge case coverage
  • 🔧 Framework Support: Additional ML framework integrations

See our Contributing Guide for detailed guidelines.

📄 License

ModelGuard is licensed under the Apache License 2.0.

🔗 Links

🙏 Acknowledgments

  • Sigstore for signature verification infrastructure
  • Python Security Team for security best practices
  • ML Community for feedback and testing

Made with ❤️ for the ML community's security

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ml_modelguard-0.2.0.tar.gz (48.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ml_modelguard-0.2.0-py3-none-any.whl (25.2 kB view details)

Uploaded Python 3

File details

Details for the file ml_modelguard-0.2.0.tar.gz.

File metadata

  • Download URL: ml_modelguard-0.2.0.tar.gz
  • Upload date:
  • Size: 48.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.9

File hashes

Hashes for ml_modelguard-0.2.0.tar.gz
Algorithm Hash digest
SHA256 90eb043fe96ecacc53506d142028829c6b1471924a0b660de020c3fb514919ea
MD5 948287e137c11f9ca3173801eca9c67f
BLAKE2b-256 5fc93f55a80a4a13038085059d7fc036c7e77312c4c453b7bf128414a259f49b

See more details on using hashes here.

File details

Details for the file ml_modelguard-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: ml_modelguard-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 25.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.9

File hashes

Hashes for ml_modelguard-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 f2cb56aeb63e161927466a1dab9c52d45fe9a0c88a01eaab2ddfe48c03acc8e4
MD5 92f8133d80b0e5294561dbf831813e08
BLAKE2b-256 086d48041f5b197093f697d1a8c386389e368e71e51e752d106544b2812e86b1

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page