Static scanning library for detecting malicious code, potential backdoor indicators, and other security risks in ML model files
Project description
ModelAudit
Secure your AI models before deployment. Static scanner that detects malicious code, potential backdoor indicators, and security vulnerabilities in ML model files — without ever loading or executing them.
Full Documentation | Usage Examples | Supported Formats
Quick Start
Requires Python 3.10-3.13
pip install "modelaudit[all]"
# Scan a file or directory
modelaudit model.pkl
modelaudit ./models/
# Export results for CI/CD
modelaudit model.pkl --format json --output results.json
$ modelaudit suspicious_model.pkl
Files scanned: 1 | Issues found: 2 critical, 1 warning
1. suspicious_model.pkl (pos 28): [CRITICAL] Malicious code execution attempt
Why: Contains os.system() call that could run arbitrary commands
2. suspicious_model.pkl (pos 52): [WARNING] Dangerous pickle deserialization
Why: Could execute code when the model loads
What It Detects
- Code execution attacks in Pickle, PyTorch, NumPy, and Joblib files
- Potential backdoor indicators — suspicious weight patterns, anomalous tensors, or hidden-code signals
- Embedded secrets — API keys, tokens, and credentials in model weights or metadata
- Network indicators — URLs, IPs, and socket usage that could enable data exfiltration
- Archive exploits — path traversal, symlink attacks in ZIP/TAR/7z files
- Unsafe ML operations — Lambda layers, custom ops, TorchScript/JIT, template injection
- Supply chain risks — tampering, license violations, suspicious configurations
Supported Formats
ModelAudit includes 44 registered scanners covering model, archive, and configuration formats:
| Format | Extensions | Risk |
|---|---|---|
| Pickle | .pkl, .pickle, .dill |
HIGH |
| PyTorch | .pt, .pth, .ckpt, .bin |
HIGH |
| Joblib | .joblib |
HIGH |
| NumPy | .npy, .npz |
HIGH |
| R Serialized | .rds, .rda, .rdata |
HIGH |
| TensorFlow | .pb, .meta, SavedModel dirs |
MEDIUM |
| Keras | .h5, .hdf5, .keras |
MEDIUM |
| ONNX | .onnx |
MEDIUM |
| CoreML | .mlmodel |
LOW |
| MXNet | *-symbol.json, *-NNNN.params |
LOW |
| NeMo | .nemo |
MEDIUM |
| CNTK | .dnn, .cmf |
MEDIUM |
| RKNN | .rknn |
MEDIUM |
| Torch7 | .t7, .th, .net |
HIGH |
| CatBoost | .cbm |
MEDIUM |
| XGBoost | .bst, .model, .json, .ubj |
MEDIUM |
| LightGBM | .lgb, .lightgbm, .model |
MEDIUM |
| Llamafile | .llamafile, extensionless, .exe |
MEDIUM |
| TorchServe | .mar |
HIGH |
| SafeTensors | .safetensors |
LOW |
| GGUF/GGML | .gguf, .ggml, .ggmf, .ggjt, .ggla, .ggsa |
LOW |
| JAX/Flax | .msgpack, .flax, .orbax, .jax, .checkpoint, .orbax-checkpoint |
LOW |
| TFLite | .tflite |
LOW |
| ExecuTorch | .ptl, .pte |
LOW |
| TensorRT | .engine, .plan, .trt |
LOW |
| PaddlePaddle | .pdmodel, .pdiparams |
LOW |
| OpenVINO | .xml |
LOW |
| Skops | .skops |
HIGH |
| PMML | .pmml |
LOW |
| Compressed Wrappers | .gz, .bz2, .xz, .lz4, .zlib |
MEDIUM |
Plus scanners for ZIP, TAR, 7-Zip, OCI layers, Jinja2 templates, JSON/YAML metadata, manifests, model cards, text files, and RAR recognition. RAR archives are reported as unsupported/fail-closed instead of being skipped.
View complete format documentation
Remote Sources
Scan models directly from remote registries and cloud storage:
# Hugging Face
modelaudit https://huggingface.co/gpt2
modelaudit hf://microsoft/DialoGPT-medium
# Cloud storage
modelaudit s3://bucket/model.pt
modelaudit gs://bucket/models/
# MLflow registry
modelaudit models:/MyModel/Production
# JFrog Artifactory (files and folders)
# Auth: export JFROG_API_TOKEN=...
modelaudit https://company.jfrog.io/artifactory/repo/model.pt
modelaudit https://company.jfrog.io/artifactory/repo/models/
# DVC-tracked models
modelaudit model.dvc
Authentication Environment Variables
HF_TOKENfor private Hugging Face repositoriesAWS_ACCESS_KEY_ID/AWS_SECRET_ACCESS_KEY(and optionalAWS_SESSION_TOKEN) for S3GOOGLE_APPLICATION_CREDENTIALSfor GCSMLFLOW_TRACKING_URIfor MLflow registry accessJFROG_API_TOKENorJFROG_ACCESS_TOKENfor JFrog Artifactory- Store credentials in environment variables or a secrets manager, and never commit tokens/keys.
Installation
# Broad scanner coverage (recommended; excludes the TensorFlow runtime and platform-specific TensorRT)
pip install "modelaudit[all]"
# Core only (static scanners, pickle, NumPy, archives, manifests, metadata)
pip install modelaudit
# Specific frameworks (TensorFlow installs on Python 3.11-3.12; ONNX installs on Python 3.10-3.12)
pip install "modelaudit[tensorflow,pytorch,h5,onnx,safetensors]"
# CI/CD environments
pip install "modelaudit[all-ci]"
# On Python 3.11-3.12, add TensorFlow only when you need runtime-dependent checkpoint or weight analysis
pip install "modelaudit[all,tensorflow]"
# Docker
docker run --rm -v "$(pwd)":/app ghcr.io/promptfoo/modelaudit:latest model.pkl
The ONNX extra, including the ONNX portion of modelaudit[all], is packaged for Python 3.10-3.12.
CLI Options
Primary commands:
modelaudit [PATHS...] # Default scan command
modelaudit scan [OPTIONS] PATHS... # Explicit scan command
modelaudit scan --list-scanners # List scanner IDs for targeted scans
modelaudit metadata [OPTIONS] PATH # Extract model metadata safely (no deserialization by default)
modelaudit doctor [--show-failed] # Diagnose scanner/dependency availability
modelaudit debug [--json] [--verbose] # Environment and configuration diagnostics
modelaudit cache [stats|clear|cleanup] [OPTIONS]
Common scan options:
--format {text,json,sarif} Output format (default: auto-detected)
--output FILE Write results to file
--strict Fail on warnings, scan all file types, strict license validation
--sbom FILE Generate CycloneDX SBOM
--stream Process files one-by-one; remote downloads are deleted after scanning
--max-size SIZE Size limit (e.g., 10GB)
--timeout SECONDS Override scan timeout
--dry-run Preview what would be scanned
--verbose / --quiet Control output detail
--blacklist PATTERN Additional patterns to flag
--no-cache Disable result caching
--cache-dir DIR Set cache directory for downloads and scan results
--progress Force progress display
--scanners LIST Only run selected scanners (IDs/classes; comma-separated or repeated)
--exclude-scanner NAME Exclude a scanner from the active set (comma-separated or repeated)
--list-scanners List scanner IDs, class names, extensions, and dependencies
Targeted scanner selection:
# Discover scanner IDs and class names
modelaudit scan --list-scanners
modelaudit scan --list-scanners --format json
# Run only selected scanners
modelaudit scan ./models --scanners pickle,tf_savedmodel
modelaudit scan ./model.pkl --scanners PickleScanner
# Run the default scanner set except a noisy or slow scanner
modelaudit scan ./models --exclude-scanner weight_distribution
# For container formats, include both the container scanner and nested scanner
modelaudit scan ./archive.zip --scanners zip,pickle
--scanners starts from an explicit allowlist. --exclude-scanner subtracts scanners from either that allowlist or the default scanner set. Scanner selection is reflected in JSON output under scanner_selection.
For remote folders, ModelAudit narrows downloads by selected scanner extensions when safe, and keeps filtering conservative for container or header-routed scanners to avoid dropping extension-spoofed artifacts before scanning.
Metadata Extraction
# Human-readable summary (safe default: no model deserialization)
modelaudit metadata model.safetensors
# Machine-readable output
modelaudit metadata ./models --format json --output metadata.json
# Focus only on security-relevant metadata fields
modelaudit metadata model.onnx --security-only
--trust-loaders enables scanner metadata loaders that may deserialize model content. Only use this on trusted artifacts in isolated environments.
Exit Codes
0: No security issues detected1: Security issues detected2: Scan errors
Telemetry and Privacy
ModelAudit includes telemetry for product reliability and usage analytics.
- Collected metadata can include command usage, scan timing, scanner/file-type usage, issue severity/type aggregates, sanitized model names/references, and coarse metadata like file extension/domain.
- URL telemetry strips userinfo, query strings, and fragments from model references. Avoid putting credentials in model names, file names, or artifact paths when telemetry is enabled.
- Model files are scanned locally and ModelAudit does not upload model binary contents as telemetry events.
- Telemetry is disabled automatically in CI/test environments and in editable development installs by default.
Opt out explicitly with either environment variable:
export PROMPTFOO_DISABLE_TELEMETRY=1
# or
export NO_ANALYTICS=1
To opt in during editable/development installs:
export MODELAUDIT_TELEMETRY_DEV=1
Output Examples
# JSON for CI/CD pipelines
modelaudit model.pkl --format json --output results.json
# SARIF for code scanning platforms
modelaudit model.pkl --format sarif --output results.sarif
Troubleshooting
- Run
modelaudit doctor --show-failedto list unavailable scanners and missing optional deps. - Run
modelaudit debug --jsonto collect environment/config diagnostics for bug reports. - Use
modelaudit cache cleanup --max-age 30to remove stale cache entries safely. - If
pipinstalls an older release, verify Python is supported (python --version; ModelAudit supports Python 3.10-3.13). - For additional troubleshooting and cloud auth guidance, see:
Documentation
- Full docs — setup, configuration, and advanced usage
- Usage examples — CI/CD integration, remote scanning, SBOM generation
- Supported formats — detailed scanner documentation
- Support policy — supported Python/OS versions and maintenance policy
- Security model and limitations — what ModelAudit does and does not guarantee
- Compatibility matrix — file formats vs optional dependencies
- Scanner selection — targeted scanner allowlists and exclusions
- Metadata extraction guide — safe metadata workflows and
--trust-loadersguidance - Offline/air-gapped guide — secure operation without internet access
- Troubleshooting — run
modelaudit doctor --show-failedto check scanner availability
License
MIT License — see LICENSE for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file modelaudit-0.2.40.tar.gz.
File metadata
- Download URL: modelaudit-0.2.40.tar.gz
- Upload date:
- Size: 3.1 MB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7b881c58bd88c9b6fd5083229e83af33f074ab32648654fccee78e2bbb0b90bb
|
|
| MD5 |
a4c628e79212cd3ea22794a81773a0fd
|
|
| BLAKE2b-256 |
a33e83c6bcd4693ade1089c8f5b0eb293418b6ed7a70d3c55595ba0e9787d872
|
Provenance
The following attestation bundles were made for modelaudit-0.2.40.tar.gz:
Publisher:
release-please.yml on promptfoo/modelaudit
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
modelaudit-0.2.40.tar.gz -
Subject digest:
7b881c58bd88c9b6fd5083229e83af33f074ab32648654fccee78e2bbb0b90bb - Sigstore transparency entry: 1333202441
- Sigstore integration time:
-
Permalink:
promptfoo/modelaudit@e88c4daf17fcd7a7dfc1b261bd8b9a0d4dd865d7 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/promptfoo
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release-please.yml@e88c4daf17fcd7a7dfc1b261bd8b9a0d4dd865d7 -
Trigger Event:
workflow_dispatch
-
Statement type:
File details
Details for the file modelaudit-0.2.40-py3-none-any.whl.
File metadata
- Download URL: modelaudit-0.2.40-py3-none-any.whl
- Upload date:
- Size: 892.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4e0f52e201320b14eb1be27415424c1b30a2afbdcf7c190219155ce5ee283c3b
|
|
| MD5 |
8519c812ac0d544c6d2de8df60b253ba
|
|
| BLAKE2b-256 |
562adff62d5cae8227109ada6584dccacf9f40a71d0a7d7ec413956e3ce890a7
|
Provenance
The following attestation bundles were made for modelaudit-0.2.40-py3-none-any.whl:
Publisher:
release-please.yml on promptfoo/modelaudit
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
modelaudit-0.2.40-py3-none-any.whl -
Subject digest:
4e0f52e201320b14eb1be27415424c1b30a2afbdcf7c190219155ce5ee283c3b - Sigstore transparency entry: 1333202609
- Sigstore integration time:
-
Permalink:
promptfoo/modelaudit@e88c4daf17fcd7a7dfc1b261bd8b9a0d4dd865d7 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/promptfoo
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release-please.yml@e88c4daf17fcd7a7dfc1b261bd8b9a0d4dd865d7 -
Trigger Event:
workflow_dispatch
-
Statement type: