Security scanner for AI/ML model files
Project description
TensorTrap
Security scanner for AI/ML model files. Detect malicious code in pickle, safetensors, and GGUF files before loading them.
Why TensorTrap?
AI model files can contain executable code. Pickle files in particular can run arbitrary Python when loaded. TensorTrap analyzes model files without executing them, identifying dangerous patterns before they can harm your system.
Key statistics:
- 83.5% of Hugging Face models use pickle-based formats (arbitrary code execution risk)
- 2.1 billion monthly downloads from Hugging Face alone
- 100+ confirmed malicious models discovered on public repositories
Installation
pip install tensortrap
For development:
pip install tensortrap[dev]
Usage
Scan a single file:
tensortrap scan model.safetensors
Scan a directory:
tensortrap scan ./models/
Output as JSON (for tooling integration):
tensortrap scan model.pkl --json
Show file info without full scan:
tensortrap info model.safetensors
CLI Options
tensortrap scan <path> [OPTIONS]
Options:
-r, --recursive / -R, --no-recursive Scan directories recursively (default: recursive)
-j, --json Output results as JSON to console
-v, --verbose Show detailed output including info-level findings
--no-hash Skip computing file hashes
--report / --no-report Generate report files (default: enabled for directories)
-o, --report-dir PATH Directory to save reports (default: current directory)
-f, --report-formats TEXT Comma-separated formats: txt,json,html,csv (default: all)
Report Generation
When scanning directories, TensorTrap automatically generates reports in multiple formats:
# Scan with all report formats (default)
tensortrap scan ./models/
# Disable report generation
tensortrap scan ./models/ --no-report
# Specific formats only
tensortrap scan ./models/ -f txt,html
# Custom output directory
tensortrap scan ./models/ -o ./reports/
Reports are saved with timestamps: tensortrap_report_YYYYMMDD_HHMMSS.{txt,json,html,csv}
Supported Formats
| Format | Extensions | Risk Level |
|---|---|---|
| Pickle | .pkl, .pickle, .pt, .pth, .bin, .ckpt, .joblib | High (code execution) |
| PyTorch ZIP | .pt, .pth (ZIP archives) | High (internal pickles) |
| Safetensors | .safetensors | Low (data only) |
| GGUF | .gguf | Medium (template injection) |
| ONNX | .onnx | Medium (path traversal) |
| Keras/HDF5 | .h5, .hdf5, .keras | High (Lambda layers, pickle) |
| YAML | .yaml, .yml | Medium (unsafe deserialization) |
| ComfyUI | .json (workflows) | High (eval nodes) |
| Images | .png, .jpg, .gif, .svg, .webp, .bmp, .tiff, .ico | Medium (polyglot attacks) |
| Video | .mp4, .mkv, .avi, .mov, .webm, .flv, .wmv | Medium (polyglot attacks) |
What We Detect
Pickle Files
- Dangerous imports: os, subprocess, socket, builtins, sys, etc.
- Code execution opcodes: REDUCE, BUILD, GLOBAL, INST, NEWOBJ
- Known malicious patterns: os.system, subprocess.Popen, eval, exec
- Nested pickle attacks: pickle importing pickle
Safetensors Files
- Oversized headers: Potential DoS attacks
- Embedded payloads: Pickle data hidden in metadata
- Suspicious patterns: Code snippets in metadata
- Invalid structure: Malformed headers, bad tensor offsets
GGUF Files
- Invalid format: Wrong magic number, unknown versions
- Jinja template injection: CVE-2024-34359 patterns
- Anomalous structure: Excessive tensor/metadata counts
- Suspicious metadata: Code patterns in metadata values
ONNX Files
- Path traversal: CVE-2024-27318, CVE-2024-5187 via external_data
- Suspicious external references: Access to system files
- Arbitrary file read/write: Via malicious external data paths
Keras/HDF5 Files
- Lambda layers: Arbitrary code execution on load
- Embedded pickle: Pickle-serialized custom objects
- Suspicious config patterns: eval(), exec(), os.system()
YAML Configuration Files
- Unsafe deserialization: !!python/object tags (CVE-2025-50460)
- Code execution: subprocess, os.system patterns
- Dynamic imports: import patterns
ComfyUI Workflows
- Vulnerable nodes: ACE_ExpressionEval, HueAdjust (CVE-2024-21576/77)
- Code execution: eval() patterns in node inputs
- Arbitrary code: Malicious workflow structures
Polyglot & Media Files (Defense-in-Depth)
- Extension mismatch: Pickle/archive disguised as image (CVE-2025-1889)
- Archive-in-image: ZIP/7z/RAR appended to valid images
- Archive-in-video: Archives appended to video files
- SVG script injection: JavaScript, onclick handlers, data URIs
- Metadata payloads: Malicious code in EXIF/XMP metadata
- Double extensions: Tricks like
model.pkl.png - Trailing data: Hidden data after image end markers
- MKV attachments: Embedded files in Matroska containers
Additional Detections
- Magic byte analysis: Detects disguised pickle files (CVE-2025-1889)
- 7z archives: nullifAI bypass detection (CVE-2025-1716)
- Obfuscation: Base64, hex encoding, compression, high entropy
- PyTorch archives: Extracts and scans internal pickle files
Exit Codes
0: All files safe (no critical/high findings)1: Threats detected (critical or high severity findings)
Example Output
Collecting files from ./models/...
Found 15 model file(s)
⠋ Scanning: model.pkl ━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 15/15 0:00:02
model.pkl (pickle) - THREATS DETECTED
Severity Finding Action
!! CRITICAL Known malicious call: os.system DO NOT LOAD. Delete this file immediately.
* MEDIUM REDUCE opcode found 1 time(s) Normal for pickle models. Convert to safetensors.
Scanned 15 file(s): 14 safe, 1 with issues
1 critical, 1 medium
Reports saved:
TXT: ./tensortrap_report_20251211_120000.txt
JSON: ./tensortrap_report_20251211_120000.json
HTML: ./tensortrap_report_20251211_120000.html
CSV: ./tensortrap_report_20251211_120000.csv
JSON Output
{
"report_type": "tensortrap_security_scan",
"scan_target": "./models/",
"scan_date": "2025-12-11T12:00:00",
"summary": {
"total_files": 1,
"safe_files": 0,
"files_with_issues": 1,
"findings_by_severity": {"critical": 1, "medium": 1}
},
"results": [
{
"filepath": "model.pkl",
"format": "pickle",
"is_safe": false,
"max_severity": "critical",
"findings": [
{
"severity": "critical",
"message": "Known malicious call: os.system",
"location": 0,
"details": {"module": "os", "function": "system"},
"recommendation": "DO NOT LOAD. Delete this file immediately."
}
],
"scan_time_ms": 1.23,
"file_size": 256,
"file_hash": "abc123..."
}
]
}
Defense in Depth
TensorTrap focuses on AI model file security. For comprehensive protection of your AI workflow, we recommend combining TensorTrap with these complementary tools:
Recommended Security Stack
| Tool | Purpose | Install |
|---|---|---|
| TensorTrap | AI model file scanning | pip install tensortrap |
| Stego | Steganography detection | See stego-toolkit |
| YARA | Pattern-based malware detection | apt install yara / yara.readthedocs.io |
| RKHunter | Rootkit detection | apt install rkhunter |
| ClamAV | General antivirus | apt install clamav |
What Each Tool Catches
┌─────────────────────────────────────────────────────────────────┐
│ AI Workflow Security │
├─────────────────────────────────────────────────────────────────┤
│ Downloaded Models │ Generated Output │ System Level │
│ ───────────────── │ ──────────────── │ ──────────── │
│ TensorTrap ✓ │ Stego ✓ │ RKHunter ✓ │
│ • Pickle exploits │ • Hidden data │ • Rootkits │
│ • Format attacks │ • Steganography │ • Backdoors │
│ • Polyglot files │ │ │
│ │ │ ClamAV ✓ │
│ YARA ✓ │ │ • Known malware │
│ • Known signatures │ │ • Viruses │
└─────────────────────────────────────────────────────────────────┘
Quick Setup (Linux)
# Install TensorTrap
pip install tensortrap
# Install system tools
sudo apt update
sudo apt install yara rkhunter clamav clamav-daemon
# Initialize ClamAV database
sudo freshclam
# Run comprehensive scan
tensortrap scan ~/Models ~/Downloads # AI models + polyglot detection
yara -r /path/to/rules ~/Downloads # Pattern matching
rkhunter --check # System integrity
clamscan -r ~/Downloads # General malware
Contributing
Contributions welcome! See CONTRIBUTING.md for guidelines.
# Clone the repo
git clone https://github.com/realmarauder/TensorTrap.git
cd TensorTrap
# Install dev dependencies
pip install -e ".[dev]"
# Run tests
pytest
# Run linting
ruff check src/
mypy src/
License
MIT License - see LICENSE.
About
TensorTrap is developed by M2 Dynamics, specializing in AI/ML security consulting.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file tensortrap-0.2.0.tar.gz.
File metadata
- Download URL: tensortrap-0.2.0.tar.gz
- Upload date:
- Size: 46.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
92641d39bd277b41d21051023ab05ebb3fc0a8e4792497f4fc826d38ee5e7ebe
|
|
| MD5 |
6108dea3b791fa6d052e9e6f4900b5ed
|
|
| BLAKE2b-256 |
dd7a80fa6dcb0a690562ffd91dbdb099e832450d64b9a1c06bae601f50353051
|
Provenance
The following attestation bundles were made for tensortrap-0.2.0.tar.gz:
Publisher:
publish-pypi.yml on realmarauder/TensorTrap
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
tensortrap-0.2.0.tar.gz -
Subject digest:
92641d39bd277b41d21051023ab05ebb3fc0a8e4792497f4fc826d38ee5e7ebe - Sigstore transparency entry: 762736288
- Sigstore integration time:
-
Permalink:
realmarauder/TensorTrap@18fc89286648d97035f54f1e3c1d00ee2518f0b0 -
Branch / Tag:
refs/tags/v0.2.0 - Owner: https://github.com/realmarauder
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish-pypi.yml@18fc89286648d97035f54f1e3c1d00ee2518f0b0 -
Trigger Event:
push
-
Statement type:
File details
Details for the file tensortrap-0.2.0-py3-none-any.whl.
File metadata
- Download URL: tensortrap-0.2.0-py3-none-any.whl
- Upload date:
- Size: 6.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
bffd841740a68244a937d259ed0162e3beac8a0e282f79bcbe1238178d4a9f0d
|
|
| MD5 |
9337e9a288008481d9481d855337f3b7
|
|
| BLAKE2b-256 |
463c551bcc6ef1e27a37afa70b4859775cf6097e7b0c34bd46c2291bbec884b9
|
Provenance
The following attestation bundles were made for tensortrap-0.2.0-py3-none-any.whl:
Publisher:
publish-pypi.yml on realmarauder/TensorTrap
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
tensortrap-0.2.0-py3-none-any.whl -
Subject digest:
bffd841740a68244a937d259ed0162e3beac8a0e282f79bcbe1238178d4a9f0d - Sigstore transparency entry: 762736300
- Sigstore integration time:
-
Permalink:
realmarauder/TensorTrap@18fc89286648d97035f54f1e3c1d00ee2518f0b0 -
Branch / Tag:
refs/tags/v0.2.0 - Owner: https://github.com/realmarauder
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish-pypi.yml@18fc89286648d97035f54f1e3c1d00ee2518f0b0 -
Trigger Event:
push
-
Statement type: