Skip to main content

Security scanner for AI/ML model files

Project description

TensorTrap

Security scanner for AI/ML model files. Detect malicious code in pickle, safetensors, and GGUF files before loading them.

Why TensorTrap?

AI model files can contain executable code. Pickle files in particular can run arbitrary Python when loaded. TensorTrap analyzes model files without executing them, identifying dangerous patterns before they can harm your system.

Key statistics:

  • 83.5% of Hugging Face models use pickle-based formats (arbitrary code execution risk)
  • 2.1 billion monthly downloads from Hugging Face alone
  • 100+ confirmed malicious models discovered on public repositories

Installation

pip install tensortrap

For development:

pip install tensortrap[dev]

Usage

Scan a single file:

tensortrap scan model.safetensors

Scan a directory:

tensortrap scan ./models/

Output as JSON (for tooling integration):

tensortrap scan model.pkl --json

Show file info without full scan:

tensortrap info model.safetensors

CLI Options

tensortrap scan <path> [OPTIONS]

Options:
  -r, --recursive / -R, --no-recursive  Scan directories recursively (default: recursive)
  -j, --json                            Output results as JSON to console
  -v, --verbose                         Show detailed output including info-level findings
  --no-hash                             Skip computing file hashes
  --report / --no-report                Generate report files (default: enabled for directories)
  -o, --report-dir PATH                 Directory to save reports (default: current directory)
  -f, --report-formats TEXT             Comma-separated formats: txt,json,html,csv (default: all)

Report Generation

When scanning directories, TensorTrap automatically generates reports in multiple formats:

# Scan with all report formats (default)
tensortrap scan ./models/

# Disable report generation
tensortrap scan ./models/ --no-report

# Specific formats only
tensortrap scan ./models/ -f txt,html

# Custom output directory
tensortrap scan ./models/ -o ./reports/

Reports are saved with timestamps: tensortrap_report_YYYYMMDD_HHMMSS.{txt,json,html,csv}

Supported Formats

Format Extensions Risk Level
Pickle .pkl, .pickle, .pt, .pth, .bin, .ckpt, .joblib High (code execution)
PyTorch ZIP .pt, .pth (ZIP archives) High (internal pickles)
Safetensors .safetensors Low (data only)
GGUF .gguf Medium (template injection)
ONNX .onnx Medium (path traversal)
Keras/HDF5 .h5, .hdf5, .keras High (Lambda layers, pickle)
YAML .yaml, .yml Medium (unsafe deserialization)
ComfyUI .json (workflows) High (eval nodes)
Images .png, .jpg, .gif, .svg, .webp, .bmp, .tiff, .ico Medium (polyglot attacks)
Video .mp4, .mkv, .avi, .mov, .webm, .flv, .wmv Medium (polyglot attacks)

What We Detect

Pickle Files

  • Dangerous imports: os, subprocess, socket, builtins, sys, etc.
  • Code execution opcodes: REDUCE, BUILD, GLOBAL, INST, NEWOBJ
  • Known malicious patterns: os.system, subprocess.Popen, eval, exec
  • Nested pickle attacks: pickle importing pickle

Safetensors Files

  • Oversized headers: Potential DoS attacks
  • Embedded payloads: Pickle data hidden in metadata
  • Suspicious patterns: Code snippets in metadata
  • Invalid structure: Malformed headers, bad tensor offsets

GGUF Files

  • Invalid format: Wrong magic number, unknown versions
  • Jinja template injection: CVE-2024-34359 patterns
  • Anomalous structure: Excessive tensor/metadata counts
  • Suspicious metadata: Code patterns in metadata values

ONNX Files

  • Path traversal: CVE-2024-27318, CVE-2024-5187 via external_data
  • Suspicious external references: Access to system files
  • Arbitrary file read/write: Via malicious external data paths

Keras/HDF5 Files

  • Lambda layers: Arbitrary code execution on load
  • Embedded pickle: Pickle-serialized custom objects
  • Suspicious config patterns: eval(), exec(), os.system()

YAML Configuration Files

  • Unsafe deserialization: !!python/object tags (CVE-2025-50460)
  • Code execution: subprocess, os.system patterns
  • Dynamic imports: import patterns

ComfyUI Workflows

  • Vulnerable nodes: ACE_ExpressionEval, HueAdjust (CVE-2024-21576/77)
  • Code execution: eval() patterns in node inputs
  • Arbitrary code: Malicious workflow structures

Polyglot & Media Files (Defense-in-Depth)

  • Extension mismatch: Pickle/archive disguised as image (CVE-2025-1889)
  • Archive-in-image: ZIP/7z/RAR appended to valid images
  • Archive-in-video: Archives appended to video files
  • SVG script injection: JavaScript, onclick handlers, data URIs
  • Metadata payloads: Malicious code in EXIF/XMP metadata
  • Double extensions: Tricks like model.pkl.png
  • Trailing data: Hidden data after image end markers
  • MKV attachments: Embedded files in Matroska containers

Additional Detections

  • Magic byte analysis: Detects disguised pickle files (CVE-2025-1889)
  • 7z archives: nullifAI bypass detection (CVE-2025-1716)
  • Obfuscation: Base64, hex encoding, compression, high entropy
  • PyTorch archives: Extracts and scans internal pickle files

Exit Codes

  • 0: All files safe (no critical/high findings)
  • 1: Threats detected (critical or high severity findings)

Example Output

Collecting files from ./models/...
Found 15 model file(s)

⠋ Scanning: model.pkl ━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 15/15 0:00:02

model.pkl (pickle) - THREATS DETECTED

   Severity   Finding                                    Action
   !! CRITICAL  Known malicious call: os.system           DO NOT LOAD. Delete this file immediately.
   *  MEDIUM    REDUCE opcode found 1 time(s)             Normal for pickle models. Convert to safetensors.

Scanned 15 file(s): 14 safe, 1 with issues
  1 critical, 1 medium

Reports saved:
  TXT:  ./tensortrap_report_20251211_120000.txt
  JSON: ./tensortrap_report_20251211_120000.json
  HTML: ./tensortrap_report_20251211_120000.html
  CSV:  ./tensortrap_report_20251211_120000.csv

JSON Output

{
  "report_type": "tensortrap_security_scan",
  "scan_target": "./models/",
  "scan_date": "2025-12-11T12:00:00",
  "summary": {
    "total_files": 1,
    "safe_files": 0,
    "files_with_issues": 1,
    "findings_by_severity": {"critical": 1, "medium": 1}
  },
  "results": [
    {
      "filepath": "model.pkl",
      "format": "pickle",
      "is_safe": false,
      "max_severity": "critical",
      "findings": [
        {
          "severity": "critical",
          "message": "Known malicious call: os.system",
          "location": 0,
          "details": {"module": "os", "function": "system"},
          "recommendation": "DO NOT LOAD. Delete this file immediately."
        }
      ],
      "scan_time_ms": 1.23,
      "file_size": 256,
      "file_hash": "abc123..."
    }
  ]
}

Defense in Depth

TensorTrap focuses on AI model file security. For comprehensive protection of your AI workflow, we recommend combining TensorTrap with these complementary tools:

Recommended Security Stack

Tool Purpose Install
TensorTrap AI model file scanning pip install tensortrap
Stego Steganography detection See stego-toolkit
YARA Pattern-based malware detection apt install yara / yara.readthedocs.io
RKHunter Rootkit detection apt install rkhunter
ClamAV General antivirus apt install clamav

What Each Tool Catches

┌─────────────────────────────────────────────────────────────────┐
│                    AI Workflow Security                         │
├─────────────────────────────────────────────────────────────────┤
│  Downloaded Models    │  Generated Output    │  System Level    │
│  ─────────────────    │  ────────────────    │  ────────────    │
│  TensorTrap ✓         │  Stego ✓             │  RKHunter ✓      │
│  • Pickle exploits    │  • Hidden data       │  • Rootkits      │
│  • Format attacks     │  • Steganography     │  • Backdoors     │
│  • Polyglot files     │                      │                  │
│                       │                      │  ClamAV ✓        │
│  YARA ✓               │                      │  • Known malware │
│  • Known signatures   │                      │  • Viruses       │
└─────────────────────────────────────────────────────────────────┘

Quick Setup (Linux)

# Install TensorTrap
pip install tensortrap

# Install system tools
sudo apt update
sudo apt install yara rkhunter clamav clamav-daemon

# Initialize ClamAV database
sudo freshclam

# Run comprehensive scan
tensortrap scan ~/Models ~/Downloads    # AI models + polyglot detection
yara -r /path/to/rules ~/Downloads      # Pattern matching
rkhunter --check                        # System integrity
clamscan -r ~/Downloads                 # General malware

Contributing

Contributions welcome! See CONTRIBUTING.md for guidelines.

# Clone the repo
git clone https://github.com/realmarauder/TensorTrap.git
cd TensorTrap

# Install dev dependencies
pip install -e ".[dev]"

# Run tests
pytest

# Run linting
ruff check src/
mypy src/

License

MIT License - see LICENSE.

About

TensorTrap is developed by M2 Dynamics, specializing in AI/ML security consulting.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tensortrap-0.2.0.tar.gz (46.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

tensortrap-0.2.0-py3-none-any.whl (6.3 kB view details)

Uploaded Python 3

File details

Details for the file tensortrap-0.2.0.tar.gz.

File metadata

  • Download URL: tensortrap-0.2.0.tar.gz
  • Upload date:
  • Size: 46.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for tensortrap-0.2.0.tar.gz
Algorithm Hash digest
SHA256 92641d39bd277b41d21051023ab05ebb3fc0a8e4792497f4fc826d38ee5e7ebe
MD5 6108dea3b791fa6d052e9e6f4900b5ed
BLAKE2b-256 dd7a80fa6dcb0a690562ffd91dbdb099e832450d64b9a1c06bae601f50353051

See more details on using hashes here.

Provenance

The following attestation bundles were made for tensortrap-0.2.0.tar.gz:

Publisher: publish-pypi.yml on realmarauder/TensorTrap

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file tensortrap-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: tensortrap-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 6.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for tensortrap-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 bffd841740a68244a937d259ed0162e3beac8a0e282f79bcbe1238178d4a9f0d
MD5 9337e9a288008481d9481d855337f3b7
BLAKE2b-256 463c551bcc6ef1e27a37afa70b4859775cf6097e7b0c34bd46c2291bbec884b9

See more details on using hashes here.

Provenance

The following attestation bundles were made for tensortrap-0.2.0-py3-none-any.whl:

Publisher: publish-pypi.yml on realmarauder/TensorTrap

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page