Production-grade static analysis tool for detecting malicious Python pickle files

These details have not been verified by PyPI

Project links

Project description

PickleGuard

Production-grade static analysis for detecting malicious Python pickle files. Built to protect ML pipelines from pickle-based attacks.

Why PickleGuard?

Python's pickle format is a known security risk - arbitrary code execution during deserialization. As ML models are increasingly shared via pickle-based formats (.pt, .pth, .pkl), attackers exploit this to distribute malware disguised as models.

PickleGuard detects these threats through deep opcode analysis, catching attacks that bypass existing tools.

Benchmark Results

Evaluated on the PickleBall dataset (84 malicious samples) and 268 benign models from HuggingFace:

Tool	True Positive Rate	False Positive Rate
PickleGuard	96.4%	0.0%
Picklescan	92.9%	6.2%
ModelScan	90.5%	N/A

Installation

pip install pickleguard

Quick Start

# Scan a model file
pickleguard scan model.pt

# Scan directory recursively
pickleguard scan ./models/ -r

# JSON output for CI/CD
pickleguard scan model.pt -f json

# SARIF output for GitHub Code Scanning
pickleguard scan ./models/ -f sarif -o results.sarif

What It Detects

Dangerous Callables (200+)

Code Execution: os.system, subprocess.Popen, eval, exec
Import Attacks: __import__, importlib.import_module
Network Operations: socket.socket, urllib.request.urlopen
File Operations: open, shutil.rmtree, os.remove
Deserialization Chains: pickle.loads, marshal.loads, yaml.load

Obfuscation Techniques

Technique	Description
Nested Module Paths	`torch.serialization.os.system`
INST Opcode Bypass	Evades GLOBAL+REDUCE detection
STACK_GLOBAL	Dynamic name resolution
BUILD Injection	Setting `__reduce__` via state
Encoded Payloads	Base64/hex obfuscated strings
Unicode Homoglyphs	Lookalike character substitution

Supported Formats

Raw pickle (protocol 0-5)
PyTorch containers (.pt, .pth, .bin)
NumPy files (.npy, .npz)
SafeTensors (marked safe)
ONNX models

Output Example

============================================================
File: malicious_model.pt
Format: pytorch_zip

Risk Assessment:
  Level: CRITICAL
  Score: 100/100
  Confidence: 100%

Obfuscation Detected:
  - NESTED_MODULE_PATH

Findings (2):

  [CRITICAL] dangerous_callable_nested_module_attack
    Dangerous callable: torch.serialization.os.system
    Callable: torch.serialization.os.system
    Position: 2

  [HIGH] obfuscation_nested_module_path
    Name contains dangerous segment 'os'
    Position: 2

============================================================

Python API

from pickle_scanner import PickleScanner

scanner = PickleScanner()
result = scanner.scan_file("model.pt")

if result.report.risk_level.name == "CRITICAL":
    print(f"Threat detected: {result.report.risk_score}/100")
    for finding in result.report.findings:
        print(f"  [{finding.severity}] {finding.message}")

CI/CD Integration

GitHub Actions

- name: Scan ML Models
  run: |
    pip install pickleguard
    pickleguard scan ./models/ -r -f sarif -o results.sarif

- name: Upload SARIF
  uses: github/codeql-action/upload-sarif@v2
  with:
    sarif_file: results.sarif

Pre-commit Hook

# .pre-commit-config.yaml
repos:
  - repo: local
    hooks:
      - id: pickleguard
        name: PickleGuard
        entry: pickleguard scan
        language: system
        files: \.(pt|pth|pkl|pickle)$

Custom Rules

Define custom detection rules in YAML:

rules:
  - name: "block_custom_module"
    severity: critical
    description: "Block imports from untrusted module"
    conditions:
      - opcode: [GLOBAL, INST]
        module: "untrusted_module"

pickleguard scan model.pt --rules custom_rules.yaml

How It Works

PickleGuard uses a multi-stage analysis pipeline:

Format Detection: Identifies file type (pickle, PyTorch ZIP, NumPy, etc.)
Opcode Parsing: Extracts all pickle opcodes from the stream
Stack Simulation: Abstract interpretation without code execution
Threat Analysis: Matches against 200+ dangerous callable patterns
Obfuscation Detection: Identifies evasion techniques
Risk Scoring: Multi-factor scoring with context awareness

Risk Levels

Level	Score	Description
CRITICAL	85-100	Confirmed dangerous callable
HIGH	60-84	Dangerous pattern or obfuscation
MEDIUM	30-59	Unknown callable detected
LOW	1-29	Minor indicators
SAFE	0	Clean file

Comparison with Alternatives

Feature	PickleGuard	Picklescan	ModelScan	Fickling
TPR	96.4%	92.9%	90.5%	-
FPR	0.0%	6.2%	-	-
Nested Path Detection	Yes	No	No	No
INST Bypass Detection	Yes	No	No	No
PyTorch ZIP Support	Yes	Yes	Yes	No
Safe Builtin Whitelist	Yes	No	No	No
SARIF Output	Yes	No	Yes	No

CLI Reference

usage: pickleguard scan [-h] [-r] [-f {text,json,sarif}] [-o OUTPUT] [-v]
                        [--show-safe-patterns] [--rules RULES] path

positional arguments:
  path                  File or directory to scan

options:
  -r, --recursive       Scan directories recursively
  -f, --format          Output format (default: text)
  -o, --output          Write output to file
  -v, --verbose         Show detailed findings
  --show-safe-patterns  Include safe ML patterns in output
  --rules RULES         Custom rules YAML file

Contributing

Contributions welcome. Please ensure:

New detection rules include test cases
Changes maintain 0% false positive rate
Code passes ruff and mypy checks

# Development setup
pip install -e ".[dev]"
pytest
ruff check .
mypy pickle_scanner/

License

MIT License

Acknowledgments

PickleBall - Malicious pickle dataset
Fickling - Fickling research
ProtectAI - ModelScan

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

1.0.1

Feb 4, 2026

1.0.0

Jan 31, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pickleguard-1.0.1.tar.gz (65.3 kB view details)

Uploaded Feb 4, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

pickleguard-1.0.1-py3-none-any.whl (59.2 kB view details)

Uploaded Feb 4, 2026 Python 3

File details

Details for the file pickleguard-1.0.1.tar.gz.

File metadata

Download URL: pickleguard-1.0.1.tar.gz
Upload date: Feb 4, 2026
Size: 65.3 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.8.10

File hashes

Hashes for pickleguard-1.0.1.tar.gz
Algorithm	Hash digest
SHA256	`82e35a4c7f14f2dc3bcebd7b461d6ed941639bb514aa3bd0aef00efb167d730d`
MD5	`709c9829991f1b63294896cf54fc020e`
BLAKE2b-256	`6209a8dfcf774811fbe31e2d0e1dbde7107e077f0ea147a7ece801c1fc4aef0d`

See more details on using hashes here.

File details

Details for the file pickleguard-1.0.1-py3-none-any.whl.

File metadata

Download URL: pickleguard-1.0.1-py3-none-any.whl
Upload date: Feb 4, 2026
Size: 59.2 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.8.10

File hashes

Hashes for pickleguard-1.0.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`4df07e04141632b819b9184bb05d8dd6d221c6a383605cf2b3c6fc0d7127f4d8`
MD5	`6265725fd1ff5a6c7412999e46e5761a`
BLAKE2b-256	`151d97b874d8a0dc5d5550d6d742ee1c546a26b715c7f06448e9c365aead20f1`

See more details on using hashes here.

pickleguard 1.0.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

PickleGuard

Why PickleGuard?

Benchmark Results

Installation

Quick Start

What It Detects

Dangerous Callables (200+)

Obfuscation Techniques

Supported Formats

Output Example

Python API

CI/CD Integration

GitHub Actions

Pre-commit Hook

Custom Rules

How It Works

Risk Levels

Comparison with Alternatives

CLI Reference

Contributing

License

Acknowledgments

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes