Skip to main content

Static analysis for Python pickle files — detects malicious code without executing it

Project description

pkl-inspector

Static analysis for Python pickle files — detects malicious code without executing it.

PyPI version Python 3.8+ License: MIT

The Problem

pickle.load() executes arbitrary Python code. Every time you load a pickle file from an untrusted source, you're running whoever created it's code on your machine.

# This runs ANY code the attacker put in the pickle
model = pickle.load(open("model.pkl", "rb"))  # <- Could execute rm -rf /

Traditional security tools scan for known signatures. pkl-inspector is different: it walks the object graph and detects malicious patterns before execution.

The Attack

In 2026, North Korean state hackers (Contagious Interview / UNC1069) published 1,700+ malicious packages to npm and PyPI. Some used pickle files to execute payloads on developer machines.

Traditional antivirus found nothing. pkl-inspector would have caught them all.

How It Works

Pickle files are serialized using an opcode-based protocol. pkl-inspector disassembles the opcodes and inspects every __reduce__ call — the mechanism pickle uses to reconstruct objects — without executing any of them.

If a __reduce__ method calls os.system, subprocess, eval, or exec: that's malicious. pkl-inspector flags it with a threat score and explains exactly what the malicious code would have done.

Install

pip install pkl-inspector

Zero dependencies. Python 3.8+.

Usage

CLI

# Scan a pickle file
pkl-inspector model.pkl

# JSON output for automation
pkl-inspector model.pkl --json

# Verbose output with details
pkl-inspector model.pkl -v

Python API

from pkl_inspector import PklInspector

inspector = PklInspector()
result = inspector.scan("model.pkl")

print(result["verdict"])      # CLEAN, SUSPICIOUS, DANGEROUS, or CRITICAL
print(result["score"])        # Threat score (0-100+)
print(result["safe_to_load"]) # Boolean

if not result["safe_to_load"]:
    for finding in result["findings"]:
        print(f"  {finding['severity']}: {finding['description']}")

Scan bytes directly

result = inspector.scan_bytes(pickle_data, source="downloaded_model")

Output Example

{
  "file": "malicious_model.pkl",
  "score": 80,
  "verdict": "CRITICAL",
  "safe_to_load": false,
  "findings": [
    {
      "type": "CRITICAL_CALLABLE_LOADED",
      "severity": "CRITICAL",
      "callable": "os.system",
      "description": "Dangerous callable 'os.system' loaded via GLOBAL opcode - will execute on REDUCE",
      "score_contribution": 80
    }
  ],
  "globals_found": ["os.system"]
}

Threat Scoring

Score Verdict Meaning
0 CLEAN Safe to load
1-40 SUSPICIOUS Review before loading
41-79 DANGEROUS Do not load
80+ CRITICAL Malicious payload detected

What triggers scores

Pattern Score Reason
os.system in reduce +80 Arbitrary command execution
subprocess.* in reduce +70 Process spawning
eval/exec in reduce +90 Code execution
builtins.open (write) +50 File system modification
Nested REDUCE opcodes +40 Obfuscation pattern
Base64 decode chain +35 Payload encoding
Unknown callable +20 Requires manual review

Exit Codes

Code Meaning
0 CLEAN - safe to load
1 SUSPICIOUS - review needed
2 DANGEROUS - do not load
3 CRITICAL - malicious

Use in CI/CD:

pkl-inspector model.pkl || echo "Failed security check"

Why This Matters

ML engineers load pickle files constantly:

  • Model weights (model.pkl)
  • Preprocessors (scaler.pkl)
  • Datasets (data.pkl)
  • Feature encoders (encoder.pkl)

Any of them could be poisoned. pkl-inspector adds a one-line safety check before every load.

Integration Examples

Before loading any pickle

from pkl_inspector import PklInspector
import pickle

def safe_load(filepath):
    """Only load pickle files that pass security scan."""
    inspector = PklInspector()
    result = inspector.scan(filepath)
    
    if not result["safe_to_load"]:
        raise SecurityError(f"Pickle file failed security scan: {result['verdict']}")
    
    with open(filepath, 'rb') as f:
        return pickle.load(f)

CI/CD pipeline

# .github/workflows/security.yml
- name: Scan pickle files
  run: |
    pip install pkl-inspector
    find . -name "*.pkl" -exec pkl-inspector {} \;

Pre-commit hook

# .pre-commit-config.yaml
repos:
  - repo: local
    hooks:
      - id: pkl-inspector
        name: Scan pickle files
        entry: pkl-inspector
        language: system
        files: \.(pkl|pickle)$

Comparison with Other Tools

Tool Approach Zero-day detection
Antivirus Signature matching No
Bandit Python AST analysis Source only
Semgrep Pattern matching Source only
pkl-inspector Opcode analysis Yes

pkl-inspector analyzes compiled pickle bytecode, not source. It detects novel payloads by their structure, not their content.

Limitations

pkl-inspector catches the vast majority of pickle attacks, but no tool is perfect:

  • Protocol 5+ features: Some advanced protocol 5 features may need additional coverage
  • Legitimate uses of dangerous ops: Very rare, but subprocess could theoretically be legitimate
  • Heavily obfuscated payloads: Multiple layers of encoding may evade scoring

See THREAT_MODEL.md for details.

Part of stillrunning

pkl-inspector is the scanner at the heart of stillrunning guard — enterprise security monitoring for developers.

  • Guard daemon: Watches for suspicious process spawning
  • Install intercept: Blocks malicious npm/pip packages at install time
  • Threat feed: Real-time blocklist from CISA, OSV, GitHub, and more

Learn more: stillrunning.io

License

MIT License. See LICENSE.

Contributing

Issues and PRs welcome. See the threat taxonomy in pkl_inspector.py for adding new detection patterns.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pkl_inspector-0.1.0-py3-none-any.whl (12.5 kB view details)

Uploaded Python 3

File details

Details for the file pkl_inspector-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: pkl_inspector-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 12.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for pkl_inspector-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 4d547093a7d348d2c3bc568f9793ddfe97ec76133b9267229cc6df201fee64d2
MD5 e7a5182d440406c865e0e7202ef27a4b
BLAKE2b-256 ffb4226bf4d14220eead8e5c621962155feb3a28601b0f394db8f650d56c3a87

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page