Security scanner detecting Python Pickle files performing suspicious actions
Project description
Python Pickle Malware Scanner
Security scanner detecting Python Pickle files performing suspicious actions.
Getting started
Scan a malicious model on Hugging Face:
pip install picklescan
picklescan --huggingface ykilcher/totally-harmless-model
The scanner reports that the Pickle is calling eval()
to execute arbitrary code:
https://huggingface.co/ykilcher/totally-harmless-model/resolve/main/pytorch_model.bin:archive/data.pkl: global import '__builtin__ eval' FOUND
----------- SCAN SUMMARY -----------
Scanned files: 1
Infected files: 1
Dangerous globals: 1
The scanner can also load Pickles from local files, directories, URLs, and zip archives (a-la PyTorch):
picklescan --path downloads/pytorch_model.bin
picklescan --path downloads
picklescan --url https://huggingface.co/sshleifer/tiny-distilbert-base-cased-distilled-squad/resolve/main/pytorch_model.bin
To scan Numpy's .npy
files, pip install the numpy
package first.
The scanner exit status codes are (a-la ClamAV):
0
: scan did not find malware1
: scan found malware2
: scan failed
Develop
Create and activate the conda environment (miniconda is sufficient):
conda env create -f conda.yaml
conda activate picklescan
Install the package in editable mode to develop and test:
python3 -m pip install -e .
Edit with VS Code:
code .
Run unit tests:
pytest tests
Run manual tests:
- Local PyTorch (zip) file
mkdir downloads
wget -O downloads/pytorch_model.bin https://huggingface.co/ykilcher/totally-harmless-model/resolve/main/pytorch_model.bin
picklescan -l DEBUG -p downloads/pytorch_model.bin
- Remote PyTorch (zip) URL
picklescan -l DEBUG -u https://huggingface.co/prajjwal1/bert-tiny/resolve/main/pytorch_model.bin
Lint the code:
black src tests
flake8 src tests --count --show-source
Publish the package to PyPI: bump the package version in setup.cfg
and create a GitHub release. This triggers the publish
workflow.
Alternative manual steps to publish the package:
python3 -m pip install --upgrade pip
python3 -m pip install --upgrade build
python3 -m build
python3 -m twine upload dist/*
Test the package: bump the version of picklescan
in conda.test.yaml
and run
conda env remove -n picklescan-test
conda env create -f conda.test.yaml
conda activate picklescan-test
picklescan --huggingface ykilcher/totally-harmless-model
Tested on Linux 5.10.102.1-microsoft-standard-WSL2 x86_64
(WSL2).
References
- pickletools.py -- The pickletool code is the most detailed documentation of the Pickle format.
- Machine Learning Attack Series: Backdooring Pickle Files, Johann Rehberger, 2022
- Hugging Face Pickle Scanning, Luc Georges, 2022
- The hidden dangers of loading open-source AI models (ARBITRARY CODE EXPLOIT!, Yannic Kilcher, 2022
- Secure Machine Learning at Scale with MLSecOps, Alejandro Saucedo, 2022
- Backdooring Pickles: A decade only made things worse, ColdwaterQ, DEFCON 2022
- Never a dill moment: Exploiting machine learning pickle files, Evan Sultanik, 2021 (tool: Fickling)
- Exploiting Python pickles, David Hamann, 2020
- Dangerous Pickles - malicious python serialization, Evan Sangaline, 2017
- Python Pickle Security Problems and Solutions, Travis Cunningham, 2015
- Arbitrary code execution with Python pickles, Stephen Checkoway, 2013
- Sour Pickles, A serialised exploitation guide in one part, Marco Slaviero, BlackHat USA 2011 (see also: doc, slides)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for picklescan-0.0.11-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 069437226b1882c510580ef2f7ac380b177e57dc3a21f3636a829c5fbf4f3af2 |
|
MD5 | 9c4dfb2f6e288843e9990c84b0853922 |
|
BLAKE2b-256 | d787d3065078a104a21d5883426427f4298cf0389280ab503dc5359b1f715827 |