Standalone pickle security scanner extracted from ModelAudit

These details have not been verified by PyPI

Project description

modelaudit-picklescan

Rust-backed, bounded, static pickle security scanner. Inspects Python pickle streams and PyTorch ZIP checkpoints without unpickling them, and returns a typed report you can feed into CI, SARIF exporters, or custom policy engines.

Why this package

Pickle deserialization is the most common supply-chain attack vector in ML checkpoints, and existing Python-only scanners either unpickle the payload (unsafe), scan string literals only (imprecise), or fail open on large/malformed inputs (dangerous in CI). This package is a direct response:

Rust scanner engine. Opcode walker, string analyzer, and nested-payload decoder are all native code.
Fail-closed semantics. Every scan returns both a status (complete / inconclusive / error) and a verdict (clean / suspicious / malicious / unknown). Truncation, timeouts, budget exhaustion, and parser errors downgrade the verdict instead of silently returning clean.
Bounded by construction. Opcode count, wall-clock timeout, string-literal bytes, nested-payload bytes, and recursion depth are all configurable caps with safe defaults. A malicious producer cannot force unbounded memory or CPU.
Zero Python runtime dependencies. The wheel is self-contained — pip install modelaudit-picklescan and nothing else.
Attested provenance. Release wheels are published to PyPI with sigstore attestations via GitHub Actions trusted publishing.
Typed, immutable reports. PickleReport, Finding, Notice, and ScanError are frozen dataclasses with to_dict() for serialization. The package ships py.typed for mypy / pyright.

Install

pip install modelaudit-picklescan

Pre-built abi3 wheels ship for Python 3.10–3.13 on five targets: Linux x86_64, Linux aarch64, macOS arm64, macOS x86_64, and Windows x64. Other platforms install from the sdist and require a Rust toolchain (see Building from source).

Quickstart

from modelaudit_picklescan import scan_file

report = scan_file("suspicious_model.pt")  # raw pickle or PyTorch ZIP checkpoint

print(f"status={report.status.value} verdict={report.verdict.value}")
for finding in report.findings:
    print(f"  [{finding.severity.value}] {finding.rule_code}: {finding.message}")
    if finding.location:
        print(f"    at {finding.location}")

Example output on a PyTorch ZIP whose inner pickle reduces on os.system:

status=complete verdict=malicious
  [critical] DANGEROUS_CALL: Found REDUCE opcode invoking os.system
    at suspicious_model.pt:archive/data.pkl (pos 42)

Example output on a truncated or oversized pickle where analysis is incomplete:

status=inconclusive verdict=unknown
  (no findings — scan was truncated, inspect report.notices and report.coverage)

The finding.location string follows the format {source} (pos {byte_offset}). The source on PyTorch ZIP members is {archive_path}:{member_name}.

What it detects

Each finding carries a rule_code so downstream tooling can allowlist, suppress, or route alerts:

Rule code	What it flags
`DANGEROUS_CALL`	REDUCE/NEWOBJ/NEWOBJ_EX opcodes invoking a callable known to execute code
`DANGEROUS_GLOBAL`	Imports of modules or classes that enable code execution when the pickle is loaded
`EXTENSION_REF`	`copyreg.extension` / `EXT1`/`EXT2`/`EXT4` opcodes that resolve through process state
`MALFORMED_STACK_GLOBAL`	`STACK_GLOBAL` operands crafted to bypass naive string-matching scanners
`PERSISTENT_ID`	`PERSID` / `BINPERSID` references that delegate object construction to the loader
`PICKLE_EXPANSION`	Oversized or amplified pickle structures consistent with zip-bomb-style payloads
`POST_BUDGET_GLOBAL`	Dangerous globals observed after the opcode budget, surfaced conservatively
`STRUCTURAL_TAMPER`	Opcode sequences that do not correspond to any legitimate pickle producer
`SUSPICIOUS_STRING`	High-signal string literals (shell metacharacters, import payloads, URLs)
`S203`	Non-allowlisted `__main__` global reference (requires manual review before loading)
`S213`	Raw (unencoded) nested pickle payload inside a byte field
`S601`	Base64-encoded nested pickle payload inside a string literal
`S602`	Hex-encoded nested pickle payload inside a string literal

The scanner covers pickle protocols 0 through 5, recognizes short and extended opcodes, and reconstructs module.class targets for STACK_GLOBAL without executing them.

When to use this vs. `modelaudit`

Use modelaudit-picklescan if you want a single-purpose library to embed in another tool: a linter, a model registry gate, a custom CI step, or a server-side scanner. It does pickle analysis and nothing else.

Use modelaudit if you want the full static scanner CLI: 40+ model/archive format scanners, SARIF and JSON output, remote-source scanning (Hugging Face, S3, GCS, JFrog, MLflow, DVC), license and secret detection, caching, progress reporting, and CI recipes. modelaudit uses this package internally for its pickle scanner.

API overview

from modelaudit_picklescan import (
    PickleScanner, ScanOptions,
    scan_file, scan_bytes, scan_stream,
    PickleReport, Finding, Notice, ScanError,
    Severity, ScanStatus, SafetyVerdict, CoverageSummary,
)

Three convenience entry points, each returning a PickleReport:

scan_file(path, *, options=None) — scan a .pkl / .pickle or a PyTorch ZIP checkpoint (detects the container, enumerates pickle members, combines reports).
scan_bytes(data, *, source="<bytes>", options=None) — scan an in-memory payload.
scan_stream(stream, *, source="<stream>", size=None, options=None) — scan a binary file-like object; falls back to bounded spooling when size is unknown.

For long-running services, construct PickleScanner(options=...) once and reuse it across calls.

Resource controls — `ScanOptions`

All fields have safe defaults; override only what you need.

Field	Default	Meaning
`timeout_s`	`3600.0`	Per-scan wall clock, capped at `86_400` seconds
`max_opcodes`	`1_000_000`	Opcode budget before the scanner downgrades to partial
`post_budget_scan_bytes`	`100 MiB`	Bytes to keep scanning for globals after the budget
`max_known_stream_read_bytes`	`100 MiB`	Cap on streams with a known `size`
`max_unbounded_stream_read_bytes`	`8 MiB`	Cap on streams without a known `size`
`max_string_literal_scan_chars`	`8 MiB`	Cap on bytes inspected for `SUSPICIOUS_STRING`
`max_nested_pickle_bytes`	`2 MiB`	Cap on each decoded nested-payload inspection
`max_nested_depth`	`2`	Recursion depth for base64/hex-encoded pickles

Construction validates every field; pass invalid values and you'll get a ValueError immediately instead of a misleading scan result.

Report contract — `PickleReport`

status: ScanStatus — complete, inconclusive, or error.
verdict: SafetyVerdict — clean, suspicious, malicious, or unknown. clean requires status=complete with no findings.
findings: tuple[Finding, ...] — WARNING or CRITICAL security results.
notices: tuple[Notice, ...] — DEBUG/INFO explainability and coverage notes (budget hits, truncation, unsupported members).
errors: tuple[ScanError, ...] — operational failures (short reads, malformed containers, engine errors).
coverage: CoverageSummary — bytes_scanned, bytes_total, opcode_count, and per-phase completion flags.
metadata: Mapping[str, Any] — container info (e.g. container_type="pytorch_zip", archive size, pickle members).
duration_s: float — scan wall clock.

Convenience accessors: report.has_security_findings, report.is_clean, report.to_dict().

Reports and all nested models are frozen — call to_dict() if you need a mutable payload for serialization. For aggregation, treat findings at warning/critical as security alerts; group notices by code rather than showing every INFO row as actionable.

PyTorch ZIP checkpoints

scan_file auto-detects PyTorch ZIP containers (archives containing data.pkl plus version / byteorder metadata), enumerates pickle members (including hidden ones identified by content sniffing, not just extension), and combines per-member reports into a single container-level report with metadata.container_type="pytorch_zip". Archive member count is capped at 10,000 entries; per-member pickles are capped at 512 MiB. Both limits are enforced by structured notices, not silent skips.

Building from source

Wheels cover five targets; any other platform or a custom Python ABI requires building from source:

# Requires Rust 1.83+ and a working C toolchain
pip install modelaudit-picklescan --no-binary modelaudit-picklescan

From a checkout:

pip install packages/modelaudit-picklescan
# or, for development with hot-reload of the Rust extension:
maturin develop --release -m packages/modelaudit-picklescan/Cargo.toml

Stability and versioning

modelaudit-picklescan follows semantic versioning. 0.x should be read as pre-1.0 — expect small adjustments as the API settles. The working intent, reflected in the current code, is:

Resource-control defaults (ScanOptions) are tuned conservatively; changes that relax a default will be called out in the changelog.
Public report models (PickleReport, Finding, Notice, ScanError) and their field names are the supported surface for serialization and downstream tooling.
Rule codes are intended to be additive — new codes rather than renames — so that downstream allowlists and suppressions remain stable.
Verdict semantics — SafetyVerdict.CLEAN is only returned when ScanStatus.COMPLETE holds and there are no findings; truncation, timeouts, and engine errors never produce CLEAN. This is enforced in _combine_verdict / _with_*_notice in api.py.

Any change to the items above will be announced in CHANGELOG.md and the GitHub release notes.

Security and reporting

Please do not open public GitHub issues for suspected vulnerabilities. See the project security policy for coordinated disclosure.

License

MIT. See LICENSE.

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

0.1.5

May 3, 2026

0.1.4

May 2, 2026

This version

0.1.3

Apr 27, 2026

0.1.2

Apr 17, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

modelaudit_picklescan-0.1.3.tar.gz (190.8 kB view details)

Uploaded Apr 27, 2026 Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

modelaudit_picklescan-0.1.3-cp310-abi3-win_amd64.whl (450.8 kB view details)

Uploaded Apr 27, 2026 CPython 3.10+Windows x86-64

modelaudit_picklescan-0.1.3-cp310-abi3-manylinux_2_28_x86_64.whl (593.6 kB view details)

Uploaded Apr 27, 2026 CPython 3.10+manylinux: glibc 2.28+ x86-64

modelaudit_picklescan-0.1.3-cp310-abi3-manylinux_2_28_aarch64.whl (589.5 kB view details)

Uploaded Apr 27, 2026 CPython 3.10+manylinux: glibc 2.28+ ARM64

modelaudit_picklescan-0.1.3-cp310-abi3-macosx_11_0_arm64.whl (546.4 kB view details)

Uploaded Apr 27, 2026 CPython 3.10+macOS 11.0+ ARM64

modelaudit_picklescan-0.1.3-cp310-abi3-macosx_10_12_x86_64.whl (549.9 kB view details)

Uploaded Apr 27, 2026 CPython 3.10+macOS 10.12+ x86-64

File details

Details for the file modelaudit_picklescan-0.1.3.tar.gz.

File metadata

Download URL: modelaudit_picklescan-0.1.3.tar.gz
Upload date: Apr 27, 2026
Size: 190.8 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.13

File hashes

Hashes for modelaudit_picklescan-0.1.3.tar.gz
Algorithm	Hash digest
SHA256	`e7f2bff25765ec670b39d783bbf8f8f77537bf2cebb9f6d45349b093546a2615`
MD5	`268c2ca9aba6c406b7d25e485fe57482`
BLAKE2b-256	`ea5e93a439c5364e6ec38f768bdb0264be43280ff0d1e8c74bb08d1892719c9b`

See more details on using hashes here.

Provenance

The following attestation bundles were made for modelaudit_picklescan-0.1.3.tar.gz:

Publisher: release-please.yml on promptfoo/modelaudit

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: modelaudit_picklescan-0.1.3.tar.gz
- Subject digest: e7f2bff25765ec670b39d783bbf8f8f77537bf2cebb9f6d45349b093546a2615
- Sigstore transparency entry: 1392549407
- Sigstore integration time: Apr 27, 2026
Source repository:
- Permalink: promptfoo/modelaudit@dca64f84f46b7f722dee1731450a1b148ec3bd2b
- Branch / Tag: refs/heads/main
- Owner: https://github.com/promptfoo
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release-please.yml@dca64f84f46b7f722dee1731450a1b148ec3bd2b
- Trigger Event: push

File details

Details for the file modelaudit_picklescan-0.1.3-cp310-abi3-win_amd64.whl.

File metadata

Download URL: modelaudit_picklescan-0.1.3-cp310-abi3-win_amd64.whl
Upload date: Apr 27, 2026
Size: 450.8 kB
Tags: CPython 3.10+, Windows x86-64
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.13

File hashes

Hashes for modelaudit_picklescan-0.1.3-cp310-abi3-win_amd64.whl
Algorithm	Hash digest
SHA256	`e3bc3eb371dfb5146e1325724f55c7776278c08d5acf7ffd117bfcc19874c0ff`
MD5	`3e0ef8917135665ec88b74c65c25bfc6`
BLAKE2b-256	`f352bf0efad8ee0fd9556b1dd21add20081886135aa1c8e8f58b8afbc3c8091c`

See more details on using hashes here.

Provenance

The following attestation bundles were made for modelaudit_picklescan-0.1.3-cp310-abi3-win_amd64.whl:

Publisher: release-please.yml on promptfoo/modelaudit

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: modelaudit_picklescan-0.1.3-cp310-abi3-win_amd64.whl
- Subject digest: e3bc3eb371dfb5146e1325724f55c7776278c08d5acf7ffd117bfcc19874c0ff
- Sigstore transparency entry: 1392549411
- Sigstore integration time: Apr 27, 2026
Source repository:
- Permalink: promptfoo/modelaudit@dca64f84f46b7f722dee1731450a1b148ec3bd2b
- Branch / Tag: refs/heads/main
- Owner: https://github.com/promptfoo
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release-please.yml@dca64f84f46b7f722dee1731450a1b148ec3bd2b
- Trigger Event: push

File details

Details for the file modelaudit_picklescan-0.1.3-cp310-abi3-manylinux_2_28_x86_64.whl.

File metadata

Download URL: modelaudit_picklescan-0.1.3-cp310-abi3-manylinux_2_28_x86_64.whl
Upload date: Apr 27, 2026
Size: 593.6 kB
Tags: CPython 3.10+, manylinux: glibc 2.28+ x86-64
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.13

File hashes

Hashes for modelaudit_picklescan-0.1.3-cp310-abi3-manylinux_2_28_x86_64.whl
Algorithm	Hash digest
SHA256	`a31fb74fec29eb356289b253de3d006a36a7bd1d11f46ac219a91e509212afa5`
MD5	`48ce2fd2f69da3467e91b514f6641970`
BLAKE2b-256	`29bf214f656cf2dfe3e7c5b76d0be5d55c1d71bff39fca7a3774bc913f33c9a6`

See more details on using hashes here.

Provenance

The following attestation bundles were made for modelaudit_picklescan-0.1.3-cp310-abi3-manylinux_2_28_x86_64.whl:

Publisher: release-please.yml on promptfoo/modelaudit

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: modelaudit_picklescan-0.1.3-cp310-abi3-manylinux_2_28_x86_64.whl
- Subject digest: a31fb74fec29eb356289b253de3d006a36a7bd1d11f46ac219a91e509212afa5
- Sigstore transparency entry: 1392549414
- Sigstore integration time: Apr 27, 2026
Source repository:
- Permalink: promptfoo/modelaudit@dca64f84f46b7f722dee1731450a1b148ec3bd2b
- Branch / Tag: refs/heads/main
- Owner: https://github.com/promptfoo
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release-please.yml@dca64f84f46b7f722dee1731450a1b148ec3bd2b
- Trigger Event: push

File details

Details for the file modelaudit_picklescan-0.1.3-cp310-abi3-manylinux_2_28_aarch64.whl.

File metadata

Download URL: modelaudit_picklescan-0.1.3-cp310-abi3-manylinux_2_28_aarch64.whl
Upload date: Apr 27, 2026
Size: 589.5 kB
Tags: CPython 3.10+, manylinux: glibc 2.28+ ARM64
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.13

File hashes

Hashes for modelaudit_picklescan-0.1.3-cp310-abi3-manylinux_2_28_aarch64.whl
Algorithm	Hash digest
SHA256	`a148750c58b04d3c039688955f74ed24b8a517d08db3e5dd8c867bc49ddfde9c`
MD5	`aa962ef8871eea2cfe27bbc6a141b3d1`
BLAKE2b-256	`eac88393faa7f9c7d5fa0080c07577e88f6f3f32233dd5cc443927a75e933837`

See more details on using hashes here.

Provenance

The following attestation bundles were made for modelaudit_picklescan-0.1.3-cp310-abi3-manylinux_2_28_aarch64.whl:

Publisher: release-please.yml on promptfoo/modelaudit

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: modelaudit_picklescan-0.1.3-cp310-abi3-manylinux_2_28_aarch64.whl
- Subject digest: a148750c58b04d3c039688955f74ed24b8a517d08db3e5dd8c867bc49ddfde9c
- Sigstore transparency entry: 1392549413
- Sigstore integration time: Apr 27, 2026
Source repository:
- Permalink: promptfoo/modelaudit@dca64f84f46b7f722dee1731450a1b148ec3bd2b
- Branch / Tag: refs/heads/main
- Owner: https://github.com/promptfoo
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release-please.yml@dca64f84f46b7f722dee1731450a1b148ec3bd2b
- Trigger Event: push

File details

Details for the file modelaudit_picklescan-0.1.3-cp310-abi3-macosx_11_0_arm64.whl.

File metadata

Download URL: modelaudit_picklescan-0.1.3-cp310-abi3-macosx_11_0_arm64.whl
Upload date: Apr 27, 2026
Size: 546.4 kB
Tags: CPython 3.10+, macOS 11.0+ ARM64
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.13

File hashes

Hashes for modelaudit_picklescan-0.1.3-cp310-abi3-macosx_11_0_arm64.whl
Algorithm	Hash digest
SHA256	`f4d977b552db8bd23ef8e2436aaec4f47d8dffd7783d564266fbdd2ff50a60c5`
MD5	`cfc66b8f7eba55867c46283d9be37174`
BLAKE2b-256	`a6a07ae3370d555d50930ae51123b53e85e0aff9d5d69e51dc4d2b90d7fc0ba3`

See more details on using hashes here.

Provenance

The following attestation bundles were made for modelaudit_picklescan-0.1.3-cp310-abi3-macosx_11_0_arm64.whl:

Publisher: release-please.yml on promptfoo/modelaudit

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: modelaudit_picklescan-0.1.3-cp310-abi3-macosx_11_0_arm64.whl
- Subject digest: f4d977b552db8bd23ef8e2436aaec4f47d8dffd7783d564266fbdd2ff50a60c5
- Sigstore transparency entry: 1392549415
- Sigstore integration time: Apr 27, 2026
Source repository:
- Permalink: promptfoo/modelaudit@dca64f84f46b7f722dee1731450a1b148ec3bd2b
- Branch / Tag: refs/heads/main
- Owner: https://github.com/promptfoo
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release-please.yml@dca64f84f46b7f722dee1731450a1b148ec3bd2b
- Trigger Event: push

File details

Details for the file modelaudit_picklescan-0.1.3-cp310-abi3-macosx_10_12_x86_64.whl.

File metadata

Download URL: modelaudit_picklescan-0.1.3-cp310-abi3-macosx_10_12_x86_64.whl
Upload date: Apr 27, 2026
Size: 549.9 kB
Tags: CPython 3.10+, macOS 10.12+ x86-64
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.13

File hashes

Hashes for modelaudit_picklescan-0.1.3-cp310-abi3-macosx_10_12_x86_64.whl
Algorithm	Hash digest
SHA256	`fa22c9201da10f6493fb96834ecda0d8abe9aa3438581facf2c432c06740f361`
MD5	`160aee0b923fc250b80fe9e2ba078de8`
BLAKE2b-256	`092bbfeadc1d0c942870c6df8a81413d7301268ed838e66d8222729a193ef911`

See more details on using hashes here.

Provenance

The following attestation bundles were made for modelaudit_picklescan-0.1.3-cp310-abi3-macosx_10_12_x86_64.whl:

Publisher: release-please.yml on promptfoo/modelaudit

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: modelaudit_picklescan-0.1.3-cp310-abi3-macosx_10_12_x86_64.whl
- Subject digest: fa22c9201da10f6493fb96834ecda0d8abe9aa3438581facf2c432c06740f361
- Sigstore transparency entry: 1392549418
- Sigstore integration time: Apr 27, 2026
Source repository:
- Permalink: promptfoo/modelaudit@dca64f84f46b7f722dee1731450a1b148ec3bd2b
- Branch / Tag: refs/heads/main
- Owner: https://github.com/promptfoo
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release-please.yml@dca64f84f46b7f722dee1731450a1b148ec3bd2b
- Trigger Event: push

modelaudit-picklescan 0.1.3

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

modelaudit-picklescan

Why this package

Install

Quickstart

What it detects

When to use this vs. modelaudit

API overview

Resource controls — ScanOptions

Report contract — PickleReport

PyTorch ZIP checkpoints

Building from source

Stability and versioning

Security and reporting

Links

License

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distributions

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance

When to use this vs. `modelaudit`

Resource controls — `ScanOptions`

Report contract — `PickleReport`