Skip to main content

Chunked pickle serialization — split large objects into Git-friendly slices.

Project description

pickle-jar

PyPI Python License: MIT

A container of pickle slices.

Serialize Python objects to disk the same way you'd use pickle — but instead of one monolithic file, pickle-jar splits the output into small, numbered chunks inside a directory. This makes it easy to commit large serialized objects (ML model weights, embeddings, datasets) to Git repositories that enforce per-file size limits.

Install

pip install pickle-jar

Quick start

import jar

# Save any picklable object
jar.dump(my_model.state_dict(), "model_weights")

# Load it back
weights = jar.load("model_weights")

The call above creates a directory called model_weights/ containing numbered chunk files (0.pkl, 1.pkl, …). Each chunk defaults to 5 MB — small enough for GitHub's file-size limits.

API

jar.dump(obj, path, chunk_size=5_000_000)

Serialize obj and write it as chunked .pkl files inside path.

Parameter Type Description
obj Any Any picklable Python object.
path str | Path Directory to create (overwritten if it exists).
chunk_size int Max bytes per chunk file. Default 5_000_000 (5 MB).

Returns the number of chunk files written.

jar.load(path)

Reassemble and deserialize an object from a jar directory.

Parameter Type Description
path str | Path Directory previously created by jar.dump.

Returns the deserialized Python object.

Tuning chunk size

# Smaller chunks for strict hosting limits
jar.dump(obj, "output", chunk_size=1_000_000)   # 1 MB per file

# Larger chunks when size limits aren't a concern
jar.dump(obj, "output", chunk_size=50_000_000)  # 50 MB per file

Security warning

pickle-jar uses Python's pickle module under the hood. pickle.loads() can execute arbitrary code. Never load jar directories from untrusted sources. This is the same caveat that applies to pickle, torch.load, and similar serialization tools.

Development

# Clone and set up
git clone https://github.com/jkvc/pickle-jar.git
cd pickle-jar
uv venv .venv && source .venv/bin/activate
uv pip install -e ".[dev]"

# Run tests & lint
pytest tests/ -v
ruff check .

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pickle_jar-1.0.0.tar.gz (5.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pickle_jar-1.0.0-py3-none-any.whl (4.9 kB view details)

Uploaded Python 3

File details

Details for the file pickle_jar-1.0.0.tar.gz.

File metadata

  • Download URL: pickle_jar-1.0.0.tar.gz
  • Upload date:
  • Size: 5.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.7 {"installer":{"name":"uv","version":"0.11.7","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for pickle_jar-1.0.0.tar.gz
Algorithm Hash digest
SHA256 9d846a6cd6631106d8f31f892f5f05192537402869dd3ae3388b444090612feb
MD5 6f759596afe1ea75aedbd352fd5b47e1
BLAKE2b-256 43992be11dc70f51551db00ea24e53fb18a3d8fad1b3cefdc17119e193b33957

See more details on using hashes here.

File details

Details for the file pickle_jar-1.0.0-py3-none-any.whl.

File metadata

  • Download URL: pickle_jar-1.0.0-py3-none-any.whl
  • Upload date:
  • Size: 4.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.7 {"installer":{"name":"uv","version":"0.11.7","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for pickle_jar-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 9417fced365b37f6187ecabbfed9a4a738f4700b9c232e1ac615c9bda733bfb8
MD5 c2d65eac94239b61aa38355bf81dc281
BLAKE2b-256 ee6e0f1bf7e30e5043b4a63094c2548c5753e274887dc2fbd8f8f6d6e5c6fd17

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page