Seamless serialization and deserialization of complex Python objects — a more portable and readable alternative to pickle for data-heavy workflows.
Project description
DataZip
DataZip is a Python library that extends zipfile.ZipFile to provide seamless serialization and deserialization of complex Python objects — a more portable and readable alternative to pickle for data science workflows.
Why DataZip?
- Human-inspectable archives: DataZip files are standard
.zipfiles. You can open them with any archive tool and inspect the contents. - Broad type support: Works out of the box with pandas DataFrames/Series, NumPy arrays, Polars DataFrames, datetimes, paths, sets, frozensets, complex numbers, and custom classes.
- Efficient storage: Tabular data is stored as Parquet; arrays as
.npy. JSON is used for metadata and simple types. - Lazy loading: Objects and data are only deserialized when they are accessed, allowing efficient loading of objects within huge files. Nested access avoids deserialzing unnecessary enclosing objects.
- No pickle by default: Most types are serialized without pickle, making files safer and more portable.
- Custom class integration: Any class that implements
__getstate__/__setstate__(the standard pickle protocol) works automatically. TheIOMixinmakes it even simpler. - Pluggable type support: Teach DataZip how to handle any third-party or stdlib type by registering encoder/decoder pairs with
DataZip.register_coders. The bundled NumPy, pandas, Polars, and Plotly integrations are themselves built on this hook — see the User Guide for details.
Quick Example
from io import BytesIO
import pandas as pd
from datazip import DataZip
# Write
buffer = BytesIO()
with DataZip(buffer, "w") as z:
z["df"] = pd.DataFrame({"x": [1, 2, 3], "y": [4, 5, 6]})
z["config"] = {"threshold": 0.5, "labels": ["a", "b"]}
z["values"] = {1, 2, frozenset([3, 4])}
# Read
with DataZip(buffer, "r") as z:
df = z["df"]
config = z["config"]
Supported Types
| Category | Types |
|---|---|
| Primitives | str, int, float, bool, None, complex |
| Collections | dict, list, tuple, set, frozenset, deque, defaultdict |
| Date/Time | datetime, pandas.Timestamp |
| Paths | pathlib.Path |
| Custom | Any class with __getstate__/__setstate__ |
| Optional | numpy.ndarray, pandas.DataFrame, pandas.Series, polars.DataFrame, polars.LazyFrame, polars.Series, xarray.Dataset, Plotly figures |
Installation
pip install datazip
See the Installation page for full details including optional dependencies.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file datazip-0.3.0.tar.gz.
File metadata
- Download URL: datazip-0.3.0.tar.gz
- Upload date:
- Size: 26.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: uv/0.11.16 {"installer":{"name":"uv","version":"0.11.16","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
29e7e4fc6d25ec475a58bd1104967eef89564257c91d45a76fe64f1fb7d64eb9
|
|
| MD5 |
3dd428e53d751bdda272e50b225bd061
|
|
| BLAKE2b-256 |
fd938a1648604ecf81229e668efe1d2be19553e9bf7a595c519d0178240bc9bd
|
File details
Details for the file datazip-0.3.0-py3-none-any.whl.
File metadata
- Download URL: datazip-0.3.0-py3-none-any.whl
- Upload date:
- Size: 17.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: uv/0.11.16 {"installer":{"name":"uv","version":"0.11.16","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
97375169ea22f5c4f26d2d938e65350f933d555e193e21f79407d226ee60334d
|
|
| MD5 |
52febee6ba2758ae899c7be2f4fd39af
|
|
| BLAKE2b-256 |
e0e07a873530819aab872c3912b5b61df26e9ebbfc6c16aa472fecc83151eda5
|