Skip to main content

IO for multiple python objects to/from a single file

Project description

packio

Packio allows you to use a single file to store and retrieve multiple python objects. See the docs.

Example:

import dummio
import pandas as pd
from packio import Reader, Writer

# define some objects and an output filepath
df = pd.DataFrame({"a": [1, 2], "b": [3, 4]})
lookup = {"a": 1, "b": 2}
filepath = tmp_path / "data.packio"

# save both objects to the same filepath
with Writer(filepath) as writer:
    df.to_parquet(writer.file("df.parquet"))
    dummio.json.save(lookup, filepath=writer.file("lookup.json"))

# load the objects from the file
with Reader(filepath) as reader:
    df2 = pd.read_parquet(reader.file("df.parquet"))
    lookup2 = dummio.json.load(reader.file("lookup.json"))

assert df.equals(df2)
assert lookup == lookup2

Available on pypi: pip install packio.

Why a single file and not a directory?

In a word, encapsulation. Copy/move operations with a file are simpler than a directory, especially when it comes to moving data across platforms such as to/from the cloud. A file is also more tamper-resistant - it's typically harder to accidentally modify the contents of a file than it is for someone to add or remove files or subdirectories in a directory.

Why not pickle?

Although pickle may be the most common approach for serialization of complex python objects, there are strong reasons to dislike pickle. As summarized by Gemini, "Python's pickle module, while convenient, has drawbacks. It poses security risks due to potential code execution vulnerabilities when handling untrusted data. Compatibility issues arise because it's Python-specific and version-dependent. Maintaining pickle can be challenging due to refactoring difficulties and complex debugging." See also Ben Frederickson.

Development

Create and activate a virtual env for dev ops:

git clone git@github.com:zkurtz/packio.git
cd packio
pip install uv
uv sync
source .venv/bin/activate
pre-commit install

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

packio-0.1.2.tar.gz (7.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

packio-0.1.2-py3-none-any.whl (6.2 kB view details)

Uploaded Python 3

File details

Details for the file packio-0.1.2.tar.gz.

File metadata

  • Download URL: packio-0.1.2.tar.gz
  • Upload date:
  • Size: 7.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.7.8

File hashes

Hashes for packio-0.1.2.tar.gz
Algorithm Hash digest
SHA256 e5549c77757fc2e31c061ea0e5790ca81299c8ee79d897c4598cb0f87c1e3b93
MD5 5c0dbdda8614333d5db98b8edacc367d
BLAKE2b-256 edce6c40254be178b099681a31aee4b7bb7e6ac75d81ddab23bf0134c0990df5

See more details on using hashes here.

File details

Details for the file packio-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: packio-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 6.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.7.8

File hashes

Hashes for packio-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 1375a414bff082b200288fcb20f450bcecdfa028cdeca93172e78981ac648ef8
MD5 96359ad54da81e8d785e53eb92dc90d3
BLAKE2b-256 1e506a740d15f91ed61d6d63598039ee0f49411c5781e9e921d756599760b452

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page