Skip to main content

IO for multiple python objects to/from a single file

Project description

packio

Packio allows you to use a single file to store and retrieve multiple python objects. For example:

import dummio
import pandas as pd
from packio import Reader, Writer

# define some objects and an output filepath
df = pd.DataFrame({"a": [1, 2], "b": [3, 4]})
lookup = {"a": 1, "b": 2}
filepath = tmp_path / "data.packio"

# save both objects to the same filepath
with Writer(filepath) as writer:
    df.to_parquet(writer.file("df.parquet"))
    dummio.json.save(lookup, filepath=writer.file("lookup.json"))

# load the objects from the file
with Reader(filepath) as reader:
    df2 = pd.read_parquet(reader.file("df.parquet"))
    lookup2 = dummio.json.load(reader.file("lookup.json"))

assert df.equals(df2)
assert lookup == lookup2

Available on pypi: pip install packio.

Why a single file and not a directory?

In a word, encapsulation. Copy/move operations with a file are simpler than a directory, especially when it comes to moving data across platforms such as to/from the cloud. A file is also more tamper-resistant - it's typically harder to accidentally modify the contents of a file than it is for someone to add or remove files or subdirectories in a directory.

Why not pickle?

Although pickle may be the most common approach for serialization of complex python objects, there are strong reasons to dislike pickle. As summarized by Gemini, "Python's pickle module, while convenient, has drawbacks. It poses security risks due to potential code execution vulnerabilities when handling untrusted data. Compatibility issues arise because it's Python-specific and version-dependent. Maintaining pickle can be challenging due to refactoring difficulties and complex debugging." See also Ben Frederickson.

Development

Create and activate a virtual env for dev ops:

git clone git@github.com:zkurtz/packio.git
cd packio
pip install uv
uv sync
source .venv/bin/activate
pre-commit install

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

packio-0.0.5.tar.gz (4.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

packio-0.0.5-py3-none-any.whl (4.5 kB view details)

Uploaded Python 3

File details

Details for the file packio-0.0.5.tar.gz.

File metadata

  • Download URL: packio-0.0.5.tar.gz
  • Upload date:
  • Size: 4.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.5.4

File hashes

Hashes for packio-0.0.5.tar.gz
Algorithm Hash digest
SHA256 0ee822dee500a355adada53a175e1b90cf02dbe8ee960056de7a052618b54808
MD5 f423a639e2997b6834a6e3fdd7159a7f
BLAKE2b-256 4c627fbc8c4f2fc1fc51d43842852098dc2a36721592e193939c40882214d0af

See more details on using hashes here.

File details

Details for the file packio-0.0.5-py3-none-any.whl.

File metadata

  • Download URL: packio-0.0.5-py3-none-any.whl
  • Upload date:
  • Size: 4.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.5.4

File hashes

Hashes for packio-0.0.5-py3-none-any.whl
Algorithm Hash digest
SHA256 7224e0c52a79dd27e89dfafb3526b247b402b96754c02513e0322d191ce5e6d2
MD5 5ff638745e52e70f87b0f8b03dc4e382
BLAKE2b-256 246df13c226baae6edadff70aaca8ef765d4d35e18871e3248aa31b56f7c5b8a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page