Skip to main content

High-performance, multi-language dataset storage format

Project description

Zippy (ZDS) Python Package

High-performance, HuggingFace-compatible dataset storage format.

Installation

pip install zippy-zds

# With optional dependencies
pip install zippy-zds[pandas]
pip install zippy-zds[all]

Quick Start

from zippy import ZDSStore, ZDataset, ZIterableDataset

# Create a store
store = ZDSStore.open("./my_dataset", collection="train")

# Add documents
store.put("doc1", {"text": "Hello world", "label": 1})
store.put("doc2", {"text": "Goodbye world", "label": 0})

# Map-style dataset (random access)
dataset = store.to_dataset()
print(dataset[0])  # {"text": "Hello world", "label": 1}
print(len(dataset))  # 2

# Iterable dataset (streaming)
iterable = store.to_iterable_dataset()
for doc in iterable:
    print(doc)

# With shuffle buffer
for doc in iterable.shuffle(buffer_size=1000):
    print(doc)

Multi-Collection Management with ZDSStore

from zippy import ZDSStore

# Omit the collection argument to get a root-capable store handle
store = ZDSStore.open("./ml_data", native=True)

# Open as many collections as you need from the same handle
train = store.collection("train")
test = store.collection("test")
validation = store.collection("validation")

train.put("doc1", {"split": "train"})
test.put("doc1", {"split": "test"})
validation.put("doc1", {"split": "validation"})

print(store.list_collections())

# Context manager closes the shared root when you're done
with ZDSStore.open("./ml_data", "train") as store:
    store.put("doc2", {"split": "train"})

> ℹ️ `ZDSRoot` still exists under the hood (and is exposed via `store.root`)
> for advanced scenarios that need explicit read/write modes or manual
> locking, but most workflows can just rely on the unified
> `ZDSStore.open()` entry point shown above.
>
> ⚠️ Closing the root invalidates every reader/writer for that path. Only call `store.root.close()` during shutdown/cleanup.
print(store.list_collections())  # ['test', 'train', 'validation']

DataFrame Integration

from zippy import read_zds, to_zds

# Load as DataFrame (requires pandas)
df = read_zds("./my_dataset", collection="train")

# Export DataFrame to ZDS
to_zds(df, "./output", collection="exported")

HuggingFace Compatibility

ZDS datasets are designed to work seamlessly with HuggingFace training loops:

from zippy import ZIterableDataset

dataset = ZIterableDataset.from_store("./my_dataset", collection="train")

# Works with DataLoader
from torch.utils.data import DataLoader
loader = DataLoader(dataset, batch_size=32)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

zippy_data-0.1.2.tar.gz (80.0 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

zippy_data-0.1.2-cp313-cp313-win_amd64.whl (522.6 kB view details)

Uploaded CPython 3.13Windows x86-64

zippy_data-0.1.2-cp313-cp313-macosx_10_12_x86_64.macosx_11_0_arm64.macosx_10_12_universal2.whl (1.2 MB view details)

Uploaded CPython 3.13macOS 10.12+ universal2 (ARM64, x86-64)macOS 10.12+ x86-64macOS 11.0+ ARM64

zippy_data-0.1.2-cp312-cp312-win_amd64.whl (523.6 kB view details)

Uploaded CPython 3.12Windows x86-64

zippy_data-0.1.2-cp312-cp312-macosx_10_12_x86_64.macosx_11_0_arm64.macosx_10_12_universal2.whl (1.2 MB view details)

Uploaded CPython 3.12macOS 10.12+ universal2 (ARM64, x86-64)macOS 10.12+ x86-64macOS 11.0+ ARM64

zippy_data-0.1.2-cp311-cp311-win_amd64.whl (523.2 kB view details)

Uploaded CPython 3.11Windows x86-64

zippy_data-0.1.2-cp311-cp311-macosx_10_12_x86_64.macosx_11_0_arm64.macosx_10_12_universal2.whl (1.2 MB view details)

Uploaded CPython 3.11macOS 10.12+ universal2 (ARM64, x86-64)macOS 10.12+ x86-64macOS 11.0+ ARM64

zippy_data-0.1.2-cp310-cp310-win_amd64.whl (522.5 kB view details)

Uploaded CPython 3.10Windows x86-64

zippy_data-0.1.2-cp310-cp310-macosx_10_12_x86_64.macosx_11_0_arm64.macosx_10_12_universal2.whl (1.2 MB view details)

Uploaded CPython 3.10macOS 10.12+ universal2 (ARM64, x86-64)macOS 10.12+ x86-64macOS 11.0+ ARM64

zippy_data-0.1.2-cp39-cp39-win_amd64.whl (523.2 kB view details)

Uploaded CPython 3.9Windows x86-64

zippy_data-0.1.2-cp39-cp39-macosx_10_12_x86_64.macosx_11_0_arm64.macosx_10_12_universal2.whl (1.2 MB view details)

Uploaded CPython 3.9macOS 10.12+ universal2 (ARM64, x86-64)macOS 10.12+ x86-64macOS 11.0+ ARM64

zippy_data-0.1.2-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (698.2 kB view details)

Uploaded CPython 3.8manylinux: glibc 2.17+ x86-64

File details

Details for the file zippy_data-0.1.2.tar.gz.

File metadata

  • Download URL: zippy_data-0.1.2.tar.gz
  • Upload date:
  • Size: 80.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for zippy_data-0.1.2.tar.gz
Algorithm Hash digest
SHA256 cced48e87cd5dcf505b3c9554ae7b7a7ca68c10e116f7d07b628036d4672d82c
MD5 66392359f1c8bf8a4ee77e34e1b1e4a3
BLAKE2b-256 e4e55a00c3a392bc4b3543eb80e635cd6e4b7601082c14c0faa3c38c2e3bfc84

See more details on using hashes here.

Provenance

The following attestation bundles were made for zippy_data-0.1.2.tar.gz:

Publisher: release.yml on zippydata/zippy

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file zippy_data-0.1.2-cp313-cp313-win_amd64.whl.

File metadata

  • Download URL: zippy_data-0.1.2-cp313-cp313-win_amd64.whl
  • Upload date:
  • Size: 522.6 kB
  • Tags: CPython 3.13, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for zippy_data-0.1.2-cp313-cp313-win_amd64.whl
Algorithm Hash digest
SHA256 f9708973ab81f22c615dcf09719ae006a5e196edab4d59db83bef26e9db56711
MD5 3b421050e0ef3c0e1b66390a63c3f82e
BLAKE2b-256 08e52fa19914f184000ad6aaf886d8e8ca9993ca7c13d3c478e90894910ede92

See more details on using hashes here.

Provenance

The following attestation bundles were made for zippy_data-0.1.2-cp313-cp313-win_amd64.whl:

Publisher: release.yml on zippydata/zippy

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file zippy_data-0.1.2-cp313-cp313-macosx_10_12_x86_64.macosx_11_0_arm64.macosx_10_12_universal2.whl.

File metadata

File hashes

Hashes for zippy_data-0.1.2-cp313-cp313-macosx_10_12_x86_64.macosx_11_0_arm64.macosx_10_12_universal2.whl
Algorithm Hash digest
SHA256 ee49d53e63f5b5d43becdd391e41f77e0b686a35d3628b9050202c9d2094962b
MD5 8136745c595a867febb5a208ba5605ce
BLAKE2b-256 ee3070c6cbabec1faf58ace53c399488b27b3f064db3e68087fe1199dbb2eb28

See more details on using hashes here.

Provenance

The following attestation bundles were made for zippy_data-0.1.2-cp313-cp313-macosx_10_12_x86_64.macosx_11_0_arm64.macosx_10_12_universal2.whl:

Publisher: release.yml on zippydata/zippy

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file zippy_data-0.1.2-cp312-cp312-win_amd64.whl.

File metadata

  • Download URL: zippy_data-0.1.2-cp312-cp312-win_amd64.whl
  • Upload date:
  • Size: 523.6 kB
  • Tags: CPython 3.12, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for zippy_data-0.1.2-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 af0dc85564fd40c4b0919582f615463633b2d212465d8763c80bc7e9a088e89b
MD5 843c1acfeb8f94d321ca01abd38c0bf2
BLAKE2b-256 989a9efdb5803e196fa5dd742f732e06e52b427a1b250475bf5eaf5b5cf3030f

See more details on using hashes here.

Provenance

The following attestation bundles were made for zippy_data-0.1.2-cp312-cp312-win_amd64.whl:

Publisher: release.yml on zippydata/zippy

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file zippy_data-0.1.2-cp312-cp312-macosx_10_12_x86_64.macosx_11_0_arm64.macosx_10_12_universal2.whl.

File metadata

File hashes

Hashes for zippy_data-0.1.2-cp312-cp312-macosx_10_12_x86_64.macosx_11_0_arm64.macosx_10_12_universal2.whl
Algorithm Hash digest
SHA256 4f2ea6a1eb19e802142100e9c41fb7cef21d5e01fd8a698407c328a5ac9e3202
MD5 87fd8990d2141373e4966533f8d1d094
BLAKE2b-256 9eecaa598233fb0037a9408aa6f4e2a71e550508356c053f5457e5d61c3b7704

See more details on using hashes here.

Provenance

The following attestation bundles were made for zippy_data-0.1.2-cp312-cp312-macosx_10_12_x86_64.macosx_11_0_arm64.macosx_10_12_universal2.whl:

Publisher: release.yml on zippydata/zippy

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file zippy_data-0.1.2-cp311-cp311-win_amd64.whl.

File metadata

  • Download URL: zippy_data-0.1.2-cp311-cp311-win_amd64.whl
  • Upload date:
  • Size: 523.2 kB
  • Tags: CPython 3.11, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for zippy_data-0.1.2-cp311-cp311-win_amd64.whl
Algorithm Hash digest
SHA256 01eca4a17fe565a3d8d22471a3edc60ca1288a4ad043fa648f2ee681397a3c8b
MD5 c1602a481e53bbab5b12c99714b277bc
BLAKE2b-256 587309416f2cd859ed5d212c71787cb7b6c108131c95e2e0600d046a9017edb2

See more details on using hashes here.

Provenance

The following attestation bundles were made for zippy_data-0.1.2-cp311-cp311-win_amd64.whl:

Publisher: release.yml on zippydata/zippy

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file zippy_data-0.1.2-cp311-cp311-macosx_10_12_x86_64.macosx_11_0_arm64.macosx_10_12_universal2.whl.

File metadata

File hashes

Hashes for zippy_data-0.1.2-cp311-cp311-macosx_10_12_x86_64.macosx_11_0_arm64.macosx_10_12_universal2.whl
Algorithm Hash digest
SHA256 1d8a6567069d9492ff438825802fded7778e2b636f4c6e2fe9d8d8ef88eeb3d4
MD5 020c3688b769b49e87b34222a1a9daf8
BLAKE2b-256 cb173ab61faf596f3a9c49cfe383185b47d57836694ab97429caeecf533fd9d4

See more details on using hashes here.

Provenance

The following attestation bundles were made for zippy_data-0.1.2-cp311-cp311-macosx_10_12_x86_64.macosx_11_0_arm64.macosx_10_12_universal2.whl:

Publisher: release.yml on zippydata/zippy

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file zippy_data-0.1.2-cp310-cp310-win_amd64.whl.

File metadata

  • Download URL: zippy_data-0.1.2-cp310-cp310-win_amd64.whl
  • Upload date:
  • Size: 522.5 kB
  • Tags: CPython 3.10, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for zippy_data-0.1.2-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 875b0732c62ee7880e30dd250686735349f1fe23895ee431cbdf91772ed423c2
MD5 fc18eff7a6d2497eb9bd761dbaff8b76
BLAKE2b-256 24dd106e9f06fa9c0adbb5849c9e00cfd75d9a5722bb9124ace0d226142c1834

See more details on using hashes here.

Provenance

The following attestation bundles were made for zippy_data-0.1.2-cp310-cp310-win_amd64.whl:

Publisher: release.yml on zippydata/zippy

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file zippy_data-0.1.2-cp310-cp310-macosx_10_12_x86_64.macosx_11_0_arm64.macosx_10_12_universal2.whl.

File metadata

File hashes

Hashes for zippy_data-0.1.2-cp310-cp310-macosx_10_12_x86_64.macosx_11_0_arm64.macosx_10_12_universal2.whl
Algorithm Hash digest
SHA256 674cd636dddef55932ae0d6f44170dfe6880bb2d6330d82dcc181cf373fdd03d
MD5 41fdef3d77bc8a067ac1407fa3ff9241
BLAKE2b-256 6431158ff3b9e4b5f5af5066a31e6e8c5ef0e3eedb3a26438619c8d17214eb0e

See more details on using hashes here.

Provenance

The following attestation bundles were made for zippy_data-0.1.2-cp310-cp310-macosx_10_12_x86_64.macosx_11_0_arm64.macosx_10_12_universal2.whl:

Publisher: release.yml on zippydata/zippy

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file zippy_data-0.1.2-cp39-cp39-win_amd64.whl.

File metadata

  • Download URL: zippy_data-0.1.2-cp39-cp39-win_amd64.whl
  • Upload date:
  • Size: 523.2 kB
  • Tags: CPython 3.9, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for zippy_data-0.1.2-cp39-cp39-win_amd64.whl
Algorithm Hash digest
SHA256 06c066745c8f052da71242a553cfea4e4177ac1a32f2a42bd2f07986cb75fc2b
MD5 abf79e8e9e13c0e0c468f7bfbfa79f23
BLAKE2b-256 81c55c72e1893579b1acc0093b3057fe33d29e87411f42b238256a01cc106a83

See more details on using hashes here.

Provenance

The following attestation bundles were made for zippy_data-0.1.2-cp39-cp39-win_amd64.whl:

Publisher: release.yml on zippydata/zippy

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file zippy_data-0.1.2-cp39-cp39-macosx_10_12_x86_64.macosx_11_0_arm64.macosx_10_12_universal2.whl.

File metadata

File hashes

Hashes for zippy_data-0.1.2-cp39-cp39-macosx_10_12_x86_64.macosx_11_0_arm64.macosx_10_12_universal2.whl
Algorithm Hash digest
SHA256 90f9f6dab88a2749d2732febee0cf4a3fc46641ccb330224b5e1e3ff72086b50
MD5 94c57ec6c62dc02441b60b7fe46e4434
BLAKE2b-256 b2862936a5154ec182627566eab8148ffbf98fe76d682fa49e442858a6ba7929

See more details on using hashes here.

Provenance

The following attestation bundles were made for zippy_data-0.1.2-cp39-cp39-macosx_10_12_x86_64.macosx_11_0_arm64.macosx_10_12_universal2.whl:

Publisher: release.yml on zippydata/zippy

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file zippy_data-0.1.2-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for zippy_data-0.1.2-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 f7f86368d3317b2c06a9a137cf4763e6f6b463b706c846bbfc3e79bb4aca1efb
MD5 3e0dc4c72c7ad580106063c9dfd6363c
BLAKE2b-256 200c1c9be670ee97dc624ceda46ee754608b402767ec7cae0da9dbacaff746df

See more details on using hashes here.

Provenance

The following attestation bundles were made for zippy_data-0.1.2-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:

Publisher: release.yml on zippydata/zippy

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page