Skip to main content

Python bindings to libcozip, the reference writer for the Cloud-Optimized ZIP (cozip) format.

Project description

cozip — Cloud Optimized ZIP

License MIT PyPI R Julia npm WASM DuckDB C11


What is cozip?

A ZIP file you can open like a table — over the network, without downloading it.

cozip puts a Parquet manifest called __metadata__ at byte 0 — one row per entry with name, offset, size, plus any columns you add (split, label, class...). DuckDB, Arrow, and Polars query it directly. Range requests fetch only the bytes you actually need.

A 20 GB archive becomes a queryable table.

It's still a ZIP. unzip, zipfile.ZipFile, your OS's preview window — all unchanged.

Install

pip install cozip

Usage

Two functions: write and read.

Write

import cozip
import polars as pl

df = pl.DataFrame({
    "path":  ["local/tile_001.tif", "local/tile_002.tif", "local/tile_003.tif"],
    "name":  ["tile_001.tif", "tile_002.tif", "tile_003.tif"],
    "split": ["train", "val", "train"],
    "label": ["cloud", "water", "forest"],
})

cozip.write("dataset.zip", df)

Two reserved columns. path is where the file lives on disk — it's consumed at write time and dropped. name is how the entry is stored inside the archive and becomes part of __metadata__. Every other column rides along and becomes queryable on read.

Read

df = cozip.read("dataset.zip")

Local file or remote URL — same call. You get a DataFrame back with one row per entry, including offset and size resolved against the archive.

df = cozip.read("https://example.com/dataset.zip")

# query the manifest like any DataFrame
batch = df.filter(pl.col("split") == "train").sample(32)

# batch.select(["name", "offset", "size"]) is everything you need
# to range-request the payloads

Bindings

Language Install
Python pip install cozip
R install.packages("cozip", repos = "https://asterisk-labs.r-universe.dev")
Julia Pkg.add("Cozip")
JavaScript npm install cozip
WASM browser bundle, no Node required
DuckDB INSTALL cozip FROM community; LOAD cozip;
C vendored single-header cozip.h

All bindings call into the same C11 core. Byte-exact behavior across runtimes.

Specification

The on-disk format is defined in SPEC.md. Any conforming implementation reads any cozip ever written.

License

MIT


Developed with ❤️ by

Asterisk Labs

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cozip-2026.5.10.tar.gz (15.3 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

cozip-2026.5.10-py3-none-win_amd64.whl (98.1 kB view details)

Uploaded Python 3Windows x86-64

cozip-2026.5.10-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl (123.3 kB view details)

Uploaded Python 3manylinux: glibc 2.17+ x86-64manylinux: glibc 2.28+ x86-64

cozip-2026.5.10-py3-none-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl (123.8 kB view details)

Uploaded Python 3manylinux: glibc 2.17+ ARM64manylinux: glibc 2.28+ ARM64

cozip-2026.5.10-py3-none-macosx_11_0_universal2.whl (182.6 kB view details)

Uploaded Python 3macOS 11.0+ universal2 (ARM64, x86-64)

File details

Details for the file cozip-2026.5.10.tar.gz.

File metadata

  • Download URL: cozip-2026.5.10.tar.gz
  • Upload date:
  • Size: 15.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for cozip-2026.5.10.tar.gz
Algorithm Hash digest
SHA256 95748f37b37989100327909cd5fc0eb27b618129b21bcb84541d37ed01b0b770
MD5 76e7f9a1e31d7831e4dd295e81312563
BLAKE2b-256 19e9f0ea5eb3e6802fefa6ddd7905064d2527f6545cf510a031323c99048d5aa

See more details on using hashes here.

Provenance

The following attestation bundles were made for cozip-2026.5.10.tar.gz:

Publisher: release.yml on asterisk-labs/taco

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file cozip-2026.5.10-py3-none-win_amd64.whl.

File metadata

  • Download URL: cozip-2026.5.10-py3-none-win_amd64.whl
  • Upload date:
  • Size: 98.1 kB
  • Tags: Python 3, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for cozip-2026.5.10-py3-none-win_amd64.whl
Algorithm Hash digest
SHA256 18b60c9a2d5300ba4f7c5e48f4d7c122446544cebc944e9483ea2c3f9c48ff7a
MD5 11c9369b5730109841e352e1ceb437ad
BLAKE2b-256 38d8b173c94b23998c7b56d8c4533e7f45f987b6a6064ab460e4545ac2e4a0dc

See more details on using hashes here.

Provenance

The following attestation bundles were made for cozip-2026.5.10-py3-none-win_amd64.whl:

Publisher: release.yml on asterisk-labs/taco

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file cozip-2026.5.10-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for cozip-2026.5.10-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 d97eb067c12c2f5918b5461b6ab04bb50772ceb5d056cf6e03f52e471d5c4af2
MD5 3c028f994b353d5d3ed62f92ad8e144c
BLAKE2b-256 795bc5101dec528c157a5f3a02c2a5964dcc0b51d3ee1c578063bc24aef02f1a

See more details on using hashes here.

Provenance

The following attestation bundles were made for cozip-2026.5.10-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl:

Publisher: release.yml on asterisk-labs/taco

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file cozip-2026.5.10-py3-none-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for cozip-2026.5.10-py3-none-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 f053442e351ec86342ef623334f3647eccfc6782d075cfe8c137d66a824ff0e2
MD5 c6cf145f9e26793d8c35a8c9c83282e7
BLAKE2b-256 6b38a71cf4a7aee8dc406d4579d7065e15d70d92946bf68f46b695a33a737807

See more details on using hashes here.

Provenance

The following attestation bundles were made for cozip-2026.5.10-py3-none-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl:

Publisher: release.yml on asterisk-labs/taco

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file cozip-2026.5.10-py3-none-macosx_11_0_universal2.whl.

File metadata

File hashes

Hashes for cozip-2026.5.10-py3-none-macosx_11_0_universal2.whl
Algorithm Hash digest
SHA256 3b8f996e3bee95a08d84cd6232d877da3ed360d477b6606e55021df4379c8d41
MD5 22134c60e00bd89c6452c68d19fbae8a
BLAKE2b-256 aee8fb4866c6ef4b7929a7ccac708fac9753239ecee2af5ab714bd4519d1c9b8

See more details on using hashes here.

Provenance

The following attestation bundles were made for cozip-2026.5.10-py3-none-macosx_11_0_universal2.whl:

Publisher: release.yml on asterisk-labs/taco

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page