Skip to main content

Python bindings to libcozip, the reference writer for the Cloud-Optimized ZIP (cozip) format.

Project description

cozip — Cloud Optimized ZIP

License MIT PyPI R Julia npm WASM DuckDB C11


What is cozip?

A ZIP file you can open like a table — over the network, without downloading it.

cozip puts a Parquet manifest called __metadata__ at byte 0 — one row per entry with name, offset, size, plus any columns you add (split, label, class...). DuckDB, Arrow, and Polars query it directly. Range requests fetch only the bytes you actually need.

A 20 GB archive becomes a queryable table.

It's still a ZIP. unzip, zipfile.ZipFile, your OS's preview window — all unchanged.

Install

pip install cozip

Usage

Two functions: write and read.

Write

import cozip
import polars as pl

df = pl.DataFrame({
    "path":  ["local/tile_001.tif", "local/tile_002.tif", "local/tile_003.tif"],
    "name":  ["tile_001.tif", "tile_002.tif", "tile_003.tif"],
    "split": ["train", "val", "train"],
    "label": ["cloud", "water", "forest"],
})

cozip.write("dataset.zip", df)

Two reserved columns. path is where the file lives on disk — it's consumed at write time and dropped. name is how the entry is stored inside the archive and becomes part of __metadata__. Every other column rides along and becomes queryable on read.

Read

df = cozip.read("dataset.zip")

Local file or remote URL — same call. You get a DataFrame back with one row per entry, including offset and size resolved against the archive.

df = cozip.read("https://example.com/dataset.zip")

# query the manifest like any DataFrame
batch = df.filter(pl.col("split") == "train").sample(32)

# batch.select(["name", "offset", "size"]) is everything you need
# to range-request the payloads

Bindings

Language Install
Python pip install cozip
R install.packages("cozip", repos = "https://asterisk-labs.r-universe.dev")
Julia Pkg.add("Cozip")
JavaScript npm install cozip
WASM browser bundle, no Node required
DuckDB INSTALL cozip FROM community; LOAD cozip;
C vendored single-header cozip.h

All bindings call into the same C11 core. Byte-exact behavior across runtimes.

Specification

The on-disk format is defined in SPEC.md. Any conforming implementation reads any cozip ever written.

License

MIT


Developed with ❤️ by

Asterisk Labs

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cozip-2026.5.11.tar.gz (15.3 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

cozip-2026.5.11-py3-none-win_amd64.whl (98.1 kB view details)

Uploaded Python 3Windows x86-64

cozip-2026.5.11-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl (123.3 kB view details)

Uploaded Python 3manylinux: glibc 2.17+ x86-64manylinux: glibc 2.28+ x86-64

cozip-2026.5.11-py3-none-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl (123.8 kB view details)

Uploaded Python 3manylinux: glibc 2.17+ ARM64manylinux: glibc 2.28+ ARM64

cozip-2026.5.11-py3-none-macosx_11_0_universal2.whl (182.6 kB view details)

Uploaded Python 3macOS 11.0+ universal2 (ARM64, x86-64)

File details

Details for the file cozip-2026.5.11.tar.gz.

File metadata

  • Download URL: cozip-2026.5.11.tar.gz
  • Upload date:
  • Size: 15.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for cozip-2026.5.11.tar.gz
Algorithm Hash digest
SHA256 c560c5f9346e9fcd93dec4c0013fb6df6e212ad5d46bde7f53993e0a69c6a9bd
MD5 b4ab57c8dc49bdc5487086a17648f2a0
BLAKE2b-256 b2e488f72992afa780ad2a0bd827e1ab59dd26fe7e23019c45c6c5159e772147

See more details on using hashes here.

Provenance

The following attestation bundles were made for cozip-2026.5.11.tar.gz:

Publisher: release.yml on asterisk-labs/taco

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file cozip-2026.5.11-py3-none-win_amd64.whl.

File metadata

  • Download URL: cozip-2026.5.11-py3-none-win_amd64.whl
  • Upload date:
  • Size: 98.1 kB
  • Tags: Python 3, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for cozip-2026.5.11-py3-none-win_amd64.whl
Algorithm Hash digest
SHA256 21ca6be55acf78a8a6001afe292cd7c4f6f97b31b6c004d0d8e4c6af0b19e4a2
MD5 0bd5cb155f0c74e0a487678ca7233701
BLAKE2b-256 f52533f737afd649454c527a1db1123d75a79c0679034e7eeebffddab4695240

See more details on using hashes here.

Provenance

The following attestation bundles were made for cozip-2026.5.11-py3-none-win_amd64.whl:

Publisher: release.yml on asterisk-labs/taco

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file cozip-2026.5.11-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for cozip-2026.5.11-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 31d7ced71d12dd53a81be9a4fbfe6379437c9d6c5918a56eb40568c0653abfa3
MD5 ed6b35ccb3bc7473202a2dac7e5721b5
BLAKE2b-256 8fe40c431decbf67cb19d843180b0fe5a34eb1c34920363dd4ef6a859b207fba

See more details on using hashes here.

Provenance

The following attestation bundles were made for cozip-2026.5.11-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl:

Publisher: release.yml on asterisk-labs/taco

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file cozip-2026.5.11-py3-none-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for cozip-2026.5.11-py3-none-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 9ad29eb288b1bb15822a5a6721fbb2d385e893f91d3d8bc3f5b48e151fe841e3
MD5 4760fc0bde31861fb3fe1d7a039052ee
BLAKE2b-256 77fe3d86c23650b01ec20056a144ed435d7fa0632638f250e7c2648d7213b0bb

See more details on using hashes here.

Provenance

The following attestation bundles were made for cozip-2026.5.11-py3-none-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl:

Publisher: release.yml on asterisk-labs/taco

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file cozip-2026.5.11-py3-none-macosx_11_0_universal2.whl.

File metadata

File hashes

Hashes for cozip-2026.5.11-py3-none-macosx_11_0_universal2.whl
Algorithm Hash digest
SHA256 69525e542894e314f0ffaa70a47d0a778ec3072dcdef57c3884a9c0fbd87f432
MD5 7f9c82839ec0475115707d9ece6a58e3
BLAKE2b-256 7f4a05cc94c684ca0b29efca69a7c481991da1eaf603eaf2a0d8ea93a04efb3e

See more details on using hashes here.

Provenance

The following attestation bundles were made for cozip-2026.5.11-py3-none-macosx_11_0_universal2.whl:

Publisher: release.yml on asterisk-labs/taco

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page