Skip to main content

COZIP Cloud-Optimized ZIP Open a ZIP like a table.

Project description

cozip — Cloud Optimized ZIP

License MIT PyPI R Julia npm WASM DuckDB C11


What is cozip?

A ZIP file you can open like a table — over the network, without downloading it.

cozip puts a Parquet manifest called __metadata__ at byte 0 — one row per entry with name, offset, size, plus any columns you add (split, label, class...). DuckDB, Arrow, and Polars query it directly. Range requests fetch only the bytes you actually need.

A 20 GB archive becomes a queryable table.

It's still a ZIP. unzip, zipfile.ZipFile, your OS's preview window — all unchanged.

Install

pip install cozip

Usage

Two functions: write and read.

Write

import cozip
import polars as pl

df = pl.DataFrame({
    "path":  ["local/tile_001.tif", "local/tile_002.tif", "local/tile_003.tif"],
    "name":  ["tile_001.tif", "tile_002.tif", "tile_003.tif"],
    "split": ["train", "val", "train"],
    "label": ["cloud", "water", "forest"],
})

cozip.write("dataset.zip", df)

Two reserved columns. path is where the file lives on disk — it's consumed at write time and dropped. name is how the entry is stored inside the archive and becomes part of __metadata__. Every other column rides along and becomes queryable on read.

Read

df = cozip.read("dataset.zip")

Local file or remote URL — same call. You get a DataFrame back with one row per entry, including offset and size resolved against the archive.

df = cozip.read("https://example.com/dataset.zip")

# query the manifest like any DataFrame
batch = df.filter(pl.col("split") == "train").sample(32)

# batch.select(["name", "offset", "size"]) is everything you need
# to range-request the payloads

Bindings

Language Install
Python pip install cozip
R install.packages("cozip", repos = "https://asterisk-labs.r-universe.dev")
Julia Pkg.add("Cozip")
JavaScript npm install cozip
WASM browser bundle, no Node required
DuckDB INSTALL cozip FROM community; LOAD cozip;
C vendored single-header cozip.h

All bindings call into the same C11 core. Byte-exact behavior across runtimes.

Specification

The on-disk format is defined in SPEC.md. Any conforming implementation reads any cozip ever written.

License

MIT


Developed with ❤️ by

Asterisk Labs

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cozip-2026.5.5.tar.gz (16.4 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

cozip-2026.5.5-py3-none-win_amd64.whl (98.6 kB view details)

Uploaded Python 3Windows x86-64

cozip-2026.5.5-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl (123.0 kB view details)

Uploaded Python 3manylinux: glibc 2.17+ x86-64manylinux: glibc 2.28+ x86-64

cozip-2026.5.5-py3-none-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl (123.8 kB view details)

Uploaded Python 3manylinux: glibc 2.17+ ARM64manylinux: glibc 2.28+ ARM64

cozip-2026.5.5-py3-none-macosx_11_0_universal2.whl (182.6 kB view details)

Uploaded Python 3macOS 11.0+ universal2 (ARM64, x86-64)

File details

Details for the file cozip-2026.5.5.tar.gz.

File metadata

  • Download URL: cozip-2026.5.5.tar.gz
  • Upload date:
  • Size: 16.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for cozip-2026.5.5.tar.gz
Algorithm Hash digest
SHA256 f6e085ebea4b007b85fa81a60be51d4709952eaaa429050f2c6dea91986e6dac
MD5 245ab99a4ba35f9850b98781767bd82a
BLAKE2b-256 5ecbdd61c9ba92f8f643b3aacb33aa07d0c06a231810663865060fd36d889313

See more details on using hashes here.

Provenance

The following attestation bundles were made for cozip-2026.5.5.tar.gz:

Publisher: release.yml on asterisk-labs/taco

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file cozip-2026.5.5-py3-none-win_amd64.whl.

File metadata

  • Download URL: cozip-2026.5.5-py3-none-win_amd64.whl
  • Upload date:
  • Size: 98.6 kB
  • Tags: Python 3, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for cozip-2026.5.5-py3-none-win_amd64.whl
Algorithm Hash digest
SHA256 47ea3fa27375febac570990cfc228169a7cf8c803bf8cd24035814805b79b3d2
MD5 e64ce04a00cf5c3fedc2cbacbfab1a3b
BLAKE2b-256 f8da4ef8e9ffc8e5e5abcd99d4256be10f6e7c60e31bf25d47c702393b88f404

See more details on using hashes here.

Provenance

The following attestation bundles were made for cozip-2026.5.5-py3-none-win_amd64.whl:

Publisher: release.yml on asterisk-labs/taco

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file cozip-2026.5.5-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for cozip-2026.5.5-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 e1fafbb5f033e116952352bf281e45537e9d1e17a8119efc04e904daf88cfa35
MD5 5ac642dcf288d02598991e7837ed7f8d
BLAKE2b-256 4823ab2391c02c79a9be6785406b6ecea4173f86e4d204058db9b69095f8a9b6

See more details on using hashes here.

Provenance

The following attestation bundles were made for cozip-2026.5.5-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl:

Publisher: release.yml on asterisk-labs/taco

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file cozip-2026.5.5-py3-none-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for cozip-2026.5.5-py3-none-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 96ef720d919e7343ee6e83504668c8adf2624dec5746c856a68c8887ff1cacf9
MD5 f0d196cf7f12b694f266d6b31ec4f329
BLAKE2b-256 aa065ecd4de533a2313b8618b96fa786bdbe9016100da9b62a1910be32698f11

See more details on using hashes here.

Provenance

The following attestation bundles were made for cozip-2026.5.5-py3-none-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl:

Publisher: release.yml on asterisk-labs/taco

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file cozip-2026.5.5-py3-none-macosx_11_0_universal2.whl.

File metadata

File hashes

Hashes for cozip-2026.5.5-py3-none-macosx_11_0_universal2.whl
Algorithm Hash digest
SHA256 05e0d64272cf3482f310c9265bdd48c6515e11e2ad2b9052988fa8ec8eab0c9d
MD5 6464db88bb303cd5db96d43fd09d7c2d
BLAKE2b-256 a90f904a8ecca3719ae42539bd256f5fa9494a96b097d3130ce5b21c9dc76b40

See more details on using hashes here.

Provenance

The following attestation bundles were made for cozip-2026.5.5-py3-none-macosx_11_0_universal2.whl:

Publisher: release.yml on asterisk-labs/taco

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page