COZIP Cloud-Optimized ZIP Open a ZIP like a table.
Project description
What is cozip?
A ZIP file you can open like a table — over the network, without downloading it.
cozip puts a Parquet manifest called __metadata__ at byte 0 — one row per entry with name, offset, size, plus any columns you add (split, label, class...). DuckDB, Arrow, and Polars query it directly. Range requests fetch only the bytes you actually need.
A 20 GB archive becomes a queryable table.
It's still a ZIP. unzip, zipfile.ZipFile, your OS's preview window — all unchanged.
Install
pip install cozip
Usage
Two functions: write and read.
Write
import cozip
import polars as pl
df = pl.DataFrame({
"path": ["local/tile_001.tif", "local/tile_002.tif", "local/tile_003.tif"],
"name": ["tile_001.tif", "tile_002.tif", "tile_003.tif"],
"split": ["train", "val", "train"],
"label": ["cloud", "water", "forest"],
})
cozip.write("dataset.zip", df)
Two reserved columns. path is where the file lives on disk — it's consumed at write time and dropped. name is how the entry is stored inside the archive and becomes part of __metadata__. Every other column rides along and becomes queryable on read.
Read
df = cozip.read("dataset.zip")
Local file or remote URL — same call. You get a DataFrame back with one row per entry, including offset and size resolved against the archive.
df = cozip.read("https://example.com/dataset.zip")
# query the manifest like any DataFrame
batch = df.filter(pl.col("split") == "train").sample(32)
# batch.select(["name", "offset", "size"]) is everything you need
# to range-request the payloads
Bindings
| Language | Install |
|---|---|
| Python | pip install cozip |
| R | install.packages("cozip", repos = "https://asterisk-labs.r-universe.dev") |
| Julia | Pkg.add("Cozip") |
| JavaScript | npm install cozip |
| WASM | browser bundle, no Node required |
| DuckDB | INSTALL cozip FROM community; LOAD cozip; |
| C | vendored single-header cozip.h |
All bindings call into the same C11 core. Byte-exact behavior across runtimes.
Specification
The on-disk format is defined in SPEC.md. Any conforming implementation reads any cozip ever written.
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file cozip-2026.5.5.tar.gz.
File metadata
- Download URL: cozip-2026.5.5.tar.gz
- Upload date:
- Size: 16.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f6e085ebea4b007b85fa81a60be51d4709952eaaa429050f2c6dea91986e6dac
|
|
| MD5 |
245ab99a4ba35f9850b98781767bd82a
|
|
| BLAKE2b-256 |
5ecbdd61c9ba92f8f643b3aacb33aa07d0c06a231810663865060fd36d889313
|
Provenance
The following attestation bundles were made for cozip-2026.5.5.tar.gz:
Publisher:
release.yml on asterisk-labs/taco
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
cozip-2026.5.5.tar.gz -
Subject digest:
f6e085ebea4b007b85fa81a60be51d4709952eaaa429050f2c6dea91986e6dac - Sigstore transparency entry: 1438685205
- Sigstore integration time:
-
Permalink:
asterisk-labs/taco@32e8b9dcfae8df5834bae07245d82946060d2d1b -
Branch / Tag:
refs/tags/v2026.5.5 - Owner: https://github.com/asterisk-labs
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@32e8b9dcfae8df5834bae07245d82946060d2d1b -
Trigger Event:
push
-
Statement type:
File details
Details for the file cozip-2026.5.5-py3-none-win_amd64.whl.
File metadata
- Download URL: cozip-2026.5.5-py3-none-win_amd64.whl
- Upload date:
- Size: 98.6 kB
- Tags: Python 3, Windows x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
47ea3fa27375febac570990cfc228169a7cf8c803bf8cd24035814805b79b3d2
|
|
| MD5 |
e64ce04a00cf5c3fedc2cbacbfab1a3b
|
|
| BLAKE2b-256 |
f8da4ef8e9ffc8e5e5abcd99d4256be10f6e7c60e31bf25d47c702393b88f404
|
Provenance
The following attestation bundles were made for cozip-2026.5.5-py3-none-win_amd64.whl:
Publisher:
release.yml on asterisk-labs/taco
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
cozip-2026.5.5-py3-none-win_amd64.whl -
Subject digest:
47ea3fa27375febac570990cfc228169a7cf8c803bf8cd24035814805b79b3d2 - Sigstore transparency entry: 1438685280
- Sigstore integration time:
-
Permalink:
asterisk-labs/taco@32e8b9dcfae8df5834bae07245d82946060d2d1b -
Branch / Tag:
refs/tags/v2026.5.5 - Owner: https://github.com/asterisk-labs
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@32e8b9dcfae8df5834bae07245d82946060d2d1b -
Trigger Event:
push
-
Statement type:
File details
Details for the file cozip-2026.5.5-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl.
File metadata
- Download URL: cozip-2026.5.5-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl
- Upload date:
- Size: 123.0 kB
- Tags: Python 3, manylinux: glibc 2.17+ x86-64, manylinux: glibc 2.28+ x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e1fafbb5f033e116952352bf281e45537e9d1e17a8119efc04e904daf88cfa35
|
|
| MD5 |
5ac642dcf288d02598991e7837ed7f8d
|
|
| BLAKE2b-256 |
4823ab2391c02c79a9be6785406b6ecea4173f86e4d204058db9b69095f8a9b6
|
Provenance
The following attestation bundles were made for cozip-2026.5.5-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl:
Publisher:
release.yml on asterisk-labs/taco
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
cozip-2026.5.5-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl -
Subject digest:
e1fafbb5f033e116952352bf281e45537e9d1e17a8119efc04e904daf88cfa35 - Sigstore transparency entry: 1438685308
- Sigstore integration time:
-
Permalink:
asterisk-labs/taco@32e8b9dcfae8df5834bae07245d82946060d2d1b -
Branch / Tag:
refs/tags/v2026.5.5 - Owner: https://github.com/asterisk-labs
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@32e8b9dcfae8df5834bae07245d82946060d2d1b -
Trigger Event:
push
-
Statement type:
File details
Details for the file cozip-2026.5.5-py3-none-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl.
File metadata
- Download URL: cozip-2026.5.5-py3-none-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl
- Upload date:
- Size: 123.8 kB
- Tags: Python 3, manylinux: glibc 2.17+ ARM64, manylinux: glibc 2.28+ ARM64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
96ef720d919e7343ee6e83504668c8adf2624dec5746c856a68c8887ff1cacf9
|
|
| MD5 |
f0d196cf7f12b694f266d6b31ec4f329
|
|
| BLAKE2b-256 |
aa065ecd4de533a2313b8618b96fa786bdbe9016100da9b62a1910be32698f11
|
Provenance
The following attestation bundles were made for cozip-2026.5.5-py3-none-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl:
Publisher:
release.yml on asterisk-labs/taco
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
cozip-2026.5.5-py3-none-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl -
Subject digest:
96ef720d919e7343ee6e83504668c8adf2624dec5746c856a68c8887ff1cacf9 - Sigstore transparency entry: 1438685266
- Sigstore integration time:
-
Permalink:
asterisk-labs/taco@32e8b9dcfae8df5834bae07245d82946060d2d1b -
Branch / Tag:
refs/tags/v2026.5.5 - Owner: https://github.com/asterisk-labs
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@32e8b9dcfae8df5834bae07245d82946060d2d1b -
Trigger Event:
push
-
Statement type:
File details
Details for the file cozip-2026.5.5-py3-none-macosx_11_0_universal2.whl.
File metadata
- Download URL: cozip-2026.5.5-py3-none-macosx_11_0_universal2.whl
- Upload date:
- Size: 182.6 kB
- Tags: Python 3, macOS 11.0+ universal2 (ARM64, x86-64)
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
05e0d64272cf3482f310c9265bdd48c6515e11e2ad2b9052988fa8ec8eab0c9d
|
|
| MD5 |
6464db88bb303cd5db96d43fd09d7c2d
|
|
| BLAKE2b-256 |
a90f904a8ecca3719ae42539bd256f5fa9494a96b097d3130ce5b21c9dc76b40
|
Provenance
The following attestation bundles were made for cozip-2026.5.5-py3-none-macosx_11_0_universal2.whl:
Publisher:
release.yml on asterisk-labs/taco
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
cozip-2026.5.5-py3-none-macosx_11_0_universal2.whl -
Subject digest:
05e0d64272cf3482f310c9265bdd48c6515e11e2ad2b9052988fa8ec8eab0c9d - Sigstore transparency entry: 1438685325
- Sigstore integration time:
-
Permalink:
asterisk-labs/taco@32e8b9dcfae8df5834bae07245d82946060d2d1b -
Branch / Tag:
refs/tags/v2026.5.5 - Owner: https://github.com/asterisk-labs
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@32e8b9dcfae8df5834bae07245d82946060d2d1b -
Trigger Event:
push
-
Statement type: