Polars IO
Project description
Polars IO utility library
Helpers to make it easier to read and write Hive partitioned parquet dataset with Polars.
It is meant to be a library to deal with datasets easily, but also contains a commandline interface which allows you to inspect parquet files and datasets more easily.
Dataset
Example of use of polario.hive_dataset.HiveDataset
from polario.hive_dataset import HiveDataset
import polars as pl
df = pl.from_dicts(
[
{"p1": 1, "v": 1},
{"p1": 2, "v": 1},
]
)
ds = HiveDataset("file:///tmp/", partition_columns=["p1"])
ds.write(df)
for partition_df in ds.read_partitions():
print(partition_df)
To model data storage, we use three layers: dataset, partition, fragment.
- Each dataset is a lexical ordered set of partitions
- Each partition is a lexical ordered set of fragments
- Each fragment is a file on disk with rows in any order
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file polario-0.3.2.tar.gz.
File metadata
- Download URL: polario-0.3.2.tar.gz
- Upload date:
- Size: 82.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/5.1.1 CPython/3.12.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
606c349a17c1b9b5270c0cec3ddd43811a5e9be48dd5c8d88c7c062b231278e3
|
|
| MD5 |
fbb9c599f006945445af636e61bc20e2
|
|
| BLAKE2b-256 |
e5b279707c642ae916638e3ada4f17a8333c3677a9ef4121d4bf176b63402f7e
|
Provenance
The following attestation bundles were made for polario-0.3.2.tar.gz:
Publisher:
publish.yml on bneijt/polario
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
polario-0.3.2.tar.gz -
Subject digest:
606c349a17c1b9b5270c0cec3ddd43811a5e9be48dd5c8d88c7c062b231278e3 - Sigstore transparency entry: 151032175
- Sigstore integration time:
-
Permalink:
bneijt/polario@707e340c63d739e6924c73f15b54de7a4a70e11d -
Branch / Tag:
refs/tags/0.3.2 - Owner: https://github.com/bneijt
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@707e340c63d739e6924c73f15b54de7a4a70e11d -
Trigger Event:
release
-
Statement type:
File details
Details for the file polario-0.3.2-py3-none-any.whl.
File metadata
- Download URL: polario-0.3.2-py3-none-any.whl
- Upload date:
- Size: 12.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/5.1.1 CPython/3.12.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a7ae6cf6b2724214153171bfa5cf580bc369a022b44107b587d086117f0f5d32
|
|
| MD5 |
d139123d3cea7d14c33c9533ffe8c913
|
|
| BLAKE2b-256 |
02c189ffb8ed86305d4e7cbff40ce652f390c710129d3609a536d9ad9849f0ca
|
Provenance
The following attestation bundles were made for polario-0.3.2-py3-none-any.whl:
Publisher:
publish.yml on bneijt/polario
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
polario-0.3.2-py3-none-any.whl -
Subject digest:
a7ae6cf6b2724214153171bfa5cf580bc369a022b44107b587d086117f0f5d32 - Sigstore transparency entry: 151032176
- Sigstore integration time:
-
Permalink:
bneijt/polario@707e340c63d739e6924c73f15b54de7a4a70e11d -
Branch / Tag:
refs/tags/0.3.2 - Owner: https://github.com/bneijt
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@707e340c63d739e6924c73f15b54de7a4a70e11d -
Trigger Event:
release
-
Statement type: