Polars IO
Project description
Polars IO utility library
Helpers to make it easier to read and write Hive partitioned parquet dataset with Polars.
It is meant to be a library to deal with datasets easily, but also contains a commandline interface which allows you to inspect parquet files and datasets more easily.
Dataset
Example of use of polario.dataset.HiveDataset
from polario.dataset import HiveDataset
import polars as pl
df = pl.from_dicts(
[
{"p1": 1, "v": 1},
{"p1": 2, "v": 1},
]
)
ds = HiveDataset("file:///tmp/", partition_columns=["p1"])
ds.write(df)
for partition_df in ds.read_partitions():
print(partition_df)
To model data storage, we use three layers: dataset, partition, fragment.
Each dataset is a lexical ordered set of partitions Each partition is a lexical ordered set of fragments Each fragment is a file on disk with rows in any order
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file polario-0.3.1.tar.gz
.
File metadata
- Download URL: polario-0.3.1.tar.gz
- Upload date:
- Size: 10.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.10.12
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | c0fd2b319a2cf6afe0002f591bb8cf978dcaf7aeba9eaa1dfa092b818e917d2b |
|
MD5 | 60a41276b7f0379916afcec309aad197 |
|
BLAKE2b-256 | 3c4ed4528725b88d9b5599fd2e4342e30312c355605b73822cacba4ce367b9e2 |
File details
Details for the file polario-0.3.1-py3-none-any.whl
.
File metadata
- Download URL: polario-0.3.1-py3-none-any.whl
- Upload date:
- Size: 11.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.10.12
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 81211e7fc6ff625e31534e4ae4078fee2bd15b6f44dfff2f00f772223b71c866 |
|
MD5 | 290090232097d0a0d6c901bbfbcd13e0 |
|
BLAKE2b-256 | dc07439369f7315f2c070fc3bd47bb41f265b5e9848e0f697bdfb00f4c40ef4d |