Skip to main content

Polars IO

Project description

Polars IO utility library

Helpers to make it easier to read and write Hive partitioned parquet dataset with Polars.

It is meant to be a library to deal with datasets easily, but also contains a commandline interface which allows you to inspect parquet files and datasets more easily.

Dataset

Example of use of polario.hive_dataset.HiveDataset

from polario.hive_dataset import HiveDataset
import polars as pl
df = pl.from_dicts(
        [
            {"p1": 1, "v": 1},
            {"p1": 2, "v": 1},
        ]
    )

ds = HiveDataset("file:///tmp/", partition_columns=["p1"])

ds.write(df)

for partition_df in ds.read_partitions():
    print(partition_df)

To model data storage, we use three layers: dataset, partition, fragment.

  • Each dataset is a lexical ordered set of partitions
  • Each partition is a lexical ordered set of fragments
  • Each fragment is a file on disk with rows in any order

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

polario-0.3.2.tar.gz (82.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

polario-0.3.2-py3-none-any.whl (12.3 kB view details)

Uploaded Python 3

File details

Details for the file polario-0.3.2.tar.gz.

File metadata

  • Download URL: polario-0.3.2.tar.gz
  • Upload date:
  • Size: 82.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/5.1.1 CPython/3.12.7

File hashes

Hashes for polario-0.3.2.tar.gz
Algorithm Hash digest
SHA256 606c349a17c1b9b5270c0cec3ddd43811a5e9be48dd5c8d88c7c062b231278e3
MD5 fbb9c599f006945445af636e61bc20e2
BLAKE2b-256 e5b279707c642ae916638e3ada4f17a8333c3677a9ef4121d4bf176b63402f7e

See more details on using hashes here.

Provenance

The following attestation bundles were made for polario-0.3.2.tar.gz:

Publisher: publish.yml on bneijt/polario

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file polario-0.3.2-py3-none-any.whl.

File metadata

  • Download URL: polario-0.3.2-py3-none-any.whl
  • Upload date:
  • Size: 12.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/5.1.1 CPython/3.12.7

File hashes

Hashes for polario-0.3.2-py3-none-any.whl
Algorithm Hash digest
SHA256 a7ae6cf6b2724214153171bfa5cf580bc369a022b44107b587d086117f0f5d32
MD5 d139123d3cea7d14c33c9533ffe8c913
BLAKE2b-256 02c189ffb8ed86305d4e7cbff40ce652f390c710129d3609a536d9ad9849f0ca

See more details on using hashes here.

Provenance

The following attestation bundles were made for polario-0.3.2-py3-none-any.whl:

Publisher: publish.yml on bneijt/polario

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page