Skip to main content

A Python library to manage (create, read, update, delete) large amounts of tabular data in a blob store.

Project description

plateau

flat files, flat land

Build Status conda-forge pypi-version python-version Documentation Status codecov.io License: MIT Anaconda-Server Badge

plateau is a Python library to manage (create, read, update, delete) large amounts of tabular data in a blob store. It stores data as datasets, which it presents as pandas DataFrames to the user. Datasets are a collection of files with the same schema that reside in a blob store. plateau uses a metadata definition to handle these datasets efficiently. For distributed access and manipulation of datasets plateau offers a Dask interface.

Storing data distributed over multiple files in a blob store (S3, ABS, GCS, etc.) allows for a fast, cost-efficient and highly scalable data infrastructure. A downside of storing data solely in an object store is that the storages themselves give little to no guarantees beyond the consistency of a single file. In particular, they cannot guarantee the consistency of your dataset. If we demand a consistent state of our dataset at all times, we need to track the state of the dataset. plateau frees us from having to do this manually.

The plateau.io module provides building blocks to create and modify these datasets in data pipelines. plateau handles I/O, tracks dataset partitions and selects subsets of data transparently.

Installation

This project is managed by pixi. You can install the package in development mode using:

git clone https://github.com/data-engineering-collective/plateau
cd plateau

pixi run pre-commit-install
pixi run postinstall
pixi run test

Plateau is also available on PyPI and can be installed through pip:

pip install plateau

Contributing

Find details on how to contribute here.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

plateau-4.6.2.tar.gz (1.2 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

plateau-4.6.2-py3-none-any.whl (138.3 kB view details)

Uploaded Python 3

File details

Details for the file plateau-4.6.2.tar.gz.

File metadata

  • Download URL: plateau-4.6.2.tar.gz
  • Upload date:
  • Size: 1.2 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for plateau-4.6.2.tar.gz
Algorithm Hash digest
SHA256 da14c588a5000de6faa892ea91837ceef42e8efa1bfcfbce6c99097cb412782c
MD5 4bfbeac5d09d6fb539af598e399aa620
BLAKE2b-256 2fe7b21cfa3ecd0125163c56cc62c60851e9c5743583a5f28b55c06767937c99

See more details on using hashes here.

Provenance

The following attestation bundles were made for plateau-4.6.2.tar.gz:

Publisher: build.yml on data-engineering-collective/plateau

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file plateau-4.6.2-py3-none-any.whl.

File metadata

  • Download URL: plateau-4.6.2-py3-none-any.whl
  • Upload date:
  • Size: 138.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for plateau-4.6.2-py3-none-any.whl
Algorithm Hash digest
SHA256 9c8bc8f9a97568a2516f90e7e35dafd01d2489ffb1b7ea35fc050d0650249dfd
MD5 b953845ff16f1c2b5f6816b16537d83c
BLAKE2b-256 496424f665301f26f06db9cc10baf555b83752732ffeb192aeea12dabb28591b

See more details on using hashes here.

Provenance

The following attestation bundles were made for plateau-4.6.2-py3-none-any.whl:

Publisher: build.yml on data-engineering-collective/plateau

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page