Skip to main content

No project description provided

Project description

plateau

flat files, flat land

Build Status Documentation Status codecov.io License: MIT Anaconda-Server Badge Anaconda-Server Badge

plateau is a Python library to manage (create, read, update, delete) large amounts of tabular data in a blob store. It stores data as datasets, which it presents as pandas DataFrames to the user. Datasets are a collection of files with the same schema that reside in a blob store. plateau uses a metadata definition to handle these datasets efficiently. For distributed access and manipulation of datasets plateau offers a Dask interface.

Storing data distributed over multiple files in a blob store (S3, ABS, GCS, etc.) allows for a fast, cost-efficient and highly scalable data infrastructure. A downside of storing data solely in an object store is that the storages themselves give little to no guarantees beyond the consistency of a single file. In particular, they cannot guarantee the consistency of your dataset. If we demand a consistent state of our dataset at all times, we need to track the state of the dataset. plateau frees us from having to do this manually.

The plateau.io module provides building blocks to create and modify these datasets in data pipelines. plateau handles I/O, tracks dataset partitions and selects subsets of data transparently.

Installation

Installers for the latest released version are availabe at the Python package index and on conda-forge.

# Install with pip
pip install plateau
# Install with conda/mamba, optionally add conda-forge as a source
# conda config --add channels conda-forge
mamba install plateau

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

plateau-4.0.4.tar.gz (848.5 kB view details)

Uploaded Source

Built Distribution

plateau-4.0.4-py3-none-any.whl (132.5 kB view details)

Uploaded Python 3

File details

Details for the file plateau-4.0.4.tar.gz.

File metadata

  • Download URL: plateau-4.0.4.tar.gz
  • Upload date:
  • Size: 848.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/34.0 requests/2.27.1 requests-toolbelt/0.9.1 urllib3/1.26.9 tqdm/4.63.0 importlib-metadata/4.11.3 keyring/23.5.0 rfc3986/2.0.0 colorama/0.4.4 CPython/3.9.11

File hashes

Hashes for plateau-4.0.4.tar.gz
Algorithm Hash digest
SHA256 fb7f06c99e63e83ab91af51482703a74e61ee5420242bebc5e9c859ebb19901c
MD5 a2e542d656d5eb5b1e3e16d0c3c5ae31
BLAKE2b-256 fca24c6bf03a43489ab479ad8f2de7bb6c0a292d114b29d27d218a12c7a9e8c9

See more details on using hashes here.

File details

Details for the file plateau-4.0.4-py3-none-any.whl.

File metadata

  • Download URL: plateau-4.0.4-py3-none-any.whl
  • Upload date:
  • Size: 132.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/34.0 requests/2.27.1 requests-toolbelt/0.9.1 urllib3/1.26.9 tqdm/4.63.0 importlib-metadata/4.11.3 keyring/23.5.0 rfc3986/2.0.0 colorama/0.4.4 CPython/3.9.11

File hashes

Hashes for plateau-4.0.4-py3-none-any.whl
Algorithm Hash digest
SHA256 d7f2f64dabe85b0d14055a85951dcdabd6837f2008899d681e9d365fd9a2a9a3
MD5 94200dc6ddcc6b4e80df8971cdb2b598
BLAKE2b-256 c3e45cc46d06a99e3415aa939b5a719e5513b6e596445e9609119b1cd4a4a3b3

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page