Skip to main content

Pipeline to Aggregate Data for Optimised Cloud Capabilities

Project description

PADOCC Package

Padocc (Pipeline to Aggregate Data for Optimal Cloud Capabilities) is a Data Aggregation pipeline for creating Kerchunk (or alternative) files to represent various datasets in different original formats. Currently the Pipeline supports writing JSON/Parquet Kerchunk files for input NetCDF/HDF files. Further developments will allow GeoTiff, GRIB and possibly MetOffice (.pp) files to be represented, as well as using the Pangeo Rechunker tool to create Zarr stores for Kerchunk-incompatible datasets.

Example Notebooks at this link

Documentation hosted at this link

Kerchunk Pipeline

Pre-release b

Release date: 20th January 2025

This pre-release contains updated source code and source code documentation, but some of the main descriptors that are hand-written (not source) may be out of date. Please refer to the release notes for details on what has changed.

This package acknowledges contributions by Matt Brown as a pre-release tester.

Installation

To install this package, clone the repository using git clone (and switch to the MigrationOO branch - git checkout MigrationOO if release v1.3 has not been released.)

Then follow the steps below to install the package with the necessary dependencies.

python -m venv .venv
source .venv/bin/activate
pip install poetry
poetry install

Usage

Please refer to the tests/ scripts for how to use the GroupOperation and ProjectOperation classes.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

padocc-1.3.1.tar.gz (9.9 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

padocc-1.3.1-py3-none-any.whl (10.0 MB view details)

Uploaded Python 3

File details

Details for the file padocc-1.3.1.tar.gz.

File metadata

  • Download URL: padocc-1.3.1.tar.gz
  • Upload date:
  • Size: 9.9 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.5 CPython/3.11.9 Linux/5.14.0-503.21.1.el9_5.x86_64

File hashes

Hashes for padocc-1.3.1.tar.gz
Algorithm Hash digest
SHA256 43d1ba9eb6a93af672aa45aa16ca0187feaae8d9d74a7100b9f6b16b4bdc0cf3
MD5 e2c9ce8eb8d51060fee651435f717e5f
BLAKE2b-256 5a7eb4dcdce91134c74472417bc3803eaac4805bb2545c5c1e72ed63510abc71

See more details on using hashes here.

File details

Details for the file padocc-1.3.1-py3-none-any.whl.

File metadata

  • Download URL: padocc-1.3.1-py3-none-any.whl
  • Upload date:
  • Size: 10.0 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.5 CPython/3.11.9 Linux/5.14.0-503.21.1.el9_5.x86_64

File hashes

Hashes for padocc-1.3.1-py3-none-any.whl
Algorithm Hash digest
SHA256 78b227dd829b6f9c482c4b2dc44e24fc6123a6fbb4c7fb61186b2ca93230df6b
MD5 5d6fb5ea213e06fd457a0adc1850344a
BLAKE2b-256 fb64d11efd43f24d4533c7f31d197f2fcbafb07bd449e5d00c7780e3b4ce2c42

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page