Skip to main content

Pipeline to Aggregate Data for Optimised Cloud Capabilities

Project description

PADOCC Package

PyPI version

Padocc (Pipeline to Aggregate Data for Optimal Cloud Capabilities) is a Data Aggregation pipeline for creating Kerchunk (or alternative) files to represent various datasets in different original formats. Currently the Pipeline supports writing JSON/Parquet Kerchunk files for input NetCDF/HDF files. Further developments will allow GeoTiff, GRIB and possibly MetOffice (.pp) files to be represented, as well as using the Pangeo Rechunker tool to create Zarr stores for Kerchunk-incompatible datasets.

Example Notebooks at this link

Documentation hosted at this link

Kerchunk Pipeline

Release 1.3.5

Release date: 17 April 2025

See the release notes for details.

This package acknowledges contributions by Matt Brown as a pre-release tester.

Installation

To install this package, clone the repository using git clone (and switch to the MigrationOO branch - git checkout MigrationOO if release v1.3 has not been released.)

Then follow the steps below to install the package with the necessary dependencies.

python -m venv .venv
source .venv/bin/activate
pip install poetry
poetry install

Usage

Please refer to the tests/ scripts for how to use the GroupOperation and ProjectOperation classes.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

padocc-1.4.0a0.tar.gz (9.9 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

padocc-1.4.0a0-py3-none-any.whl (10.0 MB view details)

Uploaded Python 3

File details

Details for the file padocc-1.4.0a0.tar.gz.

File metadata

  • Download URL: padocc-1.4.0a0.tar.gz
  • Upload date:
  • Size: 9.9 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.1.2 CPython/3.11.9 Linux/5.14.0-570.28.1.el9_6.x86_64

File hashes

Hashes for padocc-1.4.0a0.tar.gz
Algorithm Hash digest
SHA256 e7514980c7e25ac6c4a499a82153d9f3cb0dd31087457b53c438f7aa594a8f20
MD5 94427cfd526915bf7e7ac6137c6c2f9a
BLAKE2b-256 6c6095be0d059a8e66a05483f4f8bdb5d6090a28d8328eda33fdcfcfda337108

See more details on using hashes here.

File details

Details for the file padocc-1.4.0a0-py3-none-any.whl.

File metadata

  • Download URL: padocc-1.4.0a0-py3-none-any.whl
  • Upload date:
  • Size: 10.0 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.1.2 CPython/3.11.9 Linux/5.14.0-570.28.1.el9_6.x86_64

File hashes

Hashes for padocc-1.4.0a0-py3-none-any.whl
Algorithm Hash digest
SHA256 5845f313a6a62d859abaf0b1229ecf81ae68f8e9ab8f5b52d587a7329e5fc4fb
MD5 cff8a81a696bc5c411e3444374d510c5
BLAKE2b-256 0f4b16438f063ca9fe4bdac8c6d860d155ada022f9ccbb2ddcca6f367240db1b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page