Skip to main content

Pipeline to Aggregate Data for Optimised Cloud Capabilities

Project description

PADOCC Package

Now a repository under cedadev group!

Padocc (Pipeline to Aggregate Data for Optimal Cloud Capabilities) is a Data Aggregation pipeline for creating Kerchunk (or alternative) files to represent various datasets in different original formats. Currently the Pipeline supports writing JSON/Parquet Kerchunk files for input NetCDF/HDF files. Further developments will allow GeoTiff, GRIB and possibly MetOffice (.pp) files to be represented, as well as using the Pangeo Rechunker tool to create Zarr stores for Kerchunk-incompatible datasets.

Example Notebooks at this link

Documentation hosted at this link

Kerchunk Pipeline

Installation

To install this package, clone the repository using git clone (and switch to the MigrationOO branch - git checkout MigrationOO if release v1.3 has not been released.)

Then follow the steps below to install the package with the necessary dependencies.

python -m venv .venv
source .venv/bin/activate
pip install poetry
poetry install

## Usage

Please refer to the tests/ scripts for how to use the GroupOperation and ProjectOperation classes.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

padocc-1.3.0a0.tar.gz (9.9 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

padocc-1.3.0a0-py3-none-any.whl (10.0 MB view details)

Uploaded Python 3

File details

Details for the file padocc-1.3.0a0.tar.gz.

File metadata

  • Download URL: padocc-1.3.0a0.tar.gz
  • Upload date:
  • Size: 9.9 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.5 CPython/3.12.5 Darwin/23.6.0

File hashes

Hashes for padocc-1.3.0a0.tar.gz
Algorithm Hash digest
SHA256 894fcbf8446802b84b008f0a72b1dbaa6d9ddae1e5e197f718c5cb0da7e40e5b
MD5 a183dd64264bfefb8bd75f3cdc76c37f
BLAKE2b-256 90d3d22e1c0980d38c4c2df87fd3852587ffa0fd03a5fa2939973fd2d02eb78f

See more details on using hashes here.

File details

Details for the file padocc-1.3.0a0-py3-none-any.whl.

File metadata

  • Download URL: padocc-1.3.0a0-py3-none-any.whl
  • Upload date:
  • Size: 10.0 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.5 CPython/3.12.5 Darwin/23.6.0

File hashes

Hashes for padocc-1.3.0a0-py3-none-any.whl
Algorithm Hash digest
SHA256 c50858ab3b38595d3c5852d7b49896b9d58cea47b84d02d1fb124d93815920b7
MD5 c397ec616c4f2c0201c1786bf859788c
BLAKE2b-256 c4774f6a4feb9f8223e4a3e40ee5f7b87c1942b0f691352eae6e18fd7bcbaea2

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page