dataset preparation for data-driven weather models
Project description
mllam-data-prep
This package aims to be a declarative way to prepare training-data for data-driven (i.e. machine learning) weather forecasting models. A training dataset is constructed by declaring in a yaml configuration file (for example example.danra.yaml) the data sources, the variables to extract, the transformations to apply to the data, and the target variable(s) of the model architecture to map the data to.
The configuration is principally a means to represent how the dimensions of a given variable in a source dataset should be mapped to the dimensions and input variables of the model architecture to be trained.
The full configuration file specification is given in mllam_data_prep/config/spec.py.
Installation
To simply use mllam-data-prep
you can install the most recent tagged version from pypi with pip:
python -m pip install mllam-data-prep
Developing mllam-data-prep
To work on developing mllam-data-prep
it easiest to install and manage the dependencies with pdm. To get started clone your fork of the main repo locally:
git clone https://github.com/<your-github-username>/mllam-data-prep
cd mllam-data-prep
Use pdm to create and use a virtualenv:
pdm venv create
pdm use --venv in-project
pdm install
All the linting is handelled by pre-commit
which can be setup to automatically be run on each git commit
by installing the git commit hook:
pdm run pre-commit install
The branch, commit, push and make a pull-request :)
Usage
The package is designed to be used as a command-line tool. The main command is mllam-data-prep
which takes a configuration file as input and outputs a training dataset in the form of a .zarr
dataset named from the config file (e.g. example.danra.yaml
produces example.danra.zarr
).
python -m mllam_data_prep example.danra.yaml
Example output:
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distribution
File details
Details for the file mllam_data_prep-0.1.0-py3-none-any.whl
.
File metadata
- Download URL: mllam_data_prep-0.1.0-py3-none-any.whl
- Upload date:
- Size: 13.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.0 CPython/3.12.3
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 533802a9834c4c40d9d8160895244125e80e04bbe8a40e76242d8b7f6e2b9aa6 |
|
MD5 | ee27b3316cfab8ce730f188b79060455 |
|
BLAKE2b-256 | e95df31faf73201f5cc6786b86f0950bd3df8a5f04046325b2f5b32dc2a9ddc7 |