Tool for aggregating raw NWP files into .zarr files
Project description
NWP Consumer
Download and convert weather data for use in ML pipelines
Some renewables, such as solar and wind, generate power according to the weather conditions. Any forecasting therefore requires predictions of how these conditions will change. Many meteorological organisations provide Numerical Weather Prediction (NWP) data, which can then used for model training and inference.
This data is often very large and can come in various formats. Furthermore, these formats are not necessarily suitable for training, so may require preprocessing and conversion.
This package aims to streamline the collection and processing of this NWP data.
[!Note] This is not built to replace tools such as Herbie. It is built to produce data specific to the needs of Open Climate Fix's models, so things like the output format and the variable selection are hard-coded. If you need a more configurable cli-driven tool, consider using herbie instead.
Installation
Install from PyPi using pip:
$ pip install nwp-consumer
Or use the container image:
$ docker pull ghcr.io/openclimatefix/nwp-consumer
Example usage
To download the latest available day of GFS data:*
$ nwp-consumer consume
To create an archive of a month of GFS data:
[!Note] This will download several gigabytes of data to your home partition. Make sure you have plenty of free space (and time!)
$ nwp-consumer archive --year 2024 --month 1
Documentation
Documentation is generated via pdoc. To build the documentation, run the following command in the repository root:
$ PDOC_ALLOW_EXEC=1 python -m pdoc -o docs --docformat=google src/nwp_consumer
[!Note] The
PDOC_ALLOW_EXEC=1
environment variable is required due to a facet of theocf_blosc2
library, which imports itself automatically and hence necessitates execution to be enabled.
FAQ
How do I authenticate with model repositories that require accounts?
Authentication, and model repository selection, is handled via environment variables.
Choose a repository via the MODEL_REPOSITORY
environment variable. Required environment
variables can be found in the repository's metadata function. Missing variables will be
warned about at runtime.
How do I use an S3 bucket for created stores?
The ZARRDIR
environment variable can be set to an S3 url
(ex: s3://some-bucket-name/some-prefix
). Valid credentials for accessing the bucket
must be discoverable in the environment as per
Botocore's documentation
How do I change what variables are pulled?
With difficulty! This package pulls data specifically tailored to Open Climate Fix's needs, and as such, the data it pulls (and the schema that data is surfaced with) is a fixed part of the package. A large part of the value proposition of this consumer is that the data it produces is consistent and comparable between different sources, so pull requests to the effect of adding or changing this for a specific model are unlikely to be approved.
However, desired changes can be made via cloning the repo and making the relevant parameter modifications to the model's expected coordinates in it's metadata for the desired model repository.
Development
Linting and static type checking
This project uses MyPy for static type checking and Ruff for linting. Installing the development dependencies makes them available in your virtual environment.
Use them via:
$ python -m mypy .
$ python -m ruff check .
Be sure to do this periodically while developing to catch any errors early and prevent headaches with the CI pipeline. It may seem like a hassle at first, but it prevents accidental creation of a whole suite of bugs.
Running the test suite
Run the unittests with:
$ python -m unittest discover -s src/nwp_consumer -p "test_*.py"
Further reading
On packaging a python project using setuptools and pyproject.toml:
- The official PyPA packaging guide.
- A step-by-step practical guide on the godatadriven blog.
- The pyproject.toml metadata specification.
On hexagonal architecture:
- A concrete example using Python.
- An overview of the fundamentals incorporating Typescript
- Another example using Go.
On the directory structure:
- The official PyPA discussion on src and flat layouts.
Contributing and community
- PR's are welcome! See the Organisation Profile for details on contributing
- Find out about our other projects in the OCF Meta Repo
- Check out the OCF blog for updates
- Follow OCF on LinkedIn
Part of the Open Climate Fix community.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file nwp_consumer-1.1.28.tar.gz
.
File metadata
- Download URL: nwp_consumer-1.1.28.tar.gz
- Upload date:
- Size: 69.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.7.14
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 |
55c0803dd638e05b90d858324347076942eb4b819719ff4b650efcdc2d963504
|
|
MD5 |
d99d85e79c314f7c7bb178fb1739d3e9
|
|
BLAKE2b-256 |
2e0f5f8bb05484b6e08636bdabbf12814811147d626f2e2faa637f38cc4484c3
|
File details
Details for the file nwp_consumer-1.1.28-py3-none-any.whl
.
File metadata
- Download URL: nwp_consumer-1.1.28-py3-none-any.whl
- Upload date:
- Size: 91.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.7.14
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 |
5d9fee679e030b2a92050dc6909b6b7066baf09444776c749df5d677930337e2
|
|
MD5 |
84f40068482af90f57a83893dac94303
|
|
BLAKE2b-256 |
50b19daa71f8e2dbe92dc62d57292fd9801b2117efcf599f4ae9262760b0f257
|