File format conversions to cfdb
Project description
cfdb-ingest
Convert meteorological model output to cfdb with standardized CF conventions
Documentation: https://mullenkamp.github.io/cfdb-ingest/
Source Code: https://github.com/mullenkamp/cfdb-ingest
Overview
cfdb-ingest converts meteorological file formats (netCDF4/HDF5) from various model outputs into cfdb. It standardizes variable names and attributes to be consistent with CF conventions, making it straightforward to work with datasets from different sources through a single interface.
Supported sources:
- WRF -- wrfout NetCDF files (all variables in one file per time range)
- ERA5 -- NCAR ERA5 NetCDF files (one variable per file, surface + pressure level + invariant products)
Key features:
- Automatic variable mapping -- source variable names are translated to CF-standard names with proper metadata via cfdb-vars
- Named height coordinates -- surface variables at specific heights (0m, 2m, 10m, 100m) get their own named coordinates (e.g.
height_2m), allowing them to coexist with pressure-level variables without ambiguity - Wind rotation (WRF) -- grid-relative wind components are rotated to earth-relative
- VIMF computation (ERA5) -- native calculation of vertically integrated moisture flux from Q, U, and V
- 3D level interpolation (WRF) -- eta-level variables are interpolated to user-specified height or pressure levels
- Auto pressure level detection (ERA5) -- pressure levels are read directly from source files
- Split or combined output (ERA5) -- create one cfdb per variable or combine into a single dataset
- WPS intermediate file export -- convert cfdb datasets to WPS intermediate format for metgrid.exe
- Spatial and temporal filtering -- subset by bounding box and/or date range
- Multi-file support -- seamlessly spans multiple input files
Performance
cfdb-ingest is designed for high-performance processing of large meteorological datasets:
- Vectorized rechunking -- utilizes rechunkit for optimized HDF5 reads, even when extracting small spatial subsets across many timesteps.
- Parallel initialization -- multi-threaded file scanning and metadata extraction for fast startup.
- HDF5 Chunk Caching -- intelligent management of the HDF5 chunk cache to prevent redundant I/O during per-timestep transformations.
- Synchronized multi-variable rechunking -- synchronized iteration for derived variables (like VIMF) to eliminate redundant reads of shared source variables.
Installation
Requires Python >= 3.10.
pip install cfdb-ingest
Quick Start
WRF
from cfdb_ingest import WrfIngest
wrf = WrfIngest('wrfout_d01_2023-02-12_00:00:00.nc')
wrf.convert(
cfdb_path='output.cfdb',
variables=['T2', 'WIND10'],
start_date='2023-02-12T06:00',
end_date='2023-02-12T18:00',
)
cfdb-ingest wrf wrfout_d01_*.nc output.cfdb -v T2,WIND10 -s 2023-02-12T06:00 -e 2023-02-12T18:00
For WPS export, use the --preset wps flag:
cfdb-ingest wrf /path/to/wrfout/ output.cfdb --preset wps -s 2023-02-10 -e 2023-02-10_06
cfdb-to-int output.cfdb -s 2023-02-10 -e 2023-02-10_06
ERA5
from cfdb_ingest import Era5Ingest
era5 = Era5Ingest('/path/to/era5/*.nc')
era5.convert(
cfdb_path='era5.cfdb',
variables=['SP', 'VAR_2T', 'T', 'U', 'V'],
start_date='2020-01-01',
end_date='2020-01-31',
)
# Combined: multiple variables in one cfdb
cfdb-ingest era5 /path/to/era5/*.nc output.cfdb -v SP,VAR_2T,T,U,V -s 2020-01-01 -e 2020-01-31
# Split: one cfdb file per variable
cfdb-ingest era5 /path/to/era5/*.nc /output/dir/ --split -v SP,T
See the full documentation for details.
License
This project is licensed under the terms of the Apache Software License 2.0.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file cfdb_ingest-0.3.9.tar.gz.
File metadata
- Download URL: cfdb_ingest-0.3.9.tar.gz
- Upload date:
- Size: 39.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.8.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f4381719aa92ce7d520c2d71c96cde055c80d87054e1f4737a799e7e361e5646
|
|
| MD5 |
301e1d52d84863d57c7126ad352858d0
|
|
| BLAKE2b-256 |
5882e6917d8dee2ece5167aa7f0d2343670ea31d748ae6b77fa3b0f2ddaadcbc
|
File details
Details for the file cfdb_ingest-0.3.9-py3-none-any.whl.
File metadata
- Download URL: cfdb_ingest-0.3.9-py3-none-any.whl
- Upload date:
- Size: 40.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.8.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
fd8961d1d7d54a4664d312eed93400a9c650e2d2709edfd69acff3438ac2da06
|
|
| MD5 |
246c567e265b4434ed3c3d4bad808d00
|
|
| BLAKE2b-256 |
4ba9925f599587a2e2ce8ff8d4fa27d8c6dd4a2cabc46c81a9b0fe9b9aa2f753
|