File format conversions to cfdb
Project description
cfdb-ingest
Convert meteorological model output to cfdb with standardized CF conventions
Documentation: https://mullenkamp.github.io/cfdb-ingest/
Source Code: https://github.com/mullenkamp/cfdb-ingest
Table of Contents
Overview
cfdb-ingest converts meteorological file formats (netCDF4/HDF5, GRIB2, etc.) from various model outputs into cfdb. It standardizes variable names and attributes to be consistent with CF conventions, making it straightforward to work with datasets from different sources through a single interface.
Key features:
- Automatic variable mapping -- source variable names are translated to CF-standard names with proper metadata (standard_name, units, encoding)
- Wind rotation -- grid-relative wind components are rotated to earth-relative using COSALPHA/SINALPHA
- 3D level interpolation -- eta-level variables are interpolated to user-specified height levels above ground
- Variable merging -- surface and level-interpolated variants of the same quantity (e.g. T2 at 2 m and T at arbitrary heights) are merged into a single output variable
- Spatial and temporal filtering -- subset by bounding box (WGS84) and/or date range before writing
- Multi-file support -- seamlessly spans multiple input files, including cross-file precipitation accumulation
- Configurable chunking -- tune output chunk shapes for different access patterns
Supported Formats
| Source | Class | CRS Projections |
|---|---|---|
| WRF (wrfout) | WrfIngest |
Lambert Conformal Conic, Polar Stereographic, Mercator, Lat-Lon |
WRF Variables
Surface variables (fixed height above ground):
| Key | cfdb Name | Height | Source Vars | Transform |
|---|---|---|---|---|
T2 |
air_temp |
2 m | T2 | direct |
PSFC |
surface_pressure |
0 m | PSFC | direct |
Q2 |
specific_humidity |
2 m | Q2 | direct |
RAIN |
precip |
0 m | RAINNC, RAINC | accumulation increment |
WIND10 |
wind_speed |
10 m | U10, V10 | wind rotation |
WIND_DIR10 |
wind_direction |
10 m | U10, V10 | wind rotation |
U10 |
u_wind |
10 m | U10, V10 | wind rotation |
V10 |
v_wind |
10 m | U10, V10 | wind rotation |
TSK |
soil_temp |
0 m | TSK | direct |
SWDOWN |
shortwave_radiation |
0 m | SWDOWN | direct |
GLW |
longwave_radiation |
0 m | GLW | direct |
SNOWH |
snow_depth |
0 m | SNOWH | direct |
SLP |
mslp |
0 m | PSFC, T2, HGT | hypsometric reduction |
3D level-interpolated variables (interpolated to user-specified target_levels):
| Key | cfdb Name | Source Vars | Transform |
|---|---|---|---|
T |
air_temp |
T, P, PB, PH, PHB | potential to actual temperature |
WIND |
wind_speed |
U, V, PH, PHB | unstagger + rotation |
WIND_DIR |
wind_direction |
U, V, PH, PHB | unstagger + rotation |
U |
u_wind |
U, V, PH, PHB | unstagger + rotation |
V |
v_wind |
U, V, PH, PHB | unstagger + rotation |
Q |
specific_humidity |
QVAPOR, PH, PHB | mixing ratio to specific humidity |
Installation
Requires Python >= 3.11.
pip install cfdb-ingest
Python API
Basic conversion
from cfdb_ingest import WrfIngest
wrf = WrfIngest('wrfout_d01_2023-02-12_00:00:00.nc')
# Convert selected variables for a time window
wrf.convert(
cfdb_path='output.cfdb',
variables=['T2', 'WIND10', 'precip'],
start_date='2023-02-12T06:00',
end_date='2023-02-12T18:00',
)
Multi-file input
wrf = WrfIngest([
'wrfout_d01_2023-02-12_00:00:00.nc',
'wrfout_d01_2023-02-13_00:00:00.nc',
])
# All timesteps across both files are merged automatically
wrf.convert(cfdb_path='output.cfdb', variables=['T2'])
Spatial subsetting with a bounding box
wrf.convert(
cfdb_path='output.cfdb',
variables=['T2'],
bbox=(165.0, -47.0, 175.0, -40.0), # (min_lon, min_lat, max_lon, max_lat)
)
3D level interpolation
# Interpolate 3D temperature and wind to specific heights above ground
wrf.convert(
cfdb_path='output.cfdb',
variables=['T', 'WIND'],
target_levels=[100.0, 500.0, 1000.0, 2000.0],
bbox=(165.0, -47.0, 175.0, -40.0),
)
Merging surface and 3D variables
Variables sharing a cfdb name are automatically merged. For example, T2 (2 m) and T (levels) both map to air_temp and produce a single output variable spanning all heights:
wrf.convert(
cfdb_path='output.cfdb',
variables=['T2', 'T'],
target_levels=[100.0, 500.0],
)
# Output height coordinate: [2.0, 100.0, 500.0]
Custom chunk shape
The output chunk shape defaults to (1, 1, ny, nx) (one full spatial slab per timestep per height level). Override it for different access patterns:
wrf.convert(
cfdb_path='output.cfdb',
variables=['T2'],
chunk_shape=(1, 1, 50, 50), # (time, z, y, x)
)
Inspecting metadata before conversion
wrf = WrfIngest('wrfout_d01_2023-02-12_00:00:00.nc')
wrf.crs # pyproj.CRS
wrf.times # numpy datetime64 array
wrf.x, wrf.y # 1D projected coordinate arrays
wrf.variables # dict of available variable mappings
wrf.bbox_geographic # (min_lon, min_lat, max_lon, max_lat)
Variable name resolution
variables accepts mapping keys (T2), source variable names (RAINNC), or cfdb names (air_temp). When a cfdb name maps to multiple keys, all are included:
wrf.resolve_variables(['air_temp']) # ['T2', 'T']
wrf.resolve_variables(['RAINNC']) # ['RAIN']
wrf.resolve_variables(None) # all available keys
CLI
cfdb-ingest provides a cfdb-ingest command with a wrf subcommand.
Basic usage
cfdb-ingest wrf wrfout_d01_2023-02-12_00:00:00.nc output.cfdb \
-v T2,WIND10 \
-s 2023-02-12T06:00 \
-e 2023-02-12T18:00
All options
cfdb-ingest wrf [OPTIONS] INPUT_PATHS... CFDB_PATH
| Option | Short | Description |
|---|---|---|
--variables |
-v |
Comma-separated variable names |
--start-date |
-s |
Start date (ISO format) |
--end-date |
-e |
End date (ISO format) |
--bbox |
-b |
Bounding box: min_lon,min_lat,max_lon,max_lat |
--target-levels |
-l |
Comma-separated height levels in meters |
--chunk-shape |
-c |
Output chunk shape: time,z,y,x (e.g. 1,1,50,50) |
--max-mem |
Read buffer size in bytes (default: 128 MiB) | |
--compression |
Compression algorithm: zstd or lz4 |
Examples
# Convert with spatial subset
cfdb-ingest wrf wrfout_d01_*.nc output.cfdb \
-v T2 -b 165.0,-47.0,175.0,-40.0
# 3D temperature at specific height levels
cfdb-ingest wrf wrfout_d01_*.nc output.cfdb \
-v T -l 100,500,1000,2000 -b 165.0,-47.0,175.0,-40.0
# Custom chunk shape for time-series access patterns
cfdb-ingest wrf wrfout_d01_*.nc output.cfdb \
-v T2,WIND10 -c 24,1,50,50
Development
Setup environment
We use UV to manage the development environment and production build.
uv sync
Run tests
uv run pytest
Lint and format
uv run lint
License
This project is licensed under the terms of the Apache Software License 2.0.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file cfdb_ingest-0.1.6.tar.gz.
File metadata
- Download URL: cfdb_ingest-0.1.6.tar.gz
- Upload date:
- Size: 18.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.8.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4b1c8ab374e76107e768031f3f9b8d4d7de08d9890edfd9587dabbbc1d1dd1da
|
|
| MD5 |
3bb6b9f87535561717aea10dc53de874
|
|
| BLAKE2b-256 |
b3cdabc9a33cfc3c62f537938b9a4ff06ae4728f3140be2fa1cc031d5c6b811f
|
File details
Details for the file cfdb_ingest-0.1.6-py3-none-any.whl.
File metadata
- Download URL: cfdb_ingest-0.1.6-py3-none-any.whl
- Upload date:
- Size: 19.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.8.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f5b573832c50ecc7debb0ca42f240506ceeef7f5b589b5468937f639fb81857d
|
|
| MD5 |
4c1cd0316cdcd71493490322dc8773be
|
|
| BLAKE2b-256 |
077ba3fbde1336ef9296c8174186d4175da17d21bd3cd432ce9764063251b209
|