Skip to main content

File format conversions to cfdb

Project description

cfdb-ingest

Convert meteorological model output to cfdb with standardized CF conventions

build codecov PyPI version


Documentation: https://mullenkamp.github.io/cfdb-ingest/

Source Code: https://github.com/mullenkamp/cfdb-ingest


Overview

cfdb-ingest converts meteorological file formats (netCDF4/HDF5) from various model outputs into cfdb. It standardizes variable names and attributes to be consistent with CF conventions, making it straightforward to work with datasets from different sources through a single interface.

Supported sources:

  • WRF -- wrfout NetCDF files (all variables in one file per time range)
  • ERA5 -- NCAR ERA5 NetCDF files (one variable per file, surface + pressure level + invariant products)

Key features:

  • Automatic variable mapping -- source variable names are translated to CF-standard names with proper metadata via cfdb-vars
  • Named height coordinates -- surface variables at specific heights (0m, 2m, 10m, 100m) get their own named coordinates (e.g. height_2m), allowing them to coexist with pressure-level variables without ambiguity
  • Wind rotation (WRF) -- grid-relative wind components are rotated to earth-relative
  • VIMF computation (ERA5) -- native calculation of vertically integrated moisture flux from Q, U, and V
  • 3D level interpolation (WRF) -- eta-level variables are interpolated to user-specified height or pressure levels
  • Auto pressure level detection (ERA5) -- pressure levels are read directly from source files
  • Split or combined output (ERA5) -- create one cfdb per variable or combine into a single dataset
  • WPS intermediate file export -- convert cfdb datasets to WPS intermediate format for metgrid.exe
  • Spatial and temporal filtering -- subset by bounding box and/or date range
  • Multi-file support -- seamlessly spans multiple input files

Performance

cfdb-ingest is designed for high-performance processing of large meteorological datasets:

  • Vectorized rechunking -- utilizes rechunkit for optimized HDF5 reads, even when extracting small spatial subsets across many timesteps.
  • Parallel initialization -- multi-threaded file scanning and metadata extraction for fast startup.
  • HDF5 Chunk Caching -- intelligent management of the HDF5 chunk cache to prevent redundant I/O during per-timestep transformations.
  • Synchronized multi-variable rechunking -- synchronized iteration for derived variables (like VIMF) to eliminate redundant reads of shared source variables.

Installation

Requires Python >= 3.10.

pip install cfdb-ingest

Quick Start

WRF

from cfdb_ingest import WrfIngest

wrf = WrfIngest('wrfout_d01_2023-02-12_00:00:00.nc')
wrf.convert(
    cfdb_path='output.cfdb',
    variables=['T2', 'WIND10'],
    start_date='2023-02-12T06:00',
    end_date='2023-02-12T18:00',
)
cfdb-ingest wrf wrfout_d01_*.nc output.cfdb -v T2,WIND10 -s 2023-02-12T06:00 -e 2023-02-12T18:00

For WPS export, use the --preset wps flag:

cfdb-ingest wrf /path/to/wrfout/ output.cfdb --preset wps -s 2023-02-10 -e 2023-02-10_06
cfdb-to-int output.cfdb -s 2023-02-10 -e 2023-02-10_06

ERA5

from cfdb_ingest import Era5Ingest

era5 = Era5Ingest('/path/to/era5/*.nc')
era5.convert(
    cfdb_path='era5.cfdb',
    variables=['SP', 'VAR_2T', 'T', 'U', 'V'],
    start_date='2020-01-01',
    end_date='2020-01-31',
)
# Combined: multiple variables in one cfdb
cfdb-ingest era5 /path/to/era5/*.nc output.cfdb -v SP,VAR_2T,T,U,V -s 2020-01-01 -e 2020-01-31

# Split: one cfdb file per variable
cfdb-ingest era5 /path/to/era5/*.nc /output/dir/ --split -v SP,T

See the full documentation for details.

License

This project is licensed under the terms of the Apache Software License 2.0.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cfdb_ingest-0.3.5.tar.gz (35.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

cfdb_ingest-0.3.5-py3-none-any.whl (36.6 kB view details)

Uploaded Python 3

File details

Details for the file cfdb_ingest-0.3.5.tar.gz.

File metadata

  • Download URL: cfdb_ingest-0.3.5.tar.gz
  • Upload date:
  • Size: 35.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.8.7

File hashes

Hashes for cfdb_ingest-0.3.5.tar.gz
Algorithm Hash digest
SHA256 dbb87893862b6f38c9dc8417ad1825e09ef875ff8e82e955ec916cdf5cacc2d3
MD5 f63e4490253e2aed4ba8da04b9079ffd
BLAKE2b-256 b115057696364c854361ecee30c19ea545891791f2267f632ddc5b6ed27bb9b4

See more details on using hashes here.

File details

Details for the file cfdb_ingest-0.3.5-py3-none-any.whl.

File metadata

File hashes

Hashes for cfdb_ingest-0.3.5-py3-none-any.whl
Algorithm Hash digest
SHA256 b2c9793fc19791ed30cb4e69528d5a4be1c68d44ab6ab997654d28a60301fac3
MD5 3e8d55603ed3b43e0e1aba8b0965980d
BLAKE2b-256 1c63eca88326f909336a1c3b3de6084bf596148f917a04d9bb68ef8b2396e922

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page