Community Forcing Service — acquire-and-subset access to meteorological forcing for hydrological models
Project description
CFS — Community Forcing Service
Acquire-and-subset access to meteorological forcing products for hydrological modelling.
Acquiring forcing for a modeling study traditionally means bespoke scripting
per product — every product has its own API, native variable names, units,
accumulation conventions, and grid, and the accumulation-to-rate conversion is
re-implemented (and mis-implemented) in every group's scripts. CFS replaces
that with one async interface over 33 products that stops at a canonical,
CF-aligned xarray.Dataset — deliberately leaving catchment/HRU remapping and
model-specific file formats to modeling frameworks (e.g. SYMFLUENCE).
Documentation: https://darriey.github.io/CFS/
CFS is the third member of the community-data triad alongside CAS (Community Attribute Service) and CSFS (Community Streamflow Service):
| Service | Data | Returns |
|---|---|---|
| CAS | geospatial attributes (DEM, soil, land cover) | harmonized zonal statistics |
| CSFS | streamflow observations | harmonized station time series |
| CFS | meteorological forcing | canonical, subset xarray.Dataset |
The boundary (why CFS stops where it does)
CFS does exactly one job: acquire a forcing product, subset it to a bounding
box + time range, harmonize it to a canonical schema, and hand back a lazy
xarray.Dataset. That's it.
It deliberately does not:
- remap to HRUs / sub-basins,
- write model-specific forcing schemas (SUMMA, FUSE, mizuRoute, …),
- serialize monthly NetCDF chunks or handle HPC filesystem locking.
Those steps are model- and deployment-specific, so they stay in the consumer (e.g. SYMFLUENCE). Keeping the boundary here is what makes CFS reusable across frameworks rather than a SYMFLUENCE library in disguise.
upstream store ──▶ subset to bbox+time ──▶ harmonize to canonical ──▶ xr.Dataset
(Zarr/S3/…) cfs.subset.bbox cfs.subset.canonical │
▼
[ consumer: HRU remap + model schema ]
Canonical schema (canonical-v1)
Every connector renames native variables to CF-aligned canonical names and
converts to canonical SI units (see cfs/core/vocabulary.py). Precipitation and
radiation are always returned as rates (kg m-2 s-1, W m-2), never
accumulations — the conversion that most often goes wrong is done once, here.
The output contract (names, units, attrs, grid layouts, time conventions) is
specified normatively in
the canonical-v1 spec.
Install
pip install 'community-forcing-service[climate]' # xarray, zarr, gcsfs, dask, netcdf4
The distribution is named community-forcing-service (the name cfs is taken
on PyPI), but the import package and CLI are still cfs (import cfs). From a
checkout:
pip install -e '.[climate]'
Use
cfs providers # list registered providers
cfs products # list products + canonical variables
cfs fetch \
-P era5_arco:single_levels \
-b -114.5,50.7,-114.0,51.1 \
--start 2015-06-01T00:00 --end 2015-06-01T06:00 \
-v air_temperature,precipitation_flux
Python:
from cfs.core.models import BoundingBox, TimeRange
from cfs.core.registry import discover, get_connector
from cfs.core.vocabulary import CanonicalVar
discover()
Conn = get_connector("era5_arco")
async with Conn() as conn:
ds, result = await conn.fetch(
"era5_arco:single_levels",
BoundingBox(min_lon=-114.5, min_lat=50.7, max_lon=-114.0, max_lat=51.1),
TimeRange(start=..., end=...),
variables=[CanonicalVar.AIR_TEMPERATURE, CanonicalVar.PRECIPITATION_FLUX],
)
# ds: lazy canonical cube; result: FetchResult provenance/shape metadata
Adding a connector
Subclass BaseForcingConnector (optionally mix in ZarrStoreMixin), implement
list_products() and fetch(), declare a VariableMapping table mapping native
names → canonical vars + linear unit conversions, and decorate with
@register("slug"). discover() finds it automatically.
Providers
33 connectors — 31 live-verified against their upstream stores (19 anonymous +
12 auth-gated, confirmed with real CDS and Earthdata credentials); mswep and
em_earth are offline-verified pending access/credentials. Highlights:
| products | |
|---|---|
| Global / regional reanalyses | ERA5 (ARCO + CDS), ERA5-Land, MERRA-2, CARRA, CERRA, RDRS/CaSR, BARRA-R2, CONUS404, NARR, WFDE5 |
| Analysis / observation grids | AORC (+ NWM grid), NLDAS-2, HRRR, NWM operational, Daymet, gridMET, nClimGrid-Daily, GLDAS, FLDAS, E-OBS |
| Satellite / merged precipitation | CHIRPS, CHIRTS, GPM IMERG, PERSIANN-CDR, CMORPH, MSWEP, EM-Earth |
| Forecasts | GFS (deterministic), GEFS (ensemble, member dimension) |
| Climate projections | NEX-GDDP-CMIP6, NA-CORDEX |
The full per-provider table — grid type, access protocol, auth, verification
status, and the per-provider caveats (rolling archive windows, unverified
units, slow OPeNDAP paths, derivation notes) — lives in the
provider catalog, with the
machine-readable version in
inventory/providers.yaml.
CDS connectors need ~/.cdsapirc; Earthdata connectors need
EARTHDATA_TOKEN (or ~/.netrc / EARTHDATA_USERNAME+PASSWORD) with the
"NASA GESDISC DATA ARCHIVE" app authorized. GFS/GEFS need the forecast
extra:
pip install 'community-forcing-service[climate,cds,earthdata,forecast]'
Note that CFS is a passthrough service — every fetch hits the provider's live store, so transient upstream outages (THREDDS restarts, S3 hiccups, CDS queue congestion) can surface as fetch errors independent of CFS itself.
Hardening / robustness
- Range QC (
cfs/qc.py): every fetch samples the harmonized cube against each canonical variable's physicalvalid_rangeand reports out-of-range values inFetchResult.warnings— catching unit-conversion bugs (a precip flux of 8.6 instead of 1e-4) before they reach a model. Advisory; never fails a fetch. Toggle withCFS_QC_ENABLED. - Fetch guardrails: shared
_guard_area(CFS_MAX_AREA_DEG2) and cell-count (CFS_MAX_CELLS_PER_FETCH) checks on the base class refuse accidental continental/decadal pulls; enforced uniformly via_finalize. - Reset-aware de-accumulation (
cfs/subset/deaccumulate.py): running-total fields (ERA5-Landtp/ssrd/strd) are converted to per-step increments before unit conversion, handling daily resets.
Derived variables
When a provider lacks a canonical field, CFS derives it once, in a tested place
(cfs/derive/). Currently: specific humidity from relative humidity
(cfs/derive/humidity.py, Bolton 1980 saturation vapour pressure) — used by
CARRA/CERRA, which ship 2 m RH rather than specific humidity. Derivation inputs
(RH) are consumed, not emitted: they do not appear in the canonical output.
Tests
pytest -m 'not network' # offline: harmonization + subsetting logic
pytest -m network # integration: real ERA5 fetch from GCS
Naming note
"CFS" also denotes NOAA's Climate Forecast System (CFSR/CFSv2), itself a
forcing product. If a CFSR connector is ever added it must use a disambiguated
slug (e.g. cfsr) to avoid collision with the service name.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file community_forcing_service-0.1.0.tar.gz.
File metadata
- Download URL: community_forcing_service-0.1.0.tar.gz
- Upload date:
- Size: 142.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
70c663e355ec170551ddecd5a5f2cf4501c0ae572bcaaeb5ebd4f81b2c717f96
|
|
| MD5 |
f25a5fff7dda3f0876441f2e070dabda
|
|
| BLAKE2b-256 |
71c6d1740bccbc22ee7979afac28b6f059d958f61488b775fc73b42fa35b9371
|
File details
Details for the file community_forcing_service-0.1.0-py3-none-any.whl.
File metadata
- Download URL: community_forcing_service-0.1.0-py3-none-any.whl
- Upload date:
- Size: 150.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
20582e9cafe55ef9cb780aad0049fc09f95e249d5290255cf9bb86f08412d543
|
|
| MD5 |
2e8aefad9f44649cfa56ca36b93e6379
|
|
| BLAKE2b-256 |
38f3cf9a82f609e382249421f2108d49b0d45fee21efd3e895a1f54035b6d03e
|