Skip to main content

A plugin for climetlab to retrieve the Eumetnet postprocessing benchmark dataset.

Project description

The Eumetnet postprocessing benchmark dataset Climetlab plugin

PyPI version PyPI pyversions build

A plugin for climetlab to retrieve the Eumetnet postprocessing benchmark dataset.

Ease the download of the dataset time-aligned forecasts, reforecasts (hindcasts) and observations (ERA5 reanalysis).

Warning: The current development stage is Alpha.

Using climetlab to access the data

See the demo notebooks Binder

The climetlab python package allows easy access to the data with a few lines of code such as:

# Uncomment the line below if climetlab and the plugin are not yet installed
#!pip install climetlab climetlab-eumetnet-postprocessing-benchmark
import climetlab as cml
ds = cml.load_dataset('eumetnet-postprocessing-benchmark-training-data-gridded-forecasts-surface', "2017-12-02", "2t", "highres")
fcs = ds.to_xarray()

which download the deterministic (high-resolution) forecasts for the 2 metres temperature. Once obtained, the corresponding observations (if available) can be retrieved in the xarray format by using the get_observations_as_xarray method:

obs = ds.get_observations_as_xarray()

Datasets description

There are two main datasets:

1 - Gridded Data

  • The Eumetnet postprocessing benchmark dataset contains ECMWF ensemble and deterministic forecasts over a large portion of Europe, from 36 to 67° in latitude and from -6 to 17° of longitude, and covers the years 2017-2018.

  • It also contains the corresponding ERA5 reanalysis for the purpose of providing observations for the benchmark.

  • For some dates, it contains also reforecasts that covers 20 years of past forecasts recomputed with the most recent model version.

  • All the forecasts and reforecasts provided are the noon ECMWF runs.

  • The ensemble forecasts and reforecsts also contain by default the control run.

  • The gridded data resolution is 0.25° x 0.25° which corresponds roughly to 25 kilometers.

  • Please note that you can presently only retrieve one forecast date for each climetlab.load_dataset call.

There are 5 gridded sub-datasets:

1.1 - Extreme Forecast Index

All the Extreme Forecast Index (EFI) variables can be obtained for each forecast date.

It includes:

TODO: add links for the efi

Parameter name ECMWF key Remarks
2 metre temperature efi 2ti
10 metre wind speed efi 10wsi
10 metre wind gust efi 10fgi
cape efi capei
cape shear efi capesi
Maximum temperature at 2m efi mx2ti
Minimum temperature at 2m efi mn2ti
Snowfall efi sfi
Total precipitation efi tpi

The EFI are available for the model step range (in hours) 0-24, 24-48, 48-72, 72-96, 96-120, 120-144 and 144-168.

Usage: The EFI variables can be retrieved by calling

ds = cml.load_dataset('eumetnet-postprocessing-benchmark-training-data-gridded-forecasts-efi', date, parameter)
ds.to_xarray()

where the date argument is a string with a single date, and the parameter argument is a string or a list of string with the ECMWF keys described above. Setting 'all' as parameter download all the EFI parameters.

Example:

ds = cml.load_dataset('eumetnet-postprocessing-benchmark-training-data-gridded-forecasts-efi', "2017-12-02", "2ti")
ds.to_xarray()

Remark:

By definition, observations are not available for Extreme Forecast Indices (EFI).

1.2 - Surface variable forecasts

The surface variables can be obtained for each forecast date, both for the ensemble (51 members) and deterministic runs.

It includes:

Parameter name ECMWF key Remarks
2 metre temperature 2t
10 metre U wind component 10u
10 metre V wind component 10v
Total cloud cover tcc
100 metre U wind component anomaly 100ua Observations not available
100 metre V wind component anomaly 100va Observations not available
Convective available potential energy cape
Soil temperature level 1 stl1
Total column water tcw
Total column water vapour tcwv
Volumetric soil water layer 1 swvl1
Snow depth sd
Convective inhibition cin Observations not available
Visibility vis Observations not available

Some missing observations will become available later.

The forecasts are available for the model steps (in hours) 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 93, 96, 99, 102, 105, 108, 111, 114, 117, 120, 123, 126, 129, 132, 135, 138, 141, 144, 150, 156, 162, 168, 174, 180, 186, 192, 198, 204, 210, 216, 222, 228, 234 and 240. All the steps are automatically retrieved.

Usage: The surface variables forecasts can be retrieved by calling

ds = cml.load_dataset('eumetnet-postprocessing-benchmark-training-data-gridded-forecasts-surface', date, parameter, kind)
ds.to_xarray()

where the date argument is a string with a single date, and the parameter argument is a string or a list of string with the ECMWF keys described above. Setting 'all' as parameter download all the surface parameters. The kind argument allows to select the deterministic or ensemble forecasts, by setting it to 'highres' or 'ensemble'.

Example:

ds = cml.load_dataset('eumetnet-postprocessing-benchmark-training-data-gridded-forecasts-surface', "2017-12-02", "sd", "highres")
ds.to_xarray()

1.3 - Pressure level variable forecasts

The variables on pressure level can be obtained for each forecast date, both for the ensemble (51 members) and deterministic runs.

It includes:

Parameter name Level ECMWF key Remarks
Temperature 850 t
U component of wind 700 u
V component of wind 700 v
Geopotential 500 z
Specific humidity 700 q
Relative humidity 850 r

The forecasts are available for the same model steps as the surface variables above.

Usage: The pressure level variables forecasts can be retrieved by calling

ds = cml.load_dataset('eumetnet-postprocessing-benchmark-training-data-gridded-forecasts-pressure', date, parameter, level, kind)
ds.to_xarray()

where the date argument is a string with a single date, and the parameter argument is a string or a list of string with the ECMWF keys described above. Setting 'all' as parameter download all the parameters at the given pressure level. The level argument is the pressure level, as a string or an integer. The kind argument allows to select the deterministic or ensemble forecasts, by setting it to 'highres' or 'ensemble'.

Example:

ds = cml.load_dataset('eumetnet-postprocessing-benchmark-training-data-gridded-forecasts-pressure', "2017-12-02", "z", 500, "highres")
ds.to_xarray()

1.4 - Postprocessed surface variable forecasts

Postprocessed surface variables can be obtained for each forecast date, both for the ensemble (51 members) and deterministic runs. A postprocessed variable is either accumulated, averaged or filtered.

It includes:

Parameter name ECMWF key Remarks
Total precipitation tp
Surface sensible heat flux sshf
Surface latent heat flux slhf
Surface net solar radiation ssr
Surface net thermal radiation str
Convective precipitation cp
Maximum temperature at 2 metres mx2t6
Minimum temperature at 2 metres mn2t6
Surface solar radiation downwards ssrd
Surface thermal radiation downwards strd
10 metre wind gust 10fg6

All these variables are accumulated or filtered over the last 6 hours preceding a given forecast timestamp. Therefore, the forecasts are available for the model steps (in hours) 6, 12, 18, 24, 30, 36, 42, 48, 54, 60, 66, 72, 78, 84, 90, 96, 102, 108, 114, 120, 126, 132, 138, 144, 150, 156, 162, 168, 174, 180, 186, 192, 198, 204, 210, 216, 222, 228, 234 and 240. All the steps are automatically retrieved.

Usage: The surface variables forecasts can be retrieved by calling

ds = cml.load_dataset('eumetnet-postprocessing-benchmark-training-data-gridded-forecasts-surface-postprocessed', date, parameter, kind)
ds.to_xarray()

where the date argument is a string with a single date, and the parameter argument is a string or a list of string with the ECMWF keys described above. The kind argument allows to select the deterministic or ensemble forecasts, by setting it to 'highres' or 'ensemble'.

Remark: For technical reason, most fields cannot be retrieved along the others and must be downloaded alone. E.g. a request with parameter=['tp', 'mx2t6'] will fail while one with parameter='tp' will succeed.

Example:

ds = cml.load_dataset('eumetnet-postprocessing-benchmark-training-data-gridded-forecasts-surface-postprocessed', "2017-12-02", "mx2t6", "highres")
ds.to_xarray()

1.5 - Surface variable reforecasts

The surface variables for the ensemble reforecasts (11 members) can be obtained for each reforecast date. All the variables described at the point 1.2 above are available.

The reforecasts are available for the model steps (in hours) 0, 6, 12, 18, 24, 30, 36, 42, 48, 54, 60, 66, 72, 78, 84, 90, 96, 102, 108, 114, 120, 126, 132, 138, 144, 150, 156, 162, 168, 174, 180, 186, 192, 198, 204, 210, 216, 222, 228, 234 and 240. All the steps are automatically retrieved.

Remark: The ECMWF reforecasts are only available Mondays and Thursdays. Providing any other date will fail.

Usage: The surface variables reforecasts can be retrieved by calling

ds = cml.load_dataset('eumetnet-postprocessing-benchmark-training-data-gridded-reforecasts-surface', date, parameter)
ds.to_xarray()

where the date argument is a string with a single date, and the parameter argument is a string or a list of string with the ECMWF keys. Setting 'all' as parameter download all the surface parameters.

Example:

ds = cml.load_dataset('eumetnet-postprocessing-benchmark-training-data-gridded-reforecasts-surface', "2017-12-28", "sd")
ds.to_xarray()

1.6 - Pressure level variable reforecasts

The variables on pressure level for the ensemble reforecasts (11 members) can be obtained for each reforecast date All the variables described at the point 1.3 above are available.

The reforecast are available for the same model steps as the surface variables above.

Remark: The ECMWF reforecasts are only available Mondays and Thursdays. Providing any other date will fail.

Usage: The pressure level variables reforecasts can be retrieved by calling

ds = cml.load_dataset('eumetnet-postprocessing-benchmark-training-data-gridded-reforecasts-pressure', date, parameter, level)
ds.to_xarray()

where the date argument is a string with a single date, and the parameter argument is a string or a list of string with the ECMWF keys. Setting 'all' as parameter download all the parameters at the given pressure level. The level argument is the pressure level, as a string or an integer.

Example:

ds = cml.load_dataset('eumetnet-postprocessing-benchmark-training-data-gridded-reforecasts-pressure', "2017-12-28", "z", 500)
ds.to_xarray()

1.7 - Postprocessed surface variable reforecasts

Postprocessed surface variables as described in section 1.4 can also be obtained as ensemble reforecasts (11 members).

The reforecast are available for the same model steps as the surface variables described in section 1.5.

Remark: The ECMWF reforecasts are only available Mondays and Thursdays. Providing any other date will fail.

Usage: The surface variables forecasts can be retrieved by calling

ds = cml.load_dataset('eumetnet-postprocessing-benchmark-training-data-gridded-reforecasts-surface-postprocessed', date, parameter)
ds.to_xarray()

where the date argument is a string with a single date, and the parameter argument is a string or a list of string with the ECMWF keys.

Remark: For technical reason, most fields cannot be retrieved along the others and must be downloaded alone. E.g. a request with parameter=['tp', 'mx2t6'] will fail while one with parameter='tp' will succeed.

Example:

ds = cml.load_dataset('eumetnet-postprocessing-benchmark-training-data-gridded-reforecasts-surface-postprocessed', "2017-12-28", "mx2t6")
ds.to_xarray()

2 - Stations Data

Not yet provided.

Major ECMWF model changes

TODO

Support and contributing

Please open a issue on github.

LICENSE

See the LICENSE file for the code, and the DATA_LICENSE for the data.

Authors

See the CONTRIBUTORS.md file.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page