Using queries to the ESGF API to generate urls and keyword arguments for receipe generation in pangeo-forge
Project description
pangeo-forge-esgf
Using queries to the ESGF API to generate urls and keyword arguments for receipe generation in pangeo-forge
Install
You can install pangeo-forge-esgf via pip:
pip install pangeo-forge-esgf
If you want all the required dependencies for testing and development simply do:
pip install pangeo-forge-esgf[dev]
Parsing a list of instance ids using wildcards
Pangeo forge recipes require the user to provide exact instance_id's for the datasets they want to be processed. Discovering these with the web search can become cumbersome, especially when dealing with a large number of members/models etc.
pangeo-forge-esgf
provides some functions to query the ESGF API based on instance_id values with wildcards.
For example if you want to find all the zonal (uo
) and meridonal (vo
) velocities available for the lgm
experiment of PMIP, you can do:
from pangeo_forge_esgf.parsing import parse_instance_ids
parse_iids = [
"CMIP6.PMIP.*.*.lgm.*.*.[uo, vo].*.*",
]
# Comma separated values in square brackets will be expanded and the above is equivalent to:
# parse_iids = [
# "CMIP6.PMIP.*.*.lgm.*.*.[uo, vo].*.*", # this is equivalent to passing
# "CMIP6.PMIP.*.*.lgm.*.*.vo.*.*",
# ]
iids = []
for piid in parse_iids:
iids.extend(parse_instance_ids(piid))
iids
and you will get:
['CMIP6.PMIP.MIROC.MIROC-ES2L.lgm.r1i1p1f2.Omon.uo.gn.v20191002',
'CMIP6.PMIP.AWI.AWI-ESM-1-1-LR.lgm.r1i1p1f1.Odec.uo.gn.v20200212',
'CMIP6.PMIP.AWI.AWI-ESM-1-1-LR.lgm.r1i1p1f1.Omon.uo.gn.v20200212',
'CMIP6.PMIP.MIROC.MIROC-ES2L.lgm.r1i1p1f2.Omon.uo.gr1.v20200911',
'CMIP6.PMIP.MPI-M.MPI-ESM1-2-LR.lgm.r1i1p1f1.Omon.uo.gn.v20200909',
'CMIP6.PMIP.AWI.AWI-ESM-1-1-LR.lgm.r1i1p1f1.Omon.vo.gn.v20200212',
'CMIP6.PMIP.MIROC.MIROC-ES2L.lgm.r1i1p1f2.Omon.vo.gn.v20191002',
'CMIP6.PMIP.AWI.AWI-ESM-1-1-LR.lgm.r1i1p1f1.Odec.vo.gn.v20200212',
'CMIP6.PMIP.MIROC.MIROC-ES2L.lgm.r1i1p1f2.Omon.vo.gr1.v20200911',
'CMIP6.PMIP.MPI-M.MPI-ESM1-2-LR.lgm.r1i1p1f1.Omon.vo.gn.v20190710']
Eventually I hope I can leverage this functionality to handle user requests in PRs that add wildcard instance_ids, but for now this might be helpful to manually construct lists of instance_ids to submit to a pangeo-forge feedstock.
Generating PGF recipe input (urls) from instance_ids
from pangeo_forge_esgf import get_urls_from_esgf
iids = ['CMIP6.CMIP.CSIRO-ARCCSS.ACCESS-CM2.historical.r1i1p1f1.SImon.sifb.gn.v20200817']
url_dict = await get_urls_from_esgf(iids)
url_dict['CMIP6.CMIP.CSIRO-ARCCSS.ACCESS-CM2.historical.r1i1p1f1.SImon.sifb.gn.v20200817']
gives
100%|██████████| 5/5 [00:01<00:00, 4.98it/s]
Processing responses
Processing responses: Expected files per iid
Processing responses: Check for missing iids
Processing responses: Flatten results
Processing responses: Group results
Find responsive urls
100%|██████████| 1/1 [00:00<00:00, 3.25it/s]
['https://esgf-data1.llnl.gov/thredds/fileServer/css03_data/CMIP6/CMIP/CSIRO-ARCCSS/ACCESS-CM2/historical/r1i1p1f1/SImon/sifb/gn/v20200817/sifb_SImon_ACCESS-CM2_historical_r1i1p1f1_gn_185001-201412.nc']
or if you want to see detaile debugging statements
from pangeo_forge_esgf import get_urls_from_esgf, setup_logging
setup_logging('DEBUG')
iids = ['CMIP6.CMIP.CSIRO-ARCCSS.ACCESS-CM2.historical.r1i1p1f1.SImon.sifb.gn.v20200817']
url_dict = await get_urls_from_esgf(iids)
url_dict['CMIP6.CMIP.CSIRO-ARCCSS.ACCESS-CM2.historical.r1i1p1f1.SImon.sifb.gn.v20200817']
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file pangeo_forge_esgf-0.3.1.tar.gz
.
File metadata
- Download URL: pangeo_forge_esgf-0.3.1.tar.gz
- Upload date:
- Size: 21.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.0 CPython/3.12.3
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 2ccd659b29e43071ee4ca5075e2cb7bf7af449fe77ceb4ce3cd28dd410ca522e |
|
MD5 | 3b66f87011105d67ff800156b661724b |
|
BLAKE2b-256 | 86308ecce769206c8783a8ec665a0776617d7663ed158507d67275665954c2e4 |
File details
Details for the file pangeo_forge_esgf-0.3.1-py3-none-any.whl
.
File metadata
- Download URL: pangeo_forge_esgf-0.3.1-py3-none-any.whl
- Upload date:
- Size: 18.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.0 CPython/3.12.3
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | fd84c5c4857ce376a8d4238c9c3b1180f9f4cc52caf1cf250f4fc512c2aa91f5 |
|
MD5 | 4ce5378be7080df931eebb6f5f6649e0 |
|
BLAKE2b-256 | fbd461ba41fe0b7b12de3e3ae34753db39e2773e2275beba0f2a4b4726ab79b0 |