Skip to main content

Using queries to the ESGF API to generate urls and keyword arguments for receipe generation in pangeo-forge

Project description

pangeo-forge-esgf

Using queries to the ESGF API to generate urls and keyword arguments for receipe generation in pangeo-forge

Install

You can install pangeo-forge-esgf via pip:

pip install pangeo-forge-esgf

If you want all the required dependencies for testing and development simply do:

pip install pangeo-forge-esgf[dev]

Parsing a list of instance ids using wildcards

Pangeo forge recipes require the user to provide exact instance_id's for the datasets they want to be processed. Discovering these with the web search can become cumbersome, especially when dealing with a large number of members/models etc.

pangeo-forge-esgf provides some functions to query the ESGF API based on instance_id values with wildcards.

For example if you want to find all the zonal (uo) and meridonal (vo) velocities available for the lgm experiment of PMIP, you can do:

from pangeo_forge_esgf.parsing import parse_instance_ids
parse_iids = [
    "CMIP6.PMIP.*.*.lgm.*.*.[uo, vo].*.*",
]
# Comma separated values in square brackets will be expanded and the above is equivalent to:
# parse_iids = [
#     "CMIP6.PMIP.*.*.lgm.*.*.[uo, vo].*.*", # this is equivalent to passing
#     "CMIP6.PMIP.*.*.lgm.*.*.vo.*.*",
# ]
iids = []
for piid in parse_iids:
    iids.extend(parse_instance_ids(piid))
iids

and you will get:

['CMIP6.PMIP.MIROC.MIROC-ES2L.lgm.r1i1p1f2.Omon.uo.gn.v20191002',
 'CMIP6.PMIP.AWI.AWI-ESM-1-1-LR.lgm.r1i1p1f1.Odec.uo.gn.v20200212',
 'CMIP6.PMIP.AWI.AWI-ESM-1-1-LR.lgm.r1i1p1f1.Omon.uo.gn.v20200212',
 'CMIP6.PMIP.MIROC.MIROC-ES2L.lgm.r1i1p1f2.Omon.uo.gr1.v20200911',
 'CMIP6.PMIP.MPI-M.MPI-ESM1-2-LR.lgm.r1i1p1f1.Omon.uo.gn.v20200909',
 'CMIP6.PMIP.AWI.AWI-ESM-1-1-LR.lgm.r1i1p1f1.Omon.vo.gn.v20200212',
 'CMIP6.PMIP.MIROC.MIROC-ES2L.lgm.r1i1p1f2.Omon.vo.gn.v20191002',
 'CMIP6.PMIP.AWI.AWI-ESM-1-1-LR.lgm.r1i1p1f1.Odec.vo.gn.v20200212',
 'CMIP6.PMIP.MIROC.MIROC-ES2L.lgm.r1i1p1f2.Omon.vo.gr1.v20200911',
 'CMIP6.PMIP.MPI-M.MPI-ESM1-2-LR.lgm.r1i1p1f1.Omon.vo.gn.v20190710']

Eventually I hope I can leverage this functionality to handle user requests in PRs that add wildcard instance_ids, but for now this might be helpful to manually construct lists of instance_ids to submit to a pangeo-forge feedstock.

Generating PGF recipe input (urls) from instance_ids

from pangeo_forge_esgf import get_urls_from_esgf
iids = ['CMIP6.CMIP.CSIRO-ARCCSS.ACCESS-CM2.historical.r1i1p1f1.SImon.sifb.gn.v20200817']
url_dict = await get_urls_from_esgf(iids)
url_dict['CMIP6.CMIP.CSIRO-ARCCSS.ACCESS-CM2.historical.r1i1p1f1.SImon.sifb.gn.v20200817']

gives

100%|██████████| 5/5 [00:01<00:00,  4.98it/s]
Processing responses
Processing responses: Expected files per iid
Processing responses: Check for missing iids
Processing responses: Flatten results
Processing responses: Group results
Find responsive urls
100%|██████████| 1/1 [00:00<00:00,  3.25it/s]
['https://esgf-data1.llnl.gov/thredds/fileServer/css03_data/CMIP6/CMIP/CSIRO-ARCCSS/ACCESS-CM2/historical/r1i1p1f1/SImon/sifb/gn/v20200817/sifb_SImon_ACCESS-CM2_historical_r1i1p1f1_gn_185001-201412.nc']

or if you want to see detaile debugging statements

from pangeo_forge_esgf import get_urls_from_esgf, setup_logging
setup_logging('DEBUG')
iids = ['CMIP6.CMIP.CSIRO-ARCCSS.ACCESS-CM2.historical.r1i1p1f1.SImon.sifb.gn.v20200817']
url_dict = await get_urls_from_esgf(iids)
url_dict['CMIP6.CMIP.CSIRO-ARCCSS.ACCESS-CM2.historical.r1i1p1f1.SImon.sifb.gn.v20200817']

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pangeo_forge_esgf-0.3.1.tar.gz (21.4 kB view details)

Uploaded Source

Built Distribution

pangeo_forge_esgf-0.3.1-py3-none-any.whl (18.7 kB view details)

Uploaded Python 3

File details

Details for the file pangeo_forge_esgf-0.3.1.tar.gz.

File metadata

  • Download URL: pangeo_forge_esgf-0.3.1.tar.gz
  • Upload date:
  • Size: 21.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.0 CPython/3.12.3

File hashes

Hashes for pangeo_forge_esgf-0.3.1.tar.gz
Algorithm Hash digest
SHA256 2ccd659b29e43071ee4ca5075e2cb7bf7af449fe77ceb4ce3cd28dd410ca522e
MD5 3b66f87011105d67ff800156b661724b
BLAKE2b-256 86308ecce769206c8783a8ec665a0776617d7663ed158507d67275665954c2e4

See more details on using hashes here.

File details

Details for the file pangeo_forge_esgf-0.3.1-py3-none-any.whl.

File metadata

File hashes

Hashes for pangeo_forge_esgf-0.3.1-py3-none-any.whl
Algorithm Hash digest
SHA256 fd84c5c4857ce376a8d4238c9c3b1180f9f4cc52caf1cf250f4fc512c2aa91f5
MD5 4ce5378be7080df931eebb6f5f6649e0
BLAKE2b-256 fbd461ba41fe0b7b12de3e3ae34753db39e2773e2275beba0f2a4b4726ab79b0

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page